From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 479CB72FC for ; Wed, 14 Sep 2016 02:20:20 +0200 (CEST) Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP; 13 Sep 2016 17:20:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.30,331,1470726000"; d="scan'208";a="8082904" Received: from fedora.sh.intel.com ([10.239.67.161]) by fmsmga006.fm.intel.com with ESMTP; 13 Sep 2016 17:20:14 -0700 From: Changpeng Liu To: dev@dpdk.org Cc: yuanhan.liu@intel.com, james.r.harris@intel.com Date: Thu, 15 Sep 2016 08:28:17 +0800 Message-Id: <1473899298-4580-1-git-send-email-changpeng.liu@intel.com> X-Mailer: git-send-email 1.9.3 In-Reply-To: <1473855300-3066-1-git-send-email-changpeng.liu@intel.com> References: <1473855300-3066-1-git-send-email-changpeng.liu@intel.com> Subject: [dpdk-dev] [PATCH v2 1/2] vhost: change the vhost library to a common framework which can support more VIRTIO devices X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Sep 2016 00:20:21 -0000 For storage virtualization use cases, vhost-scsi becomes a more popular solution to support VMs. However a user space vhost-scsi-user solution does not exist currently. SPDK(Storage Performance Development Kit, https://github.com/spdk/spdk) will provide a user space vhost-scsi target to support multiple VMs through Qemu. Originally SPDK is built on top of DPDK libraries, so we would like to use DPDK vhost library as the communication channel between Qemu and vhost-scsi target application. Currently DPDK vhost library can only support VIRTIO_ID_NET device type, we would like to extend the library to support VIRTIO_ID_SCSI and VIRTIO_ID_BLK. Most of DPDK vhost library can be reused only several differences: 1. VIRTIO SCSI device has different vring queues compared with VIRTIO NET device, at least 3 vring queues needed for SCSI device type; 2. VIRTIO SCSI will need several extra message operation code, such as SCSI_SET_ENDPIONT/SCSI_CLEAR_ENDPOINT; First, we would like to extend DPDK vhost library as a common framework which be friendly to add other VIRTIO device types, to implement this feature, we add a new data structure virtio_dev, which can deliver socket messages to different VIRTIO devices, each specific VIRTIO device will register callback to virtio_dev. Secondly, we would to upstream a patch to Qemu community to add vhost-scsi specific operation command such as SCSI_SET_ENDPOINT and SCSI_CLEAR_ENDOINT, and user space feature bits. Finally, after the Qemu patch set was merged, we will add VIRTIO_ID_SCSI support to DPDK vhost library and an example vhost-scsi target which can add a SCSI device to VM through this example application. This patch set changed the vhost library as a common framework which can add other VIRTIO device type in future. Signed-off-by: Changpeng Liu --- lib/librte_vhost/Makefile | 4 +- lib/librte_vhost/rte_virtio_dev.h | 140 ++++++++ lib/librte_vhost/rte_virtio_net.h | 97 +----- lib/librte_vhost/socket.c | 6 +- lib/librte_vhost/vhost.c | 421 ------------------------ lib/librte_vhost/vhost.h | 288 ----------------- lib/librte_vhost/vhost_device.h | 230 +++++++++++++ lib/librte_vhost/vhost_net.c | 659 ++++++++++++++++++++++++++++++++++++++ lib/librte_vhost/vhost_net.h | 126 ++++++++ lib/librte_vhost/vhost_user.c | 451 +++++++++++++------------- lib/librte_vhost/vhost_user.h | 17 +- lib/librte_vhost/virtio_net.c | 37 ++- 12 files changed, 1426 insertions(+), 1050 deletions(-) create mode 100644 lib/librte_vhost/rte_virtio_dev.h delete mode 100644 lib/librte_vhost/vhost.c delete mode 100644 lib/librte_vhost/vhost.h create mode 100644 lib/librte_vhost/vhost_device.h create mode 100644 lib/librte_vhost/vhost_net.c create mode 100644 lib/librte_vhost/vhost_net.h diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile index 415ffc6..af30491 100644 --- a/lib/librte_vhost/Makefile +++ b/lib/librte_vhost/Makefile @@ -47,11 +47,11 @@ LDLIBS += -lnuma endif # all source are stored in SRCS-y -SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c socket.c vhost.c vhost_user.c \ +SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c socket.c vhost_net.c vhost_user.c \ virtio_net.c # install includes -SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h +SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h rte_virtio_dev.h # dependencies DEPDIRS-$(CONFIG_RTE_LIBRTE_VHOST) += lib/librte_eal diff --git a/lib/librte_vhost/rte_virtio_dev.h b/lib/librte_vhost/rte_virtio_dev.h new file mode 100644 index 0000000..e3c857a --- /dev/null +++ b/lib/librte_vhost/rte_virtio_dev.h @@ -0,0 +1,140 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _VIRTIO_DEV_H_ +#define _VIRTIO_DEV_H_ + +/* Device types and capabilities flag */ +#define RTE_VHOST_USER_CLIENT (1ULL << 0) +#define RTE_VHOST_USER_NO_RECONNECT (1ULL << 1) +#define RTE_VHOST_USER_TX_ZERO_COPY (1ULL << 2) + +#define RTE_VHOST_USER_DEV_NET (1ULL << 32) + +/** + * Device and vring operations. + * + * Make sure to set VIRTIO_DEV_RUNNING to the device flags in new_device and + * remove it in destroy_device. + * + */ +struct virtio_net_device_ops { + int (*new_device)(int vid); /**< Add device. */ + void (*destroy_device)(int vid); /**< Remove device. */ + + int (*vring_state_changed)(int vid, uint16_t queue_id, int enable); /**< triggered when a vring is enabled or disabled */ + + void *reserved[5]; /**< Reserved for future extension */ +}; + +/** + * Disable features in feature_mask. Returns 0 on success. + */ +int rte_vhost_feature_disable(uint64_t feature_mask); + +/** + * Enable features in feature_mask. Returns 0 on success. + */ +int rte_vhost_feature_enable(uint64_t feature_mask); + +/* Returns currently supported vhost features */ +uint64_t rte_vhost_feature_get(void); + +int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable); + +/** + * Register vhost driver. path could be different for multiple + * instance support. + */ +int rte_vhost_driver_register(const char *path, uint64_t flags); + +/* Unregister vhost driver. This is only meaningful to vhost user. */ +int rte_vhost_driver_unregister(const char *path); + +/* Start vhost driver session blocking loop. */ +int rte_vhost_driver_session_start(void); + +/** + * Get the numa node from which the virtio net device's memory + * is allocated. + * + * @param vid + * virtio-net device ID + * + * @return + * The numa node, -1 on failure + */ +int rte_vhost_get_numa_node(int vid); + +/** + * Get the number of queues the device supports. + * + * @param vid + * virtio-net device ID + * + * @return + * The number of queues, 0 on failure + */ +uint32_t rte_vhost_get_queue_num(int vid); + +/** + * Get how many avail entries are left in the queue + * + * @param vid + * virtio-net device ID + * @param queue_id + * virtio queue index + * + * @return + * num of avail entires left + */ +uint16_t rte_vhost_avail_entries(int vid, uint16_t queue_id); + +/** + * Get the virtio net device's ifname. For vhost-cuse, ifname is the + * path of the char device. For vhost-user, ifname is the vhost-user + * socket file path. + * + * @param vid + * virtio-net device ID + * @param buf + * The buffer to stored the queried ifname + * @param len + * The length of buf + * + * @return + * 0 on success, -1 on failure + */ +int rte_vhost_get_ifname(int vid, char *buf, size_t len); + +#endif /* _VIRTIO_DEV_H_ */ \ No newline at end of file diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h index 3ddc9ca..86ede8a 100644 --- a/lib/librte_vhost/rte_virtio_net.h +++ b/lib/librte_vhost/rte_virtio_net.h @@ -50,107 +50,14 @@ #include #include #include - -#define RTE_VHOST_USER_CLIENT (1ULL << 0) -#define RTE_VHOST_USER_NO_RECONNECT (1ULL << 1) -#define RTE_VHOST_USER_TX_ZERO_COPY (1ULL << 2) +#include /* Enum for virtqueue management. */ enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM}; -/** - * Device and vring operations. - */ -struct virtio_net_device_ops { - int (*new_device)(int vid); /**< Add device. */ - void (*destroy_device)(int vid); /**< Remove device. */ - - int (*vring_state_changed)(int vid, uint16_t queue_id, int enable); /**< triggered when a vring is enabled or disabled */ - - void *reserved[5]; /**< Reserved for future extension */ -}; - -/** - * Disable features in feature_mask. Returns 0 on success. - */ -int rte_vhost_feature_disable(uint64_t feature_mask); - -/** - * Enable features in feature_mask. Returns 0 on success. - */ -int rte_vhost_feature_enable(uint64_t feature_mask); - -/* Returns currently supported vhost features */ -uint64_t rte_vhost_feature_get(void); - -int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable); - -/** - * Register vhost driver. path could be different for multiple - * instance support. - */ -int rte_vhost_driver_register(const char *path, uint64_t flags); - -/* Unregister vhost driver. This is only meaningful to vhost user. */ -int rte_vhost_driver_unregister(const char *path); - /* Register callbacks. */ int rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const); -/* Start vhost driver session blocking loop. */ -int rte_vhost_driver_session_start(void); - -/** - * Get the numa node from which the virtio net device's memory - * is allocated. - * - * @param vid - * virtio-net device ID - * - * @return - * The numa node, -1 on failure - */ -int rte_vhost_get_numa_node(int vid); -/** - * Get the number of queues the device supports. - * - * @param vid - * virtio-net device ID - * - * @return - * The number of queues, 0 on failure - */ -uint32_t rte_vhost_get_queue_num(int vid); - -/** - * Get the virtio net device's ifname. For vhost-cuse, ifname is the - * path of the char device. For vhost-user, ifname is the vhost-user - * socket file path. - * - * @param vid - * virtio-net device ID - * @param buf - * The buffer to stored the queried ifname - * @param len - * The length of buf - * - * @return - * 0 on success, -1 on failure - */ -int rte_vhost_get_ifname(int vid, char *buf, size_t len); - -/** - * Get how many avail entries are left in the queue - * - * @param vid - * virtio-net device ID - * @param queue_id - * virtio queue index - * - * @return - * num of avail entires left - */ -uint16_t rte_vhost_avail_entries(int vid, uint16_t queue_id); /** * This function adds buffers to the virtio devices RX virtqueue. Buffers can @@ -191,4 +98,4 @@ uint16_t rte_vhost_enqueue_burst(int vid, uint16_t queue_id, uint16_t rte_vhost_dequeue_burst(int vid, uint16_t queue_id, struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count); -#endif /* _VIRTIO_NET_H_ */ +#endif /* _VIRTIO_NET_H_ */ \ No newline at end of file diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c index 5c3962d..1474c98 100644 --- a/lib/librte_vhost/socket.c +++ b/lib/librte_vhost/socket.c @@ -49,7 +49,7 @@ #include #include "fd_man.h" -#include "vhost.h" +#include "vhost_device.h" #include "vhost_user.h" /* @@ -62,6 +62,7 @@ struct vhost_user_socket { int connfd; bool is_server; bool reconnect; + int type; bool tx_zero_copy; }; @@ -194,7 +195,7 @@ vhost_user_add_connection(int fd, struct vhost_user_socket *vsocket) return; } - vid = vhost_new_device(); + vid = vhost_new_device(vsocket->type); if (vid == -1) { close(fd); free(conn); @@ -525,6 +526,7 @@ rte_vhost_driver_register(const char *path, uint64_t flags) goto out; } + vsocket->type = VIRTIO_ID_NET; vhost_user.vsockets[vhost_user.vsocket_cnt++] = vsocket; out: diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c deleted file mode 100644 index 5461e5b..0000000 --- a/lib/librte_vhost/vhost.c +++ /dev/null @@ -1,421 +0,0 @@ -/*- - * BSD LICENSE - * - * Copyright(c) 2010-2016 Intel Corporation. All rights reserved. - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * - * * Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * * Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in - * the documentation and/or other materials provided with the - * distribution. - * * Neither the name of Intel Corporation nor the names of its - * contributors may be used to endorse or promote products derived - * from this software without specific prior written permission. - * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - */ - -#include -#include -#include -#include -#include -#ifdef RTE_LIBRTE_VHOST_NUMA -#include -#endif - -#include -#include -#include -#include -#include -#include - -#include "vhost.h" - -#define VHOST_USER_F_PROTOCOL_FEATURES 30 - -/* Features supported by this lib. */ -#define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \ - (1ULL << VIRTIO_NET_F_CTRL_VQ) | \ - (1ULL << VIRTIO_NET_F_CTRL_RX) | \ - (1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) | \ - (VHOST_SUPPORTS_MQ) | \ - (1ULL << VIRTIO_F_VERSION_1) | \ - (1ULL << VHOST_F_LOG_ALL) | \ - (1ULL << VHOST_USER_F_PROTOCOL_FEATURES) | \ - (1ULL << VIRTIO_NET_F_HOST_TSO4) | \ - (1ULL << VIRTIO_NET_F_HOST_TSO6) | \ - (1ULL << VIRTIO_NET_F_CSUM) | \ - (1ULL << VIRTIO_NET_F_GUEST_CSUM) | \ - (1ULL << VIRTIO_NET_F_GUEST_TSO4) | \ - (1ULL << VIRTIO_NET_F_GUEST_TSO6)) - -uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES; - -struct virtio_net *vhost_devices[MAX_VHOST_DEVICE]; - -/* device ops to add/remove device to/from data core. */ -struct virtio_net_device_ops const *notify_ops; - -struct virtio_net * -get_device(int vid) -{ - struct virtio_net *dev = vhost_devices[vid]; - - if (unlikely(!dev)) { - RTE_LOG(ERR, VHOST_CONFIG, - "(%d) device not found.\n", vid); - } - - return dev; -} - -static void -cleanup_vq(struct vhost_virtqueue *vq, int destroy) -{ - if ((vq->callfd >= 0) && (destroy != 0)) - close(vq->callfd); - if (vq->kickfd >= 0) - close(vq->kickfd); -} - -/* - * Unmap any memory, close any file descriptors and - * free any memory owned by a device. - */ -void -cleanup_device(struct virtio_net *dev, int destroy) -{ - uint32_t i; - - vhost_backend_cleanup(dev); - - for (i = 0; i < dev->virt_qp_nb; i++) { - cleanup_vq(dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_RXQ], destroy); - cleanup_vq(dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_TXQ], destroy); - } -} - -/* - * Release virtqueues and device memory. - */ -static void -free_device(struct virtio_net *dev) -{ - uint32_t i; - - for (i = 0; i < dev->virt_qp_nb; i++) - rte_free(dev->virtqueue[i * VIRTIO_QNUM]); - - rte_free(dev); -} - -static void -init_vring_queue(struct vhost_virtqueue *vq, int qp_idx) -{ - memset(vq, 0, sizeof(struct vhost_virtqueue)); - - vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD; - vq->callfd = VIRTIO_UNINITIALIZED_EVENTFD; - - /* Backends are set to -1 indicating an inactive device. */ - vq->backend = -1; - - /* always set the default vq pair to enabled */ - if (qp_idx == 0) - vq->enabled = 1; - - TAILQ_INIT(&vq->zmbuf_list); -} - -static void -init_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx) -{ - uint32_t base_idx = qp_idx * VIRTIO_QNUM; - - init_vring_queue(dev->virtqueue[base_idx + VIRTIO_RXQ], qp_idx); - init_vring_queue(dev->virtqueue[base_idx + VIRTIO_TXQ], qp_idx); -} - -static void -reset_vring_queue(struct vhost_virtqueue *vq, int qp_idx) -{ - int callfd; - - callfd = vq->callfd; - init_vring_queue(vq, qp_idx); - vq->callfd = callfd; -} - -static void -reset_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx) -{ - uint32_t base_idx = qp_idx * VIRTIO_QNUM; - - reset_vring_queue(dev->virtqueue[base_idx + VIRTIO_RXQ], qp_idx); - reset_vring_queue(dev->virtqueue[base_idx + VIRTIO_TXQ], qp_idx); -} - -int -alloc_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx) -{ - struct vhost_virtqueue *virtqueue = NULL; - uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ; - uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ; - - virtqueue = rte_malloc(NULL, - sizeof(struct vhost_virtqueue) * VIRTIO_QNUM, 0); - if (virtqueue == NULL) { - RTE_LOG(ERR, VHOST_CONFIG, - "Failed to allocate memory for virt qp:%d.\n", qp_idx); - return -1; - } - - dev->virtqueue[virt_rx_q_idx] = virtqueue; - dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ; - - init_vring_queue_pair(dev, qp_idx); - - dev->virt_qp_nb += 1; - - return 0; -} - -/* - * Reset some variables in device structure, while keeping few - * others untouched, such as vid, ifname, virt_qp_nb: they - * should be same unless the device is removed. - */ -void -reset_device(struct virtio_net *dev) -{ - uint32_t i; - - dev->features = 0; - dev->protocol_features = 0; - dev->flags = 0; - - for (i = 0; i < dev->virt_qp_nb; i++) - reset_vring_queue_pair(dev, i); -} - -/* - * Function is called from the CUSE open function. The device structure is - * initialised and a new entry is added to the device configuration linked - * list. - */ -int -vhost_new_device(void) -{ - struct virtio_net *dev; - int i; - - dev = rte_zmalloc(NULL, sizeof(struct virtio_net), 0); - if (dev == NULL) { - RTE_LOG(ERR, VHOST_CONFIG, - "Failed to allocate memory for new dev.\n"); - return -1; - } - - for (i = 0; i < MAX_VHOST_DEVICE; i++) { - if (vhost_devices[i] == NULL) - break; - } - if (i == MAX_VHOST_DEVICE) { - RTE_LOG(ERR, VHOST_CONFIG, - "Failed to find a free slot for new device.\n"); - return -1; - } - - vhost_devices[i] = dev; - dev->vid = i; - - return i; -} - -/* - * Function is called from the CUSE release function. This function will - * cleanup the device and remove it from device configuration linked list. - */ -void -vhost_destroy_device(int vid) -{ - struct virtio_net *dev = get_device(vid); - - if (dev == NULL) - return; - - if (dev->flags & VIRTIO_DEV_RUNNING) { - dev->flags &= ~VIRTIO_DEV_RUNNING; - notify_ops->destroy_device(vid); - } - - cleanup_device(dev, 1); - free_device(dev); - - vhost_devices[vid] = NULL; -} - -void -vhost_set_ifname(int vid, const char *if_name, unsigned int if_len) -{ - struct virtio_net *dev; - unsigned int len; - - dev = get_device(vid); - if (dev == NULL) - return; - - len = if_len > sizeof(dev->ifname) ? - sizeof(dev->ifname) : if_len; - - strncpy(dev->ifname, if_name, len); - dev->ifname[sizeof(dev->ifname) - 1] = '\0'; -} - -void -vhost_enable_tx_zero_copy(int vid) -{ - struct virtio_net *dev = get_device(vid); - - if (dev == NULL) - return; - - dev->tx_zero_copy = 1; -} - -int -rte_vhost_get_numa_node(int vid) -{ -#ifdef RTE_LIBRTE_VHOST_NUMA - struct virtio_net *dev = get_device(vid); - int numa_node; - int ret; - - if (dev == NULL) - return -1; - - ret = get_mempolicy(&numa_node, NULL, 0, dev, - MPOL_F_NODE | MPOL_F_ADDR); - if (ret < 0) { - RTE_LOG(ERR, VHOST_CONFIG, - "(%d) failed to query numa node: %d\n", vid, ret); - return -1; - } - - return numa_node; -#else - RTE_SET_USED(vid); - return -1; -#endif -} - -uint32_t -rte_vhost_get_queue_num(int vid) -{ - struct virtio_net *dev = get_device(vid); - - if (dev == NULL) - return 0; - - return dev->virt_qp_nb; -} - -int -rte_vhost_get_ifname(int vid, char *buf, size_t len) -{ - struct virtio_net *dev = get_device(vid); - - if (dev == NULL) - return -1; - - len = RTE_MIN(len, sizeof(dev->ifname)); - - strncpy(buf, dev->ifname, len); - buf[len - 1] = '\0'; - - return 0; -} - -uint16_t -rte_vhost_avail_entries(int vid, uint16_t queue_id) -{ - struct virtio_net *dev; - struct vhost_virtqueue *vq; - - dev = get_device(vid); - if (!dev) - return 0; - - vq = dev->virtqueue[queue_id]; - if (!vq->enabled) - return 0; - - return *(volatile uint16_t *)&vq->avail->idx - vq->last_used_idx; -} - -int -rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable) -{ - struct virtio_net *dev = get_device(vid); - - if (dev == NULL) - return -1; - - if (enable) { - RTE_LOG(ERR, VHOST_CONFIG, - "guest notification isn't supported.\n"); - return -1; - } - - dev->virtqueue[queue_id]->used->flags = VRING_USED_F_NO_NOTIFY; - return 0; -} - -uint64_t rte_vhost_feature_get(void) -{ - return VHOST_FEATURES; -} - -int rte_vhost_feature_disable(uint64_t feature_mask) -{ - VHOST_FEATURES = VHOST_FEATURES & ~feature_mask; - return 0; -} - -int rte_vhost_feature_enable(uint64_t feature_mask) -{ - if ((feature_mask & VHOST_SUPPORTED_FEATURES) == feature_mask) { - VHOST_FEATURES = VHOST_FEATURES | feature_mask; - return 0; - } - return -1; -} - -/* - * Register ops so that we can add/remove device to data core. - */ -int -rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const ops) -{ - notify_ops = ops; - - return 0; -} diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h deleted file mode 100644 index 7e4a15e..0000000 --- a/lib/librte_vhost/vhost.h +++ /dev/null @@ -1,288 +0,0 @@ -/*- - * BSD LICENSE - * - * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * - * * Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * * Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in - * the documentation and/or other materials provided with the - * distribution. - * * Neither the name of Intel Corporation nor the names of its - * contributors may be used to endorse or promote products derived - * from this software without specific prior written permission. - * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - */ - -#ifndef _VHOST_NET_CDEV_H_ -#define _VHOST_NET_CDEV_H_ -#include -#include -#include -#include -#include -#include - -#include - -#include "rte_virtio_net.h" - -/* Used to indicate that the device is running on a data core */ -#define VIRTIO_DEV_RUNNING 1 - -/* Backend value set by guest. */ -#define VIRTIO_DEV_STOPPED -1 - -#define BUF_VECTOR_MAX 256 - -/** - * Structure contains buffer address, length and descriptor index - * from vring to do scatter RX. - */ -struct buf_vector { - uint64_t buf_addr; - uint32_t buf_len; - uint32_t desc_idx; -}; - -/* - * A structure to hold some fields needed in zero copy code path, - * mainly for associating an mbuf with the right desc_idx. - */ -struct zcopy_mbuf { - struct rte_mbuf *mbuf; - uint32_t desc_idx; - uint16_t in_use; - - TAILQ_ENTRY(zcopy_mbuf) next; -}; -TAILQ_HEAD(zcopy_mbuf_list, zcopy_mbuf); - -/** - * Structure contains variables relevant to RX/TX virtqueues. - */ -struct vhost_virtqueue { - struct vring_desc *desc; - struct vring_avail *avail; - struct vring_used *used; - uint32_t size; - - uint16_t last_avail_idx; - volatile uint16_t last_used_idx; -#define VIRTIO_INVALID_EVENTFD (-1) -#define VIRTIO_UNINITIALIZED_EVENTFD (-2) - - /* Backend value to determine if device should started/stopped */ - int backend; - /* Used to notify the guest (trigger interrupt) */ - int callfd; - /* Currently unused as polling mode is enabled */ - int kickfd; - int enabled; - - /* Physical address of used ring, for logging */ - uint64_t log_guest_addr; - - uint16_t nr_zmbuf; - uint16_t zmbuf_size; - uint16_t last_zmbuf_idx; - struct zcopy_mbuf *zmbufs; - struct zcopy_mbuf_list zmbuf_list; -} __rte_cache_aligned; - -/* Old kernels have no such macro defined */ -#ifndef VIRTIO_NET_F_GUEST_ANNOUNCE - #define VIRTIO_NET_F_GUEST_ANNOUNCE 21 -#endif - - -/* - * Make an extra wrapper for VIRTIO_NET_F_MQ and - * VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX as they are - * introduced since kernel v3.8. This makes our - * code buildable for older kernel. - */ -#ifdef VIRTIO_NET_F_MQ - #define VHOST_MAX_QUEUE_PAIRS VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX - #define VHOST_SUPPORTS_MQ (1ULL << VIRTIO_NET_F_MQ) -#else - #define VHOST_MAX_QUEUE_PAIRS 1 - #define VHOST_SUPPORTS_MQ 0 -#endif - -/* - * Define virtio 1.0 for older kernels - */ -#ifndef VIRTIO_F_VERSION_1 - #define VIRTIO_F_VERSION_1 32 -#endif - -struct guest_page { - uint64_t guest_phys_addr; - uint64_t host_phys_addr; - uint64_t size; -}; - -/** - * Device structure contains all configuration information relating - * to the device. - */ -struct virtio_net { - /* Frontend (QEMU) memory and memory region information */ - struct virtio_memory *mem; - uint64_t features; - uint64_t protocol_features; - int vid; - uint32_t flags; - uint16_t vhost_hlen; - /* to tell if we need broadcast rarp packet */ - rte_atomic16_t broadcast_rarp; - uint32_t virt_qp_nb; - int tx_zero_copy; - struct vhost_virtqueue *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2]; -#define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) - char ifname[IF_NAME_SZ]; - uint64_t log_size; - uint64_t log_base; - uint64_t log_addr; - struct ether_addr mac; - - uint32_t nr_guest_pages; - uint32_t max_guest_pages; - struct guest_page *guest_pages; -} __rte_cache_aligned; - -/** - * Information relating to memory regions including offsets to - * addresses in QEMUs memory file. - */ -struct virtio_memory_region { - uint64_t guest_phys_addr; - uint64_t guest_user_addr; - uint64_t host_user_addr; - uint64_t size; - void *mmap_addr; - uint64_t mmap_size; - int fd; -}; - - -/** - * Memory structure includes region and mapping information. - */ -struct virtio_memory { - uint32_t nregions; - struct virtio_memory_region regions[0]; -}; - - -/* Macros for printing using RTE_LOG */ -#define RTE_LOGTYPE_VHOST_CONFIG RTE_LOGTYPE_USER1 -#define RTE_LOGTYPE_VHOST_DATA RTE_LOGTYPE_USER1 - -#ifdef RTE_LIBRTE_VHOST_DEBUG -#define VHOST_MAX_PRINT_BUFF 6072 -#define LOG_LEVEL RTE_LOG_DEBUG -#define LOG_DEBUG(log_type, fmt, args...) RTE_LOG(DEBUG, log_type, fmt, ##args) -#define PRINT_PACKET(device, addr, size, header) do { \ - char *pkt_addr = (char *)(addr); \ - unsigned int index; \ - char packet[VHOST_MAX_PRINT_BUFF]; \ - \ - if ((header)) \ - snprintf(packet, VHOST_MAX_PRINT_BUFF, "(%d) Header size %d: ", (device->vid), (size)); \ - else \ - snprintf(packet, VHOST_MAX_PRINT_BUFF, "(%d) Packet size %d: ", (device->vid), (size)); \ - for (index = 0; index < (size); index++) { \ - snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), \ - "%02hhx ", pkt_addr[index]); \ - } \ - snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), "\n"); \ - \ - LOG_DEBUG(VHOST_DATA, "%s", packet); \ -} while (0) -#else -#define LOG_LEVEL RTE_LOG_INFO -#define LOG_DEBUG(log_type, fmt, args...) do {} while (0) -#define PRINT_PACKET(device, addr, size, header) do {} while (0) -#endif - -extern uint64_t VHOST_FEATURES; -#define MAX_VHOST_DEVICE 1024 -extern struct virtio_net *vhost_devices[MAX_VHOST_DEVICE]; - -/* Convert guest physical Address to host virtual address */ -static inline uint64_t __attribute__((always_inline)) -gpa_to_vva(struct virtio_net *dev, uint64_t gpa) -{ - struct virtio_memory_region *reg; - uint32_t i; - - for (i = 0; i < dev->mem->nregions; i++) { - reg = &dev->mem->regions[i]; - if (gpa >= reg->guest_phys_addr && - gpa < reg->guest_phys_addr + reg->size) { - return gpa - reg->guest_phys_addr + - reg->host_user_addr; - } - } - - return 0; -} - -/* Convert guest physical address to host physical address */ -static inline phys_addr_t __attribute__((always_inline)) -gpa_to_hpa(struct virtio_net *dev, uint64_t gpa, uint64_t size) -{ - uint32_t i; - struct guest_page *page; - - for (i = 0; i < dev->nr_guest_pages; i++) { - page = &dev->guest_pages[i]; - - if (gpa >= page->guest_phys_addr && - gpa + size < page->guest_phys_addr + page->size) { - return gpa - page->guest_phys_addr + - page->host_phys_addr; - } - } - - return 0; -} - -struct virtio_net_device_ops const *notify_ops; -struct virtio_net *get_device(int vid); - -int vhost_new_device(void); -void cleanup_device(struct virtio_net *dev, int destroy); -void reset_device(struct virtio_net *dev); -void vhost_destroy_device(int); - -int alloc_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx); - -void vhost_set_ifname(int, const char *if_name, unsigned int if_len); -void vhost_enable_tx_zero_copy(int vid); - -/* - * Backend-specific cleanup. Defined by vhost-cuse and vhost-user. - */ -void vhost_backend_cleanup(struct virtio_net *dev); - -#endif /* _VHOST_NET_CDEV_H_ */ diff --git a/lib/librte_vhost/vhost_device.h b/lib/librte_vhost/vhost_device.h new file mode 100644 index 0000000..7101bb0 --- /dev/null +++ b/lib/librte_vhost/vhost_device.h @@ -0,0 +1,230 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _VHOST_DEVICE_H_ +#define _VHOST_DEVICE_H_ + +#include + +#include "vhost_net.h" +#include "vhost_user.h" + +/* Used to indicate that the device is running on a data core */ +#define VIRTIO_DEV_RUNNING 1 + +/* Backend value set by guest. */ +#define VIRTIO_DEV_STOPPED -1 + +/** + * Structure contains variables relevant to RX/TX virtqueues. + */ +struct vhost_virtqueue { + struct vring_desc *desc; + struct vring_avail *avail; + struct vring_used *used; + uint32_t size; + + uint16_t last_avail_idx; + volatile uint16_t last_used_idx; +#define VIRTIO_INVALID_EVENTFD (-1) +#define VIRTIO_UNINITIALIZED_EVENTFD (-2) + + /* Backend value to determine if device should started/stopped */ + int backend; + /* Used to notify the guest (trigger interrupt) */ + int callfd; + /* Currently unused as polling mode is enabled */ + int kickfd; + int enabled; + + /* Physical address of used ring, for logging */ + uint64_t log_guest_addr; + + uint16_t nr_zmbuf; + uint16_t zmbuf_size; + uint16_t last_zmbuf_idx; + struct zcopy_mbuf *zmbufs; + struct zcopy_mbuf_list zmbuf_list; +} __rte_cache_aligned; + +struct virtio_dev; + +struct virtio_dev_table { + int (*vhost_dev_ready)(struct virtio_dev *dev); + struct vhost_virtqueue* (*vhost_dev_get_queues)(struct virtio_dev *dev, uint16_t queue_id); + void (*vhost_dev_cleanup)(struct virtio_dev *dev, int destroy); + void (*vhost_dev_free)(struct virtio_dev *dev); + void (*vhost_dev_reset)(struct virtio_dev *dev); + uint64_t (*vhost_dev_get_features)(struct virtio_dev *dev); + int (*vhost_dev_set_features)(struct virtio_dev *dev, uint64_t features); + uint64_t (*vhost_dev_get_protocol_features)(struct virtio_dev *dev); + int (*vhost_dev_set_protocol_features)(struct virtio_dev *dev, uint64_t features); + uint32_t (*vhost_dev_get_default_queue_num)(struct virtio_dev *dev); + uint32_t (*vhost_dev_get_queue_num)(struct virtio_dev *dev); + uint16_t (*vhost_dev_get_avail_entries)(struct virtio_dev *dev, uint16_t queue_id); + int (*vhost_dev_get_vring_base)(struct virtio_dev *dev, struct vhost_virtqueue *vq); + int (*vhost_dev_set_vring_num)(struct virtio_dev *dev, struct vhost_virtqueue *vq); + int (*vhost_dev_set_vring_call)(struct virtio_dev *dev, struct vhost_vring_file *file); + int (*vhost_dev_set_log_base)(struct virtio_dev *dev, int fd, uint64_t size, uint64_t off); +}; + +struct virtio_dev { + /* Frontend (QEMU) memory and memory region information */ + struct virtio_memory *mem; + int vid; + uint32_t flags; + #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) + char ifname[IF_NAME_SZ]; + + uint32_t dev_type; + union { + struct virtio_net net_dev; + } dev; + + uint32_t nr_guest_pages; + uint32_t max_guest_pages; + struct guest_page *guest_pages; + + const struct virtio_net_device_ops *notify_ops; + struct virtio_dev_table fn_table; +} __rte_cache_aligned; + +extern struct virtio_net_device_ops const *notify_ops; + +/* + * Define virtio 1.0 for older kernels + */ +#ifndef VIRTIO_F_VERSION_1 + #define VIRTIO_F_VERSION_1 32 +#endif + +struct guest_page { + uint64_t guest_phys_addr; + uint64_t host_phys_addr; + uint64_t size; +}; + +/** + * Information relating to memory regions including offsets to + * addresses in QEMUs memory file. + */ +struct virtio_memory_region { + uint64_t guest_phys_addr; + uint64_t guest_user_addr; + uint64_t host_user_addr; + uint64_t size; + void *mmap_addr; + uint64_t mmap_size; + int fd; +}; + +/** + * Memory structure includes region and mapping information. + */ +struct virtio_memory { + uint32_t nregions; + struct virtio_memory_region regions[0]; +}; + + +/* Macros for printing using RTE_LOG */ +#define RTE_LOGTYPE_VHOST_CONFIG RTE_LOGTYPE_USER1 +#define RTE_LOGTYPE_VHOST_DATA RTE_LOGTYPE_USER1 + +#ifdef RTE_LIBRTE_VHOST_DEBUG +#define VHOST_MAX_PRINT_BUFF 6072 +#define LOG_LEVEL RTE_LOG_DEBUG +#define LOG_DEBUG(log_type, fmt, args...) RTE_LOG(DEBUG, log_type, fmt, ##args) +#define PRINT_PACKET(device, addr, size, header) do { \ + char *pkt_addr = (char *)(addr); \ + unsigned int index; \ + char packet[VHOST_MAX_PRINT_BUFF]; \ + \ + if ((header)) \ + snprintf(packet, VHOST_MAX_PRINT_BUFF, "(%d) Header size %d: ", (device->vid), (size)); \ + else \ + snprintf(packet, VHOST_MAX_PRINT_BUFF, "(%d) Packet size %d: ", (device->vid), (size)); \ + for (index = 0; index < (size); index++) { \ + snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), \ + "%02hhx ", pkt_addr[index]); \ + } \ + snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), "\n"); \ + \ + LOG_DEBUG(VHOST_DATA, "%s", packet); \ +} while (0) +#else +#define LOG_LEVEL RTE_LOG_INFO +#define LOG_DEBUG(log_type, fmt, args...) do {} while (0) +#define PRINT_PACKET(device, addr, size, header) do {} while (0) +#endif + +/* Convert guest physical Address to host virtual address */ +static inline uint64_t __attribute__((always_inline)) +gpa_to_vva(struct virtio_dev *dev, uint64_t gpa) +{ + struct virtio_memory_region *reg; + uint32_t i; + + for (i = 0; i < dev->mem->nregions; i++) { + reg = &dev->mem->regions[i]; + if (gpa >= reg->guest_phys_addr && + gpa < reg->guest_phys_addr + reg->size) { + return gpa - reg->guest_phys_addr + + reg->host_user_addr; + } + } + + return 0; +} + +/* Convert guest physical address to host physical address */ +static inline phys_addr_t __attribute__((always_inline)) +gpa_to_hpa(struct virtio_dev *dev, uint64_t gpa, uint64_t size) +{ + uint32_t i; + struct guest_page *page; + + for (i = 0; i < dev->nr_guest_pages; i++) { + page = &dev->guest_pages[i]; + + if (gpa >= page->guest_phys_addr && + gpa + size < page->guest_phys_addr + page->size) { + return gpa - page->guest_phys_addr + + page->host_phys_addr; + } + } + + return 0; +} + +#endif \ No newline at end of file diff --git a/lib/librte_vhost/vhost_net.c b/lib/librte_vhost/vhost_net.c new file mode 100644 index 0000000..f141b32 --- /dev/null +++ b/lib/librte_vhost/vhost_net.c @@ -0,0 +1,659 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2016 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +#include +#include +#include +#include +#include +#ifdef RTE_LIBRTE_VHOST_NUMA +#include +#endif +#include + +#include +#include +#include +#include +#include +#include + +#include "vhost_net.h" +#include "vhost_device.h" + +#define VHOST_USER_F_PROTOCOL_FEATURES 30 + +/* Features supported by this lib. */ +#define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \ + (1ULL << VIRTIO_NET_F_CTRL_VQ) | \ + (1ULL << VIRTIO_NET_F_CTRL_RX) | \ + (1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) | \ + (VHOST_SUPPORTS_MQ) | \ + (1ULL << VIRTIO_F_VERSION_1) | \ + (1ULL << VHOST_F_LOG_ALL) | \ + (1ULL << VHOST_USER_F_PROTOCOL_FEATURES) | \ + (1ULL << VIRTIO_NET_F_HOST_TSO4) | \ + (1ULL << VIRTIO_NET_F_HOST_TSO6) | \ + (1ULL << VIRTIO_NET_F_CSUM) | \ + (1ULL << VIRTIO_NET_F_GUEST_CSUM) | \ + (1ULL << VIRTIO_NET_F_GUEST_TSO4) | \ + (1ULL << VIRTIO_NET_F_GUEST_TSO6)) + +uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES; + +/* device ops to add/remove device to/from data core. */ +struct virtio_net_device_ops const *notify_ops = NULL; + +struct virtio_net * +get_net_device(struct virtio_dev *dev) +{ + if (!dev) + return NULL; + + return &dev->dev.net_dev; +} + +static void +cleanup_vq(struct vhost_virtqueue *vq, int destroy) +{ + if ((vq->callfd >= 0) && (destroy != 0)) + close(vq->callfd); + if (vq->kickfd >= 0) + close(vq->kickfd); +} + +/* + * Unmap any memory, close any file descriptors and + * free any memory owned by a device. + */ +static void +cleanup_device(struct virtio_dev *device, int destroy) +{ + struct virtio_net *dev = get_net_device(device); + uint32_t i; + + dev->features = 0; + dev->protocol_features = 0; + + for (i = 0; i < dev->virt_qp_nb; i++) { + cleanup_vq(dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_RXQ], destroy); + cleanup_vq(dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_TXQ], destroy); + } + + if (dev->log_addr) { + munmap((void *)(uintptr_t)dev->log_addr, dev->log_size); + dev->log_addr = 0; + } +} + +/* + * Release virtqueues and device memory. + */ +static void +free_device(struct virtio_dev *device) +{ + struct virtio_net *dev = get_net_device(device); + uint32_t i; + + for (i = 0; i < dev->virt_qp_nb; i++) + rte_free(dev->virtqueue[i * VIRTIO_QNUM]); + + rte_free(dev); +} + +static void +init_vring_queue(struct vhost_virtqueue *vq, int qp_idx) +{ + memset(vq, 0, sizeof(struct vhost_virtqueue)); + + vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD; + vq->callfd = VIRTIO_UNINITIALIZED_EVENTFD; + + /* Backends are set to -1 indicating an inactive device. */ + vq->backend = -1; + + /* always set the default vq pair to enabled */ + if (qp_idx == 0) + vq->enabled = 1; + + TAILQ_INIT(&vq->zmbuf_list); +} + +static void +init_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx) +{ + uint32_t base_idx = qp_idx * VIRTIO_QNUM; + + init_vring_queue(dev->virtqueue[base_idx + VIRTIO_RXQ], qp_idx); + init_vring_queue(dev->virtqueue[base_idx + VIRTIO_TXQ], qp_idx); +} + +static void +reset_vring_queue(struct vhost_virtqueue *vq, int qp_idx) +{ + int callfd; + + callfd = vq->callfd; + init_vring_queue(vq, qp_idx); + vq->callfd = callfd; +} + +static void +reset_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx) +{ + uint32_t base_idx = qp_idx * VIRTIO_QNUM; + + reset_vring_queue(dev->virtqueue[base_idx + VIRTIO_RXQ], qp_idx); + reset_vring_queue(dev->virtqueue[base_idx + VIRTIO_TXQ], qp_idx); +} + +static int +alloc_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx) +{ + struct vhost_virtqueue *virtqueue = NULL; + uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ; + uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ; + + virtqueue = rte_malloc(NULL, + sizeof(struct vhost_virtqueue) * VIRTIO_QNUM, 0); + if (virtqueue == NULL) { + RTE_LOG(ERR, VHOST_CONFIG, + "Failed to allocate memory for virt qp:%d.\n", qp_idx); + return -1; + } + + dev->virtqueue[virt_rx_q_idx] = virtqueue; + dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ; + + init_vring_queue_pair(dev, qp_idx); + + dev->virt_qp_nb += 1; + + return 0; +} + +/* + * Reset some variables in device structure, while keeping few + * others untouched, such as vid, ifname, virt_qp_nb: they + * should be same unless the device is removed. + */ +static void +reset_device(struct virtio_dev *device) +{ + struct virtio_net *dev = get_net_device(device); + uint32_t i; + + for (i = 0; i < dev->virt_qp_nb; i++) + reset_vring_queue_pair(dev, i); +} + +static uint64_t +vhost_dev_get_features(struct virtio_dev *dev) +{ + if (dev == NULL) + return 0; + + return VHOST_FEATURES; +} + +static int +vhost_dev_set_features(struct virtio_dev *device, uint64_t features) +{ + struct virtio_net *dev = get_net_device(device); + + if (features & ~VHOST_FEATURES) + return -1; + + dev->features = features; + if (dev->features & + ((1 << VIRTIO_NET_F_MRG_RXBUF) | (1ULL << VIRTIO_F_VERSION_1))) { + dev->vhost_hlen = sizeof(struct virtio_net_hdr_mrg_rxbuf); + } else { + dev->vhost_hlen = sizeof(struct virtio_net_hdr); + } + LOG_DEBUG(VHOST_CONFIG, + "(%d) mergeable RX buffers %s, virtio 1 %s\n", + device->vid, + (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off", + (dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off"); + + return 0; +} + +static int +vhost_dev_set_vring_num(struct virtio_dev *device, + struct vhost_virtqueue *vq) +{ + struct virtio_net *dev = get_net_device(device); + + if (dev->tx_zero_copy) { + vq->nr_zmbuf = 0; + vq->last_zmbuf_idx = 0; + vq->zmbuf_size = vq->size; + vq->zmbufs = rte_zmalloc(NULL, vq->zmbuf_size * + sizeof(struct zcopy_mbuf), 0); + if (vq->zmbufs == NULL) { + RTE_LOG(WARNING, VHOST_CONFIG, + "failed to allocate mem for zero copy; " + "zero copy is force disabled\n"); + dev->tx_zero_copy = 0; + } + } + + return 0; +} + +static int +vq_is_ready(struct vhost_virtqueue *vq) +{ + return vq && vq->desc && + vq->kickfd != VIRTIO_UNINITIALIZED_EVENTFD && + vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD; +} + +static int +vhost_dev_is_ready(struct virtio_dev *device) +{ + struct virtio_net *dev = get_net_device(device); + struct vhost_virtqueue *rvq, *tvq; + uint32_t i; + + for (i = 0; i < dev->virt_qp_nb; i++) { + rvq = dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_RXQ]; + tvq = dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_TXQ]; + + if (!vq_is_ready(rvq) || !vq_is_ready(tvq)) { + RTE_LOG(INFO, VHOST_CONFIG, + "virtio is not ready for processing.\n"); + return 0; + } + } + + RTE_LOG(INFO, VHOST_CONFIG, + "virtio is now ready for processing.\n"); + return 1; +} + +static int +vhost_dev_set_vring_call(struct virtio_dev *device, struct vhost_vring_file *file) +{ + struct virtio_net *dev = get_net_device(device); + struct vhost_virtqueue *vq; + uint32_t cur_qp_idx; + + /* + * FIXME: VHOST_SET_VRING_CALL is the first per-vring message + * we get, so we do vring queue pair allocation here. + */ + cur_qp_idx = file->index / VIRTIO_QNUM; + if (cur_qp_idx + 1 > dev->virt_qp_nb) { + if (alloc_vring_queue_pair(dev, cur_qp_idx) < 0) + return -1; + } + + vq = dev->virtqueue[file->index]; + assert(vq != NULL); + + if (vq->callfd >= 0) + close(vq->callfd); + + vq->callfd = file->fd; + return 0; +} + +static int +vhost_dev_set_protocol_features(struct virtio_dev *device, + uint64_t protocol_features) +{ + struct virtio_net *dev = get_net_device(device); + + if (protocol_features & ~VHOST_USER_PROTOCOL_FEATURES) + return -1; + + dev->protocol_features = protocol_features; + return 0; +} + +static uint64_t +vhost_dev_get_protocol_features(struct virtio_dev *dev) +{ + if (dev == NULL) + return 0; + + return VHOST_USER_PROTOCOL_FEATURES; +} + +static uint32_t +vhost_dev_get_default_queue_num(struct virtio_dev *dev) +{ + if (dev == NULL) + return 0; + + return VHOST_MAX_QUEUE_PAIRS; +} + +static uint32_t +vhost_dev_get_queue_num(struct virtio_dev *device) +{ + struct virtio_net *dev; + if (device == NULL) + return 0; + + dev = get_net_device(device); + return dev->virt_qp_nb; +} + +static uint16_t +vhost_dev_get_avail_entries(struct virtio_dev *device, uint16_t queue_id) +{ + struct virtio_net *dev = get_net_device(device); + struct vhost_virtqueue *vq; + + vq = dev->virtqueue[queue_id]; + if (!vq->enabled) + return 0; + + return *(volatile uint16_t *)&vq->avail->idx - vq->last_used_idx; +} + +void +vhost_enable_tx_zero_copy(int vid) +{ + struct virtio_dev *device = get_device(vid); + struct virtio_net *dev; + + if (device == NULL) + return; + + dev = get_net_device(device); + dev->tx_zero_copy = 1; +} + +static void +free_zmbufs(struct vhost_virtqueue *vq) +{ + struct zcopy_mbuf *zmbuf, *next; + + for (zmbuf = TAILQ_FIRST(&vq->zmbuf_list); + zmbuf != NULL; zmbuf = next) { + next = TAILQ_NEXT(zmbuf, next); + + rte_pktmbuf_free(zmbuf->mbuf); + TAILQ_REMOVE(&vq->zmbuf_list, zmbuf, next); + } + + rte_free(vq->zmbufs); +} + +static int +vhost_dev_get_vring_base(struct virtio_dev *device, struct vhost_virtqueue *vq) +{ + struct virtio_net *dev = get_net_device(device); + + /* + * Based on current qemu vhost-user implementation, this message is + * sent and only sent in vhost_vring_stop. + * TODO: cleanup the vring, it isn't usable since here. + */ + if (vq->kickfd >= 0) + close(vq->kickfd); + + vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD; + + if (dev->tx_zero_copy) + free_zmbufs(vq); + + return 0; +} + +static int +vhost_dev_set_log_base(struct virtio_dev *device, int fd, uint64_t size, uint64_t off) +{ + void *addr; + struct virtio_net *dev = get_net_device(device); + + /* + * mmap from 0 to workaround a hugepage mmap bug: mmap will + * fail when offset is not page size aligned. + */ + addr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + close(fd); + if (addr == MAP_FAILED) { + RTE_LOG(ERR, VHOST_CONFIG, "mmap log base failed!\n"); + return -1; + } + + /* + * Free previously mapped log memory on occasionally + * multiple VHOST_USER_SET_LOG_BASE. + */ + if (dev->log_addr) { + munmap((void *)(uintptr_t)dev->log_addr, dev->log_size); + } + dev->log_addr = (uint64_t)(uintptr_t)addr; + dev->log_base = dev->log_addr + off; + dev->log_size = size; + + return 0; +} + +/* + * An rarp packet is constructed and broadcasted to notify switches about + * the new location of the migrated VM, so that packets from outside will + * not be lost after migration. + * + * However, we don't actually "send" a rarp packet here, instead, we set + * a flag 'broadcast_rarp' to let rte_vhost_dequeue_burst() inject it. + */ +int +vhost_user_send_rarp(struct virtio_dev *device, struct VhostUserMsg *msg) +{ + struct virtio_net *dev = get_net_device(device); + uint8_t *mac = (uint8_t *)&msg->payload.u64; + + RTE_LOG(DEBUG, VHOST_CONFIG, + ":: mac: %02x:%02x:%02x:%02x:%02x:%02x\n", + mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]); + memcpy(dev->mac.addr_bytes, mac, 6); + + /* + * Set the flag to inject a RARP broadcast packet at + * rte_vhost_dequeue_burst(). + * + * rte_smp_wmb() is for making sure the mac is copied + * before the flag is set. + */ + rte_smp_wmb(); + rte_atomic16_set(&dev->broadcast_rarp, 1); + + return 0; +} + +static struct vhost_virtqueue * +vhost_dev_get_queues(struct virtio_dev *device, uint16_t queue_id) +{ + struct virtio_net *dev = get_net_device(device); + struct vhost_virtqueue *vq; + + vq = dev->virtqueue[queue_id]; + + return vq; +} + +void +vhost_net_device_init(struct virtio_dev *device) +{ + struct virtio_net *dev = get_net_device(device); + + device->fn_table.vhost_dev_ready = vhost_dev_is_ready; + device->fn_table.vhost_dev_get_queues = vhost_dev_get_queues; + device->fn_table.vhost_dev_cleanup = cleanup_device; + device->fn_table.vhost_dev_free = free_device; + device->fn_table.vhost_dev_reset = reset_device; + device->fn_table.vhost_dev_get_features = vhost_dev_get_features; + device->fn_table.vhost_dev_set_features = vhost_dev_set_features; + device->fn_table.vhost_dev_get_protocol_features = vhost_dev_get_protocol_features; + device->fn_table.vhost_dev_set_protocol_features = vhost_dev_set_protocol_features; + device->fn_table.vhost_dev_get_default_queue_num = vhost_dev_get_default_queue_num; + device->fn_table.vhost_dev_get_queue_num = vhost_dev_get_queue_num; + device->fn_table.vhost_dev_get_avail_entries = vhost_dev_get_avail_entries; + device->fn_table.vhost_dev_get_vring_base = vhost_dev_get_vring_base; + device->fn_table.vhost_dev_set_vring_num = vhost_dev_set_vring_num; + device->fn_table.vhost_dev_set_vring_call = vhost_dev_set_vring_call; + device->fn_table.vhost_dev_set_log_base = vhost_dev_set_log_base; + + dev->device = device; +} + +uint64_t rte_vhost_feature_get(void) +{ + return VHOST_FEATURES; +} + +int rte_vhost_feature_disable(uint64_t feature_mask) +{ + VHOST_FEATURES = VHOST_FEATURES & ~feature_mask; + return 0; +} + +int rte_vhost_feature_enable(uint64_t feature_mask) +{ + if ((feature_mask & VHOST_SUPPORTED_FEATURES) == feature_mask) { + VHOST_FEATURES = VHOST_FEATURES | feature_mask; + return 0; + } + return -1; +} + +int +rte_vhost_get_numa_node(int vid) +{ +#ifdef RTE_LIBRTE_VHOST_NUMA + struct virtio_dev *dev = get_device(vid); + int numa_node; + int ret; + + if (dev == NULL) + return -1; + + ret = get_mempolicy(&numa_node, NULL, 0, dev, + MPOL_F_NODE | MPOL_F_ADDR); + if (ret < 0) { + RTE_LOG(ERR, VHOST_CONFIG, + "(%d) failed to query numa node: %d\n", vid, ret); + return -1; + } + + return numa_node; +#else + RTE_SET_USED(vid); + return -1; +#endif +} + +uint32_t +rte_vhost_get_queue_num(int vid) +{ + struct virtio_dev *device = get_device(vid); + + if (device == NULL) + return 0; + + if (device->fn_table.vhost_dev_get_queue_num) + return device->fn_table.vhost_dev_get_queue_num(device); + + return 0; +} + +int +rte_vhost_get_ifname(int vid, char *buf, size_t len) +{ + struct virtio_dev *dev = get_device(vid); + + if (dev == NULL) + return -1; + + len = RTE_MIN(len, sizeof(dev->ifname)); + + strncpy(buf, dev->ifname, len); + buf[len - 1] = '\0'; + + return 0; +} + +uint16_t +rte_vhost_avail_entries(int vid, uint16_t queue_id) +{ + struct virtio_dev *device; + + device = get_device(vid); + if (!device) + return 0; + + if (device->fn_table.vhost_dev_get_avail_entries) + return device->fn_table.vhost_dev_get_avail_entries(device, queue_id); + + return 0; +} + +int +rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable) +{ + struct virtio_dev *device = get_device(vid); + struct vhost_virtqueue *vq; + + if (device == NULL) + return -1; + + vq = device->fn_table.vhost_dev_get_queues(device, queue_id); + if (enable) { + RTE_LOG(ERR, VHOST_CONFIG, + "guest notification isn't supported.\n"); + return -1; + } + + vq->used->flags = VRING_USED_F_NO_NOTIFY; + return 0; +} + +/* + * Register ops so that we can add/remove device to data core. + */ +int +rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const ops) +{ + notify_ops = ops; + + return 0; +} \ No newline at end of file diff --git a/lib/librte_vhost/vhost_net.h b/lib/librte_vhost/vhost_net.h new file mode 100644 index 0000000..53b6b16 --- /dev/null +++ b/lib/librte_vhost/vhost_net.h @@ -0,0 +1,126 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _VHOST_NET_H_ +#define _VHOST_NET_H_ +#include +#include +#include +#include +#include +#include + +#include + +#include "rte_virtio_net.h" +#include "vhost_user.h" + +#define VHOST_USER_PROTOCOL_F_MQ 0 +#define VHOST_USER_PROTOCOL_F_LOG_SHMFD 1 +#define VHOST_USER_PROTOCOL_F_RARP 2 + +#define VHOST_USER_PROTOCOL_FEATURES ((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \ + (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\ + (1ULL << VHOST_USER_PROTOCOL_F_RARP)) + +#define BUF_VECTOR_MAX 256 + +/** + * Structure contains buffer address, length and descriptor index + * from vring to do scatter RX. + */ +struct buf_vector { + uint64_t buf_addr; + uint32_t buf_len; + uint32_t desc_idx; +}; + +/* + * A structure to hold some fields needed in zero copy code path, + * mainly for associating an mbuf with the right desc_idx. + */ +struct zcopy_mbuf { + struct rte_mbuf *mbuf; + uint32_t desc_idx; + uint16_t in_use; + + TAILQ_ENTRY(zcopy_mbuf) next; +}; +TAILQ_HEAD(zcopy_mbuf_list, zcopy_mbuf); + +/* Old kernels have no such macro defined */ +#ifndef VIRTIO_NET_F_GUEST_ANNOUNCE + #define VIRTIO_NET_F_GUEST_ANNOUNCE 21 +#endif + +/* + * Make an extra wrapper for VIRTIO_NET_F_MQ and + * VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX as they are + * introduced since kernel v3.8. This makes our + * code buildable for older kernel. + */ +#ifdef VIRTIO_NET_F_MQ + #define VHOST_MAX_QUEUE_PAIRS VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX + #define VHOST_SUPPORTS_MQ (1ULL << VIRTIO_NET_F_MQ) +#else + #define VHOST_MAX_QUEUE_PAIRS 1 + #define VHOST_SUPPORTS_MQ 0 +#endif + +/** + * Device structure contains all configuration information relating + * to the device. + */ +struct virtio_net { + uint64_t features; + uint64_t protocol_features; + uint16_t vhost_hlen; + uint64_t log_size; + uint64_t log_base; + uint64_t log_addr; + /* to tell if we need broadcast rarp packet */ + rte_atomic16_t broadcast_rarp; + uint32_t virt_qp_nb; + int tx_zero_copy; + struct vhost_virtqueue *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2]; + struct ether_addr mac; + /* transport layer device context */ + struct virtio_dev *device; +} __rte_cache_aligned; + +void vhost_enable_tx_zero_copy(int vid); +int vhost_user_send_rarp(struct virtio_dev *device, struct VhostUserMsg *msg); +void vhost_net_device_init(struct virtio_dev *device); +struct virtio_net *get_net_device(struct virtio_dev *dev); + +#endif /* _VHOST_NET_H_ */ \ No newline at end of file diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index ff995d5..90c4b03 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -48,9 +48,13 @@ #include #include -#include "vhost.h" +#include "vhost_device.h" +#include "vhost_net.h" #include "vhost_user.h" +#define MAX_VHOST_DEVICE 1024 +struct virtio_dev *vhost_devices[MAX_VHOST_DEVICE]; + static const char *vhost_message_str[VHOST_USER_MAX] = { [VHOST_USER_NONE] = "VHOST_USER_NONE", [VHOST_USER_GET_FEATURES] = "VHOST_USER_GET_FEATURES", @@ -85,7 +89,7 @@ get_blk_size(int fd) } static void -free_mem_region(struct virtio_net *dev) +free_mem_region(struct virtio_dev *dev) { uint32_t i; struct virtio_memory_region *reg; @@ -102,18 +106,99 @@ free_mem_region(struct virtio_net *dev) } } -void -vhost_backend_cleanup(struct virtio_net *dev) +static void +vhost_backend_cleanup(struct virtio_dev *dev) { if (dev->mem) { free_mem_region(dev); - rte_free(dev->mem); + free(dev->mem); dev->mem = NULL; } - if (dev->log_addr) { - munmap((void *)(uintptr_t)dev->log_addr, dev->log_size); - dev->log_addr = 0; +} + +struct virtio_dev * +get_device(int vid) +{ + struct virtio_dev *dev = vhost_devices[vid]; + + if (unlikely(!dev)) { + RTE_LOG(ERR, VHOST_CONFIG, + "(%d) device not found.\n", vid); + } + + return dev; +} + +/* + * Function is called from the CUSE open function. The device structure is + * initialised and a new entry is added to the device configuration linked + * list. + */ +int +vhost_new_device(int type) +{ + struct virtio_dev *dev; + int i; + + dev = rte_zmalloc(NULL, sizeof(struct virtio_dev), 0); + if (dev == NULL) { + RTE_LOG(ERR, VHOST_CONFIG, + "Failed to allocate memory for new dev.\n"); + return -1; + } + + for (i = 0; i < MAX_VHOST_DEVICE; i++) { + if (vhost_devices[i] == NULL) + break; + } + if (i == MAX_VHOST_DEVICE) { + RTE_LOG(ERR, VHOST_CONFIG, + "Failed to find a free slot for new device.\n"); + return -1; + } + + switch(type) { + case VIRTIO_ID_NET: + dev->notify_ops = notify_ops; + vhost_net_device_init(dev); + assert(notify_ops != NULL); + break; + default: + return -1; + } + + vhost_devices[i] = dev; + dev->vid = i; + dev->dev_type = type; + assert(dev->fn_table.vhost_dev_get_queues != NULL); + + return i; +} + +/* + * Function is called from the CUSE release function. This function will + * cleanup the device and remove it from device configuration linked list. + */ +void +vhost_destroy_device(int vid) +{ + struct virtio_dev *dev = get_device(vid); + + if (dev == NULL) + return; + + if (dev->flags & VIRTIO_DEV_RUNNING) { + dev->flags &= ~VIRTIO_DEV_RUNNING; + dev->notify_ops->destroy_device(vid); } + + vhost_backend_cleanup(dev); + if (dev->fn_table.vhost_dev_cleanup) + dev->fn_table.vhost_dev_cleanup(dev, 1); + if (dev->fn_table.vhost_dev_free) + dev->fn_table.vhost_dev_free(dev); + + vhost_devices[vid] = NULL; } /* @@ -126,16 +211,28 @@ vhost_user_set_owner(void) return 0; } +/* + * Called from CUSE IOCTL: VHOST_RESET_OWNER + */ static int -vhost_user_reset_owner(struct virtio_net *dev) +vhost_user_reset_owner(struct virtio_dev *dev) { + if (dev == NULL) + return -1; + if (dev->flags & VIRTIO_DEV_RUNNING) { dev->flags &= ~VIRTIO_DEV_RUNNING; - notify_ops->destroy_device(dev->vid); + dev->notify_ops->destroy_device(dev->vid); } - cleanup_device(dev, 0); - reset_device(dev); + dev->flags = 0; + + vhost_backend_cleanup(dev); + if (dev->fn_table.vhost_dev_cleanup) + dev->fn_table.vhost_dev_cleanup(dev, 0); + if (dev->fn_table.vhost_dev_reset) + dev->fn_table.vhost_dev_reset(dev); + return 0; } @@ -143,61 +240,61 @@ vhost_user_reset_owner(struct virtio_net *dev) * The features that we support are requested. */ static uint64_t -vhost_user_get_features(void) +vhost_user_get_features(struct virtio_dev *dev) { - return VHOST_FEATURES; + if (dev == NULL) + return 0; + + if (dev->fn_table.vhost_dev_get_features) + return dev->fn_table.vhost_dev_get_features(dev); + + return 0; } /* * We receive the negotiated features supported by us and the virtio device. */ static int -vhost_user_set_features(struct virtio_net *dev, uint64_t features) +vhost_user_set_features(struct virtio_dev *dev, uint64_t pu) { - if (features & ~VHOST_FEATURES) - return -1; + int ret = 0; - dev->features = features; - if (dev->features & - ((1 << VIRTIO_NET_F_MRG_RXBUF) | (1ULL << VIRTIO_F_VERSION_1))) { - dev->vhost_hlen = sizeof(struct virtio_net_hdr_mrg_rxbuf); - } else { - dev->vhost_hlen = sizeof(struct virtio_net_hdr); - } - LOG_DEBUG(VHOST_CONFIG, - "(%d) mergeable RX buffers %s, virtio 1 %s\n", - dev->vid, - (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off", - (dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off"); + if (dev->fn_table.vhost_dev_set_features) + ret = dev->fn_table.vhost_dev_set_features(dev, pu); - return 0; + return ret; +} + +void +vhost_set_ifname(int vid, const char *if_name, unsigned int if_len) +{ + struct virtio_dev *dev; + unsigned int len; + + dev = get_device(vid); + if (dev == NULL) + return; + + len = if_len > sizeof(dev->ifname) ? + sizeof(dev->ifname) : if_len; + + strncpy(dev->ifname, if_name, len); + dev->ifname[sizeof(dev->ifname) - 1] = '\0'; } /* * The virtio device sends us the size of the descriptor ring. */ static int -vhost_user_set_vring_num(struct virtio_net *dev, - struct vhost_vring_state *state) +vhost_user_set_vring_num(struct virtio_dev *dev, struct vhost_vring_state *state) { - struct vhost_virtqueue *vq = dev->virtqueue[state->index]; + struct vhost_virtqueue *vq; + vq = dev->fn_table.vhost_dev_get_queues(dev, state->index); vq->size = state->num; - if (dev->tx_zero_copy) { - vq->nr_zmbuf = 0; - vq->last_zmbuf_idx = 0; - vq->zmbuf_size = vq->size; - vq->zmbufs = rte_zmalloc(NULL, vq->zmbuf_size * - sizeof(struct zcopy_mbuf), 0); - if (vq->zmbufs == NULL) { - RTE_LOG(WARNING, VHOST_CONFIG, - "failed to allocate mem for zero copy; " - "zero copy is force disabled\n"); - dev->tx_zero_copy = 0; - } - } - + if (dev->fn_table.vhost_dev_set_vring_num) + dev->fn_table.vhost_dev_set_vring_num(dev, vq); return 0; } @@ -206,11 +303,11 @@ vhost_user_set_vring_num(struct virtio_net *dev, * same numa node as the memory of vring descriptor. */ #ifdef RTE_LIBRTE_VHOST_NUMA -static struct virtio_net* -numa_realloc(struct virtio_net *dev, int index) +static struct virtio_dev* +numa_realloc(struct virtio_dev *dev, int index) { int oldnode, newnode; - struct virtio_net *old_dev; + struct virtio_dev *old_dev; struct vhost_virtqueue *old_vq, *vq; int ret; @@ -222,7 +319,7 @@ numa_realloc(struct virtio_net *dev, int index) return dev; old_dev = dev; - vq = old_vq = dev->virtqueue[index]; + vq = old_vq = dev->fn_table.virtio_dev_get_queues(dev, index); ret = get_mempolicy(&newnode, NULL, 0, old_vq->desc, MPOL_F_NODE | MPOL_F_ADDR); @@ -277,8 +374,8 @@ out: return dev; } #else -static struct virtio_net* -numa_realloc(struct virtio_net *dev, int index __rte_unused) +static struct virtio_dev* +numa_realloc(struct virtio_dev *dev, int index __rte_unused) { return dev; } @@ -289,7 +386,7 @@ numa_realloc(struct virtio_net *dev, int index __rte_unused) * used to convert the ring addresses to our address space. */ static uint64_t -qva_to_vva(struct virtio_net *dev, uint64_t qva) +qva_to_vva(struct virtio_dev *dev, uint64_t qva) { struct virtio_memory_region *reg; uint32_t i; @@ -313,15 +410,14 @@ qva_to_vva(struct virtio_net *dev, uint64_t qva) * This function then converts these to our address space. */ static int -vhost_user_set_vring_addr(struct virtio_net *dev, struct vhost_vring_addr *addr) +vhost_user_set_vring_addr(struct virtio_dev *dev, struct vhost_vring_addr *addr) { struct vhost_virtqueue *vq; if (dev->mem == NULL) return -1; - /* addr->index refers to the queue index. The txq 1, rxq is 0. */ - vq = dev->virtqueue[addr->index]; + vq = dev->fn_table.vhost_dev_get_queues(dev, addr->index); /* The addresses are converted from QEMU virtual to Vhost virtual. */ vq->desc = (struct vring_desc *)(uintptr_t)qva_to_vva(dev, @@ -334,7 +430,7 @@ vhost_user_set_vring_addr(struct virtio_net *dev, struct vhost_vring_addr *addr) } dev = numa_realloc(dev, addr->index); - vq = dev->virtqueue[addr->index]; + vq = dev->fn_table.vhost_dev_get_queues(dev, addr->index); vq->avail = (struct vring_avail *)(uintptr_t)qva_to_vva(dev, addr->avail_user_addr); @@ -381,17 +477,19 @@ vhost_user_set_vring_addr(struct virtio_net *dev, struct vhost_vring_addr *addr) * The virtio device sends us the available ring last used index. */ static int -vhost_user_set_vring_base(struct virtio_net *dev, - struct vhost_vring_state *state) +vhost_user_set_vring_base(struct virtio_dev *dev, struct vhost_vring_state *state) { - dev->virtqueue[state->index]->last_used_idx = state->num; - dev->virtqueue[state->index]->last_avail_idx = state->num; + struct vhost_virtqueue *vq; + + vq = dev->fn_table.vhost_dev_get_queues(dev, state->index); + vq->last_used_idx = state->num; + vq->last_avail_idx = state->num; return 0; } static void -add_one_guest_page(struct virtio_net *dev, uint64_t guest_phys_addr, +add_one_guest_page(struct virtio_dev *dev, uint64_t guest_phys_addr, uint64_t host_phys_addr, uint64_t size) { struct guest_page *page, *last_page; @@ -419,7 +517,7 @@ add_one_guest_page(struct virtio_net *dev, uint64_t guest_phys_addr, } static void -add_guest_pages(struct virtio_net *dev, struct virtio_memory_region *reg, +add_guest_pages(struct virtio_dev *dev, struct virtio_memory_region *reg, uint64_t page_size) { uint64_t reg_size = reg->size; @@ -450,7 +548,7 @@ add_guest_pages(struct virtio_net *dev, struct virtio_memory_region *reg, #ifdef RTE_LIBRTE_VHOST_DEBUG /* TODO: enable it only in debug mode? */ static void -dump_guest_pages(struct virtio_net *dev) +dump_guest_pages(struct virtio_dev *dev) { uint32_t i; struct guest_page *page; @@ -474,7 +572,7 @@ dump_guest_pages(struct virtio_net *dev) #endif static int -vhost_user_set_mem_table(struct virtio_net *dev, struct VhostUserMsg *pmsg) +vhost_user_set_mem_table(struct virtio_dev *dev, struct VhostUserMsg *pmsg) { struct VhostUserMemory memory = pmsg->payload.memory; struct virtio_memory_region *reg; @@ -488,7 +586,7 @@ vhost_user_set_mem_table(struct virtio_net *dev, struct VhostUserMsg *pmsg) /* Remove from the data plane. */ if (dev->flags & VIRTIO_DEV_RUNNING) { dev->flags &= ~VIRTIO_DEV_RUNNING; - notify_ops->destroy_device(dev->vid); + dev->notify_ops->destroy_device(dev->vid); } if (dev->mem) { @@ -588,41 +686,22 @@ err_mmap: } static int -vq_is_ready(struct vhost_virtqueue *vq) -{ - return vq && vq->desc && - vq->kickfd != VIRTIO_UNINITIALIZED_EVENTFD && - vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD; -} - -static int -virtio_is_ready(struct virtio_net *dev) +virtio_is_ready(struct virtio_dev *dev) { - struct vhost_virtqueue *rvq, *tvq; - uint32_t i; - - for (i = 0; i < dev->virt_qp_nb; i++) { - rvq = dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_RXQ]; - tvq = dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_TXQ]; - - if (!vq_is_ready(rvq) || !vq_is_ready(tvq)) { - RTE_LOG(INFO, VHOST_CONFIG, - "virtio is not ready for processing.\n"); - return 0; - } - } + if (dev->fn_table.vhost_dev_ready) + return dev->fn_table.vhost_dev_ready(dev); - RTE_LOG(INFO, VHOST_CONFIG, - "virtio is now ready for processing.\n"); - return 1; + return -1; } +/* + * In vhost-user, when we receive kick message, will test whether virtio + * device is ready for packet processing. + */ static void -vhost_user_set_vring_call(struct virtio_net *dev, struct VhostUserMsg *pmsg) +vhost_user_set_vring_call(struct virtio_dev *dev, struct VhostUserMsg *pmsg) { struct vhost_vring_file file; - struct vhost_virtqueue *vq; - uint32_t cur_qp_idx; file.index = pmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK; if (pmsg->payload.u64 & VHOST_USER_VRING_NOFD_MASK) @@ -632,23 +711,8 @@ vhost_user_set_vring_call(struct virtio_net *dev, struct VhostUserMsg *pmsg) RTE_LOG(INFO, VHOST_CONFIG, "vring call idx:%d file:%d\n", file.index, file.fd); - /* - * FIXME: VHOST_SET_VRING_CALL is the first per-vring message - * we get, so we do vring queue pair allocation here. - */ - cur_qp_idx = file.index / VIRTIO_QNUM; - if (cur_qp_idx + 1 > dev->virt_qp_nb) { - if (alloc_vring_queue_pair(dev, cur_qp_idx) < 0) - return; - } - - vq = dev->virtqueue[file.index]; - assert(vq != NULL); - - if (vq->callfd >= 0) - close(vq->callfd); - - vq->callfd = file.fd; + if (dev->fn_table.vhost_dev_set_vring_call) + dev->fn_table.vhost_dev_set_vring_call(dev, &file); } /* @@ -656,11 +720,14 @@ vhost_user_set_vring_call(struct virtio_net *dev, struct VhostUserMsg *pmsg) * device is ready for packet processing. */ static void -vhost_user_set_vring_kick(struct virtio_net *dev, struct VhostUserMsg *pmsg) +vhost_user_set_vring_kick(struct virtio_dev *dev, struct VhostUserMsg *pmsg) { struct vhost_vring_file file; struct vhost_virtqueue *vq; + if (!dev) + return; + file.index = pmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK; if (pmsg->payload.u64 & VHOST_USER_VRING_NOFD_MASK) file.fd = VIRTIO_INVALID_EVENTFD; @@ -668,69 +735,44 @@ vhost_user_set_vring_kick(struct virtio_net *dev, struct VhostUserMsg *pmsg) file.fd = pmsg->fds[0]; RTE_LOG(INFO, VHOST_CONFIG, "vring kick idx:%d file:%d\n", file.index, file.fd); - - vq = dev->virtqueue[file.index]; + vq = dev->fn_table.vhost_dev_get_queues(dev, file.index); if (vq->kickfd >= 0) close(vq->kickfd); + vq->kickfd = file.fd; if (virtio_is_ready(dev) && !(dev->flags & VIRTIO_DEV_RUNNING)) { - if (dev->tx_zero_copy) { - RTE_LOG(INFO, VHOST_CONFIG, - "Tx zero copy is enabled\n"); - } - - if (notify_ops->new_device(dev->vid) == 0) + if (dev->notify_ops->new_device(dev->vid) == 0) dev->flags |= VIRTIO_DEV_RUNNING; } } -static void -free_zmbufs(struct vhost_virtqueue *vq) -{ - struct zcopy_mbuf *zmbuf, *next; - - for (zmbuf = TAILQ_FIRST(&vq->zmbuf_list); - zmbuf != NULL; zmbuf = next) { - next = TAILQ_NEXT(zmbuf, next); - - rte_pktmbuf_free(zmbuf->mbuf); - TAILQ_REMOVE(&vq->zmbuf_list, zmbuf, next); - } - - rte_free(vq->zmbufs); -} - /* * when virtio is stopped, qemu will send us the GET_VRING_BASE message. */ static int -vhost_user_get_vring_base(struct virtio_net *dev, - struct vhost_vring_state *state) +vhost_user_get_vring_base(struct virtio_dev *dev, struct vhost_vring_state *state) { + struct vhost_virtqueue *vq; + if (dev == NULL) + return -1; + /* We have to stop the queue (virtio) if it is running. */ if (dev->flags & VIRTIO_DEV_RUNNING) { dev->flags &= ~VIRTIO_DEV_RUNNING; - notify_ops->destroy_device(dev->vid); + dev->notify_ops->destroy_device(dev->vid); } + vq = dev->fn_table.vhost_dev_get_queues(dev, state->index); + /* Here we are safe to get the last used index */ + state->num = vq->last_used_idx; + /* Here we are safe to get the last used index */ - state->num = dev->virtqueue[state->index]->last_used_idx; + if (dev->fn_table.vhost_dev_get_vring_base) + dev->fn_table.vhost_dev_get_vring_base(dev, vq); RTE_LOG(INFO, VHOST_CONFIG, "vring base idx:%d file:%d\n", state->index, state->num); - /* - * Based on current qemu vhost-user implementation, this message is - * sent and only sent in vhost_vring_stop. - * TODO: cleanup the vring, it isn't usable since here. - */ - if (dev->virtqueue[state->index]->kickfd >= 0) - close(dev->virtqueue[state->index]->kickfd); - - dev->virtqueue[state->index]->kickfd = VIRTIO_UNINITIALIZED_EVENTFD; - - if (dev->tx_zero_copy) - free_zmbufs(dev->virtqueue[state->index]); return 0; } @@ -740,39 +782,54 @@ vhost_user_get_vring_base(struct virtio_net *dev, * enable the virtio queue pair. */ static int -vhost_user_set_vring_enable(struct virtio_net *dev, - struct vhost_vring_state *state) +vhost_user_set_vring_enable(struct virtio_dev *dev, struct vhost_vring_state *state) { + struct vhost_virtqueue *vq; int enable = (int)state->num; + if (dev == NULL) + return -1; + RTE_LOG(INFO, VHOST_CONFIG, "set queue enable: %d to qp idx: %d\n", enable, state->index); - if (notify_ops->vring_state_changed) - notify_ops->vring_state_changed(dev->vid, state->index, enable); - - dev->virtqueue[state->index]->enabled = enable; + if (dev->notify_ops->vring_state_changed) + dev->notify_ops->vring_state_changed(dev->vid, state->index, enable); + + vq = dev->fn_table.vhost_dev_get_queues(dev, state->index); + vq->enabled = enable; return 0; } static void -vhost_user_set_protocol_features(struct virtio_net *dev, - uint64_t protocol_features) +vhost_user_set_protocol_features(struct virtio_dev *dev, uint64_t protocol_features) { - if (protocol_features & ~VHOST_USER_PROTOCOL_FEATURES) + if (dev == NULL) return; - dev->protocol_features = protocol_features; + if (dev->fn_table.vhost_dev_set_protocol_features) + dev->fn_table.vhost_dev_set_protocol_features(dev, protocol_features); +} + +static uint64_t +vhost_user_get_protocol_features(struct virtio_dev *dev) +{ + if (dev == NULL) + return 0; + + if (dev->fn_table.vhost_dev_get_protocol_features) + return dev->fn_table.vhost_dev_get_protocol_features(dev); + + return 0; } static int -vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg) +vhost_user_set_log_base(struct virtio_dev *dev, struct VhostUserMsg *msg) { int fd = msg->fds[0]; uint64_t size, off; - void *addr; if (fd < 0) { RTE_LOG(ERR, VHOST_CONFIG, "invalid log fd: %d\n", fd); @@ -792,58 +849,20 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg) "log mmap size: %"PRId64", offset: %"PRId64"\n", size, off); - /* - * mmap from 0 to workaround a hugepage mmap bug: mmap will - * fail when offset is not page size aligned. - */ - addr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); - close(fd); - if (addr == MAP_FAILED) { - RTE_LOG(ERR, VHOST_CONFIG, "mmap log base failed!\n"); - return -1; - } - - /* - * Free previously mapped log memory on occasionally - * multiple VHOST_USER_SET_LOG_BASE. - */ - if (dev->log_addr) { - munmap((void *)(uintptr_t)dev->log_addr, dev->log_size); - } - dev->log_addr = (uint64_t)(uintptr_t)addr; - dev->log_base = dev->log_addr + off; - dev->log_size = size; + if (dev->fn_table.vhost_dev_set_log_base) + return dev->fn_table.vhost_dev_set_log_base(dev, fd, size, off); return 0; } -/* - * An rarp packet is constructed and broadcasted to notify switches about - * the new location of the migrated VM, so that packets from outside will - * not be lost after migration. - * - * However, we don't actually "send" a rarp packet here, instead, we set - * a flag 'broadcast_rarp' to let rte_vhost_dequeue_burst() inject it. - */ -static int -vhost_user_send_rarp(struct virtio_net *dev, struct VhostUserMsg *msg) +static uint32_t +vhost_user_get_queue_num(struct virtio_dev *dev) { - uint8_t *mac = (uint8_t *)&msg->payload.u64; - - RTE_LOG(DEBUG, VHOST_CONFIG, - ":: mac: %02x:%02x:%02x:%02x:%02x:%02x\n", - mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]); - memcpy(dev->mac.addr_bytes, mac, 6); + if (dev == NULL) + return 0; - /* - * Set the flag to inject a RARP broadcast packet at - * rte_vhost_dequeue_burst(). - * - * rte_smp_wmb() is for making sure the mac is copied - * before the flag is set. - */ - rte_smp_wmb(); - rte_atomic16_set(&dev->broadcast_rarp, 1); + if (dev->fn_table.vhost_dev_get_queue_num) + return dev->fn_table.vhost_dev_get_queue_num(dev); return 0; } @@ -899,7 +918,7 @@ send_vhost_message(int sockfd, struct VhostUserMsg *msg) int vhost_user_msg_handler(int vid, int fd) { - struct virtio_net *dev; + struct virtio_dev *dev; struct VhostUserMsg msg; int ret; @@ -926,7 +945,7 @@ vhost_user_msg_handler(int vid, int fd) vhost_message_str[msg.request]); switch (msg.request) { case VHOST_USER_GET_FEATURES: - msg.payload.u64 = vhost_user_get_features(); + msg.payload.u64 = vhost_user_get_features(dev); msg.size = sizeof(msg.payload.u64); send_vhost_message(fd, &msg); break; @@ -935,7 +954,7 @@ vhost_user_msg_handler(int vid, int fd) break; case VHOST_USER_GET_PROTOCOL_FEATURES: - msg.payload.u64 = VHOST_USER_PROTOCOL_FEATURES; + msg.payload.u64 = vhost_user_get_protocol_features(dev); msg.size = sizeof(msg.payload.u64); send_vhost_message(fd, &msg); break; @@ -996,7 +1015,7 @@ vhost_user_msg_handler(int vid, int fd) break; case VHOST_USER_GET_QUEUE_NUM: - msg.payload.u64 = VHOST_MAX_QUEUE_PAIRS; + msg.payload.u64 = vhost_user_get_queue_num(dev); msg.size = sizeof(msg.payload.u64); send_vhost_message(fd, &msg); break; @@ -1014,4 +1033,4 @@ vhost_user_msg_handler(int vid, int fd) } return 0; -} +} \ No newline at end of file diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h index ba78d32..59f80f2 100644 --- a/lib/librte_vhost/vhost_user.h +++ b/lib/librte_vhost/vhost_user.h @@ -38,19 +38,12 @@ #include #include "rte_virtio_net.h" +#include "rte_virtio_dev.h" /* refer to hw/virtio/vhost-user.c */ #define VHOST_MEMORY_MAX_NREGIONS 8 -#define VHOST_USER_PROTOCOL_F_MQ 0 -#define VHOST_USER_PROTOCOL_F_LOG_SHMFD 1 -#define VHOST_USER_PROTOCOL_F_RARP 2 - -#define VHOST_USER_PROTOCOL_FEATURES ((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \ - (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\ - (1ULL << VHOST_USER_PROTOCOL_F_RARP)) - typedef enum VhostUserRequest { VHOST_USER_NONE = 0, VHOST_USER_GET_FEATURES = 1, @@ -117,12 +110,16 @@ typedef struct VhostUserMsg { /* The version of the protocol we support */ #define VHOST_USER_VERSION 0x1 - /* vhost_user.c */ int vhost_user_msg_handler(int vid, int fd); /* socket.c */ int read_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num); int send_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num); +void vhost_set_ifname(int vid, const char *if_name, unsigned int if_len); +int vhost_new_device(int type); +void vhost_destroy_device(int vid); + +struct virtio_dev *get_device(int vid); -#endif +#endif \ No newline at end of file diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index 277b150..c11e9b2 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -45,7 +45,8 @@ #include #include -#include "vhost.h" +#include "vhost_net.h" +#include "vhost_device.h" #define MAX_PKT_BURST 32 #define VHOST_LOG_PAGE 4096 @@ -147,7 +148,7 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0}; desc = &vq->desc[desc_idx]; - desc_addr = gpa_to_vva(dev, desc->addr); + desc_addr = gpa_to_vva(dev->device, desc->addr); /* * Checking of 'desc_addr' placed outside of 'unlikely' macro to avoid * performance issue with some versions of gcc (4.8.4 and 5.3.0) which @@ -187,7 +188,7 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, return -1; desc = &vq->desc[desc->next]; - desc_addr = gpa_to_vva(dev, desc->addr); + desc_addr = gpa_to_vva(dev->device, desc->addr); if (unlikely(!desc_addr)) return -1; @@ -232,7 +233,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__); if (unlikely(!is_valid_virt_queue_idx(queue_id, 0, dev->virt_qp_nb))) { RTE_LOG(ERR, VHOST_DATA, "(%d) %s: invalid virtqueue idx %d.\n", - dev->vid, __func__, queue_id); + dev->device->vid, __func__, queue_id); return 0; } @@ -395,7 +396,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq, LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n", dev->vid, cur_idx, end_idx); - desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr); + desc_addr = gpa_to_vva(dev->device, buf_vec[vec_idx].buf_addr); if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr) return 0; @@ -432,7 +433,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq, } vec_idx++; - desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr); + desc_addr = gpa_to_vva(dev->device, buf_vec[vec_idx].buf_addr); if (unlikely(!desc_addr)) return 0; @@ -487,7 +488,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id, LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__); if (unlikely(!is_valid_virt_queue_idx(queue_id, 0, dev->virt_qp_nb))) { RTE_LOG(ERR, VHOST_DATA, "(%d) %s: invalid virtqueue idx %d.\n", - dev->vid, __func__, queue_id); + dev->device->vid, __func__, queue_id); return 0; } @@ -537,10 +538,12 @@ uint16_t rte_vhost_enqueue_burst(int vid, uint16_t queue_id, struct rte_mbuf **pkts, uint16_t count) { - struct virtio_net *dev = get_device(vid); + struct virtio_dev *device = get_device(vid); + struct virtio_net *dev; - if (!dev) + if (!device) return 0; + dev = get_net_device(device); if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) return virtio_dev_merge_rx(dev, queue_id, pkts, count); @@ -734,7 +737,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, if (unlikely(desc->len < dev->vhost_hlen)) return -1; - desc_addr = gpa_to_vva(dev, desc->addr); + desc_addr = gpa_to_vva(dev->device, desc->addr); if (unlikely(!desc_addr)) return -1; @@ -771,7 +774,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, (desc->flags & VRING_DESC_F_NEXT) != 0)) { desc = &vq->desc[desc->next]; - desc_addr = gpa_to_vva(dev, desc->addr); + desc_addr = gpa_to_vva(dev->device, desc->addr); if (unlikely(!desc_addr)) return -1; @@ -800,7 +803,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, * copy is enabled. */ if (dev->tx_zero_copy && - (hpa = gpa_to_hpa(dev, desc->addr + desc_offset, cpy_len))) { + (hpa = gpa_to_hpa(dev->device, desc->addr + desc_offset, cpy_len))) { cur->data_len = cpy_len; cur->data_off = 0; cur->buf_addr = (void *)(uintptr_t)desc_addr; @@ -833,7 +836,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, return -1; desc = &vq->desc[desc->next]; - desc_addr = gpa_to_vva(dev, desc->addr); + desc_addr = gpa_to_vva(dev->device, desc->addr); if (unlikely(!desc_addr)) return -1; @@ -924,6 +927,7 @@ uint16_t rte_vhost_dequeue_burst(int vid, uint16_t queue_id, struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count) { + struct virtio_dev *device; struct virtio_net *dev; struct rte_mbuf *rarp_mbuf = NULL; struct vhost_virtqueue *vq; @@ -933,13 +937,14 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, uint16_t free_entries; uint16_t avail_idx; - dev = get_device(vid); - if (!dev) + device = get_device(vid); + if (!device) return 0; + dev = get_net_device(device); if (unlikely(!is_valid_virt_queue_idx(queue_id, 1, dev->virt_qp_nb))) { RTE_LOG(ERR, VHOST_DATA, "(%d) %s: invalid virtqueue idx %d.\n", - dev->vid, __func__, queue_id); + dev->device->vid, __func__, queue_id); return 0; } -- 1.9.3