From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 9E3A54CBB for ; Fri, 30 Mar 2018 12:01:00 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Mar 2018 03:01:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,381,1517904000"; d="scan'208";a="43494316" Received: from unknown (HELO dpdk99.sh.intel.com) ([10.67.110.156]) by orsmga001.jf.intel.com with ESMTP; 30 Mar 2018 03:00:58 -0700 From: Zhihong Wang To: dev@dpdk.org Cc: jianfeng.tan@intel.com, tiwei.bie@intel.com, maxime.coquelin@redhat.com, yliu@fridaylinux.org, cunming.liang@intel.com, xiao.w.wang@intel.com, dan.daly@intel.com, Zhihong Wang Message-Id: <20180310100148.12030-3-zhihong.wang@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20180310100148.12030-1-zhihong.wang@intel.com> References: <1517614137-62926-1-git-send-email-zhihong.wang@intel.com> <20180310100148.12030-1-zhihong.wang@intel.com> Subject: [dpdk-dev] [PATCH v4 2/5] vhost: support selective datapath X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Fri, 30 Mar 2018 10:01:01 -0000 X-Original-Date: Sat, 10 Mar 2018 18:01:45 +0800 X-List-Received-Date: Fri, 30 Mar 2018 10:01:01 -0000 This patch set introduces support for selective datapath in DPDK vhost-user lib. vDPA stands for vhost Data Path Acceleration. The idea is to support virtio ring compatible devices to serve virtio driver directly to enable datapath acceleration. A set of device ops is defined for device specific operations: a. queue_num_get: Called to get supported queue number of the device. b. feature_get: Called to get supported features of the device. c. protocol_feature_get: Called to get supported protocol features of the device. d. dev_conf: Called to configure the actual device when the virtio device becomes ready. e. dev_close: Called to close the actual device when the virtio device is stopped. f. vring_state_set: Called to change the state of the vring in the actual device when vring state changes. g. feature_set: Called to set the negotiated features to device. h. migration_done: Called to allow the device to response to RARP sending. i. get_vfio_group_fd: Called to get the VFIO group fd of the device. j. get_vfio_device_fd: Called to get the VFIO device fd of the device. k. get_notify_area: Called to get the notify area info of the queue. Signed-off-by: Zhihong Wang --- Changes in v4: 1. Remove the "engine" concept in the lib. --- Changes in v2: 1. Add VFIO related vDPA device ops. lib/librte_vhost/Makefile | 4 +- lib/librte_vhost/rte_vdpa.h | 94 +++++++++++++++++++++++++++++++++ lib/librte_vhost/rte_vhost_version.map | 6 +++ lib/librte_vhost/vdpa.c | 96 ++++++++++++++++++++++++++++++++++ 4 files changed, 198 insertions(+), 2 deletions(-) create mode 100644 lib/librte_vhost/rte_vdpa.h create mode 100644 lib/librte_vhost/vdpa.c diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile index 5d6c6abae..37044ac03 100644 --- a/lib/librte_vhost/Makefile +++ b/lib/librte_vhost/Makefile @@ -22,9 +22,9 @@ LDLIBS += -lrte_eal -lrte_mempool -lrte_mbuf -lrte_ethdev -lrte_net # all source are stored in SRCS-y SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c iotlb.c socket.c vhost.c \ - vhost_user.c virtio_net.c + vhost_user.c virtio_net.c vdpa.c # install includes -SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_vhost.h +SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_vhost.h rte_vdpa.h include $(RTE_SDK)/mk/rte.lib.mk diff --git a/lib/librte_vhost/rte_vdpa.h b/lib/librte_vhost/rte_vdpa.h new file mode 100644 index 000000000..a4bbbd93d --- /dev/null +++ b/lib/librte_vhost/rte_vdpa.h @@ -0,0 +1,94 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2018 Intel Corporation + */ + +#ifndef _RTE_VDPA_H_ +#define _RTE_VDPA_H_ + +/** + * @file + * + * Device specific vhost lib + */ + +#include +#include "rte_vhost.h" + +#define MAX_VDPA_NAME_LEN 128 + +enum vdpa_addr_type { + PCI_ADDR, + VDPA_ADDR_MAX +}; + +struct rte_vdpa_dev_addr { + enum vdpa_addr_type type; + union { + uint8_t __dummy[64]; + struct rte_pci_addr pci_addr; + }; +}; + +/* Get capabilities of this device */ +typedef int (*vdpa_dev_queue_num_get_t)(int did, uint32_t *queue_num); +typedef int (*vdpa_dev_feature_get_t)(int did, uint64_t *features); + +/* Driver configure/close the device */ +typedef int (*vdpa_dev_conf_t)(int vid); +typedef int (*vdpa_dev_close_t)(int vid); + +/* Enable/disable this vring */ +typedef int (*vdpa_vring_state_set_t)(int vid, int vring, int state); + +/* Set features when changed */ +typedef int (*vdpa_feature_set_t)(int vid); + +/* Destination operations when migration done */ +typedef int (*vdpa_migration_done_t)(int vid); + +/* Get the vfio group fd */ +typedef int (*vdpa_get_vfio_group_fd_t)(int vid); + +/* Get the vfio device fd */ +typedef int (*vdpa_get_vfio_device_fd_t)(int vid); + +/* Get the notify area info of the queue */ +typedef int (*vdpa_get_notify_area_t)(int vid, int qid, uint64_t *offset, + uint64_t *size); +/* Device ops */ +struct rte_vdpa_dev_ops { + vdpa_dev_queue_num_get_t queue_num_get; + vdpa_dev_feature_get_t feature_get; + vdpa_dev_feature_get_t protocol_feature_get; + vdpa_dev_conf_t dev_conf; + vdpa_dev_close_t dev_close; + vdpa_vring_state_set_t vring_state_set; + vdpa_feature_set_t feature_set; + vdpa_migration_done_t migration_done; + vdpa_get_vfio_group_fd_t get_vfio_group_fd; + vdpa_get_vfio_device_fd_t get_vfio_device_fd; + vdpa_get_notify_area_t get_notify_area; +}; + +struct rte_vdpa_device { + struct rte_vdpa_dev_addr addr; + struct rte_vdpa_dev_ops *ops; +} __rte_cache_aligned; + +extern struct rte_vdpa_device *vdpa_devices[]; +extern uint32_t vdpa_device_num; + +/* Register a vdpa device, return did if successful, -1 on failure */ +int __rte_experimental +rte_vdpa_register_device(struct rte_vdpa_dev_addr *addr, + struct rte_vdpa_dev_ops *ops); + +/* Unregister a vdpa device, return -1 on failure */ +int __rte_experimental +rte_vdpa_unregister_device(int did); + +/* Find did of a vdpa device, return -1 on failure */ +int __rte_experimental +rte_vdpa_find_device_id(struct rte_vdpa_dev_addr *addr); + +#endif /* _RTE_VDPA_H_ */ diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map index df0103129..7bcffb490 100644 --- a/lib/librte_vhost/rte_vhost_version.map +++ b/lib/librte_vhost/rte_vhost_version.map @@ -59,3 +59,9 @@ DPDK_18.02 { rte_vhost_vring_call; } DPDK_17.08; + +EXPERIMENTAL { + rte_vdpa_register_device; + rte_vdpa_unregister_device; + rte_vdpa_find_device_id; +} DPDK_18.02; diff --git a/lib/librte_vhost/vdpa.c b/lib/librte_vhost/vdpa.c new file mode 100644 index 000000000..0c950d45f --- /dev/null +++ b/lib/librte_vhost/vdpa.c @@ -0,0 +1,96 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2018 Intel Corporation + */ + +/** + * @file + * + * Device specific vhost lib + */ + +#include + +#include +#include "rte_vdpa.h" +#include "vhost.h" + +struct rte_vdpa_device *vdpa_devices[MAX_VHOST_DEVICE]; +uint32_t vdpa_device_num; + +static int is_same_vdpa_dev_addr(struct rte_vdpa_dev_addr *a, + struct rte_vdpa_dev_addr *b) +{ + int ret = 0; + + if (a->type != b->type) + return -1; + + switch (a->type) { + case PCI_ADDR: + if (a->pci_addr.domain != b->pci_addr.domain || + a->pci_addr.bus != b->pci_addr.bus || + a->pci_addr.devid != b->pci_addr.devid || + a->pci_addr.function != b->pci_addr.function) + ret = -1; + break; + default: + break; + } + + return ret; +} + +int rte_vdpa_register_device(struct rte_vdpa_dev_addr *addr, + struct rte_vdpa_dev_ops *ops) +{ + struct rte_vdpa_device *dev; + char device_name[MAX_VDPA_NAME_LEN]; + int i; + + if (vdpa_device_num >= MAX_VHOST_DEVICE) + return -1; + + for (i = 0; i < MAX_VHOST_DEVICE; i++) { + if (vdpa_devices[i] == NULL) + break; + } + + sprintf(device_name, "vdpa-dev-%d", i); + dev = rte_zmalloc(device_name, sizeof(struct rte_vdpa_device), + RTE_CACHE_LINE_SIZE); + if (!dev) + return -1; + + memcpy(&dev->addr, addr, sizeof(struct rte_vdpa_dev_addr)); + dev->ops = ops; + vdpa_devices[i] = dev; + vdpa_device_num++; + + return i; +} + +int rte_vdpa_unregister_device(int did) +{ + if (did < 0 || did >= MAX_VHOST_DEVICE || vdpa_devices[did] == NULL) + return -1; + + rte_free(vdpa_devices[did]); + vdpa_devices[did] = NULL; + vdpa_device_num--; + + return did; +} + +int rte_vdpa_find_device_id(struct rte_vdpa_dev_addr *addr) +{ + struct rte_vdpa_device *dev; + int i; + + for (i = 0; i < MAX_VHOST_DEVICE; ++i) { + dev = vdpa_devices[i]; + if (dev && is_same_vdpa_dev_addr(&dev->addr, addr) == 0) + return i; + } + + return -1; +} -- 2.13.6