From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id EA8C54B4B for ; Sat, 31 Mar 2018 08:10:20 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5A9D4F63FE; Sat, 31 Mar 2018 06:10:20 +0000 (UTC) Received: from [10.36.112.16] (ovpn-112-16.ams2.redhat.com [10.36.112.16]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9B4AE6352D; Sat, 31 Mar 2018 06:10:18 +0000 (UTC) To: Zhihong Wang , dev@dpdk.org Cc: jianfeng.tan@intel.com, tiwei.bie@intel.com, yliu@fridaylinux.org, cunming.liang@intel.com, xiao.w.wang@intel.com, dan.daly@intel.com References: <1517614137-62926-1-git-send-email-zhihong.wang@intel.com> <20180310100148.12030-1-zhihong.wang@intel.com> <20180310100148.12030-3-zhihong.wang@intel.com> From: Maxime Coquelin Message-ID: Date: Sat, 31 Mar 2018 08:10:16 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20180310100148.12030-3-zhihong.wang@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Sat, 31 Mar 2018 06:10:20 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Sat, 31 Mar 2018 06:10:20 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: Re: [dpdk-dev] [PATCH v4 2/5] vhost: support selective datapath X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 31 Mar 2018 06:10:21 -0000 On 03/10/2018 11:01 AM, Zhihong Wang wrote: > This patch set introduces support for selective datapath in DPDK vhost-user > lib. vDPA stands for vhost Data Path Acceleration. The idea is to support > virtio ring compatible devices to serve virtio driver directly to enable > datapath acceleration. > > A set of device ops is defined for device specific operations: > > a. queue_num_get: Called to get supported queue number of the device. > > b. feature_get: Called to get supported features of the device. > > c. protocol_feature_get: Called to get supported protocol features of > the device. > > d. dev_conf: Called to configure the actual device when the virtio > device becomes ready. > > e. dev_close: Called to close the actual device when the virtio device > is stopped. > > f. vring_state_set: Called to change the state of the vring in the > actual device when vring state changes. > > g. feature_set: Called to set the negotiated features to device. > > h. migration_done: Called to allow the device to response to RARP > sending. > > i. get_vfio_group_fd: Called to get the VFIO group fd of the device. > > j. get_vfio_device_fd: Called to get the VFIO device fd of the device. > > k. get_notify_area: Called to get the notify area info of the queue. > > Signed-off-by: Zhihong Wang > --- > Changes in v4: > > 1. Remove the "engine" concept in the lib. > > --- > Changes in v2: > > 1. Add VFIO related vDPA device ops. > > lib/librte_vhost/Makefile | 4 +- > lib/librte_vhost/rte_vdpa.h | 94 +++++++++++++++++++++++++++++++++ > lib/librte_vhost/rte_vhost_version.map | 6 +++ > lib/librte_vhost/vdpa.c | 96 ++++++++++++++++++++++++++++++++++ > 4 files changed, 198 insertions(+), 2 deletions(-) > create mode 100644 lib/librte_vhost/rte_vdpa.h > create mode 100644 lib/librte_vhost/vdpa.c > > diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile > index 5d6c6abae..37044ac03 100644 > --- a/lib/librte_vhost/Makefile > +++ b/lib/librte_vhost/Makefile > @@ -22,9 +22,9 @@ LDLIBS += -lrte_eal -lrte_mempool -lrte_mbuf -lrte_ethdev -lrte_net > > # all source are stored in SRCS-y > SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c iotlb.c socket.c vhost.c \ > - vhost_user.c virtio_net.c > + vhost_user.c virtio_net.c vdpa.c > > # install includes > -SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_vhost.h > +SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_vhost.h rte_vdpa.h > > include $(RTE_SDK)/mk/rte.lib.mk > diff --git a/lib/librte_vhost/rte_vdpa.h b/lib/librte_vhost/rte_vdpa.h > new file mode 100644 > index 000000000..a4bbbd93d > --- /dev/null > +++ b/lib/librte_vhost/rte_vdpa.h > @@ -0,0 +1,94 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2018 Intel Corporation > + */ > + > +#ifndef _RTE_VDPA_H_ > +#define _RTE_VDPA_H_ > + > +/** > + * @file > + * > + * Device specific vhost lib > + */ > + > +#include > +#include "rte_vhost.h" > + > +#define MAX_VDPA_NAME_LEN 128 > + > +enum vdpa_addr_type { > + PCI_ADDR, > + VDPA_ADDR_MAX > +}; > + > +struct rte_vdpa_dev_addr { > + enum vdpa_addr_type type; > + union { > + uint8_t __dummy[64]; > + struct rte_pci_addr pci_addr; > + }; > +}; > + > +/* Get capabilities of this device */ > +typedef int (*vdpa_dev_queue_num_get_t)(int did, uint32_t *queue_num); > +typedef int (*vdpa_dev_feature_get_t)(int did, uint64_t *features); > + > +/* Driver configure/close the device */ > +typedef int (*vdpa_dev_conf_t)(int vid); > +typedef int (*vdpa_dev_close_t)(int vid); > + > +/* Enable/disable this vring */ > +typedef int (*vdpa_vring_state_set_t)(int vid, int vring, int state); > + > +/* Set features when changed */ > +typedef int (*vdpa_feature_set_t)(int vid); > + > +/* Destination operations when migration done */ > +typedef int (*vdpa_migration_done_t)(int vid); > + > +/* Get the vfio group fd */ > +typedef int (*vdpa_get_vfio_group_fd_t)(int vid); > + > +/* Get the vfio device fd */ > +typedef int (*vdpa_get_vfio_device_fd_t)(int vid); > + > +/* Get the notify area info of the queue */ > +typedef int (*vdpa_get_notify_area_t)(int vid, int qid, uint64_t *offset, > + uint64_t *size); > +/* Device ops */ > +struct rte_vdpa_dev_ops { > + vdpa_dev_queue_num_get_t queue_num_get; > + vdpa_dev_feature_get_t feature_get; > + vdpa_dev_feature_get_t protocol_feature_get; > + vdpa_dev_conf_t dev_conf; > + vdpa_dev_close_t dev_close; > + vdpa_vring_state_set_t vring_state_set; > + vdpa_feature_set_t feature_set; > + vdpa_migration_done_t migration_done; > + vdpa_get_vfio_group_fd_t get_vfio_group_fd; > + vdpa_get_vfio_device_fd_t get_vfio_device_fd; > + vdpa_get_notify_area_t get_notify_area; Maybe you could reserve some room here to avoid breaking the ABI in the future if we need to add some optional ops. > +}; > + > +struct rte_vdpa_device { > + struct rte_vdpa_dev_addr addr; > + struct rte_vdpa_dev_ops *ops; > +} __rte_cache_aligned; > + > +extern struct rte_vdpa_device *vdpa_devices[]; > +extern uint32_t vdpa_device_num; > + > +/* Register a vdpa device, return did if successful, -1 on failure */ > +int __rte_experimental > +rte_vdpa_register_device(struct rte_vdpa_dev_addr *addr, > + struct rte_vdpa_dev_ops *ops); > + > +/* Unregister a vdpa device, return -1 on failure */ > +int __rte_experimental > +rte_vdpa_unregister_device(int did); > + > +/* Find did of a vdpa device, return -1 on failure */ > +int __rte_experimental > +rte_vdpa_find_device_id(struct rte_vdpa_dev_addr *addr); > + > +#endif /* _RTE_VDPA_H_ */ > diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map > index df0103129..7bcffb490 100644 > --- a/lib/librte_vhost/rte_vhost_version.map > +++ b/lib/librte_vhost/rte_vhost_version.map > @@ -59,3 +59,9 @@ DPDK_18.02 { > rte_vhost_vring_call; > > } DPDK_17.08; > + > +EXPERIMENTAL { > + rte_vdpa_register_device; > + rte_vdpa_unregister_device; > + rte_vdpa_find_device_id; I think you need also to declare the new structs here, not only the new functions. > +} DPDK_18.02; > diff --git a/lib/librte_vhost/vdpa.c b/lib/librte_vhost/vdpa.c > new file mode 100644 > index 000000000..0c950d45f > --- /dev/null > +++ b/lib/librte_vhost/vdpa.c > @@ -0,0 +1,96 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2018 Intel Corporation > + */ > + > +/** > + * @file > + * > + * Device specific vhost lib > + */ > + > +#include > + > +#include > +#include "rte_vdpa.h" > +#include "vhost.h" > + > +struct rte_vdpa_device *vdpa_devices[MAX_VHOST_DEVICE]; > +uint32_t vdpa_device_num; > + > +static int is_same_vdpa_dev_addr(struct rte_vdpa_dev_addr *a, > + struct rte_vdpa_dev_addr *b) > +{ Given the boolean nature of the function name, I would return 1 if same device, 0 if different. > + int ret = 0; > + > + if (a->type != b->type) > + return -1; > + > + switch (a->type) { > + case PCI_ADDR: > + if (a->pci_addr.domain != b->pci_addr.domain || > + a->pci_addr.bus != b->pci_addr.bus || > + a->pci_addr.devid != b->pci_addr.devid || > + a->pci_addr.function != b->pci_addr.function) > + ret = -1; > + break; > + default: > + break; > + } > + > + return ret; > +} > + > +int rte_vdpa_register_device(struct rte_vdpa_dev_addr *addr, > + struct rte_vdpa_dev_ops *ops) > +{ > + struct rte_vdpa_device *dev; > + char device_name[MAX_VDPA_NAME_LEN]; > + int i; > + > + if (vdpa_device_num >= MAX_VHOST_DEVICE) > + return -1; > + > + for (i = 0; i < MAX_VHOST_DEVICE; i++) { > + if (vdpa_devices[i] == NULL) > + break; You might want to check same device isn't being registering a second time, and return an error in that case. This is not a blocker though, and can be done in a dedicated patch. > + } > + > + sprintf(device_name, "vdpa-dev-%d", i); > + dev = rte_zmalloc(device_name, sizeof(struct rte_vdpa_device), > + RTE_CACHE_LINE_SIZE); > + if (!dev) > + return -1; > + > + memcpy(&dev->addr, addr, sizeof(struct rte_vdpa_dev_addr)); > + dev->ops = ops; > + vdpa_devices[i] = dev; > + vdpa_device_num++; > + > + return i; > +} > + > +int rte_vdpa_unregister_device(int did) > +{ > + if (did < 0 || did >= MAX_VHOST_DEVICE || vdpa_devices[did] == NULL) > + return -1; > + > + rte_free(vdpa_devices[did]); > + vdpa_devices[did] = NULL; > + vdpa_device_num--; > + > + return did; > +} > + > +int rte_vdpa_find_device_id(struct rte_vdpa_dev_addr *addr) > +{ > + struct rte_vdpa_device *dev; > + int i; > + > + for (i = 0; i < MAX_VHOST_DEVICE; ++i) { > + dev = vdpa_devices[i]; > + if (dev && is_same_vdpa_dev_addr(&dev->addr, addr) == 0) > + return i; > + } > + > + return -1; > +} >