From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 48D38A04BC; Tue, 29 Sep 2020 18:17:13 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3F2F61D9FC; Tue, 29 Sep 2020 18:15:03 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by dpdk.org (Postfix) with ESMTP id 8156F1D9FC for ; Tue, 29 Sep 2020 18:15:00 +0200 (CEST) Dkim-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1601396099; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0KOCYPHA1xqDcS3gcpfFbpEAbBgx3Kee0+yDrzh30HI=; b=UYImkexFZIhZFwOb5dVD5VeiVJBkphNH9mYf3nfVCrWjDQnLPDbTztExbi/31gw6/wzC1n AIDUqvSITFPgV1DVPVzpb8/fGPKHqCxDO2amudX5Badsd9pndQANifDz91IrD5JPnOKP5+ zpMMx3U9GsCq3gKWXxBIKhEBYjZZwKk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-591-W9CBjr3cMkidw0AqxQ47EA-1; Tue, 29 Sep 2020 12:14:57 -0400 X-MC-Unique: W9CBjr3cMkidw0AqxQ47EA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D919F800688; Tue, 29 Sep 2020 16:14:55 +0000 (UTC) Received: from localhost.localdomain (unknown [10.36.110.36]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1437E5C1BD; Tue, 29 Sep 2020 16:14:46 +0000 (UTC) From: Maxime Coquelin To: dev@dpdk.org, chenbo.xia@intel.com, patrick.fu@intel.com, amorenoz@redhat.com Cc: Maxime Coquelin Date: Tue, 29 Sep 2020 18:14:04 +0200 Message-Id: <20200929161404.124580-9-maxime.coquelin@redhat.com> In-Reply-To: <20200929161404.124580-1-maxime.coquelin@redhat.com> References: <20200929161404.124580-1-maxime.coquelin@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=maxime.coquelin@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII" Subject: [dpdk-dev] [PATCH v3 8/8] net/virtio: introduce Vhost-vDPA backend X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" vhost-vDPA is a new virtio backend type introduced by vDPA kernel framework, which provides abstruction to the vDPA devices and exposes an unified control interface through a char dev. This patch adds support to the vhost-vDPA backend. As similar to the existing vhost kernel backend, a set of virtio_user ops were introduced for vhost-vDPA backend to handle device specific operations such as: - device setup - ioctl message handling - queue pair enabling - dma map/unmap vDPA relevant ioctl codes and data structures are also defined in this patch. Signed-off-by: Maxime Coquelin --- drivers/net/virtio/meson.build | 1 + drivers/net/virtio/virtio_user/vhost.h | 1 + drivers/net/virtio/virtio_user/vhost_vdpa.c | 298 ++++++++++++++++++ .../net/virtio/virtio_user/virtio_user_dev.c | 6 + 4 files changed, 306 insertions(+) create mode 100644 drivers/net/virtio/virtio_user/vhost_vdpa.c diff --git a/drivers/net/virtio/meson.build b/drivers/net/virtio/meson.build index 3fd6051f4b..eaed46373d 100644 --- a/drivers/net/virtio/meson.build +++ b/drivers/net/virtio/meson.build @@ -42,6 +42,7 @@ if is_linux 'virtio_user/vhost_kernel.c', 'virtio_user/vhost_kernel_tap.c', 'virtio_user/vhost_user.c', + 'virtio_user/vhost_vdpa.c', 'virtio_user/virtio_user_dev.c') deps += ['bus_vdev'] endif diff --git a/drivers/net/virtio/virtio_user/vhost.h b/drivers/net/virtio/virtio_user/vhost.h index 2e71995a79..210a3704e7 100644 --- a/drivers/net/virtio/virtio_user/vhost.h +++ b/drivers/net/virtio/virtio_user/vhost.h @@ -113,5 +113,6 @@ struct virtio_user_backend_ops { extern struct virtio_user_backend_ops virtio_ops_user; extern struct virtio_user_backend_ops virtio_ops_kernel; +extern struct virtio_user_backend_ops virtio_ops_vdpa; #endif diff --git a/drivers/net/virtio/virtio_user/vhost_vdpa.c b/drivers/net/virtio/virtio_user/vhost_vdpa.c new file mode 100644 index 0000000000..c7b9349fc8 --- /dev/null +++ b/drivers/net/virtio/virtio_user/vhost_vdpa.c @@ -0,0 +1,298 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Red Hat Inc. + */ + +#include +#include +#include +#include +#include + +#include + +#include "vhost.h" +#include "virtio_user_dev.h" + +/* vhost kernel & vdpa ioctls */ +#define VHOST_VIRTIO 0xAF +#define VHOST_GET_FEATURES _IOR(VHOST_VIRTIO, 0x00, __u64) +#define VHOST_SET_FEATURES _IOW(VHOST_VIRTIO, 0x00, __u64) +#define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01) +#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02) +#define VHOST_SET_MEM_TABLE _IOW(VHOST_VIRTIO, 0x03, void *) +#define VHOST_SET_LOG_BASE _IOW(VHOST_VIRTIO, 0x04, __u64) +#define VHOST_SET_LOG_FD _IOW(VHOST_VIRTIO, 0x07, int) +#define VHOST_SET_VRING_NUM _IOW(VHOST_VIRTIO, 0x10, struct vhost_vring_state) +#define VHOST_SET_VRING_ADDR _IOW(VHOST_VIRTIO, 0x11, struct vhost_vring_addr) +#define VHOST_SET_VRING_BASE _IOW(VHOST_VIRTIO, 0x12, struct vhost_vring_state) +#define VHOST_GET_VRING_BASE _IOWR(VHOST_VIRTIO, 0x12, struct vhost_vring_state) +#define VHOST_SET_VRING_KICK _IOW(VHOST_VIRTIO, 0x20, struct vhost_vring_file) +#define VHOST_SET_VRING_CALL _IOW(VHOST_VIRTIO, 0x21, struct vhost_vring_file) +#define VHOST_SET_VRING_ERR _IOW(VHOST_VIRTIO, 0x22, struct vhost_vring_file) +#define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct vhost_vring_file) +#define VHOST_VDPA_GET_DEVICE_ID _IOR(VHOST_VIRTIO, 0x70, __u32) +#define VHOST_VDPA_GET_STATUS _IOR(VHOST_VIRTIO, 0x71, __u8) +#define VHOST_VDPA_SET_STATUS _IOW(VHOST_VIRTIO, 0x72, __u8) +#define VHOST_VDPA_SET_VRING_ENABLE _IOW(VHOST_VIRTIO, 0x75, \ + struct vhost_vring_state) + +static uint64_t vhost_req_user_to_vdpa[] = { + [VHOST_USER_SET_OWNER] = VHOST_SET_OWNER, + [VHOST_USER_RESET_OWNER] = VHOST_RESET_OWNER, + [VHOST_USER_SET_FEATURES] = VHOST_SET_FEATURES, + [VHOST_USER_GET_FEATURES] = VHOST_GET_FEATURES, + [VHOST_USER_SET_VRING_CALL] = VHOST_SET_VRING_CALL, + [VHOST_USER_SET_VRING_NUM] = VHOST_SET_VRING_NUM, + [VHOST_USER_SET_VRING_BASE] = VHOST_SET_VRING_BASE, + [VHOST_USER_GET_VRING_BASE] = VHOST_GET_VRING_BASE, + [VHOST_USER_SET_VRING_ADDR] = VHOST_SET_VRING_ADDR, + [VHOST_USER_SET_VRING_KICK] = VHOST_SET_VRING_KICK, + [VHOST_USER_SET_MEM_TABLE] = VHOST_SET_MEM_TABLE, + [VHOST_USER_SET_STATUS] = VHOST_VDPA_SET_STATUS, + [VHOST_USER_GET_STATUS] = VHOST_VDPA_GET_STATUS, + [VHOST_USER_SET_VRING_ENABLE] = VHOST_VDPA_SET_VRING_ENABLE, +}; + +/* no alignment requirement */ +struct vhost_iotlb_msg { + uint64_t iova; + uint64_t size; + uint64_t uaddr; +#define VHOST_ACCESS_RO 0x1 +#define VHOST_ACCESS_WO 0x2 +#define VHOST_ACCESS_RW 0x3 + uint8_t perm; +#define VHOST_IOTLB_MISS 1 +#define VHOST_IOTLB_UPDATE 2 +#define VHOST_IOTLB_INVALIDATE 3 +#define VHOST_IOTLB_ACCESS_FAIL 4 + uint8_t type; +}; + +#define VHOST_IOTLB_MSG_V2 0x2 + +struct vhost_msg { + uint32_t type; + uint32_t reserved; + union { + struct vhost_iotlb_msg iotlb; + uint8_t padding[64]; + }; +}; + +static int +vhost_vdpa_dma_map(struct virtio_user_dev *dev, void *addr, + uint64_t iova, size_t len) +{ + struct vhost_msg msg = {}; + + msg.type = VHOST_IOTLB_MSG_V2; + msg.iotlb.type = VHOST_IOTLB_UPDATE; + msg.iotlb.iova = iova; + msg.iotlb.uaddr = (uint64_t)(uintptr_t)addr; + msg.iotlb.size = len; + msg.iotlb.perm = VHOST_ACCESS_RW; + + if (write(dev->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) { + PMD_DRV_LOG(ERR, "Failed to send IOTLB update (%s)", + strerror(errno)); + return -1; + } + + return 0; +} + +static int +vhost_vdpa_dma_unmap(struct virtio_user_dev *dev, __rte_unused void *addr, + uint64_t iova, size_t len) +{ + struct vhost_msg msg = {}; + + msg.type = VHOST_IOTLB_MSG_V2; + msg.iotlb.type = VHOST_IOTLB_INVALIDATE; + msg.iotlb.iova = iova; + msg.iotlb.size = len; + + if (write(dev->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) { + PMD_DRV_LOG(ERR, "Failed to send IOTLB invalidate (%s)", + strerror(errno)); + return -1; + } + + return 0; +} + + +static int +vhost_vdpa_map_contig(const struct rte_memseg_list *msl, + const struct rte_memseg *ms, size_t len, void *arg) +{ + struct virtio_user_dev *dev = arg; + + if (msl->external) + return 0; + + return vhost_vdpa_dma_map(dev, ms->addr, ms->iova, len); +} + +static int +vhost_vdpa_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms, + void *arg) +{ + struct virtio_user_dev *dev = arg; + + /* skip external memory that isn't a heap */ + if (msl->external && !msl->heap) + return 0; + + /* skip any segments with invalid IOVA addresses */ + if (ms->iova == RTE_BAD_IOVA) + return 0; + + /* if IOVA mode is VA, we've already mapped the internal segments */ + if (!msl->external && rte_eal_iova_mode() == RTE_IOVA_VA) + return 0; + + return vhost_vdpa_dma_map(dev, ms->addr, ms->iova, ms->len); +} + +static int +vhost_vdpa_dma_map_all(struct virtio_user_dev *dev) +{ + vhost_vdpa_dma_unmap(dev, NULL, 0, SIZE_MAX); + + if (rte_eal_iova_mode() == RTE_IOVA_VA) { + /* with IOVA as VA mode, we can get away with mapping contiguous + * chunks rather than going page-by-page. + */ + int ret = rte_memseg_contig_walk_thread_unsafe( + vhost_vdpa_map_contig, dev); + if (ret) + return ret; + /* we have to continue the walk because we've skipped the + * external segments during the config walk. + */ + } + return rte_memseg_walk_thread_unsafe(vhost_vdpa_map, dev); +} + +/* with below features, vhost vdpa does not need to do the checksum and TSO, + * these info will be passed to virtio_user through virtio net header. + */ +#define VHOST_VDPA_GUEST_OFFLOADS_MASK \ + ((1ULL << VIRTIO_NET_F_GUEST_CSUM) | \ + (1ULL << VIRTIO_NET_F_GUEST_TSO4) | \ + (1ULL << VIRTIO_NET_F_GUEST_TSO6) | \ + (1ULL << VIRTIO_NET_F_GUEST_ECN) | \ + (1ULL << VIRTIO_NET_F_GUEST_UFO)) + +#define VHOST_VDPA_HOST_OFFLOADS_MASK \ + ((1ULL << VIRTIO_NET_F_HOST_TSO4) | \ + (1ULL << VIRTIO_NET_F_HOST_TSO6) | \ + (1ULL << VIRTIO_NET_F_CSUM)) + +static int +vhost_vdpa_ioctl(struct virtio_user_dev *dev, + enum vhost_user_request req, + void *arg) +{ + int ret = -1; + uint64_t req_vdpa; + + PMD_DRV_LOG(INFO, "%s", vhost_msg_strings[req]); + + req_vdpa = vhost_req_user_to_vdpa[req]; + + if (req_vdpa == VHOST_SET_MEM_TABLE) + return vhost_vdpa_dma_map_all(dev); + + if (req_vdpa == VHOST_SET_FEATURES) { + /* WORKAROUND */ + *(uint64_t *)arg |= 1ULL << VIRTIO_F_IOMMU_PLATFORM; + + /* Multiqueue not supported for now */ + *(uint64_t *)arg &= ~(1ULL << VIRTIO_NET_F_MQ); + } + + switch (req_vdpa) { + case VHOST_SET_VRING_NUM: + case VHOST_SET_VRING_ADDR: + case VHOST_SET_VRING_BASE: + case VHOST_GET_VRING_BASE: + case VHOST_SET_VRING_KICK: + case VHOST_SET_VRING_CALL: + PMD_DRV_LOG(DEBUG, "vhostfd=%d, index=%u", + dev->vhostfd, *(unsigned int *)arg); + break; + default: + break; + } + + ret = ioctl(dev->vhostfd, req_vdpa, arg); + if (ret < 0) + PMD_DRV_LOG(ERR, "%s failed: %s", + vhost_msg_strings[req], strerror(errno)); + + return ret; +} + +/** + * Set up environment to talk with a vhost vdpa backend. + * + * @return + * - (-1) if fail to set up; + * - (>=0) if successful. + */ +static int +vhost_vdpa_setup(struct virtio_user_dev *dev) +{ + uint32_t did = (uint32_t)-1; + + dev->vhostfd = open(dev->path, O_RDWR); + if (dev->vhostfd < 0) { + PMD_DRV_LOG(ERR, "Failed to open %s: %s\n", + dev->path, strerror(errno)); + return -1; + } + + if (ioctl(dev->vhostfd, VHOST_VDPA_GET_DEVICE_ID, &did) < 0 || + did != VIRTIO_ID_NETWORK) { + PMD_DRV_LOG(ERR, "Invalid vdpa device ID: %u\n", did); + return -1; + } + + return 0; +} + +static int +vhost_vdpa_enable_queue_pair(struct virtio_user_dev *dev, + uint16_t pair_idx, + int enable) +{ + int i; + + if (dev->qp_enabled[pair_idx] == enable) + return 0; + + for (i = 0; i < 2; ++i) { + struct vhost_vring_state state = { + .index = pair_idx * 2 + i, + .num = enable, + }; + + if (vhost_vdpa_ioctl(dev, VHOST_USER_SET_VRING_ENABLE, &state)) + return -1; + } + + dev->qp_enabled[pair_idx] = enable; + + return 0; +} + +struct virtio_user_backend_ops virtio_ops_vdpa = { + .setup = vhost_vdpa_setup, + .send_request = vhost_vdpa_ioctl, + .enable_qp = vhost_vdpa_enable_queue_pair, + .dma_map = vhost_vdpa_dma_map, + .dma_unmap = vhost_vdpa_dma_unmap, +}; diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c b/drivers/net/virtio/virtio_user/virtio_user_dev.c index 63424656e3..3681758ef1 100644 --- a/drivers/net/virtio/virtio_user/virtio_user_dev.c +++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c @@ -389,6 +389,12 @@ virtio_user_dev_setup(struct virtio_user_dev *dev) dev->vhostfds[q] = -1; dev->tapfds[q] = -1; } + } else if (dev->backend_type == + VIRTIO_USER_BACKEND_VHOST_VDPA) { + dev->ops = &virtio_ops_vdpa; + } else { + PMD_DRV_LOG(ERR, "Unknown backend type"); + return -1; } } -- 2.26.2