From mboxrd@z Thu Jan 1 00:00:00 1970
Return-Path:
Received: from mailout2.w1.samsung.com (mailout2.w1.samsung.com
[210.118.77.12]) by dpdk.org (Postfix) with ESMTP id CDF505A26
for ; Mon, 11 Jan 2016 11:42:24 +0100 (CET)
Received: from eucpsbgm2.samsung.com (unknown [203.254.199.245])
by mailout2.w1.samsung.com
(Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014))
with ESMTP id <0O0S003ZKB2NUC90@mailout2.w1.samsung.com> for dev@dpdk.org;
Mon, 11 Jan 2016 10:42:23 +0000 (GMT)
X-AuditID: cbfec7f5-f79b16d000005389-fa-5693870e22ba
Received: from eusync3.samsung.com ( [203.254.199.213])
by eucpsbgm2.samsung.com (EUCPMTA) with SMTP id B8.84.21385.E0783965; Mon,
11 Jan 2016 10:42:22 +0000 (GMT)
Received: from fedinw7x64 ([106.109.131.169])
by eusync3.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0
64bit (built May 5 2014))
with ESMTPA id <0O0S00BLFB2LGG40@eusync3.samsung.com>; Mon,
11 Jan 2016 10:42:22 +0000 (GMT)
From: Pavel Fedin
To: 'Jianfeng Tan' , dev@dpdk.org
References: <1446748276-132087-1-git-send-email-jianfeng.tan@intel.com>
<1452426182-86851-1-git-send-email-jianfeng.tan@intel.com>
<1452426182-86851-4-git-send-email-jianfeng.tan@intel.com>
In-reply-to: <1452426182-86851-4-git-send-email-jianfeng.tan@intel.com>
Date: Mon, 11 Jan 2016 13:42:21 +0300
Message-id: <056a01d14c5c$c0ad9a80$4208cf80$@samsung.com>
MIME-version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7bit
X-Mailer: Microsoft Outlook 14.0
Thread-index: AQLOfZuJ4skKn5NxqR5aD7duH+7e0wGBr/PNAd1Vtd6c4MQqYA==
Content-language: ru
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprHIsWRmVeSWpSXmKPExsVy+t/xq7p87ZPDDM68MLOY+/IHk8W7T9uZ
LNpnnmWy6J79hc3i7+xWVov/v16xWhzr+cRqcejQYUaLTe8msVpcn3CB1YHL48Hlm0wevxYs
ZfVoOfKW1aPxuYTH4j0vmTyaXzxn8Zh3MtDj/b6rbAEcUVw2Kak5mWWpRfp2CVwZW9rWMBX0
Lmaq2Lr2H1sD4/QXjF2MnBwSAiYS7Z3TmSFsMYkL99azdTFycQgJLGWUuHZ4BiOE851R4kJb
KztIFZuAusTprx9YQGwRAUuJT2/+gXUzC5xhlDiyA6xGSOAgo8SaCWA1nALuEqdXvAIaxMEh
LOAqsep/JkiYRUBV4tzhV6wgNi/QmIUX3zFC2IISPybfY4EYqSWxeVsTK4QtL7F5zVuoQxUk
dpx9zQhxgpPEqt/3mCBqRCSm/bvHPIFRaBaSUbOQjJqFZNQsJC0LGFlWMYqmliYXFCel5xrp
FSfmFpfmpesl5+duYoTE2tcdjEuPWR1iFOBgVOLhnbFrUpgQa2JZcWXuIUYJDmYlEd7d2ZPD
hHhTEiurUovy44tKc1KLDzFKc7AoifPO3PU+REggPbEkNTs1tSC1CCbLxMEp1cC4f+7fql7W
1wrmYRJfMv+pLNeqnHR8Ib9inKh6A4tjt+4+o9rVP45N9+WcVxCo6i2TvV791cJ9+TO1+qZa
/TePV7zGc57FcbbUSt7fS/iqs1eumf/0vbqf2gvVje8Xbp4gvc07edG+ZSkT685y8P87kmyW
esA9cPNZHs37crGHuL771PVmPYxQYinOSDTUYi4qTgQAC75iE7ECAAA=
Cc: nakajima.yoshihiro@lab.ntt.co.jp, mst@redhat.com,
ann.zhuangyanying@huawei.com
Subject: Re: [dpdk-dev] [PATCH 3/4] virtio/vdev: add ways to interact with
vhost
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
X-List-Received-Date: Mon, 11 Jan 2016 10:42:25 -0000
Hello!
Please, see inline
> -----Original Message-----
> From: Jianfeng Tan [mailto:jianfeng.tan@intel.com]
> Sent: Sunday, January 10, 2016 2:43 PM
> To: dev@dpdk.org
> Cc: rich.lane@bigswitch.com; yuanhan.liu@linux.intel.com; mst@redhat.com;
> nakajima.yoshihiro@lab.ntt.co.jp; huawei.xie@intel.com; mukawa@igel.co.jp;
> p.fedin@samsung.com; michael.qiu@intel.com; ann.zhuangyanying@huawei.com; Jianfeng Tan
> Subject: [PATCH 3/4] virtio/vdev: add ways to interact with vhost
>
> Depends on the type of vhost file: vhost-user is used if the given
> path points to a unix socket; vhost-net is used if the given path
> points to a char device.
>
> NOTE: we now keep CONFIG_RTE_VIRTIO_VDEV undefined by default, need
> to be uncommented when in use.
>
> Signed-off-by: Huawei Xie
> Signed-off-by: Jianfeng Tan
> ---
> config/common_linuxapp | 5 +
> drivers/net/virtio/Makefile | 4 +
> drivers/net/virtio/vhost.c | 734 +++++++++++++++++++++++++++++++++++++
> drivers/net/virtio/vhost.h | 192 ++++++++++
> drivers/net/virtio/virtio_ethdev.h | 5 +-
> drivers/net/virtio/virtio_pci.h | 52 ++-
> 6 files changed, 990 insertions(+), 2 deletions(-)
> create mode 100644 drivers/net/virtio/vhost.c
> create mode 100644 drivers/net/virtio/vhost.h
>
> diff --git a/config/common_linuxapp b/config/common_linuxapp
> index 74bc515..f76e162 100644
> --- a/config/common_linuxapp
> +++ b/config/common_linuxapp
> @@ -534,3 +534,8 @@ CONFIG_RTE_APP_TEST=y
> CONFIG_RTE_TEST_PMD=y
> CONFIG_RTE_TEST_PMD_RECORD_CORE_CYCLES=n
> CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n
> +
> +#
> +# Enable virtio support for container
> +#
> +CONFIG_RTE_VIRTIO_VDEV=y
> diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile
> index 43835ba..0877023 100644
> --- a/drivers/net/virtio/Makefile
> +++ b/drivers/net/virtio/Makefile
> @@ -52,6 +52,10 @@ SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx.c
> SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_ethdev.c
> SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c
>
> +ifeq ($(CONFIG_RTE_VIRTIO_VDEV),y)
> + SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += vhost.c
> +endif
> +
> # this lib depends upon:
> DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_eal lib/librte_ether
> DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_mempool lib/librte_mbuf
> diff --git a/drivers/net/virtio/vhost.c b/drivers/net/virtio/vhost.c
> new file mode 100644
> index 0000000..e423e02
> --- /dev/null
> +++ b/drivers/net/virtio/vhost.c
> @@ -0,0 +1,734 @@
> +/*-
> + * BSD LICENSE
> + *
> + * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + * notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in
> + * the documentation and/or other materials provided with the
> + * distribution.
> + * * Neither the name of Intel Corporation nor the names of its
> + * contributors may be used to endorse or promote products derived
> + * from this software without specific prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include
> +#include
> +#include
> +#include
> +#include
> +#include
> +#include
> +#include
> +#include
> +#include
> +#include
> +#include
> +#include
> +#include
> +#include
> +
> +#include
> +#include
> +#include
> +
> +#include "virtio_pci.h"
> +#include "virtio_logs.h"
> +#include "virtio_ethdev.h"
> +#include "virtqueue.h"
> +#include "vhost.h"
> +
> +static int
> +vhost_user_write(int fd, void *buf, int len, int *fds, int fd_num)
> +{
> + struct msghdr msgh;
> + struct iovec iov;
> + int r;
> +
> + size_t fd_size = fd_num * sizeof(int);
> + char control[CMSG_SPACE(fd_size)];
> + struct cmsghdr *cmsg;
> +
> + memset(&msgh, 0, sizeof(msgh));
> + memset(control, 0, sizeof(control));
> +
> + iov.iov_base = (uint8_t *)buf;
> + iov.iov_len = len;
> +
> + msgh.msg_iov = &iov;
> + msgh.msg_iovlen = 1;
> +
> + msgh.msg_control = control;
> + msgh.msg_controllen = sizeof(control);
> +
> + cmsg = CMSG_FIRSTHDR(&msgh);
> +
> + cmsg->cmsg_len = CMSG_LEN(fd_size);
> + cmsg->cmsg_level = SOL_SOCKET;
> + cmsg->cmsg_type = SCM_RIGHTS;
> + memcpy(CMSG_DATA(cmsg), fds, fd_size);
> +
> + do {
> + r = sendmsg(fd, &msgh, 0);
> + } while (r < 0 && errno == EINTR);
> +
> + return r;
> +}
> +
> +static int
> +vhost_user_read(int fd, VhostUserMsg *msg)
> +{
> + uint32_t valid_flags = VHOST_USER_REPLY_MASK | VHOST_USER_VERSION;
> + int ret, sz_hdr = VHOST_USER_HDR_SIZE, sz_payload;
> +
> + ret = recv(fd, (void *)msg, sz_hdr, 0);
> + if (ret < sz_hdr) {
> + PMD_DRV_LOG(ERR, "Failed to recv msg hdr: %d instead of %d.",
> + ret, sz_hdr);
> + goto fail;
> + }
> +
> + /* validate msg flags */
> + if (msg->flags != (valid_flags)) {
> + PMD_DRV_LOG(ERR, "Failed to recv msg: flags 0x%x instead of 0x%x.",
> + msg->flags, valid_flags);
> + goto fail;
> + }
> +
> + sz_payload = msg->size;
> + if (sz_payload) {
> + ret = recv(fd, (void *)((uint8_t *)msg + sz_hdr), sz_payload, 0);
> + if (ret < sz_payload) {
> + PMD_DRV_LOG(ERR, "Failed to recv msg payload: %d instead of %d.",
> + ret, msg->size);
> + goto fail;
> + }
> + }
> +
> + return 0;
> +
> +fail:
> + return -1;
> +}
> +
> +static VhostUserMsg m __attribute__ ((unused));
> +
> +static void
> +prepare_vhost_memory_user(VhostUserMsg *msg, int fds[])
> +{
> + int i, num;
> + struct back_file *huges;
> + struct vhost_memory_region *mr;
> +
> + num = rte_eal_get_backfile_info(&huges);
> +
> + if (num > VHOST_MEMORY_MAX_NREGIONS)
> + rte_panic("%d hugepage files exceed the maximum of %d for "
> + "vhost-user\n", num, VHOST_MEMORY_MAX_NREGIONS);
> +
> + for (i = 0; i < num; ++i) {
> + mr = &msg->payload.memory.regions[i];
> + mr->guest_phys_addr = (uint64_t)huges[i].addr; /* use vaddr! */
> + mr->userspace_addr = (uint64_t)huges[i].addr;
> + mr->memory_size = huges[i].size;
> + mr->mmap_offset = 0;
> + fds[i] = open(huges[i].filepath, O_RDWR);
> + }
> +
> + msg->payload.memory.nregions = num;
> + msg->payload.memory.padding = 0;
> + free(huges);
> +}
> +
> +static int
> +vhost_user_sock(struct virtio_hw *hw, unsigned long int req, void *arg)
> +{
> + VhostUserMsg msg;
> + struct vhost_vring_file *file = 0;
> + int need_reply = 0;
> + int fds[VHOST_MEMORY_MAX_NREGIONS];
> + int fd_num = 0;
> + int i, len;
> +
> + msg.request = req;
> + msg.flags = VHOST_USER_VERSION;
> + msg.size = 0;
> +
> + switch (req) {
> + case VHOST_USER_GET_FEATURES:
> + need_reply = 1;
> + break;
> +
> + case VHOST_USER_SET_FEATURES:
> + case VHOST_USER_SET_LOG_BASE:
> + msg.payload.u64 = *((__u64 *)arg);
> + msg.size = sizeof(m.payload.u64);
> + break;
> +
> + case VHOST_USER_SET_OWNER:
> + case VHOST_USER_RESET_OWNER:
> + break;
> +
> + case VHOST_USER_SET_MEM_TABLE:
> + prepare_vhost_memory_user(&msg, fds);
> + fd_num = msg.payload.memory.nregions;
> + msg.size = sizeof(m.payload.memory.nregions);
> + msg.size += sizeof(m.payload.memory.padding);
> + msg.size += fd_num * sizeof(struct vhost_memory_region);
> + break;
> +
> + case VHOST_USER_SET_LOG_FD:
> + fds[fd_num++] = *((int *)arg);
> + break;
> +
> + case VHOST_USER_SET_VRING_NUM:
> + case VHOST_USER_SET_VRING_BASE:
> + memcpy(&msg.payload.state, arg, sizeof(struct vhost_vring_state));
> + msg.size = sizeof(m.payload.state);
> + break;
> +
> + case VHOST_USER_GET_VRING_BASE:
> + memcpy(&msg.payload.state, arg, sizeof(struct vhost_vring_state));
> + msg.size = sizeof(m.payload.state);
> + need_reply = 1;
> + break;
> +
> + case VHOST_USER_SET_VRING_ADDR:
> + memcpy(&msg.payload.addr, arg, sizeof(struct vhost_vring_addr));
> + msg.size = sizeof(m.payload.addr);
> + break;
> +
> + case VHOST_USER_SET_VRING_KICK:
> + case VHOST_USER_SET_VRING_CALL:
> + case VHOST_USER_SET_VRING_ERR:
> + file = arg;
> + msg.payload.u64 = file->index & VHOST_USER_VRING_IDX_MASK;
> + msg.size = sizeof(m.payload.u64);
> + if (file->fd > 0)
> + fds[fd_num++] = file->fd;
> + else
> + msg.payload.u64 |= VHOST_USER_VRING_NOFD_MASK;
> + break;
> +
> + default:
> + PMD_DRV_LOG(ERR, "vhost-user trying to send unhandled msg type");
> + return -1;
> + }
> +
> + len = VHOST_USER_HDR_SIZE + msg.size;
> + if (vhost_user_write(hw->vhostfd, &msg, len, fds, fd_num) < 0)
> + return 0;
> +
> + if (req == VHOST_USER_SET_MEM_TABLE)
> + for (i = 0; i < fd_num; ++i)
> + close(fds[i]);
> +
> + if (need_reply) {
> + if (vhost_user_read(hw->vhostfd, &msg) < 0)
> + return -1;
> +
> + if (req != msg.request) {
> + PMD_DRV_LOG(ERR, "Received unexpected msg type.");
> + return -1;
> + }
> +
> + switch (req) {
> + case VHOST_USER_GET_FEATURES:
> + if (msg.size != sizeof(m.payload.u64)) {
> + PMD_DRV_LOG(ERR, "Received bad msg size.");
> + return -1;
> + }
> + *((__u64 *)arg) = msg.payload.u64;
> + break;
> + case VHOST_USER_GET_VRING_BASE:
> + if (msg.size != sizeof(m.payload.state)) {
> + PMD_DRV_LOG(ERR, "Received bad msg size.");
> + return -1;
> + }
> + memcpy(arg, &msg.payload.state, sizeof(struct vhost_vring_state));
> + break;
> + default:
> + PMD_DRV_LOG(ERR, "Received unexpected msg type.");
> + return -1;
> + }
> + }
> +
> + return 0;
> +}
> +
> +static int
> +vhost_kernel_ioctl(struct virtio_hw *hw, unsigned long int req, void *arg)
> +{
> + return ioctl(hw->vhostfd, req, arg);
> +}
> +
> +enum {
> + VHOST_MSG_SET_OWNER,
> + VHOST_MSG_SET_FEATURES,
> + VHOST_MSG_GET_FEATURES,
> + VHOST_MSG_SET_VRING_CALL,
> + VHOST_MSG_SET_VRING_NUM,
> + VHOST_MSG_SET_VRING_BASE,
> + VHOST_MSG_SET_VRING_ADDR,
> + VHOST_MSG_SET_VRING_KICK,
> + VHOST_MSG_SET_MEM_TABLE,
> + VHOST_MSG_MAX,
> +};
> +
> +static const char *vhost_msg_strings[] = {
> + "VHOST_MSG_SET_OWNER",
> + "VHOST_MSG_SET_FEATURES",
> + "VHOST_MSG_GET_FEATURES",
> + "VHOST_MSG_SET_VRING_CALL",
> + "VHOST_MSG_SET_VRING_NUM",
> + "VHOST_MSG_SET_VRING_BASE",
> + "VHOST_MSG_SET_VRING_ADDR",
> + "VHOST_MSG_SET_VRING_KICK",
> + "VHOST_MSG_SET_MEM_TABLE",
> + NULL,
> +};
> +
> +static unsigned long int vhost_req_map[][2] = {
> + {VHOST_SET_OWNER, VHOST_USER_SET_OWNER},
> + {VHOST_SET_FEATURES, VHOST_USER_SET_FEATURES},
> + {VHOST_GET_FEATURES, VHOST_USER_GET_FEATURES},
> + {VHOST_SET_VRING_CALL, VHOST_USER_SET_VRING_CALL},
> + {VHOST_SET_VRING_NUM, VHOST_USER_SET_VRING_NUM},
> + {VHOST_SET_VRING_BASE, VHOST_USER_SET_VRING_BASE},
> + {VHOST_SET_VRING_ADDR, VHOST_USER_SET_VRING_ADDR},
> + {VHOST_SET_VRING_KICK, VHOST_USER_SET_VRING_KICK},
> + {VHOST_SET_MEM_TABLE, VHOST_USER_SET_MEM_TABLE},
> +};
> +
> +static int
> +vhost_call(struct virtio_hw *hw, unsigned long int req_orig, void *arg)
> +{
> + int req_new;
> + int ret = 0;
> +
> + if (req_orig >= VHOST_MSG_MAX)
> + rte_panic("invalid req: %lu\n", req_orig);
> +
> + PMD_DRV_LOG(INFO, "%s\n", vhost_msg_strings[req_orig]);
> + req_new = vhost_req_map[req_orig][hw->type];
> + if (hw->type == VHOST_USER)
> + ret = vhost_user_sock(hw, req_new, arg);
> + else
> + ret = vhost_kernel_ioctl(hw, req_new, arg);
> +
> + if (ret < 0)
> + rte_panic("vhost_call %s failed: %s\n",
> + vhost_msg_strings[req_orig], strerror(errno));
> +
> + return ret;
> +}
> +
> +static void
> +kick_one_vq(struct virtio_hw *hw, struct virtqueue *vq, unsigned queue_sel)
> +{
> + int callfd, kickfd;
> + struct vhost_vring_file file;
> +
> + /* or use invalid flag to disable it, but vhost-dpdk uses this to judge
> + * if dev is alive. so finally we need two real event_fds.
> + */
> + /* Of all per virtqueue MSGs, make sure VHOST_SET_VRING_CALL comes
> + * firstly because vhost depends on this msg to allocate virtqueue
> + * pair.
> + */
> + callfd = eventfd(0, O_CLOEXEC | O_NONBLOCK);
> + if (callfd < 0)
> + rte_panic("callfd error, %s\n", strerror(errno));
> +
> + file.index = queue_sel;
> + file.fd = callfd;
> + vhost_call(hw, VHOST_MSG_SET_VRING_CALL, &file);
> + hw->callfds[queue_sel] = callfd;
> +
> + struct vhost_vring_state state;
> + state.index = queue_sel;
> + state.num = vq->vq_ring.num;
> + vhost_call(hw, VHOST_MSG_SET_VRING_NUM, &state);
> +
> + state.num = 0; /* no reservation */
> + vhost_call(hw, VHOST_MSG_SET_VRING_BASE, &state);
> +
> + struct vhost_vring_addr addr = {
> + .index = queue_sel,
> + .desc_user_addr = (uint64_t)vq->vq_ring.desc,
> + .avail_user_addr = (uint64_t)vq->vq_ring.avail,
> + .used_user_addr = (uint64_t)vq->vq_ring.used,
> + .log_guest_addr = 0,
> + .flags = 0, /* disable log */
> + };
> + vhost_call(hw, VHOST_MSG_SET_VRING_ADDR, &addr);
> +
> + /* Of all per virtqueue MSGs, make sure VHOST_SET_VRING_KICK comes
> + * lastly because vhost depends on this msg to judge if
> + * virtio_is_ready().
> + */
> +
> + kickfd = eventfd(0, O_CLOEXEC | O_NONBLOCK);
> + if (kickfd < 0)
> + rte_panic("kickfd error, %s\n", strerror(errno));
> +
> + file.fd = kickfd;
> + vhost_call(hw, VHOST_MSG_SET_VRING_KICK, &file);
> + hw->kickfds[queue_sel] = kickfd;
> +}
> +
> +/**
> + * Merge those virtually adjacent memsegs into one region.
> + */
> +static void
> +prepare_vhost_memory_kernel(struct vhost_memory_kernel **p_vm)
> +{
> + unsigned i, j, k = 0;
> + struct rte_memseg *seg;
> + struct vhost_memory_region *mr;
> + struct vhost_memory_kernel *vm;
> +
> + vm = malloc(sizeof(struct vhost_memory_kernel)
> + + RTE_MAX_MEMSEG * sizeof(struct vhost_memory_region));
> +
> + for (i = 0; i < RTE_MAX_MEMSEG; ++i) {
> + seg = &rte_eal_get_configuration()->mem_config->memseg[i];
> + if (seg->addr == NULL)
> + break;
> +
> + int new_region = 1;
> + for (j = 0; j < k; ++j) {
> + mr = &vm->regions[j];
> +
> + if (mr->userspace_addr + mr->memory_size
> + == (uint64_t)seg->addr) {
> + mr->memory_size += seg->len;
> + new_region = 0;
> + break;
> + }
> +
> + if ((uint64_t)seg->addr + seg->len
> + == mr->userspace_addr) {
> + mr->guest_phys_addr = (uint64_t)seg->addr;
> + mr->userspace_addr = (uint64_t)seg->addr;
> + mr->memory_size += seg->len;
> + new_region = 0;
> + break;
> + }
> + }
> +
> + if (new_region == 0)
> + continue;
> +
> + mr = &vm->regions[k++];
> + mr->guest_phys_addr = (uint64_t)seg->addr; /* use vaddr here! */
> + mr->userspace_addr = (uint64_t)seg->addr;
> + mr->memory_size = seg->len;
> + mr->mmap_offset = 0;
> + }
> +
> + vm->nregions = k;
> + vm->padding = 0;
> + *p_vm = vm;
> +}
> +
> +static void kick_all_vq(struct virtio_hw *hw)
> +{
> + int ret;
> + unsigned i, queue_sel, nvqs;
> + struct rte_eth_dev_data *data = hw->data;
> +
> + if (hw->type == VHOST_KERNEL) {
> + struct vhost_memory_kernel *vm = NULL;
> + prepare_vhost_memory_kernel(&vm);
> + vhost_call(hw, VHOST_MSG_SET_MEM_TABLE, vm);
> + free(vm);
> + } else {
> + /* construct vhost_memory inside prepare_vhost_memory_user() */
> + vhost_call(hw, VHOST_MSG_SET_MEM_TABLE, NULL);
> + }
> +
> + for (i = 0; i < data->nb_rx_queues; ++i) {
> + queue_sel = 2 * i + VTNET_SQ_RQ_QUEUE_IDX;
> + kick_one_vq(hw, data->rx_queues[i], queue_sel);
> + }
> + for (i = 0; i < data->nb_tx_queues; ++i) {
> + queue_sel = 2 * i + VTNET_SQ_TQ_QUEUE_IDX;
> + kick_one_vq(hw, data->tx_queues[i], queue_sel);
> + }
> +
> + /* after setup all virtqueues, we need to set_features again
> + * so that these features can be set into each virtqueue in
> + * vhost side.
> + */
> + uint64_t features = hw->guest_features;
> + features &= ~(1ull << VIRTIO_NET_F_MAC);
> + vhost_call(hw, VHOST_MSG_SET_FEATURES, &features);
> + if (ioctl(hw->backfd, TUNSETVNETHDRSZ, &hw->vtnet_hdr_size) == -1)
> + rte_panic("TUNSETVNETHDRSZ failed: %s\n", strerror(errno));
> + PMD_DRV_LOG(INFO, "set features:%"PRIx64"\n", features);
> +
> + if (hw->type == VHOST_KERNEL) {
> + struct vhost_vring_file file;
> +
> + file.fd = hw->backfd;
> + nvqs = data->nb_rx_queues + data->nb_tx_queues;
> + for (file.index = 0; file.index < nvqs; ++file.index) {
> + ret = vhost_kernel_ioctl(hw, VHOST_NET_SET_BACKEND, &file);
> + if (ret < 0)
> + rte_panic("VHOST_NET_SET_BACKEND failed, %s\n",
> + strerror(errno));
> + }
> + }
> +
> + /* TODO: VHOST_SET_LOG_BASE */
> +}
> +
> +void
> +virtio_ioport_write(struct virtio_hw *hw, uint64_t addr, uint32_t val)
> +{
> + uint64_t guest_features;
> +
> + switch (addr) {
> + case VIRTIO_PCI_GUEST_FEATURES:
> + guest_features = val;
> + guest_features &= ~(1ull << VIRTIO_NET_F_MAC);
> + vhost_call(hw, VHOST_MSG_SET_FEATURES, &guest_features);
> + break;
> + case VIRTIO_PCI_QUEUE_PFN:
> + /* do nothing */
> + break;
> + case VIRTIO_PCI_QUEUE_SEL:
> + hw->queue_sel = val;
> + break;
> + case VIRTIO_PCI_STATUS:
> + if (val & VIRTIO_CONFIG_S_DRIVER_OK)
> + kick_all_vq(hw);
> + hw->status = val & 0xFF;
> + break;
> + case VIRTIO_PCI_QUEUE_NOTIFY:
> + {
> + int ret;
> + uint64_t buf = 1;
> + ret = write(hw->kickfds[val], &buf, sizeof(uint64_t));
> + if (ret == -1)
> + rte_panic("VIRTIO_PCI_QUEUE_NOTIFY failed: %s\n",
> + strerror(errno));
> + break;
> + }
> + default:
> + PMD_DRV_LOG(ERR, "unexpected address %"PRIu64" value 0x%x\n",
> + addr, val);
> + break;
> + }
> +}
> +
> +uint32_t
> +virtio_ioport_read(struct virtio_hw *hw, uint64_t addr)
> +{
> + uint32_t ret = 0xFFFFFFFF;
> + uint64_t host_features;
> +
> + PMD_DRV_LOG(INFO, "addr: %"PRIu64"\n", addr);
> +
> + switch (addr) {
> + case VIRTIO_PCI_HOST_FEATURES:
> + vhost_call(hw, VHOST_MSG_GET_FEATURES, &host_features);
> + PMD_DRV_LOG(INFO, "get_features: %"PRIx64"\n", host_features);
> + if (hw->mac_specified)
> + host_features |= (1ull << VIRTIO_NET_F_MAC);
> + /* disable it until we support CQ */
> + host_features &= ~(1ull << VIRTIO_NET_F_CTRL_RX);
> + ret = host_features;
> + break;
> + case VIRTIO_PCI_GUEST_FEATURES:
> + ret = hw->guest_features;
> + break;
> + case VIRTIO_PCI_QUEUE_NUM:
> + ret = hw->queue_num;
> + break;
> + case VIRTIO_PCI_QUEUE_SEL:
> + ret = hw->queue_sel;
> + break;
> + case VIRTIO_PCI_STATUS:
> + ret = hw->status;
> + break;
> + case 20: /* mac addr: 0~3 */
> + if (hw->mac_specified) {
> + uint32_t m0 = hw->mac_addr[0],
> + m1 = hw->mac_addr[1],
> + m2 = hw->mac_addr[2],
> + m3 = hw->mac_addr[3];
> + ret = (m3 << 24) | (m2 << 16) | (m1 << 8) | m0;
> + }
> + break;
> + case 24: /* mac addr: 4~5 */
> + if (hw->mac_specified) {
> + uint32_t m4 = hw->mac_addr[4],
> + m5 = hw->mac_addr[5];
> + ret = (m5 << 8) | m4;
> + }
> + break;
> + default:
> + PMD_DRV_LOG(ERR, "%"PRIu64" (r) not supported\n", addr);
> + break;
> + }
> +
> + return ret;
> +}
> +
> +#define TUN_DEF_SNDBUF (1ull << 20)
> +
> +static void
> +vhost_kernel_backend_setup(struct virtio_hw *hw)
> +{
> + int fd;
> + int len = sizeof(struct virtio_net_hdr);
> + int req_mq = 0;
> + int sndbuf = TUN_DEF_SNDBUF;
> + unsigned int features;
> + struct ifreq ifr;
> +
> + /* TODO:
> + * 1. get and set offload capability, tap_probe_has_ufo, tap_fd_set_offload
> + * 2. verify we can get and set vnet_hdr_len, tap_probe_vnet_hdr_len
> +
> + * 1. get number of memory regions from vhost module parameter
> + * max_mem_regions, supported in newer version linux kernel
> + */
> +
> + fd = open(PATH_NET_TUN, O_RDWR);
> + if (fd < 0)
> + rte_panic("open %s error, %s\n", PATH_NET_TUN, strerror(errno));
> +
> + memset(&ifr, 0, sizeof(ifr));
> + ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
> +
> + if (ioctl(fd, TUNGETFEATURES, &features) == -1)
> + rte_panic("TUNGETFEATURES failed: %s", strerror(errno));
> +
> + if (features & IFF_ONE_QUEUE)
> + ifr.ifr_flags |= IFF_ONE_QUEUE;
> +
> + if (features & IFF_VNET_HDR)
> + ifr.ifr_flags |= IFF_VNET_HDR;
> + else
> + rte_panic("vnet_hdr requested, but kernel does not support\n");
> +
> + if (req_mq) {
> + if (features & IFF_MULTI_QUEUE)
> + ifr.ifr_flags |= IFF_MULTI_QUEUE;
> + else
> + rte_panic("multiqueue requested, but kernel does not support\n");
> + }
> +
> + strncpy(ifr.ifr_name, "tap%d", IFNAMSIZ);
> + if (ioctl(fd, TUNSETIFF, (void *) &ifr) == -1)
> + rte_panic("TUNSETIFF failed: %s", strerror(errno));
> + fcntl(fd, F_SETFL, O_NONBLOCK);
> +
> + if (ioctl(fd, TUNSETVNETHDRSZ, &len) == -1)
> + rte_panic("TUNSETVNETHDRSZ failed: %s\n", strerror(errno));
> +
> + if (ioctl(fd, TUNSETSNDBUF, &sndbuf) == -1)
> + rte_panic("TUNSETSNDBUF failed: %s", strerror(errno));
> +
> + hw->backfd = fd;
> +
> + hw->vhostfd = open(hw->path, O_RDWR);
> + if (hw->vhostfd == -1)
> + rte_panic("open %s failed: %s\n", hw->path, strerror(errno));
> +}
> +
> +static void
> +vhost_user_backend_setup(struct virtio_hw *hw)
> +{
> + int fd;
> + int flag;
> + struct sockaddr_un un;
> +
> + fd = socket(AF_UNIX, SOCK_STREAM, 0);
> + if (fd < 0)
> + rte_panic("socket error, %s\n", strerror(errno));
> +
> + flag = fcntl(fd, F_GETFD);
> + fcntl(fd, F_SETFD, flag | FD_CLOEXEC);
> +
> + memset(&un, 0, sizeof(un));
> + un.sun_family = AF_UNIX;
> + snprintf(un.sun_path, sizeof(un.sun_path), "%s", hw->path);
> + if (connect(fd, (struct sockaddr *)&un, sizeof(un)) < 0) {
> + PMD_DRV_LOG(ERR, "connect error, %s\n", strerror(errno));
> + exit(-1);
> + }
> +
> + hw->vhostfd = fd;
> +}
> +
> +void
> +virtio_vdev_init(struct rte_eth_dev_data *data, const char *path,
> + int nb_rx, int nb_tx, int nb_cq __attribute__ ((unused)),
> + int queue_num, char *mac)
> +{
> + int i;
> + int ret;
> + struct stat s;
> + uint32_t tmp[ETHER_ADDR_LEN];
> + struct virtio_hw *hw = data->dev_private;
> +
> + hw->data = data;
> + hw->path = strdup(path);
> + hw->max_rx_queues = nb_rx;
> + hw->max_tx_queues = nb_tx;
> + hw->queue_num = queue_num;
> + hw->mac_specified = 0;
> + if (mac) {
> + ret = sscanf(mac, "%x:%x:%x:%x:%x:%x", &tmp[0], &tmp[1],
> + &tmp[2], &tmp[3], &tmp[4], &tmp[5]);
> + if (ret == ETHER_ADDR_LEN) {
> + for (i = 0; i < ETHER_ADDR_LEN; ++i)
> + hw->mac_addr[i] = (uint8_t)tmp[i];
> + hw->mac_specified = 1;
> + }
> + }
> +
> + /* TODO: cq */
> +
> + ret = stat(hw->path, &s);
> + if (ret < 0)
> + rte_panic("stat: %s failed, %s\n", hw->path, strerror(errno));
> +
> + switch (s.st_mode & S_IFMT) {
> + case S_IFCHR:
> + hw->type = VHOST_KERNEL;
> + vhost_kernel_backend_setup(hw);
> + break;
> + case S_IFSOCK:
> + hw->type = VHOST_USER;
> + vhost_user_backend_setup(hw);
> + break;
> + default:
> + rte_panic("unknown file type of %s\n", hw->path);
> + }
> + if (vhost_call(hw, VHOST_MSG_SET_OWNER, NULL) == -1)
> + rte_panic("vhost set_owner failed: %s\n", strerror(errno));
> +}
> diff --git a/drivers/net/virtio/vhost.h b/drivers/net/virtio/vhost.h
> new file mode 100644
> index 0000000..c7517f6
> --- /dev/null
> +++ b/drivers/net/virtio/vhost.h
> @@ -0,0 +1,192 @@
> +/*-
> + * BSD LICENSE
> + *
> + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + * notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + * notice, this list of conditions and the following disclaimer in
> + * the documentation and/or other materials provided with the
> + * distribution.
> + * * Neither the name of Intel Corporation nor the names of its
> + * contributors may be used to endorse or promote products derived
> + * from this software without specific prior written permission.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _VHOST_NET_USER_H
> +#define _VHOST_NET_USER_H
> +
> +#include
> +#include
> +#include
> +
> +#define VHOST_MEMORY_MAX_NREGIONS 8
> +
> +struct vhost_vring_state {
> + unsigned int index;
> + unsigned int num;
> +};
> +
> +struct vhost_vring_file {
> + unsigned int index;
> + int fd;
> +};
> +
> +struct vhost_vring_addr {
> + unsigned int index;
> + /* Option flags. */
> + unsigned int flags;
> + /* Flag values: */
> + /* Whether log address is valid. If set enables logging. */
> +#define VHOST_VRING_F_LOG 0
> +
> + /* Start of array of descriptors (virtually contiguous) */
> + uint64_t desc_user_addr;
> + /* Used structure address. Must be 32 bit aligned */
> + uint64_t used_user_addr;
> + /* Available structure address. Must be 16 bit aligned */
> + uint64_t avail_user_addr;
> + /* Logging support. */
> + /* Log writes to used structure, at offset calculated from specified
> + * address. Address must be 32 bit aligned. */
> + uint64_t log_guest_addr;
> +};
> +
> +#define VIRTIO_CONFIG_S_DRIVER_OK 4
> +
> +typedef enum VhostUserRequest {
> + VHOST_USER_NONE = 0,
> + VHOST_USER_GET_FEATURES = 1,
> + VHOST_USER_SET_FEATURES = 2,
> + VHOST_USER_SET_OWNER = 3,
> + VHOST_USER_RESET_OWNER = 4,
> + VHOST_USER_SET_MEM_TABLE = 5,
> + VHOST_USER_SET_LOG_BASE = 6,
> + VHOST_USER_SET_LOG_FD = 7,
> + VHOST_USER_SET_VRING_NUM = 8,
> + VHOST_USER_SET_VRING_ADDR = 9,
> + VHOST_USER_SET_VRING_BASE = 10,
> + VHOST_USER_GET_VRING_BASE = 11,
> + VHOST_USER_SET_VRING_KICK = 12,
> + VHOST_USER_SET_VRING_CALL = 13,
> + VHOST_USER_SET_VRING_ERR = 14,
> + VHOST_USER_GET_PROTOCOL_FEATURES = 15,
> + VHOST_USER_SET_PROTOCOL_FEATURES = 16,
> + VHOST_USER_GET_QUEUE_NUM = 17,
> + VHOST_USER_SET_VRING_ENABLE = 18,
> + VHOST_USER_MAX
> +} VhostUserRequest;
> +
> +struct vhost_memory_region {
> + uint64_t guest_phys_addr;
> + uint64_t memory_size; /* bytes */
> + uint64_t userspace_addr;
> + uint64_t mmap_offset;
> +};
> +struct vhost_memory_kernel {
> + uint32_t nregions;
> + uint32_t padding;
> + struct vhost_memory_region regions[0];
> +};
> +
> +struct vhost_memory {
> + uint32_t nregions;
> + uint32_t padding;
> + struct vhost_memory_region regions[VHOST_MEMORY_MAX_NREGIONS];
> +};
> +
> +typedef struct VhostUserMsg {
> + VhostUserRequest request;
> +
> +#define VHOST_USER_VERSION_MASK 0x3
> +#define VHOST_USER_REPLY_MASK (0x1 << 2)
> + uint32_t flags;
> + uint32_t size; /* the following payload size */
> + union {
> +#define VHOST_USER_VRING_IDX_MASK 0xff
> +#define VHOST_USER_VRING_NOFD_MASK (0x1 << 8)
> + uint64_t u64;
> + struct vhost_vring_state state;
> + struct vhost_vring_addr addr;
> + struct vhost_memory memory;
> + } payload;
> + int fds[VHOST_MEMORY_MAX_NREGIONS];
> +} __attribute((packed)) VhostUserMsg;
> +
> +#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
> +#define VHOST_USER_PAYLOAD_SIZE (sizeof(VhostUserMsg) - VHOST_USER_HDR_SIZE)
> +
> +/* The version of the protocol we support */
> +#define VHOST_USER_VERSION 0x1
> +
> +/* ioctls */
> +
> +#define VHOST_VIRTIO 0xAF
> +
> +#define VHOST_GET_FEATURES _IOR(VHOST_VIRTIO, 0x00, __u64)
> +#define VHOST_SET_FEATURES _IOW(VHOST_VIRTIO, 0x00, __u64)
> +#define VHOST_SET_OWNER _IO(VHOST_VIRTIO, 0x01)
> +#define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
> +#define VHOST_SET_MEM_TABLE _IOW(VHOST_VIRTIO, 0x03, struct vhost_memory_kernel)
> +#define VHOST_SET_LOG_BASE _IOW(VHOST_VIRTIO, 0x04, __u64)
> +#define VHOST_SET_LOG_FD _IOW(VHOST_VIRTIO, 0x07, int)
> +#define VHOST_SET_VRING_NUM _IOW(VHOST_VIRTIO, 0x10, struct vhost_vring_state)
> +#define VHOST_SET_VRING_ADDR _IOW(VHOST_VIRTIO, 0x11, struct vhost_vring_addr)
> +#define VHOST_SET_VRING_BASE _IOW(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
> +#define VHOST_GET_VRING_BASE _IOWR(VHOST_VIRTIO, 0x12, struct vhost_vring_state)
> +#define VHOST_SET_VRING_KICK _IOW(VHOST_VIRTIO, 0x20, struct vhost_vring_file)
> +#define VHOST_SET_VRING_CALL _IOW(VHOST_VIRTIO, 0x21, struct vhost_vring_file)
> +#define VHOST_SET_VRING_ERR _IOW(VHOST_VIRTIO, 0x22, struct vhost_vring_file)
> +#define VHOST_NET_SET_BACKEND _IOW(VHOST_VIRTIO, 0x30, struct vhost_vring_file)
> +
> +/*****************************************************************************/
> +
> +/* Ioctl defines */
> +#define TUNSETIFF _IOW('T', 202, int)
> +#define TUNGETFEATURES _IOR('T', 207, unsigned int)
> +#define TUNSETOFFLOAD _IOW('T', 208, unsigned int)
> +#define TUNGETIFF _IOR('T', 210, unsigned int)
> +#define TUNSETSNDBUF _IOW('T', 212, int)
> +#define TUNGETVNETHDRSZ _IOR('T', 215, int)
> +#define TUNSETVNETHDRSZ _IOW('T', 216, int)
> +#define TUNSETQUEUE _IOW('T', 217, int)
> +#define TUNSETVNETLE _IOW('T', 220, int)
> +#define TUNSETVNETBE _IOW('T', 222, int)
> +
> +/* TUNSETIFF ifr flags */
> +#define IFF_TAP 0x0002
> +#define IFF_NO_PI 0x1000
> +#define IFF_ONE_QUEUE 0x2000
> +#define IFF_VNET_HDR 0x4000
> +#define IFF_MULTI_QUEUE 0x0100
> +#define IFF_ATTACH_QUEUE 0x0200
> +#define IFF_DETACH_QUEUE 0x0400
> +
> +/* Features for GSO (TUNSETOFFLOAD). */
> +#define TUN_F_CSUM 0x01 /* You can hand me unchecksummed packets. */
> +#define TUN_F_TSO4 0x02 /* I can handle TSO for IPv4 packets */
> +#define TUN_F_TSO6 0x04 /* I can handle TSO for IPv6 packets */
> +#define TUN_F_TSO_ECN 0x08 /* I can handle TSO with ECN bits. */
> +#define TUN_F_UFO 0x10 /* I can handle UFO packets */
> +
> +#define PATH_NET_TUN "/dev/net/tun"
> +
> +#endif
> diff --git a/drivers/net/virtio/virtio_ethdev.h b/drivers/net/virtio/virtio_ethdev.h
> index ae2d47d..9e1ecb3 100644
> --- a/drivers/net/virtio/virtio_ethdev.h
> +++ b/drivers/net/virtio/virtio_ethdev.h
> @@ -122,5 +122,8 @@ uint16_t virtio_xmit_pkts_simple(void *tx_queue, struct rte_mbuf
> **tx_pkts,
> #define VTNET_LRO_FEATURES (VIRTIO_NET_F_GUEST_TSO4 | \
> VIRTIO_NET_F_GUEST_TSO6 | VIRTIO_NET_F_GUEST_ECN)
>
> -
> +#ifdef RTE_VIRTIO_VDEV
> +void virtio_vdev_init(struct rte_eth_dev_data *data, const char *path,
> + int nb_rx, int nb_tx, int nb_cq, int queue_num, char *mac);
> +#endif
> #endif /* _VIRTIO_ETHDEV_H_ */
> diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
> index 47f722a..af05ae2 100644
> --- a/drivers/net/virtio/virtio_pci.h
> +++ b/drivers/net/virtio/virtio_pci.h
> @@ -147,7 +147,6 @@ struct virtqueue;
> * rest are per-device feature bits.
> */
> #define VIRTIO_TRANSPORT_F_START 28
> -#define VIRTIO_TRANSPORT_F_END 32
I understand that this #define is not used, but... May be we should do this cleanup as a separate patch? Otherwise it's hard to
track this change (i believe this definition had some use in the past).
>
> /* The Guest publishes the used index for which it expects an interrupt
> * at the end of the avail ring. Host should ignore the avail->flags field. */
> @@ -165,6 +164,7 @@ struct virtqueue;
>
> struct virtio_hw {
> struct virtqueue *cvq;
> +#define VIRTIO_VDEV_IO_BASE 0xffffffff
> uint32_t io_base;
> uint32_t guest_features;
> uint32_t max_tx_queues;
> @@ -174,6 +174,21 @@ struct virtio_hw {
> uint8_t use_msix;
> uint8_t started;
> uint8_t mac_addr[ETHER_ADDR_LEN];
> +#ifdef RTE_VIRTIO_VDEV
> +#define VHOST_KERNEL 0
> +#define VHOST_USER 1
> + int type; /* type of backend */
> + uint32_t queue_num;
> + char *path;
> + int mac_specified;
> + int vhostfd;
> + int backfd; /* tap device used in vhost-net */
> + int callfds[VIRTIO_MAX_VIRTQUEUES * 2 + 1];
> + int kickfds[VIRTIO_MAX_VIRTQUEUES * 2 + 1];
> + uint32_t queue_sel;
> + uint8_t status;
> + struct rte_eth_dev_data *data;
> +#endif
Actually i am currently working on this too, and i decided to use different approach. I made these extra fields into a separate
structure, changed 'io_base' to a pointer, and now i can store there a pointer to this extra structure. Device type can easily be
determined by (dev->dev_type == RTE_ETH_DEV_PCI) check, so you don't need VIRTIO_VDEV_IO_BASE magic value.
> };
>
> /*
> @@ -229,6 +244,39 @@ outl_p(unsigned int data, unsigned int port)
> #define VIRTIO_PCI_REG_ADDR(hw, reg) \
> (unsigned short)((hw)->io_base + (reg))
>
> +#ifdef RTE_VIRTIO_VDEV
> +uint32_t virtio_ioport_read(struct virtio_hw *, uint64_t);
> +void virtio_ioport_write(struct virtio_hw *, uint64_t, uint32_t);
> +
> +#define VIRTIO_READ_REG_1(hw, reg) \
> + (hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> + inb((VIRTIO_PCI_REG_ADDR((hw), (reg)))) \
> + :virtio_ioport_read(hw, reg)
> +#define VIRTIO_WRITE_REG_1(hw, reg, value) \
> + (hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> + outb_p((unsigned char)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg)))) \
> + :virtio_ioport_write(hw, reg, value)
> +
> +#define VIRTIO_READ_REG_2(hw, reg) \
> + (hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> + inw((VIRTIO_PCI_REG_ADDR((hw), (reg)))) \
> + :virtio_ioport_read(hw, reg)
> +#define VIRTIO_WRITE_REG_2(hw, reg, value) \
> + (hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> + outw_p((unsigned short)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg)))) \
> + :virtio_ioport_write(hw, reg, value)
> +
> +#define VIRTIO_READ_REG_4(hw, reg) \
> + (hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> + inl((VIRTIO_PCI_REG_ADDR((hw), (reg)))) \
> + :virtio_ioport_read(hw, reg)
> +#define VIRTIO_WRITE_REG_4(hw, reg, value) \
> + (hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> + outl_p((unsigned int)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg)))) \
> + :virtio_ioport_write(hw, reg, value)
I also decided to add two fields to 'hw', where pointers to these accessors are stored. I think this should be faster, however,
yes, this is not performance-critical code because it's executed only during initialization.
> +
> +#else /* RTE_VIRTIO_VDEV */
> +
> #define VIRTIO_READ_REG_1(hw, reg) \
> inb((VIRTIO_PCI_REG_ADDR((hw), (reg))))
> #define VIRTIO_WRITE_REG_1(hw, reg, value) \
> @@ -244,6 +292,8 @@ outl_p(unsigned int data, unsigned int port)
> #define VIRTIO_WRITE_REG_4(hw, reg, value) \
> outl_p((unsigned int)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg))))
>
> +#endif /* RTE_VIRTIO_VDEV */
> +
> static inline int
> vtpci_with_feature(struct virtio_hw *hw, uint32_t bit)
> {
> --
> 2.1.4
Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia