From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f53.google.com (mail-pa0-f53.google.com [209.85.220.53]) by dpdk.org (Postfix) with ESMTP id B0CAE2BB8 for ; Fri, 4 Mar 2016 06:05:23 +0100 (CET) Received: by mail-pa0-f53.google.com with SMTP id fy10so28012869pac.1 for ; Thu, 03 Mar 2016 21:05:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=igel-co-jp.20150623.gappssmtp.com; s=20150623; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=4cUaS8TOoHaat2KSAldk1mxe5oq+s/Xp3Cmc0ygHlz8=; b=KAP745UjMEiyNOaDxlTQtvSnJPLUv8qdqZ6tRmFkj7Mqir4Igf/hZ+N9uxeqWjhSSZ 4ULsDrHXp1cR+ajFBOA2E7tA312IChsQOzLc/ckmC2aklHe99uNNv8alQIf0SQ8TjvQH r4q4O9eUgK7+osiPwSa9PiallFX47SgK4hXORNvnZpFS/8lNLlq2g1UKKmNXm/Lj8Uyz 7u7EkUer0OfSS4NVDuPEMegB3m8rcqD8KtEkluHrDd4D3/UTugir8NiVMzvRKNt7+byK j0XP/gydRFz7QU5kXSDk1gdM/HgV5eryZojNUwfKTO92B3jiW5iygE8MZB/6bFtAoCFn YL7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=4cUaS8TOoHaat2KSAldk1mxe5oq+s/Xp3Cmc0ygHlz8=; b=llxLCqETkbgCt/VP07DU2yGgVtpsbguIu4xBdrdt8AV65p3hUJCA1PG/3vwhGud3aJ 1/rH44N2NaJQ5HoI89Jw2BQm+YIDQHcvGjOuMACAPFDzVtAQeLqKwgPa/XlySFk4EPZz F7U/f4kjKpkPBIIQ9SRr7mjr/370XKH8wehCW+IlmVVHeHtGt+IgeghI+cyGUX4ROsuW O9sxxQ23FfmevypKm/kUEiecqTGFUvmTuzOhC6OothPt0zuCL49xxhSoWJZn1MHRiDHG EwhIhHAYNZQ4kxi0DfgZeUgrz/hOzAe+QZpqkQskEfKsyBIY5k6w6Uqo435yrMB/EtsO y9IA== X-Gm-Message-State: AD7BkJJRBU5ekq6RgTUnu7ugmMuUHiTEYsW31J9uFocpAsJXVcbH1XoXe1Zndt3SBJ6yRg== X-Received: by 10.66.159.136 with SMTP id xc8mr9199671pab.71.1457067923018; Thu, 03 Mar 2016 21:05:23 -0800 (PST) Received: from [10.16.129.101] (napt.igel.co.jp. [219.106.231.132]) by smtp.googlemail.com with ESMTPSA id f66sm1875269pff.8.2016.03.03.21.05.20 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 03 Mar 2016 21:05:22 -0800 (PST) To: "Tan, Jianfeng" , dev@dpdk.org References: <1455075613-3605-3-git-send-email-mukawa@igel.co.jp> <1456129075-14909-6-git-send-email-mukawa@igel.co.jp> <56D8F06F.3060705@intel.com> From: Tetsuya Mukawa Message-ID: <56D91792.1060406@igel.co.jp> Date: Fri, 4 Mar 2016 14:05:22 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <56D8F06F.3060705@intel.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v3 5/6] virtio: Add support for qtest virtio-net PMD X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Mar 2016 05:05:24 -0000 On 2016/03/04 11:18, Tan, Jianfeng wrote: > Hi Tetsuya, > > Seems that this patch is too long. Is it possible to split into > multiple commits? Hi Jianfeng, Sure, will do. > > On 2/22/2016 4:17 PM, Tetsuya Mukawa wrote: >> The patch adds a new virtio-net PMD configuration that allows the PMD to >> work on host as if the PMD is in VM. >> Here is new configuration for virtio-net PMD. >> - CONFIG_RTE_VIRTIO_VDEV_QTEST >> To use this mode, EAL needs map all hugepages as one file. Also the file >> should be mapped between (1 << 31) and (1 << 44). And start address >> should be aligned by EAL memory size. >> >> To allocate like above, use below options. >> --single-file >> --range-virtaddr=0x80000000-0x100000000000 >> --align-memsize >> If a free regions isn't found, EAL will return error. >> >> To prepare virtio-net device on host, the users need to invoke QEMU >> process in special qtest mode. This mode is mainly used for testing QEMU >> devices from outer process. In this mode, no guest runs. >> Here is QEMU command line. >> >> $ qemu-system-x86_64 \ >> -machine pc-i440fx-1.4,accel=qtest \ >> -display none -qtest-log /dev/null \ >> -qtest unix:/tmp/socket,server \ >> -netdev type=tap,script=/etc/qemu-ifup,id=net0,queues=1 \ >> -device >> virtio-net-pci,netdev=net0,mq=on,disable-modern=false,addr=3 \ >> -chardev socket,id=chr1,path=/tmp/ivshmem,server \ >> -device ivshmem,size=1G,chardev=chr1,vectors=1,addr=4 >> >> * Should use qemu-2.5.1, or above. >> * QEMU process is needed per port. >> * virtio-1.0 device are only supported. >> * The vhost backends like vhost-net and vhost-user can be specified. >> * In most cases, just using above command is enough, but you can also >> specify other QEMU virtio-net options. >> * Only checked "pc-i440fx-1.4" machine, but may work with other >> machines. >> * Should not add "--enable-kvm" to QEMU command line. > > Correct me if wrong: all control msgs go through qemu process, e.g., > tx notifications and rx interrupts need follow frontend-qemu-backend > path. Question: qemu is started without --enable-kvm, as I understand, > ioeventfd, the basis of kickfd/callfd, will not be available. So how > does qemu kick backend or be kicked by backend? Actually, vhost-backend process will receive kickfd and callfd as -1. (Currently, we have a bug in librte_vhost, because the library treats -1 as "not initialized state". But actually without "--enable-kvm", -1 will be set by qemu to initialize kickfd and callfd. I will send a patch for the issue with next patch series.) In our case, virtio-net driver and vhost-backend driver are PMD. So we don't use kickfd and callfd, right? If you worried about vhost-net case, vhost-net kernel thread will work without ioeventfd and irqfd. In this case, virtio-net PMD can kick the vhost-net by accessing VIRTIO_PCI_QUEUE_NOTIFY register. (vhost-net doesn't need to kick virtio-net driver, because the driver is PMD.) > >> >> After invoking QEMU, the PMD can connect to QEMU process using unix >> domain sockets. Over these sockets, virtio-net, ivshmem and piix3 >> device in QEMU are probed by the PMD. >> Here is example of command line. >> >> $ testpmd -c f -n 1 -m 1024 --no-pci --single-file \ >> --range-virtaddr=0x80000000-0x100000000000 --align-memsize \ >> >> --vdev="eth_qtest_virtio0,qtest=/tmp/socket,ivshmem=/tmp/ivshmem"\ >> -- --disable-hw-vlan --txqflags=0xf00 -i >> >> Please specify same unix domain sockets and memory size in both QEMU >> and DPDK command lines like above. >> The share memory size should be power of 2, because ivshmem only >> accepts such memory size. >> >> Signed-off-by: Tetsuya Mukawa >> --- >> config/common_linuxapp | 1 + >> drivers/net/virtio/Makefile | 4 + >> drivers/net/virtio/qtest.c | 1342 >> ++++++++++++++++++++++++++++++++++++ >> drivers/net/virtio/qtest.h | 65 ++ >> drivers/net/virtio/virtio_ethdev.c | 383 +++++++++- >> drivers/net/virtio/virtio_pci.c | 364 +++++++++- >> drivers/net/virtio/virtio_pci.h | 5 +- >> 7 files changed, 2122 insertions(+), 42 deletions(-) >> create mode 100644 drivers/net/virtio/qtest.c >> create mode 100644 drivers/net/virtio/qtest.h >> >> diff --git a/config/common_linuxapp b/config/common_linuxapp >> index 452f39c..f6e53bc 100644 >> --- a/config/common_linuxapp >> +++ b/config/common_linuxapp >> @@ -533,3 +533,4 @@ CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n >> # Enable virtio support for container >> # >> CONFIG_RTE_VIRTIO_VDEV=y >> +CONFIG_RTE_VIRTIO_VDEV_QTEST=y >> diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile >> index ef920f9..6c11378 100644 >> --- a/drivers/net/virtio/Makefile >> +++ b/drivers/net/virtio/Makefile >> @@ -56,6 +56,10 @@ ifeq ($(CONFIG_RTE_VIRTIO_VDEV),y) >> SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += vhost_embedded.c >> endif >> +ifeq ($(CONFIG_RTE_VIRTIO_VDEV_QTEST),y) >> + SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += qtest.c >> +endif >> + >> # this lib depends upon: >> DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_eal >> lib/librte_ether >> DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_mempool >> lib/librte_mbuf >> diff --git a/drivers/net/virtio/qtest.c b/drivers/net/virtio/qtest.c >> new file mode 100644 >> index 0000000..061aab5 >> --- /dev/null >> +++ b/drivers/net/virtio/qtest.c >> @@ -0,0 +1,1342 @@ >> +/*- >> + * BSD LICENSE >> + * >> + * Copyright(c) 2016 IGEL Co., Ltd. All rights reserved. >> + * All rights reserved. >> + * >> + * Redistribution and use in source and binary forms, with or without >> + * modification, are permitted provided that the following conditions >> + * are met: >> + * >> + * * Redistributions of source code must retain the above copyright >> + * notice, this list of conditions and the following disclaimer. >> + * * Redistributions in binary form must reproduce the above >> copyright >> + * notice, this list of conditions and the following >> disclaimer in >> + * the documentation and/or other materials provided with the >> + * distribution. >> + * * Neither the name of IGEL Co., Ltd. nor the names of its >> + * contributors may be used to endorse or promote products >> derived >> + * from this software without specific prior written permission. >> + * >> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND >> CONTRIBUTORS >> + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT >> + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND >> FITNESS FOR >> + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE >> COPYRIGHT >> + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, >> INCIDENTAL, >> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT >> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS >> OF USE, >> + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND >> ON ANY >> + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR >> TORT >> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF >> THE USE >> + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH >> DAMAGE. >> + */ >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +#include >> +#include >> +#include >> +#include >> + >> +#include "virtio_pci.h" >> +#include "virtio_logs.h" >> +#include "virtio_ethdev.h" >> +#include "qtest.h" >> + >> +#define NB_BAR 6 >> + >> +/* PIIX3 configuration registers */ >> +#define PIIX3_REG_ADDR_PIRQA 0x60 >> +#define PIIX3_REG_ADDR_PIRQB 0x61 >> +#define PIIX3_REG_ADDR_PIRQC 0x62 >> +#define PIIX3_REG_ADDR_PIRQD 0x63 >> + >> +/* Device information */ >> +#define VIRTIO_NET_DEVICE_ID 0x1000 >> +#define VIRTIO_NET_VENDOR_ID 0x1af4 >> +#define VIRTIO_NET_IRQ_NUM 10 >> +#define IVSHMEM_DEVICE_ID 0x1110 >> +#define IVSHMEM_VENDOR_ID 0x1af4 >> +#define IVSHMEM_PROTOCOL_VERSION 0 >> +#define PIIX3_DEVICE_ID 0x7000 >> +#define PIIX3_VENDOR_ID 0x8086 >> + >> +/* ------------------------------------------------------------ >> + * IO port mapping of qtest guest >> + * ------------------------------------------------------------ >> + * 0x0000 - 0xbfff : not used >> + * 0xc000 - 0xc03f : virtio-net(BAR0) >> + * 0xc040 - 0xffff : not used >> + * >> + * ------------------------------------------------------------ >> + * Memory mapping of qtest quest >> + * ------------------------------------------------------------ >> + * 0x00000000_00000000 - 0x00000000_3fffffff : not used >> + * 0x00000000_40000000 - 0x00000000_40000fff : virtio-net(BAR1) >> + * 0x00000000_40001000 - 0x00000000_40ffffff : not used >> + * 0x00000000_41000000 - 0x00000000_417fffff : virtio-net(BAR4) >> + * 0x00000000_41800000 - 0x00000000_41ffffff : not used >> + * 0x00000000_42000000 - 0x00000000_420000ff : ivshmem(BAR0) >> + * 0x00000000_42000100 - 0x00000000_42ffffff : not used >> + * 0x00000000_80000000 - 0xffffffff_ffffffff : ivshmem(BAR2) >> + * > > Is it possible to arrange multiple virtio-net devices here? What's the > challenges? Yes, you can manage multiple virtio-net devices here, if you define correct memory map. > > Seems that lots of below code do the same work as libqos. So can we > just link libqos? Or we need to maintain this code. Problem is libqos will be GPL. So I wrote the code from scratch. > >> + * We can only specify start address of a region. The region size >> + * will be defined by the device implementation in QEMU. >> + * The size will be pow of 2 according to the PCI specification. >> + * Also, the region start address should be aligned by region size. >> + * >> + * BAR2 of ivshmem will be used to mmap DPDK application memory. >> + * So this address will be dynamically changed, but not to overlap >> + * others, it should be mmaped between above addresses. Such allocation >> + * is done by EAL. Check rte_eal_get_free_region() also. >> + */ >> +#define VIRTIO_NET_IO_START 0xc000 >> +#define VIRTIO_NET_MEMORY1_START 0x40000000 >> +#define VIRTIO_NET_MEMORY2_START 0x41000000 >> +#define IVSHMEM_MEMORY_START 0x42000000 >> + >> +#define PCI_CONFIG_ADDR(_bus, _device, _function, _offset) ( \ >> + (1 << 31) | ((_bus) & 0xff) << 16 | ((_device) & 0x1f) << 11 | \ >> + ((_function) & 0x7) << 8 | ((_offset) & 0xfc)) >> + >> +static char interrupt_message[32]; >> + >> +enum qtest_pci_bar_type { >> + QTEST_PCI_BAR_DISABLE = 0, >> + QTEST_PCI_BAR_IO, >> + QTEST_PCI_BAR_MEMORY_UNDER_1MB, >> + QTEST_PCI_BAR_MEMORY_32, >> + QTEST_PCI_BAR_MEMORY_64 >> +}; >> + >> +struct qtest_pci_bar { >> + enum qtest_pci_bar_type type; >> + uint8_t addr; >> + uint64_t region_start; >> + uint64_t region_size; >> +}; >> + >> +struct qtest_session; >> +TAILQ_HEAD(qtest_pci_device_list, qtest_pci_device); >> +struct qtest_pci_device { >> + TAILQ_ENTRY(qtest_pci_device) next; >> + const char *name; >> + uint16_t device_id; >> + uint16_t vendor_id; >> + uint8_t bus_addr; >> + uint8_t device_addr; >> + struct qtest_pci_bar bar[NB_BAR]; >> + int (*init)(struct qtest_session *s, struct qtest_pci_device *dev); >> +}; >> + >> +union qtest_pipefds { >> + struct { >> + int pipefd[2]; >> + }; >> + struct { >> + int readfd; >> + int writefd; >> + }; >> +}; >> + >> +struct qtest_session { >> + int qtest_socket; >> + pthread_mutex_t qtest_session_lock; >> + >> + struct qtest_pci_device_list head; >> + int ivshmem_socket; >> + >> + pthread_t event_th; >> + char *evq; >> + char *evq_dequeue_ptr; >> + size_t evq_total_len; >> + >> + union qtest_pipefds msgfds; >> + >> + pthread_t intr_th; >> + int eventfd; >> + rte_atomic16_t enable_intr; >> + rte_intr_callback_fn cb; >> + void *cb_arg; >> + struct rte_eth_dev_data *eth_data; >> +}; >> + >> +static int >> +qtest_raw_send(int fd, char *buf, size_t count) >> +{ >> + size_t len = count; >> + size_t total_len = 0; >> + int ret = 0; >> + >> + while (len > 0) { >> + ret = write(fd, buf, len); >> + if (ret == -1) { >> + if (errno == EINTR) >> + continue; >> + return ret; >> + } >> + if (ret == (int)len) >> + break; > > This _if_ complicates this function, remove this and rely on while > condition to quit? Same as in the below qtest_raw_recv(). > Yes, I will remove it. Thanks, Tetsuya > Thanks, > Jianfeng > >> + total_len += ret; >> + buf += ret; >> + len -= ret; >> + } >> + return total_len + ret; >> +} >> + >> +static int >> +qtest_raw_recv(int fd, char *buf, size_t count) >> +{ >> + size_t len = count; >> + size_t total_len = 0; >> + int ret = 0; >> + >> + while (len > 0) { >> + ret = read(fd, buf, len); >> + if (ret <= 0) { >> + if (errno == EINTR) { >> + continue; >> + } >> + return ret; >> + } >> + if (ret == (int)len) >> + break; >> + if (*(buf + ret - 1) == '\n') >> + break; >> + total_len += ret; >> + buf += ret; >> + len -= ret; >> + } >> + return total_len + ret; >> +} >> + > [...]