From mboxrd@z Thu Jan 1 00:00:00 1970
Return-Path:
Received: from mailout3.w1.samsung.com (mailout3.w1.samsung.com
[210.118.77.13]) by dpdk.org (Postfix) with ESMTP id 01BCD379E
for ; Tue, 12 Jan 2016 08:46:03 +0100 (CET)
Received: from eucpsbgm1.samsung.com (unknown [203.254.199.244])
by mailout3.w1.samsung.com
(Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014))
with ESMTP id <0O0T00D15XKOOQ40@mailout3.w1.samsung.com> for dev@dpdk.org;
Tue, 12 Jan 2016 07:46:00 +0000 (GMT)
X-AuditID: cbfec7f4-f79026d00000418a-35-5694af38bd0d
Received: from eusync4.samsung.com ( [203.254.199.214])
by eucpsbgm1.samsung.com (EUCPMTA) with SMTP id DB.2D.16778.83FA4965; Tue,
12 Jan 2016 07:46:00 +0000 (GMT)
Received: from fedinw7x64 ([106.109.131.169])
by eusync4.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0
64bit (built May 5 2014))
with ESMTPA id <0O0T006J1XKNKE80@eusync4.samsung.com>; Tue,
12 Jan 2016 07:46:00 +0000 (GMT)
From: Pavel Fedin
To: 'Jianfeng Tan' , dev@dpdk.org
References: <1446748276-132087-1-git-send-email-jianfeng.tan@intel.com>
<1452426182-86851-1-git-send-email-jianfeng.tan@intel.com>
<1452426182-86851-5-git-send-email-jianfeng.tan@intel.com>
In-reply-to: <1452426182-86851-5-git-send-email-jianfeng.tan@intel.com>
Date: Tue, 12 Jan 2016 10:45:59 +0300
Message-id: <009b01d14d0d$47c85540$d758ffc0$@samsung.com>
MIME-version: 1.0
Content-type: text/plain; charset=us-ascii
Content-transfer-encoding: 7bit
X-Mailer: Microsoft Outlook 14.0
Thread-index: AQLOfZuJ4skKn5NxqR5aD7duH+7e0wGBr/PNAjtEweac3zaDYA==
Content-language: ru
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprPIsWRmVeSWpSXmKPExsVy+t/xa7oW66eEGay7YmIx9+UPJot3n7Yz
WbTPPMtk0T37C5vF39mtrBb/f71itTjW84nV4tChw4wWm95NYrW4PuECqwOXx4PLN5k8fi1Y
yurRcuQtq0fjcwmPxXteMnk0v3jO4jHvZKDH+31X2QI4orhsUlJzMstSi/TtErgy3q58yFYw
/yJjxdKPE1kaGB+sZOxi5OSQEDCR+HYcxhaTuHBvPVsXIxeHkMBSRondjZ/ZIZzvjBI9s/aC
VbEJqEuc/vqBBcQWEbCU+PTmHzOIzSxwhlHiyA52EFtI4CCjxP5eHxCbU8Bdou3RUrB6YQFn
iRPdJ8DqWQRUJRruN4PZvEBzZp3+xAphC0r8mHyPBWKmlsT6nceZIGx5ic1r3jJDXKogsePs
a0aIG5wk3t+9xAhRIyIx7d895gmMQrOQjJqFZNQsJKNmIWlZwMiyilE0tTS5oDgpPddQrzgx
t7g0L10vOT93EyMk2r7sYFx8zOoQowAHoxIPbwb7lDAh1sSy4srcQ4wSHMxKIrxOQUAh3pTE
yqrUovz4otKc1OJDjNIcLErivHN3vQ8REkhPLEnNTk0tSC2CyTJxcEo1MM6b8uL8rzmac9cb
2uSfbL/U/rlspqqkX2l98cQX17PVp1fc2L1qR2Strb3b3rUbpQ6tjuDYFZhsatJ9lauj/M7c
iBcyNo7x/mxr/dSCLl04cKZn3k59Vs+a8/qR/J+dcmeHdrm/KKh9e2N2g/T3H9eF7itlVMS/
O7s0r+lDUdZD1jUO3+9bayuxFGckGmoxFxUnAgB3hSpGsgIAAA==
Cc: nakajima.yoshihiro@lab.ntt.co.jp, mst@redhat.com,
ann.zhuangyanying@huawei.com
Subject: Re: [dpdk-dev] [PATCH 4/4] virtio/vdev: add a new vdev named
eth_cvio
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
X-List-Received-Date: Tue, 12 Jan 2016 07:46:03 -0000
Hello!
See inline
> -----Original Message-----
> From: Jianfeng Tan [mailto:jianfeng.tan@intel.com]
> Sent: Sunday, January 10, 2016 2:43 PM
> To: dev@dpdk.org
> Cc: rich.lane@bigswitch.com; yuanhan.liu@linux.intel.com; mst@redhat.com;
> nakajima.yoshihiro@lab.ntt.co.jp; huawei.xie@intel.com; mukawa@igel.co.jp;
> p.fedin@samsung.com; michael.qiu@intel.com; ann.zhuangyanying@huawei.com; Jianfeng Tan
> Subject: [PATCH 4/4] virtio/vdev: add a new vdev named eth_cvio
>
> Add a new virtual device named eth_cvio, it can be used just like
> eth_ring, eth_null, etc.
>
> Configured parameters include:
> - rx (optional, 1 by default): number of rx, only allowed to be
> 1 for now.
> - tx (optional, 1 by default): number of tx, only allowed to be
> 1 for now.
> - cq (optional, 0 by default): if ctrl queue is enabled, not
> supported for now.
> - mac (optional): mac address, random value will be given if not
> specified.
> - queue_num (optional, 256 by default): size of virtqueue.
> - path (madatory): path of vhost, depends on the file type:
> vhost-user is used if the given path points to
> a unix socket; vhost-net is used if the given
> path points to a char device.
>
> The major difference with original virtio for vm is that, here we
> use virtual address instead of physical address for vhost to
> calculate relative address.
>
> When enable CONFIG_RTE_VIRTIO_VDEV (enabled by default), the compiled
> library can be used in both VM and container environment.
>
> Examples:
> a. Use vhost-net as a backend
> sudo numactl -N 1 -m 1 ./examples/l2fwd/build/l2fwd -c 0x100000 -n 4 \
> -m 1024 --no-pci --single-file --file-prefix=l2fwd \
> --vdev=eth_cvio0,mac=00:01:02:03:04:05,path=/dev/vhost-net \
> -- -p 0x1
>
> b. Use vhost-user as a backend
> numactl -N 1 -m 1 ./examples/l2fwd/build/l2fwd -c 0x100000 -n 4 -m 1024 \
> --no-pci --single-file --file-prefix=l2fwd \
> --vdev=eth_cvio0,mac=00:01:02:03:04:05,path= \
> -- -p 0x1
>
> Signed-off-by: Huawei Xie
> Signed-off-by: Jianfeng Tan
> ---
> drivers/net/virtio/virtio_ethdev.c | 338 +++++++++++++++++++++++++-------
> drivers/net/virtio/virtio_ethdev.h | 1 +
> drivers/net/virtio/virtio_pci.h | 24 +--
> drivers/net/virtio/virtio_rxtx.c | 11 +-
> drivers/net/virtio/virtio_rxtx_simple.c | 14 +-
> drivers/net/virtio/virtqueue.h | 13 +-
> 6 files changed, 302 insertions(+), 99 deletions(-)
>
> diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
> index d928339..6e46060 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -56,6 +56,7 @@
> #include
> #include
> #include
> +#include
>
> #include "virtio_ethdev.h"
> #include "virtio_pci.h"
> @@ -174,14 +175,14 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
> * One RX packet for ACK.
> */
> vq->vq_ring.desc[head].flags = VRING_DESC_F_NEXT;
> - vq->vq_ring.desc[head].addr = vq->virtio_net_hdr_mz->phys_addr;
> + vq->vq_ring.desc[head].addr = vq->virtio_net_hdr_mem;
> vq->vq_ring.desc[head].len = sizeof(struct virtio_net_ctrl_hdr);
> vq->vq_free_cnt--;
> i = vq->vq_ring.desc[head].next;
>
> for (k = 0; k < pkt_num; k++) {
> vq->vq_ring.desc[i].flags = VRING_DESC_F_NEXT;
> - vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mz->phys_addr
> + vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mem
> + sizeof(struct virtio_net_ctrl_hdr)
> + sizeof(ctrl->status) + sizeof(uint8_t)*sum;
> vq->vq_ring.desc[i].len = dlen[k];
> @@ -191,7 +192,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
> }
>
> vq->vq_ring.desc[i].flags = VRING_DESC_F_WRITE;
> - vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mz->phys_addr
> + vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mem
> + sizeof(struct virtio_net_ctrl_hdr);
> vq->vq_ring.desc[i].len = sizeof(ctrl->status);
> vq->vq_free_cnt--;
> @@ -374,68 +375,85 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
> }
> }
>
> - /*
> - * Virtio PCI device VIRTIO_PCI_QUEUE_PF register is 32bit,
> - * and only accepts 32 bit page frame number.
> - * Check if the allocated physical memory exceeds 16TB.
> - */
> - if ((mz->phys_addr + vq->vq_ring_size - 1) >> (VIRTIO_PCI_QUEUE_ADDR_SHIFT + 32)) {
> - PMD_INIT_LOG(ERR, "vring address shouldn't be above 16TB!");
> - rte_free(vq);
> - return -ENOMEM;
> - }
> -
> memset(mz->addr, 0, sizeof(mz->len));
> vq->mz = mz;
> - vq->vq_ring_mem = mz->phys_addr;
> vq->vq_ring_virt_mem = mz->addr;
> - PMD_INIT_LOG(DEBUG, "vq->vq_ring_mem: 0x%"PRIx64, (uint64_t)mz->phys_addr);
> - PMD_INIT_LOG(DEBUG, "vq->vq_ring_virt_mem: 0x%"PRIx64, (uint64_t)(uintptr_t)mz->addr);
> +
> + if (dev->dev_type == RTE_ETH_DEV_PCI) {
> + vq->vq_ring_mem = mz->phys_addr;
> +
> + /* Virtio PCI device VIRTIO_PCI_QUEUE_PF register is 32bit,
> + * and only accepts 32 bit page frame number.
> + * Check if the allocated physical memory exceeds 16TB.
> + */
> + uint64_t last_physaddr = vq->vq_ring_mem + vq->vq_ring_size - 1;
> + if (last_physaddr >> (VIRTIO_PCI_QUEUE_ADDR_SHIFT + 32)) {
> + PMD_INIT_LOG(ERR, "vring address shouldn't be above 16TB!");
> + rte_free(vq);
> + return -ENOMEM;
> + }
> + }
> +#ifdef RTE_VIRTIO_VDEV
> + else
> + vq->vq_ring_mem = (phys_addr_t)mz->addr; /* Use vaddr!!! */
> +#endif
> +
> + PMD_INIT_LOG(DEBUG, "vq->vq_ring_mem: 0x%"PRIx64,
> + (uint64_t)vq->vq_ring_mem);
> + PMD_INIT_LOG(DEBUG, "vq->vq_ring_virt_mem: 0x%"PRIx64,
> + (uint64_t)(uintptr_t)vq->vq_ring_virt_mem);
> vq->virtio_net_hdr_mz = NULL;
> vq->virtio_net_hdr_mem = 0;
>
> + uint64_t hdr_size = 0;
> if (queue_type == VTNET_TQ) {
> /*
> * For each xmit packet, allocate a virtio_net_hdr
> */
> snprintf(vq_name, sizeof(vq_name), "port%d_tvq%d_hdrzone",
> dev->data->port_id, queue_idx);
> - vq->virtio_net_hdr_mz = rte_memzone_reserve_aligned(vq_name,
> - vq_size * hw->vtnet_hdr_size,
> - socket_id, 0, RTE_CACHE_LINE_SIZE);
> - if (vq->virtio_net_hdr_mz == NULL) {
> - if (rte_errno == EEXIST)
> - vq->virtio_net_hdr_mz =
> - rte_memzone_lookup(vq_name);
> - if (vq->virtio_net_hdr_mz == NULL) {
> - rte_free(vq);
> - return -ENOMEM;
> - }
> - }
> - vq->virtio_net_hdr_mem =
> - vq->virtio_net_hdr_mz->phys_addr;
> - memset(vq->virtio_net_hdr_mz->addr, 0,
> - vq_size * hw->vtnet_hdr_size);
> + hdr_size = vq_size * hw->vtnet_hdr_size;
> } else if (queue_type == VTNET_CQ) {
> /* Allocate a page for control vq command, data and status */
> snprintf(vq_name, sizeof(vq_name), "port%d_cvq_hdrzone",
> dev->data->port_id);
> - vq->virtio_net_hdr_mz = rte_memzone_reserve_aligned(vq_name,
> - PAGE_SIZE, socket_id, 0, RTE_CACHE_LINE_SIZE);
> - if (vq->virtio_net_hdr_mz == NULL) {
> + hdr_size = PAGE_SIZE;
> + }
> +
> + if (hdr_size) { /* queue_type is VTNET_TQ or VTNET_CQ */
> + mz = rte_memzone_reserve_aligned(vq_name,
> + hdr_size, socket_id, 0, RTE_CACHE_LINE_SIZE);
> + if (mz == NULL) {
> if (rte_errno == EEXIST)
> - vq->virtio_net_hdr_mz =
> - rte_memzone_lookup(vq_name);
> - if (vq->virtio_net_hdr_mz == NULL) {
> + mz = rte_memzone_lookup(vq_name);
> + if (mz == NULL) {
> rte_free(vq);
> return -ENOMEM;
> }
> }
> - vq->virtio_net_hdr_mem =
> - vq->virtio_net_hdr_mz->phys_addr;
> - memset(vq->virtio_net_hdr_mz->addr, 0, PAGE_SIZE);
> + vq->virtio_net_hdr_mz = mz;
> + vq->virtio_net_hdr_vaddr = mz->addr;
> + memset(vq->virtio_net_hdr_vaddr, 0, hdr_size);
> +
> + if (dev->dev_type == RTE_ETH_DEV_PCI)
> + vq->virtio_net_hdr_mem = mz->phys_addr;
> +#ifdef RTE_VIRTIO_VDEV
> + else
> + vq->virtio_net_hdr_mem = (phys_addr_t)mz->addr; /* Use vaddr!!! */
> +#endif
> }
>
> + struct rte_mbuf *m = NULL;
> + if (dev->dev_type == RTE_ETH_DEV_PCI)
> + vq->offset = (uintptr_t)&m->buf_addr;
> +#ifdef RTE_VIRTIO_VDEV
> + else {
> + vq->offset = (uintptr_t)&m->buf_physaddr;
Not sure, but shouldn't these be swapped? Originally, for PCI devices, we used buf_physaddr.
> +#if (RTE_BYTE_ORDER == RTE_BIG_ENDIAN) && (__WORDSIZE == 32)
> + vq->offset += 4;
> +#endif
> + }
> +#endif
> /*
> * Set guest physical address of the virtqueue
> * in VIRTIO_PCI_QUEUE_PFN config register of device
> @@ -491,8 +509,10 @@ virtio_dev_close(struct rte_eth_dev *dev)
> PMD_INIT_LOG(DEBUG, "virtio_dev_close");
>
> /* reset the NIC */
> - if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
> - vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR);
> + if (dev->dev_type == RTE_ETH_DEV_PCI) {
> + if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
> + vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR);
> + }
> vtpci_reset(hw);
> hw->started = 0;
> virtio_dev_free_mbufs(dev);
> @@ -1233,8 +1253,9 @@ virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
> isr = vtpci_isr(hw);
> PMD_DRV_LOG(INFO, "interrupt status = %#x", isr);
>
> - if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0)
> - PMD_DRV_LOG(ERR, "interrupt enable failed");
> + if (dev->dev_type == RTE_ETH_DEV_PCI)
> + if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0)
> + PMD_DRV_LOG(ERR, "interrupt enable failed");
>
> if (isr & VIRTIO_PCI_ISR_CONFIG) {
> if (virtio_dev_link_update(dev, 0) == 0)
> @@ -1287,11 +1308,18 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
>
> pci_dev = eth_dev->pci_dev;
>
> - if (virtio_resource_init(pci_dev) < 0)
> - return -1;
> -
> - hw->use_msix = virtio_has_msix(&pci_dev->addr);
> - hw->io_base = (uint32_t)(uintptr_t)pci_dev->mem_resource[0].addr;
> + if (eth_dev->dev_type == RTE_ETH_DEV_PCI) {
> + if (virtio_resource_init(pci_dev) < 0)
> + return -1;
> + hw->use_msix = virtio_has_msix(&pci_dev->addr);
> + hw->io_base = (uint32_t)(uintptr_t)pci_dev->mem_resource[0].addr;
> + }
> +#ifdef RTE_VIRTIO_VDEV
> + else {
> + hw->use_msix = 0;
> + hw->io_base = VIRTIO_VDEV_IO_BASE;
> + }
> +#endif
>
> /* Reset the device although not necessary at startup */
> vtpci_reset(hw);
> @@ -1304,10 +1332,12 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
> virtio_negotiate_features(hw);
>
> /* If host does not support status then disable LSC */
> - if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS))
> - pci_dev->driver->drv_flags &= ~RTE_PCI_DRV_INTR_LSC;
> + if (eth_dev->dev_type == RTE_ETH_DEV_PCI) {
> + if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS))
> + pci_dev->driver->drv_flags &= ~RTE_PCI_DRV_INTR_LSC;
>
> - rte_eth_copy_pci_info(eth_dev, pci_dev);
> + rte_eth_copy_pci_info(eth_dev, pci_dev);
> + }
>
> rx_func_get(eth_dev);
>
> @@ -1383,15 +1413,16 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
>
> PMD_INIT_LOG(DEBUG, "hw->max_rx_queues=%d hw->max_tx_queues=%d",
> hw->max_rx_queues, hw->max_tx_queues);
> - PMD_INIT_LOG(DEBUG, "port %d vendorID=0x%x deviceID=0x%x",
> - eth_dev->data->port_id, pci_dev->id.vendor_id,
> - pci_dev->id.device_id);
> -
> - /* Setup interrupt callback */
> - if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
> - rte_intr_callback_register(&pci_dev->intr_handle,
> - virtio_interrupt_handler, eth_dev);
> -
> + if (eth_dev->dev_type == RTE_ETH_DEV_PCI) {
> + PMD_INIT_LOG(DEBUG, "port %d vendorID=0x%x deviceID=0x%x",
> + eth_dev->data->port_id, pci_dev->id.vendor_id,
> + pci_dev->id.device_id);
> +
> + /* Setup interrupt callback */
> + if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
> + rte_intr_callback_register(&pci_dev->intr_handle,
> + virtio_interrupt_handler, eth_dev);
> + }
> virtio_dev_cq_start(eth_dev);
>
> return 0;
> @@ -1424,10 +1455,12 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
> eth_dev->data->mac_addrs = NULL;
>
> /* reset interrupt callback */
> - if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
> - rte_intr_callback_unregister(&pci_dev->intr_handle,
> - virtio_interrupt_handler,
> - eth_dev);
> + if (eth_dev->dev_type == RTE_ETH_DEV_PCI) {
> + if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
> + rte_intr_callback_unregister(&pci_dev->intr_handle,
> + virtio_interrupt_handler,
> + eth_dev);
> + }
>
> PMD_INIT_LOG(DEBUG, "dev_uninit completed");
>
> @@ -1491,11 +1524,13 @@ virtio_dev_configure(struct rte_eth_dev *dev)
> return -ENOTSUP;
> }
>
> - if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
> - if (vtpci_irq_config(hw, 0) == VIRTIO_MSI_NO_VECTOR) {
> - PMD_DRV_LOG(ERR, "failed to set config vector");
> - return -EBUSY;
> - }
> + if (dev->dev_type == RTE_ETH_DEV_PCI) {
> + if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
> + if (vtpci_irq_config(hw, 0) == VIRTIO_MSI_NO_VECTOR) {
> + PMD_DRV_LOG(ERR, "failed to set config vector");
> + return -EBUSY;
> + }
> + }
>
> return 0;
> }
> @@ -1689,3 +1724,162 @@ static struct rte_driver rte_virtio_driver = {
> };
>
> PMD_REGISTER_DRIVER(rte_virtio_driver);
> +
> +#ifdef RTE_VIRTIO_VDEV
> +
> +static const char *valid_args[] = {
> +#define ETH_CVIO_ARG_RX_NUM "rx"
> + ETH_CVIO_ARG_RX_NUM,
> +#define ETH_CVIO_ARG_TX_NUM "tx"
> + ETH_CVIO_ARG_TX_NUM,
> +#define ETH_CVIO_ARG_CQ_NUM "cq"
> + ETH_CVIO_ARG_CQ_NUM,
> +#define ETH_CVIO_ARG_MAC "mac"
> + ETH_CVIO_ARG_MAC,
> +#define ETH_CVIO_ARG_PATH "path"
> + ETH_CVIO_ARG_PATH,
> +#define ETH_CVIO_ARG_QUEUE_SIZE "queue_num"
> + ETH_CVIO_ARG_QUEUE_SIZE,
> + NULL
> +};
> +
> +static int
> +get_string_arg(const char *key __rte_unused,
> + const char *value, void *extra_args)
> +{
> + if ((value == NULL) || (extra_args == NULL))
> + return -EINVAL;
> +
> + strcpy(extra_args, value);
> +
> + return 0;
> +}
> +
> +static int
> +get_integer_arg(const char *key __rte_unused,
> + const char *value, void *extra_args)
> +{
> + uint64_t *p_u64 = extra_args;
> +
> + if ((value == NULL) || (extra_args == NULL))
> + return -EINVAL;
> +
> + *p_u64 = (uint64_t)strtoull(value, NULL, 0);
> +
> + return 0;
> +}
> +
> +static struct rte_eth_dev *
> +cvio_eth_dev_alloc(const char *name)
> +{
> + struct rte_eth_dev *eth_dev;
> + struct rte_eth_dev_data *data;
> + struct virtio_hw *hw;
> +
> + eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_VIRTUAL);
> + if (eth_dev == NULL)
> + rte_panic("cannot alloc rte_eth_dev\n");
> +
> + data = eth_dev->data;
> +
> + hw = rte_zmalloc(NULL, sizeof(*hw), 0);
> + if (!hw)
> + rte_panic("malloc virtio_hw failed\n");
> +
> + data->dev_private = hw;
> + data->numa_node = SOCKET_ID_ANY;
> + eth_dev->pci_dev = NULL;
> + /* will be used in virtio_dev_info_get() */
> + eth_dev->driver = &rte_virtio_pmd;
> + /* TODO: eth_dev->link_intr_cbs */
> + return eth_dev;
> +}
> +
> +#define CVIO_DEF_CQ_EN 0
> +#define CVIO_DEF_Q_NUM 1
> +#define CVIO_DEF_Q_SZ 256
> +/*
> + * Dev initialization routine. Invoked once for each virtio vdev at
> + * EAL init time, see rte_eal_dev_init().
> + * Returns 0 on success.
> + */
> +static int
> +rte_cvio_pmd_devinit(const char *name, const char *params)
> +{
> + struct rte_kvargs *kvlist = NULL;
> + struct rte_eth_dev *eth_dev = NULL;
> + uint64_t nb_rx = CVIO_DEF_Q_NUM;
> + uint64_t nb_tx = CVIO_DEF_Q_NUM;
> + uint64_t nb_cq = CVIO_DEF_CQ_EN;
> + uint64_t queue_num = CVIO_DEF_Q_SZ;
> + char sock_path[256];
> + char mac_addr[32];
> + int flag_mac = 0;
> +
> + if (params == NULL || params[0] == '\0')
> + rte_panic("arg %s is mandatory for eth_cvio\n",
> + ETH_CVIO_ARG_QUEUE_SIZE);
> +
> + kvlist = rte_kvargs_parse(params, valid_args);
> + if (!kvlist)
> + rte_panic("error when parsing param\n");
> +
> + if (rte_kvargs_count(kvlist, ETH_CVIO_ARG_PATH) == 1)
> + rte_kvargs_process(kvlist, ETH_CVIO_ARG_PATH,
> + &get_string_arg, sock_path);
> + else
> + rte_panic("arg %s is mandatory for eth_cvio\n",
> + ETH_CVIO_ARG_QUEUE_SIZE);
> +
> + if (rte_kvargs_count(kvlist, ETH_CVIO_ARG_MAC) == 1) {
> + rte_kvargs_process(kvlist, ETH_CVIO_ARG_MAC,
> + &get_string_arg, mac_addr);
> + flag_mac = 1;
> + }
> +
> + if (rte_kvargs_count(kvlist, ETH_CVIO_ARG_QUEUE_SIZE) == 1)
> + rte_kvargs_process(kvlist, ETH_CVIO_ARG_QUEUE_SIZE,
> + &get_integer_arg, &queue_num);
> +
> + if (rte_kvargs_count(kvlist, ETH_CVIO_ARG_RX_NUM) == 1)
> + rte_kvargs_process(kvlist, ETH_CVIO_ARG_RX_NUM,
> + &get_integer_arg, &nb_rx);
> +
> + if (rte_kvargs_count(kvlist, ETH_CVIO_ARG_TX_NUM) == 1)
> + rte_kvargs_process(kvlist, ETH_CVIO_ARG_TX_NUM,
> + &get_integer_arg, &nb_tx);
> +
> + if (rte_kvargs_count(kvlist, ETH_CVIO_ARG_CQ_NUM) == 1)
> + rte_kvargs_process(kvlist, ETH_CVIO_ARG_CQ_NUM,
> + &get_integer_arg, &nb_cq);
> +
> + eth_dev = cvio_eth_dev_alloc(name);
> +
> + virtio_vdev_init(eth_dev->data, sock_path,
> + nb_rx, nb_tx, nb_cq, queue_num,
> + (flag_mac) ? mac_addr : NULL);
> +
> + /* originally, this will be called in rte_eal_pci_probe() */
> + eth_virtio_dev_init(eth_dev);
> +
> + return 0;
> +}
> +
> +static int
> +rte_cvio_pmd_devuninit(const char *name)
> +{
> + /* TODO: if it's last one, memory init, free memory */
> + rte_panic("%s: %s", __func__, name);
> + return 0;
> +}
> +
> +static struct rte_driver rte_cvio_driver = {
> + .name = "eth_cvio",
> + .type = PMD_VDEV,
> + .init = rte_cvio_pmd_devinit,
> + .uninit = rte_cvio_pmd_devuninit,
> +};
> +
> +PMD_REGISTER_DRIVER(rte_cvio_driver);
> +
> +#endif
> diff --git a/drivers/net/virtio/virtio_ethdev.h b/drivers/net/virtio/virtio_ethdev.h
> index 9e1ecb3..90890b4 100644
> --- a/drivers/net/virtio/virtio_ethdev.h
> +++ b/drivers/net/virtio/virtio_ethdev.h
> @@ -126,4 +126,5 @@ uint16_t virtio_xmit_pkts_simple(void *tx_queue, struct rte_mbuf
> **tx_pkts,
> void virtio_vdev_init(struct rte_eth_dev_data *data, const char *path,
> int nb_rx, int nb_tx, int nb_cq, int queue_num, char *mac);
> #endif
> +
> #endif /* _VIRTIO_ETHDEV_H_ */
> diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
> index af05ae2..d79bd05 100644
> --- a/drivers/net/virtio/virtio_pci.h
> +++ b/drivers/net/virtio/virtio_pci.h
> @@ -249,31 +249,31 @@ uint32_t virtio_ioport_read(struct virtio_hw *, uint64_t);
> void virtio_ioport_write(struct virtio_hw *, uint64_t, uint32_t);
>
> #define VIRTIO_READ_REG_1(hw, reg) \
> - (hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> + ((hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> inb((VIRTIO_PCI_REG_ADDR((hw), (reg)))) \
> - :virtio_ioport_read(hw, reg)
> + :virtio_ioport_read(hw, reg))
> #define VIRTIO_WRITE_REG_1(hw, reg, value) \
> - (hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> + ((hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> outb_p((unsigned char)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg)))) \
> - :virtio_ioport_write(hw, reg, value)
> + :virtio_ioport_write(hw, reg, value))
>
> #define VIRTIO_READ_REG_2(hw, reg) \
> - (hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> + ((hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> inw((VIRTIO_PCI_REG_ADDR((hw), (reg)))) \
> - :virtio_ioport_read(hw, reg)
> + :virtio_ioport_read(hw, reg))
> #define VIRTIO_WRITE_REG_2(hw, reg, value) \
> - (hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> + ((hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> outw_p((unsigned short)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg)))) \
> - :virtio_ioport_write(hw, reg, value)
> + :virtio_ioport_write(hw, reg, value))
>
> #define VIRTIO_READ_REG_4(hw, reg) \
> - (hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> + ((hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> inl((VIRTIO_PCI_REG_ADDR((hw), (reg)))) \
> - :virtio_ioport_read(hw, reg)
> + :virtio_ioport_read(hw, reg))
> #define VIRTIO_WRITE_REG_4(hw, reg, value) \
> - (hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> + ((hw->io_base != VIRTIO_VDEV_IO_BASE) ? \
> outl_p((unsigned int)(value), (VIRTIO_PCI_REG_ADDR((hw), (reg)))) \
> - :virtio_ioport_write(hw, reg, value)
> + :virtio_ioport_write(hw, reg, value))
These bracket fixups should be squashed into #3
>
> #else /* RTE_VIRTIO_VDEV */
>
> diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
> index 74b39ef..dd07ba7 100644
> --- a/drivers/net/virtio/virtio_rxtx.c
> +++ b/drivers/net/virtio/virtio_rxtx.c
> @@ -191,8 +191,7 @@ virtqueue_enqueue_recv_refill(struct virtqueue *vq, struct rte_mbuf
> *cookie)
>
> start_dp = vq->vq_ring.desc;
> start_dp[idx].addr =
> - (uint64_t)(cookie->buf_physaddr + RTE_PKTMBUF_HEADROOM
> - - hw->vtnet_hdr_size);
> + RTE_MBUF_DATA_DMA_ADDR(cookie, vq->offset) - hw->vtnet_hdr_size;
> start_dp[idx].len =
> cookie->buf_len - RTE_PKTMBUF_HEADROOM + hw->vtnet_hdr_size;
> start_dp[idx].flags = VRING_DESC_F_WRITE;
> @@ -237,7 +236,7 @@ virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie)
>
> for (; ((seg_num > 0) && (cookie != NULL)); seg_num--) {
> idx = start_dp[idx].next;
> - start_dp[idx].addr = RTE_MBUF_DATA_DMA_ADDR(cookie);
> + start_dp[idx].addr = RTE_MBUF_DATA_DMA_ADDR(cookie, txvq->offset);
> start_dp[idx].len = cookie->data_len;
> start_dp[idx].flags = VRING_DESC_F_NEXT;
> cookie = cookie->next;
> @@ -343,7 +342,7 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type)
> VIRTIO_WRITE_REG_2(vq->hw, VIRTIO_PCI_QUEUE_SEL,
> vq->vq_queue_index);
> VIRTIO_WRITE_REG_4(vq->hw, VIRTIO_PCI_QUEUE_PFN,
> - vq->mz->phys_addr >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
> + vq->vq_ring_mem >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
> } else if (queue_type == VTNET_TQ) {
> if (use_simple_rxtx) {
> int mid_idx = vq->vq_nentries >> 1;
> @@ -366,12 +365,12 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type)
> VIRTIO_WRITE_REG_2(vq->hw, VIRTIO_PCI_QUEUE_SEL,
> vq->vq_queue_index);
> VIRTIO_WRITE_REG_4(vq->hw, VIRTIO_PCI_QUEUE_PFN,
> - vq->mz->phys_addr >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
> + vq->vq_ring_mem >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
> } else {
> VIRTIO_WRITE_REG_2(vq->hw, VIRTIO_PCI_QUEUE_SEL,
> vq->vq_queue_index);
> VIRTIO_WRITE_REG_4(vq->hw, VIRTIO_PCI_QUEUE_PFN,
> - vq->mz->phys_addr >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
> + vq->vq_ring_mem >> VIRTIO_PCI_QUEUE_ADDR_SHIFT);
> }
> }
>
> diff --git a/drivers/net/virtio/virtio_rxtx_simple.c
> b/drivers/net/virtio/virtio_rxtx_simple.c
> index ff3c11a..3a14a4e 100644
> --- a/drivers/net/virtio/virtio_rxtx_simple.c
> +++ b/drivers/net/virtio/virtio_rxtx_simple.c
> @@ -80,8 +80,8 @@ virtqueue_enqueue_recv_refill_simple(struct virtqueue *vq,
> vq->sw_ring[desc_idx] = cookie;
>
> start_dp = vq->vq_ring.desc;
> - start_dp[desc_idx].addr = (uint64_t)((uintptr_t)cookie->buf_physaddr +
> - RTE_PKTMBUF_HEADROOM - sizeof(struct virtio_net_hdr));
> + start_dp[desc_idx].addr = RTE_MBUF_DATA_DMA_ADDR(cookie, vq->offset)
> + - sizeof(struct virtio_net_hdr);
> start_dp[desc_idx].len = cookie->buf_len -
> RTE_PKTMBUF_HEADROOM + sizeof(struct virtio_net_hdr);
>
> @@ -118,9 +118,8 @@ virtio_rxq_rearm_vec(struct virtqueue *rxvq)
> p = (uintptr_t)&sw_ring[i]->rearm_data;
> *(uint64_t *)p = rxvq->mbuf_initializer;
>
> - start_dp[i].addr =
> - (uint64_t)((uintptr_t)sw_ring[i]->buf_physaddr +
> - RTE_PKTMBUF_HEADROOM - sizeof(struct virtio_net_hdr));
> + start_dp[i].addr = RTE_MBUF_DATA_DMA_ADDR(sw_ring[i], rxvq->offset)
> + - sizeof(struct virtio_net_hdr);
> start_dp[i].len = sw_ring[i]->buf_len -
> RTE_PKTMBUF_HEADROOM + sizeof(struct virtio_net_hdr);
> }
> @@ -366,7 +365,7 @@ virtio_xmit_pkts_simple(void *tx_queue, struct rte_mbuf **tx_pkts,
> txvq->vq_descx[desc_idx + i].cookie = tx_pkts[i];
> for (i = 0; i < nb_tail; i++) {
> start_dp[desc_idx].addr =
> - RTE_MBUF_DATA_DMA_ADDR(*tx_pkts);
> + RTE_MBUF_DATA_DMA_ADDR(*tx_pkts, txvq->offset);
> start_dp[desc_idx].len = (*tx_pkts)->pkt_len;
> tx_pkts++;
> desc_idx++;
> @@ -377,7 +376,8 @@ virtio_xmit_pkts_simple(void *tx_queue, struct rte_mbuf **tx_pkts,
> for (i = 0; i < nb_commit; i++)
> txvq->vq_descx[desc_idx + i].cookie = tx_pkts[i];
> for (i = 0; i < nb_commit; i++) {
> - start_dp[desc_idx].addr = RTE_MBUF_DATA_DMA_ADDR(*tx_pkts);
> + start_dp[desc_idx].addr = RTE_MBUF_DATA_DMA_ADDR(*tx_pkts,
> + txvq->offset);
> start_dp[desc_idx].len = (*tx_pkts)->pkt_len;
> tx_pkts++;
> desc_idx++;
> diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
> index 61b3137..dc0b656 100644
> --- a/drivers/net/virtio/virtqueue.h
> +++ b/drivers/net/virtio/virtqueue.h
> @@ -66,8 +66,14 @@ struct rte_mbuf;
>
> #define VIRTQUEUE_MAX_NAME_SZ 32
>
> -#define RTE_MBUF_DATA_DMA_ADDR(mb) \
> +#ifdef RTE_VIRTIO_VDEV
> +#define RTE_MBUF_DATA_DMA_ADDR(mb, offset) \
> + (uint64_t)((uintptr_t)(*(void **)((uintptr_t)mb + offset)) \
> + + (mb)->data_off)
> +#else
> +#define RTE_MBUF_DATA_DMA_ADDR(mb, offset) \
> (uint64_t) ((mb)->buf_physaddr + (mb)->data_off)
> +#endif /* RTE_VIRTIO_VDEV */
>
> #define VTNET_SQ_RQ_QUEUE_IDX 0
> #define VTNET_SQ_TQ_QUEUE_IDX 1
> @@ -167,7 +173,8 @@ struct virtqueue {
>
> void *vq_ring_virt_mem; /**< linear address of vring*/
> unsigned int vq_ring_size;
> - phys_addr_t vq_ring_mem; /**< physical address of vring */
> + phys_addr_t vq_ring_mem; /**< phys address of vring for pci dev,
> + virt addr of vring for vdev */
>
> struct vring vq_ring; /**< vring keeping desc, used and avail */
> uint16_t vq_free_cnt; /**< num of desc available */
> @@ -186,8 +193,10 @@ struct virtqueue {
> */
> uint16_t vq_used_cons_idx;
> uint16_t vq_avail_idx;
> + uint16_t offset; /**< relative offset to obtain addr in mbuf */
> uint64_t mbuf_initializer; /**< value to init mbufs. */
> phys_addr_t virtio_net_hdr_mem; /**< hdr for each xmit packet */
> + void *virtio_net_hdr_vaddr; /**< linear address of vring*/
>
> struct rte_mbuf **sw_ring; /**< RX software ring. */
> /* dummy mbuf, for wraparound when processing RX ring. */
> --
> 2.1.4
Kind regards,
Pavel Fedin
Senior Engineer
Samsung Electronics Research center Russia