From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 19D56A0C46; Fri, 18 Jun 2021 10:10:26 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 749A241104; Fri, 18 Jun 2021 10:10:04 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by mails.dpdk.org (Postfix) with ESMTP id 998DA410F8 for ; Fri, 18 Jun 2021 10:09:58 +0200 (CEST) IronPort-SDR: fm6/AJ9FuDdpfG9cKaxlUs8xN+MgcQJ9r13HuHC2454paSe+8O+JkSrb1gmsI/QEA2h6+A6hjG RYqyzpBYUz/A== X-IronPort-AV: E=McAfee;i="6200,9189,10018"; a="206334400" X-IronPort-AV: E=Sophos;i="5.83,283,1616482800"; d="scan'208";a="206334400" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jun 2021 01:09:58 -0700 IronPort-SDR: 9Mm6F8Qs+bTPdvCxfB0f9Z8svsNzzgjclntY9MWKObx8ThDrgKVtDrHN1BJh+GjWuHtZsjrqjM G6p/PDLMkERg== X-IronPort-AV: E=Sophos;i="5.83,283,1616482800"; d="scan'208";a="451307822" Received: from unknown (HELO localhost.localdomain) ([10.240.183.109]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jun 2021 01:09:56 -0700 From: Wenwu Ma To: dev@dpdk.org Cc: maxime.coquelin@redhat.com, chenbo.xia@intel.com, cheng1.jiang@intel.com, Wenwu Ma Date: Fri, 18 Jun 2021 20:03:05 +0000 Message-Id: <20210618200305.662515-5-wenwux.ma@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210618200305.662515-1-wenwux.ma@intel.com> References: <20210602083110.5530-1-yuanx.wang@intel.com> <20210618200305.662515-1-wenwux.ma@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH v2 4/4] examples/vhost: support vhost async dequeue data path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This patch is to add vhost async dequeue data-path in vhost sample. vswitch can leverage IOAT to accelerate vhost async dequeue data-path. Signed-off-by: Wenwu Ma --- doc/guides/sample_app_ug/vhost.rst | 9 +- examples/vhost/ioat.c | 34 +++++-- examples/vhost/ioat.h | 4 + examples/vhost/main.c | 141 ++++++++++++++++++++--------- examples/vhost/main.h | 5 + 5 files changed, 140 insertions(+), 53 deletions(-) diff --git a/doc/guides/sample_app_ug/vhost.rst b/doc/guides/sample_app_ug/vhost.rst index 9afde9c7f5..63dcf181e1 100644 --- a/doc/guides/sample_app_ug/vhost.rst +++ b/doc/guides/sample_app_ug/vhost.rst @@ -169,9 +169,12 @@ demonstrates how to use the async vhost APIs. It's used in combination with dmas **--dmas** This parameter is used to specify the assigned DMA device of a vhost device. Async vhost-user net driver will be used if --dmas is set. For example ---dmas [txd0@00:04.0,txd1@00:04.1] means use DMA channel 00:04.0 for vhost -device 0 enqueue operation and use DMA channel 00:04.1 for vhost device 1 -enqueue operation. +--dmas [txd0@00:04.0,txd1@00:04.1,rxd0@00:04.2,rxd1@00:04.3] means use +DMA channel 00:04.0/00:04.2 for vhost device 0 enqueue/dequeue operation +and use DMA channel 00:04.1/00:04.3 for vhost device 1 enqueue/dequeue +operation. The index of the device corresponds to the socket file in order, +that means vhost device 0 is created through the first socket file, vhost +device 1 is created through the second socket file, and so on. Common Issues ------------- diff --git a/examples/vhost/ioat.c b/examples/vhost/ioat.c index bf4e033bdb..179ae87deb 100644 --- a/examples/vhost/ioat.c +++ b/examples/vhost/ioat.c @@ -21,6 +21,8 @@ struct packet_tracker { struct packet_tracker cb_tracker[MAX_VHOST_DEVICE]; +int vid2txd[MAX_VHOST_DEVICE]; + int open_ioat(const char *value) { @@ -60,6 +62,8 @@ open_ioat(const char *value) goto out; } while (i < args_nr) { + char *txd, *rxd; + bool is_txd; char *arg_temp = dma_arg[i]; uint8_t sub_nr; sub_nr = rte_strsplit(arg_temp, strlen(arg_temp), ptrs, 2, '@'); @@ -68,10 +72,20 @@ open_ioat(const char *value) goto out; } - start = strstr(ptrs[0], "txd"); - if (start == NULL) { + int async_flag; + txd = strstr(ptrs[0], "txd"); + rxd = strstr(ptrs[0], "rxd"); + if (txd == NULL && rxd == NULL) { ret = -1; goto out; + } else if (txd) { + is_txd = true; + start = txd; + async_flag = ASYNC_RX_VHOST; + } else { + is_txd = false; + start = rxd; + async_flag = ASYNC_TX_VHOST; } start += 3; @@ -81,7 +95,8 @@ open_ioat(const char *value) goto out; } - vring_id = 0 + VIRTIO_RXQ; + vring_id = is_txd ? VIRTIO_RXQ : VIRTIO_TXQ; + if (rte_pci_addr_parse(ptrs[1], &(dma_info + vid)->dmas[vring_id].addr) < 0) { ret = -1; @@ -105,6 +120,7 @@ open_ioat(const char *value) (dma_info + vid)->dmas[vring_id].dev_id = dev_id; (dma_info + vid)->dmas[vring_id].is_valid = true; + (dma_info + vid)->async_flag |= async_flag; config.ring_size = IOAT_RING_SIZE; config.hdls_disable = true; if (rte_rawdev_configure(dev_id, &info, sizeof(config)) < 0) { @@ -126,13 +142,16 @@ ioat_transfer_data_cb(int vid, uint16_t queue_id, struct rte_vhost_async_status *opaque_data, uint16_t count) { uint32_t i_desc; - uint16_t dev_id = dma_bind[vid].dmas[queue_id * 2 + VIRTIO_RXQ].dev_id; struct rte_vhost_iov_iter *src = NULL; struct rte_vhost_iov_iter *dst = NULL; unsigned long i_seg; unsigned short mask = MAX_ENQUEUED_SIZE - 1; - unsigned short write = cb_tracker[dev_id].next_write; + if (queue_id >= MAX_RING_COUNT) + return -1; + + uint16_t dev_id = dma_bind[vid2txd[vid]].dmas[queue_id].dev_id; + unsigned short write = cb_tracker[dev_id].next_write; if (!opaque_data) { for (i_desc = 0; i_desc < count; i_desc++) { src = descs[i_desc].src; @@ -170,7 +189,7 @@ ioat_check_completed_copies_cb(int vid, uint16_t queue_id, struct rte_vhost_async_status *opaque_data, uint16_t max_packets) { - if (!opaque_data) { + if (!opaque_data && (queue_id < MAX_RING_COUNT)) { uintptr_t dump[255]; int n_seg; unsigned short read, write; @@ -178,8 +197,7 @@ ioat_check_completed_copies_cb(int vid, uint16_t queue_id, unsigned short mask = MAX_ENQUEUED_SIZE - 1; unsigned short i; - uint16_t dev_id = dma_bind[vid].dmas[queue_id * 2 - + VIRTIO_RXQ].dev_id; + uint16_t dev_id = dma_bind[vid2txd[vid]].dmas[queue_id].dev_id; n_seg = rte_ioat_completed_ops(dev_id, 255, NULL, NULL, dump, dump); if (n_seg < 0) { RTE_LOG(ERR, diff --git a/examples/vhost/ioat.h b/examples/vhost/ioat.h index 1aa28ed6a3..c3d5c2344a 100644 --- a/examples/vhost/ioat.h +++ b/examples/vhost/ioat.h @@ -12,6 +12,9 @@ #define MAX_VHOST_DEVICE 1024 #define IOAT_RING_SIZE 4096 #define MAX_ENQUEUED_SIZE 4096 +#define MAX_RING_COUNT 2 +#define ASYNC_RX_VHOST 1 +#define ASYNC_TX_VHOST 2 struct dma_info { struct rte_pci_addr addr; @@ -20,6 +23,7 @@ struct dma_info { }; struct dma_for_vhost { + int async_flag; struct dma_info dmas[RTE_MAX_QUEUES_PER_PORT * 2]; uint16_t nr; }; diff --git a/examples/vhost/main.c b/examples/vhost/main.c index aebdc3a566..8a65e525ff 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -25,7 +25,6 @@ #include #include -#include "ioat.h" #include "main.h" #ifndef MAX_QUEUES @@ -93,8 +92,6 @@ static int client_mode; static int builtin_net_driver; -static int async_vhost_driver; - static char *dma_type; /* Specify timeout (in useconds) between retries on RX. */ @@ -679,7 +676,6 @@ us_vhost_parse_args(int argc, char **argv) us_vhost_usage(prgname); return -1; } - async_vhost_driver = 1; break; case OPT_CLIENT_NUM: @@ -897,7 +893,7 @@ drain_vhost(struct vhost_dev *vdev) __ATOMIC_SEQ_CST); } - if (!async_vhost_driver) + if ((dma_bind[vid2txd[vdev->vid]].async_flag & ASYNC_RX_VHOST) == 0) free_pkts(m, nr_xmit); } @@ -1237,10 +1233,19 @@ drain_eth_rx(struct vhost_dev *vdev) __ATOMIC_SEQ_CST); } - if (!async_vhost_driver) + if ((dma_bind[vid2txd[vdev->vid]].async_flag & ASYNC_RX_VHOST) == 0) free_pkts(pkts, rx_count); } +uint16_t async_dequeue_pkts(struct vhost_dev *dev, uint16_t queue_id, + struct rte_mempool *mbuf_pool, + struct rte_mbuf **pkts, uint16_t count) +{ + int nr_inflight; + return rte_vhost_async_try_dequeue_burst(dev->vid, queue_id, + mbuf_pool, pkts, count, &nr_inflight); +} + uint16_t sync_dequeue_pkts(struct vhost_dev *dev, uint16_t queue_id, struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count) @@ -1392,12 +1397,90 @@ destroy_device(int vid) "(%d) device has been removed from data core\n", vdev->vid); - if (async_vhost_driver) + if (dma_bind[vid2txd[vid]].async_flag & ASYNC_RX_VHOST) rte_vhost_async_channel_unregister(vid, VIRTIO_RXQ); + if (dma_bind[vid2txd[vid]].async_flag & ASYNC_TX_VHOST) + rte_vhost_async_channel_unregister(vid, VIRTIO_TXQ); rte_free(vdev); } +static int +get_txd_id(int vid) +{ + int i; + char ifname[PATH_MAX]; + rte_vhost_get_ifname(vid, ifname, sizeof(ifname)); + + for (i = 0; i < nb_sockets; i++) { + char *file = socket_files + i * PATH_MAX; + if (strcmp(file, ifname) == 0) + return i; + } + + return -1; +} + +static int +init_vhost_queue_ops(int vid) +{ + int i = get_txd_id(vid); + if (i == -1) + return -1; + + vid2txd[vid] = i; + if (builtin_net_driver) { + vdev_queue_ops[vid].enqueue_pkt_burst = builtin_enqueue_pkts; + vdev_queue_ops[vid].dequeue_pkt_burst = builtin_dequeue_pkts; + } else { + if (dma_bind[i].async_flag & ASYNC_RX_VHOST) { + vdev_queue_ops[vid].enqueue_pkt_burst = + async_enqueue_pkts; + } else { + vdev_queue_ops[vid].enqueue_pkt_burst = + sync_enqueue_pkts; + } + + if (dma_bind[i].async_flag & ASYNC_TX_VHOST) { + vdev_queue_ops[vid].dequeue_pkt_burst = + async_dequeue_pkts; + } else { + vdev_queue_ops[vid].dequeue_pkt_burst = + sync_dequeue_pkts; + } + } + + return 0; +} + +static int +vhost_aysnc_channel_register(int vid) +{ + int ret = 0; + struct rte_vhost_async_features f; + struct rte_vhost_async_channel_ops channel_ops; + + if (dma_type != NULL && strncmp(dma_type, "ioat", 4) == 0) { + channel_ops.transfer_data = ioat_transfer_data_cb; + channel_ops.check_completed_copies = + ioat_check_completed_copies_cb; + + f.async_inorder = 1; + f.async_threshold = 256; + + if (dma_bind[vid2txd[vid]].async_flag & ASYNC_RX_VHOST) { + ret |= rte_vhost_async_channel_register(vid, VIRTIO_RXQ, + f.intval, &channel_ops); + } + if (dma_bind[vid2txd[vid]].async_flag & ASYNC_TX_VHOST) { + ret |= rte_vhost_async_channel_register(vid, VIRTIO_TXQ, + f.intval, &channel_ops); + } + } + + return ret; +} + /* * A new device is added to a data core. First the device is added to the main linked list * and then allocated to a specific data core. @@ -1431,20 +1514,8 @@ new_device(int vid) } } - if (builtin_net_driver) { - vdev_queue_ops[vid].enqueue_pkt_burst = builtin_enqueue_pkts; - vdev_queue_ops[vid].dequeue_pkt_burst = builtin_dequeue_pkts; - } else { - if (async_vhost_driver) { - vdev_queue_ops[vid].enqueue_pkt_burst = - async_enqueue_pkts; - } else { - vdev_queue_ops[vid].enqueue_pkt_burst = - sync_enqueue_pkts; - } - - vdev_queue_ops[vid].dequeue_pkt_burst = sync_dequeue_pkts; - } + if (init_vhost_queue_ops(vid) != 0) + return -1; if (builtin_net_driver) vs_vhost_net_setup(vdev); @@ -1473,28 +1544,13 @@ new_device(int vid) rte_vhost_enable_guest_notification(vid, VIRTIO_RXQ, 0); rte_vhost_enable_guest_notification(vid, VIRTIO_TXQ, 0); + int ret = vhost_aysnc_channel_register(vid); + RTE_LOG(INFO, VHOST_DATA, "(%d) device has been added to data core %d\n", vid, vdev->coreid); - if (async_vhost_driver) { - struct rte_vhost_async_features f; - struct rte_vhost_async_channel_ops channel_ops; - - if (dma_type != NULL && strncmp(dma_type, "ioat", 4) == 0) { - channel_ops.transfer_data = ioat_transfer_data_cb; - channel_ops.check_completed_copies = - ioat_check_completed_copies_cb; - - f.async_inorder = 1; - f.async_threshold = 256; - - return rte_vhost_async_channel_register(vid, VIRTIO_RXQ, - f.intval, &channel_ops); - } - } - - return 0; + return ret; } /* @@ -1735,10 +1791,11 @@ main(int argc, char *argv[]) for (i = 0; i < nb_sockets; i++) { char *file = socket_files + i * PATH_MAX; - if (async_vhost_driver) - flags = flags | RTE_VHOST_USER_ASYNC_COPY; + uint64_t flag = flags; + if (dma_bind[i].async_flag != 0) + flag |= RTE_VHOST_USER_ASYNC_COPY; - ret = rte_vhost_driver_register(file, flags); + ret = rte_vhost_driver_register(file, flag); if (ret != 0) { unregister_drivers(i); rte_exit(EXIT_FAILURE, diff --git a/examples/vhost/main.h b/examples/vhost/main.h index 7cd8a11a45..5a892ed08d 100644 --- a/examples/vhost/main.h +++ b/examples/vhost/main.h @@ -9,6 +9,8 @@ #include +#include "ioat.h" + /* Macros for printing using RTE_LOG */ #define RTE_LOGTYPE_VHOST_CONFIG RTE_LOGTYPE_USER1 #define RTE_LOGTYPE_VHOST_DATA RTE_LOGTYPE_USER2 @@ -18,6 +20,9 @@ enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM}; #define MAX_PKT_BURST 32 /* Max burst size for RX/TX */ +extern struct dma_for_vhost dma_bind[MAX_VHOST_DEVICE]; +extern int vid2txd[MAX_VHOST_DEVICE]; + struct device_statistics { uint64_t tx; uint64_t tx_total; -- 2.25.1