From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 78733A0350; Thu, 25 Jun 2020 15:42:39 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1E355AAB7; Thu, 25 Jun 2020 15:42:39 +0200 (CEST) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) by dpdk.org (Postfix) with ESMTP id 9BFF42B8B for ; Thu, 25 Jun 2020 15:42:37 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1593092557; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=RIiP9ttzIPwL4FWmGGC5FgqXYTw+hzyMMSZcDd4ndx8=; b=T5b8wlm3N5c+09NdjI6oOYG6ijCurzbkUVgdBbwqNfC7LDNkSgNANWuHyHE+6Nq9feMv6j 7dsd25CYMIdwniiRnruw3Htaj8y+FS4cHkFlaLmkKqNCa6OWLkDfmKu4A8/xolOlc6BMSz LV9MfVU/QdKJzMfsB4vOOw7t+qQLJZE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-440-SAgOj8FkM_Ck7KIzeh8ZrA-1; Thu, 25 Jun 2020 09:42:34 -0400 X-MC-Unique: SAgOj8FkM_Ck7KIzeh8ZrA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2573D18FF661; Thu, 25 Jun 2020 13:42:33 +0000 (UTC) Received: from [10.36.110.7] (unknown [10.36.110.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B0553100EBA4; Thu, 25 Jun 2020 13:42:30 +0000 (UTC) To: "Liu, Yong" , "Fu, Patrick" Cc: "Jiang, Cheng1" , "Liang, Cunming" , "dev@dpdk.org" , "Xia, Chenbo" , "Wang, Zhihong" , "Ye, Xiaolong" References: <1591869725-13331-1-git-send-email-patrick.fu@intel.com> <1591869725-13331-2-git-send-email-patrick.fu@intel.com> <86228AFD5BCD8E4EBFD2B90117B5E81E635F601B@SHSMSX103.ccr.corp.intel.com> From: Maxime Coquelin Autocrypt: addr=maxime.coquelin@redhat.com; keydata= mQINBFOEQQIBEADjNLYZZqghYuWv1nlLisptPJp+TSxE/KuP7x47e1Gr5/oMDJ1OKNG8rlNg kLgBQUki3voWhUbMb69ybqdMUHOl21DGCj0BTU3lXwapYXOAnsh8q6RRM+deUpasyT+Jvf3a gU35dgZcomRh5HPmKMU4KfeA38cVUebsFec1HuJAWzOb/UdtQkYyZR4rbzw8SbsOemtMtwOx YdXodneQD7KuRU9IhJKiEfipwqk2pufm2VSGl570l5ANyWMA/XADNhcEXhpkZ1Iwj3TWO7XR uH4xfvPl8nBsLo/EbEI7fbuUULcAnHfowQslPUm6/yaGv6cT5160SPXT1t8U9QDO6aTSo59N jH519JS8oeKZB1n1eLDslCfBpIpWkW8ZElGkOGWAN0vmpLfdyiqBNNyS3eGAfMkJ6b1A24un /TKc6j2QxM0QK4yZGfAxDxtvDv9LFXec8ENJYsbiR6WHRHq7wXl/n8guyh5AuBNQ3LIK44x0 KjGXP1FJkUhUuruGyZsMrDLBRHYi+hhDAgRjqHgoXi5XGETA1PAiNBNnQwMf5aubt+mE2Q5r qLNTgwSo2dpTU3+mJ3y3KlsIfoaxYI7XNsPRXGnZi4hbxmeb2NSXgdCXhX3nELUNYm4ArKBP LugOIT/zRwk0H0+RVwL2zHdMO1Tht1UOFGfOZpvuBF60jhMzbQARAQABtCxNYXhpbWUgQ29x dWVsaW4gPG1heGltZS5jb3F1ZWxpbkByZWRoYXQuY29tPokCOAQTAQIAIgUCV3u/5QIbAwYL CQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQyjiNKEaHD4ma2g/+P+Hg9WkONPaY1J4AR7Uf kBneosS4NO3CRy0x4WYmUSLYMLx1I3VH6SVjqZ6uBoYy6Fs6TbF6SHNc7QbB6Qjo3neqnQR1 71Ua1MFvIob8vUEl3jAR/+oaE1UJKrxjWztpppQTukIk4oJOmXbL0nj3d8dA2QgHdTyttZ1H xzZJWWz6vqxCrUqHU7RSH9iWg9R2iuTzii4/vk1oi4Qz7y/q8ONOq6ffOy/t5xSZOMtZCspu Mll2Szzpc/trFO0pLH4LZZfz/nXh2uuUbk8qRIJBIjZH3ZQfACffgfNefLe2PxMqJZ8mFJXc RQO0ONZvwoOoHL6CcnFZp2i0P5ddduzwPdGsPq1bnIXnZqJSl3dUfh3xG5ArkliZ/++zGF1O wvpGvpIuOgLqjyCNNRoR7cP7y8F24gWE/HqJBXs1qzdj/5Hr68NVPV1Tu/l2D1KMOcL5sOrz 2jLXauqDWn1Okk9hkXAP7+0Cmi6QwAPuBT3i6t2e8UdtMtCE4sLesWS/XohnSFFscZR6Vaf3 gKdWiJ/fW64L6b9gjkWtHd4jAJBAIAx1JM6xcA1xMbAFsD8gA2oDBWogHGYcScY/4riDNKXi lw92d6IEHnSf6y7KJCKq8F+Jrj2BwRJiFKTJ6ChbOpyyR6nGTckzsLgday2KxBIyuh4w+hMq TGDSp2rmWGJjASq5Ag0EVPSbkwEQAMkaNc084Qvql+XW+wcUIY+Dn9A2D1gMr2BVwdSfVDN7 0ZYxo9PvSkzh6eQmnZNQtl8WSHl3VG3IEDQzsMQ2ftZn2sxjcCadexrQQv3Lu60Tgj7YVYRM H+fLYt9W5YuWduJ+FPLbjIKynBf6JCRMWr75QAOhhhaI0tsie3eDsKQBA0w7WCuPiZiheJaL 4MDe9hcH4rM3ybnRW7K2dLszWNhHVoYSFlZGYh+MGpuODeQKDS035+4H2rEWgg+iaOwqD7bg CQXwTZ1kSrm8NxIRVD3MBtzp9SZdUHLfmBl/tLVwDSZvHZhhvJHC6Lj6VL4jPXF5K2+Nn/Su CQmEBisOmwnXZhhu8ulAZ7S2tcl94DCo60ReheDoPBU8PR2TLg8rS5f9w6mLYarvQWL7cDtT d2eX3Z6TggfNINr/RTFrrAd7NHl5h3OnlXj7PQ1f0kfufduOeCQddJN4gsQfxo/qvWVB7PaE 1WTIggPmWS+Xxijk7xG6x9McTdmGhYaPZBpAxewK8ypl5+yubVsE9yOOhKMVo9DoVCjh5To5 aph7CQWfQsV7cd9PfSJjI2lXI0dhEXhQ7lRCFpf3V3mD6CyrhpcJpV6XVGjxJvGUale7+IOp sQIbPKUHpB2F+ZUPWds9yyVxGwDxD8WLqKKy0WLIjkkSsOb9UBNzgRyzrEC9lgQ/ABEBAAGJ Ah8EGAECAAkFAlT0m5MCGwwACgkQyjiNKEaHD4nU8hAAtt0xFJAy0sOWqSmyxTc7FUcX+pbD KVyPlpl6urKKMk1XtVMUPuae/+UwvIt0urk1mXi6DnrAN50TmQqvdjcPTQ6uoZ8zjgGeASZg jj0/bJGhgUr9U7oG7Hh2F8vzpOqZrdd65MRkxmc7bWj1k81tOU2woR/Gy8xLzi0k0KUa8ueB iYOcZcIGTcs9CssVwQjYaXRoeT65LJnTxYZif2pfNxfINFzCGw42s3EtZFteczClKcVSJ1+L +QUY/J24x0/ocQX/M1PwtZbB4c/2Pg/t5FS+s6UB1Ce08xsJDcwyOPIH6O3tccZuriHgvqKP yKz/Ble76+NFlTK1mpUlfM7PVhD5XzrDUEHWRTeTJSvJ8TIPL4uyfzhjHhlkCU0mw7Pscyxn DE8G0UYMEaNgaZap8dcGMYH/96EfE5s/nTX0M6MXV0yots7U2BDb4soLCxLOJz4tAFDtNFtA wLBhXRSvWhdBJZiig/9CG3dXmKfi2H+wdUCSvEFHRpgo7GK8/Kh3vGhgKmnnxhl8ACBaGy9n fxjSxjSO6rj4/MeenmlJw1yebzkX8ZmaSi8BHe+n6jTGEFNrbiOdWpJgc5yHIZZnwXaW54QT UhhSjDL1rV2B4F28w30jYmlRmm2RdN7iCZfbyP3dvFQTzQ4ySquuPkIGcOOHrvZzxbRjzMx1 Mwqu3GQ= Message-ID: Date: Thu, 25 Jun 2020 15:42:28 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: <86228AFD5BCD8E4EBFD2B90117B5E81E635F601B@SHSMSX103.ccr.corp.intel.com> Content-Language: en-US X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v1 1/2] vhost: introduce async data path registration API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 6/18/20 7:50 AM, Liu, Yong wrote: > Thanks, Patrick. So comments are inline. > >> -----Original Message----- >> From: dev On Behalf Of patrick.fu@intel.com >> Sent: Thursday, June 11, 2020 6:02 PM >> To: dev@dpdk.org; maxime.coquelin@redhat.com; Xia, Chenbo >> ; Wang, Zhihong ; Ye, >> Xiaolong >> Cc: Fu, Patrick ; Jiang, Cheng1 >> ; Liang, Cunming >> Subject: [dpdk-dev] [PATCH v1 1/2] vhost: introduce async data path >> registration API >> >> From: Patrick >> >> This patch introduces registration/un-registration APIs >> for async data path together with all required data >> structures and DMA callback function proto-types. >> >> Signed-off-by: Patrick >> --- >> lib/librte_vhost/Makefile | 3 +- >> lib/librte_vhost/rte_vhost.h | 1 + >> lib/librte_vhost/rte_vhost_async.h | 134 >> +++++++++++++++++++++++++++++++++++++ >> lib/librte_vhost/socket.c | 20 ++++++ >> lib/librte_vhost/vhost.c | 74 +++++++++++++++++++- >> lib/librte_vhost/vhost.h | 30 ++++++++- >> lib/librte_vhost/vhost_user.c | 28 ++++++-- >> 7 files changed, 283 insertions(+), 7 deletions(-) >> create mode 100644 lib/librte_vhost/rte_vhost_async.h >> >> diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile >> index e592795..3aed094 100644 >> --- a/lib/librte_vhost/Makefile >> +++ b/lib/librte_vhost/Makefile >> @@ -41,7 +41,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c >> iotlb.c socket.c vhost.c \ >> vhost_user.c virtio_net.c vdpa.c >> >> # install includes >> -SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_vhost.h rte_vdpa.h >> +SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_vhost.h rte_vdpa.h >> \ >> + rte_vhost_async.h >> > Hi Patrick, > Please also update meson build for newly added file. > > Thanks, > Marvin > >> # only compile vhost crypto when cryptodev is enabled >> ifeq ($(CONFIG_RTE_LIBRTE_CRYPTODEV),y) >> diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h >> index d43669f..cec4d07 100644 >> --- a/lib/librte_vhost/rte_vhost.h >> +++ b/lib/librte_vhost/rte_vhost.h >> @@ -35,6 +35,7 @@ >> #define RTE_VHOST_USER_EXTBUF_SUPPORT (1ULL << 5) >> /* support only linear buffers (no chained mbufs) */ >> #define RTE_VHOST_USER_LINEARBUF_SUPPORT (1ULL << 6) >> +#define RTE_VHOST_USER_ASYNC_COPY (1ULL << 7) >> >> /** Protocol features. */ >> #ifndef VHOST_USER_PROTOCOL_F_MQ >> diff --git a/lib/librte_vhost/rte_vhost_async.h >> b/lib/librte_vhost/rte_vhost_async.h >> new file mode 100644 >> index 0000000..82f2ebe >> --- /dev/null >> +++ b/lib/librte_vhost/rte_vhost_async.h >> @@ -0,0 +1,134 @@ >> +/* SPDX-License-Identifier: BSD-3-Clause >> + * Copyright(c) 2018 Intel Corporation >> + */ > > s/2018/2020/ > >> + >> +#ifndef _RTE_VHOST_ASYNC_H_ >> +#define _RTE_VHOST_ASYNC_H_ >> + >> +#include "rte_vhost.h" >> + >> +/** >> + * iovec iterator >> + */ >> +struct iov_it { >> + /** offset to the first byte of interesting data */ >> + size_t offset; >> + /** total bytes of data in this iterator */ >> + size_t count; >> + /** pointer to the iovec array */ >> + struct iovec *iov; >> + /** number of iovec in this iterator */ >> + unsigned long nr_segs; >> +}; > > Patrick, > I think structure named as "it" is too generic for understanding, please use more meaningful name like "iov_iter". I would also add that being pat if the Vhost API, it needs to be prefixed with rte_vhost_. This comment applies to all structures in this header. >> + >> +/** >> + * dma transfer descriptor pair >> + */ >> +struct dma_trans_desc { >> + /** source memory iov_it */ >> + struct iov_it *src; >> + /** destination memory iov_it */ >> + struct iov_it *dst; >> +}; >> + > > This series patch named as sync copy, and dma is just one async copy method which underneath hardware supplied. > IMHO, structure is better to named as "async_copy_desc" which matched the overall concept. > >> +/** >> + * dma transfer status >> + */ >> +struct dma_trans_status { >> + /** An array of application specific data for source memory */ >> + uintptr_t *src_opaque_data; >> + /** An array of application specific data for destination memory */ >> + uintptr_t *dst_opaque_data; >> +}; >> + > Same as pervious comment. > >> +/** >> + * dma operation callbacks to be implemented by applications >> + */ >> +struct rte_vhost_async_channel_ops { >> + /** >> + * instruct a DMA channel to perform copies for a batch of packets >> + * >> + * @param vid >> + * id of vhost device to perform data copies >> + * @param queue_id >> + * queue id to perform data copies >> + * @param descs >> + * an array of DMA transfer memory descriptors >> + * @param opaque_data >> + * opaque data pair sending to DMA engine >> + * @param count >> + * number of elements in the "descs" array >> + * @return >> + * -1 on failure, number of descs processed on success >> + */ >> + int (*transfer_data)(int vid, uint16_t queue_id, >> + struct dma_trans_desc *descs, >> + struct dma_trans_status *opaque_data, >> + uint16_t count); >> + /** >> + * check copy-completed packets from a DMA channel >> + * @param vid >> + * id of vhost device to check copy completion >> + * @param queue_id >> + * queue id to check copyp completion >> + * @param opaque_data >> + * buffer to receive the opaque data pair from DMA engine >> + * @param max_packets >> + * max number of packets could be completed >> + * @return >> + * -1 on failure, number of iov segments completed on success >> + */ >> + int (*check_completed_copies)(int vid, uint16_t queue_id, >> + struct dma_trans_status *opaque_data, >> + uint16_t max_packets); >> +}; >> + >> +/** >> + * dma channel feature bit definition >> + */ >> +struct dma_channel_features { >> + union { >> + uint32_t intval; >> + struct { >> + uint32_t inorder:1; >> + uint32_t resvd0115:15; >> + uint32_t threshold:12; >> + uint32_t resvd2831:4; >> + }; >> + }; >> +}; >> + > > Naming feature bits as "intval" may cause confusion, why not just use its meaning like "engine_features"? > I'm not sure whether format "resvd0115" match dpdk copy style. In my mind, dpdk will use resvd_0 and resvd_1 for two reserved elements. > >> +/** >> + * register a dma channel for vhost >> + * >> + * @param vid >> + * vhost device id DMA channel to be attached to >> + * @param queue_id >> + * vhost queue id DMA channel to be attached to >> + * @param features >> + * DMA channel feature bit >> + * b0 : DMA supports inorder data transfer >> + * b1 - b15: reserved >> + * b16 - b27: Packet length threshold for DMA transfer >> + * b28 - b31: reserved >> + * @param ops >> + * DMA operation callbacks >> + * @return >> + * 0 on success, -1 on failures >> + */ >> +int rte_vhost_async_channel_register(int vid, uint16_t queue_id, >> + uint32_t features, struct rte_vhost_async_channel_ops *ops); >> + >> +/** >> + * unregister a dma channel for vhost >> + * >> + * @param vid >> + * vhost device id DMA channel to be detached >> + * @param queue_id >> + * vhost queue id DMA channel to be detached >> + * @return >> + * 0 on success, -1 on failures >> + */ >> +int rte_vhost_async_channel_unregister(int vid, uint16_t queue_id); >> + >> +#endif /* _RTE_VDPA_H_ */ >> diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c >> index 0a66ef9..f817783 100644 >> --- a/lib/librte_vhost/socket.c >> +++ b/lib/librte_vhost/socket.c >> @@ -42,6 +42,7 @@ struct vhost_user_socket { >> bool use_builtin_virtio_net; >> bool extbuf; >> bool linearbuf; >> + bool async_copy; >> >> /* >> * The "supported_features" indicates the feature bits the >> @@ -210,6 +211,7 @@ struct vhost_user { >> size_t size; >> struct vhost_user_connection *conn; >> int ret; >> + struct virtio_net *dev; >> >> if (vsocket == NULL) >> return; >> @@ -241,6 +243,13 @@ struct vhost_user { >> if (vsocket->linearbuf) >> vhost_enable_linearbuf(vid); >> >> + if (vsocket->async_copy) { >> + dev = get_device(vid); >> + >> + if (dev) >> + dev->async_copy = 1; >> + } >> + > > IMHO, user can chose which queue utilize async copy as backend hardware resource is limited. > So should async_copy enable flag be saved in virtqueue structure? > >> VHOST_LOG_CONFIG(INFO, "new device, handle is %d\n", vid); >> >> if (vsocket->notify_ops->new_connection) { >> @@ -891,6 +900,17 @@ struct vhost_user_reconnect_list { >> goto out_mutex; >> } >> >> + vsocket->async_copy = flags & RTE_VHOST_USER_ASYNC_COPY; >> + >> + if (vsocket->async_copy && >> + (flags & (RTE_VHOST_USER_IOMMU_SUPPORT | >> + RTE_VHOST_USER_POSTCOPY_SUPPORT))) { >> + VHOST_LOG_CONFIG(ERR, "error: enabling async copy and >> IOMMU " >> + "or post-copy feature simultaneously is not " >> + "supported\n"); >> + goto out_mutex; >> + } >> + >> /* >> * Set the supported features correctly for the builtin vhost-user >> * net driver. >> diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c >> index 0266318..e6b688a 100644 >> --- a/lib/librte_vhost/vhost.c >> +++ b/lib/librte_vhost/vhost.c >> @@ -332,8 +332,13 @@ >> { >> if (vq_is_packed(dev)) >> rte_free(vq->shadow_used_packed); >> - else >> + else { >> rte_free(vq->shadow_used_split); >> + if (vq->async_pkts_pending) >> + rte_free(vq->async_pkts_pending); >> + if (vq->async_pending_info) >> + rte_free(vq->async_pending_info); >> + } >> rte_free(vq->batch_copy_elems); >> rte_mempool_free(vq->iotlb_pool); >> rte_free(vq); >> @@ -1527,3 +1532,70 @@ int rte_vhost_extern_callback_register(int vid, >> if (vhost_data_log_level >= 0) >> rte_log_set_level(vhost_data_log_level, >> RTE_LOG_WARNING); >> } >> + >> +int rte_vhost_async_channel_register(int vid, uint16_t queue_id, >> + uint32_t features, >> + struct rte_vhost_async_channel_ops >> *ops) >> +{ >> + struct vhost_virtqueue *vq; >> + struct virtio_net *dev = get_device(vid); >> + struct dma_channel_features f; >> + >> + if (dev == NULL || ops == NULL) >> + return -1; >> + >> + f.intval = features; >> + >> + vq = dev->virtqueue[queue_id]; >> + >> + if (vq == NULL) >> + return -1; >> + >> + /** packed queue is not supported */ >> + if (vq_is_packed(dev) || !f.inorder) >> + return -1; >> + > Virtio already has in_order concept, these two names are so like and can be easily messed up. Please consider how to distinguish them. > >> + if (ops->check_completed_copies == NULL || >> + ops->transfer_data == NULL) >> + return -1; >> + > > Previous error is unlikely to be true, unlikely macro may be helpful for understanding. > >> + rte_spinlock_lock(&vq->access_lock); >> + >> + vq->async_ops.check_completed_copies = ops- >>> check_completed_copies; >> + vq->async_ops.transfer_data = ops->transfer_data; >> + >> + vq->async_inorder = f.inorder; >> + vq->async_threshold = f.threshold; >> + >> + vq->async_registered = true; >> + >> + rte_spinlock_unlock(&vq->access_lock); >> + >> + return 0; >> +} >> + >> +int rte_vhost_async_channel_unregister(int vid, uint16_t queue_id) >> +{ >> + struct vhost_virtqueue *vq; >> + struct virtio_net *dev = get_device(vid); >> + >> + if (dev == NULL) >> + return -1; >> + >> + vq = dev->virtqueue[queue_id]; >> + >> + if (vq == NULL) >> + return -1; >> + >> + rte_spinlock_lock(&vq->access_lock); >> + >> + vq->async_ops.transfer_data = NULL; >> + vq->async_ops.check_completed_copies = NULL; >> + >> + vq->async_registered = false; >> + >> + rte_spinlock_unlock(&vq->access_lock); >> + >> + return 0; >> +} >> + >> diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h >> index df98d15..a7fbe23 100644 >> --- a/lib/librte_vhost/vhost.h >> +++ b/lib/librte_vhost/vhost.h >> @@ -23,6 +23,8 @@ >> #include "rte_vhost.h" >> #include "rte_vdpa.h" >> >> +#include "rte_vhost_async.h" >> + >> /* Used to indicate that the device is running on a data core */ >> #define VIRTIO_DEV_RUNNING 1 >> /* Used to indicate that the device is ready to operate */ >> @@ -39,6 +41,11 @@ >> >> #define VHOST_LOG_CACHE_NR 32 >> >> +#define MAX_PKT_BURST 32 >> + >> +#define VHOST_MAX_ASYNC_IT (MAX_PKT_BURST * 2) >> +#define VHOST_MAX_ASYNC_VEC (BUF_VECTOR_MAX * 2) >> + >> #define PACKED_DESC_ENQUEUE_USED_FLAG(w) \ >> ((w) ? (VRING_DESC_F_AVAIL | VRING_DESC_F_USED | >> VRING_DESC_F_WRITE) : \ >> VRING_DESC_F_WRITE) >> @@ -200,6 +207,25 @@ struct vhost_virtqueue { >> TAILQ_HEAD(, vhost_iotlb_entry) iotlb_list; >> int iotlb_cache_nr; >> TAILQ_HEAD(, vhost_iotlb_entry) iotlb_pending_list; >> + >> + /* operation callbacks for async dma */ >> + struct rte_vhost_async_channel_ops async_ops; >> + >> + struct iov_it it_pool[VHOST_MAX_ASYNC_IT]; >> + struct iovec vec_pool[VHOST_MAX_ASYNC_VEC]; >> + >> + /* async data transfer status */ >> + uintptr_t **async_pkts_pending; >> + #define ASYNC_PENDING_INFO_N_MSK 0xFFFF >> + #define ASYNC_PENDING_INFO_N_SFT 16 >> + uint64_t *async_pending_info; >> + uint16_t async_pkts_idx; >> + uint16_t async_pkts_inflight_n; >> + >> + /* vq async features */ >> + bool async_inorder; >> + bool async_registered; >> + uint16_t async_threshold; >> } __rte_cache_aligned; >> >> /* Old kernels have no such macros defined */ >> @@ -353,6 +379,7 @@ struct virtio_net { >> int16_t broadcast_rarp; >> uint32_t nr_vring; >> int dequeue_zero_copy; >> + int async_copy; >> int extbuf; >> int linearbuf; >> struct vhost_virtqueue *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2]; >> @@ -702,7 +729,8 @@ uint64_t translate_log_addr(struct virtio_net *dev, >> struct vhost_virtqueue *vq, >> /* Don't kick guest if we don't reach index specified by guest. */ >> if (dev->features & (1ULL << VIRTIO_RING_F_EVENT_IDX)) { >> uint16_t old = vq->signalled_used; >> - uint16_t new = vq->last_used_idx; >> + uint16_t new = vq->async_pkts_inflight_n ? >> + vq->used->idx:vq->last_used_idx; >> bool signalled_used_valid = vq->signalled_used_valid; >> >> vq->signalled_used = new; >> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c >> index 84bebad..d7600bf 100644 >> --- a/lib/librte_vhost/vhost_user.c >> +++ b/lib/librte_vhost/vhost_user.c >> @@ -464,12 +464,25 @@ >> } else { >> if (vq->shadow_used_split) >> rte_free(vq->shadow_used_split); >> + if (vq->async_pkts_pending) >> + rte_free(vq->async_pkts_pending); >> + if (vq->async_pending_info) >> + rte_free(vq->async_pending_info); >> + >> vq->shadow_used_split = rte_malloc(NULL, >> vq->size * sizeof(struct vring_used_elem), >> RTE_CACHE_LINE_SIZE); >> - if (!vq->shadow_used_split) { >> + vq->async_pkts_pending = rte_malloc(NULL, >> + vq->size * sizeof(uintptr_t), >> + RTE_CACHE_LINE_SIZE); >> + vq->async_pending_info = rte_malloc(NULL, >> + vq->size * sizeof(uint64_t), >> + RTE_CACHE_LINE_SIZE); >> + if (!vq->shadow_used_split || >> + !vq->async_pkts_pending || >> + !vq->async_pending_info) { >> VHOST_LOG_CONFIG(ERR, >> - "failed to allocate memory for >> shadow used ring.\n"); >> + "failed to allocate memory for vq >> internal data.\n"); > > If async copy not enabled, there will be no need to allocate related structures. > >> return RTE_VHOST_MSG_RESULT_ERR; >> } >> } >> @@ -1147,7 +1160,8 @@ >> goto err_mmap; >> } >> >> - populate = (dev->dequeue_zero_copy) ? MAP_POPULATE : 0; >> + populate = (dev->dequeue_zero_copy || dev->async_copy) ? >> + MAP_POPULATE : 0; >> mmap_addr = mmap(NULL, mmap_size, PROT_READ | >> PROT_WRITE, >> MAP_SHARED | populate, fd, 0); >> >> @@ -1162,7 +1176,7 @@ >> reg->host_user_addr = (uint64_t)(uintptr_t)mmap_addr + >> mmap_offset; >> >> - if (dev->dequeue_zero_copy) >> + if (dev->dequeue_zero_copy || dev->async_copy) >> if (add_guest_pages(dev, reg, alignment) < 0) { >> VHOST_LOG_CONFIG(ERR, >> "adding guest pages to region %u >> failed.\n", >> @@ -1945,6 +1959,12 @@ static int vhost_user_set_vring_err(struct >> virtio_net **pdev __rte_unused, >> } else { >> rte_free(vq->shadow_used_split); >> vq->shadow_used_split = NULL; >> + if (vq->async_pkts_pending) >> + rte_free(vq->async_pkts_pending); >> + if (vq->async_pending_info) >> + rte_free(vq->async_pending_info); >> + vq->async_pkts_pending = NULL; >> + vq->async_pending_info = NULL; >> } >> >> rte_free(vq->batch_copy_elems); >> -- >> 1.8.3.1 >