DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Fu, Patrick" <patrick.fu@intel.com>
To: "Liu, Yong" <yong.liu@intel.com>
Cc: "Jiang, Cheng1" <cheng1.jiang@intel.com>,
	"Liang, Cunming" <cunming.liang@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"maxime.coquelin@redhat.com" <maxime.coquelin@redhat.com>,
	"Xia, Chenbo" <chenbo.xia@intel.com>,
	"Wang, Zhihong" <zhihong.wang@intel.com>,
	"Ye, Xiaolong" <xiaolong.ye@intel.com>
Subject: Re: [dpdk-dev] [PATCH v1 2/2] vhost: introduce async enqueue for split	ring
Date: Thu, 18 Jun 2020 11:36:43 +0000	[thread overview]
Message-ID: <BYAPR11MB3735144871FCAE6C26BE763B849B0@BYAPR11MB3735.namprd11.prod.outlook.com> (raw)
In-Reply-To: <86228AFD5BCD8E4EBFD2B90117B5E81E635F613A@SHSMSX103.ccr.corp.intel.com>

Hi,

> -----Original Message-----
> From: Liu, Yong <yong.liu@intel.com>
> Sent: Thursday, June 18, 2020 2:57 PM
> To: Fu, Patrick <patrick.fu@intel.com>
> Cc: Fu, Patrick <patrick.fu@intel.com>; Jiang, Cheng1
> <cheng1.jiang@intel.com>; Liang, Cunming <cunming.liang@intel.com>;
> dev@dpdk.org; maxime.coquelin@redhat.com; Xia, Chenbo
> <chenbo.xia@intel.com>; Wang, Zhihong <zhihong.wang@intel.com>; Ye,
> Xiaolong <xiaolong.ye@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v1 2/2] vhost: introduce async enqueue for
> split ring
> 
> Thanks, Patrick. Some comments are inline.
> 
> >
> > From: Patrick <patrick.fu@intel.com>
> >
> > This patch implement async enqueue data path for split ring.
> >
> > Signed-off-by: Patrick <patrick.fu@intel.com>
> > ---
> >  lib/librte_vhost/rte_vhost_async.h |  38 +++
> >  lib/librte_vhost/virtio_net.c      | 538
> > ++++++++++++++++++++++++++++++++++++-
> >  2 files changed, 574 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_vhost/rte_vhost_async.h
> > b/lib/librte_vhost/rte_vhost_async.h
> > index 82f2ebe..efcba0a 100644
> > --- a/lib/librte_vhost/rte_vhost_async.h
> > +++ b/lib/librte_vhost/rte_vhost_async.h
> > +
> > +static __rte_always_inline int
> > +async_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
> > +			struct rte_mbuf *m, struct buf_vector *buf_vec,
> > +			uint16_t nr_vec, uint16_t num_buffers,
> > +			struct iovec *src_iovec, struct iovec *dst_iovec,
> > +			struct iov_it *src_it, struct iov_it *dst_it) {
> 
> There're too much arguments in this function, please check whether it will
> impact performance.
> 
I guess src_iovec & dst_iovec could be removed from the parameter list. 

> > +	uint32_t vec_idx = 0;
> > +	uint32_t mbuf_offset, mbuf_avail;
> > +	uint32_t buf_offset, buf_avail;
> > +	uint64_t buf_addr, buf_iova, buf_len;
> > +	uint32_t cpy_len, cpy_threshold;
> > +	uint64_t hdr_addr;
> > +	struct rte_mbuf *hdr_mbuf;
> > +	struct batch_copy_elem *batch_copy = vq->batch_copy_elems;
> > +	struct virtio_net_hdr_mrg_rxbuf tmp_hdr, *hdr = NULL;
> > +	int error = 0;
> > +
> > +	uint32_t tlen = 0;
> > +	int tvec_idx = 0;
> > +	void *hpa;
> > +
> > +	if (unlikely(m == NULL)) {
> > +		error = -1;
> > +		goto out;
> > +	}
> > +
> > +	cpy_threshold = vq->async_threshold;
> > +
> > +	buf_addr = buf_vec[vec_idx].buf_addr;
> > +	buf_iova = buf_vec[vec_idx].buf_iova;
> > +	buf_len = buf_vec[vec_idx].buf_len;
> > +
> > +	if (unlikely(buf_len < dev->vhost_hlen && nr_vec <= 1)) {
> > +		error = -1;
> > +		goto out;
> > +	}
> > +
> > +	hdr_mbuf = m;
> > +	hdr_addr = buf_addr;
> > +	if (unlikely(buf_len < dev->vhost_hlen))
> > +		hdr = &tmp_hdr;
> > +	else
> > +		hdr = (struct virtio_net_hdr_mrg_rxbuf
> > *)(uintptr_t)hdr_addr;
> > +
> > +	VHOST_LOG_DATA(DEBUG, "(%d) RX: num merge buffers %d\n",
> > +		dev->vid, num_buffers);
> > +
> > +	if (unlikely(buf_len < dev->vhost_hlen)) {
> > +		buf_offset = dev->vhost_hlen - buf_len;
> > +		vec_idx++;
> > +		buf_addr = buf_vec[vec_idx].buf_addr;
> > +		buf_iova = buf_vec[vec_idx].buf_iova;
> > +		buf_len = buf_vec[vec_idx].buf_len;
> > +		buf_avail = buf_len - buf_offset;
> > +	} else {
> > +		buf_offset = dev->vhost_hlen;
> > +		buf_avail = buf_len - dev->vhost_hlen;
> > +	}
> > +
> > +	mbuf_avail  = rte_pktmbuf_data_len(m);
> > +	mbuf_offset = 0;
> > +
> > +	while (mbuf_avail != 0 || m->next != NULL) {
> > +		/* done with current buf, get the next one */
> > +		if (buf_avail == 0) {
> > +			vec_idx++;
> > +			if (unlikely(vec_idx >= nr_vec)) {
> > +				error = -1;
> > +				goto out;
> > +			}
> > +
> > +			buf_addr = buf_vec[vec_idx].buf_addr;
> > +			buf_iova = buf_vec[vec_idx].buf_iova;
> > +			buf_len = buf_vec[vec_idx].buf_len;
> > +
> > +			buf_offset = 0;
> > +			buf_avail  = buf_len;
> > +		}
> > +
> > +		/* done with current mbuf, get the next one */
> > +		if (mbuf_avail == 0) {
> > +			m = m->next;
> > +
> > +			mbuf_offset = 0;
> > +			mbuf_avail  = rte_pktmbuf_data_len(m);
> > +		}
> > +
> > +		if (hdr_addr) {
> > +			virtio_enqueue_offload(hdr_mbuf, &hdr->hdr);
> > +			if (rxvq_is_mergeable(dev))
> > +				ASSIGN_UNLESS_EQUAL(hdr->num_buffers,
> > +						num_buffers);
> > +
> > +			if (unlikely(hdr == &tmp_hdr)) {
> > +				copy_vnet_hdr_to_desc(dev, vq, buf_vec,
> > hdr);
> > +			} else {
> > +				PRINT_PACKET(dev, (uintptr_t)hdr_addr,
> > +						dev->vhost_hlen, 0);
> > +				vhost_log_cache_write_iova(dev, vq,
> > +						buf_vec[0].buf_iova,
> > +						dev->vhost_hlen);
> > +			}
> > +
> > +			hdr_addr = 0;
> > +		}
> > +
> > +		cpy_len = RTE_MIN(buf_avail, mbuf_avail);
> > +
> > +		if (unlikely(cpy_len >= cpy_threshold)) {
> > +			hpa = (void *)(uintptr_t)gpa_to_hpa(dev,
> > +					buf_iova + buf_offset, cpy_len);
> 
> I have one question here. If user has called async copy directly, should vhost
> library still check copy threshold for software fallback?
> If need software fallback, IMHO it will be more suitable to handle it in copy
> device driver.
> 
Technically, we can delegate the threshold judgement from vhost to application callbacks. 
This will significantly increase the complexity of the callback implementations, which have to maintain
correct ordering between dma copied data and CPU copies data. Meanwhile, it actually doesn't help to 
boost performance comparing with current vhost implementation. Considering this threshold is a 
generic design, I would still prefer to keep it in vhost.

> IMHO, the cost will be too high for checking and fix virtio header in async
> copy function.
> Since this is async copy datapath, could it possible that eliminate the cost in
> calculation of segmented addresses?
> 
Yes, I believe async data path certainly brings new opportunity to apply more optimizations. 
However, at current time frame, settling down the overall async framework might be the priority. 

Thanks,

Patrick

  reply	other threads:[~2020-06-18 11:36 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-11 10:02 [dpdk-dev] [PATCH v1 0/2] introduce asynchronous data path for vhost patrick.fu
2020-06-11 10:02 ` [dpdk-dev] [PATCH v1 1/2] vhost: introduce async data path registration API patrick.fu
2020-06-18  5:50   ` Liu, Yong
2020-06-18  9:08     ` Fu, Patrick
2020-06-19  0:40       ` Liu, Yong
2020-06-25 13:42     ` Maxime Coquelin
2020-06-26 14:28   ` Maxime Coquelin
2020-06-29  1:15     ` Fu, Patrick
2020-06-26 14:44   ` Maxime Coquelin
2020-06-11 10:02 ` [dpdk-dev] [PATCH v1 2/2] vhost: introduce async enqueue for split ring patrick.fu
2020-06-18  6:56   ` Liu, Yong
2020-06-18 11:36     ` Fu, Patrick [this message]
2020-06-26 14:39   ` Maxime Coquelin
2020-06-26 14:46   ` Maxime Coquelin
2020-06-29  1:25     ` Fu, Patrick
2020-06-26 14:42 ` [dpdk-dev] [PATCH v1 0/2] introduce asynchronous data path for vhost Maxime Coquelin
2020-07-03 10:27 ` [dpdk-dev] [PATCH v3 " patrick.fu
2020-07-03 10:27   ` [dpdk-dev] [PATCH v3 1/2] vhost: introduce async enqueue registration API patrick.fu
2020-07-03 10:27   ` [dpdk-dev] [PATCH v3 2/2] vhost: introduce async enqueue for split ring patrick.fu
2020-07-03 12:21 ` [dpdk-dev] [PATCH v4 0/2] introduce asynchronous data path for vhost patrick.fu
2020-07-03 12:21   ` [dpdk-dev] [PATCH v4 1/2] vhost: introduce async enqueue registration API patrick.fu
2020-07-06  3:05     ` Liu, Yong
2020-07-06  9:08       ` Fu, Patrick
2020-07-03 12:21   ` [dpdk-dev] [PATCH v4 2/2] vhost: introduce async enqueue for split ring patrick.fu
2020-07-06 11:53 ` [dpdk-dev] [PATCH v5 0/2] introduce asynchronous data path for vhost patrick.fu
2020-07-06 11:53   ` [dpdk-dev] [PATCH v5 1/2] vhost: introduce async enqueue registration API patrick.fu
2020-07-06 11:53   ` [dpdk-dev] [PATCH v5 2/2] vhost: introduce async enqueue for split ring patrick.fu
2020-07-07  5:07 ` [dpdk-dev] [PATCH v6 0/2] introduce asynchronous data path for vhost patrick.fu
2020-07-07  5:07   ` [dpdk-dev] [PATCH v6 1/2] vhost: introduce async enqueue registration API patrick.fu
2020-07-07  8:22     ` Xia, Chenbo
2020-07-07  5:07   ` [dpdk-dev] [PATCH v6 2/2] vhost: introduce async enqueue for split ring patrick.fu
2020-07-07  8:22     ` Xia, Chenbo
2020-07-07 16:45   ` [dpdk-dev] [PATCH v6 0/2] introduce asynchronous data path for vhost Ferruh Yigit
2020-07-20 13:26   ` Maxime Coquelin
2020-07-21  2:28     ` Fu, Patrick
2020-07-21  8:28       ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BYAPR11MB3735144871FCAE6C26BE763B849B0@BYAPR11MB3735.namprd11.prod.outlook.com \
    --to=patrick.fu@intel.com \
    --cc=chenbo.xia@intel.com \
    --cc=cheng1.jiang@intel.com \
    --cc=cunming.liang@intel.com \
    --cc=dev@dpdk.org \
    --cc=maxime.coquelin@redhat.com \
    --cc=xiaolong.ye@intel.com \
    --cc=yong.liu@intel.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).