DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Fu, Patrick" <patrick.fu@intel.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"Xia, Chenbo" <chenbo.xia@intel.com>
Subject: Re: [dpdk-dev] [PATCH v4] vhost: fix async copy fail on multi-page buffers
Date: Wed, 29 Jul 2020 02:05:48 +0000
Message-ID: <DM5PR1101MB216916391F84F09603100CAB84700@DM5PR1101MB2169.namprd11.prod.outlook.com> (raw)
In-Reply-To: <DM5PR1101MB21696FD2CFADBB948600155A84700@DM5PR1101MB2169.namprd11.prod.outlook.com>

Hi Maxime,

> -----Original Message-----
> From: Fu, Patrick
> Sent: Wednesday, July 29, 2020 9:40 AM
> To: 'Maxime Coquelin' <maxime.coquelin@redhat.com>; dev@dpdk.org; Xia,
> Chenbo <Chenbo.Xia@intel.com>
> Subject: RE: [PATCH v4] vhost: fix async copy fail on multi-page buffers
> 
> Hi Maxime,
> 
> > -----Original Message-----
> > From: Maxime Coquelin <maxime.coquelin@redhat.com>
> > Sent: Tuesday, July 28, 2020 9:55 PM
> > To: Fu, Patrick <patrick.fu@intel.com>; dev@dpdk.org; Xia, Chenbo
> > <chenbo.xia@intel.com>
> > Subject: Re: [PATCH v4] vhost: fix async copy fail on multi-page
> > buffers
> >
> >
> >
> > On 7/28/20 5:28 AM, patrick.fu@intel.com wrote:
> > > From: Patrick Fu <patrick.fu@intel.com>
> > >
> > > Async copy fails when single ring buffer vector is splited on
> > > multiple physical pages. This happens because current hpa address
> > > translation function doesn't handle multi-page buffers. A new gpa to
> > > hpa address conversion function, which returns the hpa on the first
> > > hitting host pages, is implemented in this patch. Async data path
> > > recursively calls this new function to construct a multi-segments
> > > async copy descriptor for ring buffers crossing physical page boundaries.
> > >
> > > Fixes: cd6760da1076 ("vhost: introduce async enqueue for split
> > > ring")
> > >
> > > Signed-off-by: Patrick Fu <patrick.fu@intel.com>
> > > ---
> > > v2:
> > >  - change commit message and title
> > >  - v1 patch used CPU to copy multi-page buffers; v2 patch split the
> > > copy into multiple async copy segments whenever possible
> > >
> > > v3:
> > >  - added fixline
> > >
> > > v4:
> > >  - fix miss translation of the gpa which is the same length with host
> > >    page size
> > >
> > >  lib/librte_vhost/vhost.h      | 50
> +++++++++++++++++++++++++++++++++++
> > >  lib/librte_vhost/virtio_net.c | 40 +++++++++++++++++-----------
> > >  2 files changed, 75 insertions(+), 15 deletions(-)
> > >
> > > diff --git a/lib/librte_vhost/virtio_net.c
> > > b/lib/librte_vhost/virtio_net.c index 95a0bc19f..124a33a10 100644
> > > --- a/lib/librte_vhost/virtio_net.c
> > > +++ b/lib/librte_vhost/virtio_net.c
> > > @@ -980,6 +980,7 @@ async_mbuf_to_desc(struct virtio_net *dev,
> > > struct
> > vhost_virtqueue *vq,
> > >  	struct batch_copy_elem *batch_copy = vq->batch_copy_elems;
> > >  	struct virtio_net_hdr_mrg_rxbuf tmp_hdr, *hdr = NULL;
> > >  	int error = 0;
> > > +	uint64_t mapped_len;
> > >
> > >  	uint32_t tlen = 0;
> > >  	int tvec_idx = 0;
> > > @@ -1072,24 +1073,31 @@ async_mbuf_to_desc(struct virtio_net *dev,
> > > struct vhost_virtqueue *vq,
> > >
> > >  		cpy_len = RTE_MIN(buf_avail, mbuf_avail);
> > >
> > > -		if (unlikely(cpy_len >= cpy_threshold)) {
> > > -			hpa = (void *)(uintptr_t)gpa_to_hpa(dev,
> > > -					buf_iova + buf_offset, cpy_len);
> > > +		while (unlikely(cpy_len && cpy_len >= cpy_threshold)) {
> > > +			hpa = (void *)(uintptr_t)gpa_to_first_hpa(dev,
> > > +					buf_iova + buf_offset,
> > > +					cpy_len, &mapped_len);
> > >
> > > -			if (unlikely(!hpa)) {
> > > -				error = -1;
> > > -				goto out;
> > > -			}
> > > +			if (unlikely(!hpa || mapped_len < cpy_threshold))
> > > +				break;
> > >
> > >  			async_fill_vec(src_iovec + tvec_idx,
> > >  				(void *)(uintptr_t)rte_pktmbuf_iova_offset(m,
> > > -						mbuf_offset), cpy_len);
> > > +				mbuf_offset), (size_t)mapped_len);
> > >
> > > -			async_fill_vec(dst_iovec + tvec_idx, hpa, cpy_len);
> > > +			async_fill_vec(dst_iovec + tvec_idx,
> > > +					hpa, (size_t)mapped_len);
> > >
> > > -			tlen += cpy_len;
> > > +			tlen += (uint32_t)mapped_len;
> > > +			cpy_len -= (uint32_t)mapped_len;
> > > +			mbuf_avail  -= (uint32_t)mapped_len;
> > > +			mbuf_offset += (uint32_t)mapped_len;
> > > +			buf_avail  -= (uint32_t)mapped_len;
> > > +			buf_offset += (uint32_t)mapped_len;
> > >  			tvec_idx++;
> > > -		} else {
> > > +		}
> > > +
> > > +		if (likely(cpy_len)) {
> > >  			if (unlikely(vq->batch_copy_nb_elems >= vq->size)) {
> > >  				rte_memcpy(
> > >  				(void *)((uintptr_t)(buf_addr + buf_offset)),
> > @@ -1112,10
> > > +1120,12 @@ async_mbuf_to_desc(struct virtio_net *dev, struct
> > vhost_virtqueue *vq,
> > >  			}
> > >  		}
> > >
> > > -		mbuf_avail  -= cpy_len;
> > > -		mbuf_offset += cpy_len;
> > > -		buf_avail  -= cpy_len;
> > > -		buf_offset += cpy_len;
> > > +		if (cpy_len) {
> > > +			mbuf_avail  -= cpy_len;
> > > +			mbuf_offset += cpy_len;
> > > +			buf_avail  -= cpy_len;
> > > +			buf_offset += cpy_len;
> > > +		}
> >
> > Is that really necessary to check if copy length is not 0?
> >
> The intension is to optimize for the case that ring buffers are NOT split (which
> should be the most common case). In that case, cpy_len will be zero and by
> this "if" statement we can save couple of cycles. With that said, the actual
> difference is minor. I'm open with either adding an "unlikely" to the "if", or
> removing this the "if". Would like to hear your option and submit modified
> patch.
> 

I have a better way to handle the case (combine this "if" logic with the previous one). 
Please review my v5 patch for the code change.

Thanks,

Patrick


  reply	other threads:[~2020-07-29  2:06 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-20  2:52 [dpdk-dev] [PATCH v1] vhost: support cross page buf in async data path patrick.fu
2020-07-20 16:39 ` Maxime Coquelin
2020-07-21  2:57   ` Fu, Patrick
2020-07-21  8:35     ` Maxime Coquelin
2020-07-21  9:01       ` Fu, Patrick
2020-07-24 13:49 ` [dpdk-dev] [PATCH v2] vhost: fix async copy fail on multi-page buffers patrick.fu
2020-07-27  6:33 ` [dpdk-dev] [PATCH v3] " patrick.fu
2020-07-27 13:14   ` Xia, Chenbo
2020-07-28  3:09     ` Fu, Patrick
2020-07-28  3:28 ` [dpdk-dev] [PATCH v4] " patrick.fu
2020-07-28 13:55   ` Maxime Coquelin
2020-07-29  1:40     ` Fu, Patrick
2020-07-29  2:05       ` Fu, Patrick [this message]
2020-07-29  2:04 ` [dpdk-dev] [PATCH v5] " Patrick Fu
2020-07-29 14:24   ` Maxime Coquelin
2020-07-29 14:55   ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM5PR1101MB216916391F84F09603100CAB84700@DM5PR1101MB2169.namprd11.prod.outlook.com \
    --to=patrick.fu@intel.com \
    --cc=chenbo.xia@intel.com \
    --cc=dev@dpdk.org \
    --cc=maxime.coquelin@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git