DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] vhost: fix get hpa fail from guest pages
@ 2021-11-11  6:27 Yuan Wang
  2021-11-15  8:04 ` Xia, Chenbo
  0 siblings, 1 reply; 8+ messages in thread
From: Yuan Wang @ 2021-11-11  6:27 UTC (permalink / raw)
  To: maxime.coquelin, chenbo.xia
  Cc: dev, jiayu.hu, xuan.ding, wenwux.ma, xingguang.he, yvonnex.yang,
	yuanx.wang

When processing front-end memory regions messages,
vhost saves the guest/host physical address mappings to guest pages
and merges adjacent contiguous pages if hpa is contiguous,
however gpa is likely not contiguous in PA mode
and merging will cause the gpa range to change.
This patch distinguishes the case of discontinuous gpa
and does a range lookup on gpa when doing a binary search.

Fixes: e246896178e("vhost: get guest/host physical address mappings")
Fixes: 6563cf92380 ("vhost: fix async copy on multi-page buffers")

Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
---
 lib/vhost/vhost.h      | 18 ++++++++++++++++--
 lib/vhost/vhost_user.c | 15 +++++++++++----
 2 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
index 7085e0885c..b3f0c1d07c 100644
--- a/lib/vhost/vhost.h
+++ b/lib/vhost/vhost.h
@@ -587,6 +587,20 @@ static __rte_always_inline int guest_page_addrcmp(const void *p1,
 	return 0;
 }
 
+static __rte_always_inline int guest_page_rangecmp(const void *p1, const void *p2)
+{
+	const struct guest_page *page1 = (const struct guest_page *)p1;
+	const struct guest_page *page2 = (const struct guest_page *)p2;
+
+	if (page1->guest_phys_addr >= page2->guest_phys_addr) {
+		if (page1->guest_phys_addr < page2->guest_phys_addr + page2->size)
+			return 0;
+		else
+			return 1;
+	} else
+		return -1;
+}
+
 static __rte_always_inline rte_iova_t
 gpa_to_first_hpa(struct virtio_net *dev, uint64_t gpa,
 	uint64_t gpa_size, uint64_t *hpa_size)
@@ -597,9 +611,9 @@ gpa_to_first_hpa(struct virtio_net *dev, uint64_t gpa,
 
 	*hpa_size = gpa_size;
 	if (dev->nr_guest_pages >= VHOST_BINARY_SEARCH_THRESH) {
-		key.guest_phys_addr = gpa & ~(dev->guest_pages[0].size - 1);
+		key.guest_phys_addr = gpa;
 		page = bsearch(&key, dev->guest_pages, dev->nr_guest_pages,
-			       sizeof(struct guest_page), guest_page_addrcmp);
+			       sizeof(struct guest_page), guest_page_rangecmp);
 		if (page) {
 			if (gpa + gpa_size <=
 					page->guest_phys_addr + page->size) {
diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
index a781346c4d..7d58fde458 100644
--- a/lib/vhost/vhost_user.c
+++ b/lib/vhost/vhost_user.c
@@ -999,10 +999,17 @@ add_one_guest_page(struct virtio_net *dev, uint64_t guest_phys_addr,
 	if (dev->nr_guest_pages > 0) {
 		last_page = &dev->guest_pages[dev->nr_guest_pages - 1];
 		/* merge if the two pages are continuous */
-		if (host_phys_addr == last_page->host_phys_addr +
-				      last_page->size) {
-			last_page->size += size;
-			return 0;
+		if (host_phys_addr == last_page->host_phys_addr + last_page->size) {
+			if (rte_eal_iova_mode() == RTE_IOVA_VA) {
+				last_page->size += size;
+				return 0;
+			}
+
+			if (rte_eal_iova_mode() == RTE_IOVA_PA &&
+				guest_phys_addr == last_page->guest_phys_addr + last_page->size) {
+				last_page->size += size;
+				return 0;
+			}
 		}
 	}
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] vhost: fix get hpa fail from guest pages
  2021-11-11  6:27 [PATCH] vhost: fix get hpa fail from guest pages Yuan Wang
@ 2021-11-15  8:04 ` Xia, Chenbo
  2021-11-16  8:07   ` Xia, Chenbo
  0 siblings, 1 reply; 8+ messages in thread
From: Xia, Chenbo @ 2021-11-15  8:04 UTC (permalink / raw)
  To: Wang, YuanX, maxime.coquelin
  Cc: dev, Hu, Jiayu, Ding, Xuan, Ma, WenwuX, He, Xingguang, Yang, YvonneX

> -----Original Message-----
> From: Wang, YuanX <yuanx.wang@intel.com>
> Sent: Thursday, November 11, 2021 2:27 PM
> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
> <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He, Xingguang
> <xingguang.he@intel.com>; Yang, YvonneX <yvonnex.yang@intel.com>; Wang, YuanX
> <yuanx.wang@intel.com>
> Subject: [PATCH] vhost: fix get hpa fail from guest pages
> 
> When processing front-end memory regions messages,
> vhost saves the guest/host physical address mappings to guest pages
> and merges adjacent contiguous pages if hpa is contiguous,
> however gpa is likely not contiguous in PA mode
> and merging will cause the gpa range to change.
> This patch distinguishes the case of discontinuous gpa
> and does a range lookup on gpa when doing a binary search.
> 
> Fixes: e246896178e("vhost: get guest/host physical address mappings")
> Fixes: 6563cf92380 ("vhost: fix async copy on multi-page buffers")
> 
> Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
> ---
>  lib/vhost/vhost.h      | 18 ++++++++++++++++--
>  lib/vhost/vhost_user.c | 15 +++++++++++----
>  2 files changed, 27 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
> index 7085e0885c..b3f0c1d07c 100644
> --- a/lib/vhost/vhost.h
> +++ b/lib/vhost/vhost.h
> @@ -587,6 +587,20 @@ static __rte_always_inline int guest_page_addrcmp(const
> void *p1,
>  	return 0;
>  }
> 
> +static __rte_always_inline int guest_page_rangecmp(const void *p1, const void
> *p2)
> +{
> +	const struct guest_page *page1 = (const struct guest_page *)p1;
> +	const struct guest_page *page2 = (const struct guest_page *)p2;
> +
> +	if (page1->guest_phys_addr >= page2->guest_phys_addr) {
> +		if (page1->guest_phys_addr < page2->guest_phys_addr + page2->size)
> +			return 0;
> +		else
> +			return 1;
> +	} else
> +		return -1;
> +}
> +
>  static __rte_always_inline rte_iova_t
>  gpa_to_first_hpa(struct virtio_net *dev, uint64_t gpa,
>  	uint64_t gpa_size, uint64_t *hpa_size)
> @@ -597,9 +611,9 @@ gpa_to_first_hpa(struct virtio_net *dev, uint64_t gpa,
> 
>  	*hpa_size = gpa_size;
>  	if (dev->nr_guest_pages >= VHOST_BINARY_SEARCH_THRESH) {
> -		key.guest_phys_addr = gpa & ~(dev->guest_pages[0].size - 1);
> +		key.guest_phys_addr = gpa;
>  		page = bsearch(&key, dev->guest_pages, dev->nr_guest_pages,
> -			       sizeof(struct guest_page), guest_page_addrcmp);
> +			       sizeof(struct guest_page), guest_page_rangecmp);
>  		if (page) {
>  			if (gpa + gpa_size <=
>  					page->guest_phys_addr + page->size) {
> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> index a781346c4d..7d58fde458 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -999,10 +999,17 @@ add_one_guest_page(struct virtio_net *dev, uint64_t
> guest_phys_addr,
>  	if (dev->nr_guest_pages > 0) {
>  		last_page = &dev->guest_pages[dev->nr_guest_pages - 1];
>  		/* merge if the two pages are continuous */
> -		if (host_phys_addr == last_page->host_phys_addr +
> -				      last_page->size) {
> -			last_page->size += size;
> -			return 0;
> +		if (host_phys_addr == last_page->host_phys_addr + last_page->size)
> {
> +			if (rte_eal_iova_mode() == RTE_IOVA_VA) {
> +				last_page->size += size;
> +				return 0;
> +			}

This makes me think about a question: In IOVA_VA mode, what ensures HPA and GPA are
both contiguous?

Maxime & Yuan, any thought?

Thanks,
Chenbo

> +
> +			if (rte_eal_iova_mode() == RTE_IOVA_PA &&
> +				guest_phys_addr == last_page->guest_phys_addr +
> last_page->size) {
> +				last_page->size += size;
> +				return 0;
> +			}
>  		}
>  	}
> 
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] vhost: fix get hpa fail from guest pages
  2021-11-15  8:04 ` Xia, Chenbo
@ 2021-11-16  8:07   ` Xia, Chenbo
  2021-11-16  8:47     ` Ding, Xuan
  0 siblings, 1 reply; 8+ messages in thread
From: Xia, Chenbo @ 2021-11-16  8:07 UTC (permalink / raw)
  To: Xia, Chenbo, Wang, YuanX, maxime.coquelin
  Cc: dev, Hu, Jiayu, Ding, Xuan, Ma, WenwuX, He, Xingguang, Yang, YvonneX

> -----Original Message-----
> From: Xia, Chenbo <chenbo.xia@intel.com>
> Sent: Monday, November 15, 2021 4:04 PM
> To: Wang, YuanX <yuanx.wang@intel.com>; maxime.coquelin@redhat.com
> Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
> <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He, Xingguang
> <xingguang.he@intel.com>; Yang, YvonneX <yvonnex.yang@intel.com>
> Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
> 
> > -----Original Message-----
> > From: Wang, YuanX <yuanx.wang@intel.com>
> > Sent: Thursday, November 11, 2021 2:27 PM
> > To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> > Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
> > <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He, Xingguang
> > <xingguang.he@intel.com>; Yang, YvonneX <yvonnex.yang@intel.com>; Wang,
> YuanX
> > <yuanx.wang@intel.com>
> > Subject: [PATCH] vhost: fix get hpa fail from guest pages
> >
> > When processing front-end memory regions messages,
> > vhost saves the guest/host physical address mappings to guest pages
> > and merges adjacent contiguous pages if hpa is contiguous,
> > however gpa is likely not contiguous in PA mode
> > and merging will cause the gpa range to change.
> > This patch distinguishes the case of discontinuous gpa
> > and does a range lookup on gpa when doing a binary search.
> >
> > Fixes: e246896178e("vhost: get guest/host physical address mappings")
> > Fixes: 6563cf92380 ("vhost: fix async copy on multi-page buffers")
> >
> > Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
> > ---
> >  lib/vhost/vhost.h      | 18 ++++++++++++++++--
> >  lib/vhost/vhost_user.c | 15 +++++++++++----
> >  2 files changed, 27 insertions(+), 6 deletions(-)
> >
> > diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
> > index 7085e0885c..b3f0c1d07c 100644
> > --- a/lib/vhost/vhost.h
> > +++ b/lib/vhost/vhost.h
> > @@ -587,6 +587,20 @@ static __rte_always_inline int guest_page_addrcmp(const
> > void *p1,
> >  	return 0;
> >  }
> >
> > +static __rte_always_inline int guest_page_rangecmp(const void *p1, const
> void
> > *p2)
> > +{
> > +	const struct guest_page *page1 = (const struct guest_page *)p1;
> > +	const struct guest_page *page2 = (const struct guest_page *)p2;
> > +
> > +	if (page1->guest_phys_addr >= page2->guest_phys_addr) {
> > +		if (page1->guest_phys_addr < page2->guest_phys_addr + page2->size)
> > +			return 0;
> > +		else
> > +			return 1;
> > +	} else
> > +		return -1;
> > +}
> > +
> >  static __rte_always_inline rte_iova_t
> >  gpa_to_first_hpa(struct virtio_net *dev, uint64_t gpa,
> >  	uint64_t gpa_size, uint64_t *hpa_size)
> > @@ -597,9 +611,9 @@ gpa_to_first_hpa(struct virtio_net *dev, uint64_t gpa,
> >
> >  	*hpa_size = gpa_size;
> >  	if (dev->nr_guest_pages >= VHOST_BINARY_SEARCH_THRESH) {
> > -		key.guest_phys_addr = gpa & ~(dev->guest_pages[0].size - 1);
> > +		key.guest_phys_addr = gpa;
> >  		page = bsearch(&key, dev->guest_pages, dev->nr_guest_pages,
> > -			       sizeof(struct guest_page), guest_page_addrcmp);
> > +			       sizeof(struct guest_page), guest_page_rangecmp);
> >  		if (page) {
> >  			if (gpa + gpa_size <=
> >  					page->guest_phys_addr + page->size) {
> > diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> > index a781346c4d..7d58fde458 100644
> > --- a/lib/vhost/vhost_user.c
> > +++ b/lib/vhost/vhost_user.c
> > @@ -999,10 +999,17 @@ add_one_guest_page(struct virtio_net *dev, uint64_t
> > guest_phys_addr,
> >  	if (dev->nr_guest_pages > 0) {
> >  		last_page = &dev->guest_pages[dev->nr_guest_pages - 1];
> >  		/* merge if the two pages are continuous */
> > -		if (host_phys_addr == last_page->host_phys_addr +
> > -				      last_page->size) {
> > -			last_page->size += size;
> > -			return 0;
> > +		if (host_phys_addr == last_page->host_phys_addr + last_page->size)
> > {
> > +			if (rte_eal_iova_mode() == RTE_IOVA_VA) {
> > +				last_page->size += size;
> > +				return 0;
> > +			}
> 
> This makes me think about a question: In IOVA_VA mode, what ensures HPA and
> GPA are
> both contiguous?

When I wrote this email, I thought host_phys_addr is HPA but in VA mode, it's
Host IOVA, aka VA in DPDK's case. So in most cases GPA and VA will be both
contiguous when the contiguous pages are all in one memory region. But I think
we should not make such assumption as some vhost master may send GPA-contiguous
pages in different memory region, then the VA after mmap may not be contiguous.

Maxime, What do you think?

Thanks,
Chenbo

> 
> Maxime & Yuan, any thought?
> 
> Thanks,
> Chenbo
> 
> > +
> > +			if (rte_eal_iova_mode() == RTE_IOVA_PA &&
> > +				guest_phys_addr == last_page->guest_phys_addr +
> > last_page->size) {
> > +				last_page->size += size;
> > +				return 0;
> > +			}
> >  		}
> >  	}
> >
> > --
> > 2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] vhost: fix get hpa fail from guest pages
  2021-11-16  8:07   ` Xia, Chenbo
@ 2021-11-16  8:47     ` Ding, Xuan
  2021-11-16  8:51       ` Maxime Coquelin
  2021-11-16  9:01       ` Xia, Chenbo
  0 siblings, 2 replies; 8+ messages in thread
From: Ding, Xuan @ 2021-11-16  8:47 UTC (permalink / raw)
  To: Xia, Chenbo, Wang, YuanX, maxime.coquelin
  Cc: dev, Hu, Jiayu, Ma, WenwuX, He, Xingguang, Yang, YvonneX

Hi Chenbo,

> -----Original Message-----
> From: Xia, Chenbo <chenbo.xia@intel.com>
> Sent: 2021年11月16日 16:07
> To: Xia, Chenbo <chenbo.xia@intel.com>; Wang, YuanX
> <yuanx.wang@intel.com>; maxime.coquelin@redhat.com
> Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
> <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
> Xingguang <xingguang.he@intel.com>; Yang, YvonneX
> <yvonnex.yang@intel.com>
> Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
> 
> > -----Original Message-----
> > From: Xia, Chenbo <chenbo.xia@intel.com>
> > Sent: Monday, November 15, 2021 4:04 PM
> > To: Wang, YuanX <yuanx.wang@intel.com>; maxime.coquelin@redhat.com
> > Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
> > <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
> Xingguang
> > <xingguang.he@intel.com>; Yang, YvonneX <yvonnex.yang@intel.com>
> > Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
> >
> > > -----Original Message-----
> > > From: Wang, YuanX <yuanx.wang@intel.com>
> > > Sent: Thursday, November 11, 2021 2:27 PM
> > > To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> > > Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
> > > <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
> > > Xingguang <xingguang.he@intel.com>; Yang, YvonneX
> > > <yvonnex.yang@intel.com>; Wang,
> > YuanX
> > > <yuanx.wang@intel.com>
> > > Subject: [PATCH] vhost: fix get hpa fail from guest pages
> > >
> > > When processing front-end memory regions messages, vhost saves the
> > > guest/host physical address mappings to guest pages and merges
> > > adjacent contiguous pages if hpa is contiguous, however gpa is
> > > likely not contiguous in PA mode and merging will cause the gpa
> > > range to change.
> > > This patch distinguishes the case of discontinuous gpa and does a
> > > range lookup on gpa when doing a binary search.
> > >
> > > Fixes: e246896178e("vhost: get guest/host physical address
> > > mappings")
> > > Fixes: 6563cf92380 ("vhost: fix async copy on multi-page buffers")
> > >
> > > Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
> > > ---
> > >  lib/vhost/vhost.h      | 18 ++++++++++++++++--
> > >  lib/vhost/vhost_user.c | 15 +++++++++++----
> > >  2 files changed, 27 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index
> > > 7085e0885c..b3f0c1d07c 100644
> > > --- a/lib/vhost/vhost.h
> > > +++ b/lib/vhost/vhost.h
> > > @@ -587,6 +587,20 @@ static __rte_always_inline int
> > > guest_page_addrcmp(const void *p1,
> > >  	return 0;
> > >  }
> > >
> > > +static __rte_always_inline int guest_page_rangecmp(const void *p1,
> > > +const
> > void
> > > *p2)
> > > +{
> > > +	const struct guest_page *page1 = (const struct guest_page *)p1;
> > > +	const struct guest_page *page2 = (const struct guest_page *)p2;
> > > +
> > > +	if (page1->guest_phys_addr >= page2->guest_phys_addr) {
> > > +		if (page1->guest_phys_addr < page2->guest_phys_addr +
> page2->size)
> > > +			return 0;
> > > +		else
> > > +			return 1;
> > > +	} else
> > > +		return -1;
> > > +}
> > > +
> > >  static __rte_always_inline rte_iova_t  gpa_to_first_hpa(struct
> > > virtio_net *dev, uint64_t gpa,
> > >  	uint64_t gpa_size, uint64_t *hpa_size) @@ -597,9 +611,9 @@
> > > gpa_to_first_hpa(struct virtio_net *dev, uint64_t gpa,
> > >
> > >  	*hpa_size = gpa_size;
> > >  	if (dev->nr_guest_pages >= VHOST_BINARY_SEARCH_THRESH) {
> > > -		key.guest_phys_addr = gpa & ~(dev->guest_pages[0].size - 1);
> > > +		key.guest_phys_addr = gpa;
> > >  		page = bsearch(&key, dev->guest_pages, dev->nr_guest_pages,
> > > -			       sizeof(struct guest_page), guest_page_addrcmp);
> > > +			       sizeof(struct guest_page), guest_page_rangecmp);
> > >  		if (page) {
> > >  			if (gpa + gpa_size <=
> > >  					page->guest_phys_addr + page->size)
> { diff --git
> > > a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index
> > > a781346c4d..7d58fde458 100644
> > > --- a/lib/vhost/vhost_user.c
> > > +++ b/lib/vhost/vhost_user.c
> > > @@ -999,10 +999,17 @@ add_one_guest_page(struct virtio_net *dev,
> > > uint64_t guest_phys_addr,
> > >  	if (dev->nr_guest_pages > 0) {
> > >  		last_page = &dev->guest_pages[dev->nr_guest_pages - 1];
> > >  		/* merge if the two pages are continuous */
> > > -		if (host_phys_addr == last_page->host_phys_addr +
> > > -				      last_page->size) {
> > > -			last_page->size += size;
> > > -			return 0;
> > > +		if (host_phys_addr == last_page->host_phys_addr +
> > > +last_page->size)
> > > {
> > > +			if (rte_eal_iova_mode() == RTE_IOVA_VA) {
> > > +				last_page->size += size;
> > > +				return 0;
> > > +			}
> >
> > This makes me think about a question: In IOVA_VA mode, what ensures
> > HPA and GPA are both contiguous?
> 
> When I wrote this email, I thought host_phys_addr is HPA but in VA mode, it's
> Host IOVA, aka VA in DPDK's case. So in most cases GPA and VA will be both
> contiguous when the contiguous pages are all in one memory region. But I think
> we should not make such assumption as some vhost master may send GPA-
> contiguous pages in different memory region, then the VA after mmap may not
> be contiguous. 

Now we do async vfio mapping at page granularity.
For 4K/2M pages, without this assumption, pages not be merged may
exceed the IOMMU's capability. This limits only use 1G hugepage for DMA device.

Thanks,
Xuan

> 
> Maxime, What do you think?
> 
> Thanks,
> Chenbo
> 
> >
> > Maxime & Yuan, any thought?
> >
> > Thanks,
> > Chenbo
> >
> > > +
> > > +			if (rte_eal_iova_mode() == RTE_IOVA_PA &&
> > > +				guest_phys_addr == last_page-
> >guest_phys_addr +
> > > last_page->size) {
> > > +				last_page->size += size;
> > > +				return 0;
> > > +			}
> > >  		}
> > >  	}
> > >
> > > --
> > > 2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] vhost: fix get hpa fail from guest pages
  2021-11-16  8:47     ` Ding, Xuan
@ 2021-11-16  8:51       ` Maxime Coquelin
  2021-11-16  9:01       ` Xia, Chenbo
  1 sibling, 0 replies; 8+ messages in thread
From: Maxime Coquelin @ 2021-11-16  8:51 UTC (permalink / raw)
  To: Ding, Xuan, Xia, Chenbo, Wang, YuanX
  Cc: dev, Hu, Jiayu, Ma, WenwuX, He, Xingguang, Yang, YvonneX

Hi,

On 11/16/21 09:47, Ding, Xuan wrote:
> Hi Chenbo,
> 
>> -----Original Message-----
>> From: Xia, Chenbo <chenbo.xia@intel.com>
>> Sent: 2021年11月16日 16:07
>> To: Xia, Chenbo <chenbo.xia@intel.com>; Wang, YuanX
>> <yuanx.wang@intel.com>; maxime.coquelin@redhat.com
>> Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
>> <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
>> Xingguang <xingguang.he@intel.com>; Yang, YvonneX
>> <yvonnex.yang@intel.com>
>> Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
>>
>>> -----Original Message-----
>>> From: Xia, Chenbo <chenbo.xia@intel.com>
>>> Sent: Monday, November 15, 2021 4:04 PM
>>> To: Wang, YuanX <yuanx.wang@intel.com>; maxime.coquelin@redhat.com
>>> Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
>>> <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
>> Xingguang
>>> <xingguang.he@intel.com>; Yang, YvonneX <yvonnex.yang@intel.com>
>>> Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
>>>
>>>> -----Original Message-----
>>>> From: Wang, YuanX <yuanx.wang@intel.com>
>>>> Sent: Thursday, November 11, 2021 2:27 PM
>>>> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
>>>> Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
>>>> <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
>>>> Xingguang <xingguang.he@intel.com>; Yang, YvonneX
>>>> <yvonnex.yang@intel.com>; Wang,
>>> YuanX
>>>> <yuanx.wang@intel.com>
>>>> Subject: [PATCH] vhost: fix get hpa fail from guest pages
>>>>
>>>> When processing front-end memory regions messages, vhost saves the
>>>> guest/host physical address mappings to guest pages and merges
>>>> adjacent contiguous pages if hpa is contiguous, however gpa is
>>>> likely not contiguous in PA mode and merging will cause the gpa
>>>> range to change.
>>>> This patch distinguishes the case of discontinuous gpa and does a
>>>> range lookup on gpa when doing a binary search.
>>>>
>>>> Fixes: e246896178e("vhost: get guest/host physical address
>>>> mappings")
>>>> Fixes: 6563cf92380 ("vhost: fix async copy on multi-page buffers")
>>>>
>>>> Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
>>>> ---
>>>>   lib/vhost/vhost.h      | 18 ++++++++++++++++--
>>>>   lib/vhost/vhost_user.c | 15 +++++++++++----
>>>>   2 files changed, 27 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index
>>>> 7085e0885c..b3f0c1d07c 100644
>>>> --- a/lib/vhost/vhost.h
>>>> +++ b/lib/vhost/vhost.h
>>>> @@ -587,6 +587,20 @@ static __rte_always_inline int
>>>> guest_page_addrcmp(const void *p1,
>>>>   	return 0;
>>>>   }
>>>>
>>>> +static __rte_always_inline int guest_page_rangecmp(const void *p1,
>>>> +const
>>> void
>>>> *p2)
>>>> +{
>>>> +	const struct guest_page *page1 = (const struct guest_page *)p1;
>>>> +	const struct guest_page *page2 = (const struct guest_page *)p2;
>>>> +
>>>> +	if (page1->guest_phys_addr >= page2->guest_phys_addr) {
>>>> +		if (page1->guest_phys_addr < page2->guest_phys_addr +
>> page2->size)
>>>> +			return 0;
>>>> +		else
>>>> +			return 1;
>>>> +	} else
>>>> +		return -1;
>>>> +}
>>>> +
>>>>   static __rte_always_inline rte_iova_t  gpa_to_first_hpa(struct
>>>> virtio_net *dev, uint64_t gpa,
>>>>   	uint64_t gpa_size, uint64_t *hpa_size) @@ -597,9 +611,9 @@
>>>> gpa_to_first_hpa(struct virtio_net *dev, uint64_t gpa,
>>>>
>>>>   	*hpa_size = gpa_size;
>>>>   	if (dev->nr_guest_pages >= VHOST_BINARY_SEARCH_THRESH) {
>>>> -		key.guest_phys_addr = gpa & ~(dev->guest_pages[0].size - 1);
>>>> +		key.guest_phys_addr = gpa;
>>>>   		page = bsearch(&key, dev->guest_pages, dev->nr_guest_pages,
>>>> -			       sizeof(struct guest_page), guest_page_addrcmp);
>>>> +			       sizeof(struct guest_page), guest_page_rangecmp);
>>>>   		if (page) {
>>>>   			if (gpa + gpa_size <=
>>>>   					page->guest_phys_addr + page->size)
>> { diff --git
>>>> a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index
>>>> a781346c4d..7d58fde458 100644
>>>> --- a/lib/vhost/vhost_user.c
>>>> +++ b/lib/vhost/vhost_user.c
>>>> @@ -999,10 +999,17 @@ add_one_guest_page(struct virtio_net *dev,
>>>> uint64_t guest_phys_addr,
>>>>   	if (dev->nr_guest_pages > 0) {
>>>>   		last_page = &dev->guest_pages[dev->nr_guest_pages - 1];
>>>>   		/* merge if the two pages are continuous */
>>>> -		if (host_phys_addr == last_page->host_phys_addr +
>>>> -				      last_page->size) {
>>>> -			last_page->size += size;
>>>> -			return 0;
>>>> +		if (host_phys_addr == last_page->host_phys_addr +
>>>> +last_page->size)
>>>> {
>>>> +			if (rte_eal_iova_mode() == RTE_IOVA_VA) {
>>>> +				last_page->size += size;
>>>> +				return 0;
>>>> +			}
>>>
>>> This makes me think about a question: In IOVA_VA mode, what ensures
>>> HPA and GPA are both contiguous?
>>
>> When I wrote this email, I thought host_phys_addr is HPA but in VA mode, it's
>> Host IOVA, aka VA in DPDK's case. So in most cases GPA and VA will be both
>> contiguous when the contiguous pages are all in one memory region. But I think
>> we should not make such assumption as some vhost master may send GPA-
>> contiguous pages in different memory region, then the VA after mmap may not
>> be contiguous.
> 
> Now we do async vfio mapping at page granularity.
> For 4K/2M pages, without this assumption, pages not be merged may
> exceed the IOMMU's capability. This limits only use 1G hugepage for DMA device.

I'm sorry, but I'm not sure to understand your explanation.
Could you please rephrase it?

Thanks,
Maxime

> Thanks,
> Xuan
> 
>>
>> Maxime, What do you think?
>>
>> Thanks,
>> Chenbo
>>
>>>
>>> Maxime & Yuan, any thought?
>>>
>>> Thanks,
>>> Chenbo
>>>
>>>> +
>>>> +			if (rte_eal_iova_mode() == RTE_IOVA_PA &&
>>>> +				guest_phys_addr == last_page-
>>> guest_phys_addr +
>>>> last_page->size) {
>>>> +				last_page->size += size;
>>>> +				return 0;
>>>> +			}
>>>>   		}
>>>>   	}
>>>>
>>>> --
>>>> 2.25.1
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] vhost: fix get hpa fail from guest pages
  2021-11-16  8:47     ` Ding, Xuan
  2021-11-16  8:51       ` Maxime Coquelin
@ 2021-11-16  9:01       ` Xia, Chenbo
  2021-11-16 10:08         ` Ding, Xuan
  1 sibling, 1 reply; 8+ messages in thread
From: Xia, Chenbo @ 2021-11-16  9:01 UTC (permalink / raw)
  To: Ding, Xuan, Wang, YuanX, maxime.coquelin
  Cc: dev, Hu, Jiayu, Ma, WenwuX, He, Xingguang, Yang, YvonneX

> -----Original Message-----
> From: Ding, Xuan <xuan.ding@intel.com>
> Sent: Tuesday, November 16, 2021 4:47 PM
> To: Xia, Chenbo <chenbo.xia@intel.com>; Wang, YuanX <yuanx.wang@intel.com>;
> maxime.coquelin@redhat.com
> Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ma, WenwuX
> <wenwux.ma@intel.com>; He, Xingguang <xingguang.he@intel.com>; Yang, YvonneX
> <yvonnex.yang@intel.com>
> Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
> 
> Hi Chenbo,
> 
> > -----Original Message-----
> > From: Xia, Chenbo <chenbo.xia@intel.com>
> > Sent: 2021年11月16日 16:07
> > To: Xia, Chenbo <chenbo.xia@intel.com>; Wang, YuanX
> > <yuanx.wang@intel.com>; maxime.coquelin@redhat.com
> > Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
> > <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
> > Xingguang <xingguang.he@intel.com>; Yang, YvonneX
> > <yvonnex.yang@intel.com>
> > Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
> >
> > > -----Original Message-----
> > > From: Xia, Chenbo <chenbo.xia@intel.com>
> > > Sent: Monday, November 15, 2021 4:04 PM
> > > To: Wang, YuanX <yuanx.wang@intel.com>; maxime.coquelin@redhat.com
> > > Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
> > > <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
> > Xingguang
> > > <xingguang.he@intel.com>; Yang, YvonneX <yvonnex.yang@intel.com>
> > > Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
> > >
> > > > -----Original Message-----
> > > > From: Wang, YuanX <yuanx.wang@intel.com>
> > > > Sent: Thursday, November 11, 2021 2:27 PM
> > > > To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
> > > > Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
> > > > <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
> > > > Xingguang <xingguang.he@intel.com>; Yang, YvonneX
> > > > <yvonnex.yang@intel.com>; Wang,
> > > YuanX
> > > > <yuanx.wang@intel.com>
> > > > Subject: [PATCH] vhost: fix get hpa fail from guest pages
> > > >
> > > > When processing front-end memory regions messages, vhost saves the
> > > > guest/host physical address mappings to guest pages and merges
> > > > adjacent contiguous pages if hpa is contiguous, however gpa is
> > > > likely not contiguous in PA mode and merging will cause the gpa
> > > > range to change.
> > > > This patch distinguishes the case of discontinuous gpa and does a
> > > > range lookup on gpa when doing a binary search.
> > > >
> > > > Fixes: e246896178e("vhost: get guest/host physical address
> > > > mappings")
> > > > Fixes: 6563cf92380 ("vhost: fix async copy on multi-page buffers")
> > > >
> > > > Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
> > > > ---
> > > >  lib/vhost/vhost.h      | 18 ++++++++++++++++--
> > > >  lib/vhost/vhost_user.c | 15 +++++++++++----
> > > >  2 files changed, 27 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index
> > > > 7085e0885c..b3f0c1d07c 100644
> > > > --- a/lib/vhost/vhost.h
> > > > +++ b/lib/vhost/vhost.h
> > > > @@ -587,6 +587,20 @@ static __rte_always_inline int
> > > > guest_page_addrcmp(const void *p1,
> > > >  	return 0;
> > > >  }
> > > >
> > > > +static __rte_always_inline int guest_page_rangecmp(const void *p1,
> > > > +const
> > > void
> > > > *p2)
> > > > +{
> > > > +	const struct guest_page *page1 = (const struct guest_page *)p1;
> > > > +	const struct guest_page *page2 = (const struct guest_page *)p2;
> > > > +
> > > > +	if (page1->guest_phys_addr >= page2->guest_phys_addr) {
> > > > +		if (page1->guest_phys_addr < page2->guest_phys_addr +
> > page2->size)
> > > > +			return 0;
> > > > +		else
> > > > +			return 1;
> > > > +	} else
> > > > +		return -1;
> > > > +}
> > > > +
> > > >  static __rte_always_inline rte_iova_t  gpa_to_first_hpa(struct
> > > > virtio_net *dev, uint64_t gpa,
> > > >  	uint64_t gpa_size, uint64_t *hpa_size) @@ -597,9 +611,9 @@
> > > > gpa_to_first_hpa(struct virtio_net *dev, uint64_t gpa,
> > > >
> > > >  	*hpa_size = gpa_size;
> > > >  	if (dev->nr_guest_pages >= VHOST_BINARY_SEARCH_THRESH) {
> > > > -		key.guest_phys_addr = gpa & ~(dev->guest_pages[0].size - 1);
> > > > +		key.guest_phys_addr = gpa;
> > > >  		page = bsearch(&key, dev->guest_pages, dev->nr_guest_pages,
> > > > -			       sizeof(struct guest_page), guest_page_addrcmp);
> > > > +			       sizeof(struct guest_page), guest_page_rangecmp);
> > > >  		if (page) {
> > > >  			if (gpa + gpa_size <=
> > > >  					page->guest_phys_addr + page->size)
> > { diff --git
> > > > a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index
> > > > a781346c4d..7d58fde458 100644
> > > > --- a/lib/vhost/vhost_user.c
> > > > +++ b/lib/vhost/vhost_user.c
> > > > @@ -999,10 +999,17 @@ add_one_guest_page(struct virtio_net *dev,
> > > > uint64_t guest_phys_addr,
> > > >  	if (dev->nr_guest_pages > 0) {
> > > >  		last_page = &dev->guest_pages[dev->nr_guest_pages - 1];
> > > >  		/* merge if the two pages are continuous */
> > > > -		if (host_phys_addr == last_page->host_phys_addr +
> > > > -				      last_page->size) {
> > > > -			last_page->size += size;
> > > > -			return 0;
> > > > +		if (host_phys_addr == last_page->host_phys_addr +
> > > > +last_page->size)
> > > > {
> > > > +			if (rte_eal_iova_mode() == RTE_IOVA_VA) {
> > > > +				last_page->size += size;
> > > > +				return 0;
> > > > +			}
> > >
> > > This makes me think about a question: In IOVA_VA mode, what ensures
> > > HPA and GPA are both contiguous?
> >
> > When I wrote this email, I thought host_phys_addr is HPA but in VA mode,
> it's
> > Host IOVA, aka VA in DPDK's case. So in most cases GPA and VA will be both
> > contiguous when the contiguous pages are all in one memory region. But I
> think
> > we should not make such assumption as some vhost master may send GPA-
> > contiguous pages in different memory region, then the VA after mmap may not
> > be contiguous.
> 
> Now we do async vfio mapping at page granularity.
> For 4K/2M pages, without this assumption, pages not be merged may
> exceed the IOMMU's capability. This limits only use 1G hugepage for DMA device.

I was not saying merging should be done, but merging with correct logic.

That is to say, the pages that should be merged will still be merged, but make the
Logic to handle all cases.

Thanks,
Chenbo

> 
> Thanks,
> Xuan
> 
> >
> > Maxime, What do you think?
> >
> > Thanks,
> > Chenbo
> >
> > >
> > > Maxime & Yuan, any thought?
> > >
> > > Thanks,
> > > Chenbo
> > >
> > > > +
> > > > +			if (rte_eal_iova_mode() == RTE_IOVA_PA &&
> > > > +				guest_phys_addr == last_page-
> > >guest_phys_addr +
> > > > last_page->size) {
> > > > +				last_page->size += size;
> > > > +				return 0;
> > > > +			}
> > > >  		}
> > > >  	}
> > > >
> > > > --
> > > > 2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH] vhost: fix get hpa fail from guest pages
  2021-11-16  9:01       ` Xia, Chenbo
@ 2021-11-16 10:08         ` Ding, Xuan
  2021-11-16 10:21           ` Maxime Coquelin
  0 siblings, 1 reply; 8+ messages in thread
From: Ding, Xuan @ 2021-11-16 10:08 UTC (permalink / raw)
  To: Xia, Chenbo, Wang, YuanX, maxime.coquelin
  Cc: dev, Hu, Jiayu, Ma, WenwuX, He, Xingguang, Yang, YvonneX

Hi,

> -----Original Message-----
> From: Xia, Chenbo <chenbo.xia@intel.com>
> Sent: 2021年11月16日 17:01
> To: Ding, Xuan <xuan.ding@intel.com>; Wang, YuanX <yuanx.wang@intel.com>;
> maxime.coquelin@redhat.com
> Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ma, WenwuX
> <wenwux.ma@intel.com>; He, Xingguang <xingguang.he@intel.com>; Yang,
> YvonneX <yvonnex.yang@intel.com>
> Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
> 
> > -----Original Message-----
> > From: Ding, Xuan <xuan.ding@intel.com>
> > Sent: Tuesday, November 16, 2021 4:47 PM
> > To: Xia, Chenbo <chenbo.xia@intel.com>; Wang, YuanX
> > <yuanx.wang@intel.com>; maxime.coquelin@redhat.com
> > Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ma, WenwuX
> > <wenwux.ma@intel.com>; He, Xingguang <xingguang.he@intel.com>; Yang,
> > YvonneX <yvonnex.yang@intel.com>
> > Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
> >
> > Hi Chenbo,
> >
> > > -----Original Message-----
> > > From: Xia, Chenbo <chenbo.xia@intel.com>
> > > Sent: 2021年11月16日 16:07
> > > To: Xia, Chenbo <chenbo.xia@intel.com>; Wang, YuanX
> > > <yuanx.wang@intel.com>; maxime.coquelin@redhat.com
> > > Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
> > > <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
> > > Xingguang <xingguang.he@intel.com>; Yang, YvonneX
> > > <yvonnex.yang@intel.com>
> > > Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
> > >
> > > > -----Original Message-----
> > > > From: Xia, Chenbo <chenbo.xia@intel.com>
> > > > Sent: Monday, November 15, 2021 4:04 PM
> > > > To: Wang, YuanX <yuanx.wang@intel.com>;
> maxime.coquelin@redhat.com
> > > > Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
> > > > <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
> > > Xingguang
> > > > <xingguang.he@intel.com>; Yang, YvonneX <yvonnex.yang@intel.com>
> > > > Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
> > > >
> > > > > -----Original Message-----
> > > > > From: Wang, YuanX <yuanx.wang@intel.com>
> > > > > Sent: Thursday, November 11, 2021 2:27 PM
> > > > > To: maxime.coquelin@redhat.com; Xia, Chenbo
> > > > > <chenbo.xia@intel.com>
> > > > > Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
> > > > > <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
> > > > > Xingguang <xingguang.he@intel.com>; Yang, YvonneX
> > > > > <yvonnex.yang@intel.com>; Wang,
> > > > YuanX
> > > > > <yuanx.wang@intel.com>
> > > > > Subject: [PATCH] vhost: fix get hpa fail from guest pages
> > > > >
> > > > > When processing front-end memory regions messages, vhost saves
> > > > > the guest/host physical address mappings to guest pages and
> > > > > merges adjacent contiguous pages if hpa is contiguous, however
> > > > > gpa is likely not contiguous in PA mode and merging will cause
> > > > > the gpa range to change.
> > > > > This patch distinguishes the case of discontinuous gpa and does
> > > > > a range lookup on gpa when doing a binary search.
> > > > >
> > > > > Fixes: e246896178e("vhost: get guest/host physical address
> > > > > mappings")
> > > > > Fixes: 6563cf92380 ("vhost: fix async copy on multi-page
> > > > > buffers")
> > > > >
> > > > > Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
> > > > > ---
> > > > >  lib/vhost/vhost.h      | 18 ++++++++++++++++--
> > > > >  lib/vhost/vhost_user.c | 15 +++++++++++----
> > > > >  2 files changed, 27 insertions(+), 6 deletions(-)
> > > > >
> > > > > diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index
> > > > > 7085e0885c..b3f0c1d07c 100644
> > > > > --- a/lib/vhost/vhost.h
> > > > > +++ b/lib/vhost/vhost.h
> > > > > @@ -587,6 +587,20 @@ static __rte_always_inline int
> > > > > guest_page_addrcmp(const void *p1,
> > > > >  	return 0;
> > > > >  }
> > > > >
> > > > > +static __rte_always_inline int guest_page_rangecmp(const void
> > > > > +*p1, const
> > > > void
> > > > > *p2)
> > > > > +{
> > > > > +	const struct guest_page *page1 = (const struct guest_page *)p1;
> > > > > +	const struct guest_page *page2 = (const struct guest_page
> > > > > +*)p2;
> > > > > +
> > > > > +	if (page1->guest_phys_addr >= page2->guest_phys_addr) {
> > > > > +		if (page1->guest_phys_addr < page2->guest_phys_addr
> +
> > > page2->size)
> > > > > +			return 0;
> > > > > +		else
> > > > > +			return 1;
> > > > > +	} else
> > > > > +		return -1;
> > > > > +}
> > > > > +
> > > > >  static __rte_always_inline rte_iova_t  gpa_to_first_hpa(struct
> > > > > virtio_net *dev, uint64_t gpa,
> > > > >  	uint64_t gpa_size, uint64_t *hpa_size) @@ -597,9 +611,9 @@
> > > > > gpa_to_first_hpa(struct virtio_net *dev, uint64_t gpa,
> > > > >
> > > > >  	*hpa_size = gpa_size;
> > > > >  	if (dev->nr_guest_pages >= VHOST_BINARY_SEARCH_THRESH) {
> > > > > -		key.guest_phys_addr = gpa & ~(dev->guest_pages[0].size - 1);
> > > > > +		key.guest_phys_addr = gpa;
> > > > >  		page = bsearch(&key, dev->guest_pages, dev->nr_guest_pages,
> > > > > -			       sizeof(struct guest_page), guest_page_addrcmp);
> > > > > +			       sizeof(struct guest_page),
> guest_page_rangecmp);
> > > > >  		if (page) {
> > > > >  			if (gpa + gpa_size <=
> > > > >  					page->guest_phys_addr + page->size)
> > > { diff --git
> > > > > a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index
> > > > > a781346c4d..7d58fde458 100644
> > > > > --- a/lib/vhost/vhost_user.c
> > > > > +++ b/lib/vhost/vhost_user.c
> > > > > @@ -999,10 +999,17 @@ add_one_guest_page(struct virtio_net *dev,
> > > > > uint64_t guest_phys_addr,
> > > > >  	if (dev->nr_guest_pages > 0) {
> > > > >  		last_page = &dev->guest_pages[dev->nr_guest_pages - 1];
> > > > >  		/* merge if the two pages are continuous */
> > > > > -		if (host_phys_addr == last_page->host_phys_addr +
> > > > > -				      last_page->size) {
> > > > > -			last_page->size += size;
> > > > > -			return 0;
> > > > > +		if (host_phys_addr == last_page->host_phys_addr +
> > > > > +last_page->size)
> > > > > {
> > > > > +			if (rte_eal_iova_mode() == RTE_IOVA_VA) {
> > > > > +				last_page->size += size;
> > > > > +				return 0;
> > > > > +			}
> > > >
> > > > This makes me think about a question: In IOVA_VA mode, what
> > > > ensures HPA and GPA are both contiguous?
> > >
> > > When I wrote this email, I thought host_phys_addr is HPA but in VA
> > > mode,
> > it's
> > > Host IOVA, aka VA in DPDK's case. So in most cases GPA and VA will
> > > be both contiguous when the contiguous pages are all in one memory
> > > region. But I
> > think
> > > we should not make such assumption as some vhost master may send
> > > GPA- contiguous pages in different memory region, then the VA after
> > > mmap may not be contiguous.
> >
> > Now we do async vfio mapping at page granularity.
> > For 4K/2M pages, without this assumption, pages not be merged may
> > exceed the IOMMU's capability. This limits only use 1G hugepage for DMA
> device.
> 
> I was not saying merging should be done, but merging with correct logic.
> 
> That is to say, the pages that should be merged will still be merged, but make
> the Logic to handle all cases.

Sorry, I misunderstood your point. I was to add a supplement,
PA mode will have lots of pages to map for 4K/2M cases,
it may exceed IOMMU's capability. But it is rational, that's what I hope
to explain in the doc.

Thanks,
Xuan

> 
> Thanks,
> Chenbo
> 
> >
> > Thanks,
> > Xuan
> >
> > >
> > > Maxime, What do you think?
> > >
> > > Thanks,
> > > Chenbo
> > >
> > > >
> > > > Maxime & Yuan, any thought?
> > > >
> > > > Thanks,
> > > > Chenbo
> > > >
> > > > > +
> > > > > +			if (rte_eal_iova_mode() == RTE_IOVA_PA &&
> > > > > +				guest_phys_addr == last_page-
> > > >guest_phys_addr +
> > > > > last_page->size) {
> > > > > +				last_page->size += size;
> > > > > +				return 0;
> > > > > +			}
> > > > >  		}
> > > > >  	}
> > > > >
> > > > > --
> > > > > 2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] vhost: fix get hpa fail from guest pages
  2021-11-16 10:08         ` Ding, Xuan
@ 2021-11-16 10:21           ` Maxime Coquelin
  0 siblings, 0 replies; 8+ messages in thread
From: Maxime Coquelin @ 2021-11-16 10:21 UTC (permalink / raw)
  To: Ding, Xuan, Xia, Chenbo, Wang, YuanX
  Cc: dev, Hu, Jiayu, Ma, WenwuX, He, Xingguang, Yang, YvonneX



On 11/16/21 11:08, Ding, Xuan wrote:
> Hi,
> 
>> -----Original Message-----
>> From: Xia, Chenbo <chenbo.xia@intel.com>
>> Sent: 2021年11月16日 17:01
>> To: Ding, Xuan <xuan.ding@intel.com>; Wang, YuanX <yuanx.wang@intel.com>;
>> maxime.coquelin@redhat.com
>> Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ma, WenwuX
>> <wenwux.ma@intel.com>; He, Xingguang <xingguang.he@intel.com>; Yang,
>> YvonneX <yvonnex.yang@intel.com>
>> Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
>>
>>> -----Original Message-----
>>> From: Ding, Xuan <xuan.ding@intel.com>
>>> Sent: Tuesday, November 16, 2021 4:47 PM
>>> To: Xia, Chenbo <chenbo.xia@intel.com>; Wang, YuanX
>>> <yuanx.wang@intel.com>; maxime.coquelin@redhat.com
>>> Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ma, WenwuX
>>> <wenwux.ma@intel.com>; He, Xingguang <xingguang.he@intel.com>; Yang,
>>> YvonneX <yvonnex.yang@intel.com>
>>> Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
>>>
>>> Hi Chenbo,
>>>
>>>> -----Original Message-----
>>>> From: Xia, Chenbo <chenbo.xia@intel.com>
>>>> Sent: 2021年11月16日 16:07
>>>> To: Xia, Chenbo <chenbo.xia@intel.com>; Wang, YuanX
>>>> <yuanx.wang@intel.com>; maxime.coquelin@redhat.com
>>>> Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
>>>> <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
>>>> Xingguang <xingguang.he@intel.com>; Yang, YvonneX
>>>> <yvonnex.yang@intel.com>
>>>> Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
>>>>
>>>>> -----Original Message-----
>>>>> From: Xia, Chenbo <chenbo.xia@intel.com>
>>>>> Sent: Monday, November 15, 2021 4:04 PM
>>>>> To: Wang, YuanX <yuanx.wang@intel.com>;
>> maxime.coquelin@redhat.com
>>>>> Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
>>>>> <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
>>>> Xingguang
>>>>> <xingguang.he@intel.com>; Yang, YvonneX <yvonnex.yang@intel.com>
>>>>> Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Wang, YuanX <yuanx.wang@intel.com>
>>>>>> Sent: Thursday, November 11, 2021 2:27 PM
>>>>>> To: maxime.coquelin@redhat.com; Xia, Chenbo
>>>>>> <chenbo.xia@intel.com>
>>>>>> Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
>>>>>> <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; He,
>>>>>> Xingguang <xingguang.he@intel.com>; Yang, YvonneX
>>>>>> <yvonnex.yang@intel.com>; Wang,
>>>>> YuanX
>>>>>> <yuanx.wang@intel.com>
>>>>>> Subject: [PATCH] vhost: fix get hpa fail from guest pages
>>>>>>
>>>>>> When processing front-end memory regions messages, vhost saves
>>>>>> the guest/host physical address mappings to guest pages and
>>>>>> merges adjacent contiguous pages if hpa is contiguous, however
>>>>>> gpa is likely not contiguous in PA mode and merging will cause
>>>>>> the gpa range to change.
>>>>>> This patch distinguishes the case of discontinuous gpa and does
>>>>>> a range lookup on gpa when doing a binary search.
>>>>>>
>>>>>> Fixes: e246896178e("vhost: get guest/host physical address
>>>>>> mappings")
>>>>>> Fixes: 6563cf92380 ("vhost: fix async copy on multi-page
>>>>>> buffers")
>>>>>>
>>>>>> Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
>>>>>> ---
>>>>>>   lib/vhost/vhost.h      | 18 ++++++++++++++++--
>>>>>>   lib/vhost/vhost_user.c | 15 +++++++++++----
>>>>>>   2 files changed, 27 insertions(+), 6 deletions(-)
>>>>>>
>>>>>> diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index
>>>>>> 7085e0885c..b3f0c1d07c 100644
>>>>>> --- a/lib/vhost/vhost.h
>>>>>> +++ b/lib/vhost/vhost.h
>>>>>> @@ -587,6 +587,20 @@ static __rte_always_inline int
>>>>>> guest_page_addrcmp(const void *p1,
>>>>>>   	return 0;
>>>>>>   }
>>>>>>
>>>>>> +static __rte_always_inline int guest_page_rangecmp(const void
>>>>>> +*p1, const
>>>>> void
>>>>>> *p2)
>>>>>> +{
>>>>>> +	const struct guest_page *page1 = (const struct guest_page *)p1;
>>>>>> +	const struct guest_page *page2 = (const struct guest_page
>>>>>> +*)p2;
>>>>>> +
>>>>>> +	if (page1->guest_phys_addr >= page2->guest_phys_addr) {
>>>>>> +		if (page1->guest_phys_addr < page2->guest_phys_addr
>> +
>>>> page2->size)
>>>>>> +			return 0;
>>>>>> +		else
>>>>>> +			return 1;
>>>>>> +	} else
>>>>>> +		return -1;
>>>>>> +}
>>>>>> +
>>>>>>   static __rte_always_inline rte_iova_t  gpa_to_first_hpa(struct
>>>>>> virtio_net *dev, uint64_t gpa,
>>>>>>   	uint64_t gpa_size, uint64_t *hpa_size) @@ -597,9 +611,9 @@
>>>>>> gpa_to_first_hpa(struct virtio_net *dev, uint64_t gpa,
>>>>>>
>>>>>>   	*hpa_size = gpa_size;
>>>>>>   	if (dev->nr_guest_pages >= VHOST_BINARY_SEARCH_THRESH) {
>>>>>> -		key.guest_phys_addr = gpa & ~(dev->guest_pages[0].size - 1);
>>>>>> +		key.guest_phys_addr = gpa;
>>>>>>   		page = bsearch(&key, dev->guest_pages, dev->nr_guest_pages,
>>>>>> -			       sizeof(struct guest_page), guest_page_addrcmp);
>>>>>> +			       sizeof(struct guest_page),
>> guest_page_rangecmp);
>>>>>>   		if (page) {
>>>>>>   			if (gpa + gpa_size <=
>>>>>>   					page->guest_phys_addr + page->size)
>>>> { diff --git
>>>>>> a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index
>>>>>> a781346c4d..7d58fde458 100644
>>>>>> --- a/lib/vhost/vhost_user.c
>>>>>> +++ b/lib/vhost/vhost_user.c
>>>>>> @@ -999,10 +999,17 @@ add_one_guest_page(struct virtio_net *dev,
>>>>>> uint64_t guest_phys_addr,
>>>>>>   	if (dev->nr_guest_pages > 0) {
>>>>>>   		last_page = &dev->guest_pages[dev->nr_guest_pages - 1];
>>>>>>   		/* merge if the two pages are continuous */
>>>>>> -		if (host_phys_addr == last_page->host_phys_addr +
>>>>>> -				      last_page->size) {
>>>>>> -			last_page->size += size;
>>>>>> -			return 0;
>>>>>> +		if (host_phys_addr == last_page->host_phys_addr +
>>>>>> +last_page->size)
>>>>>> {
>>>>>> +			if (rte_eal_iova_mode() == RTE_IOVA_VA) {
>>>>>> +				last_page->size += size;
>>>>>> +				return 0;
>>>>>> +			}
>>>>>
>>>>> This makes me think about a question: In IOVA_VA mode, what
>>>>> ensures HPA and GPA are both contiguous?
>>>>
>>>> When I wrote this email, I thought host_phys_addr is HPA but in VA
>>>> mode,
>>> it's
>>>> Host IOVA, aka VA in DPDK's case. So in most cases GPA and VA will
>>>> be both contiguous when the contiguous pages are all in one memory
>>>> region. But I
>>> think
>>>> we should not make such assumption as some vhost master may send
>>>> GPA- contiguous pages in different memory region, then the VA after
>>>> mmap may not be contiguous.
>>>
>>> Now we do async vfio mapping at page granularity.
>>> For 4K/2M pages, without this assumption, pages not be merged may
>>> exceed the IOMMU's capability. This limits only use 1G hugepage for DMA
>> device.
>>
>> I was not saying merging should be done, but merging with correct logic.
>>
>> That is to say, the pages that should be merged will still be merged, but make
>> the Logic to handle all cases.
> 
> Sorry, I misunderstood your point. I was to add a supplement,
> PA mode will have lots of pages to map for 4K/2M cases,
> it may exceed IOMMU's capability. But it is rational, that's what I hope
> to explain in the doc.

We are running out of time for v21.11 release. Do you agree to postpone
the fix to new release? It can be backported to v21.11 LTS later.

> Thanks,
> Xuan
> 
>>
>> Thanks,
>> Chenbo
>>
>>>
>>> Thanks,
>>> Xuan
>>>
>>>>
>>>> Maxime, What do you think?
>>>>
>>>> Thanks,
>>>> Chenbo
>>>>
>>>>>
>>>>> Maxime & Yuan, any thought?
>>>>>
>>>>> Thanks,
>>>>> Chenbo
>>>>>
>>>>>> +
>>>>>> +			if (rte_eal_iova_mode() == RTE_IOVA_PA &&
>>>>>> +				guest_phys_addr == last_page-
>>>>> guest_phys_addr +
>>>>>> last_page->size) {
>>>>>> +				last_page->size += size;
>>>>>> +				return 0;
>>>>>> +			}
>>>>>>   		}
>>>>>>   	}
>>>>>>
>>>>>> --
>>>>>> 2.25.1
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-11-16 10:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-11  6:27 [PATCH] vhost: fix get hpa fail from guest pages Yuan Wang
2021-11-15  8:04 ` Xia, Chenbo
2021-11-16  8:07   ` Xia, Chenbo
2021-11-16  8:47     ` Ding, Xuan
2021-11-16  8:51       ` Maxime Coquelin
2021-11-16  9:01       ` Xia, Chenbo
2021-11-16 10:08         ` Ding, Xuan
2021-11-16 10:21           ` Maxime Coquelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).