patches for DPDK stable branches
 help / color / mirror / Atom feed
* [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses when populating
@ 2019-11-14 13:58 Anatoly Burakov
  2019-11-14 13:58 ` [dpdk-stable] [PATCH 2/2] mempool: remove check for bad IOVA " Anatoly Burakov
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Anatoly Burakov @ 2019-11-14 13:58 UTC (permalink / raw)
  To: dev; +Cc: Olivier Matz, Andrew Rybchenko, david.marchand, stable

Currently, when mempool is being populated, we get IOVA address
of every segment using rte_mem_virt2iova(). This works for internal
memory, but does not really work for external memory, and does not
work on platforms which return RTE_BAD_IOVA as a result of this
call (such as FreeBSD). Moreover, even when it works, the function
in question will do unnecessary pagewalks in IOVA as PA mode, as
it falls back to rte_mem_virt2phy() instead of just doing a lookup in
internal memseg table.

To fix it, replace the call to first attempt to look through the
internal memseg table (this takes care of internal and external memory),
and fall back to rte_mem_virt2iova() when unable to perform VA->IOVA
translation via memseg table.

Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/librte_mempool/rte_mempool.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 40cae3eb67..8da2e471c7 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -356,6 +356,19 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
 	return ret;
 }
 
+static rte_iova_t
+get_iova(void *addr)
+{
+	struct rte_memseg *ms;
+
+	/* try registered memory first */
+	ms = rte_mem_virt2memseg(addr, NULL);
+	if (ms == NULL || ms->iova == RTE_BAD_IOVA)
+		/* fall back to actual physical address */
+		return rte_mem_virt2iova(addr);
+	return ms->iova + RTE_PTR_DIFF(addr, ms->addr);
+}
+
 /* Populate the mempool with a virtual area. Return the number of
  * objects added, or a negative value on error.
  */
@@ -375,7 +388,7 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 	for (off = 0; off < len &&
 		     mp->populated_size < mp->size; off += phys_len) {
 
-		iova = rte_mem_virt2iova(addr + off);
+		iova = get_iova(addr + off);
 
 		if (iova == RTE_BAD_IOVA && rte_eal_has_hugepages()) {
 			ret = -EINVAL;
@@ -391,7 +404,7 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 		     phys_len = RTE_MIN(phys_len + pg_sz, len - off)) {
 			rte_iova_t iova_tmp;
 
-			iova_tmp = rte_mem_virt2iova(addr + off + phys_len);
+			iova_tmp = get_iova(addr + off + phys_len);
 
 			if (iova_tmp == RTE_BAD_IOVA ||
 					iova_tmp != iova + phys_len)
-- 
2.17.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [dpdk-stable] [PATCH 2/2] mempool: remove check for bad IOVA when populating
  2019-11-14 13:58 [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses when populating Anatoly Burakov
@ 2019-11-14 13:58 ` Anatoly Burakov
  2019-11-15  8:48   ` Olivier Matz
  2019-11-19  1:45   ` Li, WenjieX A
  2019-11-15  8:46 ` [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses " Olivier Matz
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 8+ messages in thread
From: Anatoly Burakov @ 2019-11-14 13:58 UTC (permalink / raw)
  To: dev; +Cc: Olivier Matz, Andrew Rybchenko, david.marchand, stable

Currently, mempool will check if IOVA is bad for a segment, and reject
the IOVA if hugepages are also enabled. This check is wrong because now
that we have external memory segments, they are allowed to have their
IOVA's to be invalid. This check also doesn't make much sense in the
first place, because the following code can handle bad IOVA's perfectly
well (and in fact, this check is not triggering a failure when
--no-huge option is enabled), so there is not much sense to check for
this in the first place.

Fixes: 950e8fb4e194 ("mem: allow registering external memory areas")
Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    An alternative implementation would be to add a socket ID check to see
    if the memory being allocated from belongs to an external segment.

 lib/librte_mempool/rte_mempool.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 8da2e471c7..78d8eb941e 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -390,11 +390,6 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 
 		iova = get_iova(addr + off);
 
-		if (iova == RTE_BAD_IOVA && rte_eal_has_hugepages()) {
-			ret = -EINVAL;
-			goto fail;
-		}
-
 		/* populate with the largest group of contiguous pages */
 		for (phys_len = RTE_MIN(
 			(size_t)(RTE_PTR_ALIGN_CEIL(addr + off + 1, pg_sz) -
-- 
2.17.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses when populating
  2019-11-14 13:58 [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses when populating Anatoly Burakov
  2019-11-14 13:58 ` [dpdk-stable] [PATCH 2/2] mempool: remove check for bad IOVA " Anatoly Burakov
@ 2019-11-15  8:46 ` Olivier Matz
  2019-11-15 10:12   ` Burakov, Anatoly
  2019-11-19  1:45 ` Li, WenjieX A
  2019-11-19 20:56 ` David Marchand
  3 siblings, 1 reply; 8+ messages in thread
From: Olivier Matz @ 2019-11-15  8:46 UTC (permalink / raw)
  To: Anatoly Burakov; +Cc: dev, Andrew Rybchenko, david.marchand, stable

On Thu, Nov 14, 2019 at 01:58:20PM +0000, Anatoly Burakov wrote:
> Currently, when mempool is being populated, we get IOVA address
> of every segment using rte_mem_virt2iova(). This works for internal
> memory, but does not really work for external memory, and does not
> work on platforms which return RTE_BAD_IOVA as a result of this
> call (such as FreeBSD). Moreover, even when it works, the function
> in question will do unnecessary pagewalks in IOVA as PA mode, as
> it falls back to rte_mem_virt2phy() instead of just doing a lookup in
> internal memseg table.
> 
> To fix it, replace the call to first attempt to look through the
> internal memseg table (this takes care of internal and external memory),
> and fall back to rte_mem_virt2iova() when unable to perform VA->IOVA
> translation via memseg table.
> 
> Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")

Shouldn't we also add
Fixes: 035ee5bea5ef ("mempool: remove optimistic IOVA-contiguous allocation")
?

From what I understand, even if the problem existed in populate_virt()
before, this is the commit that makes the problem visible in most cases.

> Cc: stable@dpdk.org
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>

Acked-by: Olivier Matz <olivier.matz@6wind.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-stable] [PATCH 2/2] mempool: remove check for bad IOVA when populating
  2019-11-14 13:58 ` [dpdk-stable] [PATCH 2/2] mempool: remove check for bad IOVA " Anatoly Burakov
@ 2019-11-15  8:48   ` Olivier Matz
  2019-11-19  1:45   ` Li, WenjieX A
  1 sibling, 0 replies; 8+ messages in thread
From: Olivier Matz @ 2019-11-15  8:48 UTC (permalink / raw)
  To: Anatoly Burakov; +Cc: dev, Andrew Rybchenko, david.marchand, stable

On Thu, Nov 14, 2019 at 01:58:21PM +0000, Anatoly Burakov wrote:
> Currently, mempool will check if IOVA is bad for a segment, and reject
> the IOVA if hugepages are also enabled. This check is wrong because now
> that we have external memory segments, they are allowed to have their
> IOVA's to be invalid. This check also doesn't make much sense in the
> first place, because the following code can handle bad IOVA's perfectly
> well (and in fact, this check is not triggering a failure when
> --no-huge option is enabled), so there is not much sense to check for
> this in the first place.
> 
> Fixes: 950e8fb4e194 ("mem: allow registering external memory areas")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>

Acked-by: Olivier Matz <olivier.matz@6wind.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses when populating
  2019-11-15  8:46 ` [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses " Olivier Matz
@ 2019-11-15 10:12   ` Burakov, Anatoly
  0 siblings, 0 replies; 8+ messages in thread
From: Burakov, Anatoly @ 2019-11-15 10:12 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev, Andrew Rybchenko, david.marchand, stable

On 15-Nov-19 8:46 AM, Olivier Matz wrote:
> On Thu, Nov 14, 2019 at 01:58:20PM +0000, Anatoly Burakov wrote:
>> Currently, when mempool is being populated, we get IOVA address
>> of every segment using rte_mem_virt2iova(). This works for internal
>> memory, but does not really work for external memory, and does not
>> work on platforms which return RTE_BAD_IOVA as a result of this
>> call (such as FreeBSD). Moreover, even when it works, the function
>> in question will do unnecessary pagewalks in IOVA as PA mode, as
>> it falls back to rte_mem_virt2phy() instead of just doing a lookup in
>> internal memseg table.
>>
>> To fix it, replace the call to first attempt to look through the
>> internal memseg table (this takes care of internal and external memory),
>> and fall back to rte_mem_virt2iova() when unable to perform VA->IOVA
>> translation via memseg table.
>>
>> Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
> 
> Shouldn't we also add
> Fixes: 035ee5bea5ef ("mempool: remove optimistic IOVA-contiguous allocation")
> ?
> 
>>From what I understand, even if the problem existed in populate_virt()
> before, this is the commit that makes the problem visible in most cases.

The commit didn't introduce the issue - only revealed it. So the commit 
itself isn't buggy, and there's nothing to fix :) However, i'm not 
opposed to sharing the blame!

> 
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> 
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
> 


-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-stable] [PATCH 2/2] mempool: remove check for bad IOVA when populating
  2019-11-14 13:58 ` [dpdk-stable] [PATCH 2/2] mempool: remove check for bad IOVA " Anatoly Burakov
  2019-11-15  8:48   ` Olivier Matz
@ 2019-11-19  1:45   ` Li, WenjieX A
  1 sibling, 0 replies; 8+ messages in thread
From: Li, WenjieX A @ 2019-11-19  1:45 UTC (permalink / raw)
  To: Burakov, Anatoly, dev
  Cc: Olivier Matz, Andrew Rybchenko, david.marchand, stable, Chen, BoX C

Tested-by: Chen, BoX C <box.c.chen@intel.com>

> -----Original Message-----
> From: stable [mailto:stable-bounces@dpdk.org] On Behalf Of Anatoly Burakov
> Sent: Thursday, November 14, 2019 9:58 PM
> To: dev@dpdk.org
> Cc: Olivier Matz <olivier.matz@6wind.com>; Andrew Rybchenko
> <arybchenko@solarflare.com>; david.marchand@redhat.com; stable@dpdk.org
> Subject: [dpdk-stable] [PATCH 2/2] mempool: remove check for bad IOVA when
> populating
> 
> Currently, mempool will check if IOVA is bad for a segment, and reject the IOVA
> if hugepages are also enabled. This check is wrong because now that we have
> external memory segments, they are allowed to have their IOVA's to be invalid.
> This check also doesn't make much sense in the first place, because the
> following code can handle bad IOVA's perfectly well (and in fact, this check is
> not triggering a failure when --no-huge option is enabled), so there is not much
> sense to check for this in the first place.
> 
> Fixes: 950e8fb4e194 ("mem: allow registering external memory areas")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
> 
> Notes:
>     An alternative implementation would be to add a socket ID check to see
>     if the memory being allocated from belongs to an external segment.
> 
>  lib/librte_mempool/rte_mempool.c | 5 -----
>  1 file changed, 5 deletions(-)
> 
> diff --git a/lib/librte_mempool/rte_mempool.c
> b/lib/librte_mempool/rte_mempool.c
> index 8da2e471c7..78d8eb941e 100644
> --- a/lib/librte_mempool/rte_mempool.c
> +++ b/lib/librte_mempool/rte_mempool.c
> @@ -390,11 +390,6 @@ rte_mempool_populate_virt(struct rte_mempool *mp,
> char *addr,
> 
>  		iova = get_iova(addr + off);
> 
> -		if (iova == RTE_BAD_IOVA && rte_eal_has_hugepages()) {
> -			ret = -EINVAL;
> -			goto fail;
> -		}
> -
>  		/* populate with the largest group of contiguous pages */
>  		for (phys_len = RTE_MIN(
>  			(size_t)(RTE_PTR_ALIGN_CEIL(addr + off + 1, pg_sz) -
> --
> 2.17.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses when populating
  2019-11-14 13:58 [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses when populating Anatoly Burakov
  2019-11-14 13:58 ` [dpdk-stable] [PATCH 2/2] mempool: remove check for bad IOVA " Anatoly Burakov
  2019-11-15  8:46 ` [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses " Olivier Matz
@ 2019-11-19  1:45 ` Li, WenjieX A
  2019-11-19 20:56 ` David Marchand
  3 siblings, 0 replies; 8+ messages in thread
From: Li, WenjieX A @ 2019-11-19  1:45 UTC (permalink / raw)
  To: Burakov, Anatoly, dev
  Cc: Olivier Matz, Andrew Rybchenko, david.marchand, stable, Chen, BoX C

Tested-by: Chen, BoX C <box.c.chen@intel.com>

> -----Original Message-----
> From: stable [mailto:stable-bounces@dpdk.org] On Behalf Of Anatoly Burakov
> Sent: Thursday, November 14, 2019 9:58 PM
> To: dev@dpdk.org
> Cc: Olivier Matz <olivier.matz@6wind.com>; Andrew Rybchenko
> <arybchenko@solarflare.com>; david.marchand@redhat.com; stable@dpdk.org
> Subject: [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses when
> populating
> 
> Currently, when mempool is being populated, we get IOVA address of every
> segment using rte_mem_virt2iova(). This works for internal memory, but does
> not really work for external memory, and does not work on platforms which
> return RTE_BAD_IOVA as a result of this call (such as FreeBSD). Moreover, even
> when it works, the function in question will do unnecessary pagewalks in IOVA
> as PA mode, as it falls back to rte_mem_virt2phy() instead of just doing a lookup
> in internal memseg table.
> 
> To fix it, replace the call to first attempt to look through the internal memseg
> table (this takes care of internal and external memory), and fall back to
> rte_mem_virt2iova() when unable to perform VA->IOVA translation via memseg
> table.
> 
> Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
>  lib/librte_mempool/rte_mempool.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_mempool/rte_mempool.c
> b/lib/librte_mempool/rte_mempool.c
> index 40cae3eb67..8da2e471c7 100644
> --- a/lib/librte_mempool/rte_mempool.c
> +++ b/lib/librte_mempool/rte_mempool.c
> @@ -356,6 +356,19 @@ rte_mempool_populate_iova(struct rte_mempool *mp,
> char *vaddr,
>  	return ret;
>  }
> 
> +static rte_iova_t
> +get_iova(void *addr)
> +{
> +	struct rte_memseg *ms;
> +
> +	/* try registered memory first */
> +	ms = rte_mem_virt2memseg(addr, NULL);
> +	if (ms == NULL || ms->iova == RTE_BAD_IOVA)
> +		/* fall back to actual physical address */
> +		return rte_mem_virt2iova(addr);
> +	return ms->iova + RTE_PTR_DIFF(addr, ms->addr); }
> +
>  /* Populate the mempool with a virtual area. Return the number of
>   * objects added, or a negative value on error.
>   */
> @@ -375,7 +388,7 @@ rte_mempool_populate_virt(struct rte_mempool *mp,
> char *addr,
>  	for (off = 0; off < len &&
>  		     mp->populated_size < mp->size; off += phys_len) {
> 
> -		iova = rte_mem_virt2iova(addr + off);
> +		iova = get_iova(addr + off);
> 
>  		if (iova == RTE_BAD_IOVA && rte_eal_has_hugepages()) {
>  			ret = -EINVAL;
> @@ -391,7 +404,7 @@ rte_mempool_populate_virt(struct rte_mempool *mp,
> char *addr,
>  		     phys_len = RTE_MIN(phys_len + pg_sz, len - off)) {
>  			rte_iova_t iova_tmp;
> 
> -			iova_tmp = rte_mem_virt2iova(addr + off + phys_len);
> +			iova_tmp = get_iova(addr + off + phys_len);
> 
>  			if (iova_tmp == RTE_BAD_IOVA ||
>  					iova_tmp != iova + phys_len)
> --
> 2.17.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses when populating
  2019-11-14 13:58 [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses when populating Anatoly Burakov
                   ` (2 preceding siblings ...)
  2019-11-19  1:45 ` Li, WenjieX A
@ 2019-11-19 20:56 ` David Marchand
  3 siblings, 0 replies; 8+ messages in thread
From: David Marchand @ 2019-11-19 20:56 UTC (permalink / raw)
  To: Anatoly Burakov
  Cc: dev, Olivier Matz, Andrew Rybchenko, dpdk stable, box.c.chen,
	WenjieX A Li

On Thu, Nov 14, 2019 at 2:58 PM Anatoly Burakov
<anatoly.burakov@intel.com> wrote:
>
> Currently, when mempool is being populated, we get IOVA address
> of every segment using rte_mem_virt2iova(). This works for internal
> memory, but does not really work for external memory, and does not
> work on platforms which return RTE_BAD_IOVA as a result of this
> call (such as FreeBSD). Moreover, even when it works, the function
> in question will do unnecessary pagewalks in IOVA as PA mode, as
> it falls back to rte_mem_virt2phy() instead of just doing a lookup in
> internal memseg table.
>
> To fix it, replace the call to first attempt to look through the
> internal memseg table (this takes care of internal and external memory),
> and fall back to rte_mem_virt2iova() when unable to perform VA->IOVA
> translation via memseg table.
>
> Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
> Cc: stable@dpdk.org
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>

Acked-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Bo Chen <box.c.chen@intel.com>

Series applied, thanks.

--
David Marchand


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-11-19 20:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-14 13:58 [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses when populating Anatoly Burakov
2019-11-14 13:58 ` [dpdk-stable] [PATCH 2/2] mempool: remove check for bad IOVA " Anatoly Burakov
2019-11-15  8:48   ` Olivier Matz
2019-11-19  1:45   ` Li, WenjieX A
2019-11-15  8:46 ` [dpdk-stable] [PATCH 1/2] mempool: use actual IOVA addresses " Olivier Matz
2019-11-15 10:12   ` Burakov, Anatoly
2019-11-19  1:45 ` Li, WenjieX A
2019-11-19 20:56 ` David Marchand

patches for DPDK stable branches

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ https://inbox.dpdk.org/stable \
		stable@dpdk.org
	public-inbox-index stable

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.stable


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git