When allocating a mempool which is larger than the largest available area, it can take a lot of time: a- the mempool calculate the required memory size, and tries to allocate it, it fails b- then it tries to allocate the largest available area (this does not request new huge pages) c- add this zone to the mempool, this triggers the allocation of a mem hdr, which request a new huge page d- back to a- until mempool is populated or until there is no more memory This can take a lot of time to finally fail (several minutes): in step a- it takes all available hugepages on the system, then release them after it fails. The problem appeared with commit eba11e364614 ("mempool: reduce wasted space on populate"), because smaller chunks are now allowed. Previously, it had to be at least one page size, which is not the case in step b-. To fix this, implement our own way to allocate the largest available area instead of using the feature from memzone: if an allocation fails, try to divide the size by 2 and retry. When the requested size falls below min_chunk_size, stop and return an error. Fixes: eba11e364614 ("mempool: reduce wasted space on populate") Cc: stable@dpdk.org Signed-off-by: Olivier Matz <olivier.matz@6wind.com> --- lib/librte_mempool/rte_mempool.c | 29 ++++++++++++----------------- 1 file changed, 12 insertions(+), 17 deletions(-) diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index bda361ce6..03c8d984c 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -481,6 +481,7 @@ rte_mempool_populate_default(struct rte_mempool *mp) unsigned mz_id, n; int ret; bool need_iova_contig_obj; + size_t max_alloc_size = SIZE_MAX; ret = mempool_ops_alloc_once(mp); if (ret != 0) @@ -560,30 +561,24 @@ rte_mempool_populate_default(struct rte_mempool *mp) if (min_chunk_size == (size_t)mem_size) mz_flags |= RTE_MEMZONE_IOVA_CONTIG; - mz = rte_memzone_reserve_aligned(mz_name, mem_size, + /* Allocate a memzone, retrying with a smaller area on ENOMEM */ + do { + mz = rte_memzone_reserve_aligned(mz_name, + RTE_MIN((size_t)mem_size, max_alloc_size), mp->socket_id, mz_flags, align); - /* don't try reserving with 0 size if we were asked to reserve - * IOVA-contiguous memory. - */ - if (min_chunk_size < (size_t)mem_size && mz == NULL) { - /* not enough memory, retry with the biggest zone we - * have - */ - mz = rte_memzone_reserve_aligned(mz_name, 0, - mp->socket_id, mz_flags, align); - } + if (mz == NULL && rte_errno != ENOMEM) + break; + + max_alloc_size = RTE_MIN(max_alloc_size, + (size_t)mem_size) / 2; + } while (max_alloc_size >= min_chunk_size); + if (mz == NULL) { ret = -rte_errno; goto fail; } - if (mz->len < min_chunk_size) { - rte_memzone_free(mz); - ret = -ENOMEM; - goto fail; - } - if (need_iova_contig_obj) iova = mz->iova; else -- 2.20.1
On 09-Jan-20 1:27 PM, Olivier Matz wrote:
> When allocating a mempool which is larger than the largest
> available area, it can take a lot of time:
>
> a- the mempool calculate the required memory size, and tries
> to allocate it, it fails
> b- then it tries to allocate the largest available area (this
> does not request new huge pages)
> c- add this zone to the mempool, this triggers the allocation
> of a mem hdr, which request a new huge page
> d- back to a- until mempool is populated or until there is no
> more memory
>
> This can take a lot of time to finally fail (several minutes): in step
> a- it takes all available hugepages on the system, then release them
> after it fails.
>
> The problem appeared with commit eba11e364614 ("mempool: reduce wasted
> space on populate"), because smaller chunks are now allowed. Previously,
> it had to be at least one page size, which is not the case in step b-.
>
> To fix this, implement our own way to allocate the largest available
> area instead of using the feature from memzone: if an allocation fails,
> try to divide the size by 2 and retry. When the requested size falls
> below min_chunk_size, stop and return an error.
>
> Fixes: eba11e364614 ("mempool: reduce wasted space on populate")
> Cc: stable@dpdk.org
>
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
I don't particularly like the idea of working around this issue as
opposed to fixing it memzone-side, but since there's currently no plan
to address this in memzone allocator, this should work much better than
before.
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
--
Thanks,
Anatoly
Hi Olivier,
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Olivier Matz
> Sent: Thursday, January 9, 2020 3:28 PM
> To: dev@dpdk.org
> Cc: Andrew Rybchenko <arybchenko@solarflare.com>; Anatoly Burakov
> <anatoly.burakov@intel.com>; stable@dpdk.org
> Subject: [dpdk-dev] [PATCH] mempool: fix slow allocation of large mempools
>
> When allocating a mempool which is larger than the largest available area, it
> can take a lot of time:
>
> a- the mempool calculate the required memory size, and tries
> to allocate it, it fails
> b- then it tries to allocate the largest available area (this
> does not request new huge pages)
> c- add this zone to the mempool, this triggers the allocation
> of a mem hdr, which request a new huge page
> d- back to a- until mempool is populated or until there is no
> more memory
>
> This can take a lot of time to finally fail (several minutes): in step
> a- it takes all available hugepages on the system, then release them after it
> fails.
>
> The problem appeared with commit eba11e364614 ("mempool: reduce
> wasted space on populate"), because smaller chunks are now allowed.
> Previously, it had to be at least one page size, which is not the case in step b-.
>
> To fix this, implement our own way to allocate the largest available area
> instead of using the feature from memzone: if an allocation fails, try to divide
> the size by 2 and retry. When the requested size falls below min_chunk_size,
> stop and return an error.
>
> Fixes: eba11e364614 ("mempool: reduce wasted space on populate")
> Cc: stable@dpdk.org
>
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
Testpmd (testpmd -n4 -- -i) fails to start after applying this patch with:
"""
EAL: Error - exiting with code: 1
Cause: Creation of mbuf pool for socket 0 failed: File exists
"""
This is why the check ci/iol-mellanox-Performance is failing (not sure if the other tests are failing for the same reason).
Regards,
Ali
Hi Ali,
On Thu, Jan 09, 2020 at 04:06:53PM +0000, Ali Alnubani wrote:
> Hi Olivier,
>
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Olivier Matz
> > Sent: Thursday, January 9, 2020 3:28 PM
> > To: dev@dpdk.org
> > Cc: Andrew Rybchenko <arybchenko@solarflare.com>; Anatoly Burakov
> > <anatoly.burakov@intel.com>; stable@dpdk.org
> > Subject: [dpdk-dev] [PATCH] mempool: fix slow allocation of large mempools
> >
> > When allocating a mempool which is larger than the largest available area, it
> > can take a lot of time:
> >
> > a- the mempool calculate the required memory size, and tries
> > to allocate it, it fails
> > b- then it tries to allocate the largest available area (this
> > does not request new huge pages)
> > c- add this zone to the mempool, this triggers the allocation
> > of a mem hdr, which request a new huge page
> > d- back to a- until mempool is populated or until there is no
> > more memory
> >
> > This can take a lot of time to finally fail (several minutes): in step
> > a- it takes all available hugepages on the system, then release them after it
> > fails.
> >
> > The problem appeared with commit eba11e364614 ("mempool: reduce
> > wasted space on populate"), because smaller chunks are now allowed.
> > Previously, it had to be at least one page size, which is not the case in step b-.
> >
> > To fix this, implement our own way to allocate the largest available area
> > instead of using the feature from memzone: if an allocation fails, try to divide
> > the size by 2 and retry. When the requested size falls below min_chunk_size,
> > stop and return an error.
> >
> > Fixes: eba11e364614 ("mempool: reduce wasted space on populate")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> > ---
>
> Testpmd (testpmd -n4 -- -i) fails to start after applying this patch with:
> """
> EAL: Error - exiting with code: 1
> Cause: Creation of mbuf pool for socket 0 failed: File exists
> """
>
> This is why the check ci/iol-mellanox-Performance is failing (not sure if the other tests are failing for the same reason).
Thanks for the report.
I should have retested after my "little rework"... :)
I'll send a v2 with this fix:
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -572,7 +572,7 @@ rte_mempool_populate_default(struct rte_mempool *mp)
max_alloc_size = RTE_MIN(max_alloc_size,
(size_t)mem_size) / 2;
- } while (max_alloc_size >= min_chunk_size);
+ } while (mz == NULL && max_alloc_size >= min_chunk_size);
if (mz == NULL) {
ret = -rte_errno;
Olivier
On 1/9/20 4:27 PM, Olivier Matz wrote: > When allocating a mempool which is larger than the largest > available area, it can take a lot of time: > > a- the mempool calculate the required memory size, and tries > to allocate it, it fails > b- then it tries to allocate the largest available area (this > does not request new huge pages) > c- add this zone to the mempool, this triggers the allocation > of a mem hdr, which request a new huge page > d- back to a- until mempool is populated or until there is no > more memory > > This can take a lot of time to finally fail (several minutes): in step > a- it takes all available hugepages on the system, then release them > after it fails. > > The problem appeared with commit eba11e364614 ("mempool: reduce wasted > space on populate"), because smaller chunks are now allowed. Previously, > it had to be at least one page size, which is not the case in step b-. > > To fix this, implement our own way to allocate the largest available > area instead of using the feature from memzone: if an allocation fails, > try to divide the size by 2 and retry. When the requested size falls > below min_chunk_size, stop and return an error. > > Fixes: eba11e364614 ("mempool: reduce wasted space on populate") > Cc: stable@dpdk.org > > Signed-off-by: Olivier Matz <olivier.matz@6wind.com> LGTM except already mentioned bug with missing mz == NULL to retry loop. Plus one minor question below. > --- > lib/librte_mempool/rte_mempool.c | 29 ++++++++++++----------------- > 1 file changed, 12 insertions(+), 17 deletions(-) > > diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c > index bda361ce6..03c8d984c 100644 > --- a/lib/librte_mempool/rte_mempool.c > +++ b/lib/librte_mempool/rte_mempool.c > @@ -481,6 +481,7 @@ rte_mempool_populate_default(struct rte_mempool *mp) > unsigned mz_id, n; > int ret; > bool need_iova_contig_obj; > + size_t max_alloc_size = SIZE_MAX; > > ret = mempool_ops_alloc_once(mp); > if (ret != 0) > @@ -560,30 +561,24 @@ rte_mempool_populate_default(struct rte_mempool *mp) > if (min_chunk_size == (size_t)mem_size) > mz_flags |= RTE_MEMZONE_IOVA_CONTIG; > > - mz = rte_memzone_reserve_aligned(mz_name, mem_size, > + /* Allocate a memzone, retrying with a smaller area on ENOMEM */ > + do { > + mz = rte_memzone_reserve_aligned(mz_name, > + RTE_MIN((size_t)mem_size, max_alloc_size), > mp->socket_id, mz_flags, align); > > - /* don't try reserving with 0 size if we were asked to reserve > - * IOVA-contiguous memory. > - */ > - if (min_chunk_size < (size_t)mem_size && mz == NULL) { > - /* not enough memory, retry with the biggest zone we > - * have > - */ > - mz = rte_memzone_reserve_aligned(mz_name, 0, > - mp->socket_id, mz_flags, align); > - } > + if (mz == NULL && rte_errno != ENOMEM) > + break; > + > + max_alloc_size = RTE_MIN(max_alloc_size, > + (size_t)mem_size) / 2; Does it make sense to make max_alloc_size multiple of min_chunk_size here? I think it could help to waste less memory space. > + } while (max_alloc_size >= min_chunk_size); > + > if (mz == NULL) { > ret = -rte_errno; > goto fail; > } > > - if (mz->len < min_chunk_size) { > - rte_memzone_free(mz); > - ret = -ENOMEM; > - goto fail; > - } > - > if (need_iova_contig_obj) > iova = mz->iova; > else >
Hi, On Fri, Jan 10, 2020 at 12:53:24PM +0300, Andrew Rybchenko wrote: > On 1/9/20 4:27 PM, Olivier Matz wrote: > > When allocating a mempool which is larger than the largest > > available area, it can take a lot of time: > > > > a- the mempool calculate the required memory size, and tries > > to allocate it, it fails > > b- then it tries to allocate the largest available area (this > > does not request new huge pages) > > c- add this zone to the mempool, this triggers the allocation > > of a mem hdr, which request a new huge page > > d- back to a- until mempool is populated or until there is no > > more memory > > > > This can take a lot of time to finally fail (several minutes): in step > > a- it takes all available hugepages on the system, then release them > > after it fails. > > > > The problem appeared with commit eba11e364614 ("mempool: reduce wasted > > space on populate"), because smaller chunks are now allowed. Previously, > > it had to be at least one page size, which is not the case in step b-. > > > > To fix this, implement our own way to allocate the largest available > > area instead of using the feature from memzone: if an allocation fails, > > try to divide the size by 2 and retry. When the requested size falls > > below min_chunk_size, stop and return an error. > > > > Fixes: eba11e364614 ("mempool: reduce wasted space on populate") > > Cc: stable@dpdk.org > > > > Signed-off-by: Olivier Matz <olivier.matz@6wind.com> > > LGTM except already mentioned bug with missing mz == NULL to retry loop. > Plus one minor question below. > > > --- > > lib/librte_mempool/rte_mempool.c | 29 ++++++++++++----------------- > > 1 file changed, 12 insertions(+), 17 deletions(-) > > > > diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c > > index bda361ce6..03c8d984c 100644 > > --- a/lib/librte_mempool/rte_mempool.c > > +++ b/lib/librte_mempool/rte_mempool.c > > @@ -481,6 +481,7 @@ rte_mempool_populate_default(struct rte_mempool *mp) > > unsigned mz_id, n; > > int ret; > > bool need_iova_contig_obj; > > + size_t max_alloc_size = SIZE_MAX; > > > > ret = mempool_ops_alloc_once(mp); > > if (ret != 0) > > @@ -560,30 +561,24 @@ rte_mempool_populate_default(struct rte_mempool *mp) > > if (min_chunk_size == (size_t)mem_size) > > mz_flags |= RTE_MEMZONE_IOVA_CONTIG; > > > > - mz = rte_memzone_reserve_aligned(mz_name, mem_size, > > + /* Allocate a memzone, retrying with a smaller area on ENOMEM */ > > + do { > > + mz = rte_memzone_reserve_aligned(mz_name, > > + RTE_MIN((size_t)mem_size, max_alloc_size), > > mp->socket_id, mz_flags, align); > > > > - /* don't try reserving with 0 size if we were asked to reserve > > - * IOVA-contiguous memory. > > - */ > > - if (min_chunk_size < (size_t)mem_size && mz == NULL) { > > - /* not enough memory, retry with the biggest zone we > > - * have > > - */ > > - mz = rte_memzone_reserve_aligned(mz_name, 0, > > - mp->socket_id, mz_flags, align); > > - } > > + if (mz == NULL && rte_errno != ENOMEM) > > + break; > > + > > + max_alloc_size = RTE_MIN(max_alloc_size, > > + (size_t)mem_size) / 2; > > Does it make sense to make max_alloc_size multiple of > min_chunk_size here? I think it could help to waste less > memory space. I don't think it's worth doing it: I agree it could avoid to waste space, but it is only significant if max_alloc_size is in the same order of size than min_chunk_size. And this would only happen when we are running out of memory. Also, as populate_virt() will skip page boundaries, keeping a multiple of min_chunk_size may not make sense in that case. > > > + } while (max_alloc_size >= min_chunk_size); > > + > > if (mz == NULL) { > > ret = -rte_errno; > > goto fail; > > } > > > > - if (mz->len < min_chunk_size) { > > - rte_memzone_free(mz); > > - ret = -ENOMEM; > > - goto fail; > > - } > > - > > if (need_iova_contig_obj) > > iova = mz->iova; > > else > > >
When allocating a mempool which is larger than the largest available area, it can take a lot of time: a- the mempool calculate the required memory size, and tries to allocate it, it fails b- then it tries to allocate the largest available area (this does not request new huge pages) c- add this zone to the mempool, this triggers the allocation of a mem hdr, which request a new huge page d- back to a- until mempool is populated or until there is no more memory This can take a lot of time to finally fail (several minutes): in step a- it takes all available hugepages on the system, then release them after it fails. The problem appeared with commit eba11e364614 ("mempool: reduce wasted space on populate"), because smaller chunks are now allowed. Previously, it had to be at least one page size, which is not the case in step b-. To fix this, implement our own way to allocate the largest available area instead of using the feature from memzone: if an allocation fails, try to divide the size by 2 and retry. When the requested size falls below min_chunk_size, stop and return an error. Fixes: eba11e364614 ("mempool: reduce wasted space on populate") Cc: stable@dpdk.org Signed-off-by: Olivier Matz <olivier.matz@6wind.com> --- v2: * fix missing check on mz == NULL condition lib/librte_mempool/rte_mempool.c | 29 ++++++++++++----------------- 1 file changed, 12 insertions(+), 17 deletions(-) diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index 1eae10e27..a68a69040 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -499,6 +499,7 @@ rte_mempool_populate_default(struct rte_mempool *mp) unsigned mz_id, n; int ret; bool need_iova_contig_obj; + size_t max_alloc_size = SIZE_MAX; ret = mempool_ops_alloc_once(mp); if (ret != 0) @@ -578,30 +579,24 @@ rte_mempool_populate_default(struct rte_mempool *mp) if (min_chunk_size == (size_t)mem_size) mz_flags |= RTE_MEMZONE_IOVA_CONTIG; - mz = rte_memzone_reserve_aligned(mz_name, mem_size, + /* Allocate a memzone, retrying with a smaller area on ENOMEM */ + do { + mz = rte_memzone_reserve_aligned(mz_name, + RTE_MIN((size_t)mem_size, max_alloc_size), mp->socket_id, mz_flags, align); - /* don't try reserving with 0 size if we were asked to reserve - * IOVA-contiguous memory. - */ - if (min_chunk_size < (size_t)mem_size && mz == NULL) { - /* not enough memory, retry with the biggest zone we - * have - */ - mz = rte_memzone_reserve_aligned(mz_name, 0, - mp->socket_id, mz_flags, align); - } + if (mz == NULL && rte_errno != ENOMEM) + break; + + max_alloc_size = RTE_MIN(max_alloc_size, + (size_t)mem_size) / 2; + } while (mz == NULL && max_alloc_size >= min_chunk_size); + if (mz == NULL) { ret = -rte_errno; goto fail; } - if (mz->len < min_chunk_size) { - rte_memzone_free(mz); - ret = -ENOMEM; - goto fail; - } - if (need_iova_contig_obj) iova = mz->iova; else -- 2.20.1
On Fri, Jan 17, 2020 at 10:51:49AM +0100, Olivier Matz wrote: > When allocating a mempool which is larger than the largest > available area, it can take a lot of time: > > a- the mempool calculate the required memory size, and tries > to allocate it, it fails > b- then it tries to allocate the largest available area (this > does not request new huge pages) > c- add this zone to the mempool, this triggers the allocation > of a mem hdr, which request a new huge page > d- back to a- until mempool is populated or until there is no > more memory > > This can take a lot of time to finally fail (several minutes): in step > a- it takes all available hugepages on the system, then release them > after it fails. > > The problem appeared with commit eba11e364614 ("mempool: reduce wasted > space on populate"), because smaller chunks are now allowed. Previously, > it had to be at least one page size, which is not the case in step b-. > > To fix this, implement our own way to allocate the largest available > area instead of using the feature from memzone: if an allocation fails, > try to divide the size by 2 and retry. When the requested size falls > below min_chunk_size, stop and return an error. > > Fixes: eba11e364614 ("mempool: reduce wasted space on populate") > Cc: stable@dpdk.org > > Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Sorry I forgot to report Anatoly's ack on v1 http://patchwork.dpdk.org/patch/64370/ Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> > --- > > v2: > * fix missing check on mz == NULL condition > > > lib/librte_mempool/rte_mempool.c | 29 ++++++++++++----------------- > 1 file changed, 12 insertions(+), 17 deletions(-) > > diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c > index 1eae10e27..a68a69040 100644 > --- a/lib/librte_mempool/rte_mempool.c > +++ b/lib/librte_mempool/rte_mempool.c > @@ -499,6 +499,7 @@ rte_mempool_populate_default(struct rte_mempool *mp) > unsigned mz_id, n; > int ret; > bool need_iova_contig_obj; > + size_t max_alloc_size = SIZE_MAX; > > ret = mempool_ops_alloc_once(mp); > if (ret != 0) > @@ -578,30 +579,24 @@ rte_mempool_populate_default(struct rte_mempool *mp) > if (min_chunk_size == (size_t)mem_size) > mz_flags |= RTE_MEMZONE_IOVA_CONTIG; > > - mz = rte_memzone_reserve_aligned(mz_name, mem_size, > + /* Allocate a memzone, retrying with a smaller area on ENOMEM */ > + do { > + mz = rte_memzone_reserve_aligned(mz_name, > + RTE_MIN((size_t)mem_size, max_alloc_size), > mp->socket_id, mz_flags, align); > > - /* don't try reserving with 0 size if we were asked to reserve > - * IOVA-contiguous memory. > - */ > - if (min_chunk_size < (size_t)mem_size && mz == NULL) { > - /* not enough memory, retry with the biggest zone we > - * have > - */ > - mz = rte_memzone_reserve_aligned(mz_name, 0, > - mp->socket_id, mz_flags, align); > - } > + if (mz == NULL && rte_errno != ENOMEM) > + break; > + > + max_alloc_size = RTE_MIN(max_alloc_size, > + (size_t)mem_size) / 2; > + } while (mz == NULL && max_alloc_size >= min_chunk_size); > + > if (mz == NULL) { > ret = -rte_errno; > goto fail; > } > > - if (mz->len < min_chunk_size) { > - rte_memzone_free(mz); > - ret = -ENOMEM; > - goto fail; > - } > - > if (need_iova_contig_obj) > iova = mz->iova; > else > -- > 2.20.1 >
On 1/17/20 1:01 PM, Olivier Matz wrote:
> On Fri, Jan 17, 2020 at 10:51:49AM +0100, Olivier Matz wrote:
>> When allocating a mempool which is larger than the largest
>> available area, it can take a lot of time:
>>
>> a- the mempool calculate the required memory size, and tries
>> to allocate it, it fails
>> b- then it tries to allocate the largest available area (this
>> does not request new huge pages)
>> c- add this zone to the mempool, this triggers the allocation
>> of a mem hdr, which request a new huge page
>> d- back to a- until mempool is populated or until there is no
>> more memory
>>
>> This can take a lot of time to finally fail (several minutes): in step
>> a- it takes all available hugepages on the system, then release them
>> after it fails.
>>
>> The problem appeared with commit eba11e364614 ("mempool: reduce wasted
>> space on populate"), because smaller chunks are now allowed. Previously,
>> it had to be at least one page size, which is not the case in step b-.
>>
>> To fix this, implement our own way to allocate the largest available
>> area instead of using the feature from memzone: if an allocation fails,
>> try to divide the size by 2 and retry. When the requested size falls
>> below min_chunk_size, stop and return an error.
>>
>> Fixes: eba11e364614 ("mempool: reduce wasted space on populate")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
>
> Sorry I forgot to report Anatoly's ack on v1
> http://patchwork.dpdk.org/patch/64370/
>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Friday, January 17, 2020 11:52 AM
> To: dev@dpdk.org
> Cc: Ali Alnubani <alialnu@mellanox.com>; Anatoly Burakov
> <anatoly.burakov@intel.com>; Andrew Rybchenko
> <arybchenko@solarflare.com>; Raslan Darawsheh
> <rasland@mellanox.com>; stable@dpdk.org; Thomas Monjalon
> <thomasm@mellanox.com>
> Subject: [PATCH v2] mempool: fix slow allocation of large mempools
>
> When allocating a mempool which is larger than the largest available area, it
> can take a lot of time:
>
> a- the mempool calculate the required memory size, and tries
> to allocate it, it fails
> b- then it tries to allocate the largest available area (this
> does not request new huge pages)
> c- add this zone to the mempool, this triggers the allocation
> of a mem hdr, which request a new huge page
> d- back to a- until mempool is populated or until there is no
> more memory
>
> This can take a lot of time to finally fail (several minutes): in step
> a- it takes all available hugepages on the system, then release them after it
> fails.
>
> The problem appeared with commit eba11e364614 ("mempool: reduce
> wasted space on populate"), because smaller chunks are now allowed.
> Previously, it had to be at least one page size, which is not the case in step b-.
>
> To fix this, implement our own way to allocate the largest available area
> instead of using the feature from memzone: if an allocation fails, try to divide
> the size by 2 and retry. When the requested size falls below min_chunk_size,
> stop and return an error.
>
> Fixes: eba11e364614 ("mempool: reduce wasted space on populate")
> Cc: stable@dpdk.org
>
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>
> v2:
> * fix missing check on mz == NULL condition
>
Tested-by: Ali Alnubani <alialnu@mellanox.com>
17/01/2020 11:09, Andrew Rybchenko:
> On 1/17/20 1:01 PM, Olivier Matz wrote:
> > On Fri, Jan 17, 2020 at 10:51:49AM +0100, Olivier Matz wrote:
> >> When allocating a mempool which is larger than the largest
> >> available area, it can take a lot of time:
> >>
> >> a- the mempool calculate the required memory size, and tries
> >> to allocate it, it fails
> >> b- then it tries to allocate the largest available area (this
> >> does not request new huge pages)
> >> c- add this zone to the mempool, this triggers the allocation
> >> of a mem hdr, which request a new huge page
> >> d- back to a- until mempool is populated or until there is no
> >> more memory
> >>
> >> This can take a lot of time to finally fail (several minutes): in step
> >> a- it takes all available hugepages on the system, then release them
> >> after it fails.
> >>
> >> The problem appeared with commit eba11e364614 ("mempool: reduce wasted
> >> space on populate"), because smaller chunks are now allowed. Previously,
> >> it had to be at least one page size, which is not the case in step b-.
> >>
> >> To fix this, implement our own way to allocate the largest available
> >> area instead of using the feature from memzone: if an allocation fails,
> >> try to divide the size by 2 and retry. When the requested size falls
> >> below min_chunk_size, stop and return an error.
> >>
> >> Fixes: eba11e364614 ("mempool: reduce wasted space on populate")
> >> Cc: stable@dpdk.org
> >>
> >> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> >
> > Sorry I forgot to report Anatoly's ack on v1
> > http://patchwork.dpdk.org/patch/64370/
> >
> > Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>
> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Applied, thanks