* [dpdk-dev] [PATCH] eal: unmap unneed dpdk VA spaces for legacy mem
@ 2019-03-08 5:38 Lilijun
2019-03-08 9:37 ` Burakov, Anatoly
0 siblings, 1 reply; 4+ messages in thread
From: Lilijun @ 2019-03-08 5:38 UTC (permalink / raw)
To: dev; +Cc: jerry.zhang, ian.stokes, Lilijun
Comparing dpdk VA spaces to dpdk 16.11, the dpdk app process's VA spaces increase to above 30G.
Here we can unmap the unneed VA spaces in rte_memseg_list.
Signed-off-by: Lilijun <jerry.lilijun@huawei.com>
---
lib/librte_eal/linuxapp/eal/eal_memory.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 32feb41..56abdd2 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1626,8 +1626,19 @@ void numa_error(char *where)
if (msl->base_va == NULL)
continue;
/* skip lists where there is at least one page allocated */
- if (msl->memseg_arr.count > 0)
+ if (msl->memseg_arr.count > 0) {
+ if (internal_config.legacy_mem) {
+ struct rte_fbarray *arr = &msl->memseg_arr;
+ int idx = rte_fbarray_find_next_free(arr, 0);
+
+ while (idx >= 0) {
+ void *va = (void*)((char*)msl->base_va + idx * msl->page_sz);
+ munmap(va, msl->page_sz);
+ idx = rte_fbarray_find_next_free(arr, idx + 1);
+ }
+ }
continue;
+ }
/* this is an unused list, deallocate it */
mem_sz = msl->len;
munmap(msl->base_va, mem_sz);
--
1.8.3.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dpdk-dev] [PATCH] eal: unmap unneed dpdk VA spaces for legacy mem
2019-03-08 5:38 [dpdk-dev] [PATCH] eal: unmap unneed dpdk VA spaces for legacy mem Lilijun
@ 2019-03-08 9:37 ` Burakov, Anatoly
2019-03-12 1:47 ` Lilijun (Jerry, Cloud Networking)
0 siblings, 1 reply; 4+ messages in thread
From: Burakov, Anatoly @ 2019-03-08 9:37 UTC (permalink / raw)
To: Lilijun, dev; +Cc: jerry.zhang, ian.stokes
On 08-Mar-19 5:38 AM, Lilijun wrote:
> Comparing dpdk VA spaces to dpdk 16.11, the dpdk app process's VA spaces increase to above 30G.
> Here we can unmap the unneed VA spaces in rte_memseg_list.
>
> Signed-off-by: Lilijun <jerry.lilijun@huawei.com>
> ---
> lib/librte_eal/linuxapp/eal/eal_memory.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
> index 32feb41..56abdd2 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
> @@ -1626,8 +1626,19 @@ void numa_error(char *where)
> if (msl->base_va == NULL)
> continue;
> /* skip lists where there is at least one page allocated */
> - if (msl->memseg_arr.count > 0)
> + if (msl->memseg_arr.count > 0) {
> + if (internal_config.legacy_mem) {
> + struct rte_fbarray *arr = &msl->memseg_arr;
> + int idx = rte_fbarray_find_next_free(arr, 0);
> +
> + while (idx >= 0) {
> + void *va = (void*)((char*)msl->base_va + idx * msl->page_sz);
> + munmap(va, msl->page_sz);
> + idx = rte_fbarray_find_next_free(arr, idx + 1);
> + }
I am not entirely convinced this change is safe to do. Technically, this
space is marked as free, so correctly written code should not attempt to
access it, however it is still potentially dangerous to have memory area
that is supposed to be allocated (according to data structures'
parameters), but isn't.
If you are deallocating the VA space, ideally you should also resize the
memseg list (as in, change its length), because that leftover memory
area is no longer valid. However, this then presents us with a mismatch
between (va_start + len) and (va_start + page_sz * memseg_arr.len),
which may break things further.
May i ask what is the purpose of this change? I mean, i understand the
part about unused VA space sitting there, but what is the consequence of
that? This isn't 32-bit codepath, and in 64-bit there's plenty of
address space to go around, and this memory doesn't take up any system
resources anyway because it is read-only anonymous memory, and is
therefore backed by zero page instead of real pages. So, what's wrong
with just leaving it there?
I don't see any advantage of this change, and i see plenty of
disadvantages, so for now i'm inclined to NACK this particular patch.
_However_, i should note that if you feel this is very important feature
to have and would still like to implement it, my advise would be to look
at how 32-bit code works, and model the 64-bit implementation after
that, because 32-bit codepath does exactly what you propose, and doesn't
leave unused address space.
> + } > continue;
> + }
> /* this is an unused list, deallocate it */
> mem_sz = msl->len;
> munmap(msl->base_va, mem_sz);
>
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dpdk-dev] [PATCH] eal: unmap unneed dpdk VA spaces for legacy mem
2019-03-08 9:37 ` Burakov, Anatoly
@ 2019-03-12 1:47 ` Lilijun (Jerry, Cloud Networking)
2019-03-12 11:02 ` Burakov, Anatoly
0 siblings, 1 reply; 4+ messages in thread
From: Lilijun (Jerry, Cloud Networking) @ 2019-03-12 1:47 UTC (permalink / raw)
To: Burakov, Anatoly, dev; +Cc: jerry.zhang, ian.stokes
Hi Anatoly,
> -----Original Message-----
> From: Burakov, Anatoly [mailto:anatoly.burakov@intel.com]
> Sent: Friday, March 08, 2019 5:38 PM
> To: Lilijun (Jerry, Cloud Networking) <jerry.lilijun@huawei.com>;
> dev@dpdk.org
> Cc: jerry.zhang@intel.com; ian.stokes@intel.com
> Subject: Re: [dpdk-dev] [PATCH] eal: unmap unneed dpdk VA spaces for
> legacy mem
>
> On 08-Mar-19 5:38 AM, Lilijun wrote:
> > Comparing dpdk VA spaces to dpdk 16.11, the dpdk app process's VA
> spaces increase to above 30G.
> > Here we can unmap the unneed VA spaces in rte_memseg_list.
> >
> > Signed-off-by: Lilijun <jerry.lilijun@huawei.com>
> > ---
> > lib/librte_eal/linuxapp/eal/eal_memory.c | 13 ++++++++++++-
> > 1 file changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c
> > b/lib/librte_eal/linuxapp/eal/eal_memory.c
> > index 32feb41..56abdd2 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
> > @@ -1626,8 +1626,19 @@ void numa_error(char *where)
> > if (msl->base_va == NULL)
> > continue;
> > /* skip lists where there is at least one page allocated */
> > - if (msl->memseg_arr.count > 0)
> > + if (msl->memseg_arr.count > 0) {
> > + if (internal_config.legacy_mem) {
> > + struct rte_fbarray *arr = &msl->memseg_arr;
> > + int idx = rte_fbarray_find_next_free(arr, 0);
> > +
> > + while (idx >= 0) {
> > + void *va = (void*)((char*)msl-
> >base_va + idx * msl->page_sz);
> > + munmap(va, msl->page_sz);
> > + idx = rte_fbarray_find_next_free(arr,
> idx + 1);
> > + }
>
> I am not entirely convinced this change is safe to do. Technically, this space is
> marked as free, so correctly written code should not attempt to access it,
> however it is still potentially dangerous to have memory area that is
> supposed to be allocated (according to data structures'
> parameters), but isn't.
>
> If you are deallocating the VA space, ideally you should also resize the
> memseg list (as in, change its length), because that leftover memory area is
> no longer valid. However, this then presents us with a mismatch between
> (va_start + len) and (va_start + page_sz * memseg_arr.len), which may
> break things further.
Yes, you're right, here we need resize the memseg length. I will update it if this patch is needed.
>
> May i ask what is the purpose of this change? I mean, i understand the part
> about unused VA space sitting there, but what is the consequence of that?
> This isn't 32-bit codepath, and in 64-bit there's plenty of address space to go
> around, and this memory doesn't take up any system resources anyway
> because it is read-only anonymous memory, and is therefore backed by zero
> page instead of real pages. So, what's wrong with just leaving it there?
This change will cause a issues: when dpdk apps crashed, the coredump file will become too large.
Thanks.
>
> I don't see any advantage of this change, and i see plenty of disadvantages,
> so for now i'm inclined to NACK this particular patch.
>
> _However_, i should note that if you feel this is very important feature to
> have and would still like to implement it, my advise would be to look at how
> 32-bit code works, and model the 64-bit implementation after that, because
> 32-bit codepath does exactly what you propose, and doesn't leave unused
> address space.
>
> > + } > continue;
> > + }
> > /* this is an unused list, deallocate it */
> > mem_sz = msl->len;
> > munmap(msl->base_va, mem_sz);
> >
>
>
> --
> Thanks,
> Anatoly
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dpdk-dev] [PATCH] eal: unmap unneed dpdk VA spaces for legacy mem
2019-03-12 1:47 ` Lilijun (Jerry, Cloud Networking)
@ 2019-03-12 11:02 ` Burakov, Anatoly
0 siblings, 0 replies; 4+ messages in thread
From: Burakov, Anatoly @ 2019-03-12 11:02 UTC (permalink / raw)
To: Lilijun (Jerry, Cloud Networking), dev; +Cc: jerry.zhang, ian.stokes
On 12-Mar-19 1:47 AM, Lilijun (Jerry, Cloud Networking) wrote:
> Hi Anatoly,
>
>> -----Original Message-----
>> From: Burakov, Anatoly [mailto:anatoly.burakov@intel.com]
>> Sent: Friday, March 08, 2019 5:38 PM
>> To: Lilijun (Jerry, Cloud Networking) <jerry.lilijun@huawei.com>;
>> dev@dpdk.org
>> Cc: jerry.zhang@intel.com; ian.stokes@intel.com
>> Subject: Re: [dpdk-dev] [PATCH] eal: unmap unneed dpdk VA spaces for
>> legacy mem
>>
>> On 08-Mar-19 5:38 AM, Lilijun wrote:
>>> Comparing dpdk VA spaces to dpdk 16.11, the dpdk app process's VA
>> spaces increase to above 30G.
>>> Here we can unmap the unneed VA spaces in rte_memseg_list.
>>>
>>> Signed-off-by: Lilijun <jerry.lilijun@huawei.com>
>>> ---
>>> lib/librte_eal/linuxapp/eal/eal_memory.c | 13 ++++++++++++-
>>> 1 file changed, 12 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c
>>> b/lib/librte_eal/linuxapp/eal/eal_memory.c
>>> index 32feb41..56abdd2 100644
>>> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
>>> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
>>> @@ -1626,8 +1626,19 @@ void numa_error(char *where)
>>> if (msl->base_va == NULL)
>>> continue;
>>> /* skip lists where there is at least one page allocated */
>>> - if (msl->memseg_arr.count > 0)
>>> + if (msl->memseg_arr.count > 0) {
>>> + if (internal_config.legacy_mem) {
>>> + struct rte_fbarray *arr = &msl->memseg_arr;
>>> + int idx = rte_fbarray_find_next_free(arr, 0);
>>> +
>>> + while (idx >= 0) {
>>> + void *va = (void*)((char*)msl-
>>> base_va + idx * msl->page_sz);
>>> + munmap(va, msl->page_sz);
>>> + idx = rte_fbarray_find_next_free(arr,
>> idx + 1);
>>> + }
>>
>> I am not entirely convinced this change is safe to do. Technically, this space is
>> marked as free, so correctly written code should not attempt to access it,
>> however it is still potentially dangerous to have memory area that is
>> supposed to be allocated (according to data structures'
>> parameters), but isn't.
>>
>> If you are deallocating the VA space, ideally you should also resize the
>> memseg list (as in, change its length), because that leftover memory area is
>> no longer valid. However, this then presents us with a mismatch between
>> (va_start + len) and (va_start + page_sz * memseg_arr.len), which may
>> break things further.
>
> Yes, you're right, here we need resize the memseg length. I will update it if this patch is needed.
Resizing memseg list is not the best course of action because fbarray
itself doesn't support resizing, so you'll end up with a mismatch
between length of memory and length of fbarray backing the memseg list.
See below suggestion for implementation.
>>
>> May i ask what is the purpose of this change? I mean, i understand the part
>> about unused VA space sitting there, but what is the consequence of that?
>> This isn't 32-bit codepath, and in 64-bit there's plenty of address space to go
>> around, and this memory doesn't take up any system resources anyway
>> because it is read-only anonymous memory, and is therefore backed by zero
>> page instead of real pages. So, what's wrong with just leaving it there?
>
> This change will cause a issues: when dpdk apps crashed, the coredump file will become too large.
>
> Thanks.
You must have different default coredump settings than i do, because i
haven't seen Linux attempting to dump the entire address space before (i
have seen FreeBSD do that, mind you...).
>
>>
>> I don't see any advantage of this change, and i see plenty of disadvantages,
>> so for now i'm inclined to NACK this particular patch.
>>
>> _However_, i should note that if you feel this is very important feature to
>> have and would still like to implement it, my advise would be to look at how
>> 32-bit code works, and model the 64-bit implementation after that, because
>> 32-bit codepath does exactly what you propose, and doesn't leave unused
>> address space.
The above is the way to go as far as implementing this particular
feature goes: this has to be done at memseg list allocation time, not
post-factum, when memseg lists are already allocated.
>>
>>> + } > continue;
>>> + }
>>> /* this is an unused list, deallocate it */
>>> mem_sz = msl->len;
>>> munmap(msl->base_va, mem_sz);
>>>
>>
>>
>> --
>> Thanks,
>> Anatoly
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-03-12 11:02 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-08 5:38 [dpdk-dev] [PATCH] eal: unmap unneed dpdk VA spaces for legacy mem Lilijun
2019-03-08 9:37 ` Burakov, Anatoly
2019-03-12 1:47 ` Lilijun (Jerry, Cloud Networking)
2019-03-12 11:02 ` Burakov, Anatoly
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).