DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] eal: decrease the memory init time with many hugepages setup
@ 2015-04-02 11:30 jerry.lilijun
  2015-04-02 12:55 ` Thomas Monjalon
  0 siblings, 1 reply; 7+ messages in thread
From: jerry.lilijun @ 2015-04-02 11:30 UTC (permalink / raw)
  To: dev

From: Lilijun <jerry.lilijun@huawei.com>

In the function map_all_hugepages(), hugepage memory is truly allocated by
memset(virtaddr, 0, hugepage_sz). Then it costs about 40s to finish the
dpdk memory initialization when 40000 2M hugepages are setup in host os.
In fact we can only write one byte to finish  the allocation.

Signed-off-by: Lilijun <jerry.lilijun@huawei.com>
---
 lib/librte_eal/linuxapp/eal/eal_memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 5f9f92e..8bbee9c 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -378,7 +378,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl,
 
 		if (orig) {
 			hugepg_tbl[i].orig_va = virtaddr;
-			memset(virtaddr, 0, hugepage_sz);
+			memset(virtaddr, 0, 1);
 		}
 		else {
 			hugepg_tbl[i].final_va = virtaddr;
-- 
1.9.4.msysgit.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH] eal: decrease the memory init time with many hugepages setup
  2015-04-02 11:30 [dpdk-dev] [PATCH] eal: decrease the memory init time with many hugepages setup jerry.lilijun
@ 2015-04-02 12:55 ` Thomas Monjalon
  2015-04-02 13:41   ` Jay Rolette
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Monjalon @ 2015-04-02 12:55 UTC (permalink / raw)
  To: jerry.lilijun; +Cc: dev

2015-04-02 19:30, jerry.lilijun@huawei.com:
> From: Lilijun <jerry.lilijun@huawei.com>
> 
> In the function map_all_hugepages(), hugepage memory is truly allocated by
> memset(virtaddr, 0, hugepage_sz). Then it costs about 40s to finish the
> dpdk memory initialization when 40000 2M hugepages are setup in host os.

Yes it's something we should try to reduce.

> In fact we can only write one byte to finish  the allocation.

Isn't it a security hole?

This article speaks about "prezeroing optimizations" in Linux kernel:
	http://landley.net/writing/memory-faq.txt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH] eal: decrease the memory init time with many hugepages setup
  2015-04-02 12:55 ` Thomas Monjalon
@ 2015-04-02 13:41   ` Jay Rolette
  2015-04-03  9:04     ` Gonzalez Monroy, Sergio
  0 siblings, 1 reply; 7+ messages in thread
From: Jay Rolette @ 2015-04-02 13:41 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: DPDK

On Thu, Apr 2, 2015 at 7:55 AM, Thomas Monjalon <thomas.monjalon@6wind.com>
wrote:

> 2015-04-02 19:30, jerry.lilijun@huawei.com:
> > From: Lilijun <jerry.lilijun@huawei.com>
> >
> > In the function map_all_hugepages(), hugepage memory is truly allocated
> by
> > memset(virtaddr, 0, hugepage_sz). Then it costs about 40s to finish the
> > dpdk memory initialization when 40000 2M hugepages are setup in host os.
>
> Yes it's something we should try to reduce.
>

I have a patch in my tree that does the same opto, but it is commented out
right now. In our case, 2/3's of the startup time for our entire app was
due to that particular call - memset(virtaddr, 0, hugepage_sz). Just
zeroing 1 byte per huge page reduces that by 30% in my tests.

The only reason I have it commented out is that I didn't have time to make
sure there weren't side-effects for DPDK or my app. For normal shared
memory on Linux, pages are initialized to zero automatically once they are
touched, so the memset isn't required but I wasn't sure whether that
applied to huge pages. Also wasn't sure how hugetlbfs factored into the
equation.

Hopefully someone can chime in on that. Would love to uncomment the opto :)

> In fact we can only write one byte to finish  the allocation.
>
> Isn't it a security hole?
>

Not necessarily. If the kernel pre-zeros the huge pages via CoW like normal
pages, then definitely not.

Even if the kernel doesn't pre-zero the pages, if DPDK takes care of
properly initializing memory structures on startup as they are carved out
of the huge pages, then it isn't a security hole. However, that approach is
susceptible to bit rot... You can audit the code and make sure everything
is kosher at first, but you have to worry about new code making assumptions
about how memory is initialized.


> This article speaks about "prezeroing optimizations" in Linux kernel:
>         http://landley.net/writing/memory-faq.txt


I read through that when I was trying to figure out what whether huge pages
were pre-zeroed or not. It doesn't talk about huge pages much beyond why
they are useful for reducing TLB swaps.

Jay

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH] eal: decrease the memory init time with many hugepages setup
  2015-04-02 13:41   ` Jay Rolette
@ 2015-04-03  9:04     ` Gonzalez Monroy, Sergio
  2015-04-03  9:14       ` Thomas Monjalon
  0 siblings, 1 reply; 7+ messages in thread
From: Gonzalez Monroy, Sergio @ 2015-04-03  9:04 UTC (permalink / raw)
  To: Jay Rolette; +Cc: DPDK

On 02/04/2015 14:41, Jay Rolette wrote:
> On Thu, Apr 2, 2015 at 7:55 AM, Thomas Monjalon <thomas.monjalon@6wind.com>
> wrote:
>
>> 2015-04-02 19:30, jerry.lilijun@huawei.com:
>>> From: Lilijun <jerry.lilijun@huawei.com>
>>>
>>> In the function map_all_hugepages(), hugepage memory is truly allocated
>> by
>>> memset(virtaddr, 0, hugepage_sz). Then it costs about 40s to finish the
>>> dpdk memory initialization when 40000 2M hugepages are setup in host os.
>> Yes it's something we should try to reduce.
>>
> I have a patch in my tree that does the same opto, but it is commented out
> right now. In our case, 2/3's of the startup time for our entire app was
> due to that particular call - memset(virtaddr, 0, hugepage_sz). Just
> zeroing 1 byte per huge page reduces that by 30% in my tests.
>
> The only reason I have it commented out is that I didn't have time to make
> sure there weren't side-effects for DPDK or my app. For normal shared
> memory on Linux, pages are initialized to zero automatically once they are
> touched, so the memset isn't required but I wasn't sure whether that
> applied to huge pages. Also wasn't sure how hugetlbfs factored into the
> equation.
>
> Hopefully someone can chime in on that. Would love to uncomment the opto :)
>
I think the opto/patch is good ;)

I had a look at the Linux kernel sources (mm/hugetlb.c)and at least 
since 2.6.32 (minimum
Linux kernel version supported by DPDK) the kernel clears the hugepage 
(clear_huge_page)
when it faults (hugetlb_no_page).

Primary DPDK apps do clear_hugedir, clearing previously allocated 
hugepages, thus triggering
hugepage faults (hugetlb_no_page) during map_all_hugepages.

Note that even when we exit a primary DPDK app, hugepages remain 
allocated, reason why
apps such as dump_cfg are able to retrieve config/memory information.

Sergio
>> In fact we can only write one byte to finish  the allocation.
>>
>> Isn't it a security hole?
>>
> Not necessarily. If the kernel pre-zeros the huge pages via CoW like normal
> pages, then definitely not.
>
> Even if the kernel doesn't pre-zero the pages, if DPDK takes care of
> properly initializing memory structures on startup as they are carved out
> of the huge pages, then it isn't a security hole. However, that approach is
> susceptible to bit rot... You can audit the code and make sure everything
> is kosher at first, but you have to worry about new code making assumptions
> about how memory is initialized.
>
>
>> This article speaks about "prezeroing optimizations" in Linux kernel:
>>          http://landley.net/writing/memory-faq.txt
>
> I read through that when I was trying to figure out what whether huge pages
> were pre-zeroed or not. It doesn't talk about huge pages much beyond why
> they are useful for reducing TLB swaps.
>
> Jay
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH] eal: decrease the memory init time with many hugepages setup
  2015-04-03  9:04     ` Gonzalez Monroy, Sergio
@ 2015-04-03  9:14       ` Thomas Monjalon
  2015-04-03  9:37         ` Lilijun
  2015-04-03 12:00         ` Gonzalez Monroy, Sergio
  0 siblings, 2 replies; 7+ messages in thread
From: Thomas Monjalon @ 2015-04-03  9:14 UTC (permalink / raw)
  To: Gonzalez Monroy, Sergio, Lilijun; +Cc: dev

2015-04-03 10:04, Gonzalez Monroy, Sergio:
> On 02/04/2015 14:41, Jay Rolette wrote:
> > On Thu, Apr 2, 2015 at 7:55 AM, Thomas Monjalon <thomas.monjalon@6wind.com>
> > wrote:
> >
> >> 2015-04-02 19:30, jerry.lilijun@huawei.com:
> >>> From: Lilijun <jerry.lilijun@huawei.com>
> >>>
> >>> In the function map_all_hugepages(), hugepage memory is truly allocated
> >> by
> >>> memset(virtaddr, 0, hugepage_sz). Then it costs about 40s to finish the
> >>> dpdk memory initialization when 40000 2M hugepages are setup in host os.
> >> Yes it's something we should try to reduce.
> >>
> > I have a patch in my tree that does the same opto, but it is commented out
> > right now. In our case, 2/3's of the startup time for our entire app was
> > due to that particular call - memset(virtaddr, 0, hugepage_sz). Just
> > zeroing 1 byte per huge page reduces that by 30% in my tests.
> >
> > The only reason I have it commented out is that I didn't have time to make
> > sure there weren't side-effects for DPDK or my app. For normal shared
> > memory on Linux, pages are initialized to zero automatically once they are
> > touched, so the memset isn't required but I wasn't sure whether that
> > applied to huge pages. Also wasn't sure how hugetlbfs factored into the
> > equation.
> >
> > Hopefully someone can chime in on that. Would love to uncomment the opto :)
> >
> I think the opto/patch is good ;)
> 
> I had a look at the Linux kernel sources (mm/hugetlb.c)and at least 
> since 2.6.32 (minimum
> Linux kernel version supported by DPDK) the kernel clears the hugepage 
> (clear_huge_page)
> when it faults (hugetlb_no_page).
> 
> Primary DPDK apps do clear_hugedir, clearing previously allocated 
> hugepages, thus triggering
> hugepage faults (hugetlb_no_page) during map_all_hugepages.
> 
> Note that even when we exit a primary DPDK app, hugepages remain 
> allocated, reason why
> apps such as dump_cfg are able to retrieve config/memory information.

OK, thanks Sergio.

So the patch should add a comment to explain page fault reason of memset and
why 1 byte is enough.
I think we should also consider remap_all_hugepages() function.

> >> Isn't it a security hole?
> >>
> > Not necessarily. If the kernel pre-zeros the huge pages via CoW like normal
> > pages, then definitely not.
> >
> > Even if the kernel doesn't pre-zero the pages, if DPDK takes care of
> > properly initializing memory structures on startup as they are carved out
> > of the huge pages, then it isn't a security hole. However, that approach is
> > susceptible to bit rot... You can audit the code and make sure everything
> > is kosher at first, but you have to worry about new code making assumptions
> > about how memory is initialized.
> >
> >> This article speaks about "prezeroing optimizations" in Linux kernel:
> >>          http://landley.net/writing/memory-faq.txt
> >
> > I read through that when I was trying to figure out what whether huge pages
> > were pre-zeroed or not. It doesn't talk about huge pages much beyond why
> > they are useful for reducing TLB swaps.
> >
> > Jay

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH] eal: decrease the memory init time with many hugepages setup
  2015-04-03  9:14       ` Thomas Monjalon
@ 2015-04-03  9:37         ` Lilijun
  2015-04-03 12:00         ` Gonzalez Monroy, Sergio
  1 sibling, 0 replies; 7+ messages in thread
From: Lilijun @ 2015-04-03  9:37 UTC (permalink / raw)
  To: Thomas Monjalon, Gonzalez Monroy, Sergio; +Cc: dev

On 2015/4/3 17:14, Thomas Monjalon wrote:
> 2015-04-03 10:04, Gonzalez Monroy, Sergio:
>> On 02/04/2015 14:41, Jay Rolette wrote:
>>> On Thu, Apr 2, 2015 at 7:55 AM, Thomas Monjalon <thomas.monjalon@6wind.com>
>>> wrote:
>>>
>>>> 2015-04-02 19:30, jerry.lilijun@huawei.com:
>>>>> From: Lilijun <jerry.lilijun@huawei.com>
>>>>>
>>>>> In the function map_all_hugepages(), hugepage memory is truly allocated
>>>> by
>>>>> memset(virtaddr, 0, hugepage_sz). Then it costs about 40s to finish the
>>>>> dpdk memory initialization when 40000 2M hugepages are setup in host os.
>>>> Yes it's something we should try to reduce.
>>>>
>>> I have a patch in my tree that does the same opto, but it is commented out
>>> right now. In our case, 2/3's of the startup time for our entire app was
>>> due to that particular call - memset(virtaddr, 0, hugepage_sz). Just
>>> zeroing 1 byte per huge page reduces that by 30% in my tests.
>>>
>>> The only reason I have it commented out is that I didn't have time to make
>>> sure there weren't side-effects for DPDK or my app. For normal shared
>>> memory on Linux, pages are initialized to zero automatically once they are
>>> touched, so the memset isn't required but I wasn't sure whether that
>>> applied to huge pages. Also wasn't sure how hugetlbfs factored into the
>>> equation.
>>>
>>> Hopefully someone can chime in on that. Would love to uncomment the opto :)
>>>
>> I think the opto/patch is good ;)
>>
>> I had a look at the Linux kernel sources (mm/hugetlb.c)and at least 
>> since 2.6.32 (minimum
>> Linux kernel version supported by DPDK) the kernel clears the hugepage 
>> (clear_huge_page)
>> when it faults (hugetlb_no_page).
>>
>> Primary DPDK apps do clear_hugedir, clearing previously allocated 
>> hugepages, thus triggering
>> hugepage faults (hugetlb_no_page) during map_all_hugepages.
>>
>> Note that even when we exit a primary DPDK app, hugepages remain 
>> allocated, reason why
>> apps such as dump_cfg are able to retrieve config/memory information.
> 
> OK, thanks Sergio.
> 
> So the patch should add a comment to explain page fault reason of memset and
> why 1 byte is enough.
> I think we should also consider remap_all_hugepages() function.

Thanks very much.
I will update the comments and send it again.


> 
>>>> Isn't it a security hole?
>>>>
>>> Not necessarily. If the kernel pre-zeros the huge pages via CoW like normal
>>> pages, then definitely not.
>>>
>>> Even if the kernel doesn't pre-zero the pages, if DPDK takes care of
>>> properly initializing memory structures on startup as they are carved out
>>> of the huge pages, then it isn't a security hole. However, that approach is
>>> susceptible to bit rot... You can audit the code and make sure everything
>>> is kosher at first, but you have to worry about new code making assumptions
>>> about how memory is initialized.
>>>
>>>> This article speaks about "prezeroing optimizations" in Linux kernel:
>>>>          http://landley.net/writing/memory-faq.txt
>>>
>>> I read through that when I was trying to figure out what whether huge pages
>>> were pre-zeroed or not. It doesn't talk about huge pages much beyond why
>>> they are useful for reducing TLB swaps.
>>>
>>> Jay
> 
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [dpdk-dev] [PATCH] eal: decrease the memory init time with many hugepages setup
  2015-04-03  9:14       ` Thomas Monjalon
  2015-04-03  9:37         ` Lilijun
@ 2015-04-03 12:00         ` Gonzalez Monroy, Sergio
  1 sibling, 0 replies; 7+ messages in thread
From: Gonzalez Monroy, Sergio @ 2015-04-03 12:00 UTC (permalink / raw)
  To: Thomas Monjalon, Lilijun; +Cc: dev

On 03/04/2015 10:14, Thomas Monjalon wrote:
> 2015-04-03 10:04, Gonzalez Monroy, Sergio:
>> On 02/04/2015 14:41, Jay Rolette wrote:
>>> On Thu, Apr 2, 2015 at 7:55 AM, Thomas Monjalon <thomas.monjalon@6wind.com>
>>> wrote:
>>>
>>>> 2015-04-02 19:30, jerry.lilijun@huawei.com:
>>>>> From: Lilijun <jerry.lilijun@huawei.com>
>>>>>
>>>>> In the function map_all_hugepages(), hugepage memory is truly allocated
>>>> by
>>>>> memset(virtaddr, 0, hugepage_sz). Then it costs about 40s to finish the
>>>>> dpdk memory initialization when 40000 2M hugepages are setup in host os.
>>>> Yes it's something we should try to reduce.
>>>>
>>> I have a patch in my tree that does the same opto, but it is commented out
>>> right now. In our case, 2/3's of the startup time for our entire app was
>>> due to that particular call - memset(virtaddr, 0, hugepage_sz). Just
>>> zeroing 1 byte per huge page reduces that by 30% in my tests.
>>>
>>> The only reason I have it commented out is that I didn't have time to make
>>> sure there weren't side-effects for DPDK or my app. For normal shared
>>> memory on Linux, pages are initialized to zero automatically once they are
>>> touched, so the memset isn't required but I wasn't sure whether that
>>> applied to huge pages. Also wasn't sure how hugetlbfs factored into the
>>> equation.
>>>
>>> Hopefully someone can chime in on that. Would love to uncomment the opto :)
>>>
>> I think the opto/patch is good ;)
>>
>> I had a look at the Linux kernel sources (mm/hugetlb.c)and at least
>> since 2.6.32 (minimum
>> Linux kernel version supported by DPDK) the kernel clears the hugepage
>> (clear_huge_page)
>> when it faults (hugetlb_no_page).
>>
>> Primary DPDK apps do clear_hugedir, clearing previously allocated
>> hugepages, thus triggering
>> hugepage faults (hugetlb_no_page) during map_all_hugepages.
>>
>> Note that even when we exit a primary DPDK app, hugepages remain
>> allocated, reason why
>> apps such as dump_cfg are able to retrieve config/memory information.
> OK, thanks Sergio.
>
> So the patch should add a comment to explain page fault reason of memset and
> why 1 byte is enough.
> I think we should also consider remap_all_hugepages() function.
Good point!
You are right, I don't think we would even need to do memset at all in 
remap_all_hugepages
as we already have touched/allocated those pages.

Sergio
>>>> Isn't it a security hole?
>>>>
>>> Not necessarily. If the kernel pre-zeros the huge pages via CoW like normal
>>> pages, then definitely not.
>>>
>>> Even if the kernel doesn't pre-zero the pages, if DPDK takes care of
>>> properly initializing memory structures on startup as they are carved out
>>> of the huge pages, then it isn't a security hole. However, that approach is
>>> susceptible to bit rot... You can audit the code and make sure everything
>>> is kosher at first, but you have to worry about new code making assumptions
>>> about how memory is initialized.
>>>
>>>> This article speaks about "prezeroing optimizations" in Linux kernel:
>>>>           http://landley.net/writing/memory-faq.txt
>>> I read through that when I was trying to figure out what whether huge pages
>>> were pre-zeroed or not. It doesn't talk about huge pages much beyond why
>>> they are useful for reducing TLB swaps.
>>>
>>> Jay
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-04-03 12:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-02 11:30 [dpdk-dev] [PATCH] eal: decrease the memory init time with many hugepages setup jerry.lilijun
2015-04-02 12:55 ` Thomas Monjalon
2015-04-02 13:41   ` Jay Rolette
2015-04-03  9:04     ` Gonzalez Monroy, Sergio
2015-04-03  9:14       ` Thomas Monjalon
2015-04-03  9:37         ` Lilijun
2015-04-03 12:00         ` Gonzalez Monroy, Sergio

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).