DPDK usage discussions
 help / color / mirror / Atom feed
* [dpdk-users] DPDK program huge core file size
@ 2021-02-19 19:18 James Huang
  2021-02-23 19:22 ` James Huang
  0 siblings, 1 reply; 6+ messages in thread
From: James Huang @ 2021-02-19 19:18 UTC (permalink / raw)
  To: users

On CentOS7, we observed that the program (based on dpdk 19.11) creates a
huge core file size, i.e. 100+GB, far bigger than the expected <4GB. even
though the system only installs 16GB memory, and allocates 1GB hugepage
size at boot time. no matter if the core file is created by program panic
(segfault), or run with tool gcore.

On CentOS 6, the program (based on dpdk 17.05), the core file is the
expected size.

On CentOS7, we tried to adjust the process coredump_filter combinations, it
found only when clean the bit 0 can avoid the huge core size, however, a
cleared bit 0 generate small core file (200MB) and is meaningless for debug
purposes, i.e. gdb bt command does not output.

Is there a way to avoid dumping the hugepage memory, while remaining other
memory in the core file?

The following is the program pmap output comparison.
on CentOS 6, the hugepage resides on the process user space:
...
00007f4e80000000 1048576K rw-s-  /mnt/huge_1GB/rtemap_0
00007f4ec0000000   2048K rw-s-
/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/resource0
00007f4ec0200000     16K rw-s-
/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/resource4
00007f4ec0204000   2048K rw-s-
/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1/resource0
00007f4ec0404000     16K rw-s-
/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1/resource4
...


on CentOS 7, the hugepage resides on the process system space::
...
0000000100000000     20K rw-s- config
0000000100005000    184K rw-s- fbarray_memzone
0000000100033000      4K rw-s- fbarray_memseg-1048576k-0-0
0000000140000000 1048576K rw-s- rtemap_0
0000000180000000 32505856K r----   [ anon ]
0000000940000000      4K rw-s- fbarray_memseg-1048576k-0-1
0000000980000000 33554432K r----   [ anon ]
0000001180000000      4K rw-s- fbarray_memseg-1048576k-0-2
00000011c0000000 33554432K r----   [ anon ]
00000019c0000000      4K rw-s- fbarray_memseg-1048576k-0-3
0000001a00000000 33554432K r----   [ anon ]
0000002200000000   1024K rw-s- resource0
0000002200100000     16K rw-s- resource3
0000002200104000   1024K rw-s- resource0
0000002200204000     16K rw-s- resource3
...

Thanks,
-James

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] DPDK program huge core file size
  2021-02-19 19:18 [dpdk-users] DPDK program huge core file size James Huang
@ 2021-02-23 19:22 ` James Huang
  2021-02-24  3:59   ` Li Feng
  0 siblings, 1 reply; 6+ messages in thread
From: James Huang @ 2021-02-23 19:22 UTC (permalink / raw)
  To: users

UPDATE: the 'kill -6' command does not dump the hugepage memory zone into
the core file.

Is there a way to bypass the hugepage memory zone dump into the core file
with running gcore command ?


On Fri, Feb 19, 2021 at 11:18 AM James Huang <jamsphon@gmail.com> wrote:

> On CentOS7, we observed that the program (based on dpdk 19.11) creates a
> huge core file size, i.e. 100+GB, far bigger than the expected <4GB. even
> though the system only installs 16GB memory, and allocates 1GB hugepage
> size at boot time. no matter if the core file is created by program panic
> (segfault), or run with tool gcore.
>
> On CentOS 6, the program (based on dpdk 17.05), the core file is the
> expected size.
>
> On CentOS7, we tried to adjust the process coredump_filter combinations,
> it found only when clean the bit 0 can avoid the huge core size, however, a
> cleared bit 0 generate small core file (200MB) and is meaningless for debug
> purposes, i.e. gdb bt command does not output.
>
> Is there a way to avoid dumping the hugepage memory, while remaining other
> memory in the core file?
>
> The following is the program pmap output comparison.
> on CentOS 6, the hugepage resides on the process user space:
> ...
> 00007f4e80000000 1048576K rw-s-  /mnt/huge_1GB/rtemap_0
> 00007f4ec0000000   2048K rw-s-
> /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/resource0
> 00007f4ec0200000     16K rw-s-
> /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/resource4
> 00007f4ec0204000   2048K rw-s-
> /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1/resource0
> 00007f4ec0404000     16K rw-s-
> /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1/resource4
> ...
>
>
> on CentOS 7, the hugepage resides on the process system space::
> ...
> 0000000100000000     20K rw-s- config
> 0000000100005000    184K rw-s- fbarray_memzone
> 0000000100033000      4K rw-s- fbarray_memseg-1048576k-0-0
> 0000000140000000 1048576K rw-s- rtemap_0
> 0000000180000000 32505856K r----   [ anon ]
> 0000000940000000      4K rw-s- fbarray_memseg-1048576k-0-1
> 0000000980000000 33554432K r----   [ anon ]
> 0000001180000000      4K rw-s- fbarray_memseg-1048576k-0-2
> 00000011c0000000 33554432K r----   [ anon ]
> 00000019c0000000      4K rw-s- fbarray_memseg-1048576k-0-3
> 0000001a00000000 33554432K r----   [ anon ]
> 0000002200000000   1024K rw-s- resource0
> 0000002200100000     16K rw-s- resource3
> 0000002200104000   1024K rw-s- resource0
> 0000002200204000     16K rw-s- resource3
> ...
>
> Thanks,
> -James
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] DPDK program huge core file size
  2021-02-23 19:22 ` James Huang
@ 2021-02-24  3:59   ` Li Feng
  2021-02-25 17:37     ` James Huang
  2021-02-26 16:00     ` David Marchand
  0 siblings, 2 replies; 6+ messages in thread
From: Li Feng @ 2021-02-24  3:59 UTC (permalink / raw)
  To: James Huang; +Cc: users

I think you should update your dpdk to the latest.
I have fixed this issue some months ago.

d72e4042c - mem: exclude unused memory from core dump

Thanks,
Feng Li

James Huang <jamsphon@gmail.com> 于2021年2月24日周三 上午3:22写道:
>
> UPDATE: the 'kill -6' command does not dump the hugepage memory zone into
> the core file.
>
> Is there a way to bypass the hugepage memory zone dump into the core file
> with running gcore command ?
>
>
> On Fri, Feb 19, 2021 at 11:18 AM James Huang <jamsphon@gmail.com> wrote:
>
> > On CentOS7, we observed that the program (based on dpdk 19.11) creates a
> > huge core file size, i.e. 100+GB, far bigger than the expected <4GB. even
> > though the system only installs 16GB memory, and allocates 1GB hugepage
> > size at boot time. no matter if the core file is created by program panic
> > (segfault), or run with tool gcore.
> >
> > On CentOS 6, the program (based on dpdk 17.05), the core file is the
> > expected size.
> >
> > On CentOS7, we tried to adjust the process coredump_filter combinations,
> > it found only when clean the bit 0 can avoid the huge core size, however, a
> > cleared bit 0 generate small core file (200MB) and is meaningless for debug
> > purposes, i.e. gdb bt command does not output.
> >
> > Is there a way to avoid dumping the hugepage memory, while remaining other
> > memory in the core file?
> >
> > The following is the program pmap output comparison.
> > on CentOS 6, the hugepage resides on the process user space:
> > ...
> > 00007f4e80000000 1048576K rw-s-  /mnt/huge_1GB/rtemap_0
> > 00007f4ec0000000   2048K rw-s-
> > /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/resource0
> > 00007f4ec0200000     16K rw-s-
> > /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/resource4
> > 00007f4ec0204000   2048K rw-s-
> > /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1/resource0
> > 00007f4ec0404000     16K rw-s-
> > /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1/resource4
> > ...
> >
> >
> > on CentOS 7, the hugepage resides on the process system space::
> > ...
> > 0000000100000000     20K rw-s- config
> > 0000000100005000    184K rw-s- fbarray_memzone
> > 0000000100033000      4K rw-s- fbarray_memseg-1048576k-0-0
> > 0000000140000000 1048576K rw-s- rtemap_0
> > 0000000180000000 32505856K r----   [ anon ]
> > 0000000940000000      4K rw-s- fbarray_memseg-1048576k-0-1
> > 0000000980000000 33554432K r----   [ anon ]
> > 0000001180000000      4K rw-s- fbarray_memseg-1048576k-0-2
> > 00000011c0000000 33554432K r----   [ anon ]
> > 00000019c0000000      4K rw-s- fbarray_memseg-1048576k-0-3
> > 0000001a00000000 33554432K r----   [ anon ]
> > 0000002200000000   1024K rw-s- resource0
> > 0000002200100000     16K rw-s- resource3
> > 0000002200104000   1024K rw-s- resource0
> > 0000002200204000     16K rw-s- resource3
> > ...
> >
> > Thanks,
> > -James
> >

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] DPDK program huge core file size
  2021-02-24  3:59   ` Li Feng
@ 2021-02-25 17:37     ` James Huang
  2021-02-25 18:23       ` James Huang
  2021-02-26 16:00     ` David Marchand
  1 sibling, 1 reply; 6+ messages in thread
From: James Huang @ 2021-02-25 17:37 UTC (permalink / raw)
  To: Li Feng; +Cc: users

@feng, thank you for the info. We'll pull the fix from
eal_common_memory.c:eal_get_virtual_area(), give it a try.

Regards,
James Huang


On Tue, Feb 23, 2021 at 8:00 PM Li Feng <fengli@smartx.com> wrote:

> I think you should update your dpdk to the latest.
> I have fixed this issue some months ago.
>
> d72e4042c - mem: exclude unused memory from core dump
>
> Thanks,
> Feng Li
>
> James Huang <jamsphon@gmail.com> 于2021年2月24日周三 上午3:22写道:
> >
> > UPDATE: the 'kill -6' command does not dump the hugepage memory zone into
> > the core file.
> >
> > Is there a way to bypass the hugepage memory zone dump into the core file
> > with running gcore command ?
> >
> >
> > On Fri, Feb 19, 2021 at 11:18 AM James Huang <jamsphon@gmail.com> wrote:
> >
> > > On CentOS7, we observed that the program (based on dpdk 19.11) creates
> a
> > > huge core file size, i.e. 100+GB, far bigger than the expected <4GB.
> even
> > > though the system only installs 16GB memory, and allocates 1GB hugepage
> > > size at boot time. no matter if the core file is created by program
> panic
> > > (segfault), or run with tool gcore.
> > >
> > > On CentOS 6, the program (based on dpdk 17.05), the core file is the
> > > expected size.
> > >
> > > On CentOS7, we tried to adjust the process coredump_filter
> combinations,
> > > it found only when clean the bit 0 can avoid the huge core size,
> however, a
> > > cleared bit 0 generate small core file (200MB) and is meaningless for
> debug
> > > purposes, i.e. gdb bt command does not output.
> > >
> > > Is there a way to avoid dumping the hugepage memory, while remaining
> other
> > > memory in the core file?
> > >
> > > The following is the program pmap output comparison.
> > > on CentOS 6, the hugepage resides on the process user space:
> > > ...
> > > 00007f4e80000000 1048576K rw-s-  /mnt/huge_1GB/rtemap_0
> > > 00007f4ec0000000   2048K rw-s-
> > > /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/resource0
> > > 00007f4ec0200000     16K rw-s-
> > > /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/resource4
> > > 00007f4ec0204000   2048K rw-s-
> > > /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1/resource0
> > > 00007f4ec0404000     16K rw-s-
> > > /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1/resource4
> > > ...
> > >
> > >
> > > on CentOS 7, the hugepage resides on the process system space::
> > > ...
> > > 0000000100000000     20K rw-s- config
> > > 0000000100005000    184K rw-s- fbarray_memzone
> > > 0000000100033000      4K rw-s- fbarray_memseg-1048576k-0-0
> > > 0000000140000000 1048576K rw-s- rtemap_0
> > > 0000000180000000 32505856K r----   [ anon ]
> > > 0000000940000000      4K rw-s- fbarray_memseg-1048576k-0-1
> > > 0000000980000000 33554432K r----   [ anon ]
> > > 0000001180000000      4K rw-s- fbarray_memseg-1048576k-0-2
> > > 00000011c0000000 33554432K r----   [ anon ]
> > > 00000019c0000000      4K rw-s- fbarray_memseg-1048576k-0-3
> > > 0000001a00000000 33554432K r----   [ anon ]
> > > 0000002200000000   1024K rw-s- resource0
> > > 0000002200100000     16K rw-s- resource3
> > > 0000002200104000   1024K rw-s- resource0
> > > 0000002200204000     16K rw-s- resource3
> > > ...
> > >
> > > Thanks,
> > > -James
> > >
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] DPDK program huge core file size
  2021-02-25 17:37     ` James Huang
@ 2021-02-25 18:23       ` James Huang
  0 siblings, 0 replies; 6+ messages in thread
From: James Huang @ 2021-02-25 18:23 UTC (permalink / raw)
  To: Li Feng; +Cc: users

UPDATE: yes, the code works, in use of system api madvise().

Regards,
James Huang


On Thu, Feb 25, 2021 at 9:37 AM James Huang <jamsphon@gmail.com> wrote:

>
> @feng, thank you for the info. We'll pull the fix from
> eal_common_memory.c:eal_get_virtual_area(), give it a try.
>
> Regards,
> James Huang
>
>
> On Tue, Feb 23, 2021 at 8:00 PM Li Feng <fengli@smartx.com> wrote:
>
>> I think you should update your dpdk to the latest.
>> I have fixed this issue some months ago.
>>
>> d72e4042c - mem: exclude unused memory from core dump
>>
>> Thanks,
>> Feng Li
>>
>> James Huang <jamsphon@gmail.com> 于2021年2月24日周三 上午3:22写道:
>> >
>> > UPDATE: the 'kill -6' command does not dump the hugepage memory zone
>> into
>> > the core file.
>> >
>> > Is there a way to bypass the hugepage memory zone dump into the core
>> file
>> > with running gcore command ?
>> >
>> >
>> > On Fri, Feb 19, 2021 at 11:18 AM James Huang <jamsphon@gmail.com>
>> wrote:
>> >
>> > > On CentOS7, we observed that the program (based on dpdk 19.11)
>> creates a
>> > > huge core file size, i.e. 100+GB, far bigger than the expected <4GB.
>> even
>> > > though the system only installs 16GB memory, and allocates 1GB
>> hugepage
>> > > size at boot time. no matter if the core file is created by program
>> panic
>> > > (segfault), or run with tool gcore.
>> > >
>> > > On CentOS 6, the program (based on dpdk 17.05), the core file is the
>> > > expected size.
>> > >
>> > > On CentOS7, we tried to adjust the process coredump_filter
>> combinations,
>> > > it found only when clean the bit 0 can avoid the huge core size,
>> however, a
>> > > cleared bit 0 generate small core file (200MB) and is meaningless for
>> debug
>> > > purposes, i.e. gdb bt command does not output.
>> > >
>> > > Is there a way to avoid dumping the hugepage memory, while remaining
>> other
>> > > memory in the core file?
>> > >
>> > > The following is the program pmap output comparison.
>> > > on CentOS 6, the hugepage resides on the process user space:
>> > > ...
>> > > 00007f4e80000000 1048576K rw-s-  /mnt/huge_1GB/rtemap_0
>> > > 00007f4ec0000000   2048K rw-s-
>> > > /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/resource0
>> > > 00007f4ec0200000     16K rw-s-
>> > > /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/resource4
>> > > 00007f4ec0204000   2048K rw-s-
>> > > /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1/resource0
>> > > 00007f4ec0404000     16K rw-s-
>> > > /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1/resource4
>> > > ...
>> > >
>> > >
>> > > on CentOS 7, the hugepage resides on the process system space::
>> > > ...
>> > > 0000000100000000     20K rw-s- config
>> > > 0000000100005000    184K rw-s- fbarray_memzone
>> > > 0000000100033000      4K rw-s- fbarray_memseg-1048576k-0-0
>> > > 0000000140000000 1048576K rw-s- rtemap_0
>> > > 0000000180000000 32505856K r----   [ anon ]
>> > > 0000000940000000      4K rw-s- fbarray_memseg-1048576k-0-1
>> > > 0000000980000000 33554432K r----   [ anon ]
>> > > 0000001180000000      4K rw-s- fbarray_memseg-1048576k-0-2
>> > > 00000011c0000000 33554432K r----   [ anon ]
>> > > 00000019c0000000      4K rw-s- fbarray_memseg-1048576k-0-3
>> > > 0000001a00000000 33554432K r----   [ anon ]
>> > > 0000002200000000   1024K rw-s- resource0
>> > > 0000002200100000     16K rw-s- resource3
>> > > 0000002200104000   1024K rw-s- resource0
>> > > 0000002200204000     16K rw-s- resource3
>> > > ...
>> > >
>> > > Thanks,
>> > > -James
>> > >
>>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] DPDK program huge core file size
  2021-02-24  3:59   ` Li Feng
  2021-02-25 17:37     ` James Huang
@ 2021-02-26 16:00     ` David Marchand
  1 sibling, 0 replies; 6+ messages in thread
From: David Marchand @ 2021-02-26 16:00 UTC (permalink / raw)
  To: Li Feng, James Huang; +Cc: users

On Wed, Feb 24, 2021 at 5:00 AM Li Feng <fengli@smartx.com> wrote:
>
> I think you should update your dpdk to the latest.
> I have fixed this issue some months ago.
>
> d72e4042c - mem: exclude unused memory from core dump

No need to go to the latest release, this commit has been backported
to 19.11 recently.
http://git.dpdk.org/dpdk-stable/commit/?h=19.11&id=d3ceba92eff783d2c995387e5ed8509045578748

You can wait for the next 19.11.x release.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-02-26 16:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-19 19:18 [dpdk-users] DPDK program huge core file size James Huang
2021-02-23 19:22 ` James Huang
2021-02-24  3:59   ` Li Feng
2021-02-25 17:37     ` James Huang
2021-02-25 18:23       ` James Huang
2021-02-26 16:00     ` David Marchand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).