DPDK usage discussions
 help / color / mirror / Atom feed
* Secondary process stuck in rte_eal_memory_init
@ 2022-08-23 14:54 Anna Tauzzi
  2022-08-24  9:14 ` Antonio Di Bacco
  0 siblings, 1 reply; 4+ messages in thread
From: Anna Tauzzi @ 2022-08-23 14:54 UTC (permalink / raw)
  To: users

[-- Attachment #1: Type: text/plain, Size: 617 bytes --]

I have a primary process that spawns a secondary process.Primary is on NUMA
1 while secondary on NUMA 0.
The secondary process starts up but when calling rte_eal_init it gets stuck
with this backtrace:

flock()
sync_walk()
rte_memseg_list_walk_thread_unsafe()
eal_memalloc_sync_with_primary()
rte_eal_hugepage_attach()
rte_eal_memory_init()
rte_eal_init.cold()

While starting the secondary, it is possible that the primary is allocating
memory on different NUMAs. I'm saying this because if in the primary I
replace the dpdk memory allocation function (rte_zalloc...) with a plain
memalign I don't get this problem.

[-- Attachment #2: Type: text/html, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Secondary process stuck in rte_eal_memory_init
  2022-08-23 14:54 Secondary process stuck in rte_eal_memory_init Anna Tauzzi
@ 2022-08-24  9:14 ` Antonio Di Bacco
  2022-08-24  9:18   ` Anna Tauzzi
  0 siblings, 1 reply; 4+ messages in thread
From: Antonio Di Bacco @ 2022-08-24  9:14 UTC (permalink / raw)
  To: Anna Tauzzi; +Cc: users

Can you try launching the secondary with some delay in order not to
overlap with memory allocations done in the primary?
Is your primary allocating memory on NUMA 0 where the secondary is running?

On Tue, Aug 23, 2022 at 4:54 PM Anna Tauzzi <admin@argonnetech.net> wrote:
>
> I have a primary process that spawns a secondary process.Primary is on NUMA 1 while secondary on NUMA 0.
> The secondary process starts up but when calling rte_eal_init it gets stuck with this backtrace:
>
> flock()
> sync_walk()
> rte_memseg_list_walk_thread_unsafe()
> eal_memalloc_sync_with_primary()
> rte_eal_hugepage_attach()
> rte_eal_memory_init()
> rte_eal_init.cold()
>
> While starting the secondary, it is possible that the primary is allocating memory on different NUMAs. I'm saying this because if in the primary I replace the dpdk memory allocation function (rte_zalloc...) with a plain memalign I don't get this problem.
>
>
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Secondary process stuck in rte_eal_memory_init
  2022-08-24  9:14 ` Antonio Di Bacco
@ 2022-08-24  9:18   ` Anna Tauzzi
  2022-08-24 10:11     ` Anna Tauzzi
  0 siblings, 1 reply; 4+ messages in thread
From: Anna Tauzzi @ 2022-08-24  9:18 UTC (permalink / raw)
  To: Antonio Di Bacco; +Cc: users

[-- Attachment #1: Type: text/plain, Size: 1907 bytes --]

Already tried the first suggestion with no luck, the secondary always gets
stuck:

#0  0x00007fc6d3eb05ab in flock () at ../sysdeps/unix/syscall-template.S:78
#1  0x00007fc6d3ba1343 in sync_walk () from /usr/local/lib/librte_eal.so.22
#2  0x00007fc6d3b8402b in rte_memseg_list_walk_thread_unsafe () from
/usr/local/lib/librte_eal.so.22
#3  0x00007fc6d3ba18bf in eal_memalloc_sync_with_primary () from
/usr/local/lib/librte_eal.so.22
#4  0x00007fc6d3ba24b5 in rte_eal_hugepage_attach () from
/usr/local/lib/librte_eal.so.22
#5  0x00007fc6d3b848f1 in rte_eal_memory_init () from
/usr/local/lib/librte_eal.so.22
#6  0x00007fc6d3b782aa in rte_eal_init.cold () from
/usr/local/lib/librte_eal.so.22

For the second info:
if I prevent  the primary to allocate on the NUMA where  secondary is
running, then, the secondary doesn't get stuck.




Il giorno mer 24 ago 2022 alle ore 11:14 Antonio Di Bacco <
a.dibacco.ks@gmail.com> ha scritto:

> Can you try launching the secondary with some delay in order not to
> overlap with memory allocations done in the primary?
> Is your primary allocating memory on NUMA 0 where the secondary is running?
>
> On Tue, Aug 23, 2022 at 4:54 PM Anna Tauzzi <admin@argonnetech.net> wrote:
> >
> > I have a primary process that spawns a secondary process.Primary is on
> NUMA 1 while secondary on NUMA 0.
> > The secondary process starts up but when calling rte_eal_init it gets
> stuck with this backtrace:
> >
> > flock()
> > sync_walk()
> > rte_memseg_list_walk_thread_unsafe()
> > eal_memalloc_sync_with_primary()
> > rte_eal_hugepage_attach()
> > rte_eal_memory_init()
> > rte_eal_init.cold()
> >
> > While starting the secondary, it is possible that the primary is
> allocating memory on different NUMAs. I'm saying this because if in the
> primary I replace the dpdk memory allocation function (rte_zalloc...) with
> a plain memalign I don't get this problem.
> >
> >
> >
>

[-- Attachment #2: Type: text/html, Size: 2496 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Secondary process stuck in rte_eal_memory_init
  2022-08-24  9:18   ` Anna Tauzzi
@ 2022-08-24 10:11     ` Anna Tauzzi
  0 siblings, 0 replies; 4+ messages in thread
From: Anna Tauzzi @ 2022-08-24 10:11 UTC (permalink / raw)
  To: Antonio Di Bacco; +Cc: users

[-- Attachment #1: Type: text/plain, Size: 2448 bytes --]

Using lslocks command on Linux I see that the primary has a lock on
/mnt/huge2M and the secondary is waiting for a lock on the same directory.

SECONDARY  2416270 FLOCK      WRITE* 0     0          0 /mnt/huge2M...
 2416174
PRIMARY         2416174 FLOCK      WRITE  0     0          0 /mnt/huge2M...

Is a PRIMARY supposed to hold a permanent lock on a /mnt/huge2M ?




Il giorno mer 24 ago 2022 alle ore 11:18 Anna Tauzzi <admin@argonnetech.net>
ha scritto:

> Already tried the first suggestion with no luck, the secondary always gets
> stuck:
>
> #0  0x00007fc6d3eb05ab in flock () at ../sysdeps/unix/syscall-template.S:78
> #1  0x00007fc6d3ba1343 in sync_walk () from /usr/local/lib/librte_eal.so.22
> #2  0x00007fc6d3b8402b in rte_memseg_list_walk_thread_unsafe () from
> /usr/local/lib/librte_eal.so.22
> #3  0x00007fc6d3ba18bf in eal_memalloc_sync_with_primary () from
> /usr/local/lib/librte_eal.so.22
> #4  0x00007fc6d3ba24b5 in rte_eal_hugepage_attach () from
> /usr/local/lib/librte_eal.so.22
> #5  0x00007fc6d3b848f1 in rte_eal_memory_init () from
> /usr/local/lib/librte_eal.so.22
> #6  0x00007fc6d3b782aa in rte_eal_init.cold () from
> /usr/local/lib/librte_eal.so.22
>
> For the second info:
> if I prevent  the primary to allocate on the NUMA where  secondary is
> running, then, the secondary doesn't get stuck.
>
>
>
>
> Il giorno mer 24 ago 2022 alle ore 11:14 Antonio Di Bacco <
> a.dibacco.ks@gmail.com> ha scritto:
>
>> Can you try launching the secondary with some delay in order not to
>> overlap with memory allocations done in the primary?
>> Is your primary allocating memory on NUMA 0 where the secondary is
>> running?
>>
>> On Tue, Aug 23, 2022 at 4:54 PM Anna Tauzzi <admin@argonnetech.net>
>> wrote:
>> >
>> > I have a primary process that spawns a secondary process.Primary is on
>> NUMA 1 while secondary on NUMA 0.
>> > The secondary process starts up but when calling rte_eal_init it gets
>> stuck with this backtrace:
>> >
>> > flock()
>> > sync_walk()
>> > rte_memseg_list_walk_thread_unsafe()
>> > eal_memalloc_sync_with_primary()
>> > rte_eal_hugepage_attach()
>> > rte_eal_memory_init()
>> > rte_eal_init.cold()
>> >
>> > While starting the secondary, it is possible that the primary is
>> allocating memory on different NUMAs. I'm saying this because if in the
>> primary I replace the dpdk memory allocation function (rte_zalloc...) with
>> a plain memalign I don't get this problem.
>> >
>> >
>> >
>>
>

[-- Attachment #2: Type: text/html, Size: 3387 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-08-24 10:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-23 14:54 Secondary process stuck in rte_eal_memory_init Anna Tauzzi
2022-08-24  9:14 ` Antonio Di Bacco
2022-08-24  9:18   ` Anna Tauzzi
2022-08-24 10:11     ` Anna Tauzzi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).