Using lslocks command on Linux I see that the primary has a lock on /mnt/huge2M and the secondary is waiting for a lock on the same directory. SECONDARY 2416270 FLOCK WRITE* 0 0 0 /mnt/huge2M... 2416174 PRIMARY 2416174 FLOCK WRITE 0 0 0 /mnt/huge2M... Is a PRIMARY supposed to hold a permanent lock on a /mnt/huge2M ? Il giorno mer 24 ago 2022 alle ore 11:18 Anna Tauzzi ha scritto: > Already tried the first suggestion with no luck, the secondary always gets > stuck: > > #0 0x00007fc6d3eb05ab in flock () at ../sysdeps/unix/syscall-template.S:78 > #1 0x00007fc6d3ba1343 in sync_walk () from /usr/local/lib/librte_eal.so.22 > #2 0x00007fc6d3b8402b in rte_memseg_list_walk_thread_unsafe () from > /usr/local/lib/librte_eal.so.22 > #3 0x00007fc6d3ba18bf in eal_memalloc_sync_with_primary () from > /usr/local/lib/librte_eal.so.22 > #4 0x00007fc6d3ba24b5 in rte_eal_hugepage_attach () from > /usr/local/lib/librte_eal.so.22 > #5 0x00007fc6d3b848f1 in rte_eal_memory_init () from > /usr/local/lib/librte_eal.so.22 > #6 0x00007fc6d3b782aa in rte_eal_init.cold () from > /usr/local/lib/librte_eal.so.22 > > For the second info: > if I prevent the primary to allocate on the NUMA where secondary is > running, then, the secondary doesn't get stuck. > > > > > Il giorno mer 24 ago 2022 alle ore 11:14 Antonio Di Bacco < > a.dibacco.ks@gmail.com> ha scritto: > >> Can you try launching the secondary with some delay in order not to >> overlap with memory allocations done in the primary? >> Is your primary allocating memory on NUMA 0 where the secondary is >> running? >> >> On Tue, Aug 23, 2022 at 4:54 PM Anna Tauzzi >> wrote: >> > >> > I have a primary process that spawns a secondary process.Primary is on >> NUMA 1 while secondary on NUMA 0. >> > The secondary process starts up but when calling rte_eal_init it gets >> stuck with this backtrace: >> > >> > flock() >> > sync_walk() >> > rte_memseg_list_walk_thread_unsafe() >> > eal_memalloc_sync_with_primary() >> > rte_eal_hugepage_attach() >> > rte_eal_memory_init() >> > rte_eal_init.cold() >> > >> > While starting the secondary, it is possible that the primary is >> allocating memory on different NUMAs. I'm saying this because if in the >> primary I replace the dpdk memory allocation function (rte_zalloc...) with >> a plain memalign I don't get this problem. >> > >> > >> > >> >