DPDK usage discussions
 help / color / mirror / Atom feed
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Tom Barbette <barbette@kth.se>,
	Shahaf Shuler <shahafs@mellanox.com>,
	Yongseok Koh <yskoh@mellanox.com>,
	Raslan Darawsheh <rasland@mellanox.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	"Iremonger, Bernard" <bernard.iremonger@intel.com>
Cc: "users@dpdk.org" <users@dpdk.org>
Subject: Re: [dpdk-users] Unregistered mempool in secondary
Date: Thu, 6 Dec 2018 15:39:45 +0000	[thread overview]
Message-ID: <C6ECDF3AB251BE4894318F4E45123697824FC9F1@IRSMSX109.ger.corp.intel.com> (raw)
In-Reply-To: <1544110125447.14917@kth.se>

I'm not familiar with mlx5 stuff enough to offer anything useful here. From the memory side everything looks entirely unsuspicious, except for the fact that you shouldn't run your secondary process with the same coremask as primary (it will lead to mempool cache corruption, among other things).

Thanks,
Anatoly


> -----Original Message-----
> From: Tom Barbette [mailto:barbette@kth.se]
> Sent: Thursday, December 6, 2018 3:29 PM
> To: Burakov, Anatoly <anatoly.burakov@intel.com>; Shahaf Shuler
> <shahafs@mellanox.com>; Yongseok Koh <yskoh@mellanox.com>; Raslan
> Darawsheh <rasland@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Iremonger, Bernard
> <bernard.iremonger@intel.com>
> Cc: users@dpdk.org
> Subject: RE: Unregistered mempool in secondary
> 
> ​Some more infos :
> - --legacy-mem does not solve the issue (Had to try...)
> - ​I can confirm it *does* work when we use two ports from the same device,
> that is -w 03:00.0 and -w 03:00.1 instead of the second being 82:00.0. If ports
> are totally isolated regarding memory regions then this is indeed an issue
> caused when using multiple NUMA nodes in the secondary.
> - It fails as soon as a packet flows from left port (socket 0) to right port
> (socket 1).
> 
> Below is the log of the primary :
> EAL: Detected lcore 0 as core 0 on socket 0 ...
> EAL: Detected lcore 14 as core 7 on socket 0
> EAL: Detected lcore 15 as core 7 on socket 1
> EAL: Support maximum 128 logical core(s) by configuration.
> EAL: Detected 16 lcore(s)
> EAL: Detected 2 NUMA nodes
> EAL: RTE Version: 'DPDK 18.11.0'
> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> EAL: Module /sys/module/vfio_pci not found! error 2 (No such file or
> directory)
> EAL: VFIO PCI modules not loaded
> EAL: DPAA Bus not present. Skipping.
> EAL: Probing VFIO support...
> EAL: Module /sys/module/vfio not found! error 2 (No such file or directory)
> EAL: VFIO modules not loaded, skipping VFIO support...
> EAL: Ask a virtual area of 0x2e000 bytes
> EAL: Virtual area found at 0x600000000000 (size = 0x2e000)
> EAL: Setting up physically contiguous memory...
> EAL: Setting maximum number of open files to 1048576
> EAL: Detected memory type: socket_id:0 hugepage_sz:1073741824
> EAL: Detected memory type: socket_id:1 hugepage_sz:1073741824
> EAL: Creating 4 segment lists: n_segs:32 socket_id:0
> hugepage_sz:1073741824
> EAL: Ask a virtual area of 0x1000 bytes
> EAL: Virtual area found at 0x60000002e000 (size = 0x1000)
> EAL: Memseg list allocated: 0x100000kB at socket 0
> EAL: Ask a virtual area of 0x800000000 bytes
> EAL: Virtual area found at 0x600040000000 (size = 0x800000000)
> EAL: Ask a virtual area of 0x1000 bytes
> EAL: Virtual area found at 0x600840000000 (size = 0x1000)
> EAL: Memseg list allocated: 0x100000kB at socket 0
> EAL: Ask a virtual area of 0x800000000 bytes
> EAL: Virtual area found at 0x600880000000 (size = 0x800000000)
> EAL: Ask a virtual area of 0x1000 bytes
> EAL: Virtual area found at 0x601080000000 (size = 0x1000)
> EAL: Memseg list allocated: 0x100000kB at socket 0
> EAL: Ask a virtual area of 0x800000000 bytes
> EAL: Virtual area found at 0x6010c0000000 (size = 0x800000000)
> EAL: Ask a virtual area of 0x1000 bytes
> EAL: Virtual area found at 0x6018c0000000 (size = 0x1000)
> EAL: Memseg list allocated: 0x100000kB at socket 0
> EAL: Ask a virtual area of 0x800000000 bytes
> EAL: Virtual area found at 0x601900000000 (size = 0x800000000)
> EAL: Creating 4 segment lists: n_segs:32 socket_id:1
> hugepage_sz:1073741824
> EAL: Ask a virtual area of 0x1000 bytes
> EAL: Virtual area found at 0x602100000000 (size = 0x1000) ...
> EAL: TSC frequency is ~3200000 KHz
> EAL: Master lcore 0 is ready (tid=7f51acf2ac00;cpuset=[0])
> EAL: lcore 8 is ready (tid=7f51a0795700;cpuset=[8])
> EAL: lcore 7 is ready (tid=7f51a0f96700;cpuset=[7])
> EAL: lcore 6 is ready (tid=7f51a1797700;cpuset=[6])
> EAL: lcore 9 is ready (tid=7f519ff94700;cpuset=[9])
> EAL: lcore 3 is ready (tid=7f51a2f9a700;cpuset=[3])
> EAL: lcore 1 is ready (tid=7f51a3f9c700;cpuset=[1])
> EAL: lcore 2 is ready (tid=7f51a379b700;cpuset=[2])
> EAL: lcore 11 is ready (tid=7f519ef92700;cpuset=[11])
> EAL: lcore 10 is ready (tid=7f519f793700;cpuset=[10])
> EAL: lcore 4 is ready (tid=7f51a2799700;cpuset=[4])
> EAL: lcore 5 is ready (tid=7f51a1f98700;cpuset=[5])
> EAL: lcore 13 is ready (tid=7f519df90700;cpuset=[13])
> EAL: lcore 12 is ready (tid=7f519e791700;cpuset=[12])
> EAL: lcore 15 is ready (tid=7f519cf8e700;cpuset=[15])
> EAL: lcore 14 is ready (tid=7f519d78f700;cpuset=[14])
> EAL: Trying to obtain current memory policy.
> EAL: Setting policy MPOL_PREFERRED for socket 0
> EAL: Restoring previous memory policy: 0
> EAL: request: mp_malloc_sync
> EAL: Heap on socket 0 was expanded by 1024MB
> EAL: PCI device 0000:03:00.0 on NUMA socket 0
> EAL:   probe driver: 15b3:1017 net_mlx5
> EAL: Mem event callback 'MLX5_MEM_EVENT_CB:(nil)' registered
> EAL: PCI device 0000:82:00.0 on NUMA socket 1
> EAL:   probe driver: 15b3:1017 net_mlx5
> EAL: Trying to obtain current memory policy.
> EAL: Setting policy MPOL_PREFERRED for socket 1
> EAL: Restoring previous memory policy: 0
> EAL: Calling mem event callback 'MLX5_MEM_EVENT_CB:(nil)'
> EAL: request: mp_malloc_sync
> EAL: Heap on socket 1 was expanded by 1024MB
> EAL: Module /sys/module/vfio not found! error 2 (No such file or directory)
> ...
> fd0 : using queues from 0 to 15
> fd1 : using queues from 0 to 15
> Initializing DPDK
> ... (at this point queue and devices are started, but no rte_eth_rx_burst is
> called)...
> 
> 
> Below is the log of the secondary :
> 
> EAL: Detected lcore 0 as core 0 on socket 0
> EAL: Detected lcore 1 as core 0 on socket 1 ...
> EAL: Detected lcore 14 as core 7 on socket 0
> EAL: Detected lcore 15 as core 7 on socket 1
> EAL: Support maximum 128 logical core(s) by configuration.
> EAL: Detected 16 lcore(s)
> EAL: Detected 2 NUMA nodes
> EAL: Multi-process socket
> /var/run/dpdk/rte/mp_socket_9585_25762c2ecf59
> EAL: Module /sys/module/vfio_pci not found! error 2 (No such file or
> directory)
> EAL: VFIO PCI modules not loaded
> EAL: request: bus_vdev_mp
> EAL: msg: bus_vdev_mp
> EAL: reply: bus_vdev_mp
> EAL: msg: bus_vdev_mp
> EAL: DPAA Bus not present. Skipping.
> EAL: Probing VFIO support...
> EAL: Module /sys/module/vfio not found! error 2 (No such file or directory)
> EAL: VFIO modules not loaded, skipping VFIO support...
> EAL: Ask a virtual area of 0x2e000 bytes
> EAL: Virtual area found at 0x600000000000 (size = 0x2e000)
> EAL: Setting up physically contiguous memory...
> EAL: Setting maximum number of open files to 1048576
> EAL: Ask a virtual area of 0x1000 bytes
> ....
> EAL: Virtual area found at 0x7f1c4c551000 (size = 0x1000)
> EAL: Ask a virtual area of 0x1000 bytes
> EAL: Virtual area found at 0x7f1c4c550000 (size = 0x1000)
> EAL: TSC frequency is ~3200000 KHz
> EAL: Master lcore 0 is ready (tid=7f1c4c526c00;cpuset=[0])
> EAL: PCI device 0000:03:00.0 on NUMA socket 0
> EAL:   probe driver: 15b3:1017 net_mlx5
> EAL: Mem event callback 'MLX5_MEM_EVENT_CB:(nil)' registered
> EAL: PCI device 0000:82:00.0 on NUMA socket 1
> EAL:   probe driver: 15b3:1017 net_mlx5
> EAL: Module /sys/module/vfio not found! error 2 (No such file or directory)
> ...
> slaveFD1C0 : using queues from 0 to 0
> slaveFD0C0 : using queues from 0 to 0
> ...
> Initializing DPDK
> Found DPDK primary pool #0 click_mempool_0 Found DPDK primary pool #1
> click_mempool_1 ... (at this point lcore0 is reading packets from queue 0 of
> both devices) ...
> ... (packet flows)...
> net_mlx5: port 1 using address (0x60006e19ac00) of unregistered mempool
> in secondary process, please create mempool before rte_eth_dev_start()
> net_mlx5: port 1 using address (0x60006e19a2c0) of unregistered mempool
> in secondary process, please create mempool before rte_eth_dev_start()
> net_mlx5: port 1 using address (0x60006e199980) of unregistered mempool
> in secondary process, please create mempool before rte_eth_dev_start()
> net_mlx5: port 1 using address (0x60006e199040) of unregistered mempool
> in secondary process, please create mempool before rte_eth_dev_start()
> 
> After a few more of those lines, the slave dies.
> 
> Thanks,
> 
> Tom
> 
> ________________________________________
> De : users <users-bounces@dpdk.org> de la part de Tom Barbette
> <barbette@kth.se> Envoyé : jeudi 6 décembre 2018 15:53 À : Burakov,
> Anatoly; Shahaf Shuler; Yongseok Koh; Raslan Darawsheh; Thomas Monjalon;
> Iremonger, Bernard Cc : users@dpdk.org Objet : Re: [dpdk-users]
> Unregistered mempool in secondary
> 
> ​Hi Anatoly,
> 
> 
> I'm not sure for testpmd secondary support, I see in the mailing list history a
> patch proposal called "app/testpmd: improve multiprocess support"​ in 2016.
> But there does not seem to be any of the usual code path to detect if the
> current process is a primary or a secondary and create/attach pools
> accordingly.I guess it does not support multiprocess.
> 
> 
> We wanted to reproduce my problem with testpmd, but if it's not compatible
> with multiprocess, we'll have to figure out something else...
> 
> 
> I'll come back with more details/logs using my application.
> 
> 
> 
> Tom
> 
> 
> 
> ________________________________
> De : Burakov, Anatoly <anatoly.burakov@intel.com> Envoyé : jeudi 6
> décembre 2018 14:49 À : Shahaf Shuler; Tom Barbette; Yongseok Koh; Raslan
> Darawsheh; Thomas Monjalon; Iremonger, Bernard Cc : users@dpdk.org
> Objet : RE: Unregistered mempool in secondary
> 
> It would be good to get more detailed logs (--log-level=eal,8 or similar for
> mempool) to see exactly what fails, and why. However, I’m not sure I
> understand what is going on there. Since when does testpmd support
> secondary processes?
> 
> Thanks,
> Anatoly
> 
> From: Shahaf Shuler [mailto:shahafs@mellanox.com]
> Sent: Thursday, December 6, 2018 1:46 PM
> To: Tom Barbette <barbette@kth.se>; Yongseok Koh
> <yskoh@mellanox.com>; Raslan Darawsheh <rasland@mellanox.com>;
> Thomas Monjalon <thomas@monjalon.net>; Burakov, Anatoly
> <anatoly.burakov@intel.com>; Iremonger, Bernard
> <bernard.iremonger@intel.com>
> Cc: users@dpdk.org
> Subject: RE: Unregistered mempool in secondary
> 
> Adding some folks which may help.
> 
> Raslan – is the secondary process testing pass on our regression?
> 
> From: Tom Barbette <barbette@kth.se<mailto:barbette@kth.se>>
> Sent: Thursday, December 6, 2018 1:53 PM
> To: Shahaf Shuler
> <shahafs@mellanox.com<mailto:shahafs@mellanox.com>>; Yongseok Koh
> <yskoh@mellanox.com<mailto:yskoh@mellanox.com>>
> Cc: users@dpdk.org<mailto:users@dpdk.org>
> Subject: RE: Unregistered mempool in secondary
> 
> 
> No, that testpmd problem is with all NICs. I just can't get testpmd to work in
> primary+secondary mode. Technically the lines below should at least start
> but it doesn't. So I can't reproduce the problem at hand with testpmd...
> 
> 
> 
> If some testpmd expert could jump in here to allow to showcase secondary
> with testpmd it would be great. Then I may be able to reproduce.
> 
> 
> 
> Tom
> 
> 
> 
> 
> 
> ________________________________
> De : Shahaf Shuler
> <shahafs@mellanox.com<mailto:shahafs@mellanox.com>>
> Envoyé : jeudi 6 décembre 2018 10:54
> À : Tom Barbette; Yongseok Koh
> Cc : users@dpdk.org<mailto:users@dpdk.org>
> Objet : RE: Unregistered mempool in secondary
> 
> Only w/ Mellanox NICs or in general?
> 
> From: Tom Barbette <barbette@kth.se<mailto:barbette@kth.se>>
> Sent: Thursday, December 6, 2018 11:38 AM
> To: Shahaf Shuler
> <shahafs@mellanox.com<mailto:shahafs@mellanox.com>>; Yongseok Koh
> <yskoh@mellanox.com<mailto:yskoh@mellanox.com>>
> Cc: users@dpdk.org<mailto:users@dpdk.org>
> Subject: RE: Unregistered mempool in secondary
> 
> 
> I'm trying to reproduce with testpmd. But when I launch :
> 
> sudo ~/dpdk/x86_64-native-linuxapp-gcc/app/testpmd -w "0000:03:00.0" -w
> "0000:82:00.0" -- -i
> 
> sudo ~/dpdk/x86_64-native-linuxapp-gcc/app/testpmd -w "0000:03:00.0" -w
> "0000:82:00.0" --proc-type=secondary​ -- -i
> 
> 
> 
> I get on the secondary:
> EAL: Error - exiting with code: 1
> 
> Cause: Creation of mbuf pool for socket 0 failed: File exists
> 
> I also had that with 18.11.
> 
> Tom
> 
> 
> 
> ________________________________
> De : Shahaf Shuler
> <shahafs@mellanox.com<mailto:shahafs@mellanox.com>>
> Envoyé : jeudi 6 décembre 2018 10:21
> À : Tom Barbette; Yongseok Koh
> Cc : users@dpdk.org<mailto:users@dpdk.org>
> Objet : RE: Unregistered mempool in secondary
> 
> Do you reproduce this issue w/ testpmd?
> Can you provide the instruction on how to?
> 
> From: Tom Barbette <barbette@kth.se<mailto:barbette@kth.se>>
> Sent: Thursday, December 6, 2018 10:19 AM
> To: Shahaf Shuler
> <shahafs@mellanox.com<mailto:shahafs@mellanox.com>>; Yongseok Koh
> <yskoh@mellanox.com<mailto:yskoh@mellanox.com>>
> Cc: users@dpdk.org<mailto:users@dpdk.org>
> Subject: RE: Unregistered mempool in secondary
> 
> 
> Hi Shahaf,
> 
> 
> 
> Both devices are initialized in the primary with rte_eth_dev_start, and
> queues also (each with their own packet pool). It's just that the secondary
> read from the initialized queue.
> 
> 
> 
> If you're mentionning something more to do, then I did not understand.
> Which function would that be? To ask port 0 to register port 1's pools and vice
> versa?
> 
> 
> 
> Thanks !
> 
> 
> 
> Tom
> 
> 
> 
> ________________________________
> De : Shahaf Shuler
> <shahafs@mellanox.com<mailto:shahafs@mellanox.com>>
> Envoyé : jeudi 6 décembre 2018 08:18
> À : Tom Barbette; Yongseok Koh
> Cc : users@dpdk.org<mailto:users@dpdk.org>
> Objet : RE: Unregistered mempool in secondary
> 
> Hi Tom,
> 
> There is a complete isolation between the resources of each port.
> Meaning, mempool cannot be registered on one port and be used on the
> other (w/o explicit registration).
> 
> In order your solution to work, the primary process needs to probe both
> ports, so that the mempool_walk will happen on both and the pool will be
> registered to both.
> 
> From: Tom Barbette <barbette@kth.se<mailto:barbette@kth.se>>
> Sent: Thursday, December 6, 2018 1:41 AM
> To: Shahaf Shuler
> <shahafs@mellanox.com<mailto:shahafs@mellanox.com>>; Yongseok Koh
> <yskoh@mellanox.com<mailto:yskoh@mellanox.com>>
> Cc: users@dpdk.org<mailto:users@dpdk.org>
> Subject: Unregistered mempool in secondary
> 
> Hi mlx5 maintainers,
> 
> Since we're using a second ConnectX 5 NIC plugged to our second CPU
> socket, we see the following message when flowing packets through that
> port :
> 
> net_mlx5: port 1 using address (0x6000712742c0) of unregistered mempool
> in secondary process, please create mempool before rte_eth_dev_start()
> 
> The secondary process has all of the primary process pools registered via
> rte_mempool_walk (2 as we have 2 numa nodes). But that address in the
> error message is actually not between any of the pool->mz->addr_64  + pool-
> >mz->size range. So maybe this comes from an automatically created pool of
> the primary ? If so, how can we force it to be created before we call
> eth_dev_start?
> 
> We think the only change is that the second port is now a dedicated NIC on
> the second socket, instead of the second port of the same NIC on the first
> socket. But we are not 100% sure this is the triggering event. The same test
> worked before, still with DPDK 18.08.
> 
> Thanks,
> Tom
> 
> 


  reply	other threads:[~2018-12-06 15:39 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-05 23:41 Tom Barbette
2018-12-06  7:18 ` Shahaf Shuler
2018-12-06  8:18   ` Tom Barbette
2018-12-06  9:21     ` Shahaf Shuler
2018-12-06  9:37       ` Tom Barbette
2018-12-06  9:54         ` Shahaf Shuler
2018-12-06 11:52           ` Tom Barbette
2018-12-06 13:45             ` Shahaf Shuler
2018-12-06 13:49               ` Burakov, Anatoly
2018-12-06 14:53                 ` Tom Barbette
2018-12-06 15:28                   ` Tom Barbette
2018-12-06 15:39                     ` Burakov, Anatoly [this message]
2018-12-06 17:27                       ` Tom Barbette
2018-12-06 17:50                         ` Cliff Burdick
2018-12-06 21:53                           ` Tom Barbette

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C6ECDF3AB251BE4894318F4E45123697824FC9F1@IRSMSX109.ger.corp.intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=barbette@kth.se \
    --cc=bernard.iremonger@intel.com \
    --cc=rasland@mellanox.com \
    --cc=shahafs@mellanox.com \
    --cc=thomas@monjalon.net \
    --cc=users@dpdk.org \
    --cc=yskoh@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).