Mr. Ferber, much appreciated. I knew this metal box came to me with two mellanox NICs bonded. I used their util to unbond it, but alas it did not do it all the way. The /etc/network/interfaces was bad. I fixed the config and rebooted.
Voila. Success. ibv_devinfo now shows two devices. And the DPDK app runs.
Your point about bonding nudged me to double check.
Now, on a second machine where I believe I have done everything I did on the first machine, running the DPDK application continues to look for the wrong driver:
EAL: Detected CPU lcores: 16
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: libmlx4.so.1: cannot open shared object file: No such file or directory
EAL: FATAL: Cannot init plugins
EAL: Cannot init pluginsSomehow I fixed this on the first machine, but I cannot duplicate success on the second machine.Both machines report two devices; both machines show similar ifconifgs
device node GUID
------ ----------------
mlx5_0 0c42a103007ea9b8
mlx5_1 0c42a103007ea9b9
device node GUID
------ ----------------
mlx5_0 0c42a103007ea3ec
mlx5_1 0c42a103007ea3ed
root@server:~/Dev/reinvent/scripts# ibv_devinfo
hca_id: mlx5_0
transport: InfiniBand (0)
fw_ver: 14.32.1010
node_guid: 0c42:a103:007e:a3ec
sys_image_guid: 0c42:a103:007e:a3ec
vendor_id: 0x02c9
vendor_part_id: 4117
hw_ver: 0x0
board_id: MT_2420110034
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
hca_id: mlx5_1
transport: InfiniBand (0)
fw_ver: 14.32.1010
node_guid: 0c42:a103:007e:a3ed
sys_image_guid: 0c42:a103:007e:a3ec
vendor_id: 0x02c9
vendor_part_id: 4117
hw_ver: 0x0
board_id: MT_2420110034
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
ibv_devinfo
hca_id: mlx5_0
transport: InfiniBand (0)
fw_ver: 14.32.1010
node_guid: 0c42:a103:007e:a9b8
sys_image_guid: 0c42:a103:007e:a9b8
vendor_id: 0x02c9
vendor_part_id: 4117
hw_ver: 0x0
board_id: MT_2420110034
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
hca_id: mlx5_1
transport: InfiniBand (0)
fw_ver: 14.32.1010
node_guid: 0c42:a103:007e:a9b9
sys_image_guid: 0c42:a103:007e:a9b8
vendor_id: 0x02c9
vendor_part_id: 4117
hw_ver: 0x0
board_id: MT_2420110034
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
On Tue, Apr 5, 2022 at 1:00 PM Erez Ferber <erezferber@gmail.com> wrote:Hi,Based on your output, the ConnectX-4LX device is configured in LAG mode managed via the kernel bonding scripts. In this mode, both physical functions share a single port (mlx5_bond_0). You should only probe the first PCI BDF - 01:00.0, not the 2nd one.By the way, the --dpdk installation flag should not be necessary, it is an old flag keps for Mellanox OFED builds lower than 5.x.Regards,ErezOn Tue, 5 Apr 2022 at 19:17, fwefew 4t4tg <7532yahoo@gmail.com> wrote:I built the current version of DPDK directly from dpdk.org after I installed the current OFED Mellanox driver set:
* MLNX_OFED_LINUX-5.5-1.0.3.2-ubuntu20.04-x86_64.iso
with ./install --dpdk
I am using a Mellanox Technologies MT27710 Family [ConnectX-4 Lx] which is Ethernet only; there is no IB mode for this NIC. This is a MT_2420110034 board. However, when I run dpdk-testpmd I see "No Verbs device matches PCI device 0000:01:00.1, are kernel drivers loaded?"
EAL: Detected CPU lcores: 16
EAL: Detected NUMA nodes: 1
EAL: Detected static linkage of DPDK
EAL: Selected IOVA mode 'PA'
EAL: No free 2048 kB hugepages reported on node 0
EAL: VFIO support initialized
EAL: Probe PCI driver: mlx5_pci (15b3:1015) device: 0000:01:00.1 (socket 0)
mlx5_common: No Verbs device matches PCI device 0000:01:00.1, are kernel drivers loaded?
mlx5_common: Verbs device not found: 01:00.1
mlx5_common: Failed to initialize device context.
EAL: Requested device 0000:01:00.1 cannot be used
EAL: Bus (pci) probe failed.
As far as I can see all the kernel modules are loaded:
lsmod | egrep "(ib|mlx)" | sort
ib_cm 53248 2 rdma_cm,ib_ipoib
ib_core 368640 8 rdma_cm,ib_ipoib,iw_cm,ib_umad,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm
ib_ipoib 135168 0
ib_umad 24576 0
ib_uverbs 139264 2 rdma_ucm,mlx5_ib
libahci 36864 1 ahci
libcrc32c 16384 2 btrfs,raid456
mlx5_core 1634304 1 mlx5_ib
mlx5_ib 397312 0
mlx_compat 69632 11 rdma_cm,ib_ipoib,mlxdevm,iw_cm,ib_umad,ib_core,rdma_ucm,ib_uverbs,mlx5_ib,ib_cm,mlx5_core
mlxdevm 172032 1 mlx5_core
mlxfw 32768 1 mlx5_core
pci_hyperv_intf 16384 1 mlx5_core
psample 20480 1 mlx5_core
tls 94208 2 bonding,mlx5_core
root@dc-c3-small-x86-01:~/Dev/reinvent/scripts# mst status -v
MST modules:
------------
MST PCI module is not loaded
MST PCI configuration module loaded
PCI devices:
------------
DEVICE_TYPE MST PCI RDMA NET NUMA
ConnectX4LX(rev:0) /dev/mst/mt4117_pciconf0.1 01:00.1 mlx5_bond_0 net-bond0 -1
ConnectX4LX(rev:0) /dev/mst/mt4117_pciconf0 01:00.0 mlx5_bond_0 net-bond0 -1