From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by dpdk.org (Postfix, from userid 33) id 718C62BF3; Sun, 3 Mar 2019 01:44:05 +0100 (CET) From: bugzilla@dpdk.org To: dev@dpdk.org Date: Sun, 03 Mar 2019 00:44:05 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: DPDK X-Bugzilla-Component: ethdev X-Bugzilla-Version: 18.11 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: debugnetiq1@yahoo.ca X-Bugzilla-Status: CONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: dev@dpdk.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.dpdk.org/ Auto-Submitted: auto-generated X-Auto-Response-Suppress: All MIME-Version: 1.0 Subject: [dpdk-dev] [Bug 219] DPDK 18.11 builds with MLX4/MLX5 support but testpmd won't recognize the device X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Mar 2019 00:44:05 -0000 https://bugs.dpdk.org/show_bug.cgi?id=3D219 Bug ID: 219 Summary: DPDK 18.11 builds with MLX4/MLX5 support but testpmd won't recognize the device Product: DPDK Version: 18.11 Hardware: x86 OS: Linux Status: CONFIRMED Severity: normal Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: debugnetiq1@yahoo.ca Target Milestone: --- For testing built 2 versions of DPDK-18.11 - one with static libs (CONFIG_RTE_BUILD_SHARED_LIB=3Dn) - one with shared libs (CONFIG_RTE_BUILD_SHARED_LIB=3Dy) In a nutshell, after building and verifying all of the below, croaks: - with the static-libs DPDK complains of not finding shared libs (However DPDK is built by default with static libs) - with the shared-libs complains of not finding the device - incidentally pktgen-dpdk, built against the same DPDK static build, compl= ains of the same issue With the static-libs DPDK complains of not finding librte_pmd_mlx4_glue.so.18.02.0 # /opt/dpdk_install/dpdk-18.11/install/bin/testpmd \ > -l 1-3 \ > -n 4 \ > -w aec9:00:02.0 \ > --vdev=3D"net_vdev_netvsc0,iface=3Deth1" \ > -- --port-topology=3Dchained \ > --nb-cores 1 \ > --forward-mode=3Dtxonly \ > --eth-peer=3D0,00:0d:3a:53:13:b7 \ > --stats-period 1 PMD: mlx4.c:947: mlx4_glue_init(): cannot load glue library: librte_pmd_mlx4_glue.so.18.02.0: cannot open shared object file: No such fi= le or directory PMD: mlx4.c:965: mlx4_glue_init(): cannot initialize PMD due to missing run-time dependency on rdma-core libraries (libibverbs, libmlx4) net_mlx5: mlx5.c:1712: mlx5_glue_init(): cannot load glue library: librte_pmd_mlx5_glue.so.18.11.0: cannot open shared object file: No such fi= le or directory net_mlx5: mlx5.c:1730: mlx5_glue_init(): cannot initialize PMD due to missi= ng run-time dependency on rdma-core libraries (libibverbs, libmlx5) EAL: Detected 4 lcore(s) EAL: Detected 1 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: No free hugepages reported in hugepages-1048576kB EAL: Debug dataplane logs available - lower performance EAL: Probing VFIO support... EAL: VFIO support initialized EAL: WARNING: cpu flags constant_tsc=3Dyes nonstop_tsc=3Dno -> using unreli= able clock cycles ! net_vdev_netvsc: probably using routed NetVSC interface "eth1" (index 3) rte_pmd_tap_probe(): Initializing pmd_tap for net_tap_vsc0 as dtap0 Set txonly packet forwarding mode Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa. testpmd: create a new mbuf pool : n=3D163456, size=3D21= 76, socket=3D0 testpmd: preferred mempool ops selected: ring_mp_mc Configuring Port 0 (socket 0) Port 0: 00:0D:3A:18:A1:73 Checking link statuses... Done No commandline core given, start packet forwarding txonly packet forwarding - ports=3D1 - cores=3D1 - streams=3D1 - NUMA suppo= rt enabled, MP allocation mode: native Logical Core 2 (socket 0) forwards packets on 1 streams: RX P=3D0/Q=3D0 (socket 0) -> TX P=3D0/Q=3D0 (socket 0) peer=3D00:0D:3A:53= :13:B7 txonly packet forwarding packets/burst=3D32 packet len=3D64 - nb packet segments=3D1 nb forwarding cores=3D1 - nb forwarding ports=3D1 With the shared-libs DPDK complains about mlx4_pci_probe(): cannot access device # /opt/dpdk_install/dpdk-18.11/install/bin/testpmd \ > -l 1-3 \ > -d /opt/dpdk_install/dpdk-18.11/install/lib \ > -n 4 \ > -w aec9:00:02.0 \ > --vdev=3D"net_vdev_netvsc0,iface=3Deth1" \ > -- --port-topology=3Dchained \ > --nb-cores 1 \ > --forward-mode=3Dtxonly \ > --eth-peer=3D0,00:0d:3a:53:13:b7 \ > --stats-period 1 EAL: Detected 4 lcore(s) EAL: Detected 1 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: No free hugepages reported in hugepages-1048576kB EAL: Debug dataplane logs available - lower performance EAL: Probing VFIO support... EAL: VFIO support initialized EAL: WARNING: cpu flags constant_tsc=3Dyes nonstop_tsc=3Dno -> using unreli= able clock cycles ! EAL: PCI device aec9:00:02.0 on NUMA socket 0 EAL: probe driver: 15b3:1004 net_mlx4 PMD: mlx4.c:564: mlx4_pci_probe(): cannot access device, is mlx4_ib loaded? EAL: Requested device aec9:00:02.0 cannot be used net_vdev_netvsc: probably using routed NetVSC interface "eth1" (index 3) rte_pmd_tap_probe(): Initializing pmd_tap for net_tap_vsc0 as dtap0 Set txonly packet forwarding mode Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa. testpmd: create a new mbuf pool : n=3D163456, size=3D21= 76, socket=3D0 testpmd: preferred mempool ops selected: ring_mp_mc Configuring Port 0 (socket 0) Port 0: 00:0D:3A:18:A1:73 Checking link statuses... Done No commandline core given, start packet forwarding txonly packet forwarding - ports=3D1 - cores=3D1 - streams=3D1 - NUMA suppo= rt enabled, MP allocation mode: native Logical Core 2 (socket 0) forwards packets on 1 streams: RX P=3D0/Q=3D0 (socket 0) -> TX P=3D0/Q=3D0 (socket 0) peer=3D00:0D:3A:53= :13:B7 ... Nonetheless regardless of errors both versions seem to start sending packet= s - not convinced this is really true Port statistics =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D ######################## NIC statistics for port 0 #####################= ### RX-packets: 0 RX-missed: 0 RX-bytes: 0 RX-errors: 0 RX-nombuf: 0 TX-packets: 231072 TX-errors: 0 TX-bytes: 14788608 Throughput (since last show) Rx-pps: 0 Tx-pps: 201230 #########################################################################= ### Incidentally the " mlx4_pci_probe(): cannot access device" error is the same flagged by pktgen-dpdk (which however crashes) - i.e. there is a common bug= in DPDK w/ respect to MLX4 impacting both testpmd and pktgen # ./app/x86_64-native-linuxapp-gcc/pktgen -w aec9:00:02.0 -l 1-3 -n 4 -m 4= 096 -- -m [2-3].0 -l /var/tmp/pktgen.log -T Copyright (c) <2010-2019>, Intel Corporation. All rights reserved. Powered = by DPDK EAL: Detected 4 lcore(s) EAL: Detected 1 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: No free hugepages reported in hugepages-1048576kB EAL: Debug dataplane logs available - lower performance EAL: Probing VFIO support... EAL: VFIO support initialized EAL: WARNING: cpu flags constant_tsc=3Dyes nonstop_tsc=3Dno -> using unreli= able clock cycles ! EAL: PCI device aec9:00:02.0 on NUMA socket 0 EAL: probe driver: 15b3:1004 net_mlx4 PMD: mlx4.c:564: mlx4_pci_probe(): cannot access device, is mlx4_ib loaded? EAL: Requested device aec9:00:02.0 cannot be used Lua 5.3.5 Copyright (C) 1994-2018 Lua.org, PUC-Rio *** Copyright (c) <2010-2019>, Intel Corporation. All rights reserved. *** Pktgen created by: Keith Wiles -- >>> Powered by DPDK <<< !PANIC!: *** Did not find any ports to use *** PANIC in pktgen_config_ports(): *** Did not find any ports to use ***6: [./app/x86_64-native-linuxapp-gcc/pktgen_() [0x48726f]] 5: [/lib64/libc.so.6(__libc_start_main+0xf5) [0x7fe75bf693d5]] 4: [./app/x86_64-native-linuxapp-gcc/pktgen_(main+0x630) [0x47efe0]] 3: [./app/x86_64-native-linuxapp-gcc/pktgen_(pktgen_config_ports+0x1611) [0x4afcb1]] 2: [./app/x86_64-native-linuxapp-gcc/pktgen_(__rte_panic+0xb8) [0x469ec2]] 1: [./app/x86_64-native-linuxapp-gcc/pktgen_(rte_dump_stack+0x1a) [0x582baa= ]] ./app/x86_64-native-linuxapp-gcc/pktgen: line 7: 6970 Aborted=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20 $(dirname "$0")/pktgen_ "$@" Here is what I did: - installed Mellanox OFED 4.5.1 from "sources" wget http://www.mellanox.com/downloads/ofed/MLNX_OFED-4.5-1.0.1.0/MLNX_OFED_LINU= X-4.5-1.0.1.0-rhel7.6-x86_64.tgz && tar -zxf MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.6-x86_64.tgz cd MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.6-x86_64.tgz ./mlnxofedinstall --dpdk --upstream-libs --add-kernel-support --enable-mlnx_tune This builds and installs all userland OFED components, then installing the = kmod drivers cd /tmp/MLNX_OFED_LINUX-4.5-1.0.1.0-3.10.0-957.5.1.el7.x86_64/MLNX_OFED_LINUX-= 4.5-1.0.1.0-rhel7.6-ext/RPMS && yum install mlnx-ofa_kernel-modules-4.5-OFED.4.5.1.0.1.1.gb4fdfac.kver.3.10.0_957.5.1.e= l7.x86_64.x86_64.rpm Now building dpdk-18.11 - download, untar then enable in ./config/common_base CONFIG_RTE_LIBRTE_MLX4_PMD=3Dy and CONFIG_RTE_LIBRTE_MLX5_PMD=3D"y" export DPDK_DIR=3D/opt/dpdk_install/dpdk-18.11 cd $DPDK_DIR export DPDK_BUILD=3D$DPDK_DIR/install export RTE_SDK=3D$DPDK_DIR export DPDK_TARGET=3Dx86_64-native-linuxapp-gcc export RTE_TARGET=3Dx86_64-native-linuxapp-gcc Enabling the following in config/common_base CONFIG_RTE_BUILD_SHARED_LIB=3Dn CONFIG_RTE_LIBRTE_MLX4_PMD=3Dy CONFIG_RTE_LIBRTE_MLX4_DEBUG=3Dy CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS=3Dn CONFIG_RTE_LIBRTE_MLX5_PMD=3Dy CONFIG_RTE_LIBRTE_MLX5_DEBUG=3Dy CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS=3Dn CONFIG_RTE_LOG_DP_LEVEL=3DRTE_LOG_DEBUG make config T=3D$DPDK_TARGET make install T=3D$DPDK_TARGET DESTDIR=3Dinstall It builds fine - by default with static libraries generated under install/l= ib - I can see the generated libs ls -l install/lib/*mlx* -rw-r--r--. 1 root root 2126350 Mar 2 22:12 install/lib/librte_pmd_mlx4.a -rw-r--r--. 1 root root 6613402 Mar 2 22:12 install/lib/librte_pmd_mlx5.a # lspci -v -n ... aec9:00:02.0 0200: 15b3:1004 Subsystem: 15b3:61b0 Flags: fast devsel, NUMA node 0 Memory at fe0800000 (64-bit, prefetchable) [size=3D8M] Capabilities: [60] Express Endpoint, MSI 00 Capabilities: [9c] MSI-X: Enable- Count=3D24 Masked- Capabilities: [40] Power Management version 0 Kernel driver in use: vfio-pci Kernel modules: mlx4_core # lsmod | grep mlx mlx5_fpga_tools 14392 0 mlx5_ib 339996 0 ib_uverbs 125872 3 mlx5_ib,ib_ucm,rdma_ucm mlx5_core 919535 2 mlx5_ib,mlx5_fpga_tools mlxfw 18227 1 mlx5_core mlx4_ib 211832 0 ib_core 294554 10 rdma_cm,ib_cm,iw_cm,mlx4_ib,mlx5_ib,ib_ucm,ib_umad,ib_uverbs,rdma_ucm,ib_ip= oib mlx4_en 146509 0 mlx4_core 360644 2 mlx4_en,mlx4_ib mlx_compat 28730 15 rdma_cm,ib_cm,iw_cm,mlx4_en,mlx4_ib,mlx5_ib,mlx5_fpga_tools,ib_ucm,ib_core,= ib_umad,ib_uverbs,mlx4_core,mlx5_core,rdma_ucm,ib_ipoib devlink 48345 4 mlx4_en,mlx4_ib,mlx4_core,mlx5_core ptp 19231 3 hv_utils,mlx4_en,mlx5_core Now testing with testpmd but first some sanity check # find /lib/modules/3.10.0-957.5.1.el7.x86_64/ -type f -name "*mlx*" | xarg= s ls -l -rwxr--r--. 1 root root 47688 Mar 2 17:34 /lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/compat/mlx_com= pat.ko -rwxr--r--. 1 root root 353296 Mar 2 17:34 /lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/infini= band/hw/mlx4/mlx4_ib.ko -rwxr--r--. 1 root root 554568 Mar 2 17:34 /lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/infini= band/hw/mlx5/mlx5_ib.ko -rwxr--r--. 1 root root 573648 Mar 2 17:34 /lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/net/et= hernet/mellanox/mlx4/mlx4_core.ko -rwxr--r--. 1 root root 255656 Mar 2 17:34 /lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/net/et= hernet/mellanox/mlx4/mlx4_en.ko -rwxr--r--. 1 root root 1433680 Mar 2 17:34 /lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/net/et= hernet/mellanox/mlx5/core/mlx5_core.ko -rwxr--r--. 1 root root 25728 Mar 2 17:35 /lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/net/et= hernet/mellanox/mlx5/fpga/mlx5_fpga_tools.ko -rwxr--r--. 1 root root 24728 Mar 2 17:35 /lib/modules/3.10.0-957.5.1.el7.x86_64/extra/mlnx-ofa_kernel/drivers/net/et= hernet/mellanox/mlxfw/mlxfw.ko # cat /etc/modprobe.d/ofed_mlx4.conf (all options commented out) With mlx4_ib loaded we have 2 pairs of interfaces eth0 + eth2, eth1 + eth3 Each pair is sharing same MAC # ip link show 2: eth0: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 00:0d:3a:4d:49:98 brd ff:ff:ff:ff:ff:ff 3: eth1: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 00:0d:3a:18:a1:73 brd ff:ff:ff:ff:ff:ff 4: eth2: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 00:0d:3a:4d:49:98 brd ff:ff:ff:ff:ff:ff 5: eth3: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 00:0d:3a:18:a1:73 brd ff:ff:ff:ff:ff:ff lshw | less *-network:1 description: Ethernet interface product: MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] vendor: Mellanox Technologies physical id: 2 bus info: pci@aec9:00:02.0 logical name: eth3 version: 00 serial: 00:0d:3a:18:a1:73 width: 64 bits clock: 33MHz capabilities: pciexpress msix pm bus_master cap_list ethernet physical fibre autonegotiation configuration: autonegotiation=3Don broadcast=3Dyes driver=3Dmlx4= _en driverversion=3D4.5-1.0.1 duplex=3Dfull firmware=3D2.41.7004 latency=3D0 li= nk=3Dyes multicast=3Dyes slave=3Dyes resources: iomemory:f0-ef irq:0 memory:fe0800000-fe0ffffff Using second pair with DPDK (1'st pair has the mgmt IP)=20 # dpdk-devbind.py --force --bind mlx4_core aec9:00:02.0 # dpdk-devbind.py -s Network devices using DPDK-compatible driver =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D aec9:00:02.0 'MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] 1004' drv=3Dvfio-pci unused=3Dmlx4_core Other Network devices =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D a3f2:00:02.0 'MT27500/MT27520 Family [ConnectX-3/ConnectX-3 Pro Virtual Function] 1004' unused=3Dmlx4_core,vfio-pci We have 4 cores and huge mem pages # cpu_layout.py =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Core and Socket Information (as reported by '/sys/devices/system/cpu') =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D cores =3D [0, 1] sockets =3D [0] Socket 0 -------- Core 0 [0, 1] Core 1 [2, 3] # grep -i huge /proc/meminfo AnonHugePages: 20480 kB HugePages_Total: 2048 HugePages_Free: 2048 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB # mount | grep -i huge cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,hugetlb) hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel) hugetlbfs on /mnt/huge type hugetlbfs (rw,relatime,seclabel) none on /mnt/huge_2mb type hugetlbfs (rw,relatime,seclabel,pagesize=3D2MB) --=20 You are receiving this mail because: You are the assignee for the bug.=