From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id C35A37D06 for ; Tue, 14 Nov 2017 21:20:57 +0100 (CET) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 14 Nov 2017 12:20:50 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.44,396,1505804400"; d="scan'208,217";a="1881502" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by FMSMGA003.fm.intel.com with ESMTP; 14 Nov 2017 12:20:50 -0800 Received: from fmsmsx117.amr.corp.intel.com ([169.254.3.51]) by fmsmsx104.amr.corp.intel.com ([169.254.3.185]) with mapi id 14.03.0319.002; Tue, 14 Nov 2017 12:20:50 -0800 From: "Wiles, Keith" To: "Johnson, Brian" CC: James Bensley , "Rosen, Rami" , "users@dpdk.org" Thread-Topic: [dpdk-users] Hard Crash with X710 and Pktgen Thread-Index: AQHTXLavMV+t5bSKF0KHmpj+PHD8m6MTX1UAgACRHYCAACPggIAAJ6qAgAAWtK0= Date: Tue, 14 Nov 2017 20:20:48 +0000 Message-ID: <199CE1B4-47DC-4ACB-BF21-509155591C2B@intel.com> References: <3FEF1E9E-C001-4BD1-ACD7-84AD394F4C52@intel.com> <9B0331B6EBBD0E4684FBFAEDA55776F9444D7AEC@HASMSX110.ger.corp.intel.com>, , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: MIME-Version: 1.0 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-users] Hard Crash with X710 and Pktgen X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2017 20:20:58 -0000 Sent from my iPhone On Nov 14, 2017, at 2:59 AM, Johnson, Brian > wrote: Do you have Intel VT-d enabled in the BIOS and intel_iommu=3Don iommu=3Dpt = in grub? I have had some issues with enabling this where other devices such as RAID = controllers cause the OS to no boot after applying these settings. Removing= the grub line and upgrading the RAID firmware fixed the issue. IOMMU An input-output memory management unit (IOMMU) is required for safely drivi= ng DMA-capable hardware from userspace and because of that it is a prerequi= site for using VFIO. Not all systems have one though, so you=92ll need to c= heck that the hardware supports it and that it is enabled in the BIOS setti= ngs (VT-d or Virtualization Technology for Directed I/Oon Intel systems) Finally, IOMMU needs to be excplitly enabled in the kernel as well. To do s= o, pass either intel_iommu=3Don (for Intel systems) or amd_iommu=3Don (for = AMD systems) added to the kernel command line. In addition it is recommende= d to use iommu=3Dpt option which improves IO performance for devices in the= host. Once the system boots up, check the contents of /sys/kernel/iommu_groups/ d= irectory. If it is non-empty, you have successfully set up IOMMU. To permanently add this to the kernel commandline, append it to GRUB_CMDLIN= E_LINUX in /etc/default/grub and then execute: # grub2-mkconfig -o /boot/grub2/grub.cfg Sent from my iPhone On Nov 14, 2017, at 12:38 AM, James Bensley > wrote: Hi guys, Thank you all for your responses! I initially thought this was unlikely to be a Pktgen specific issue however I thought it worth mentioning I am using Pktgen just in case. On 14 November 2017 at 04:49, Muhammad Zain-ul-Abideen > wrote: where is -p argument Do you mean upper case "-P"? Lower case =93p=94 isn=92t an option in =93Pkt= gen =96-help=94 output Pktgen once used -p Portman value and is not used anymore I still see that = option being used in some command lines. On 13 November 2017 at 19:35, James Bensley > wrote: $ sudo ./app/x86_64-native-linuxapp-gcc/pktgen -l 2-6 -n 1 -w 09:00.0 -w 09= :00.1 -v -- -P -m [3-4].0 [5-6].1 On 13 November 2017 at 21:49, Wiles, Keith > wrote: For the hard lockup problem try using testpmd application and see if that t= he same problem if not then it will be next week before I can look at. Let me know if testpmd works or not. Thanks for taking the time to reply whilst traveling Keith! Sadly I=92m getting the same behaviour with testpmd: $ ./dpdk-devbind.py --status-dev net Network devices using DPDK-compatible driver =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 0000:09:00.0 'Ethernet Controller X710 for 10GbE SFP+ 1572' drv=3Digb_uio u= nused=3D 0000:09:00.1 'Ethernet Controller X710 for 10GbE SFP+ 1572' drv=3Digb_uio u= nused=3D Network devices using kernel driver =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 0000:02:00.0 'I350 Gigabit Network Connection 1521' if=3Deno1 drv=3Digb unused=3Digb_uio *Active* 0000:02:00.1 'I350 Gigabit Network Connection 1521' if=3Deno2 drv=3Digb unused=3Digb_uio 0000:06:00.0 'NetXtreme BCM5719 Gigabit Ethernet PCIe 1657' if=3Dens3f0 drv=3Dtg3 unused=3Digb_uio 0000:06:00.1 'NetXtreme BCM5719 Gigabit Ethernet PCIe 1657' if=3Dens3f1 drv=3Dtg3 unused=3Digb_uio 0000:06:00.2 'NetXtreme BCM5719 Gigabit Ethernet PCIe 1657' if=3Dens3f2 drv=3Dtg3 unused=3Digb_uio *Active* 0000:06:00.3 'NetXtreme BCM5719 Gigabit Ethernet PCIe 1657' if=3Dens3f3 drv=3Dtg3 unused=3Digb_uio $ sudo ./testpmd -l 0-3 -n 4 -- -i --portmask=3D0x1 --nb-cores=3D2 EAL: Detected 16 lcore(s) EAL: No free hugepages reported in hugepages-1048576kB EAL: Probing VFIO support... EAL: PCI device 0000:02:00.0 on NUMA socket 0 EAL: probe driver: 8086:1521 net_e1000_igb EAL: PCI device 0000:02:00.1 on NUMA socket 0 EAL: probe driver: 8086:1521 net_e1000_igb EAL: PCI device 0000:09:00.0 on NUMA socket 0 EAL: probe driver: 8086:1572 net_i40e << HARD LOCK-UP >> $ ./dpdk-devbind.py --status-dev net Network devices using DPDK-compatible driver =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 0000:09:00.0 'Ethernet Controller X710 for 10GbE SFP+ 1572' drv=3Dvfio-pci unused=3Digb_uio 0000:09:00.1 'Ethernet Controller X710 for 10GbE SFP+ 1572' drv=3Dvfio-pci unused=3Digb_uio Network devices using kernel driver =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D 0000:02:00.0 'I350 Gigabit Network Connection 1521' if=3Deno1 drv=3Digb unused=3Digb_uio,vfio-pci *Active* 0000:02:00.1 'I350 Gigabit Network Connection 1521' if=3Deno2 drv=3Digb unused=3Digb_uio,vfio-pci 0000:06:00.0 'NetXtreme BCM5719 Gigabit Ethernet PCIe 1657' if=3Dens3f0 drv=3Dtg3 unused=3Digb_uio,vfio-pci 0000:06:00.1 'NetXtreme BCM5719 Gigabit Ethernet PCIe 1657' if=3Dens3f1 drv=3Dtg3 unused=3Digb_uio,vfio-pci 0000:06:00.2 'NetXtreme BCM5719 Gigabit Ethernet PCIe 1657' if=3Dens3f2 drv=3Dtg3 unused=3Digb_uio,vfio-pci *Active* 0000:06:00.3 'NetXtreme BCM5719 Gigabit Ethernet PCIe 1657' if=3Dens3f3 drv=3Dtg3 unused=3Digb_uio,vfio-pci $ sudo ./testpmd -l 0-3 -n 4 -- -i --portmask=3D0x1 --nb-cores=3D2 EAL: Detected 16 lcore(s) EAL: No free hugepages reported in hugepages-1048576kB EAL: Probing VFIO support... EAL: VFIO support initialized EAL: PCI device 0000:02:00.0 on NUMA socket 0 EAL: probe driver: 8086:1521 net_e1000_igb EAL: PCI device 0000:02:00.1 on NUMA socket 0 EAL: probe driver: 8086:1521 net_e1000_igb EAL: PCI device 0000:09:00.0 on NUMA socket 0 EAL: probe driver: 8086:1572 net_i40e EAL: 0000:09:00.0 failed to select IOMMU type EAL: Requested device 0000:09:00.0 cannot be used EAL: PCI device 0000:09:00.1 on NUMA socket 0 EAL: probe driver: 8086:1572 net_i40e EAL: 0000:09:00.1 failed to select IOMMU type EAL: Requested device 0000:09:00.1 cannot be used EAL: No probed ethernet devices Interactive-mode selected USER1: create a new mbuf pool : n=3D171456, size=3D2176, socket=3D0 Done On 14 November 2017 at 06:29, Rosen, Rami > wrote: Second, I think the root cause for not finding the ports is around this mes= sage ("failed to select IOMMU type") in: ... EAL: probe driver: 8086:1572 net_i40e EAL: 0000:09:00.0 failed to select IOMMU type EAL: Requested device 0000:09:00.0 cannot be used EAL: PCI device 0000:09:00.1 on NUMA socket 0 EAL: probe driver: 8086:1572 net_i40e EAL: 0000:09:00.1 failed to select IOMMU type EAL: Requested device 0000:09:00.1 cannot be used ... What does " find /sys/kernel/iommu_groups/ -type l" give ? Could it be that 0000:09:00.0 and 0000:09:00.1 belongs to an IOMMU group in= which there are other devices ? To me (a layman) this looks correct, do you agree? $ find /sys/kernel/iommu_groups/ -type l | grep 09 /sys/kernel/iommu_groups/36/devices/0000:09:00.0 /sys/kernel/iommu_groups/37/devices/0000:09:00.1 $ find /sys/kernel/iommu_groups/ -type l | grep -E "36|37" /sys/kernel/iommu_groups/36/devices/0000:09:00.0 /sys/kernel/iommu_groups/37/devices/0000:09:00.1 Cheers, James.