From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id A75E5A052B; Wed, 29 Jul 2020 16:26:55 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 72C751023; Wed, 29 Jul 2020 16:26:55 +0200 (CEST) Received: from inbox.dpdk.org (xvm-172-178.dc0.ghst.net [95.142.172.178]) by dpdk.org (Postfix) with ESMTP id 4414EE07 for ; Wed, 29 Jul 2020 16:26:53 +0200 (CEST) Received: by inbox.dpdk.org (Postfix, from userid 33) id 2C4C8A053C; Wed, 29 Jul 2020 16:26:53 +0200 (CEST) From: bugzilla@dpdk.org To: dev@dpdk.org Date: Wed, 29 Jul 2020 14:26:53 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: DPDK X-Bugzilla-Component: ethdev X-Bugzilla-Version: 18.05 X-Bugzilla-Keywords: X-Bugzilla-Severity: critical X-Bugzilla-Who: Muthurajan.Jayakumar@intel.com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: dev@dpdk.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.dpdk.org/ Auto-Submitted: auto-generated X-Auto-Response-Suppress: All MIME-Version: 1.0 Subject: [dpdk-dev] [Bug 518] Disabling a VF with XL710 still sees the traffic and the link is still high X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" https://bugs.dpdk.org/show_bug.cgi?id=3D518 Bug ID: 518 Summary: Disabling a VF with XL710 still sees the traffic and the link is still high Product: DPDK Version: 18.05 Hardware: x86 OS: Linux Status: UNCONFIRMED Severity: critical Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: Muthurajan.Jayakumar@intel.com Target Milestone: --- CONTENTS: SUMMARY DETAILED DESCRIPTION STEPS TO REPLICATE=20 ADDITIONAL CLARIFICATION IN Q&A Form=20 SUMMARY: Actual results: Even though we disable the VF, we still see traffic going out Expected results: Disabling the VF should stop all the traffic, no matter what the guest is d= oing with the PCI device. The concern here is that, when an operator decides to disable a VF, no matt= er what the guest is running or doing, it's expected that the traffic should s= top DETAILED DESCRIPTION A qemu-kvm instance configured with PCI passthrough device from a i40e VF is binding the device with igb_uio. When we disable VF state on the hypervisor, the guest doesn't get a NIC down event and the traffic is not stopped until with stop trex completely, stopping the flows isn't enough. In one test setup, we pushed the traffic with trex on a rhel7.5 guest (3.10.0-862.3.2.el7.x86_64), and in another setup VNF application is based= on debian 8.11 (kernel 3.16.43) - both are seeing the problem In the rhel7.5 setup, 3.10.0-1127.10.1.el7.x86_64 is on the hypervisor. How Often this is reproducable - ALL THE TIME What are the steps to reproduce? Using trex/testpmd image and configuthe flavor [3] - trex requires that we have 2 NICs, so I have one ovs port for ssh'ng and 2 SR-IOV ports. - As stated previously, when VF is disabled, don't see any event in dmesg = on the guest. - On one tab, (basically trex stateless mode - with server / client start trex like this: ./t-rex-64 -i -c 1 - On the other tab, start the trex console like this: ./trex-console - From the console, start traffic like this.=20 ~~~ trex>start -f stl/lot_flows.py -m 100000pps --force -t fsize=3D64 ~~~ - On one of our servers on the same VLAN, tarted to loop tcpdump with a sle= ep like this (this is to throttle the output of tcpdump): ~~~ while true; do tcpdump -c 10 -enni p7p1 ether src fa:16:3e:2d:6b:35;sleep 1;done ~~~ - [4] fa:16:3e:2d:6b:35 is one of the VF's MAC. Seeing a LOT of random traf= fic like this. So that confirms that my test is working. - On the compute, then disable the VFs like this: ~~~ # ip l set p1p2 vf 15 state disable # ip l set p1p1 vf 15 state disable ~~~ - But tcpdump keeps returning traffic, no decrease at all. But now, the interesting bit (keep in mind that VF are still disabled at t= his point on the hypervisor): - stopped the flows but let trex running. As soon as stop the flow, tcpd= ump stops showing traffic as expected. - When resumee the flows, traffic resumes on tcpdump. - When killed trex, don't see the traffic anymore on my tcpdump, as expect= ed. - When restarted trex, with the VFs disabled again, it fails with a link do= wn [5] this time. Actual results: Even though we disable the VF, we still see traffic going out Expected results: Disabling the VF should stop all the traffic, no matter what the guest is d= oing with the PCI device. Additional info: [1] ~~~ cd dpdk/x86_64-native-linuxapp-gcc/kmod/ modprobe uio insmod igb_uio.ko cd ~/dpdk/usertools/ ./dpdk-devbind.py -b igb_uio 0000:00:05.0 ./dpdk-devbind.py -b igb_uio 0000:00:06.0 cd ~/v2.28 vi /etc/trex_cfg.yaml ./t-rex-64 -i -c 1 ~~~ [2] ~~~ - port_limit: 2 version: 2 interfaces: ['00:05.0', '00:06.0'] port_info: - dest_mac: f2:12:23:a7:dc:8c src_mac: fa:16:3e:2d:6b:35 ip : 1.1.1.1 default_gw : 2.2.2.2 - dest_mac: f2:12:23:a7:dc:8c src_mac: fa:16:3e:fd:2f:20 ip : 2.2.2.2 default_gw : 1.1.1.1 platform: master_thread_id: 2 latency_thread_id: 3 dual_if: - socket: 0 threads: [4,5,6,7,8,9] ~~~ [3] ~~~ [stack@undercloud-0 ~]$ openstack flavor show 0d42bf3d-c486-4b97-9a10-0d2dbde68681 +----------------------------+---------------------------------------------= ------------------------------------------------------------------------+ | Field | Value=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20 | +----------------------------+---------------------------------------------= ------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20 | | OS-FLV-EXT-DATA:ephemeral | 0=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20 | | access_project_ids | None=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20 | | disk | 10=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20 | | id | 0d42bf3d-c486-4b97-9a10-0d2dbde68681=20=20= =20=20=20=20=20=20=20=20=20=20 | | name | ess-large-pinned=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20 | | os-flavor-access:is_public | True=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20 | | properties | aggregate_instance_extra_specs:sriov=3D'true= ', hw:cpu_policy=3D'dedicated', hw:mem_page_size=3D'large', hw:numa_nodes=3D'2= ' | | ram | 16384=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20 | | rxtx_factor | 1.0=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20 | | swap |=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20 | | vcpus | 10=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20 | +----------------------------+---------------------------------------------= ------------------------------------------------------------------------+ ~~~ [4] ~~~ 15:47:29.321093 fa:16:3e:2d:6b:35 > f2:12:23:a7:dc:8c, ethertype 802.1Q (0x8100), length 64: vlan 1170, p 0, ethertype IPv4, 17.47.251.102.1026 > 192.168.6.115.15: UDP, length 18 15:47:29.321094 fa:16:3e:2d:6b:35 > f2:12:23:a7:dc:8c, ethertype 802.1Q (0x8100), length 64: vlan 1170, p 0, ethertype IPv4, 17.3.49.80.1026 > 192.168.61.161.15: UDP, length 18 15:47:29.321095 fa:16:3e:2d:6b:35 > f2:12:23:a7:dc:8c, ethertype 802.1Q (0x8100), length 64: vlan 1170, p 0, ethertype IPv4, 16.25.104.29.1026 > 192.168.155.125.15: UDP, length 18 15:47:29.321096 fa:16:3e:2d:6b:35 > f2:12:23:a7:dc:8c, ethertype 802.1Q (0x8100), length 64: vlan 1170, p 0, ethertype IPv4, 17.3.49.80.1026 > 192.168.61.161.15: UDP, length 18 15:47:29.321097 fa:16:3e:2d:6b:35 > f2:12:23:a7:dc:8c, ethertype 802.1Q (0x8100), length 64: vlan 1170, p 0, ethertype IPv4, 16.1.100.207.1026 > 192.168.72.36.15: UDP, length 18 15:47:29.321098 fa:16:3e:2d:6b:35 > f2:12:23:a7:dc:8c, ethertype 802.1Q (0x8100), length 64: vlan 1170, p 0, ethertype IPv4, 16.25.104.29.1026 > 192.168.155.125.15: UDP, length 18 15:47:29.321099 fa:16:3e:2d:6b:35 > f2:12:23:a7:dc:8c, ethertype 802.1Q (0x8100), length 64: vlan 1170, p 0, ethertype IPv4, 17.3.49.80.1026 > 192.168.61.161.15: UDP, length 18 15:47:29.321099 fa:16:3e:2d:6b:35 > f2:12:23:a7:dc:8c, ethertype 802.1Q (0x8100), length 64: vlan 1170, p 0, ethertype IPv4, 16.25.104.29.1026 > 192.168.155.125.15: UDP, length 18 15:47:29.321101 fa:16:3e:2d:6b:35 > f2:12:23:a7:dc:8c, ethertype 802.1Q (0x8100), length 64: vlan 1170, p 0, ethertype IPv4, 16.1.100.207.1026 > 192.168.72.36.15: UDP, length 18 15:47:29.321101 fa:16:3e:2d:6b:35 > f2:12:23:a7:dc:8c, ethertype 802.1Q (0x8100), length 64: vlan 1170, p 0, ethertype IPv4, 17.47.251.102.1026 > 192.168.6.115.15: UDP, length 18 ~~~ [5] ~~~ [root@ess-testpmd v2.28]# ./t-rex-64 -i -c 1 Killing Scapy server... Scapy server is killed Starting Scapy server.... Scapy server is started The ports are bound/configured. Starting TRex v2.28 please wait ...=20 set driver name net_i40e_vf=20 driver capability : TCP_UDP_OFFLOAD=20 checksum-offload disabled by user=20 zmq publisher at: tcp://*:4500 Number of ports found: 2=20 wait 1 sec . port : 0=20 ------------ link : Link Down promiscuous : 0=20 port : 1=20 ------------ link : Link Down promiscuous : 0=20 EAL: Error - exiting with code: 1 Cause: One of the links is down=20 ~~~ FURTHER CLARIFICATION IN Q& A FORM: The concern here is that, when an operator decides to disable a VF, no matt= er what the guest is running or doing, it's expected that the traffic should s= top QUESTION 1: What happens when you don't use DPDK. Is everything works as you expect? Is it happening only with DPDK? ANSWER 1: The problem is only with UIO and DPDK. When UIO/DPDK are not used= ,it works good - as shown below good working condition: :~ # uname -r 3.10.0-1127.el7.x86_64 :~ # ip link set ens3f0 vf 0 state disable :~ # ip link set ens3f0 vf 0 state enable Inside the guest VF link goes down/up as expected. guest:~ # uname -r 3.10.0-1127.el7.x86_64 [ 205.307255] iavf 0000:00:09.0 eth0: NIC Link is Down [ 209.378792] iavf 0000:00:09.0 eth0: NIC Link is Up 40 Gbps Full Duplex However, incoming traffic can still be observed on the VF. [ 396.915807] iavf 0000:00:09.0 eth0: NIC Link is Down root@rhel7:~ # tcpdump -ni eth0 [ 410.880940] device eth0 entered promiscuous mode tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 08:54:38.060449 IP 172.16.30.46 > 172.16.62.94: ICMP echo request, id 46960, seq 366, length 64 [ 412.418969] iavf 0000:00:09.0: Entering promiscuous mode [ 412.423119] iavf 0000:00:09.0: Entering multicast promiscuous mode 08:54:39.084458 IP 172.16.30.46 > 172.16.62.94: ICMP echo request, id 46960, seq 367, length 64 08:54:40.108415 IP 172.16.30.46 > 172.16.62.94: ICMP echo request, id 46960, seq 368, length 64 QUESTION 2:=20 OK. The problem is seen only with DPDK/UIO in the guest and not otherwise in the guest. In addition to seeing the packets what is the link status inside= the guest? ANSWER 2: Not only do see the outgoing packets but the link never goes down inside t= he VM. Also, DPDK isn't running on the host, but inside the VM. As far as the = host is concerned, it's only a standard SR-IOV VF passed to the VM. We'd expect that, when the host disables the link state, no traffic should be going out= or in the VM, and the link should appear as being down inside the VM. It's a bit like if we set a port in "shut" on a switch but the traffic still flows in and out that port. QUESTION 3: What is the behaviour when you use i40evf driver in the guest - that is = not using uio/dpdk in the guest? ANSWER 3: When we use the i40evf driver, the link does go down as well . This is only happening with igb_uio=20 QUESTION 4: Have you please looked at the commands to toggle the link-state of the VF. Please find the below options for the VF link-state from the ip link man pa= ge:=20 - auto: a reflection of the PF link state (default) - enable: lets the VF to communicate with other VFs on this host even if th= e PF link state is down - disable: causes the HW to drop any packets sent by the VF. Based on what "disable" means for the link-state for the vf, when the the link-state showed "disable" for the vf, did you see traffic coming into and going out of the VF both ways or one way for incoming traffic only? ANSWER 5: When we set the VF state to disabled, we do see incoming and outgoing traff= ic on the network. Also, the instance never shows the link down event in dmesg. This is the reason why we have opened this bugzilla. QUESTION 6: Can you please reproduce this without DPDK / UIO - so we can find if the problem occurs in non-DPDK guest case as well please ANSWER 6:=20 The problem is seen only with DPDK/UIO in the guest - not otherwise. concer= n is that disabling the VF link state should cut the traffic, no matter what dri= ver the guest has loaded. There's some situations where disabling traffic is crucial. We believe it's like not being able to set a port in shut on a swi= tch. QUESTION 7: Based on what you described, I think this new feature from Intel i40e driver might help with the situation? BTW, the patch is not upstream yet, it is currently on Intel-Wired-LAN for public review. Please check it out and let= me know if any input: i40e: Add support for a new feature: Total Port Shutdown https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20200617000555.1= 5985-1-arkadiusz.kubalewski@intel.com/ PLEASE NOTE THAT This patch https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20200617000555.1= 5985-1-arkadiusz.kubalewski@intel.com/ is targeted for the entire PF, it is not VF specific. ANSWER 7: The requirement is to have the possibility to address each VF individually - not the whole PF. So, this patch won't work for us QUESTION 8: Have you tried this please Kindly refere In the i40e readme the following commands are available relat= ed to this case. How to disable link state for VF: ---------------------------------- echo disable > /sys/class/net//device/sriov//link_state The above is for sriov. For testing purpose to isolate the problem, is it possible to see with the above command and sriov, to see if you are getting what you want. Kindly find related commands in addition: How to set VF to track PF Link stat: -----------------------------------=20=20 echo AUTO > /sys/class/net//device/sriov//link_state=20 How to read link status: ------------------------ cat /sys/class/net//device/sriov//link_state ANSWER 8:=20 Using ip link set pf vf X state disable is the same as echo'ing disable in /sys/class/net. So, it will also have the same problem observation Question 9: Can you please kindly let us know the s/w versions you are using for the following please 1) DPDK version 2) XL710 firmware version - ethtool -i will give the firmware version 3) Kernel PF driver version for XL710 - ethtool -i will give the ker= nel PF driver version 4) IAVF driver version for XL710=20 The reason for asking is - that older s/w versions had disconnect in communication between VF and PF regarding link status change. Hence wanted to know the s/w version of each of the above component you are using please Please kindly provide the above 4 information please --=20 You are receiving this mail because: You are the assignee for the bug.=