From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <dev-bounces@dpdk.org> Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3AE4B45E3D; Fri, 6 Dec 2024 00:01:47 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DAC204027D; Fri, 6 Dec 2024 00:01:46 +0100 (CET) Received: from inbox.dpdk.org (inbox.dpdk.org [95.142.172.178]) by mails.dpdk.org (Postfix) with ESMTP id A555D40267 for <dev@dpdk.org>; Fri, 6 Dec 2024 00:01:44 +0100 (CET) Received: by inbox.dpdk.org (Postfix, from userid 33) id 8E7D145E3E; Fri, 6 Dec 2024 00:01:44 +0100 (CET) From: bugzilla@dpdk.org To: dev@dpdk.org Subject: [DPDK/ethdev Bug 1592] AF_PACKET PMD loops back packets on veth with tc Date: Thu, 05 Dec 2024 23:01:44 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: DPDK X-Bugzilla-Component: ethdev X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: thea.rossman@cs.stanford.edu X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: dev@dpdk.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: <bug-1592-3@http.bugs.dpdk.org/> Content-Type: multipart/alternative; boundary=17334397040.B3418AAf.322596 Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://bugs.dpdk.org/ Auto-Submitted: auto-generated X-Auto-Response-Suppress: All MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org --17334397040.B3418AAf.322596 Date: Fri, 6 Dec 2024 00:01:44 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.dpdk.org/ Auto-Submitted: auto-generated X-Auto-Response-Suppress: All https://bugs.dpdk.org/show_bug.cgi?id=3D1592 Bug ID: 1592 Summary: AF_PACKET PMD loops back packets on veth with tc Product: DPDK Version: unspecified Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: thea.rossman@cs.stanford.edu Target Milestone: --- Found using AF_PACKET PMD on veth with Linux TC via Mininet. Ubuntu 24.04.1 LTS. High-level, I have a basic Mininet topology with `h1 <-> r1 <-> h2`. (Each = link represents a pair of veths.) `r1` is a transparent middlebox running the DP= DK `skeleton/basicfwd.c` example, which should forward packets through r1 (eth= 0 -> eth1 and vice verse). I ping h1 <-> h2, expecting `r1` to perform this bridging.=20 The DPDK basicfwd program was started with the vdev configuration: `--vdev=3Deth_af_packet0,iface=3Dr1-eth0,qdisc_bypass=3D0 --vdev=3Deth_af_packet1,iface=3Dr1-eth1,qdisc_bypass=3D0 &`. (I turn off qdisc_bypass because I'd like to use TC.)=20 When one of r1's veths have TC params (e.g., delay), this all works fine.=20 However, when both of r1's veths (r1-eth0 and r1-eth1) have TC params set u= p, traffic is looped forever. Here's an example output from h1:=20 ``` ... 64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!) 64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!) 64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!) 64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!) 64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!) 64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D181 ms (DUP!) ... (continues perpetually until stopped) ``` Note that this issue only occurs if TC parameters (e.g., delay) are configu= red on both egress veths. If qdisc_bypass is disabled on both, but no actual TC parameters are set, the duplication does not happen.=20 I took packet captures (tcpdump), and I also printed packets that the application was actually seeing. It appears that when the middlebox writes a packet to a socket, it immediately reads the same packet out from the same socket.=20 I also tried making small modifications in the DPDK driver to investigate whether this could be a Linux or Mininet issue (e.g., packet mmap + TC?) vs= . a DPDK driver issue. I reduced the # of RX/TX queues to 1, and I also turned = of packet mmap (by removing the setsockopt calls in rte_eth_af_packet.c for `PACKET_RX_RING` and `PACKET_TX_RING` -- not sure if there's anything else I would need to do here?).=20=20 A basic python script for repro is below. This works fine when only one of = the `r1.cmd("tc qdisc add dev r1-ethX root netem delay 10ms")` commands is pres= ent. When both are added, the pings begin to loop.=20 ```python from mininet.net import Mininet from mininet.link import TCLink from mininet.cli import CLI net =3D Mininet(controller=3DNone, link=3DTCLink) # Add hosts in same subnet h1 =3D net.addHost('h1', ip=3D'10.0.1.10/24', mac=3D'00:00:00:00:00:01') h2 =3D net.addHost('h2', ip=3D'10.0.1.9/24', mac=3D'00:00:00:00:00:02') r1 =3D net.addHost('r1') net.addLink(r1, h1) net.addLink(r1, h2) net.build() # Configure TC on egress=20 # (could also be done by adding `delay` param to `addLink` above) h2.cmd("tc qdisc add dev h2-eth0 root netem delay 10ms") h1.cmd("tc qdisc add dev h1-eth0 root netem delay 10ms") r1.cmd("tc qdisc add dev r1-eth0 root netem delay 10ms") r1.cmd("tc qdisc add dev r1-eth1 root netem delay 10ms") r1.cmd('sudo /path/to/dpdk-24.07/examples/skeleton/build/basicfwd 0 1 --vdev=3Deth_af_packet0,iface=3Dr1-eth0,qdisc_bypass=3D0 --vdev=3Deth_af_packet1,iface=3Dr1-eth1,qdisc_bypass=3D0 &') CLI(net) net.stop() ``` I think there is a bug here, though I haven't been able to figure out for s= ure what might be going on or confirm 100% whether the issue is in the PMD. --=20 You are receiving this mail because: You are the assignee for the bug.= --17334397040.B3418AAf.322596 Date: Fri, 6 Dec 2024 00:01:44 +0100 MIME-Version: 1.0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.dpdk.org/ Auto-Submitted: auto-generated X-Auto-Response-Suppress: All <html> <head> <base href=3D"https://bugs.dpdk.org/"> </head> <body><table border=3D"1" cellspacing=3D"0" cellpadding=3D"8" class=3D"= bz_new_table"> <tr> <th>Bug ID</th> <td><a class=3D"bz_bug_link=20 bz_status_UNCONFIRMED " title=3D"UNCONFIRMED - AF_PACKET PMD loops back packets on veth with tc" href=3D"https://bugs.dpdk.org/show_bug.cgi?id=3D1592">1592</a> </td> </tr> <tr> <th>Summary</th> <td>AF_PACKET PMD loops back packets on veth with tc </td> </tr> <tr> <th>Product</th> <td>DPDK </td> </tr> <tr> <th>Version</th> <td>unspecified </td> </tr> <tr> <th>Hardware</th> <td>All </td> </tr> <tr> <th>OS</th> <td>All </td> </tr> <tr> <th>Status</th> <td>UNCONFIRMED </td> </tr> <tr> <th>Severity</th> <td>normal </td> </tr> <tr> <th>Priority</th> <td>Normal </td> </tr> <tr> <th>Component</th> <td>ethdev </td> </tr> <tr> <th>Assignee</th> <td>dev@dpdk.org </td> </tr> <tr> <th>Reporter</th> <td>thea.rossman@cs.stanford.edu </td> </tr> <tr> <th>Target Milestone</th> <td>--- </td> </tr></table> <p> <div class=3D"bz_comment_block"> <pre class=3D"bz_comment_text">Found using AF_PACKET PMD on veth = with Linux TC via Mininet. Ubuntu 24.04.1 LTS. High-level, I have a basic Mininet topology with `h1 <-> r1 <->= h2`. (Each link represents a pair of veths.) `r1` is a transparent middlebox running the DP= DK `skeleton/basicfwd.c` example, which should forward packets through r1 (eth= 0 -> eth1 and vice verse). I ping h1 <-> h2, expecting `r1` to perform this bridging.=20 The DPDK basicfwd program was started with the vdev configuration: `--vdev=3Deth_af_packet0,iface=3Dr1-eth0,qdisc_bypass=3D0 --vdev=3Deth_af_packet1,iface=3Dr1-eth1,qdisc_bypass=3D0 &`. (I turn off qdisc_bypass because I'd like to use TC.)=20 When one of r1's veths have TC params (e.g., delay), this all works fine.=20 However, when both of r1's veths (r1-eth0 and r1-eth1) have TC params set u= p, traffic is looped forever. Here's an example output from h1:=20 ``` ... 64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!) 64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!) 64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!) 64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!) 64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!) 64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D181 ms (DUP!) ... (continues perpetually until stopped) ``` Note that this issue only occurs if TC parameters (e.g., delay) are configu= red on both egress veths. If qdisc_bypass is disabled on both, but no actual TC parameters are set, the duplication does not happen.=20 I took packet captures (tcpdump), and I also printed packets that the application was actually seeing. It appears that when the middlebox writes a packet to a socket, it immediately reads the same packet out from the same socket.=20 I also tried making small modifications in the DPDK driver to investigate whether this could be a Linux or Mininet issue (e.g., packet mmap + TC?) vs= . a DPDK driver issue. I reduced the # of RX/TX queues to 1, and I also turned = of packet mmap (by removing the setsockopt calls in rte_eth_af_packet.c for `PACKET_RX_RING` and `PACKET_TX_RING` -- not sure if there's anything else I would need to do here?).=20=20 A basic python script for repro is below. This works fine when only one of = the `r1.cmd("tc qdisc add dev r1-ethX root netem delay 10ms")` comman= ds is present. When both are added, the pings begin to loop.=20 ```python from mininet.net import Mininet from mininet.link import TCLink from mininet.cli import CLI net =3D Mininet(controller=3DNone, link=3DTCLink) # Add hosts in same subnet h1 =3D net.addHost('h1', ip=3D'10.0.1.10/24', mac=3D'00:00:00:00:00:01') h2 =3D net.addHost('h2', ip=3D'10.0.1.9/24', mac=3D'00:00:00:00:00:02') r1 =3D net.addHost('r1') net.addLink(r1, h1) net.addLink(r1, h2) net.build() # Configure TC on egress=20 # (could also be done by adding `delay` param to `addLink` above) h2.cmd("tc qdisc add dev h2-eth0 root netem delay 10ms") h1.cmd("tc qdisc add dev h1-eth0 root netem delay 10ms") r1.cmd("tc qdisc add dev r1-eth0 root netem delay 10ms") r1.cmd("tc qdisc add dev r1-eth1 root netem delay 10ms") r1.cmd('sudo /path/to/dpdk-24.07/examples/skeleton/build/basicfwd 0 1 --vdev=3Deth_af_packet0,iface=3Dr1-eth0,qdisc_bypass=3D0 --vdev=3Deth_af_packet1,iface=3Dr1-eth1,qdisc_bypass=3D0 &') CLI(net) net.stop() ``` I think there is a bug here, though I haven't been able to figure out for s= ure what might be going on or confirm 100% whether the issue is in the PMD. </pre> </div> </p> <hr> <span>You are receiving this mail because:</span> <ul> <li>You are the assignee for the bug.</li> </ul> <div itemscope itemtype=3D"http://schema.org/EmailMessage"> <div itemprop=3D"action" itemscope itemtype=3D"http://schema.org/Vi= ewAction"> =20=20=20=20=20=20=20=20=20=20 <link itemprop=3D"url" href=3D"https://bugs.dpdk.org/show_bug.cgi= ?id=3D1592"> <meta itemprop=3D"name" content=3D"View bug"> </div> <meta itemprop=3D"description" content=3D"Bugzilla bug update notif= ication"> </div> </body> </html>= --17334397040.B3418AAf.322596--