From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 3AE4B45E3D;
	Fri,  6 Dec 2024 00:01:47 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id DAC204027D;
	Fri,  6 Dec 2024 00:01:46 +0100 (CET)
Received: from inbox.dpdk.org (inbox.dpdk.org [95.142.172.178])
 by mails.dpdk.org (Postfix) with ESMTP id A555D40267
 for <dev@dpdk.org>; Fri,  6 Dec 2024 00:01:44 +0100 (CET)
Received: by inbox.dpdk.org (Postfix, from userid 33)
 id 8E7D145E3E; Fri,  6 Dec 2024 00:01:44 +0100 (CET)
From: bugzilla@dpdk.org
To: dev@dpdk.org
Subject: [DPDK/ethdev Bug 1592] AF_PACKET PMD loops back packets on veth with
 tc
Date: Thu, 05 Dec 2024 23:01:44 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: DPDK
X-Bugzilla-Component: ethdev
X-Bugzilla-Version: unspecified
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: normal
X-Bugzilla-Who: thea.rossman@cs.stanford.edu
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: Normal
X-Bugzilla-Assigned-To: dev@dpdk.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform
 op_sys bug_status bug_severity priority component assigned_to reporter
 target_milestone
Message-ID: <bug-1592-3@http.bugs.dpdk.org/>
Content-Type: multipart/alternative; boundary=17334397040.B3418AAf.322596
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://bugs.dpdk.org/
Auto-Submitted: auto-generated
X-Auto-Response-Suppress: All
MIME-Version: 1.0
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org


--17334397040.B3418AAf.322596
Date: Fri, 6 Dec 2024 00:01:44 +0100
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.dpdk.org/
Auto-Submitted: auto-generated
X-Auto-Response-Suppress: All

https://bugs.dpdk.org/show_bug.cgi?id=3D1592

            Bug ID: 1592
           Summary: AF_PACKET PMD loops back packets on veth with tc
           Product: DPDK
           Version: unspecified
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: ethdev
          Assignee: dev@dpdk.org
          Reporter: thea.rossman@cs.stanford.edu
  Target Milestone: ---

Found using AF_PACKET PMD on veth with Linux TC via Mininet. Ubuntu 24.04.1
LTS.

High-level, I have a basic Mininet topology with `h1 <-> r1 <-> h2`. (Each =
link
represents a pair of veths.) `r1` is a transparent middlebox running the DP=
DK
`skeleton/basicfwd.c` example, which should forward packets through r1 (eth=
0 ->
eth1 and vice verse). I ping h1 <-> h2, expecting `r1` to perform this
bridging.=20

The DPDK basicfwd program was started with the vdev configuration:
`--vdev=3Deth_af_packet0,iface=3Dr1-eth0,qdisc_bypass=3D0
--vdev=3Deth_af_packet1,iface=3Dr1-eth1,qdisc_bypass=3D0 &`.

(I turn off qdisc_bypass because I'd like to use TC.)=20

When one of r1's veths have TC params (e.g., delay), this all works fine.=20

However, when both of r1's veths (r1-eth0 and r1-eth1) have TC params set u=
p,
traffic is looped forever. Here's an example output from h1:=20

```
...
64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!)
64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!)
64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!)
64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!)
64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!)
64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D181 ms (DUP!)
... (continues perpetually until stopped)
```

Note that this issue only occurs if TC parameters (e.g., delay) are configu=
red
on both egress veths. If qdisc_bypass is disabled on both, but no actual TC
parameters are set, the duplication does not happen.=20

I took packet captures (tcpdump), and I also printed packets that the
application was actually seeing. It appears that when the middlebox writes a
packet to a socket, it immediately reads the same packet out from the same
socket.=20

I also tried making small modifications in the DPDK driver to investigate
whether this could be a Linux or Mininet issue (e.g., packet mmap + TC?) vs=
. a
DPDK driver issue. I reduced the # of RX/TX queues to 1, and I also turned =
of
packet mmap (by removing the setsockopt calls in rte_eth_af_packet.c for
`PACKET_RX_RING` and `PACKET_TX_RING` -- not sure if there's anything else I
would need to do here?).=20=20

A basic python script for repro is below. This works fine when only one of =
the
`r1.cmd("tc qdisc add dev r1-ethX root netem delay 10ms")` commands is pres=
ent.
When both are added, the pings begin to loop.=20

```python
from mininet.net import Mininet
from mininet.link import TCLink
from mininet.cli import CLI

net =3D Mininet(controller=3DNone, link=3DTCLink)
# Add hosts in same subnet
h1 =3D net.addHost('h1', ip=3D'10.0.1.10/24', mac=3D'00:00:00:00:00:01')
h2 =3D net.addHost('h2', ip=3D'10.0.1.9/24', mac=3D'00:00:00:00:00:02')
r1 =3D net.addHost('r1')
net.addLink(r1, h1)
net.addLink(r1, h2)
net.build()

# Configure TC on egress=20
# (could also be done by adding `delay` param to `addLink` above)
h2.cmd("tc qdisc add dev h2-eth0 root netem delay 10ms")
h1.cmd("tc qdisc add dev h1-eth0 root netem delay 10ms")
r1.cmd("tc qdisc add dev r1-eth0 root netem delay 10ms")
r1.cmd("tc qdisc add dev r1-eth1 root netem delay 10ms")

r1.cmd('sudo /path/to/dpdk-24.07/examples/skeleton/build/basicfwd 0 1
--vdev=3Deth_af_packet0,iface=3Dr1-eth0,qdisc_bypass=3D0
--vdev=3Deth_af_packet1,iface=3Dr1-eth1,qdisc_bypass=3D0 &')

CLI(net)
net.stop()
```

I think there is a bug here, though I haven't been able to figure out for s=
ure
what might be going on or confirm 100% whether the issue is in the PMD.

--=20
You are receiving this mail because:
You are the assignee for the bug.=

--17334397040.B3418AAf.322596
Date: Fri, 6 Dec 2024 00:01:44 +0100
MIME-Version: 1.0
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.dpdk.org/
Auto-Submitted: auto-generated
X-Auto-Response-Suppress: All

<html>
    <head>
      <base href=3D"https://bugs.dpdk.org/">
    </head>
    <body><table border=3D"1" cellspacing=3D"0" cellpadding=3D"8" class=3D"=
bz_new_table">
        <tr>
          <th>Bug ID</th>
          <td><a class=3D"bz_bug_link=20
          bz_status_UNCONFIRMED "
   title=3D"UNCONFIRMED - AF_PACKET PMD loops back packets on veth with tc"
   href=3D"https://bugs.dpdk.org/show_bug.cgi?id=3D1592">1592</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>AF_PACKET PMD loops back packets on veth with tc
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>DPDK
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>UNCONFIRMED
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>Normal
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>ethdev
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>dev&#64;dpdk.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>thea.rossman&#64;cs.stanford.edu
          </td>
        </tr>

        <tr>
          <th>Target Milestone</th>
          <td>---
          </td>
        </tr></table>
      <p>
        <div class=3D"bz_comment_block">
          <pre class=3D"bz_comment_text">Found using AF_PACKET PMD on veth =
with Linux TC via Mininet. Ubuntu 24.04.1
LTS.

High-level, I have a basic Mininet topology with `h1 &lt;-&gt; r1 &lt;-&gt;=
 h2`. (Each link
represents a pair of veths.) `r1` is a transparent middlebox running the DP=
DK
`skeleton/basicfwd.c` example, which should forward packets through r1 (eth=
0 -&gt;
eth1 and vice verse). I ping h1 &lt;-&gt; h2, expecting `r1` to perform this
bridging.=20

The DPDK basicfwd program was started with the vdev configuration:
`--vdev=3Deth_af_packet0,iface=3Dr1-eth0,qdisc_bypass=3D0
--vdev=3Deth_af_packet1,iface=3Dr1-eth1,qdisc_bypass=3D0 &amp;`.

(I turn off qdisc_bypass because I'd like to use TC.)=20

When one of r1's veths have TC params (e.g., delay), this all works fine.=20

However, when both of r1's veths (r1-eth0 and r1-eth1) have TC params set u=
p,
traffic is looped forever. Here's an example output from h1:=20

```
...
64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!)
64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!)
64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!)
64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!)
64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D161 ms (DUP!)
64 bytes from 10.0.1.9: icmp_seq=3D1 ttl=3D64 time=3D181 ms (DUP!)
... (continues perpetually until stopped)
```

Note that this issue only occurs if TC parameters (e.g., delay) are configu=
red
on both egress veths. If qdisc_bypass is disabled on both, but no actual TC
parameters are set, the duplication does not happen.=20

I took packet captures (tcpdump), and I also printed packets that the
application was actually seeing. It appears that when the middlebox writes a
packet to a socket, it immediately reads the same packet out from the same
socket.=20

I also tried making small modifications in the DPDK driver to investigate
whether this could be a Linux or Mininet issue (e.g., packet mmap + TC?) vs=
. a
DPDK driver issue. I reduced the # of RX/TX queues to 1, and I also turned =
of
packet mmap (by removing the setsockopt calls in rte_eth_af_packet.c for
`PACKET_RX_RING` and `PACKET_TX_RING` -- not sure if there's anything else I
would need to do here?).=20=20

A basic python script for repro is below. This works fine when only one of =
the
`r1.cmd(&quot;tc qdisc add dev r1-ethX root netem delay 10ms&quot;)` comman=
ds is present.
When both are added, the pings begin to loop.=20

```python
from mininet.net import Mininet
from mininet.link import TCLink
from mininet.cli import CLI

net =3D Mininet(controller=3DNone, link=3DTCLink)
# Add hosts in same subnet
h1 =3D net.addHost('h1', ip=3D'10.0.1.10/24', mac=3D'00:00:00:00:00:01')
h2 =3D net.addHost('h2', ip=3D'10.0.1.9/24', mac=3D'00:00:00:00:00:02')
r1 =3D net.addHost('r1')
net.addLink(r1, h1)
net.addLink(r1, h2)
net.build()

# Configure TC on egress=20
# (could also be done by adding `delay` param to `addLink` above)
h2.cmd(&quot;tc qdisc add dev h2-eth0 root netem delay 10ms&quot;)
h1.cmd(&quot;tc qdisc add dev h1-eth0 root netem delay 10ms&quot;)
r1.cmd(&quot;tc qdisc add dev r1-eth0 root netem delay 10ms&quot;)
r1.cmd(&quot;tc qdisc add dev r1-eth1 root netem delay 10ms&quot;)

r1.cmd('sudo /path/to/dpdk-24.07/examples/skeleton/build/basicfwd 0 1
--vdev=3Deth_af_packet0,iface=3Dr1-eth0,qdisc_bypass=3D0
--vdev=3Deth_af_packet1,iface=3Dr1-eth1,qdisc_bypass=3D0 &amp;')

CLI(net)
net.stop()
```

I think there is a bug here, though I haven't been able to figure out for s=
ure
what might be going on or confirm 100% whether the issue is in the PMD.
          </pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
      <div itemscope itemtype=3D"http://schema.org/EmailMessage">
        <div itemprop=3D"action" itemscope itemtype=3D"http://schema.org/Vi=
ewAction">
=20=20=20=20=20=20=20=20=20=20
          <link itemprop=3D"url" href=3D"https://bugs.dpdk.org/show_bug.cgi=
?id=3D1592">
          <meta itemprop=3D"name" content=3D"View bug">
        </div>
        <meta itemprop=3D"description" content=3D"Bugzilla bug update notif=
ication">
      </div>
    </body>
</html>=

--17334397040.B3418AAf.322596--