From mboxrd@z Thu Jan 1 00:00:00 1970
Return-Path:
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
by inbox.dpdk.org (Postfix) with ESMTP id B181343BCF;
Fri, 1 Mar 2024 14:22:57 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
by mails.dpdk.org (Postfix) with ESMTP id 7B1464335A;
Fri, 1 Mar 2024 14:22:57 +0100 (CET)
Received: from inbox.dpdk.org (inbox.dpdk.org [95.142.172.178])
by mails.dpdk.org (Postfix) with ESMTP id 9BA444026C
for ; Fri, 1 Mar 2024 14:22:55 +0100 (CET)
Received: by inbox.dpdk.org (Postfix, from userid 33)
id 956BE43BD1; Fri, 1 Mar 2024 14:22:55 +0100 (CET)
From: bugzilla@dpdk.org
To: dev@dpdk.org
Subject: [DPDK/examples Bug 1391] examples/l3fwd: in event-mode
hash.txadapter.txq is not always updated
Date: Fri, 01 Mar 2024 13:22:55 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: DPDK
X-Bugzilla-Component: examples
X-Bugzilla-Version: unspecified
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: konstantin.v.ananyev@yandex.ru
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: Normal
X-Bugzilla-Assigned-To: dev@dpdk.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform
op_sys bug_status bug_severity priority component assigned_to reporter cc
target_milestone
Message-ID:
Content-Type: multipart/alternative; boundary=17092993750.8CbBa302.656936
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://bugs.dpdk.org/
Auto-Submitted: auto-generated
X-Auto-Response-Suppress: All
MIME-Version: 1.0
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Errors-To: dev-bounces@dpdk.org
--17092993750.8CbBa302.656936
Date: Fri, 1 Mar 2024 14:22:55 +0100
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.dpdk.org/
Auto-Submitted: auto-generated
X-Auto-Response-Suppress: All
https://bugs.dpdk.org/show_bug.cgi?id=3D1391
Bug ID: 1391
Summary: examples/l3fwd: in event-mode hash.txadapter.txq is
not always updated
Product: DPDK
Version: unspecified
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: normal
Priority: Normal
Component: examples
Assignee: dev@dpdk.org
Reporter: konstantin.v.ananyev@yandex.ru
CC: pbhagavatula@marvell.com
Target Milestone: ---
Reproducible with latest main branch.
l3fwd in event-mode with SW with SW eventdev on mlx5 PMDs can crash:
./dpdk-l3fwd --lcores=3D49,51,53,55,57 -n 6 -a ca:00.0 -a ca:00.1 -a cb:00.=
0 -a
cb:00.1 -s 0x8000000000000 -\
-vdev event_sw0 -- -L -P -p f --rx-queue-size 1024 --tx-queue-size 1024 --m=
ode
eventdev --eventq-sched=3Dordered \
--rule_ipv4=3Dtest/l3fwd_lpm_v4_u1.cfg --rule_ipv6=3Dtest/l3fwd_lpm_v6_u1.c=
fg
--no-numa
Thread 4 "dpdk-worker51" received signal SIGSEGV, Segmentation fault.
0x000000000135d27f in rte_eth_tx_buffer (tx_pkt=3D0x17f3ea780, buffer=3D0x1=
0,
queue_id=3D43, port_id=3D1) at ../lib/ethdev/rte_ethdev.h:6637
6637 buffer->pkts[buffer->length++] =3D tx_pkt;
(gdb) bt
#0 0x000000000135d27f in rte_eth_tx_buffer (tx_pkt=3D0x17f3ea780, buffer=
=3D0x10,
queue_id=3D43, port_id=3D1) at ../lib/ethdev/rte_ethdev.h:6637
#1 txa_service_tx (txa=3D0x11f89959c0, ev=3D0x7ffff2f23e10, n=3D16)
at ../lib/eventdev/rte_event_eth_tx_adapter.c:631
#2 0x000000000135d3ef in txa_service_func (args=3D0x11f89959c0)
at ../lib/eventdev/rte_event_eth_tx_adapter.c:666
#3 0x00000000015d30e1 in service_runner_do_callback (s=3D0x11ffffe100,
cs=3D0x11fffe8500, service_idx=3D2) at ../lib/eal/common/rte_service.c:=
405
#4 0x00000000015d3429 in service_run (i=3D2, cs=3D0x11fffe8500, service_ma=
sk=3D7,
s=3D0x11ffffe100, serialize_mt_unsafe=3D1)
at ../lib/eal/common/rte_service.c:441
#5 0x00000000015d363f in service_runner_func (arg=3D0x0)
at ../lib/eal/common/rte_service.c:513
#6 0x00000000015c12c1 in eal_thread_loop (arg=3D0x33)
at ../lib/eal/common/eal_common_thread.c:212
#7 0x00000000015e1b98 in eal_worker_thread_loop (arg=3D0x33)
at ../lib/eal/linux/eal.c:916
#8 0x00007ffff5ff76ea in start_thread () from /lib64/libpthread.so.0
#9 0x00007ffff5d0fa8f in clone () from /lib64/libc.so.6
Obviously 'queue_id=3D43' is wrong here and it crashed while trying to acce=
ss
un-configured TX queue.=20
What is happening here is a coincidence of two different problems:
1. EVENT framework silently and un-conditionally re-uses mbuf::hash.fdir for
its own purposes:
struct {
uint32_t reserved1;
uint16_t reserved2;
uint16_t txq;
/**< The event eth Tx adapter uses this fie=
ld
* to store Tx queue id.
* @see rte_event_eth_tx_adapter_txq_set()
*/
} txadapter; /**< Eventdev ethdev Tx adapter */
In particular txa_service_tx() expects hash.txadapter.txq to contain valid =
TX
queue index.
Though l3fwd not always set it properly.
Usually it is ok for that particular app, as only queue 0 is in use, and it
doesn't configure PMDs
to overwrite mbuf::hash.fdir.hi value (RTE_MBUF_F_RX_FDIR).
But if by whatever reason PMD will overwrite mbuf::hash.fdir.hi with some
non-zero value, then we are in trouble.
2. That's exactly what is happening here: mlx5 driver sometimes superfluous=
ly
updates mbuf::hash.fdir.hi.
The fix I applied localy is obvious - *always* set hash.txadapter.txq to a
proper value before calling rte_event_enqueue_burst().
See below for details.
Note that it is not the 'complete' fix, as same needs to be done for other
codepaths (em, fib, acl, etc.).
As a more general thing - I don't understand while EVENT framework keep usi=
ng
hash.fdir for its own purposes.
Specially in a completely silent and unconditional way.
I think it would be much cleaner to switch to mbuf dynfiield/dynflag based
approach.
diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c
index a484a33089..ef9838aef3 100644
--- a/examples/l3fwd/l3fwd_lpm.c
+++ b/examples/l3fwd/l3fwd_lpm.c
@@ -285,6 +285,8 @@ lpm_event_loop_single(struct l3fwd_event_resources
*evt_rsrc,
continue;
}
+ rte_event_eth_tx_adapter_txq_set(ev.mbuf, 0);
+
if (flags & L3FWD_EVENT_TX_ENQ) {
ev.queue_id =3D tx_q_id;
ev.op =3D RTE_EVENT_OP_FORWARD;
@@ -295,7 +297,6 @@ lpm_event_loop_single(struct l3fwd_event_resources
*evt_rsrc,
}
if (flags & L3FWD_EVENT_TX_DIRECT) {
- rte_event_eth_tx_adapter_txq_set(ev.mbuf, 0);
do {
enq =3D rte_event_eth_tx_adapter_enqueue(
event_d_id, event_p_id, &ev, 1, 0);
@@ -344,11 +345,8 @@ lpm_event_loop_burst(struct l3fwd_event_resources
*evt_rsrc,
events[i].op =3D RTE_EVENT_OP_FORWARD;
}
- if (flags & L3FWD_EVENT_TX_DIRECT)
-=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20
rte_event_eth_tx_adapter_txq_set(events[i].mbuf,
- 0);
-
lpm_process_event_pkt(lconf, events[i].mbuf);
+ rte_event_eth_tx_adapter_txq_set(events[i].mbuf, 0);
}
if (flags & L3FWD_EVENT_TX_ENQ) {
--=20
You are receiving this mail because:
You are the assignee for the bug.=
--17092993750.8CbBa302.656936
Date: Fri, 1 Mar 2024 14:22:55 +0100
MIME-Version: 1.0
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.dpdk.org/
Auto-Submitted: auto-generated
X-Auto-Response-Suppress: All
examples/l3fwd: in event-mode hash.txadapter.txq is not alway=
s updated
Product
DPDK
Version
unspecified
Hardware
All
OS
All
Status
UNCONFIRMED
Severity
normal
Priority
Normal
Component
examples
Assignee
dev@dpdk.org
Reporter
konstantin.v.ananyev@yandex.ru
CC
pbhagavatula@marvell.com
Target Milestone
---
Reproducible with latest main bran=
ch.
l3fwd in event-mode with SW with SW eventdev on mlx5 PMDs can crash:
./dpdk-l3fwd --lcores=3D49,51,53,55,57 -n 6 -a ca:00.0 -a ca:00.1 -a cb:00.=
0 -a
cb:00.1 -s 0x8000000000000 -\
-vdev event_sw0 -- -L -P -p f --rx-queue-size 1024 --tx-queue-size 1024 --m=
ode
eventdev --eventq-sched=3Dordered \
--rule_ipv4=3Dtest/l3fwd_lpm_v4_u1.cfg --rule_ipv6=3Dtest/l3fwd_lpm_v6_u1.c=
fg
--no-numa
Thread 4 "dpdk-worker51" received signal SIGSEGV, Segmentation fa=
ult.
0x000000000135d27f in rte_eth_tx_buffer (tx_pkt=3D0x17f3ea780, buffer=3D0x1=
0,
queue_id=3D43, port_id=3D1) at ../lib/ethdev/rte_ethdev.h:6637
6637 buffer->pkts[buffer->length++] =3D tx_pkt;
(gdb) bt
#0 0x000000000135d27f in rte_eth_tx_buffer (tx_pkt=3D0x17f3ea780, buffer=
=3D0x10,
queue_id=3D43, port_id=3D1) at ../lib/ethdev/rte_ethdev.h:6637
#1 txa_service_tx (txa=3D0x11f89959c0, ev=3D0x7ffff2f23e10, n=3D16)
at ../lib/eventdev/rte_event_eth_tx_adapter.c:631
#2 0x000000000135d3ef in txa_service_func (args=3D0x11f89959c0)
at ../lib/eventdev/rte_event_eth_tx_adapter.c:666
#3 0x00000000015d30e1 in service_runner_do_callback (s=3D0x11ffffe100,
cs=3D0x11fffe8500, service_idx=3D2) at ../lib/eal/common/rte_service.c:=
405
#4 0x00000000015d3429 in service_run (i=3D2, cs=3D0x11fffe8500, service_ma=
sk=3D7,
s=3D0x11ffffe100, serialize_mt_unsafe=3D1)
at ../lib/eal/common/rte_service.c:441
#5 0x00000000015d363f in service_runner_func (arg=3D0x0)
at ../lib/eal/common/rte_service.c:513
#6 0x00000000015c12c1 in eal_thread_loop (arg=3D0x33)
at ../lib/eal/common/eal_common_thread.c:212
#7 0x00000000015e1b98 in eal_worker_thread_loop (arg=3D0x33)
at ../lib/eal/linux/eal.c:916
#8 0x00007ffff5ff76ea in start_thread () from /lib64/libpthread.so.0
#9 0x00007ffff5d0fa8f in clone () from /lib64/libc.so.6
Obviously 'queue_id=3D43' is wrong here and it crashed while trying to acce=
ss
un-configured TX queue.=20
What is happening here is a coincidence of two different problems:
1. EVENT framework silently and un-conditionally re-uses mbuf::hash.fdir for
its own purposes:
struct {
uint32_t reserved1;
uint16_t reserved2;
uint16_t txq;
/**< The event eth Tx adapter uses this =
field
* to store Tx queue id.
* @see rte_event_eth_tx_adapter_txq_se=
t()
*/
} txadapter; /**< Eventdev ethdev Tx adapter */
In particular txa_service_tx() expects hash.txadapter.txq to contain valid =
TX
queue index.
Though l3fwd not always set it properly.
Usually it is ok for that particular app, as only queue 0 is in use, and it
doesn't configure PMDs
to overwrite mbuf::hash.fdir.hi value (RTE_MBUF_F_RX_FDIR).
But if by whatever reason PMD will overwrite mbuf::hash.fdir.hi with some
non-zero value, then we are in trouble.
2. That's exactly what is happening here: mlx5 driver sometimes superfluous=
ly
updates mbuf::hash.fdir.hi.
The fix I applied localy is obvious - *always* set hash.txadapter.txq to a
proper value before calling rte_event_enqueue_burst().
See below for details.
Note that it is not the 'complete' fix, as same needs to be done for other
codepaths (em, fib, acl, etc.).
As a more general thing - I don't understand while EVENT framework keep usi=
ng
hash.fdir for its own purposes.
Specially in a completely silent and unconditional way.
I think it would be much cleaner to switch to mbuf dynfiield/dynflag based
approach.
diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c
index a484a33089..ef9838aef3 100644
--- a/examples/l3fwd/l3fwd_lpm.c
+++ b/examples/l3fwd/l3fwd_lpm.c
@@ -285,6 +285,8 @@ lpm_event_loop_single(struct l3fwd_even=
t_resources
*evt_rsrc,
continue;
}
+ rte_event_eth_tx_adapter_txq_set(ev.mbuf, 0);
+
if (flags & L3FWD_EVENT_TX_ENQ) {
ev.queue_id =3D tx_q_id;
ev.op =3D RTE_EVENT_OP_FORWARD;
@@ -295,7 +297,6 @@ lpm_event_loop_single(struct l3fwd_even=
t_resources
*evt_rsrc,
}
if (flags & L3FWD_EVENT_TX_DIRECT) {
- rte_event_eth_tx_adapter_txq_set(ev.mbuf, 0);
do {
enq =3D rte_event_eth_tx_adapter_enqueue(
event_d_id, event_p_id, &ev, 1,=
0);
@@ -344,11 +345,8 @@ lpm_event_loop_burst(struct l3fwd_even=
t_resources
*evt_rsrc,
events[i].op =3D RTE_EVENT_OP_FORWARD;
}
- if (flags & L3FWD_EVENT_TX_DIRECT)
-=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20
rte_event_eth_tx_adapter_txq_set(events[i].mbuf,
- 0);
-
lpm_process_event_pkt(lconf, events[i].mbuf);
+ rte_event_eth_tx_adapter_txq_set(events[i].mbuf, 0);
}
if (flags & L3FWD_EVENT_TX_ENQ) {