DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [Bug 60] rte_event_port_unlink() causes subsequent events to end up in wrong port
@ 2018-06-04  7:21 bugzilla
  2018-06-04  8:20 ` Jerin Jacob
  0 siblings, 1 reply; 6+ messages in thread
From: bugzilla @ 2018-06-04  7:21 UTC (permalink / raw)
  To: dev

https://dpdk.org/tracker/show_bug.cgi?id=60

            Bug ID: 60
           Summary: rte_event_port_unlink() causes subsequent events to
                    end up in wrong port
           Product: DPDK
           Version: 17.11
          Hardware: x86
                OS: Linux
            Status: CONFIRMED
          Severity: major
          Priority: Normal
         Component: eventdev
          Assignee: dev@dpdk.org
          Reporter: matias.elo@nokia.com
  Target Milestone: ---

Created attachment 8
  --> https://dpdk.org/tracker/attachment.cgi?id=8&action=edit
Test application

I'm seeing some unexpected(?) behavior when calling rte_event_port_unlink()
with the SW eventdev driver (DPDK 17.11.2/18.02.1,
RTE_EVENT_MAX_QUEUES_PER_DEV=255). After calling rte_event_port_unlink(),
the enqueued events may end up either back to the unlinked port or to port
zero.

Scenario:

- Run SW evendev on a service core
- Start eventdev with e.g. 16 ports. Each core will have a dedicated port.
- Create 1 atomic queue and link all active ports to it (some ports may not
be linked).
- Allocate some events and enqueue them to the created queue
- Next, each worker core does a number of scheduling rounds concurrently.
E.g.

uint64_t rx_events = 0;
while(rx_events < SCHED_ROUNDS) {
        num_deq = rte_event_dequeue_burst(dev_id, port_id, ev, 1, 0);

        if (num_deq) {
                rx_events++;
                rte_event_enqueue_burst(dev_id, port_id, ev, 1);
        }
}

- This works fine but problems occur when doing cleanup after the first
loop finishes on some core.
E.g.

rte_event_port_unlink(dev_id, port_id, NULL, 0);

while(1) {
        num_deq = rte_event_dequeue_burst(dev_id, port_id, ev, 1, 0);

        if (num_deq == 0)
                break;

        rte_event_enqueue_burst(dev_id, port_id, ev, 1);
}

- The events enqueued in the cleanup loop will ramdomly end up either back to
the same port (which has already been unlinked) or to port zero, which is not
used (mapping rte_lcore_id to port_id).

As far as I understand the eventdev API, an eventdev port shouldn't have to be
linked to the target queue for enqueue to work properly.

I've attached a simple test application for reproducing this issue.
# sudo ./eventdev --vdev event_sw0 -s 0x2

Below is an example rte_event_dev_dump() output when processing events with two
cores (ports 2 and 3). The rest of the ports are not linked at all but events
still end up to port zero stalling the system.


Regards,
Matias

EventDev todo-fix-name: ports 16, qids 1
        rx   908342
        drop 0
        tx   908342
        sched calls: 42577156
        sched cq/qid call: 43120490
        sched no IQ enq: 42122057
        sched no CQ enq: 42122064
        inflight 32, credits: 4064
  Port 0 
        rx   0  drop 0  tx   2  inflight 2
        Max New: 1024   Avg cycles PP: 0        Credits: 0
        Receive burst distribution:
                0:-nan% 
        rx ring used:    0      free: 4096
        cq ring used:    2      free:   14
  Port 1 
        rx   0  drop 0  tx   0  inflight 0
        Max New: 1024   Avg cycles PP: 0        Credits: 0
        Receive burst distribution:
                0:-nan% 
        rx ring used:    0      free: 4096
        cq ring used:    0      free:   16
  Port 2 
        rx   524292     drop 0  tx   524290     inflight 0
        Max New: 1024   Avg cycles PP: 190      Credits: 30
        Receive burst distribution:
                0:98% 1-4:1.82% 
        rx ring used:    0      free: 4096
        cq ring used:    0      free:   16
  Port 3 
        rx   384050     drop 0  tx   384050     inflight 0
        Max New: 1024   Avg cycles PP: 191      Credits: 0
        Receive burst distribution:
                0:100% 1-4:0.04% 
        rx ring used:    0      free: 4096
        cq ring used:    0      free:   16
...
  Port 15 
        rx   0  drop 0  tx   0  inflight 0
        Max New: 1024   Avg cycles PP: 0        Credits: 0
        Receive burst distribution:
                0:-nan% 
        rx ring used:    0      free: 4096
        cq ring used:    0      free:   16
  Queue 0 (Atomic)
        rx   908342     drop 0  tx   908342
        Per Port Stats:
          Port 0: Pkts: 2       Flows: 1
          Port 1: Pkts: 0       Flows: 0
          Port 2: Pkts: 524290  Flows: 0
          Port 3: Pkts: 384050  Flows: 0
          Port 4: Pkts: 0       Flows: 0
          Port 5: Pkts: 0       Flows: 0
          Port 6: Pkts: 0       Flows: 0
          Port 7: Pkts: 0       Flows: 0
          Port 8: Pkts: 0       Flows: 0
          Port 9: Pkts: 0       Flows: 0
          Port 10: Pkts: 0      Flows: 0
          Port 11: Pkts: 0      Flows: 0
          Port 12: Pkts: 0      Flows: 0
          Port 13: Pkts: 0      Flows: 0
          Port 14: Pkts: 0      Flows: 0
          Port 15: Pkts: 0      Flows: 0
        -- iqs empty --

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [dpdk-dev] [Bug 60] rte_event_port_unlink() causes subsequent events to end up in wrong port
@ 2018-06-19  9:20 Elo, Matias (Nokia - FI/Espoo)
  2018-06-26 13:35 ` Maxim Uvarov
  0 siblings, 1 reply; 6+ messages in thread
From: Elo, Matias (Nokia - FI/Espoo) @ 2018-06-19  9:20 UTC (permalink / raw)
  To: harry.van.haaren; +Cc: dev, jerin.jacob

> I think this should handle the unlink case you mention, however perhaps you have identified a genuine bug. If you have more info or a sample config / app that easily demonstrates the issue that would help reproduce/debug here? 


Hi Harry,

The bug report includes a simple test application for demonstrating the issue. I've done some further digging and the following simple patch seems to fix the issue of events ending up in wrong ports.


diff --git a/drivers/event/sw/sw_evdev_scheduler.c b/drivers/event/sw/sw_evdev_scheduler.c
index 8a2c9d4f9..57298345d 100644
--- a/drivers/event/sw/sw_evdev_scheduler.c
+++ b/drivers/event/sw/sw_evdev_scheduler.c
@@ -79,9 +79,11 @@ sw_schedule_atomic_to_cq(struct sw_evdev *sw, struct sw_qid * const qid,
 		int cq = fid->cq;
 
 		if (cq < 0) {
-			uint32_t cq_idx = qid->cq_next_tx++;
-			if (qid->cq_next_tx == qid->cq_num_mapped_cqs)
+			uint32_t cq_idx;
+			if (qid->cq_next_tx >= qid->cq_num_mapped_cqs)
 				qid->cq_next_tx = 0;
+			cq_idx = qid->cq_next_tx++;
+
 			cq = qid->cq_map[cq_idx];
 
 			/* find least used */
@@ -168,9 +170,11 @@ sw_schedule_parallel_to_cq(struct sw_evdev *sw, struct sw_qid * const qid,
 		do {
 			if (++cq_check_count > qid->cq_num_mapped_cqs)
 				goto exit;
-			cq = qid->cq_map[cq_idx];
-			if (++cq_idx == qid->cq_num_mapped_cqs)
+
+			if (cq_idx >= qid->cq_num_mapped_cqs)
 				cq_idx = 0;
+			cq = qid->cq_map[cq_idx++];
+
 		} while (rte_event_ring_free_count(
 				sw->ports[cq].cq_worker_ring) == 0 ||
 				sw->ports[cq].inflights == SW_PORT_HIST_LIST);
@@ -251,6 +255,9 @@ sw_schedule_qid_to_cq(struct sw_evdev *sw)
 		if (iq_num >= SW_IQS_MAX)
 			continue;
 
+		if (qid->cq_num_mapped_cqs == 0)
+			continue;
+
 		uint32_t pkts_done = 0;
 		uint32_t count = iq_ring_count(qid->iq[iq_num]);


However, events from atomic/ordered queues may still end up getting stuck when unlinking (scheduled back to unlinked port). In case of atomic queues the problem seems to be related to (struct sw_fid_t *)fid->cq fields being invalid. With ordered queues events get stuck in reorder buffer.

-Matias

^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: [dpdk-dev] [Bug 60] rte_event_port_unlink() causes subsequent events to end up in wrong port
@ 2018-06-19  9:20 Elo, Matias (Nokia - FI/Espoo)
  0 siblings, 0 replies; 6+ messages in thread
From: Elo, Matias (Nokia - FI/Espoo) @ 2018-06-19  9:20 UTC (permalink / raw)
  To: jerin.jacob; +Cc: dev

> No related to this question, Are you planning to use rte_event_port_unlink() in fastpath?
> Does rte_event_stop() works for you, if it is in slow path.

Hi Jerin,

Sorry for missing your question earlier. We need rte_event_port_link() /
rte_event_port_unlink() for doing load balancing, so calling rte_event_stop()
isn't an option.

-Matias

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-06-26 13:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-04  7:21 [dpdk-dev] [Bug 60] rte_event_port_unlink() causes subsequent events to end up in wrong port bugzilla
2018-06-04  8:20 ` Jerin Jacob
2018-06-05 16:43   ` Van Haaren, Harry
2018-06-19  9:20 Elo, Matias (Nokia - FI/Espoo)
2018-06-26 13:35 ` Maxim Uvarov
2018-06-19  9:20 Elo, Matias (Nokia - FI/Espoo)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).