DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Bly, Mike" <mbly@ciena.com>
To: "dev@dpdk.org" <dev@dpdk.org>
Subject: memif thread race condition on memif.disconnect()
Date: Wed, 11 Oct 2023 19:57:56 +0000	[thread overview]
Message-ID: <BYAPR04MB4325C64E8B14347B37EFF237CFCCA@BYAPR04MB4325.namprd04.prod.outlook.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 3788 bytes --]

Hello,

We have run into a timing issue between threads when using the memif interface type and need some guidance.

Our application has a DPDK based process operating (among other things) a memif server interface. The problem is exposed when this memif interface receives a memif.disconnect message from the remote client, while in the middle of an rte_eth_rx_burst() on this same memif interface. As the IRQ message handling is on its own thread as compared to the DPDK worker thread doing the rx_burst, this resulted in a crash. The backtraces for which have been shared below. How does one ensure there are guard rails in place to gracefully exit the rx-burst when a disconnect occurs? Or, how do we properly modify the code such that  we defer responding to the disconnect CB after the rx-burst operation has completed?

We are utilizing DPDK 21.11.2. I have diff'd dpdks-stable:22.11.3 in ./drivers/net/memif, but I do not see anything obvious that would address this. I did a similar diff for dpdk:23.07, but do not see anything obvious there either.

-Mike

(gdb) thread 1
[Switching to thread 1 (Thread 0x7f17e2813600 (LWP 470))]
#0  0x00007f17e374d225 in eth_memif_rx (queue=0x1189023b00, bufs=0x7f17e28100e8, nb_pkts=32) at ../git/drivers/net/memif/rte_eth_memif.c:338
338                     last_slot = __atomic_load_n(&ring->head, __ATOMIC_ACQUIRE);
(gdb) bt
#0  0x00007f17e374d225 in eth_memif_rx (queue=0x1189023b00, bufs=0x7f17e28100e8, nb_pkts=32) at ../git/drivers/net/memif/rte_eth_memif.c:338
#1  0x000000000047e6fb in rte_eth_rx_burst (nb_pkts=32, rx_pkts=0x7f17e28100e8, queue_id=0, port_id=<optimized out>) at /usr/include/rte_ethdev.h:5368
#2  pmd_main_loop () at ../git/swfw/api/src/swfwPmd.c:1086
#3  0x000000000047f309 in pmd_launch_one_lcore (dummy=<optimized out>) at ../git/my_process.c:1157
#4  0x00007f17f7070e7c in eal_thread_loop (arg=<optimized out>) at ../git/lib/eal/linux/eal_thread.c:146
#5  0x00007f17f4c3da72 in start_thread (arg=<optimized out>) at pthread_create.c:442
#6  0x00007f17f4cbf930 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) l
333             ring_size = 1 << mq->log2_ring_size;
334             mask = ring_size - 1;
335
336             if (type == MEMIF_RING_C2S) {
337                     cur_slot = mq->last_head;
338                     last_slot = __atomic_load_n(&ring->head, __ATOMIC_ACQUIRE);
339             } else {
340                     cur_slot = mq->last_tail;
341                     last_slot = __atomic_load_n(&ring->tail, __ATOMIC_ACQUIRE);
342             }
(gdb) p ring->head
Cannot access memory at address 0x7f17d8e58006

(gdb) thread 19
[Switching to thread 19 (Thread 0x7f17f0804600 (LWP 468))]
#0  0x00007f17f4caf97b in __GI___close (fd=494) at ../sysdeps/unix/sysv/linux/close.c:27
27        return SYSCALL_CANCEL (close, fd);
(gdb) bt
#0  0x00007f17f4caf97b in __GI___close (fd=494) at ../sysdeps/unix/sysv/linux/close.c:27
#1  0x00007f17e374f01f in memif_free_regions (dev=dev@entry=0x7f17f727f000 <rte_eth_devices+99072>) at ../git/drivers/net/memif/rte_eth_memif.c:882
#2  0x00007f17e37475d0 in memif_disconnect (dev=0x7f17f727f000 <rte_eth_devices+99072>) at ../git/drivers/net/memif/memif_socket.c:623
#3  0x00007f17f7091bd2 in eal_intr_process_interrupts (nfds=<optimized out>, events=<optimized out>) at ../git/lib/eal/linux/eal_interrupts.c:1026
#4  eal_intr_handle_interrupts (totalfds=<optimized out>, pfd=20) at ../git/lib/eal/linux/eal_interrupts.c:1100
#5  eal_intr_thread_main (arg=<optimized out>) at ../git/lib/eal/linux/eal_interrupts.c:1172
#6  0x00007f17f4c3da72 in start_thread (arg=<optimized out>) at pthread_create.c:442
#7  0x00007f17f4cbf930 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81


[-- Attachment #2: Type: text/html, Size: 11383 bytes --]

             reply	other threads:[~2023-10-11 19:58 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-11 19:57 Bly, Mike [this message]
2023-10-30 19:18 ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BYAPR04MB4325C64E8B14347B37EFF237CFCCA@BYAPR04MB4325.namprd04.prod.outlook.com \
    --to=mbly@ciena.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).