DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [Bug 491] Timers synchronously resetting or stopping other timers as part of their callback can cause hangs
@ 2020-06-22 15:17 bugzilla
  0 siblings, 0 replies; only message in thread
From: bugzilla @ 2020-06-22 15:17 UTC (permalink / raw)
  To: dev

https://bugs.dpdk.org/show_bug.cgi?id=491

            Bug ID: 491
           Summary: Timers synchronously resetting or stopping other
                    timers as part of their callback can cause hangs
           Product: DPDK
           Version: 19.11
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: other
          Assignee: dev@dpdk.org
          Reporter: davidajackson2013@gmail.com
  Target Milestone: ---

Created attachment 108
  --> https://bugs.dpdk.org/attachment.cgi?id=108&action=edit
Patch for test_timer.c to reproduce single lcore issue

If a timer's callback function calls rte_timer_reset_sync() or
rte_timer_stop_sync() on another timer that is about to expire (during the same
call to rte_timer_manage()) on the same lcore, the call to rte_timer_*_sync()
will loop indefinitely, causing a hang.

rte_timer_manage() works out which timers are going to be expired in a given
iteration, and then sets them all to state RUNNING. When a timer is in running
state, it can only be changed to CONFIG state (in order to reset or stop it) by
the owning lcore if it is the timer that is currently running/popping. Any
timers that attempt to reset other timers synchronously when they are in
RUNNING state will wait indefinitely for that timer to have been run by
rte_timer_manage(), which is itself then waiting for this call to have
completed.

I've attached a patch file for test_timer.c which adds an extra timer to
reproduce this. It might take a few runs to trigger the test to hang, and
adding a log to rte_timer_reset_sync() aided seeing that this was the cause.

I have reproduced this on 19.11, although I believe this reproduces on both
earlier and later versions.

There is a second part to this bug where this can also happen between lcores,
although I have not managed to get a test to reproduce this. If two lcores (A
and B, say) run rte_timer_manage() such that two timers (say A1 and B1) are
being run at the same time and A2 and B2 are two timers about to be run on
their respective lcores as part of the same rte_timer_manage() calls, if A1's
callback tries to (synchronously) reset B2 and B1's callback tries to
(synchronously) reset A2, this causes both lcores A and B to hang for the same
reason as in the single lcore case.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2020-06-22 15:17 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-22 15:17 [dpdk-dev] [Bug 491] Timers synchronously resetting or stopping other timers as part of their callback can cause hangs bugzilla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).