* [RFC] rwlock: prevent readers from starving writers @ 2022-07-07 20:12 Stephen Hemminger 2022-07-08 19:22 ` Honnappa Nagarahalli 2022-07-19 20:27 ` [PATCH v2] " Stephen Hemminger 0 siblings, 2 replies; 10+ messages in thread From: Stephen Hemminger @ 2022-07-07 20:12 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger The original reader/writer lock in DPDK can cause a stream of readers to starve writers. The new version uses an additional bit to indicate that a writer is waiting and which keeps readers from starving the writer. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- Would like this to be in 22.11, but needs some more review lib/eal/include/generic/rte_rwlock.h | 93 ++++++++++++++++++---------- 1 file changed, 61 insertions(+), 32 deletions(-) diff --git a/lib/eal/include/generic/rte_rwlock.h b/lib/eal/include/generic/rte_rwlock.h index da9bc3e9c0e2..725cd19ffb27 100644 --- a/lib/eal/include/generic/rte_rwlock.h +++ b/lib/eal/include/generic/rte_rwlock.h @@ -13,7 +13,7 @@ * This file defines an API for read-write locks. The lock is used to * protect data that allows multiple readers in parallel, but only * one writer. All readers are blocked until the writer is finished - * writing. + * writing. This version will not starve writers. * */ @@ -28,10 +28,17 @@ extern "C" { /** * The rte_rwlock_t type. * - * cnt is -1 when write lock is held, and > 0 when read locks are held. + * Readers increment the counter by RW_READ (4) + * Writers set the RWLOCK_WRITE bit when lock is held + * and set the RWLOCK_WAIT bit while waiting. */ + +#define RTE_RWLOCK_WAIT 0x1 /* Writer is waiting */ +#define RTE_RWLOCK_WRITE 0x2 /* Writer has the lock */ +#define RTE_RWLOCK_READ 0x4 /* Reader increment */ + typedef struct { - volatile int32_t cnt; /**< -1 when W lock held, > 0 when R locks held. */ + volatile int32_t cnt; } rte_rwlock_t; /** @@ -61,17 +68,24 @@ static inline void rte_rwlock_read_lock(rte_rwlock_t *rwl) { int32_t x; - int success = 0; - while (success == 0) { + while (1) { x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); /* write lock is held */ - if (x < 0) { + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) { rte_pause(); continue; } - success = __atomic_compare_exchange_n(&rwl->cnt, &x, x + 1, 1, - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); + + /* Try to get read lock */ + x = __atomic_add_fetch(&rwl->cnt, RTE_RWLOCK_READ, + __ATOMIC_ACQUIRE); + if (!(x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE))) + return; + + /* Undo */ + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, + __ATOMIC_RELEASE); } } @@ -93,17 +107,23 @@ static inline int rte_rwlock_read_trylock(rte_rwlock_t *rwl) { int32_t x; - int success = 0; - while (success == 0) { - x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); - /* write lock is held */ - if (x < 0) - return -EBUSY; - success = __atomic_compare_exchange_n(&rwl->cnt, &x, x + 1, 1, - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); - } + x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); + + /* write lock is held */ + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) + return -EBUSY; + + /* Try to get read lock */ + x = __atomic_add_fetch(&rwl->cnt, RTE_RWLOCK_READ, + __ATOMIC_ACQUIRE); + + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) { + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, + __ATOMIC_RELEASE); + return -EBUSY; + } return 0; } @@ -116,7 +136,7 @@ rte_rwlock_read_trylock(rte_rwlock_t *rwl) static inline void rte_rwlock_read_unlock(rte_rwlock_t *rwl) { - __atomic_fetch_sub(&rwl->cnt, 1, __ATOMIC_RELEASE); + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, __ATOMIC_RELEASE); } /** @@ -139,11 +159,12 @@ rte_rwlock_write_trylock(rte_rwlock_t *rwl) int32_t x; x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); - if (x != 0 || __atomic_compare_exchange_n(&rwl->cnt, &x, -1, 1, - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED) == 0) + if (x < RTE_RWLOCK_WRITE && + __atomic_compare_exchange_n(&rwl->cnt, &x, x + RTE_RWLOCK_WRITE, + 1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) + return 0; + else return -EBUSY; - - return 0; } /** @@ -156,18 +177,26 @@ static inline void rte_rwlock_write_lock(rte_rwlock_t *rwl) { int32_t x; - int success = 0; - while (success == 0) { + while (1) { x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); - /* a lock is held */ - if (x != 0) { - rte_pause(); - continue; + + /* No readers or writers */ + if (x < RTE_RWLOCK_WRITE) { + /* Turn off RTE_RWLOCK_WAIT, turn on RTE_RWLOCK_WRITE */ + if (__atomic_compare_exchange_n(&rwl->cnt, &x, RTE_RWLOCK_WRITE, 1, + __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) + return; } - success = __atomic_compare_exchange_n(&rwl->cnt, &x, -1, 1, - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); - } + + /* Turn on writer wait bit */ + if (!(x & RTE_RWLOCK_WAIT)) + __atomic_fetch_or(&rwl->cnt, RTE_RWLOCK_WAIT, __ATOMIC_RELAXED); + + /* Wait until can try to take the lock */ + while (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) > RTE_RWLOCK_WAIT) + rte_pause(); + } } /** @@ -179,7 +208,7 @@ rte_rwlock_write_lock(rte_rwlock_t *rwl) static inline void rte_rwlock_write_unlock(rte_rwlock_t *rwl) { - __atomic_store_n(&rwl->cnt, 0, __ATOMIC_RELEASE); + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE, __ATOMIC_RELEASE); } /** -- 2.35.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [RFC] rwlock: prevent readers from starving writers 2022-07-07 20:12 [RFC] rwlock: prevent readers from starving writers Stephen Hemminger @ 2022-07-08 19:22 ` Honnappa Nagarahalli 2022-07-08 22:04 ` Morten Brørup 2022-07-19 20:27 ` [PATCH v2] " Stephen Hemminger 1 sibling, 1 reply; 10+ messages in thread From: Honnappa Nagarahalli @ 2022-07-08 19:22 UTC (permalink / raw) To: Stephen Hemminger, dev; +Cc: nd, nd <snip> > > The original reader/writer lock in DPDK can cause a stream of readers to > starve writers. > > The new version uses an additional bit to indicate that a writer is waiting and > which keeps readers from starving the writer. This addition makes sense. I am wondering if we should create a new lock. Is it possible that some applications are dependent on the current behavior? > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> > --- > Would like this to be in 22.11, but needs some more review > > lib/eal/include/generic/rte_rwlock.h | 93 ++++++++++++++++++---------- > 1 file changed, 61 insertions(+), 32 deletions(-) > > diff --git a/lib/eal/include/generic/rte_rwlock.h > b/lib/eal/include/generic/rte_rwlock.h > index da9bc3e9c0e2..725cd19ffb27 100644 > --- a/lib/eal/include/generic/rte_rwlock.h > +++ b/lib/eal/include/generic/rte_rwlock.h > @@ -13,7 +13,7 @@ > * This file defines an API for read-write locks. The lock is used to > * protect data that allows multiple readers in parallel, but only > * one writer. All readers are blocked until the writer is finished > - * writing. > + * writing. This version will not starve writers. > * > */ > > @@ -28,10 +28,17 @@ extern "C" { > /** > * The rte_rwlock_t type. > * > - * cnt is -1 when write lock is held, and > 0 when read locks are held. > + * Readers increment the counter by RW_READ (4) > + * Writers set the RWLOCK_WRITE bit when lock is held > + * and set the RWLOCK_WAIT bit while waiting. > */ > + > +#define RTE_RWLOCK_WAIT 0x1 /* Writer is waiting */ > +#define RTE_RWLOCK_WRITE 0x2 /* Writer has the lock */ > +#define RTE_RWLOCK_READ 0x4 /* Reader increment */ > + > typedef struct { > - volatile int32_t cnt; /**< -1 when W lock held, > 0 when R locks held. > */ > + volatile int32_t cnt; > } rte_rwlock_t; > > /** > @@ -61,17 +68,24 @@ static inline void > rte_rwlock_read_lock(rte_rwlock_t *rwl) { > int32_t x; > - int success = 0; > > - while (success == 0) { > + while (1) { > x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > /* write lock is held */ > - if (x < 0) { > + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) { > rte_pause(); > continue; > } > - success = __atomic_compare_exchange_n(&rwl->cnt, &x, x > + 1, 1, > - __ATOMIC_ACQUIRE, > __ATOMIC_RELAXED); > + > + /* Try to get read lock */ > + x = __atomic_add_fetch(&rwl->cnt, RTE_RWLOCK_READ, > + __ATOMIC_ACQUIRE); > + if (!(x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE))) > + return; > + > + /* Undo */ > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, > + __ATOMIC_RELEASE); > } > } > > @@ -93,17 +107,23 @@ static inline int > rte_rwlock_read_trylock(rte_rwlock_t *rwl) { > int32_t x; > - int success = 0; > > - while (success == 0) { > - x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > - /* write lock is held */ > - if (x < 0) > - return -EBUSY; > - success = __atomic_compare_exchange_n(&rwl->cnt, &x, x > + 1, 1, > - __ATOMIC_ACQUIRE, > __ATOMIC_RELAXED); > - } > + x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > + > + /* write lock is held */ > + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) > + return -EBUSY; > + > + /* Try to get read lock */ > + x = __atomic_add_fetch(&rwl->cnt, RTE_RWLOCK_READ, > + __ATOMIC_ACQUIRE); > + > + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) { > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, > + __ATOMIC_RELEASE); > > + return -EBUSY; > + } > return 0; > } > > @@ -116,7 +136,7 @@ rte_rwlock_read_trylock(rte_rwlock_t *rwl) static > inline void rte_rwlock_read_unlock(rte_rwlock_t *rwl) { > - __atomic_fetch_sub(&rwl->cnt, 1, __ATOMIC_RELEASE); > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, > __ATOMIC_RELEASE); > } > > /** > @@ -139,11 +159,12 @@ rte_rwlock_write_trylock(rte_rwlock_t *rwl) > int32_t x; > > x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > - if (x != 0 || __atomic_compare_exchange_n(&rwl->cnt, &x, -1, 1, > - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED) == 0) > + if (x < RTE_RWLOCK_WRITE && > + __atomic_compare_exchange_n(&rwl->cnt, &x, x + > RTE_RWLOCK_WRITE, > + 1, __ATOMIC_ACQUIRE, > __ATOMIC_RELAXED)) > + return 0; > + else > return -EBUSY; > - > - return 0; > } > > /** > @@ -156,18 +177,26 @@ static inline void > rte_rwlock_write_lock(rte_rwlock_t *rwl) { > int32_t x; > - int success = 0; > > - while (success == 0) { > + while (1) { > x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > - /* a lock is held */ > - if (x != 0) { > - rte_pause(); > - continue; > + > + /* No readers or writers */ > + if (x < RTE_RWLOCK_WRITE) { > + /* Turn off RTE_RWLOCK_WAIT, turn on > RTE_RWLOCK_WRITE */ > + if (__atomic_compare_exchange_n(&rwl->cnt, &x, > RTE_RWLOCK_WRITE, 1, > + > __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) > + return; > } > - success = __atomic_compare_exchange_n(&rwl->cnt, &x, - > 1, 1, > - __ATOMIC_ACQUIRE, > __ATOMIC_RELAXED); > - } > + > + /* Turn on writer wait bit */ > + if (!(x & RTE_RWLOCK_WAIT)) > + __atomic_fetch_or(&rwl->cnt, RTE_RWLOCK_WAIT, > __ATOMIC_RELAXED); > + > + /* Wait until can try to take the lock */ > + while (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) > > RTE_RWLOCK_WAIT) > + rte_pause(); > + } > } > > /** > @@ -179,7 +208,7 @@ rte_rwlock_write_lock(rte_rwlock_t *rwl) static > inline void rte_rwlock_write_unlock(rte_rwlock_t *rwl) { > - __atomic_store_n(&rwl->cnt, 0, __ATOMIC_RELEASE); > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE, > __ATOMIC_RELEASE); > } > > /** > -- > 2.35.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [RFC] rwlock: prevent readers from starving writers 2022-07-08 19:22 ` Honnappa Nagarahalli @ 2022-07-08 22:04 ` Morten Brørup 2022-07-09 16:22 ` Stephen Hemminger 2022-07-09 16:25 ` Stephen Hemminger 0 siblings, 2 replies; 10+ messages in thread From: Morten Brørup @ 2022-07-08 22:04 UTC (permalink / raw) To: Honnappa Nagarahalli, Stephen Hemminger, dev; +Cc: nd, nd > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com] > Sent: Friday, 8 July 2022 21.22 > > <snip> > > > > The original reader/writer lock in DPDK can cause a stream of readers > to > > starve writers. > > > > The new version uses an additional bit to indicate that a writer is > waiting and > > which keeps readers from starving the writer. > This addition makes sense. > I am wondering if we should create a new lock. Is it possible that some > applications are dependent on the current behavior? Any reader risks having to wait a while for a writer to finish its work. In my opinion, this implementation only increases the probability of that risk occurring, but it doesn't change the writer's impact on the readers. Therefore, I think this improved implementation can replace the old rwlock. > > > > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> > > --- > > Would like this to be in 22.11, but needs some more review > > > > lib/eal/include/generic/rte_rwlock.h | 93 ++++++++++++++++++-------- > -- > > 1 file changed, 61 insertions(+), 32 deletions(-) > > > > diff --git a/lib/eal/include/generic/rte_rwlock.h > > b/lib/eal/include/generic/rte_rwlock.h > > index da9bc3e9c0e2..725cd19ffb27 100644 > > --- a/lib/eal/include/generic/rte_rwlock.h > > +++ b/lib/eal/include/generic/rte_rwlock.h > > @@ -13,7 +13,7 @@ > > * This file defines an API for read-write locks. The lock is used > to > > * protect data that allows multiple readers in parallel, but only > > * one writer. All readers are blocked until the writer is finished > > - * writing. > > + * writing. This version will not starve writers. > > * > > */ > > > > @@ -28,10 +28,17 @@ extern "C" { > > /** > > * The rte_rwlock_t type. > > * > > - * cnt is -1 when write lock is held, and > 0 when read locks are > held. > > + * Readers increment the counter by RW_READ (4) > > + * Writers set the RWLOCK_WRITE bit when lock is held > > + * and set the RWLOCK_WAIT bit while waiting. > > */ > > + > > +#define RTE_RWLOCK_WAIT 0x1 /* Writer is waiting */ > > +#define RTE_RWLOCK_WRITE 0x2 /* Writer has the lock */ > > +#define RTE_RWLOCK_READ 0x4 /* Reader increment */ > > + > > typedef struct { > > - volatile int32_t cnt; /**< -1 when W lock held, > 0 when R locks > held. > > */ > > + volatile int32_t cnt; Not signed anymore, so consider uint32_t. Suggest also rename to cnt_state or similar, since it is not just a counter anymore. > > } rte_rwlock_t; > > > > /** > > @@ -61,17 +68,24 @@ static inline void > > rte_rwlock_read_lock(rte_rwlock_t *rwl) { > > int32_t x; > > - int success = 0; > > > > - while (success == 0) { > > + while (1) { > > x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > > /* write lock is held */ Held -> Held or pending, not just held. Add question mark, or move inside the if block. > > - if (x < 0) { > > + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) { > > rte_pause(); > > continue; > > } > > - success = __atomic_compare_exchange_n(&rwl->cnt, &x, x > > + 1, 1, > > - __ATOMIC_ACQUIRE, > > __ATOMIC_RELAXED); > > + > > + /* Try to get read lock */ > > + x = __atomic_add_fetch(&rwl->cnt, RTE_RWLOCK_READ, > > + __ATOMIC_ACQUIRE); > > + if (!(x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE))) > > + return; > > + > > + /* Undo */ Undo -> Unable, so release the read lock. > > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, > > + __ATOMIC_RELEASE); > > } > > } > > > > @@ -93,17 +107,23 @@ static inline int > > rte_rwlock_read_trylock(rte_rwlock_t *rwl) { > > int32_t x; > > - int success = 0; > > > > - while (success == 0) { > > - x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > > - /* write lock is held */ > > - if (x < 0) > > - return -EBUSY; > > - success = __atomic_compare_exchange_n(&rwl->cnt, &x, x > > + 1, 1, > > - __ATOMIC_ACQUIRE, > > __ATOMIC_RELAXED); > > - } > > + x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > > + > > + /* write lock is held */ Same comment as above. > > + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) > > + return -EBUSY; > > + > > + /* Try to get read lock */ > > + x = __atomic_add_fetch(&rwl->cnt, RTE_RWLOCK_READ, > > + __ATOMIC_ACQUIRE); > > + > > + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) { Add a comment, e.g.: Unable, so release the read lock. > > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, > > + __ATOMIC_RELEASE); > > > > + return -EBUSY; > > + } > > return 0; > > } > > > > @@ -116,7 +136,7 @@ rte_rwlock_read_trylock(rte_rwlock_t *rwl) > static > > inline void rte_rwlock_read_unlock(rte_rwlock_t *rwl) { > > - __atomic_fetch_sub(&rwl->cnt, 1, __ATOMIC_RELEASE); > > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, > > __ATOMIC_RELEASE); > > } > > > > /** > > @@ -139,11 +159,12 @@ rte_rwlock_write_trylock(rte_rwlock_t *rwl) > > int32_t x; > > > > x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > > - if (x != 0 || __atomic_compare_exchange_n(&rwl->cnt, &x, -1, 1, > > - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED) == 0) > > + if (x < RTE_RWLOCK_WRITE && > > + __atomic_compare_exchange_n(&rwl->cnt, &x, x + > > RTE_RWLOCK_WRITE, > > + 1, __ATOMIC_ACQUIRE, > > __ATOMIC_RELAXED)) > > + return 0; > > + else > > return -EBUSY; > > - > > - return 0; > > } > > > > /** > > @@ -156,18 +177,26 @@ static inline void > > rte_rwlock_write_lock(rte_rwlock_t *rwl) { > > int32_t x; > > - int success = 0; > > > > - while (success == 0) { > > + while (1) { > > x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > > - /* a lock is held */ > > - if (x != 0) { > > - rte_pause(); > > - continue; > > + > > + /* No readers or writers */ Add question mark, or move inside if block. > > + if (x < RTE_RWLOCK_WRITE) { > > + /* Turn off RTE_RWLOCK_WAIT, turn on > > RTE_RWLOCK_WRITE */ > > + if (__atomic_compare_exchange_n(&rwl->cnt, &x, > > RTE_RWLOCK_WRITE, 1, > > + > > __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) > > + return; > > } > > - success = __atomic_compare_exchange_n(&rwl->cnt, &x, - > > 1, 1, > > - __ATOMIC_ACQUIRE, > > __ATOMIC_RELAXED); > > - } > > + > > + /* Turn on writer wait bit */ > > + if (!(x & RTE_RWLOCK_WAIT)) > > + __atomic_fetch_or(&rwl->cnt, RTE_RWLOCK_WAIT, > > __ATOMIC_RELAXED); > > + > > + /* Wait until can try to take the lock */ > > + while (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) > > > RTE_RWLOCK_WAIT) > > + rte_pause(); > > + } > > } > > > > /** > > @@ -179,7 +208,7 @@ rte_rwlock_write_lock(rte_rwlock_t *rwl) static > > inline void rte_rwlock_write_unlock(rte_rwlock_t *rwl) { > > - __atomic_store_n(&rwl->cnt, 0, __ATOMIC_RELEASE); > > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE, > > __ATOMIC_RELEASE); > > } > > > > /** > > -- > > 2.35.1 > Always the creative mind, Stephen. :-) You might consider adding/updating even more comments. Acked-by: Morten Brørup <mb@smartsharesystems.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] rwlock: prevent readers from starving writers 2022-07-08 22:04 ` Morten Brørup @ 2022-07-09 16:22 ` Stephen Hemminger 2022-07-09 16:25 ` Stephen Hemminger 1 sibling, 0 replies; 10+ messages in thread From: Stephen Hemminger @ 2022-07-09 16:22 UTC (permalink / raw) To: Morten Brørup; +Cc: Honnappa Nagarahalli, dev, nd On Sat, 9 Jul 2022 00:04:27 +0200 Morten Brørup <mb@smartsharesystems.com> wrote: > > > typedef struct { > > > - volatile int32_t cnt; /**< -1 when W lock held, > 0 when R locks > > held. > > > */ > > > + volatile int32_t cnt; > > Not signed anymore, so consider uint32_t. Suggest also rename to cnt_state or similar, since it is not just a counter anymor I tried that but the rte_wait_until is using signed value. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] rwlock: prevent readers from starving writers 2022-07-08 22:04 ` Morten Brørup 2022-07-09 16:22 ` Stephen Hemminger @ 2022-07-09 16:25 ` Stephen Hemminger 1 sibling, 0 replies; 10+ messages in thread From: Stephen Hemminger @ 2022-07-09 16:25 UTC (permalink / raw) To: Morten Brørup; +Cc: Honnappa Nagarahalli, dev, nd On Sat, 9 Jul 2022 00:04:27 +0200 Morten Brørup <mb@smartsharesystems.com> wrote: > Always the creative mind, Stephen. :-) > > You might consider adding/updating even more comments. > > Acked-by: Morten Brørup <mb@smartsharesystems.com> The motivation is that our work load is reader/writer lock heavy with small number of threads. Therefore the number of atomic operations per lock matters, but starving is bad. And any compare-exchange on ARM is expensive and should be avoided if possible. The concept here came from this great page. https://locklessinc.com/articles/locks/ Will add link in next version. ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2] rwlock: prevent readers from starving writers 2022-07-07 20:12 [RFC] rwlock: prevent readers from starving writers Stephen Hemminger 2022-07-08 19:22 ` Honnappa Nagarahalli @ 2022-07-19 20:27 ` Stephen Hemminger 2022-07-19 21:52 ` Morten Brørup 2022-10-03 10:01 ` David Marchand 1 sibling, 2 replies; 10+ messages in thread From: Stephen Hemminger @ 2022-07-19 20:27 UTC (permalink / raw) To: dev; +Cc: Stephen Hemminger, Morten Brørup Modify reader/writer lock to avoid starvation of writer. The previous implementation would cause a writer to get starved if readers kept acquiring the lock. The new version uses an additional bit to indicate that a writer is waiting and which keeps readers from starving the writer. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Morten Brørup <mb@smartsharesystems.com> --- v2 - incorporate feedback, change from RFC to PATCH lib/eal/include/generic/rte_rwlock.h | 119 +++++++++++++++++++-------- 1 file changed, 83 insertions(+), 36 deletions(-) diff --git a/lib/eal/include/generic/rte_rwlock.h b/lib/eal/include/generic/rte_rwlock.h index da9bc3e9c0e2..59ec54110444 100644 --- a/lib/eal/include/generic/rte_rwlock.h +++ b/lib/eal/include/generic/rte_rwlock.h @@ -15,23 +15,46 @@ * one writer. All readers are blocked until the writer is finished * writing. * + * This version does not give preference to readers or writers + * and does not starve either readers or writers. + * + * See also: + * https://locklessinc.com/articles/locks/ */ #ifdef __cplusplus extern "C" { #endif +#include <rte_branch_prediction.h> #include <rte_common.h> -#include <rte_atomic.h> #include <rte_pause.h> /** * The rte_rwlock_t type. * - * cnt is -1 when write lock is held, and > 0 when read locks are held. + * Readers increment the counter by RTE_RWLOCK_READ (4) + * Writers set the RTE_RWLOCK_WRITE bit when lock is held + * and set the RTE_RWLOCK_WAIT bit while waiting. + * + * 31 2 1 0 + * +-------------------+-+-+ + * | readers | | | + * +-------------------+-+-+ + * ^ ^ + * | | + * WRITE: lock held ----/ | + * WAIT: writer pending --/ */ + +#define RTE_RWLOCK_WAIT 0x1 /* Writer is waiting */ +#define RTE_RWLOCK_WRITE 0x2 /* Writer has the lock */ +#define RTE_RWLOCK_MASK (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE) + /* Writer is waiting or has lock */ +#define RTE_RWLOCK_READ 0x4 /* Reader increment */ + typedef struct { - volatile int32_t cnt; /**< -1 when W lock held, > 0 when R locks held. */ + int32_t cnt; } rte_rwlock_t; /** @@ -61,17 +84,24 @@ static inline void rte_rwlock_read_lock(rte_rwlock_t *rwl) { int32_t x; - int success = 0; - while (success == 0) { - x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); - /* write lock is held */ - if (x < 0) { + while (1) { + /* Wait while writer is present or pending */ + while (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) + & RTE_RWLOCK_MASK) rte_pause(); - continue; - } - success = __atomic_compare_exchange_n(&rwl->cnt, &x, x + 1, 1, - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); + + /* Try to get read lock */ + x = __atomic_add_fetch(&rwl->cnt, RTE_RWLOCK_READ, + __ATOMIC_ACQUIRE); + + /* If no writer, then acquire was successful */ + if (likely(!(x & RTE_RWLOCK_MASK))) + return; + + /* Lost race with writer, backout the change. */ + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, + __ATOMIC_RELAXED); } } @@ -93,17 +123,24 @@ static inline int rte_rwlock_read_trylock(rte_rwlock_t *rwl) { int32_t x; - int success = 0; - while (success == 0) { - x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); - /* write lock is held */ - if (x < 0) - return -EBUSY; - success = __atomic_compare_exchange_n(&rwl->cnt, &x, x + 1, 1, - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); - } + x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); + + /* fail if write lock is held or writer is pending */ + if (x & RTE_RWLOCK_MASK) + return -EBUSY; + /* Try to get read lock */ + x = __atomic_add_fetch(&rwl->cnt, RTE_RWLOCK_READ, + __ATOMIC_ACQUIRE); + + /* Back out if writer raced in */ + if (unlikely(x & RTE_RWLOCK_MASK)) { + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, + __ATOMIC_RELEASE); + + return -EBUSY; + } return 0; } @@ -116,7 +153,7 @@ rte_rwlock_read_trylock(rte_rwlock_t *rwl) static inline void rte_rwlock_read_unlock(rte_rwlock_t *rwl) { - __atomic_fetch_sub(&rwl->cnt, 1, __ATOMIC_RELEASE); + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, __ATOMIC_RELEASE); } /** @@ -139,11 +176,12 @@ rte_rwlock_write_trylock(rte_rwlock_t *rwl) int32_t x; x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); - if (x != 0 || __atomic_compare_exchange_n(&rwl->cnt, &x, -1, 1, - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED) == 0) + if (x < RTE_RWLOCK_WRITE && + __atomic_compare_exchange_n(&rwl->cnt, &x, x + RTE_RWLOCK_WRITE, + 1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) + return 0; + else return -EBUSY; - - return 0; } /** @@ -156,18 +194,27 @@ static inline void rte_rwlock_write_lock(rte_rwlock_t *rwl) { int32_t x; - int success = 0; - while (success == 0) { + while (1) { x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); - /* a lock is held */ - if (x != 0) { - rte_pause(); - continue; + + /* No readers or writers? */ + if (likely(x < RTE_RWLOCK_WRITE)) { + /* Turn off RTE_RWLOCK_WAIT, turn on RTE_RWLOCK_WRITE */ + if (__atomic_compare_exchange_n(&rwl->cnt, &x, RTE_RWLOCK_WRITE, 1, + __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) + return; } - success = __atomic_compare_exchange_n(&rwl->cnt, &x, -1, 1, - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); - } + + /* Turn on writer wait bit */ + if (!(x & RTE_RWLOCK_WAIT)) + __atomic_fetch_or(&rwl->cnt, RTE_RWLOCK_WAIT, __ATOMIC_RELAXED); + + /* Wait until no readers befor trying again */ + while (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) > RTE_RWLOCK_WAIT) + rte_pause(); + + } } /** @@ -179,7 +226,7 @@ rte_rwlock_write_lock(rte_rwlock_t *rwl) static inline void rte_rwlock_write_unlock(rte_rwlock_t *rwl) { - __atomic_store_n(&rwl->cnt, 0, __ATOMIC_RELEASE); + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE, __ATOMIC_RELEASE); } /** -- 2.35.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH v2] rwlock: prevent readers from starving writers 2022-07-19 20:27 ` [PATCH v2] " Stephen Hemminger @ 2022-07-19 21:52 ` Morten Brørup 2022-07-19 22:33 ` Stephen Hemminger 2022-10-03 10:01 ` David Marchand 1 sibling, 1 reply; 10+ messages in thread From: Morten Brørup @ 2022-07-19 21:52 UTC (permalink / raw) To: Stephen Hemminger, dev > From: Stephen Hemminger [mailto:stephen@networkplumber.org] > Sent: Tuesday, 19 July 2022 22.28 > > Modify reader/writer lock to avoid starvation of writer. The previous > implementation would cause a writer to get starved if readers kept > acquiring the lock. The new version uses an additional bit to indicate > that a writer is waiting and which keeps readers from starving the > writer. > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> > Acked-by: Morten Brørup <mb@smartsharesystems.com> > --- > v2 - incorporate feedback, change from RFC to PATCH > > lib/eal/include/generic/rte_rwlock.h | 119 +++++++++++++++++++-------- > 1 file changed, 83 insertions(+), 36 deletions(-) > > diff --git a/lib/eal/include/generic/rte_rwlock.h > b/lib/eal/include/generic/rte_rwlock.h > index da9bc3e9c0e2..59ec54110444 100644 > --- a/lib/eal/include/generic/rte_rwlock.h > +++ b/lib/eal/include/generic/rte_rwlock.h > @@ -15,23 +15,46 @@ > * one writer. All readers are blocked until the writer is finished > * writing. > * > + * This version does not give preference to readers or writers > + * and does not starve either readers or writers. > + * > + * See also: > + * https://locklessinc.com/articles/locks/ > */ > > #ifdef __cplusplus > extern "C" { > #endif > > +#include <rte_branch_prediction.h> > #include <rte_common.h> > -#include <rte_atomic.h> > #include <rte_pause.h> > > /** > * The rte_rwlock_t type. > * > - * cnt is -1 when write lock is held, and > 0 when read locks are > held. > + * Readers increment the counter by RTE_RWLOCK_READ (4) > + * Writers set the RTE_RWLOCK_WRITE bit when lock is held > + * and set the RTE_RWLOCK_WAIT bit while waiting. > + * > + * 31 2 1 0 > + * +-------------------+-+-+ > + * | readers | | | > + * +-------------------+-+-+ > + * ^ ^ > + * | | > + * WRITE: lock held ----/ | > + * WAIT: writer pending --/ > */ > + > +#define RTE_RWLOCK_WAIT 0x1 /* Writer is waiting */ > +#define RTE_RWLOCK_WRITE 0x2 /* Writer has the lock */ > +#define RTE_RWLOCK_MASK (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE) > + /* Writer is waiting or has lock */ > +#define RTE_RWLOCK_READ 0x4 /* Reader increment */ > + > typedef struct { > - volatile int32_t cnt; /**< -1 when W lock held, > 0 when R locks > held. */ > + int32_t cnt; > } rte_rwlock_t; > > /** > @@ -61,17 +84,24 @@ static inline void > rte_rwlock_read_lock(rte_rwlock_t *rwl) > { > int32_t x; > - int success = 0; > > - while (success == 0) { > - x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > - /* write lock is held */ > - if (x < 0) { > + while (1) { > + /* Wait while writer is present or pending */ > + while (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) > + & RTE_RWLOCK_MASK) > rte_pause(); > - continue; > - } > - success = __atomic_compare_exchange_n(&rwl->cnt, &x, x + 1, > 1, > - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); > + > + /* Try to get read lock */ > + x = __atomic_add_fetch(&rwl->cnt, RTE_RWLOCK_READ, > + __ATOMIC_ACQUIRE); > + > + /* If no writer, then acquire was successful */ > + if (likely(!(x & RTE_RWLOCK_MASK))) > + return; > + > + /* Lost race with writer, backout the change. */ > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, > + __ATOMIC_RELAXED); > } > } > > @@ -93,17 +123,24 @@ static inline int > rte_rwlock_read_trylock(rte_rwlock_t *rwl) > { > int32_t x; > - int success = 0; > > - while (success == 0) { > - x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > - /* write lock is held */ > - if (x < 0) > - return -EBUSY; > - success = __atomic_compare_exchange_n(&rwl->cnt, &x, x + 1, > 1, > - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); > - } > + x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > + > + /* fail if write lock is held or writer is pending */ > + if (x & RTE_RWLOCK_MASK) > + return -EBUSY; > > + /* Try to get read lock */ > + x = __atomic_add_fetch(&rwl->cnt, RTE_RWLOCK_READ, > + __ATOMIC_ACQUIRE); > + > + /* Back out if writer raced in */ > + if (unlikely(x & RTE_RWLOCK_MASK)) { > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, > + __ATOMIC_RELEASE); > + > + return -EBUSY; > + } > return 0; > } > > @@ -116,7 +153,7 @@ rte_rwlock_read_trylock(rte_rwlock_t *rwl) > static inline void > rte_rwlock_read_unlock(rte_rwlock_t *rwl) > { > - __atomic_fetch_sub(&rwl->cnt, 1, __ATOMIC_RELEASE); > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, __ATOMIC_RELEASE); > } > > /** > @@ -139,11 +176,12 @@ rte_rwlock_write_trylock(rte_rwlock_t *rwl) > int32_t x; > > x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > - if (x != 0 || __atomic_compare_exchange_n(&rwl->cnt, &x, -1, 1, > - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED) == 0) > + if (x < RTE_RWLOCK_WRITE && "x < RTE_RWLOCK_WRITE" will permit this writher thread to race a waiting writer thread, while the waiting writer thread is executing rte_pause(). Have you considered "!x" instead, giving priority to the waiting thread? I suppose your solution is better, because we know that this writer thread is actively running, while the waiting writer thread may have been put on hold by the O/S scheduler. > + __atomic_compare_exchange_n(&rwl->cnt, &x, x + > RTE_RWLOCK_WRITE, Only a matter of taste, but I would prefer "x | RTE_RWLOCK_WRITE" over "x + RTE_RWLOCK_WRITE". You can leave it as is. > + 1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) > + return 0; > + else > return -EBUSY; > - > - return 0; > } > > /** > @@ -156,18 +194,27 @@ static inline void > rte_rwlock_write_lock(rte_rwlock_t *rwl) > { > int32_t x; > - int success = 0; > > - while (success == 0) { > + while (1) { > x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > - /* a lock is held */ > - if (x != 0) { > - rte_pause(); > - continue; > + > + /* No readers or writers? */ > + if (likely(x < RTE_RWLOCK_WRITE)) { > + /* Turn off RTE_RWLOCK_WAIT, turn on RTE_RWLOCK_WRITE > */ > + if (__atomic_compare_exchange_n(&rwl->cnt, &x, > RTE_RWLOCK_WRITE, 1, > + __ATOMIC_ACQUIRE, > __ATOMIC_RELAXED)) > + return; See comment below; this is the point I refer to as the "next race". > } > - success = __atomic_compare_exchange_n(&rwl->cnt, &x, -1, 1, > - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); > - } > + > + /* Turn on writer wait bit */ > + if (!(x & RTE_RWLOCK_WAIT)) > + __atomic_fetch_or(&rwl->cnt, RTE_RWLOCK_WAIT, > __ATOMIC_RELAXED); Is there a risk of race with two writer threads at this location? If a reader is active, and two writer threads reach this point simultaneously, they will both set RTE_RWLOCK_WAIT here. And then, when the reader thread is done, one of the writer thread will win the next race and replace RTE_RWLOCK_WAIT by RTE_RWLOCK_WRITE. The winning thread will then do its job and afterwards clear RTE_RWLOCK_WRITE. This means that both RTE_RWLOCK_WAIT and RTE_RWLOCK_WRITE have been cleared, but RTE_RWLOCK_WAIT should remain set for the writer thread that lost the race. Did I miss something? It does work with only one writer thread, though. > + > + /* Wait until no readers befor trying again */ Typo: befor -> before. > + while (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) > > RTE_RWLOCK_WAIT) > + rte_pause(); > + > + } > } > > /** > @@ -179,7 +226,7 @@ rte_rwlock_write_lock(rte_rwlock_t *rwl) > static inline void > rte_rwlock_write_unlock(rte_rwlock_t *rwl) > { > - __atomic_store_n(&rwl->cnt, 0, __ATOMIC_RELEASE); > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE, > __ATOMIC_RELEASE); Yes. This is correct, regardless if another writer thread is waiting or not. (Reviewed for one writer thread using rte_rwlock_write_lock() and another using rte_rwlock_write_trylock().) > } > > /** > -- > 2.35.1 > The comments in this version are good too. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] rwlock: prevent readers from starving writers 2022-07-19 21:52 ` Morten Brørup @ 2022-07-19 22:33 ` Stephen Hemminger 2022-07-20 6:48 ` Morten Brørup 0 siblings, 1 reply; 10+ messages in thread From: Stephen Hemminger @ 2022-07-19 22:33 UTC (permalink / raw) To: Morten Brørup; +Cc: dev > > /** > > @@ -179,7 +226,7 @@ rte_rwlock_write_lock(rte_rwlock_t *rwl) > > static inline void > > rte_rwlock_write_unlock(rte_rwlock_t *rwl) > > { > > - __atomic_store_n(&rwl->cnt, 0, __ATOMIC_RELEASE); > > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE, > > __ATOMIC_RELEASE); > > Yes. This is correct, regardless if another writer thread is waiting or not. (Reviewed for one writer thread using rte_rwlock_write_lock() and another using rte_rwlock_write_trylock().) > Was trying to stick to original logic. After writer releases want both writer and reader to be able to get in equally. This provide a measure of fairness (no preference) so writers can't starve readers either. ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [PATCH v2] rwlock: prevent readers from starving writers 2022-07-19 22:33 ` Stephen Hemminger @ 2022-07-20 6:48 ` Morten Brørup 0 siblings, 0 replies; 10+ messages in thread From: Morten Brørup @ 2022-07-20 6:48 UTC (permalink / raw) To: Stephen Hemminger; +Cc: dev > From: Stephen Hemminger [mailto:stephen@networkplumber.org] > Sent: Wednesday, 20 July 2022 00.34 > > > > /** > > > @@ -179,7 +226,7 @@ rte_rwlock_write_lock(rte_rwlock_t *rwl) > > > static inline void > > > rte_rwlock_write_unlock(rte_rwlock_t *rwl) > > > { > > > - __atomic_store_n(&rwl->cnt, 0, __ATOMIC_RELEASE); > > > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE, > > > __ATOMIC_RELEASE); > > > > Yes. This is correct, regardless if another writer thread is waiting > or not. (Reviewed for one writer thread using rte_rwlock_write_lock() > and another using rte_rwlock_write_trylock().) > > > > Was trying to stick to original logic. > > After writer releases want both writer and reader to be able to get in > equally. > This provide a measure of fairness (no preference) so writers can't > starve readers either. OK; I was thinking that writers had preference. I was about to request you to document this somewhere, but you already noted it above the link to the link to the Lockless Inc. article. I didn't review the __ATOMIC_ACQUIRE/RELEASE/RELAXED access modes, but all other aspects look good. Reviewed-by: Morten Brørup <mb@smartsharesystems.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] rwlock: prevent readers from starving writers 2022-07-19 20:27 ` [PATCH v2] " Stephen Hemminger 2022-07-19 21:52 ` Morten Brørup @ 2022-10-03 10:01 ` David Marchand 1 sibling, 0 replies; 10+ messages in thread From: David Marchand @ 2022-10-03 10:01 UTC (permalink / raw) To: Stephen Hemminger; +Cc: dev, Morten Brørup On Tue, Jul 19, 2022 at 10:28 PM Stephen Hemminger <stephen@networkplumber.org> wrote: > > Modify reader/writer lock to avoid starvation of writer. The previous > implementation would cause a writer to get starved if readers kept > acquiring the lock. The new version uses an additional bit to indicate > that a writer is waiting and which keeps readers from starving the > writer. > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> > Acked-by: Morten Brørup <mb@smartsharesystems.com> Applied, thanks. -- David Marchand ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-10-03 10:01 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-07-07 20:12 [RFC] rwlock: prevent readers from starving writers Stephen Hemminger 2022-07-08 19:22 ` Honnappa Nagarahalli 2022-07-08 22:04 ` Morten Brørup 2022-07-09 16:22 ` Stephen Hemminger 2022-07-09 16:25 ` Stephen Hemminger 2022-07-19 20:27 ` [PATCH v2] " Stephen Hemminger 2022-07-19 21:52 ` Morten Brørup 2022-07-19 22:33 ` Stephen Hemminger 2022-07-20 6:48 ` Morten Brørup 2022-10-03 10:01 ` David Marchand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).