From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 053DDA00C5; Sat, 9 Jul 2022 00:04:33 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E09D1406B4; Sat, 9 Jul 2022 00:04:32 +0200 (CEST) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id 898B74021E for ; Sat, 9 Jul 2022 00:04:31 +0200 (CEST) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [RFC] rwlock: prevent readers from starving writers Date: Sat, 9 Jul 2022 00:04:27 +0200 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D871BB@smartserver.smartshare.dk> In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [RFC] rwlock: prevent readers from starving writers Thread-Index: AQHYkj3qoG6JNj5SAkukgWeYSzkbi6102xTAgAAnY8A= References: <20220707201226.618611-1-stephen@networkplumber.org> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Honnappa Nagarahalli" , "Stephen Hemminger" , Cc: "nd" , "nd" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com] > Sent: Friday, 8 July 2022 21.22 >=20 > > > > > The original reader/writer lock in DPDK can cause a stream of = readers > to > > starve writers. > > > > The new version uses an additional bit to indicate that a writer is > waiting and > > which keeps readers from starving the writer. > This addition makes sense. > I am wondering if we should create a new lock. Is it possible that = some > applications are dependent on the current behavior? Any reader risks having to wait a while for a writer to finish its work. In my opinion, this implementation only increases the probability of = that risk occurring, but it doesn't change the writer's impact on the = readers. Therefore, I think this improved implementation can replace the = old rwlock. >=20 > > > > Signed-off-by: Stephen Hemminger > > --- > > Would like this to be in 22.11, but needs some more review > > > > lib/eal/include/generic/rte_rwlock.h | 93 = ++++++++++++++++++-------- > -- > > 1 file changed, 61 insertions(+), 32 deletions(-) > > > > diff --git a/lib/eal/include/generic/rte_rwlock.h > > b/lib/eal/include/generic/rte_rwlock.h > > index da9bc3e9c0e2..725cd19ffb27 100644 > > --- a/lib/eal/include/generic/rte_rwlock.h > > +++ b/lib/eal/include/generic/rte_rwlock.h > > @@ -13,7 +13,7 @@ > > * This file defines an API for read-write locks. The lock is used > to > > * protect data that allows multiple readers in parallel, but only > > * one writer. All readers are blocked until the writer is finished > > - * writing. > > + * writing. This version will not starve writers. > > * > > */ > > > > @@ -28,10 +28,17 @@ extern "C" { > > /** > > * The rte_rwlock_t type. > > * > > - * cnt is -1 when write lock is held, and > 0 when read locks are > held. > > + * Readers increment the counter by RW_READ (4) > > + * Writers set the RWLOCK_WRITE bit when lock is held > > + * and set the RWLOCK_WAIT bit while waiting. > > */ > > + > > +#define RTE_RWLOCK_WAIT 0x1 /* Writer is waiting */ > > +#define RTE_RWLOCK_WRITE 0x2 /* Writer has the lock */ > > +#define RTE_RWLOCK_READ 0x4 /* Reader increment */ > > + > > typedef struct { > > - volatile int32_t cnt; /**< -1 when W lock held, > 0 when R locks > held. > > */ > > + volatile int32_t cnt; Not signed anymore, so consider uint32_t. Suggest also rename to = cnt_state or similar, since it is not just a counter anymore. > > } rte_rwlock_t; > > > > /** > > @@ -61,17 +68,24 @@ static inline void > > rte_rwlock_read_lock(rte_rwlock_t *rwl) { > > int32_t x; > > - int success =3D 0; > > > > - while (success =3D=3D 0) { > > + while (1) { > > x =3D __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > > /* write lock is held */ Held -> Held or pending, not just held. Add question mark, or move = inside the if block. > > - if (x < 0) { > > + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) { > > rte_pause(); > > continue; > > } > > - success =3D __atomic_compare_exchange_n(&rwl->cnt, &x, x > > + 1, 1, > > - __ATOMIC_ACQUIRE, > > __ATOMIC_RELAXED); > > + > > + /* Try to get read lock */ > > + x =3D __atomic_add_fetch(&rwl->cnt, RTE_RWLOCK_READ, > > + __ATOMIC_ACQUIRE); > > + if (!(x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE))) > > + return; > > + > > + /* Undo */ Undo -> Unable, so release the read lock. > > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, > > + __ATOMIC_RELEASE); > > } > > } > > > > @@ -93,17 +107,23 @@ static inline int > > rte_rwlock_read_trylock(rte_rwlock_t *rwl) { > > int32_t x; > > - int success =3D 0; > > > > - while (success =3D=3D 0) { > > - x =3D __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > > - /* write lock is held */ > > - if (x < 0) > > - return -EBUSY; > > - success =3D __atomic_compare_exchange_n(&rwl->cnt, &x, x > > + 1, 1, > > - __ATOMIC_ACQUIRE, > > __ATOMIC_RELAXED); > > - } > > + x =3D __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > > + > > + /* write lock is held */ Same comment as above. > > + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) > > + return -EBUSY; > > + > > + /* Try to get read lock */ > > + x =3D __atomic_add_fetch(&rwl->cnt, RTE_RWLOCK_READ, > > + __ATOMIC_ACQUIRE); > > + > > + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) { Add a comment, e.g.: Unable, so release the read lock. > > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, > > + __ATOMIC_RELEASE); > > > > + return -EBUSY; > > + } > > return 0; > > } > > > > @@ -116,7 +136,7 @@ rte_rwlock_read_trylock(rte_rwlock_t *rwl) > static > > inline void rte_rwlock_read_unlock(rte_rwlock_t *rwl) { > > - __atomic_fetch_sub(&rwl->cnt, 1, __ATOMIC_RELEASE); > > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, > > __ATOMIC_RELEASE); > > } > > > > /** > > @@ -139,11 +159,12 @@ rte_rwlock_write_trylock(rte_rwlock_t *rwl) > > int32_t x; > > > > x =3D __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > > - if (x !=3D 0 || __atomic_compare_exchange_n(&rwl->cnt, &x, -1, 1, > > - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED) =3D=3D 0) > > + if (x < RTE_RWLOCK_WRITE && > > + __atomic_compare_exchange_n(&rwl->cnt, &x, x + > > RTE_RWLOCK_WRITE, > > + 1, __ATOMIC_ACQUIRE, > > __ATOMIC_RELAXED)) > > + return 0; > > + else > > return -EBUSY; > > - > > - return 0; > > } > > > > /** > > @@ -156,18 +177,26 @@ static inline void > > rte_rwlock_write_lock(rte_rwlock_t *rwl) { > > int32_t x; > > - int success =3D 0; > > > > - while (success =3D=3D 0) { > > + while (1) { > > x =3D __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); > > - /* a lock is held */ > > - if (x !=3D 0) { > > - rte_pause(); > > - continue; > > + > > + /* No readers or writers */ Add question mark, or move inside if block. > > + if (x < RTE_RWLOCK_WRITE) { > > + /* Turn off RTE_RWLOCK_WAIT, turn on > > RTE_RWLOCK_WRITE */ > > + if (__atomic_compare_exchange_n(&rwl->cnt, &x, > > RTE_RWLOCK_WRITE, 1, > > + > > __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) > > + return; > > } > > - success =3D __atomic_compare_exchange_n(&rwl->cnt, &x, - > > 1, 1, > > - __ATOMIC_ACQUIRE, > > __ATOMIC_RELAXED); > > - } > > + > > + /* Turn on writer wait bit */ > > + if (!(x & RTE_RWLOCK_WAIT)) > > + __atomic_fetch_or(&rwl->cnt, RTE_RWLOCK_WAIT, > > __ATOMIC_RELAXED); > > + > > + /* Wait until can try to take the lock */ > > + while (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) > > > RTE_RWLOCK_WAIT) > > + rte_pause(); > > + } > > } > > > > /** > > @@ -179,7 +208,7 @@ rte_rwlock_write_lock(rte_rwlock_t *rwl) static > > inline void rte_rwlock_write_unlock(rte_rwlock_t *rwl) { > > - __atomic_store_n(&rwl->cnt, 0, __ATOMIC_RELEASE); > > + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE, > > __ATOMIC_RELEASE); > > } > > > > /** > > -- > > 2.35.1 >=20 Always the creative mind, Stephen. :-) You might consider adding/updating even more comments. Acked-by: Morten Br=F8rup