From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id CC2FCA0543; Thu, 7 Jul 2022 22:12:32 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A966340A7B; Thu, 7 Jul 2022 22:12:32 +0200 (CEST) Received: from mail-pg1-f175.google.com (mail-pg1-f175.google.com [209.85.215.175]) by mails.dpdk.org (Postfix) with ESMTP id 52D6B4069D for ; Thu, 7 Jul 2022 22:12:31 +0200 (CEST) Received: by mail-pg1-f175.google.com with SMTP id r22so13568568pgr.2 for ; Thu, 07 Jul 2022 13:12:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=U0J7yLL+C265XSyWyU7TrOrjycCbI5P9HJ7VE2ui08s=; b=XNHiVCKs+PuyRkwe5PD2D9JZPqwKIZppmDHwq8nVTEXS6vak+f+XW82Rbu93Wgcz1/ /xUk68Glw8bcZPOC73H0YC1qgDxItaeS10pnXHpm9gXmS7cqMhm+a9EaXsONoHX/qP+Y RQ90QAamwRG5gPGehMvzgyAB48Un/M1Hd09ThYtxIszv4QGr1s0w5zV+pLrbQFjS2HOj DXbe7uI1Dko9pJO2RHrdsp5Zg+4Ck4fvyV3BD44t2E0Vy9ObHdZSydPc2XjjDzp2p5lf Bd2vJOPOpfUmleMOyfd0Tbb7Nhr50P29fPHxcU2aHjBDBHzlf1PuDRgahYGGC7Rr7f90 ZfuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=U0J7yLL+C265XSyWyU7TrOrjycCbI5P9HJ7VE2ui08s=; b=KP3Zj1tCm3EL0UVQjxAfGA+C1Wqo1qg2VQKwWd/FaXx5/gA5hlBJjKSVTtYpkYONQP pThiPLduKS9G2++xT8tZO8EnWxxFH/EcXuHvddHoY2GGxEd57ESeGuol0uksabXHGaP4 Xo/3PMySCND0+aZ9fKEI2Omtf7inHo8pzmaapMP99MnB19xs21fWK9QTj6QsCfX/UOH+ UeC7/+6ZLJVthUU+iwPkDMRZghl4Ms/UcN/sGw5P2XwchSjru+fX+kBGwyN8DpnGUoPp 8FbobH6bfX1FiYQifYjRsTJckbpbIThE5Kjbkd3+w7E7XPlfGBt6zXBK8AyzU53hFrjb DMBA== X-Gm-Message-State: AJIora++ZvEbtEZJi/by8+NWZsOgVbvCMZMbjsYl/wrgIhhBo7jM0ZQo si8iwDCvymVKWWzoMuql2IfItKTHTMfPewYo X-Google-Smtp-Source: AGRyM1vlzwnv7TQk3XKpj6IOQKmAf4+5upzBtIG73SujdEZcd677v/P4szA8neg03R3TVg2xHkuOCg== X-Received: by 2002:a17:902:9004:b0:16a:6808:e602 with SMTP id a4-20020a170902900400b0016a6808e602mr54143322plp.94.1657224749848; Thu, 07 Jul 2022 13:12:29 -0700 (PDT) Received: from hermes.local (204-195-112-199.wavecable.com. [204.195.112.199]) by smtp.gmail.com with ESMTPSA id s6-20020a17090a948600b001ef8264bc1fsm10410822pjo.14.2022.07.07.13.12.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Jul 2022 13:12:29 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Subject: [RFC] rwlock: prevent readers from starving writers Date: Thu, 7 Jul 2022 13:12:26 -0700 Message-Id: <20220707201226.618611-1-stephen@networkplumber.org> X-Mailer: git-send-email 2.35.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org The original reader/writer lock in DPDK can cause a stream of readers to starve writers. The new version uses an additional bit to indicate that a writer is waiting and which keeps readers from starving the writer. Signed-off-by: Stephen Hemminger --- Would like this to be in 22.11, but needs some more review lib/eal/include/generic/rte_rwlock.h | 93 ++++++++++++++++++---------- 1 file changed, 61 insertions(+), 32 deletions(-) diff --git a/lib/eal/include/generic/rte_rwlock.h b/lib/eal/include/generic/rte_rwlock.h index da9bc3e9c0e2..725cd19ffb27 100644 --- a/lib/eal/include/generic/rte_rwlock.h +++ b/lib/eal/include/generic/rte_rwlock.h @@ -13,7 +13,7 @@ * This file defines an API for read-write locks. The lock is used to * protect data that allows multiple readers in parallel, but only * one writer. All readers are blocked until the writer is finished - * writing. + * writing. This version will not starve writers. * */ @@ -28,10 +28,17 @@ extern "C" { /** * The rte_rwlock_t type. * - * cnt is -1 when write lock is held, and > 0 when read locks are held. + * Readers increment the counter by RW_READ (4) + * Writers set the RWLOCK_WRITE bit when lock is held + * and set the RWLOCK_WAIT bit while waiting. */ + +#define RTE_RWLOCK_WAIT 0x1 /* Writer is waiting */ +#define RTE_RWLOCK_WRITE 0x2 /* Writer has the lock */ +#define RTE_RWLOCK_READ 0x4 /* Reader increment */ + typedef struct { - volatile int32_t cnt; /**< -1 when W lock held, > 0 when R locks held. */ + volatile int32_t cnt; } rte_rwlock_t; /** @@ -61,17 +68,24 @@ static inline void rte_rwlock_read_lock(rte_rwlock_t *rwl) { int32_t x; - int success = 0; - while (success == 0) { + while (1) { x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); /* write lock is held */ - if (x < 0) { + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) { rte_pause(); continue; } - success = __atomic_compare_exchange_n(&rwl->cnt, &x, x + 1, 1, - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); + + /* Try to get read lock */ + x = __atomic_add_fetch(&rwl->cnt, RTE_RWLOCK_READ, + __ATOMIC_ACQUIRE); + if (!(x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE))) + return; + + /* Undo */ + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, + __ATOMIC_RELEASE); } } @@ -93,17 +107,23 @@ static inline int rte_rwlock_read_trylock(rte_rwlock_t *rwl) { int32_t x; - int success = 0; - while (success == 0) { - x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); - /* write lock is held */ - if (x < 0) - return -EBUSY; - success = __atomic_compare_exchange_n(&rwl->cnt, &x, x + 1, 1, - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); - } + x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); + + /* write lock is held */ + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) + return -EBUSY; + + /* Try to get read lock */ + x = __atomic_add_fetch(&rwl->cnt, RTE_RWLOCK_READ, + __ATOMIC_ACQUIRE); + + if (x & (RTE_RWLOCK_WAIT | RTE_RWLOCK_WRITE)) { + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, + __ATOMIC_RELEASE); + return -EBUSY; + } return 0; } @@ -116,7 +136,7 @@ rte_rwlock_read_trylock(rte_rwlock_t *rwl) static inline void rte_rwlock_read_unlock(rte_rwlock_t *rwl) { - __atomic_fetch_sub(&rwl->cnt, 1, __ATOMIC_RELEASE); + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_READ, __ATOMIC_RELEASE); } /** @@ -139,11 +159,12 @@ rte_rwlock_write_trylock(rte_rwlock_t *rwl) int32_t x; x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); - if (x != 0 || __atomic_compare_exchange_n(&rwl->cnt, &x, -1, 1, - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED) == 0) + if (x < RTE_RWLOCK_WRITE && + __atomic_compare_exchange_n(&rwl->cnt, &x, x + RTE_RWLOCK_WRITE, + 1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) + return 0; + else return -EBUSY; - - return 0; } /** @@ -156,18 +177,26 @@ static inline void rte_rwlock_write_lock(rte_rwlock_t *rwl) { int32_t x; - int success = 0; - while (success == 0) { + while (1) { x = __atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED); - /* a lock is held */ - if (x != 0) { - rte_pause(); - continue; + + /* No readers or writers */ + if (x < RTE_RWLOCK_WRITE) { + /* Turn off RTE_RWLOCK_WAIT, turn on RTE_RWLOCK_WRITE */ + if (__atomic_compare_exchange_n(&rwl->cnt, &x, RTE_RWLOCK_WRITE, 1, + __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) + return; } - success = __atomic_compare_exchange_n(&rwl->cnt, &x, -1, 1, - __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); - } + + /* Turn on writer wait bit */ + if (!(x & RTE_RWLOCK_WAIT)) + __atomic_fetch_or(&rwl->cnt, RTE_RWLOCK_WAIT, __ATOMIC_RELAXED); + + /* Wait until can try to take the lock */ + while (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) > RTE_RWLOCK_WAIT) + rte_pause(); + } } /** @@ -179,7 +208,7 @@ rte_rwlock_write_lock(rte_rwlock_t *rwl) static inline void rte_rwlock_write_unlock(rte_rwlock_t *rwl) { - __atomic_store_n(&rwl->cnt, 0, __ATOMIC_RELEASE); + __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE, __ATOMIC_RELEASE); } /** -- 2.35.1