From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 26185A04FD; Fri, 25 Mar 2022 21:25:40 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CEE5740687; Fri, 25 Mar 2022 21:25:39 +0100 (CET) Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2042.outbound.protection.outlook.com [40.107.22.42]) by mails.dpdk.org (Postfix) with ESMTP id D235340140 for ; Fri, 25 Mar 2022 21:25:38 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CeqisxAWLqz1LN9AJu39Ob46y3zWeNJ4oYI9/6E88XNNnwnr2ZjW+nWKz5t2GdvU7CHiShdyK6kxUZ1Ab8xNndp2/5FLXvJVfqwedKms3cy7yYmw2ys5P1EON3xsbV5rNVBVukGE5B9pZuN0C3ExszucLqDMTxltwpwP8yGqDwzKEjd023s3JFsqbPECwStuH0g+3QiSUAoTFXtg7MDjoArSAMynbVwaFMGpg/6mml0fxHQlwMswx/PD8UWMuYdxGvllnl1KsTtIJxGo4AoCTIJzVXaMKhV3dgSPQtcgKI2+NYz6LYf0ORovpAzAAVN6/xhx+7HDZPZE2hxLIYDlig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tzInlpEx9fmyK8cE/aovxy1PLDtkE0luGXD2CObrwTM=; b=O9Wrbno0O+JqJ/MlVcSAx6iSY3IKEZiStUwTuzD575XXnidlFPlC6HKVpqLzlH0WB7ewC4lu6Pnj2wel8Lfuy1GEb/uBLdHfI51ueIXqQY9QiZ4UmJDwzsefNu1YrOi/CI1k0jkfDnFYNl3MsJvI5S7pgQSZv774xZ7KWthJehY/gemKU2jhtMp92M/z2LNriZ3wZ+ZpqEL9bTckKE1JAAGTIvma2HRwXDo+cLY2rnihfutchCmJpcJQ8pbRU0Hkbx+q7l8pGNezMADn2WqhmRBA7OWKGAo6+QGKKUlyReuuaVjjcHkDt9eKRvtQhq1f3w48SL0d+y7K0bWvQInOWw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=temperror (sender ip is 192.176.1.74) smtp.rcpttodomain=arm.com smtp.mailfrom=ericsson.com; dmarc=temperror action=none header.from=ericsson.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tzInlpEx9fmyK8cE/aovxy1PLDtkE0luGXD2CObrwTM=; b=QLjA5qwZSxC7uOz7DgDKPNjzK4uvaE9Egoi/NcYzlmCn3oLQXgWz5Sd9H5Yt6plUxQ+PllLtNGGEFr/MBj6MfZv66KHQRXVcae5/SkfrLUMSIqAKmRvnkxcu8jqTRziuq8p30Aj3u8ghGK2XvnP0FQgGCgCmY1sAzuc75dXbX6o= Received: from AM6PR0202CA0037.eurprd02.prod.outlook.com (2603:10a6:20b:3a::14) by AS8PR07MB7574.eurprd07.prod.outlook.com (2603:10a6:20b:2a7::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5102.15; Fri, 25 Mar 2022 20:25:37 +0000 Received: from AM5EUR02FT042.eop-EUR02.prod.protection.outlook.com (2603:10a6:20b:3a:cafe::dc) by AM6PR0202CA0037.outlook.office365.com (2603:10a6:20b:3a::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5102.17 via Frontend Transport; Fri, 25 Mar 2022 20:25:37 +0000 X-MS-Exchange-Authentication-Results: spf=temperror (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=temperror action=none header.from=ericsson.com; Received-SPF: TempError (protection.outlook.com: error in processing during lookup of ericsson.com: DNS Timeout) Received: from oa.msg.ericsson.com (192.176.1.74) by AM5EUR02FT042.mail.protection.outlook.com (10.152.9.106) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.20.5102.18 via Frontend Transport; Fri, 25 Mar 2022 20:25:36 +0000 Received: from ESESSMB503.ericsson.se (153.88.183.164) by ESESBMR502.ericsson.se (153.88.183.134) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2308.27; Fri, 25 Mar 2022 21:25:35 +0100 Received: from seliiuapp00218.seli.gic.ericsson.se (153.88.183.153) by smtp.internal.ericsson.com (153.88.183.191) with Microsoft SMTP Server id 15.1.2308.27 via Frontend Transport; Fri, 25 Mar 2022 21:25:35 +0100 Received: from localhost.localdomain (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliiuapp00218.seli.gic.ericsson.se (Postfix) with ESMTP id 6F26D6028A; Fri, 25 Mar 2022 21:25:35 +0100 (CET) From: =?UTF-8?q?Mattias=20R=C3=B6nnblom?= To: CC: Thomas Monjalon , David Marchand , , , , , , , =?UTF-8?q?Mattias=20R=C3=B6nnblom?= , "Ola Liljedahl" Subject: [RFC] eal: add seqlock Date: Fri, 25 Mar 2022 21:24:28 +0100 Message-ID: <20220325202428.94628-1-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: c3baef2f-6f1d-4965-feb2-08da0e9d9eaf X-MS-TrafficTypeDiagnostic: AS8PR07MB7574:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: +FNmNPlGHl7Nb3b/Ow5k9jB0m7QzZpcxM3MSQ1/1wIXYszWJzFyW7ZHdzxYPM1pAOJN5FyNUIe6dKkii8fyRubjoEy6Njyf41hO4g4+roM6EWhybntJucZn9cBMvKRIc+F66A84FHzI4ggWeAaV/1+JrkpT7qEju7GBuivTJqtIKCs/+Bko8DOiQtdv+OpdpCR9fIGycuvdimdiEzKTZVIEvm4a3CDhER21SZCMdnu8S33YF3I0P8TV6uvW75W35nFyNDjtoGyjemuUrjqVUzDsEBPmbmArmCaG4OcdIACo2JI7mIs1WqacVTRWKNM+L7G32XKNMulJs22rZNpgR05aLusSSaFZVRzNEOsldsHbVskg/Brc0V0wO6MasUf+DuS1l/+OFUBW52g9WmLCoXIma+y5BdgwiYTu4+S/bLWrTw9UA3h44HYzYbeNIo7jnRtCfx7ZetSDzEq+ftjFWojraz2/arvih880wZ0QdNyPRZyuVk7na9zcwyu/lLB201w79mb4en38o4Ent0cJBH5RofnGQ7PgU6+WWVOtt+6CD4fOm+n9haTrCbxKjSbe04QKx5S/HQBC5h/708c70snsaJsdbZrJPMJYkX7O0Ek37WD7mqXxGt31irHpzYfKIbf4cRWMgemSJVl2WAFwETJeSk7os1tvQpTtDC+IRQCQYRRa3scsz6kUWR+R1omoL/j9Swr+Po3IOfsr7SY9l86JMHD/zALbn0Zd839LzeoI= X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230001)(4636009)(36840700001)(40470700004)(46966006)(4326008)(30864003)(5660300002)(2906002)(54906003)(316002)(8936002)(6916009)(66574015)(1076003)(336012)(6266002)(36756003)(186003)(26005)(508600001)(63350400001)(63370400001)(83380400001)(2616005)(36860700001)(356005)(86362001)(82310400004)(6666004)(70586007)(70206006)(8676002)(7636003)(82960400001)(47076005)(40460700003); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Mar 2022 20:25:36.0710 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c3baef2f-6f1d-4965-feb2-08da0e9d9eaf X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR02FT042.eop-EUR02.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR07MB7574 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org A sequence lock (seqlock) is synchronization primitive which allows for data-race free, low-overhead, high-frequency reads, especially for data structures shared across many cores and which are updated with relatively low frequency. A seqlock permits multiple parallel readers. The variant of seqlock implemented in this patch supports multiple writers as well. A spinlock is used for writer-writer serialization. To avoid resource reclamation and other issues, the data protected by a seqlock is best off being self-contained (i.e., no pointers [except to constant data]). One way to think about seqlocks is that they provide means to perform atomic operations on data objects larger what the native atomic machine instructions allow for. DPDK seqlocks are not preemption safe on the writer side. A thread preemption affects performance, not correctness. A seqlock contains a sequence number, which can be thought of as the generation of the data it protects. A reader will 1. Load the sequence number (sn). 2. Load, in arbitrary order, the seqlock-protected data. 3. Load the sn again. 4. Check if the first and second sn are equal, and even numbered. If they are not, discard the loaded data, and restart from 1. The first three steps need to be ordered using suitable memory fences. A writer will 1. Take the spinlock, to serialize writer access. 2. Load the sn. 3. Store the original sn + 1 as the new sn. 4. Perform load and stores to the seqlock-protected data. 5. Store the original sn + 2 as the new sn. 6. Release the spinlock. Proper memory fencing is required to make sure the first sn store, the data stores, and the second sn store appear to the reader in the mentioned order. The sn loads and stores must be atomic, but the data loads and stores need not be. The original seqlock design and implementation was done by Stephen Hemminger. This is an independent implementation, using C11 atomics. This RFC version lacks API documentation. Reviewed-by: Ola Liljedahl Signed-off-by: Mattias Rönnblom --- app/test/meson.build | 2 + app/test/test_seqlock.c | 197 ++++++++++++++++++++++++++++++++++ lib/eal/common/meson.build | 1 + lib/eal/common/rte_seqlock.c | 12 +++ lib/eal/include/meson.build | 1 + lib/eal/include/rte_seqlock.h | 84 +++++++++++++++ lib/eal/version.map | 3 + 7 files changed, 300 insertions(+) create mode 100644 app/test/test_seqlock.c create mode 100644 lib/eal/common/rte_seqlock.c create mode 100644 lib/eal/include/rte_seqlock.h diff --git a/app/test/meson.build b/app/test/meson.build index 5fc1dd1b7b..5e418e8766 100644 --- a/app/test/meson.build +++ b/app/test/meson.build @@ -125,6 +125,7 @@ test_sources = files( 'test_rwlock.c', 'test_sched.c', 'test_security.c', + 'test_seqlock.c', 'test_service_cores.c', 'test_spinlock.c', 'test_stack.c', @@ -214,6 +215,7 @@ fast_tests = [ ['rwlock_rde_wro_autotest', true], ['sched_autotest', true], ['security_autotest', false], + ['seqlock_autotest', true], ['spinlock_autotest', true], ['stack_autotest', false], ['stack_lf_autotest', false], diff --git a/app/test/test_seqlock.c b/app/test/test_seqlock.c new file mode 100644 index 0000000000..a727e16caf --- /dev/null +++ b/app/test/test_seqlock.c @@ -0,0 +1,197 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2022 Ericsson AB + */ + +#include + +#include +#include +#include + +#include + +#include "test.h" + +struct data { + rte_seqlock_t lock; + + uint64_t a; + uint64_t b __rte_cache_aligned; + uint64_t c __rte_cache_aligned; +} __rte_cache_aligned; + +struct reader { + struct data *data; + uint8_t stop; +}; + +#define WRITER_RUNTIME (2.0) /* s */ + +#define WRITER_MAX_DELAY (100) /* us */ + +#define INTERRUPTED_WRITER_FREQUENCY (1000) +#define WRITER_INTERRUPT_TIME (1) /* us */ + +static int +writer_start(void *arg) +{ + struct data *data = arg; + uint64_t deadline; + + deadline = rte_get_timer_cycles() + + WRITER_RUNTIME * rte_get_timer_hz(); + + while (rte_get_timer_cycles() < deadline) { + bool interrupted; + uint64_t new_value; + unsigned int delay; + + new_value = rte_rand(); + + interrupted = rte_rand_max(INTERRUPTED_WRITER_FREQUENCY) == 0; + + rte_seqlock_write_begin(&data->lock); + + data->c = new_value; + + /* These compiler barriers (both on the test reader + * and the test writer side) are here to ensure that + * loads/stores *usually* happen in test program order + * (always on a TSO machine). They are arrange in such + * a way that the writer stores in a different order + * than the reader loads, to emulate an arbitrary + * order. A real application using a seqlock does not + * require any compiler barriers. + */ + rte_compiler_barrier(); + data->b = new_value; + + if (interrupted) + rte_delay_us_block(WRITER_INTERRUPT_TIME); + + rte_compiler_barrier(); + data->a = new_value; + + rte_seqlock_write_end(&data->lock); + + delay = rte_rand_max(WRITER_MAX_DELAY); + + rte_delay_us_block(delay); + } + + return 0; +} + +#define INTERRUPTED_READER_FREQUENCY (1000) +#define READER_INTERRUPT_TIME (1000) /* us */ + +static int +reader_start(void *arg) +{ + struct reader *r = arg; + int rc = 0; + + while (__atomic_load_n(&r->stop, __ATOMIC_RELAXED) == 0 && rc == 0) { + struct data *data = r->data; + bool interrupted; + uint64_t a; + uint64_t b; + uint64_t c; + uint64_t sn; + + interrupted = rte_rand_max(INTERRUPTED_READER_FREQUENCY) == 0; + + do { + sn = rte_seqlock_read_begin(&data->lock); + + a = data->a; + /* See writer_start() for an explaination why + * these barriers are here. + */ + rte_compiler_barrier(); + + if (interrupted) + rte_delay_us_block(READER_INTERRUPT_TIME); + + c = data->c; + + rte_compiler_barrier(); + b = data->b; + + } while (rte_seqlock_read_retry(&data->lock, sn)); + + if (a != b || b != c) { + printf("Reader observed inconsistent data values " + "%" PRIu64 " %" PRIu64 " %" PRIu64 "\n", + a, b, c); + rc = -1; + } + } + + return rc; +} + +static void +reader_stop(struct reader *reader) +{ + __atomic_store_n(&reader->stop, 1, __ATOMIC_RELAXED); +} + +#define NUM_WRITERS (2) +#define MIN_NUM_READERS (2) +#define MAX_READERS (RTE_MAX_LCORE - NUM_WRITERS - 1) +#define MIN_LCORE_COUNT (NUM_WRITERS + MIN_NUM_READERS + 1) + +static int +test_seqlock(void) +{ + struct reader readers[MAX_READERS]; + unsigned int num_readers; + unsigned int num_lcores; + unsigned int i; + unsigned int lcore_id; + unsigned int writer_lcore_ids[NUM_WRITERS] = { 0 }; + unsigned int reader_lcore_ids[MAX_READERS]; + int rc = 0; + + num_lcores = rte_lcore_count(); + + if (num_lcores < MIN_LCORE_COUNT) + return -1; + + num_readers = num_lcores - NUM_WRITERS - 1; + + struct data *data = rte_zmalloc(NULL, sizeof(struct data), 0); + + i = 0; + RTE_LCORE_FOREACH_WORKER(lcore_id) { + if (i < NUM_WRITERS) { + rte_eal_remote_launch(writer_start, data, lcore_id); + writer_lcore_ids[i] = lcore_id; + } else { + unsigned int reader_idx = i - NUM_WRITERS; + struct reader *reader = &readers[reader_idx]; + + reader->data = data; + reader->stop = 0; + + rte_eal_remote_launch(reader_start, reader, lcore_id); + reader_lcore_ids[reader_idx] = lcore_id; + } + i++; + } + + for (i = 0; i < NUM_WRITERS; i++) + if (rte_eal_wait_lcore(writer_lcore_ids[i]) != 0) + rc = -1; + + for (i = 0; i < num_readers; i++) { + reader_stop(&readers[i]); + if (rte_eal_wait_lcore(reader_lcore_ids[i]) != 0) + rc = -1; + } + + return rc; +} + +REGISTER_TEST_COMMAND(seqlock_autotest, test_seqlock); diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build index 917758cc65..a41343bfed 100644 --- a/lib/eal/common/meson.build +++ b/lib/eal/common/meson.build @@ -35,6 +35,7 @@ sources += files( 'rte_malloc.c', 'rte_random.c', 'rte_reciprocal.c', + 'rte_seqlock.c', 'rte_service.c', 'rte_version.c', ) diff --git a/lib/eal/common/rte_seqlock.c b/lib/eal/common/rte_seqlock.c new file mode 100644 index 0000000000..d4fe648799 --- /dev/null +++ b/lib/eal/common/rte_seqlock.c @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2022 Ericsson AB + */ + +#include + +void +rte_seqlock_init(rte_seqlock_t *seqlock) +{ + seqlock->sn = 0; + rte_spinlock_init(&seqlock->lock); +} diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build index 9700494816..48df5f1a21 100644 --- a/lib/eal/include/meson.build +++ b/lib/eal/include/meson.build @@ -36,6 +36,7 @@ headers += files( 'rte_per_lcore.h', 'rte_random.h', 'rte_reciprocal.h', + 'rte_seqlock.h', 'rte_service.h', 'rte_service_component.h', 'rte_string_fns.h', diff --git a/lib/eal/include/rte_seqlock.h b/lib/eal/include/rte_seqlock.h new file mode 100644 index 0000000000..b975ca848a --- /dev/null +++ b/lib/eal/include/rte_seqlock.h @@ -0,0 +1,84 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2022 Ericsson AB + */ + +#ifndef _RTE_SEQLOCK_H_ +#define _RTE_SEQLOCK_H_ + +#include +#include + +#include +#include +#include + +struct rte_seqlock { + uint64_t sn; + rte_spinlock_t lock; +}; + +typedef struct rte_seqlock rte_seqlock_t; + +__rte_experimental +void +rte_seqlock_init(rte_seqlock_t *seqlock); + +__rte_experimental +static inline uint64_t +rte_seqlock_read_begin(const rte_seqlock_t *seqlock) +{ + /* __ATOMIC_ACQUIRE to prevent loads after (in program order) + * from happening before the sn load. Syncronizes-with the + * store release in rte_seqlock_end(). + */ + return __atomic_load_n(&seqlock->sn, __ATOMIC_ACQUIRE); +} + +__rte_experimental +static inline bool +rte_seqlock_read_retry(const rte_seqlock_t *seqlock, uint64_t begin_sn) +{ + uint64_t end_sn; + + /* make sure the data loads happens before the sn load */ + rte_atomic_thread_fence(__ATOMIC_ACQUIRE); + + end_sn = __atomic_load_n(&seqlock->sn, __ATOMIC_RELAXED); + + return unlikely(begin_sn & 1 || begin_sn != end_sn); +} + +__rte_experimental +static inline void +rte_seqlock_write_begin(rte_seqlock_t *seqlock) +{ + uint64_t sn; + + /* to synchronize with other writers */ + rte_spinlock_lock(&seqlock->lock); + + sn = seqlock->sn + 1; + + __atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELAXED); + + /* __ATOMIC_RELEASE to prevent stores after (in program order) + * from happening before the sn store. + */ + rte_atomic_thread_fence(__ATOMIC_RELEASE); +} + +__rte_experimental +static inline void +rte_seqlock_write_end(rte_seqlock_t *seqlock) +{ + uint64_t sn; + + sn = seqlock->sn + 1; + + /* synchronizes-with the load acquire in rte_seqlock_begin() */ + __atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELEASE); + + rte_spinlock_unlock(&seqlock->lock); +} + +#endif /* _RTE_SEQLOCK_H_ */ diff --git a/lib/eal/version.map b/lib/eal/version.map index b53eeb30d7..4a9d0ed899 100644 --- a/lib/eal/version.map +++ b/lib/eal/version.map @@ -420,6 +420,9 @@ EXPERIMENTAL { rte_intr_instance_free; rte_intr_type_get; rte_intr_type_set; + + # added in 22.07 + rte_seqlock_init; }; INTERNAL { -- 2.25.1