From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C2B61A034C; Thu, 28 Apr 2022 12:28:36 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A473640E50; Thu, 28 Apr 2022 12:28:36 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id B6BB34013F for ; Thu, 28 Apr 2022 12:28:35 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1651141715; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1RKFNOP2at1PUpECCNWzyALtyx+tNAP0iqJPEqw1YO0=; b=jCE7z6dxXA1O3fk0yKXXwzS7G6RmyaEZHBPvubrtRM54aVuRricSG7XRvzNfBeg0SfMx/f KUT/vEEJ6GRRnJH45CXOJmRT009zUb3fbfcOPgD21jtZQm6SAxHoQ5o1w8jwLG0NKRz62r mchkmlo26PLO8dXWdM51nb3jX/BYa3E= Received: from mail-lj1-f200.google.com (mail-lj1-f200.google.com [209.85.208.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-369-ebUEr6CNNUS4g96NmGj9Eg-1; Thu, 28 Apr 2022 06:28:34 -0400 X-MC-Unique: ebUEr6CNNUS4g96NmGj9Eg-1 Received: by mail-lj1-f200.google.com with SMTP id l13-20020a2e868d000000b0024f078d7ea0so1690471lji.4 for ; Thu, 28 Apr 2022 03:28:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=1RKFNOP2at1PUpECCNWzyALtyx+tNAP0iqJPEqw1YO0=; b=3g1mLJK49vxU98GGUQo2iBbxrNQ4HwzFbqew0NZucRrEAnzVdHQSW6/Aqu5HjEHvJf mh88cWs40WlR3JchTbVWgAvldXRE5WE8V0pVrwyzXVFm+8I6YtsX4mgB4kjVPMRG3y8/ vLTzd9TtX/9KNRROyenPRcj+pQUk6ownpKDwLzgWweK+3kzHy7WUhiq4/L3lqWRd7bc8 gEQZtJAl07eaZFqm3hwyxqU4jiAdhb1TmrChTfYCHOQuSHCXvEXtqLrh1qXhySoOrWXt yz+1tqyYf9gZEMBXHQgUaIMkvohupoB8DS2JHveEsRylArKTd7GPhoxkCRI8RkXcqXY7 Uj9A== X-Gm-Message-State: AOAM531SFE5qhYjv+4AfGn9VJ+YFcPjgp6w3bLs243OjArLOTHTGvrMs EBrq0U55cwO39hzzOpBfAPv0kDVJ/nfDcMIJlBSP7boa3Os4ouVBrOzZ6E6vP/cACXmzVn7Dv/t 5RFFehWkLm+K7B5nxHTs= X-Received: by 2002:a05:6512:3d0e:b0:472:f72:a0a9 with SMTP id d14-20020a0565123d0e00b004720f72a0a9mr12068807lfv.484.1651141712570; Thu, 28 Apr 2022 03:28:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxhu2sJyQ1ke9ygLV+S6qxExHXwxqYHD7r0rQpLDnvewDaH++OjlR3v1/CfvFmK3XwwXmP/dkLaLsEGrUOF6OY= X-Received: by 2002:a05:6512:3d0e:b0:472:f72:a0a9 with SMTP id d14-20020a0565123d0e00b004720f72a0a9mr12068786lfv.484.1651141712264; Thu, 28 Apr 2022 03:28:32 -0700 (PDT) MIME-Version: 1.0 References: <20220408142442.157192-1-mattias.ronnblom@ericsson.com> In-Reply-To: <20220408142442.157192-1-mattias.ronnblom@ericsson.com> From: David Marchand Date: Thu, 28 Apr 2022 12:28:20 +0200 Message-ID: Subject: Re: [PATCH v4] eal: add seqlock To: =?UTF-8?Q?Mattias_R=C3=B6nnblom?= Cc: dev , Thomas Monjalon , onar.olsen@ericsson.com, Honnappa Nagarahalli , nd , "Ananyev, Konstantin" , =?UTF-8?Q?Morten_Br=C3=B8rup?= , Stephen Hemminger , hofors@lysator.liu.se, Ola Liljedahl Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dmarchan@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hello Mattias, On Fri, Apr 8, 2022 at 4:25 PM Mattias R=C3=B6nnblom wrote: > > A sequence lock (seqlock) is synchronization primitive which allows > for data-race free, low-overhead, high-frequency reads, especially for > data structures shared across many cores and which are updated > relatively infrequently. > > A seqlock permits multiple parallel readers. The variant of seqlock > implemented in this patch supports multiple writers as well. A > spinlock is used for writer-writer serialization. > > To avoid resource reclamation and other issues, the data protected by > a seqlock is best off being self-contained (i.e., no pointers [except > to constant data]). > > One way to think about seqlocks is that they provide means to perform > atomic operations on data objects larger what the native atomic > machine instructions allow for. > > DPDK seqlocks are not preemption safe on the writer side. A thread > preemption affects performance, not correctness. > > A seqlock contains a sequence number, which can be thought of as the > generation of the data it protects. > > A reader will > 1. Load the sequence number (sn). > 2. Load, in arbitrary order, the seqlock-protected data. > 3. Load the sn again. > 4. Check if the first and second sn are equal, and even numbered. > If they are not, discard the loaded data, and restart from 1. > > The first three steps need to be ordered using suitable memory fences. > > A writer will > 1. Take the spinlock, to serialize writer access. > 2. Load the sn. > 3. Store the original sn + 1 as the new sn. > 4. Perform load and stores to the seqlock-protected data. > 5. Store the original sn + 2 as the new sn. > 6. Release the spinlock. > > Proper memory fencing is required to make sure the first sn store, the > data stores, and the second sn store appear to the reader in the > mentioned order. > > The sn loads and stores must be atomic, but the data loads and stores > need not be. > > The original seqlock design and implementation was done by Stephen > Hemminger. This is an independent implementation, using C11 atomics. > > For more information on seqlocks, see > https://en.wikipedia.org/wiki/Seqlock Revisions changelog should be after commitlog, separated with ---. > > PATCH v4: > * Reverted to Linux kernel style naming on the read side. > * Bail out early from the retry function if an odd sequence > number is encountered. > * Added experimental warnings in the API documentation. > * Static initializer now uses named field initialization. > * Various tweaks to API documentation (including the example). > > PATCH v3: > * Renamed both read and write-side critical section begin/end functions > to better match rwlock naming, per Ola Liljedahl's suggestion. > * Added 'extern "C"' guards for C++ compatibility. > * Refer to the main lcore as the main lcore, and nothing else. > > PATCH v2: > * Skip instead of fail unit test in case too few lcores are available. > * Use main lcore for testing, reducing the minimum number of lcores > required to run the unit tests to four. > * Consistently refer to sn field as the "sequence number" in the > documentation. > * Fixed spelling mistakes in documentation. > > Updates since RFC: > * Added API documentation. > * Added link to Wikipedia article in the commit message. > * Changed seqlock sequence number field from uint64_t (which was > overkill) to uint32_t. The sn type needs to be sufficiently large > to assure no reader will read a sn, access the data, and then read > the same sn, but the sn has been incremented enough times to have > wrapped during the read, and arrived back at the original sn. > * Added RTE_SEQLOCK_INITIALIZER macro for static initialization. > * Removed the rte_seqlock struct + separate rte_seqlock_t typedef > with an anonymous struct typedef:ed to rte_seqlock_t. > > Acked-by: Morten Br=C3=B8rup > Reviewed-by: Ola Liljedahl > Signed-off-by: Mattias R=C3=B6nnblom We are missing a MAINTAINERS update, either with a new section for this lock (like for MCS and ticket locks), or adding the new test code under the EAL API and common code section (like rest of the locks). This new lock is not referenced in doxygen (see doc/api/doxy-api-index.md). It's worth a release notes update for advertising this new lock type. [snip] > diff --git a/app/test/test_seqlock.c b/app/test/test_seqlock.c > new file mode 100644 > index 0000000000..3f1ce53678 > --- /dev/null > +++ b/app/test/test_seqlock.c [snip] > +/* Only a compile-time test */ > +static rte_seqlock_t __rte_unused static_init_lock =3D RTE_SEQLOCK_INITI= ALIZER; > + > +static int > +test_seqlock(void) > +{ > + struct reader readers[MAX_READERS]; > + unsigned int num_readers; > + unsigned int num_lcores; > + unsigned int i; > + unsigned int lcore_id; > + unsigned int reader_lcore_ids[MAX_READERS]; > + unsigned int worker_writer_lcore_id =3D 0; > + int rc =3D 0; A unit test is supposed to use TEST_* macros as return values. I concede other locks unit tests return 0 or -1 (which is equivalent, given TEST_SUCCESS / TEST_FAILED values). We can go with 0 / -1 (and a cleanup could be done later on app/test globally), but at least change to TEST_SKIPPED when lacking lcores (see below). > + > + num_lcores =3D rte_lcore_count(); > + > + if (num_lcores < MIN_LCORE_COUNT) { > + printf("Too few cores to run test. Skipping.\n"); > + return 0; return TEST_SKIPPED; > + } > + > + num_readers =3D num_lcores - NUM_WRITERS; > + > + struct data *data =3D rte_zmalloc(NULL, sizeof(struct data), 0); > + > + i =3D 0; > + RTE_LCORE_FOREACH_WORKER(lcore_id) { > + if (i =3D=3D 0) { > + rte_eal_remote_launch(writer_run, data, lcore_id)= ; > + worker_writer_lcore_id =3D lcore_id; > + } else { > + unsigned int reader_idx =3D i - 1; > + struct reader *reader =3D &readers[reader_idx]; > + > + reader->data =3D data; > + reader->stop =3D 0; > + > + rte_eal_remote_launch(reader_run, reader, lcore_i= d); > + reader_lcore_ids[reader_idx] =3D lcore_id; > + } > + i++; > + } > + > + if (writer_run(data) !=3D 0 || > + rte_eal_wait_lcore(worker_writer_lcore_id) !=3D 0) > + rc =3D -1; > + > + for (i =3D 0; i < num_readers; i++) { > + reader_stop(&readers[i]); > + if (rte_eal_wait_lcore(reader_lcore_ids[i]) !=3D 0) > + rc =3D -1; > + } > + > + return rc; > +} > + > +REGISTER_TEST_COMMAND(seqlock_autotest, test_seqlock); > diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build > index 917758cc65..a41343bfed 100644 > --- a/lib/eal/common/meson.build > +++ b/lib/eal/common/meson.build > @@ -35,6 +35,7 @@ sources +=3D files( > 'rte_malloc.c', > 'rte_random.c', > 'rte_reciprocal.c', > + 'rte_seqlock.c', Indent is not correct, please use spaces for meson files. > 'rte_service.c', > 'rte_version.c', > ) > diff --git a/lib/eal/include/rte_seqlock.h b/lib/eal/include/rte_seqlock.= h > new file mode 100644 > index 0000000000..961816aa10 > --- /dev/null > +++ b/lib/eal/include/rte_seqlock.h > @@ -0,0 +1,319 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2022 Ericsson AB > + */ > + > +#ifndef _RTE_SEQLOCK_H_ > +#define _RTE_SEQLOCK_H_ > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +/** > + * @file > + * RTE Seqlock Nit: mention of RTE adds nothing, I'd remove it. > + * > + * A sequence lock (seqlock) is a synchronization primitive allowing > + * multiple, parallel, readers to efficiently and safely (i.e., in a > + * data-race free manner) access lock-protected data. The RTE seqlock > + * permits multiple writers as well. A spinlock is used to > + * writer-writer synchronization. > + * > + * A reader never blocks a writer. Very high frequency writes may > + * prevent readers from making progress. > + * > + * A seqlock is not preemption-safe on the writer side. If a writer is > + * preempted, it may block readers until the writer thread is allowed > + * to continue. Heavy computations should be kept out of the > + * writer-side critical section, to avoid delaying readers. > + * > + * Seqlocks are useful for data which are read by many cores, at a > + * high frequency, and relatively infrequently written to. > + * > + * One way to think about seqlocks is that they provide means to > + * perform atomic operations on objects larger than what the native > + * machine instructions allow for. > + * > + * To avoid resource reclamation issues, the data protected by a > + * seqlock should typically be kept self-contained (e.g., no pointers > + * to mutable, dynamically allocated data). > + * > + * Example usage: > + * @code{.c} > + * #define MAX_Y_LEN (16) > + * // Application-defined example data structure, protected by a seqlock= . > + * struct config { > + * rte_seqlock_t lock; > + * int param_x; > + * char param_y[MAX_Y_LEN]; > + * }; > + * > + * // Accessor function for reading config fields. > + * void > + * config_read(const struct config *config, int *param_x, char *param_y) > + * { > + * uint32_t sn; > + * > + * do { > + * sn =3D rte_seqlock_read_begin(&config->lock); > + * > + * // Loads may be atomic or non-atomic, as in this exam= ple. > + * *param_x =3D config->param_x; > + * strcpy(param_y, config->param_y); > + * // An alternative to an immediate retry is to abort a= nd > + * // try again at some later time, assuming progress is > + * // possible without the data. > + * } while (rte_seqlock_read_retry(&config->lock)); > + * } > + * > + * // Accessor function for writing config fields. > + * void > + * config_update(struct config *config, int param_x, const char *param_y= ) > + * { > + * rte_seqlock_write_lock(&config->lock); > + * // Stores may be atomic or non-atomic, as in this example. > + * config->param_x =3D param_x; > + * strcpy(config->param_y, param_y); > + * rte_seqlock_write_unlock(&config->lock); > + * } > + * @endcode > + * > + * @see > + * https://en.wikipedia.org/wiki/Seqlock. > + */ The rest lgtm. --=20 David Marchand