From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 27DFAA00C4;
	Sun,  1 May 2022 15:46:23 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 1BE674069D;
	Sun,  1 May 2022 15:46:23 +0200 (CEST)
Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3])
 by mails.dpdk.org (Postfix) with ESMTP id 286F04003F
 for <dev@dpdk.org>; Sun,  1 May 2022 15:46:22 +0200 (CEST)
Received: from mail.lysator.liu.se (localhost [127.0.0.1])
 by mail.lysator.liu.se (Postfix) with ESMTP id 4454A105A5
 for <dev@dpdk.org>; Sun,  1 May 2022 15:46:20 +0200 (CEST)
Received: by mail.lysator.liu.se (Postfix, from userid 1004)
 id 42E0B103B3; Sun,  1 May 2022 15:46:20 +0200 (CEST)
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 hermod.lysator.liu.se
X-Spam-Level: 
X-Spam-Status: No, score=-3.2 required=5.0 tests=ALL_TRUSTED, AWL, NICE_REPLY_A,
 T_SCC_BODY_TEXT_LINE autolearn=disabled version=3.4.6
X-Spam-Score: -3.2
Received: from [192.168.1.59] (unknown [62.63.215.114])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest
 SHA256) (No client certificate requested)
 by mail.lysator.liu.se (Postfix) with ESMTPSA id 546A11033C;
 Sun,  1 May 2022 15:46:14 +0200 (CEST)
Message-ID: <3e9b0560-9306-0826-4911-1cf0e72c30d9@lysator.liu.se>
Date: Sun, 1 May 2022 15:46:13 +0200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
 Thunderbird/91.8.1
Subject: Re: [PATCH v4] eal: add seqlock
Content-Language: en-US
To: David Marchand <david.marchand@redhat.com>,
 =?UTF-8?Q?Mattias_R=c3=b6nnblom?= <mattias.ronnblom@ericsson.com>
Cc: dev <dev@dpdk.org>, Thomas Monjalon <thomas@monjalon.net>,
 onar.olsen@ericsson.com, Honnappa Nagarahalli
 <Honnappa.Nagarahalli@arm.com>, nd <nd@arm.com>,
 "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
 =?UTF-8?Q?Morten_Br=c3=b8rup?= <mb@smartsharesystems.com>,
 Stephen Hemminger <stephen@networkplumber.org>,
 Ola Liljedahl <ola.liljedahl@arm.com>
References: <c118c4ec-a7f4-5115-5b93-c7cd50b3ef99@ericsson.com>
 <20220408142442.157192-1-mattias.ronnblom@ericsson.com>
 <CAJFAV8xS_egr1Qm7DrcBFs6GU4t7Ehw8FoWm=zBJWpzRTamPYg@mail.gmail.com>
From: =?UTF-8?Q?Mattias_R=c3=b6nnblom?= <hofors@lysator.liu.se>
In-Reply-To: <CAJFAV8xS_egr1Qm7DrcBFs6GU4t7Ehw8FoWm=zBJWpzRTamPYg@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Virus-Scanned: ClamAV using ClamSMTP
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

On 2022-04-28 12:28, David Marchand wrote:
> Hello Mattias,
> 
> On Fri, Apr 8, 2022 at 4:25 PM Mattias Rönnblom
> <mattias.ronnblom@ericsson.com> wrote:
>>
>> A sequence lock (seqlock) is synchronization primitive which allows
>> for data-race free, low-overhead, high-frequency reads, especially for
>> data structures shared across many cores and which are updated
>> relatively infrequently.
>>
>> A seqlock permits multiple parallel readers. The variant of seqlock
>> implemented in this patch supports multiple writers as well. A
>> spinlock is used for writer-writer serialization.
>>
>> To avoid resource reclamation and other issues, the data protected by
>> a seqlock is best off being self-contained (i.e., no pointers [except
>> to constant data]).
>>
>> One way to think about seqlocks is that they provide means to perform
>> atomic operations on data objects larger what the native atomic
>> machine instructions allow for.
>>
>> DPDK seqlocks are not preemption safe on the writer side. A thread
>> preemption affects performance, not correctness.
>>
>> A seqlock contains a sequence number, which can be thought of as the
>> generation of the data it protects.
>>
>> A reader will
>>    1. Load the sequence number (sn).
>>    2. Load, in arbitrary order, the seqlock-protected data.
>>    3. Load the sn again.
>>    4. Check if the first and second sn are equal, and even numbered.
>>       If they are not, discard the loaded data, and restart from 1.
>>
>> The first three steps need to be ordered using suitable memory fences.
>>
>> A writer will
>>    1. Take the spinlock, to serialize writer access.
>>    2. Load the sn.
>>    3. Store the original sn + 1 as the new sn.
>>    4. Perform load and stores to the seqlock-protected data.
>>    5. Store the original sn + 2 as the new sn.
>>    6. Release the spinlock.
>>
>> Proper memory fencing is required to make sure the first sn store, the
>> data stores, and the second sn store appear to the reader in the
>> mentioned order.
>>
>> The sn loads and stores must be atomic, but the data loads and stores
>> need not be.
>>
>> The original seqlock design and implementation was done by Stephen
>> Hemminger. This is an independent implementation, using C11 atomics.
>>
>> For more information on seqlocks, see
>> https://en.wikipedia.org/wiki/Seqlock
> 
> Revisions changelog should be after commitlog, separated with ---.
> 

OK.

>>
>> PATCH v4:
>>    * Reverted to Linux kernel style naming on the read side.
>>    * Bail out early from the retry function if an odd sequence
>>      number is encountered.
>>    * Added experimental warnings in the API documentation.
>>    * Static initializer now uses named field initialization.
>>    * Various tweaks to API documentation (including the example).
>>
>> PATCH v3:
>>    * Renamed both read and write-side critical section begin/end functions
>>      to better match rwlock naming, per Ola Liljedahl's suggestion.
>>    * Added 'extern "C"' guards for C++ compatibility.
>>    * Refer to the main lcore as the main lcore, and nothing else.
>>
>> PATCH v2:
>>    * Skip instead of fail unit test in case too few lcores are available.
>>    * Use main lcore for testing, reducing the minimum number of lcores
>>      required to run the unit tests to four.
>>    * Consistently refer to sn field as the "sequence number" in the
>>      documentation.
>>    * Fixed spelling mistakes in documentation.
>>
>> Updates since RFC:
>>    * Added API documentation.
>>    * Added link to Wikipedia article in the commit message.
>>    * Changed seqlock sequence number field from uint64_t (which was
>>      overkill) to uint32_t. The sn type needs to be sufficiently large
>>      to assure no reader will read a sn, access the data, and then read
>>      the same sn, but the sn has been incremented enough times to have
>>      wrapped during the read, and arrived back at the original sn.
>>    * Added RTE_SEQLOCK_INITIALIZER macro for static initialization.
>>    * Removed the rte_seqlock struct + separate rte_seqlock_t typedef
>>      with an anonymous struct typedef:ed to rte_seqlock_t.
>>
>> Acked-by: Morten Brørup <mb@smartsharesystems.com>
>> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> 
> We are missing a MAINTAINERS update, either with a new section for
> this lock (like for MCS and ticket locks), or adding the new test code
> under the EAL API and common code section (like rest of the locks).
> 

OK. I added a new section, and myself as the maintainer. If you want to 
merge it with EAL and the common code that fine with me as well. Let me 
know in that case. The seqlock is a tiny bit of code, but there are some 
intricacies (like always when C11 memory models matter).

> This new lock is not referenced in doxygen (see doc/api/doxy-api-index.md).
> 

OK.

> It's worth a release notes update for advertising this new lock type.
> 

OK.

> [snip]
> 
> 
>> diff --git a/app/test/test_seqlock.c b/app/test/test_seqlock.c
>> new file mode 100644
>> index 0000000000..3f1ce53678
>> --- /dev/null
>> +++ b/app/test/test_seqlock.c
> 
> [snip]
> 
>> +/* Only a compile-time test */
>> +static rte_seqlock_t __rte_unused static_init_lock = RTE_SEQLOCK_INITIALIZER;
>> +
>> +static int
>> +test_seqlock(void)
>> +{
>> +       struct reader readers[MAX_READERS];
>> +       unsigned int num_readers;
>> +       unsigned int num_lcores;
>> +       unsigned int i;
>> +       unsigned int lcore_id;
>> +       unsigned int reader_lcore_ids[MAX_READERS];
>> +       unsigned int worker_writer_lcore_id = 0;
>> +       int rc = 0;
> 
> A unit test is supposed to use TEST_* macros as return values.
> I concede other locks unit tests return 0 or -1 (which is equivalent,
> given TEST_SUCCESS / TEST_FAILED values).
> We can go with 0 / -1 (and a cleanup could be done later on app/test
> globally), but at least change to TEST_SKIPPED when lacking lcores
> (see below).
> 

I'll fix it right away instead.

> 
>> +
>> +       num_lcores = rte_lcore_count();
>> +
>> +       if (num_lcores < MIN_LCORE_COUNT) {
>> +               printf("Too few cores to run test. Skipping.\n");
>> +               return 0;
> 
> return TEST_SKIPPED;
> 
> 

Excellent. I remember asking myself if there was such a return code when 
I wrote that code, but then I forgot to actually go look for it.

>> +       }
>> +
>> +       num_readers = num_lcores - NUM_WRITERS;
>> +
>> +       struct data *data = rte_zmalloc(NULL, sizeof(struct data), 0);
>> +
>> +       i = 0;
>> +       RTE_LCORE_FOREACH_WORKER(lcore_id) {
>> +               if (i == 0) {
>> +                       rte_eal_remote_launch(writer_run, data, lcore_id);
>> +                       worker_writer_lcore_id = lcore_id;
>> +               } else {
>> +                       unsigned int reader_idx = i - 1;
>> +                       struct reader *reader = &readers[reader_idx];
>> +
>> +                       reader->data = data;
>> +                       reader->stop = 0;
>> +
>> +                       rte_eal_remote_launch(reader_run, reader, lcore_id);
>> +                       reader_lcore_ids[reader_idx] = lcore_id;
>> +               }
>> +               i++;
>> +       }
>> +
>> +       if (writer_run(data) != 0 ||
>> +           rte_eal_wait_lcore(worker_writer_lcore_id) != 0)
>> +               rc = -1;
>> +
>> +       for (i = 0; i < num_readers; i++) {
>> +               reader_stop(&readers[i]);
>> +               if (rte_eal_wait_lcore(reader_lcore_ids[i]) != 0)
>> +                       rc = -1;
>> +       }
>> +
>> +       return rc;
>> +}
>> +
>> +REGISTER_TEST_COMMAND(seqlock_autotest, test_seqlock);
>> diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
>> index 917758cc65..a41343bfed 100644
>> --- a/lib/eal/common/meson.build
>> +++ b/lib/eal/common/meson.build
>> @@ -35,6 +35,7 @@ sources += files(
>>           'rte_malloc.c',
>>           'rte_random.c',
>>           'rte_reciprocal.c',
>> +       'rte_seqlock.c',
> 
> Indent is not correct, please use spaces for meson files.
> 

OK.

> 
>>           'rte_service.c',
>>           'rte_version.c',
>>   )
>> diff --git a/lib/eal/include/rte_seqlock.h b/lib/eal/include/rte_seqlock.h
>> new file mode 100644
>> index 0000000000..961816aa10
>> --- /dev/null
>> +++ b/lib/eal/include/rte_seqlock.h
>> @@ -0,0 +1,319 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2022 Ericsson AB
>> + */
>> +
>> +#ifndef _RTE_SEQLOCK_H_
>> +#define _RTE_SEQLOCK_H_
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +/** >> + * @file
>> + * RTE Seqlock
> 
> Nit: mention of RTE adds nothing, I'd remove it.
> 

I agree, but there seemed to be a convention that called for this.

> 
>> + *
>> + * A sequence lock (seqlock) is a synchronization primitive allowing
>> + * multiple, parallel, readers to efficiently and safely (i.e., in a
>> + * data-race free manner) access lock-protected data. The RTE seqlock
>> + * permits multiple writers as well. A spinlock is used to
>> + * writer-writer synchronization.
>> + *
>> + * A reader never blocks a writer. Very high frequency writes may
>> + * prevent readers from making progress.
>> + *
>> + * A seqlock is not preemption-safe on the writer side. If a writer is
>> + * preempted, it may block readers until the writer thread is allowed
>> + * to continue. Heavy computations should be kept out of the
>> + * writer-side critical section, to avoid delaying readers.
>> + *
>> + * Seqlocks are useful for data which are read by many cores, at a
>> + * high frequency, and relatively infrequently written to.
>> + *
>> + * One way to think about seqlocks is that they provide means to
>> + * perform atomic operations on objects larger than what the native
>> + * machine instructions allow for.
>> + *
>> + * To avoid resource reclamation issues, the data protected by a
>> + * seqlock should typically be kept self-contained (e.g., no pointers
>> + * to mutable, dynamically allocated data).
>> + *
>> + * Example usage:
>> + * @code{.c}
>> + * #define MAX_Y_LEN (16)
>> + * // Application-defined example data structure, protected by a seqlock.
>> + * struct config {
>> + *         rte_seqlock_t lock;
>> + *         int param_x;
>> + *         char param_y[MAX_Y_LEN];
>> + * };
>> + *
>> + * // Accessor function for reading config fields.
>> + * void
>> + * config_read(const struct config *config, int *param_x, char *param_y)
>> + * {
>> + *         uint32_t sn;
>> + *
>> + *         do {
>> + *                 sn = rte_seqlock_read_begin(&config->lock);
>> + *
>> + *                 // Loads may be atomic or non-atomic, as in this example.
>> + *                 *param_x = config->param_x;
>> + *                 strcpy(param_y, config->param_y);
>> + *                 // An alternative to an immediate retry is to abort and
>> + *                 // try again at some later time, assuming progress is
>> + *                 // possible without the data.
>> + *         } while (rte_seqlock_read_retry(&config->lock));
>> + * }
>> + *
>> + * // Accessor function for writing config fields.
>> + * void
>> + * config_update(struct config *config, int param_x, const char *param_y)
>> + * {
>> + *         rte_seqlock_write_lock(&config->lock);
>> + *         // Stores may be atomic or non-atomic, as in this example.
>> + *         config->param_x = param_x;
>> + *         strcpy(config->param_y, param_y);
>> + *         rte_seqlock_write_unlock(&config->lock);
>> + * }
>> + * @endcode
>> + *
>> + * @see
>> + * https://en.wikipedia.org/wiki/Seqlock.
>> + */
> 
> 
> The rest lgtm.
> 
> 

Thanks a lot David for the review!