From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <stable-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id C47F742526
	for <public@inbox.dpdk.org>; Wed,  6 Sep 2023 14:52:32 +0200 (CEST)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id AFA984027C;
	Wed,  6 Sep 2023 14:52:32 +0200 (CEST)
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.129.124])
 by mails.dpdk.org (Postfix) with ESMTP id 3D8374027C
 for <stable@dpdk.org>; Wed,  6 Sep 2023 14:52:31 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1694004750;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references;
 bh=bmQOJETr8E0Ck89sfFcklNDuLn+K7RqL+XZBqpL60sY=;
 b=Iad9Gb8iwqthHX2pR6Uba4t1k510RhofYIiqr3xi1fr8lLXXPX2PFnp9L1niV1bvAIKFqx
 q2utL0kMaqZpt1tPr0RwTceSRL1sqMH1CZFr1X9KkwCwKaM9GtlGNV6hFrg+/9MPZ8dyqx
 EONJOnXQNh9an9X++AFzjpM8+OxYYdU=
Received: from mail-lf1-f70.google.com (mail-lf1-f70.google.com
 [209.85.167.70]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id
 us-mta-373-tRS9zW8DNqyzwGFMdPXokQ-1; Wed, 06 Sep 2023 08:52:29 -0400
X-MC-Unique: tRS9zW8DNqyzwGFMdPXokQ-1
Received: by mail-lf1-f70.google.com with SMTP id
 2adb3069b0e04-5009796123dso3685953e87.1
 for <stable@dpdk.org>; Wed, 06 Sep 2023 05:52:29 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20221208; t=1694004748; x=1694609548;
 h=content-transfer-encoding:cc:to:subject:message-id:date:from
 :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
 :subject:date:message-id:reply-to;
 bh=bmQOJETr8E0Ck89sfFcklNDuLn+K7RqL+XZBqpL60sY=;
 b=F2dIN+m7MtNKHhFgvHoOR166m22MIa613SSlQy/URocYZt6RsEVx+CLOMcpLrKR1dp
 qn8JUNDy6HAx9ZP7eqGNZ0/xiR4dudZ+FhZiRbt133LGqr8/y7452hYSULxzsTfceYoe
 Z6DmsILsunOJu0wQEIqkssNe33Op5OLBFZmDRfCAjs+KIH4R9F585U26Y5xY5BrisGWX
 qrCDPgppHDVUrD+U3XBxWmfqSJMPtaSYOdc8yoXGQ3eUvYRAgXTkFq8wzH2t2oGOKj1X
 WVP3JYciuQePvh7uCnYn6mSg6VKd0jhuMrKJDw9VEMSN9i6gPlA6Xjdi69c/8LMJCy3g
 j0LA==
X-Gm-Message-State: AOJu0YzpsDnNA50AdblX5D63m9Pt2Qy0/WKzZHTXnnciqfY4WxFq9PYt
 IXLkDVMybvV3JpH+ClC1l0OXvMihNycPKhG4I8LJLcSbwM3AyFHu2u/a4uOhxFbjivwDkZKrAL0
 Hfl0q3pzdm+kqr+NyzxPQwXk=
X-Received: by 2002:a19:5219:0:b0:501:bf30:714c with SMTP id
 m25-20020a195219000000b00501bf30714cmr1911851lfb.24.1694004747885; 
 Wed, 06 Sep 2023 05:52:27 -0700 (PDT)
X-Google-Smtp-Source: AGHT+IG4d6D/bEElTd/8/VG2xKMLJassnn/u+0yEh/n5VvWh7GYMuU7thlwBNOhJClxD/iJC7Mo4GZC2TSlGk74BWQ0=
X-Received: by 2002:a19:5219:0:b0:501:bf30:714c with SMTP id
 m25-20020a195219000000b00501bf30714cmr1911834lfb.24.1694004747529; Wed, 06
 Sep 2023 05:52:27 -0700 (PDT)
MIME-Version: 1.0
References: <20230830103303.2428995-1-artemyko@nvidia.com>
 <20230906095227.1032271-1-artemyko@nvidia.com>
In-Reply-To: <20230906095227.1032271-1-artemyko@nvidia.com>
From: David Marchand <david.marchand@redhat.com>
Date: Wed, 6 Sep 2023 14:52:16 +0200
Message-ID: <CAJFAV8zP7zv2yeAFZ-vCeSU2Km2bUzsFG0c2dZar0AT1WRuvUw@mail.gmail.com>
Subject: Re: [PATCH v3] eal: fix memory initialization deadlock
To: Artemy Kovalyov <artemyko@nvidia.com>
Cc: dev@dpdk.org, Thomas Monjalon <thomas@monjalon.net>,
 Ophir Munk <ophirmu@nvidia.com>, 
 stable@dpdk.org, Anatoly Burakov <anatoly.burakov@intel.com>, 
 =?UTF-8?Q?Morten_Br=C3=B8rup?= <mb@smartsharesystems.com>, 
 Stephen Hemminger <stephen@networkplumber.org>
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-BeenThere: stable@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: patches for DPDK stable branches <stable.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/stable>,
 <mailto:stable-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/stable/>
List-Post: <mailto:stable@dpdk.org>
List-Help: <mailto:stable-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/stable>,
 <mailto:stable-request@dpdk.org?subject=subscribe>
Errors-To: stable-bounces@dpdk.org

On Wed, Sep 6, 2023 at 11:53=E2=80=AFAM Artemy Kovalyov <artemyko@nvidia.co=
m> wrote:
>
> The issue arose due to the change in the DPDK read-write lock
> implementation. That change added a new flag, RTE_RWLOCK_WAIT, designed
> to prevent new read locks while a write lock is in the queue. However,
> this change has led to a scenario where a recursive read lock, where a
> lock is acquired twice by the same execution thread, can initiate a
> sequence of events resulting in a deadlock:
>
> Process 1 takes the first read lock.
> Process 2 attempts to take a write lock, triggering RTE_RWLOCK_WAIT due
> to the presence of a read lock. This makes process 2 enter a wait loop
> until the read lock is released.
> Process 1 tries to take a second read lock. However, since a write lock
> is waiting (due to RTE_RWLOCK_WAIT), it also enters a wait loop until
> the write lock is acquired and then released.
>
> Both processes end up in a blocked state, unable to proceed, resulting
> in a deadlock scenario.
>
> Following these changes, the RW-lock no longer supports
> recursion, implying that a single thread shouldn't obtain a read lock if
> it already possesses one. The problem arises during initialization: the
> rte_eal_init() function acquires the memory_hotplug_lock, and later on,
> the sequence of calls rte_eal_memory_init() -> eal_memalloc_init() ->
> rte_memseg_list_walk() acquires it again without releasing it. This
> scenario introduces the risk of a potential deadlock when concurrent
> write locks are applied to the same memory_hotplug_lock. To address this
> we resolved the issue by replacing rte_memseg_list_walk() with
> rte_memseg_list_walk_thread_unsafe().
>
> Implementing a lock annotation for rte_memseg_list_walk() to
> proactively identify bugs similar to this one during compile time.

The annotations are not necessary to the fix (that we will likely
backport in LTS versions).
Please split this change in two patches, to separate those annotations
from the fix.


>
> Bugzilla ID: 1277
> Fixes: 832cecc03d77 ("rwlock: prevent readers from starving writers")
> Cc: stable@dpdk.org
>
> Signed-off-by: Artemy Kovalyov <artemyko@nvidia.com>


--=20
David Marchand