From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <users-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 1CF6C463A7
	for <public@inbox.dpdk.org>; Mon, 10 Mar 2025 22:25:04 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id B821B40268;
	Mon, 10 Mar 2025 22:25:03 +0100 (CET)
Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com
 [209.85.214.173])
 by mails.dpdk.org (Postfix) with ESMTP id 42D64400D7
 for <users@dpdk.org>; Mon, 10 Mar 2025 22:25:02 +0100 (CET)
Received: by mail-pl1-f173.google.com with SMTP id
 d9443c01a7336-223959039f4so94323965ad.3
 for <users@dpdk.org>; Mon, 10 Mar 2025 14:25:02 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1741641901;
 x=1742246701; darn=dpdk.org; 
 h=content-transfer-encoding:mime-version:references:in-reply-to
 :message-id:subject:cc:to:from:date:from:to:cc:subject:date
 :message-id:reply-to;
 bh=AqdZd1R45foLNt/IkmaGIV2/zNpgi9UqUxVhBfv6oSg=;
 b=IQU9FxJGqWpmDlNPM9TlouBfH8YyUEcupLL/pKGITePZg9KDdiuPPouQ5I/Jb2tPEC
 phttzS6DFZivlEPEM1e0h8+jraEL7Zp21EFeM1RT3d7BuG79vBOeKGLdRE7W79oCYP4I
 UAE55f2PSzWSEJx8H+22IL+4wUAZnwdoyeE9lHzvb1WArOLcQYOlxCL4QTj8Bfn2PGRQ
 IBVX31VBkwGmBTBEcaSO8Seclyd0CfGLL+u8+/tydvfs+DiepliI50tp79PkO0WAnXmW
 S5kajQRuWOZzUTPvYEKJr4ldv3LVr4O4zcUm+VKHfy1xrGxZGKWKmxXIabttGPqSKVpK
 XAmg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1741641901; x=1742246701;
 h=content-transfer-encoding:mime-version:references:in-reply-to
 :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc
 :subject:date:message-id:reply-to;
 bh=AqdZd1R45foLNt/IkmaGIV2/zNpgi9UqUxVhBfv6oSg=;
 b=C6QdLTqwNSfYxyTKnIctKhcudJcqZnX0uPXP9OFZ4XrW7yGrMiMhWN6RNdyiZsolYX
 P8uPsWn5suXpInD+Ewiz889v8Y4VNIZVRToHWNTcrqG/XRkxT8W/Gq5xmr+YJDrdldQI
 5WIXdcqZgBTXiczW+hH3TwTHQcS5xufTGEKsydFzClmED570QjyDGj2hsfmm5du+3khS
 3tYnhpq8wC+OIrUzpVr7oijcLRVf7/wa23btZUMUzHJkLEDk0s55d2zqj1Gk/+64zZKo
 yEtYfIaKvBIdTton1GvObFUsIHIGhHgZyfzT/7vDRtylgPEY4IfqhDQb46s26LU2nzDT
 0Ufw==
X-Gm-Message-State: AOJu0Yy/Qstn49zllylgzZf0Aa5RC7CmLNjFSgUILDkFL3a9vtW8o7CE
 ZdtQ9J3YAB3bn4cUwdhRqO2Cx+g6SJJB9B4Taxecb9C4MS3ya3s1uWB0r5cdeVc=
X-Gm-Gg: ASbGncso0wSQavxNDjHZGx4oS9LPnUnLO0siEvcNu95girIC6igCEuhRXa5KrYiOFns
 LpVhzGvnRXQ+p9gUwm3OdlGuu8G/rBIi1m0YXmubHsstYePpj01JD1LpF08A4GFErZRtKuaSWXs
 jyn+0NTkkoIWN2wsLmNEd6mPS0mcwrFqFBigBPh0owcdDumjs7EvLHG0Lk0amieX8qPDSD15od4
 SsPgNVYbS88OfbzbFbEbjn1Fq3+QkRI74RkH/8wIdDsCaPNeCedGv5eMseq6LK733kgtSewcA8N
 dxiQcPOR6ic04JeL6PKgT7WBj9PlBAhaGwVfaMEWYIfDZL1pqzY75XTXEXCq4Pn8okiMgBjnjHo
 2DQHzRB9PNcRoie4XWQWuiQ==
X-Google-Smtp-Source: AGHT+IFkFvuFZM1RCfwgBswKlLaxGJO558btYEzvSSltmn008U18Pss/JlCXE9udk5n2lce+K9Hfqw==
X-Received: by 2002:a05:6a00:1817:b0:736:3979:369e with SMTP id
 d2e1a72fcca58-736aa9f1fcdmr18610280b3a.9.1741641901411; 
 Mon, 10 Mar 2025 14:25:01 -0700 (PDT)
Received: from hermes.local (204-195-96-226.wavecable.com. [204.195.96.226])
 by smtp.gmail.com with ESMTPSA id
 d2e1a72fcca58-736c9ba6c37sm4046504b3a.159.2025.03.10.14.25.01
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Mon, 10 Mar 2025 14:25:01 -0700 (PDT)
Date: Mon, 10 Mar 2025 14:24:57 -0700
From: Stephen Hemminger <stephen@networkplumber.org>
To: Mikhail Malofeev <mdmalofeev@gmail.com>
Cc: users@dpdk.org
Subject: Re: DPDK multiprocessing issue: what if primary process dies
Message-ID: <20250310142457.424318f4@hermes.local>
In-Reply-To: <CAF6DLt681eC8J-EUFCKJsTOVpSG0naO24NSvA1CO0_Y6fbvw4w@mail.gmail.com>
References: <CAF6DLt681eC8J-EUFCKJsTOVpSG0naO24NSvA1CO0_Y6fbvw4w@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-BeenThere: users@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK usage discussions <users.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/users>,
 <mailto:users-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/users/>
List-Post: <mailto:users@dpdk.org>
List-Help: <mailto:users-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/users>,
 <mailto:users-request@dpdk.org?subject=subscribe>
Errors-To: users-bounces@dpdk.org

On Mon, 10 Mar 2025 17:29:09 +0100
Mikhail Malofeev <mdmalofeev@gmail.com> wrote:

> Hello everyone,
> 
> I am new to DPDK and I am implementing a DPDK-based application which might
> be running as a multi-process application. My understanding is that one of
> the processes should be the primary one, responsible for initializing and
> managing the hugepages. All the other processes should be secondary,
> meaning that they are connected to the primary process during DPDK
> initialization to get information about hugepages (please correct me if I
> am mistaken here).

That is one usage model. But secondary processes add additional overhead
and are certainly not required.

> I am interested in the scenario where there are multiple DPDK processes,
> and the primary process dies. As far as I understand, existing secondary
> processes will continue running; however, I will not be able to start new
> DPDK processes:

Once primary process dies, the DPDK application is not in a viable state.
The most common model is to use a service manager (like systemd) to recover
if primary process dies.

The secondary processes should all be monitoring for notification that
the primary dies, and exit then.

> 1) I cannot start a new primary process because it will try to initialize
> hugepages and this will fail since some of them are in use by existing
> secondary processes (source:
> https://stackoverflow.com/questions/74602244/dpdk-multi-process-kill-a-primary-process-and-restart-as-a-secondary-doesnt-wo
> )
> 2) I cannot start a new secondary process because there is no primary
> process to connect to and get the hugepages info, so the secondary process
> won't be able to complete the initialization.
> 
> Hence, if my primary process terminates, what should I do in order to start
> it again?
> 
> I have come up with two possible solutions:
> 1) If the primary process dies, I will have to kill all the secondary
> processes and restart everything. This solution is not ideal because I want
> to minimize the amount of time when processes are not running.
> 2) Have a dummy primary process that does not contain any critical business
> logic and is solely responsible for DPDK initialization. Then, all the
> business-critical applications must be secondary processes. If one of them
> dies, I can simply restart it without interrupting other processes. This
> solution sounds better than the first, but it is still not perfect as it
> requires running an additional dummy process which I would prefer to avoid.
> 
> Could anyone please advise me on the idiomatic way of dealing with these
> multiprocessing issues?
> 

Use systemd, and do full restart.