From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8A47A43412 for ; Thu, 30 Nov 2023 17:24:05 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 0A27940277; Thu, 30 Nov 2023 17:24:05 +0100 (CET) Received: from mail-lf1-f54.google.com (mail-lf1-f54.google.com [209.85.167.54]) by mails.dpdk.org (Postfix) with ESMTP id 4DBB840266 for ; Thu, 30 Nov 2023 17:24:04 +0100 (CET) Received: by mail-lf1-f54.google.com with SMTP id 2adb3069b0e04-50abb83866bso1698940e87.3 for ; Thu, 30 Nov 2023 08:24:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701361443; x=1701966243; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=6kxHN2leaqimGRCNCjO0vnL+74xYvuoyjGHInHSUxSE=; b=BiyDVxL7uDECeZeelRjM0TNch0XgXetlVmDLI+0lVkeWighYqIlaKRwjmuUjgq8GkK kMUe3qNPsI4ZMBdQe0CyD5Gic5STmPPQ0Zqn5bTTO2vjl9ZZAL9cmqmz0/xLQein2U0T XXC2Ct61Ry9tA5JJ5MX3bK06apqzDo9gJtRzaJ+s1hVUCu8+HFisRNkTH8Ck8UmQ6IQ8 kLM6nxX/oggGot01/riFTLnL/EhX14ep2LcCqm1P4u8Sr0HA4vdRt7dk1iH2mZM95cxJ 4EDeJw5b8QwEFQRM8RqE5DYBWeF6VETaCjvuHRCZ7xgyFaV/GRssImH6wXMrfamSq+hq ccPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701361443; x=1701966243; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6kxHN2leaqimGRCNCjO0vnL+74xYvuoyjGHInHSUxSE=; b=gp62UToHO4IPVa1sBPDhcokJjYn0iZgOob0D/VxTalObMpZz0LqI+j+2ySFjl345A+ XSUmfj1PoR63ySalkavu9TLlCO99mJD6QiLomjEhQd2+LCTlrsPBn54UcdBUv99xPY8x KaN6i6dAge5C9T9JgdbIaidziin6kmIFLtUrKscmqpFrodLUkwkX8dw9e4+zscG7AGwQ MQ1U+wayIbmf5+0Pu2cr8tLquW4rkXM+YLDJGvcdjc/Q+kkUgRL9Y5K/Jxn8rBv2+5Bb OC6mtE6taZKmz11pW1T88ytLiiDkgjScKcUTYO6pRxG/9AYDM8w+Pp+s7Z8pNudBFL7r 1RKg== X-Gm-Message-State: AOJu0YwJLJdMO71lSZuvZfsgHFVsO+ZpBZzXFZXeNkhS/tM0pPXEN7oz 0t86ucWAayj3p3M7x+SFAbZnwFbzAaI= X-Google-Smtp-Source: AGHT+IFr6jNc03bH1R+5lF2aYlTsCO7pQrSZH3FEladTGYNiQoDn5jOVy0e1gxFlPqcriYteSBUgbw== X-Received: by 2002:a05:6512:15c:b0:50a:77e9:d07a with SMTP id m28-20020a056512015c00b0050a77e9d07amr7839lfo.44.1701361443360; Thu, 30 Nov 2023 08:24:03 -0800 (PST) Received: from sovereign (broadband-109-173-110-33.ip.moscow.rt.ru. [109.173.110.33]) by smtp.gmail.com with ESMTPSA id 12-20020ac2482c000000b0050bc95b640dsm201114lft.110.2023.11.30.08.24.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Nov 2023 08:24:02 -0800 (PST) Date: Thu, 30 Nov 2023 19:24:01 +0300 From: Dmitry Kozlyuk To: Fuji Nafiul Cc: users@dpdk.org Subject: Re: how to make dpdk processes tolerable to segmantation fault? Message-ID: <20231130192401.2e3f3c4c@sovereign> In-Reply-To: References: X-Mailer: Claws Mail 3.18.0 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org 2023-11-30 13:45 (UTC+0600), Fuji Nafiul: > In a normal c program, I saw that the segmentation fault in 1 loosely > coupled thread doesn't necessarily affect other threads or the main > program. There, I can check all the threads by process ID of it in every > certain period of time and if some unexepected segmentation fault occurs or > got killed I can re run the thread and it works fine. I can later monitor > the logs and inspect the situation. > > But I saw that, segmentation fault or other unexpected error in remotely > launched (using DPDK) functions on different core affects the whole dpdk > process and whole dpdk program crashes.. why is that? > > Is there any alternative way to handle this scenario ? How can I take > measures for unexpected future error occurance where I should auto rerun > dpdk remote processes in live system? Please consider running the buggy code that causes SIGSEGV in a separate process rather than a thread. If it must use DPDK, can it be made an independent app? DPDK is unlikely to ever support the described scenario. Continuing to run the process after SIGSEGV is inherently unsafe. Specifically, DPDK communicates with its lcore threads using pipes allocated at startup. If such thread crashed and a SIGSEGV not killing the app was installed, the communication would hang. Generally, DPDK employs user-space synchronization primitives, which cannot recover if one of the threads using them crashes.