From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id EE58846BEA for ; Wed, 23 Jul 2025 00:28:37 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DCCE5402E6; Wed, 23 Jul 2025 00:28:37 +0200 (CEST) Received: from mail-qv1-f48.google.com (mail-qv1-f48.google.com [209.85.219.48]) by mails.dpdk.org (Postfix) with ESMTP id 58E6E40281 for ; Wed, 23 Jul 2025 00:28:36 +0200 (CEST) Received: by mail-qv1-f48.google.com with SMTP id 6a1803df08f44-6fad3400ea3so55873356d6.0 for ; Tue, 22 Jul 2025 15:28:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1753223315; x=1753828115; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=+5046KW24jnMVYuRBpMYpKynD24MK4ChE8G4LhquSFM=; b=wIwo7E48vVUh/K8d4ggd9Q652Nsxuwz5Qeo6cJvXKQ2aHFwxURQW9CIbChe2Ptm5hc VQOcnmm8Te+KH9/kHzRYRbVXi/rTggaP/GRyUdd+8aLdZL9zS7D/TRUcpjW8vSJ08DaP agSQBUzMjMxCevr5YzdtBMwbWc8UgeR233ZSp7ND8F9YOpd5Z3kFdP4QADXT1K/fnuR/ /bxWYwX7+BBqFlrRclN8WUKpBZchxu6lgpuoICat5dn5+K+TEzcv0uoYc/IhmcTCP2+5 Okeu02v5P9X84GEEyEsRMMAvL9WskJZ4Q16s2QbZpw4gvZ9uCZjd66VQM7LCyPO0qxe3 VEtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753223315; x=1753828115; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+5046KW24jnMVYuRBpMYpKynD24MK4ChE8G4LhquSFM=; b=MRy8flYM7vC6rmnaO7IUi83ksk2NL72Gm8gKBp2mz/+8phm0fIMCxCGXcveObRWYtg Oh7XSn6y655bvDwJWda+4sI2zU+kogGPzF3ci4XECDRzPxtNuHufWzSW0m5IObrghnVX 9lE3fK7NzDzhyoFpP9qZSWALVHSp9peumV1wsRqw19oEiSJIKmQKSbYi9RR4jaQ6F/KI uL3YYr6C3J4qELaPhmBI4CjeWUYijrtX7zhxKETRa2SGuSy3FlvmxjJ/ri0MWWig0kH+ /q0oufHext6fL+/6kituHid8FBwb2NkhU2vGKWslV4YNmikZc6Lm1n6Hw6Dil7hQhotG kOGw== X-Forwarded-Encrypted: i=1; AJvYcCUeFGdUtmFkq8HUDj/RmvNfZ/XxuGaim2uAwejn2il1wcnPnYbd/ZzE24b6OoidOHv/4i7Xmc8=@dpdk.org X-Gm-Message-State: AOJu0Ywf+QDUmyEhjEPiFoBLK31WWrp9iq2PF5eFhC7AKrZ+6rUaWk1+ soNPHcKkWrl7to7z/7Z/4cHzA+1TMuOxn5RglkjFiNGRi7MzZe1WmZm4Kof92SthGKo= X-Gm-Gg: ASbGnctjp+DF/yID7qQFq0qV4rQM3k4vCG19T8RI167CBZ9EnP7PU+ZLbPSdfPvtoHa NlZbFs9GYQJPEUXOI6kqWU8mXbwBALKTkazbRAx6PpVLe5NnGraSYPEmbo1ebH1pJeVCJAFuUKs g56zF0vbantJu1av2JhSeFipjMwlS5Owk/f+J6vEjjTGqPCoqy+yziMRgM0Odo5cHExJYvsSHVl gZ5Cjpw7O7nNlr3BQS9P7+DPqcgo80XBrBT1rA8apE+oW2GksDKQC7dhgUtPixely1TpMCSk+oc yBWQ+hXAkxohrvqDH4Ue2jxDme1tQSFLxXUeyDmtrAmmiZjp3Na9smT0R8QicGzVwnNQwCs1OQP kHiOEVrFQOr/xmSVbc4f0frnopMl/x0oAsNw3FeKxPyTv6w7dJuHku8NeYfV9TQSbHLmbgD79+b k= X-Google-Smtp-Source: AGHT+IGA+yQ5zgeeIMWwLG4OO1mdlQFQAFKaRSpbpFN9nlfcn7kElY8Z8Stx78TdAWBWbHR80Nbccw== X-Received: by 2002:a05:6214:f21:b0:702:afa1:b2d2 with SMTP id 6a1803df08f44-7070058c6e4mr10527616d6.4.1753223315363; Tue, 22 Jul 2025 15:28:35 -0700 (PDT) Received: from hermes.local (204-195-96-226.wavecable.com. [204.195.96.226]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7051ba6b24fsm55178946d6.60.2025.07.22.15.28.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Jul 2025 15:28:35 -0700 (PDT) Date: Tue, 22 Jul 2025 15:28:30 -0700 From: Stephen Hemminger To: Ivan Malov Cc: Khadem Ullah <14pwcse1224@uetpeshawar.edu.pk>, Bruce Richardson , Thomas Monjalon , Ferruh Yigit , Andrew Rybchenko , dev@dpdk.org, dpdk stable Subject: Re: [PATCH] lib/ethdev: fix segfault in secondary process by validating dev_private pointer Message-ID: <20250722152830.2f8f08ac@hermes.local> In-Reply-To: <55209b79-6845-5c25-bb8c-e2ecb3f0e290@arknetworks.am> References: <20250722115439.1353573-1-14pwcse1224@uetpeshawar.edu.pk> <20250722063924.2f87f3f7@hermes.local> <20250722084225.7a40e2bc@hermes.local> <20250722103824.7c9db0a0@hermes.local> <20250722112154.78349c82@hermes.local> <55209b79-6845-5c25-bb8c-e2ecb3f0e290@arknetworks.am> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org On Tue, 22 Jul 2025 23:05:06 +0400 (+04) Ivan Malov wrote: > There is a difference between control path and data path. Always has been. Yes, > on data path, DPDK has historically sought better performance, but on the slow > path, checks have typically been implemented, even in the flow API, with the > only exception being "asynchronous flow" APIs, which are meant to be fast-path. > > Yes, the idea to have a "secondary process reference counter" in 'rte_device' > to be either guarded with its own lock or accessed atomically by 'rte_dev_probe' > and 'rte_dev_remove' (to increment and decrement/check respectively) as well as > by 'rte_eth_dev_close' and 'rte_eth_dev_reset' (to decrement/check) may not be > a hill to die on, to be honest, and might be wrong, so I have no strong opinion. > > What scares me most in this idea is that, one may still end up with certain > entry points overlooked, rendering the whole effort worthless. > Please don't top post. The DPDK control has (up to now) assumed that control operations are only done from a single thread on each port. There is also the issue of hotplug but that is separate. For example, if two threads start and stop the same port bad thing happen and NIC driver's break. This is not well documented and a section needs to go into programmer's guide thread safety. The whole thread safety section is out of date, and doesn't reference RCU when it should. It also doesn't cover hot plug or weird secondary processes that fork. There is also the issue of how primary/secondary monitoring work. Right now the secondary monitors primary by periodically polling a lock file. This inherently a racy method and leads to problems. It needs to be redesigned to use a blocking method something like spawning a thread in secondary that uses some part of the existing Unix domain IPC to get notification when primary crashes or wants to exit. Ideally it would support synchronous handshake with all primaries and asynchronous case when primary crashes. The point is that bandaid's in the ethdev layer won't fix it well enough.