From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <users-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id D6EC94622A
	for <public@inbox.dpdk.org>; Fri, 14 Feb 2025 17:31:06 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 6E30F4064C;
	Fri, 14 Feb 2025 17:31:06 +0100 (CET)
Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com
 [209.85.216.51]) by mails.dpdk.org (Postfix) with ESMTP id 4521040647
 for <users@dpdk.org>; Fri, 14 Feb 2025 17:31:05 +0100 (CET)
Received: by mail-pj1-f51.google.com with SMTP id
 98e67ed59e1d1-2fa48404207so4651558a91.1
 for <users@dpdk.org>; Fri, 14 Feb 2025 08:31:05 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1739550664;
 x=1740155464; darn=dpdk.org; 
 h=content-transfer-encoding:mime-version:references:in-reply-to
 :message-id:subject:cc:to:from:date:from:to:cc:subject:date
 :message-id:reply-to;
 bh=qbSFahU1JsVBVY9BcQWzi0Tnxp2KwVhXnj79AMj+/KU=;
 b=uiWkeODIcL6eqafIUBjhVwmzao9zwO7nX6kjpKz7f89UT5PgExyKBkPrO5yFGwAJnR
 +lWk+BfEPq1R4STvDXONlyQ/M47DVgBAWJAoe+YCMC88b/3Zl42yFMkQnzyR+oJsSnJR
 R0q9yAkLwlzybWcegq76Q7zTSkEkDXtYljHsEGGUkVq+QSt6J/Wh2s/gVCrPc/s+rw3G
 uB/4SmhIw8z8jsGoWsuTWpeZWozEQJCPpd1qSfYWvvqKPqw1sG4pCA5/IJcGxp/uqxwG
 KskacnmyD9pLXwRVmK1UgWBkpd5TcbdzC6WugXu6xz+uTMUUcZ9/iC/nPAOdc2MvWrQe
 LxYA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1739550664; x=1740155464;
 h=content-transfer-encoding:mime-version:references:in-reply-to
 :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc
 :subject:date:message-id:reply-to;
 bh=qbSFahU1JsVBVY9BcQWzi0Tnxp2KwVhXnj79AMj+/KU=;
 b=YCtJo4/2IwkIylUVCOLqsqrVt3rMzkWAHqmdrDSeHWNAdXzu7hnKg16/XB/idsIzfU
 XzQJZ2IJjhzzCk0g17OoNfzTdINbEtY87Pens/ytvpCko/ggvCtsw7FdFIBxjPFI3kEh
 w7FO7m36RZ7Cg2NjvOBFUAqJRDo/UOP5pJVQV3DpmFJzycnXbeX51LECvXRk/NnkMoQu
 QaP7AIPON6zlkMtE4du+0FQFR9PRMbrs+D7Sau8fj0flqJ3rb25ynlEfp5Qc3Eru40I/
 jkLG6laHO/b9qW3kq9jWNCMfVV0M4DtOt5kaRvE9ikCRB93WVtZ5I0LTdSk4StQhcnYl
 QxbA==
X-Forwarded-Encrypted: i=1;
 AJvYcCV/1/j3hHmQswN4RI9NCXTr0xelN2fDr3mKTocyO4iqlJxGfZBRKocolatca7vl4Ov1XjFQJw==@dpdk.org
X-Gm-Message-State: AOJu0YzpXS1U7vxTsVqTb1cfjmLZG4gbL4lZjpDOLAtAAz+4MaMN2yGw
 Uso7ZtNmNoi19u3uiBVeevNanijSSyeAryENEv/hvQmRvRj3G5ulbiasN+J/A9g=
X-Gm-Gg: ASbGncsitELoRZ2rC4Qim0oKFPu3XSlS3xqaFfV709U5ZEGBcJ+01QOKnMuOu49SaEF
 h2W7WSosY4ctLzqApL+ALHMS7gBIvECTmRCQx9gEdDWOvWcCBHCJS0A6HqN+MoxUUAjGy8+Pie/
 wDCpxd6rYQvoR2aYW+nRMGB0kHkmIai8r7LnxF/1EtXWs8phxymLFDsVEmWdr8DSkpgWCyKfXA3
 UBmg/PxAygFRxPNoVoi3JaNdS8lztn1d9LjioQXXYvcggywoIIQE4IDvvyT0Xm7PQNC84hXMmxi
 dSzJGSnJIPWa7fcsyOHrZB1v4AeRUR+SY85UVOZBKqq7IfYBVY48zH7oKiTm9Yo+y7e7
X-Google-Smtp-Source: AGHT+IFzZdKBLXSZ03KTYGDF4/T/Y3ON/exQDbwkb0lBVPDzgQbu9iFvKijEv2iDi4y6Se6v/Hq8Pg==
X-Received: by 2002:a17:90b:3148:b0:2ee:fdf3:390d with SMTP id
 98e67ed59e1d1-2fc0f0db4cfmr10901826a91.31.1739550664231; 
 Fri, 14 Feb 2025 08:31:04 -0800 (PST)
Received: from hermes.local (204-195-96-226.wavecable.com. [204.195.96.226])
 by smtp.gmail.com with ESMTPSA id
 98e67ed59e1d1-2fc13ab1ff2sm3330811a91.8.2025.02.14.08.31.03
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Fri, 14 Feb 2025 08:31:04 -0800 (PST)
Date: Fri, 14 Feb 2025 08:31:02 -0800
From: Stephen Hemminger <stephen@networkplumber.org>
To: "Van Haaren, Harry" <harry.van.haaren@intel.com>
Cc: NAGENDRA BALAGANI <nagendra.balagani@oracle.com>, "users@dpdk.org"
 <users@dpdk.org>
Subject: Re: Query Regarding Race Condition Between Packet Reception and
 Device Stop in DPDK
Message-ID: <20250214083102.240c8da9@hermes.local>
In-Reply-To: <PH8PR11MB6803CD852AB1E71D2D4B71AFD7FE2@PH8PR11MB6803.namprd11.prod.outlook.com>
References: <DM6PR10MB4124042136E9F1E9DF05532396FE2@DM6PR10MB4124.namprd10.prod.outlook.com>
 <PH8PR11MB6803CD852AB1E71D2D4B71AFD7FE2@PH8PR11MB6803.namprd11.prod.outlook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-BeenThere: users@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK usage discussions <users.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/users>,
 <mailto:users-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/users/>
List-Post: <mailto:users@dpdk.org>
List-Help: <mailto:users-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/users>,
 <mailto:users-request@dpdk.org?subject=subscribe>
Errors-To: users-bounces@dpdk.org

On Fri, 14 Feb 2025 09:22:59 +0000
"Van Haaren, Harry" <harry.van.haaren@intel.com> wrote:

> > From: NAGENDRA BALAGANI <nagendra.balagani@oracle.com>
> > Sent: Friday, February 14, 2025 8:43 AM
> > To: users@dpdk.org <users@dpdk.org>
> > Subject: Query Regarding Race Condition Between Packet Reception and Device Stop in DPDK
> >
> > Hi Team,  
> 
> Ni Nagendra,
> 
> > We are facing a race condition in our DPDK application where one thread is reading packets from queue using rte_eth_rx_burst() , while another thread is attempting to stop the device using rte_eth_dev_stop(). This is causing instability, as the reading thread may still be accessing queues while the device is being stopped.  
> 
> This is as expected - it is not valid to stop a device while other cores are using it.
> 
> > Could you please suggest the best way to mitigate this race condition without impacting fast path performance? We want to ensure safe synchronization while maintaining high throughput.  
> 
> There are many implementations possible, but the end result of them all is "ensure that the dataplane core is NOT polling a device that is stopping".
> 
> 1) One implementation is using a "force_quit" boolean value (see dpdk/examples/l2fwd/main.c for example). This approach changes the lcore's "while (1)" polling loop, and turns it into a "while (!force_quit)". (Note some nuance around "volatile" keyword for the boolean to ensure reloading on each iteration, but that's off topic).
> 
> 2) Another more flexible/powerful implementation could be some form of message passing. For example imagine the dataplane thread and control plane (stopping ethdev) thread are capable of communicating by sending an "event" to eachother. When a "stop polling" event is recieved by the dataplane thread, it disables polling just that eth device/queue, and responds with a "stopped polling" reply. On recieving the "stopped polling" event, the thread that wants to stop the eth device can now safely do so.
> 
> Both of these implementations will have no datapath performance impact:
> 1) a single boolean value check (shared state cache-line, likely in the core's cache) per iteration of polling of the app is super lightweight
> 2) an "event ringbuffer" check (when empty, also shared-state, likely in cache) per iteration is also super light.
> 
> General notes on the above:
> There's even an option to only check the boolean/event-ringbuffer once every N iterations: this will cause even less overhead, but will increase the latency of event action/reply on the datapath thread. As almost always, it depends on what's important for your use-case!
> 
> The main difference between implementation 1) and 2) above can be captured by this phrase: "Do not communicate by sharing memory; instead, share memory by communicating.", which I read at the Rust docs here: https://doc.rust-lang.org/book/ch16-02-message-passing.html. 1) literally shares memory (both threads access the force_quit value directly). 2) focusses on communicating: which enables avoiding the race condition in a more powerful/elegant way (and future proof too - it allows adding new event types cleanly, which the force_quit bool value does not.) I like this design-mentality, as it is a good/high-performance/scalable way of interacting between threads, and scales to future needs too: so I recommend approach 2.

One other solution is to reserve the main lcore for control operations.
In a couple of projects we had the main lcore spawn the workers then sleep on epoll to handle control
requests from another source (unix domain socket). When the stop request came in the main thread would
set a flag (atomic variable) and wait for the worker lcore's to finish. Then it would do the stop and
other maintenance operations. It worked out much cleaner than doing control in the workers.