From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9E99AA034F; Wed, 10 Nov 2021 15:58:03 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6DDC941141; Wed, 10 Nov 2021 15:58:03 +0100 (CET) Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by mails.dpdk.org (Postfix) with ESMTP id 783EE410EF for ; Wed, 10 Nov 2021 15:58:02 +0100 (CET) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 2318E5C0200; Wed, 10 Nov 2021 09:58:00 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Wed, 10 Nov 2021 09:58:00 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=fm2; bh= fxx4w89lqYu2QugmMV9o/c/lQmj6D7ERpKAA5fRY8Ac=; b=cnZ+SgfqyR7W+ICJ be7RGEIEflPoKHWnd3c7qbybFaZUKJR18WqFugXEe/TlsXQRDg2KMrnFOqFZ0Z/c PF4BLvaNx0xTo+BqqfjZLFl6Fmq4PfTs5jROEfngC0O2F445vliQG6AWA8fNlD1B Iclqnpdl/DBooKe3rtdnfDUAEPmmyZZ3q2wEp3SvMjJvVUQScDmNXRdDG9ihZyH9 dFPrmByNVLPMBO9fNdzBmEiVXK0qdTsecvp5bjIzDQnM+PE5sDP6TNqkYwxkrd80 EBEi6oeXsfqoM2e3NVAfZ8UElJ6BfO+Tuz37bgwQm49J+J7JtdgMxMsfqD8sk+om YyDDCA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; bh=fxx4w89lqYu2QugmMV9o/c/lQmj6D7ERpKAA5fRY8 Ac=; b=kOa90G2waibx+HFbcULCG3Zji9k032HR9+XKX/2bL3vtXhrNq0aOs7By7 jWYHYU2ly3tD5gGiBKzc3buMG6Q84w27dk1o0MXLZhVvzRbGycajxLdKhq3nMNXx RzFRlfkzH2ApJ9mbIceOCLFC+pXpMLIgUT7eq2XnVsHVRiCmxN5chrPrWgEbPyc0 /pGhC33+zOCdZ/BR5JuOJfwCMEXG2mEQhtJY3SaBlIFV0tdle70KEh6aQ33jB6Lb pC/GPdR4tFoX3aGT48InL5Toy7i2Iul6zw4HNJdb9JjYOa5rkSdcKn/cdBILn1I2 3VMVoxxMZyTQwSbNJp3a5DV+QkMuw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvuddrudejgdeilecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkjghfggfgtgesthfuredttddtvdenucfhrhhomhepvfhhohhmrghs ucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenucggtf frrghtthgvrhhnpedugefgvdefudfftdefgeelgffhueekgfffhfeujedtteeutdejueei iedvffegheenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhroh hmpehthhhomhgrshesmhhonhhjrghlohhnrdhnvght X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 10 Nov 2021 09:57:58 -0500 (EST) From: Thomas Monjalon To: "Yigit, Ferruh" , Bing Zhao , "Ananyev, Konstantin" Cc: "andrew.rybchenko@oktetlabs.ru" , "dev@dpdk.org" Subject: Re: [PATCH 2/2] ethdev: fix the race condition for fp ops reset Date: Wed, 10 Nov 2021 15:57:57 +0100 Message-ID: <9516865.9GKQoxltXr@thomas> In-Reply-To: References: <20211022211407.315068-1-bingz@nvidia.com> <16525cf6-bf10-e637-3f03-f4d415ff6bb1@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org 10/11/2021 15:37, Ananyev, Konstantin: > > Hi Ferruh, > > > >> 22/10/2021 23:14, Bing Zhao: > > >>> In the function "eth_dev_fp_ops_reset", a structure assignment > > >>> operation is used to reset one queue's callback functions, etc., but > > >>> it is not thread safe. > > >>> > > >>> The structure assignment is not atomic, a lot of instructions will > > >>> be generated. Right now, since not all the fields are needed, the > > >>> fields in the "dummy_ops" which is not set explicitly will be 0s > > >>> based on the specification and compiler behavior. In order to make > > >>> "fpo" has the same content with "dummy_ops", some clearing to 0 > > >>> operation is needed. > > >>> > > >>> By checking the object instructions (e.g. with GCC 4.8.5) > > >>> 0x0000000000a58317 <+35>: mov %rsi,%rdi > > >>> 0x0000000000a5831a <+38>: mov %rdx,%rcx > > >>> => 0x0000000000a5831d <+41>: rep stos %rax,%es:(%rdi) > > >>> 0x0000000000a58320 <+44>: mov -0x38(%rsp),%rax > > >>> 0x0000000000a58325 <+49>: lea -0xe0(%rip),%rdx > > >>> // # 0xa5824c > > >>> > > >>> It shows that "rep stos" will clear the "fpo" structure before > > >>> assigning new values. > > >>> > > >>> In the other thread, if some data path Tx / Rx functions are still > > >>> running, there is a risk to get 0 instead of the correct dummy > > >>> content. > > >>> 1. qd = p->rxq.data[queue_id] > > >>> 2. (void **)&p->rxq.clbk[queue_id] > > >>> "data" and "clbk" may be observed with NULL (0) in other threads. > > >>> Even it is temporary, the accessing to a NULL pointer will cause a > > >>> crash. Using "memcpy" could get rid of this. > > >>> > > >>> Fixes: c87d435a4d79 ("ethdev: copy fast-path API into separate structure") > > >>> Cc: konstantin.ananyev@intel.com > > >>> > > >>> Signed-off-by: Bing Zhao > > >>> --- > > >>> --- a/lib/ethdev/ethdev_private.c > > >>> +++ b/lib/ethdev/ethdev_private.c > > >>> @@ -206,7 +206,7 @@ eth_dev_fp_ops_reset(struct rte_eth_fp_ops *fpo) > > >>> .txq = {.data = dummy_data, .clbk = dummy_data,}, > > >>> }; > > >>> > > >>> - *fpo = dummy_ops; > > >>> + rte_memcpy(fpo, &dummy_ops, sizeof(struct rte_eth_fp_ops)); > > >> > > >> That's not trivial. > > >> Please add a comment to briefly explain that memcpy avoids zeroing of a simple assignment. > > >> > > > > > > I think that patch is based on two totally wrong assumptions: > > > 1) ethdev data-path and control-path API is MT-safe. > > > With current design it is not. > > > When calling rx/tx_burst it is caller responsibility to make sure that given port is > > > already properly configured and started. Also it is user responsibility to guarantee > > > that none other thread doing dev_stop for the same port simultaneously. > > > And visa-versa when calling dev_stop(), it is user responsibility to ensure that > > > none other thread doing rx/tx_burst for given port simultaneously. > > > If your app doesn't follow these principles, then it is a bug that needs to be fixed. > > > 2) rte_memcpy() provides some sort of atomicity and it is safe to use it on its own > > > in MT environment. That's totally wrong. > > > In both cases compiler has total freedom to perform copy in any order it likes > > > (let say it can first read whole source data in some temporary buffer (SIMD register), > > > and then right it in one go, or it can do the same trick with 'rep stos' as above). > > > Moreover CPU itself can reorder instructions. > > > So if you need this copy to be atomic you need to use some sort of > > > sync primitives along with it (mutex, rwlock, rcu, etc.). > > > But as I said above right now ethdev API is not MT-safe, so it is not required. > > > > > > To summarise - there is no point to mae these changes, > > > and patch comment is wrong and misleading. > > > > Can we mark this patch as rejected now? > > I believe so. > > > Patch seems trying to cover a wrong application usage, and it should > > be addressed in the application level. Yes