From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 55A94455C2; Tue, 9 Jul 2024 01:50:25 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D48A2427C4; Tue, 9 Jul 2024 01:50:24 +0200 (CEST) Received: from chinatelecom.cn (smtpnm6-05.21cn.com [182.42.157.213]) by mails.dpdk.org (Postfix) with ESMTP id 15E4540295; Tue, 9 Jul 2024 01:50:21 +0200 (CEST) HMM_SOURCE_IP: 192.168.137.232:0.501106608 HMM_ATTACHE_NUM: 0000 HMM_SOURCE_TYPE: SMTP Received: from clientip-101.227.46.165 (unknown [192.168.137.232]) by chinatelecom.cn (HERMES) with SMTP id 51D3E9BD0A; Tue, 9 Jul 2024 07:50:15 +0800 (CST) X-189-SAVE-TO-SEND: chengm11@chinatelecom.cn Received: from ([101.227.46.165]) by gateway-mua-dep-b5744948-nkwsn with ESMTP id cc5308d3607a4899b8ac91ed3e6b96ed for maxime.coquelin@redhat.com; Tue, 09 Jul 2024 07:50:19 CST X-Transaction-ID: cc5308d3607a4899b8ac91ed3e6b96ed X-Real-From: chengm11@chinatelecom.cn X-Receive-IP: 101.227.46.165 X-MEDUSA-Status: 0 Sender: chengm11@chinatelecom.cn Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.60.0.1.1\)) Subject: Re: [PATCH v1] vhost: fix crash caused by accessing a freed vsocket From: Gongming Chen In-Reply-To: <1c002c66-08a7-465c-bb56-a22abb7c840e@redhat.com> Date: Tue, 9 Jul 2024 07:50:14 +0800 Cc: chenbox@nvidia.com, dev@dpdk.org, stable@dpdk.org Content-Transfer-Encoding: quoted-printable Message-Id: <7813527B-514C-4B05-B568-7C2D9F49214E@chinatelecom.cn> References: <1c002c66-08a7-465c-bb56-a22abb7c840e@redhat.com> To: Maxime Coquelin X-Mailer: Apple Mail (2.3693.60.0.1.1) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > On Jul 2, 2024, at 3:48 PM, Maxime Coquelin = wrote: >=20 > Hi Gongming, >=20 > On 5/10/24 09:28, Gongming Chen wrote: >> Hi Maxime and Chenbo, >> Do you have any suggestions for how to address this? >> Looking forward to hearing from you! >=20 > Could you please have a try with latest DPDK main branch, > and if it reproduces, rebase your series on top of it. >=20 > I don't think it has been fixed, but we've done significant changes in > fdman in this release so we need a rebase anyways. >=20 > Thanks in advance, > Maxime Hi Maxime, This bug still exists, I rebase the latest main branch and submit the v4 = version. Thank you for your review, looking forward to hearing from you! Thanks, Gongming >=20 >> Thanks, >> Gongming >>> On Apr 3, 2024, at 11:52 PM, Gongming Chen = wrote: >>>=20 >>> Hi Maxime, >>> Thanks for review. >>>=20 >>>> On Apr 3, 2024, at 5:39 PM, Maxime Coquelin = wrote: >>>>=20 >>>> Hi Gongming, >>>>=20 >>>> It's the 9th time the patch has been sent. >>>> I'm not sure whether there are changes between them or these are = just >>>> re-sends, but that's something to avoid. >>>>=20 >>>=20 >>> Sorry, there's something wrong with my mailbox. >>> I will send a v1 version as the latest patch, but they are actually = the same. >>>=20 >>>> If there are differences, you should use versionning to highlight = it. >>>> If unsure, please check the contributions guidelines first. >>>>=20 >>>> Regarding the patch itself, I don't know if this is avoidable, but = I >>>> would prefer we do not introduce yet another lock in there. >>>>=20 >>>> Thanks, >>>> Maxime >>>>=20 >>>=20 >>> I totally agree with your. >>> Therefore, initially I hoped to solve this problem without = introducing >>> new lock. However, the result was not expected. >>>=20 >>> 1. The vsocket is shared between the event and reconnect threads by >>> transmitting the vsocket pointer. Therefore, there is no way to = protect >>> vsocket through a simple vsocket lock. >>>=20 >>> 2. The event and reconnect threads can transmit vsocket pointers to >>> each other, so there is no way to ensure that vsocket will not be >>> accessed by locking the two threads separately. >>>=20 >>> 3. Therefore, on the vsocket resource, event and reconnect are in = the >>> same critical section. Only by locking two threads at the same time >>> can the vsocket be ensured that it will not be accessed and can be >>> freed safely. >>>=20 >>> Currently, app config, event, and reconnect threads respectively = have >>> locks corresponding to their own maintenance resources, >>> vhost_user.mutex, pfdset->fd_mutex, and reconn_list.mutex. >>>=20 >>> I think there is a thread-level lock missing here to protect the >>> critical section between threads, just like the rcu scene = protection. >>>=20 >>> After app config acquires the write lock, it ensures that the event = and >>> reconnect threads are outside the critical section. >>> This is to completely clean up the resources associated with vsocket >>> and safely free vsocket. >>>=20 >>> Therefore, considering future expansion, if there may be more >>> resources like vsocket, this thread lock can also be used to ensure >>> that resources are safely released after complete cleanup. >>>=20 >>> In this way, the threads will be clearer, and the complicated try = lock >>> method is no longer needed. >>>=20 >>> Thanks, >>> Gongming >=20 >=20