From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D9B5A4257D; Tue, 12 Sep 2023 14:18:14 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id AB43840293; Tue, 12 Sep 2023 14:18:14 +0200 (CEST) Received: from mail-lf1-f49.google.com (mail-lf1-f49.google.com [209.85.167.49]) by mails.dpdk.org (Postfix) with ESMTP id 92A6D40272 for ; Tue, 12 Sep 2023 14:18:13 +0200 (CEST) Received: by mail-lf1-f49.google.com with SMTP id 2adb3069b0e04-500a398cda5so9186550e87.0 for ; Tue, 12 Sep 2023 05:18:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1694521093; x=1695125893; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=BLhnDfeYWT9192x+y9MvpZHyBdWKz5K1ucUQ4Ru6Bto=; b=UwAAjksAE7XdBXK1bjKmHfBmIgyEPsB4Md5zNNYagbIH3jsYnQ8pOlN3QZKT3/IJBz 6QXY+1XoWdsp+OUju7NRkj/nLzqPePfB15J2YEFHYdFG4XovmuO7qFWmLkK60BV6D4xH swCfEhX2YsHOYdnjnyBYGMXJbl6BRGg7T4OQxarTRUggiu22PkxtE2G05UXZiZeAGgaR 4pl6RNDj77n78nvEH0ek/hbmMS+lC+9wrWXlY5QtX8G0A59yVwmIVNSxYiIK9gMeYq0y KtQGeQOVm2G261680tv7Xp6McG+M637QEVjDYi2bnxDfRUegw+fbITUTGm+toh6u8k8q +FYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694521093; x=1695125893; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BLhnDfeYWT9192x+y9MvpZHyBdWKz5K1ucUQ4Ru6Bto=; b=mOtFkTOllpdOmJij5I+u3vbPy/ex773YIVO1aXbWDH3s1RhuQ28C4O0c5u3SkN8AoK /aFQVdP3EaNvz7g3vraavzcTiGHeB4cp3NHJYI0lVI1fDz0PRcJYm1StlvaI/RwzxxZV jjEXXc7ctEdUEo9NhhOS3Xr9UxGb5qBvHwtNb2cP56DyEtBNSp/BwwrkSG5N0J+7KBL5 ArRx5M8f6JyHuwjU2pvtmCcBz7iHDR+wXwcr+XIxadUJ55OVYZjDebxWehDsZU/hzzyf MI4VPT9uYVzA7QqgzP1D1DJ8ba5bwlGNinPP3PqNYsynz0R0rtZmplvx7vx46MiwPwwv oT0A== X-Gm-Message-State: AOJu0Yw5tGdRXRVUb/Kqi+RI4zx/UgoL6OPYV6/I7wX9DiEuvt65HWwc z71agLGqCRkxs69K4zrAovk= X-Google-Smtp-Source: AGHT+IEvqw9BpTVmYPHl9jvG5BjWlD3ZitlSAmwC/hTLRdfTKpx8u/az44wQ1ufJeQ0dWoyghyvwRw== X-Received: by 2002:a19:f006:0:b0:500:bc14:3e0f with SMTP id p6-20020a19f006000000b00500bc143e0fmr9786469lfc.31.1694521092532; Tue, 12 Sep 2023 05:18:12 -0700 (PDT) Received: from sovereign (broadband-109-173-110-33.ip.moscow.rt.ru. [109.173.110.33]) by smtp.gmail.com with ESMTPSA id b1-20020ac25621000000b004fe36b790a1sm1724696lff.128.2023.09.12.05.18.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Sep 2023 05:18:11 -0700 (PDT) Date: Tue, 12 Sep 2023 15:18:10 +0300 From: Dmitry Kozlyuk To: "Li, Ming3" Cc: "dev@dpdk.org" , Tyler Retzlaff Subject: Re: [PATCH v2] windows/virt2phys: fix block MDL not updated Message-ID: <20230912151810.2962113c@sovereign> In-Reply-To: References: <20230911082229.1479773-1-ming3.li@intel.com> <20230911130936.1485584-1-ming3.li@intel.com> <20230912005041.0e09671c@sovereign> X-Mailer: Claws Mail 3.18.0 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org 2023-09-12 11:13 (UTC+0000), Li, Ming3: > > Is any if these bugs are related? > > If so, please mention "Bugzilla ID: xxxx" in the commit message. > > > > https://bugs.dpdk.org/show_bug.cgi?id=1201 > > https://bugs.dpdk.org/show_bug.cgi?id=1213 > > > > Sure, will do. > > I cannot reproduce them in my environment, but from the message, > they both mentioned that some pages not unlocked after exit. So they can be related. > > For example, in Bug 1201, it only exists on Windows 2019, may it be caused by the > OS limitation so that some memory segment got freed and allocated same virtual address again? > Maybe someone can use this patch to check if there is 'refresh' behavior from TraceView logs. I've posted a comment in BZ 1201 (the bugs are from the same user) inviting to test your patch, let's see. [...] > > > To address this, a refresh function has been added. If a block with > > > the same base address is detected in the driver's context, the MDL's > > > physical address is compared with the real physical address. > > > If they don't match, the MDL within the block is released and rebuilt > > > to store the correct mapping. > > > > What if the size is different? > > Should it be updated for the refreshed block along with the MDL? > > > > The size of single MDL is always 2MB since it describes a hugepage here. > (at least from my observation :)) Your observation is correct, DPDK memalloc layer currently works this way. > For allocated buffer larger than 2MB, it has > serval mem segs (related to serval MDLs), most buggy mem segs are those > possess a whole hugepage, these segments are freed along with the buffer, > so their MDLs become invalid. > > Since the block is just wrapper for MDL and list entry, > the refresh action should be applied to the whole block. There is always a single MDL per block, but it can describe multiple pages (generally, if used beyond DPDK). Suppose there was a block for one page. Then this page has been deallocated and allocated again but this time in the middle of a multi-page region. With your patch this will work, but that one-page block will be just lost (never found because its MDL base VA does not match the region start VA). The downside is that the memory remains locked. The solution could be to check, when inserting a new block, if there are existing blocks covered by the new one, and if so, to free those blocks as they correspond to deallocated regions. I think this can be done with another patch to limit the scope of this one. Ideally virt2phys should not be doing this guesswork at all. DPDK can just tell it when pages are allocated and freed, but this requires some rework of the userspace part. Just thinking out loud. [...] > > > /* Don't lock the same memory twice. */ > > > if (block != NULL) { > > > + KeAcquireSpinLock(g_lock, &irql); > > > + status = virt2phys_block_refresh(block, base, size); > > > + KeReleaseSpinLock(g_lock, irql); > > > > Is it safe to do all the external calls holding this spinlock? > > I can't confirm from the doc that ZwQueryVirtualMemory(), for example, does > > not access pageable data. > > And virt2phys_lock_memory() raises exceptions, although it handles them. > > Other stuff seems safe. > > > > The rest of the code only takes the lock to access block and process lists, which > > are allocated from the non-paged pool. > > Now that I think of it, this may be insufficient because the code and the static > > variables are not marked as non-paged. > > > > The translation IOCTL performance is not critical, so maybe it is worth replacing > > the spinlock with just a global mutex, WDYT? > > In the upcoming v3 patch, the lock will be used for block removal which won't fail. > > I'm relatively new to Windows driver development. From my perspective, the use > of a spinlock seems appropriate in this driver. Maybe a read-write lock can be > more effective here? It is correctness that I am concerned with, not efficiency. Translating VA to IOVA is not performance-critical, the spinlock is used just because it seemed sufficient. Relating the code to the docs [1]: * The code within a critical region guarded by an spin lock must neither be pageable nor make any references to pageable data. - Process and block structures are allocated from the non-paged pool - OK. - The code is not marked as non-pageable - FAIL, though never fired. * The code within a critical region guarded by a spin lock can neither call any external function that might access pageable data... - MDL manipulation and page locking can run at "dispatch" IRQL - OK. - ZwQueryVirtualMemory() - unsure ... or raise an exception, nor can it generate any exceptions. - MmLockPagesInMemory() does generate an exception on failure, but it is handled - unsure * The caller should release the spin lock with KeReleaseSpinLock as quickly as possible. - Before the patch, there was a fixed number of locked operations - OK. - After the patch, there's more work under the lock, although it seems to me that all of it can be done at "dispatch" IRQL - unsure. I've added Tyler from Microsoft, he might know more. [1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-keacquirespinlock