From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 43AADA0C43; Tue, 16 Nov 2021 09:52:02 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 02F46410EA; Tue, 16 Nov 2021 09:52:02 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 429E540141 for ; Tue, 16 Nov 2021 09:52:00 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1637052719; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rifLA+6OQezge1ollrEU9UE/MhhDLJtbLru79l3rBJA=; b=KV5v6y2WKE8f4N0us8nnucsMG9QuO046d9jqYg0W7o3Bx74L5wZ8zHrPxrFpaylt9TMRA6 gNxXGywajwkxk2VV3AjV5azEiy2HUgsU90X6vbfudmBN+OpTuMxw7aJyIGh4aKnXADrBeV Y5BNMXep1sew11Q2Wu1vYMU5n6ec1HM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-33-E9TLJ1qPPeulzsJXcYiWrA-1; Tue, 16 Nov 2021 03:51:58 -0500 X-MC-Unique: E9TLJ1qPPeulzsJXcYiWrA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D5E891023F4D; Tue, 16 Nov 2021 08:51:56 +0000 (UTC) Received: from [10.39.208.9] (unknown [10.39.208.9]) by smtp.corp.redhat.com (Postfix) with ESMTPS id ED26060843; Tue, 16 Nov 2021 08:51:54 +0000 (UTC) Message-ID: Date: Tue, 16 Nov 2021 09:51:53 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 Subject: Re: [PATCH] vhost: fix get hpa fail from guest pages To: "Ding, Xuan" , "Xia, Chenbo" , "Wang, YuanX" Cc: "dev@dpdk.org" , "Hu, Jiayu" , "Ma, WenwuX" , "He, Xingguang" , "Yang, YvonneX" References: <20211111062725.108297-1-yuanx.wang@intel.com> From: Maxime Coquelin In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=maxime.coquelin@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi, On 11/16/21 09:47, Ding, Xuan wrote: > Hi Chenbo, > >> -----Original Message----- >> From: Xia, Chenbo >> Sent: 2021年11月16日 16:07 >> To: Xia, Chenbo ; Wang, YuanX >> ; maxime.coquelin@redhat.com >> Cc: dev@dpdk.org; Hu, Jiayu ; Ding, Xuan >> ; Ma, WenwuX ; He, >> Xingguang ; Yang, YvonneX >> >> Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages >> >>> -----Original Message----- >>> From: Xia, Chenbo >>> Sent: Monday, November 15, 2021 4:04 PM >>> To: Wang, YuanX ; maxime.coquelin@redhat.com >>> Cc: dev@dpdk.org; Hu, Jiayu ; Ding, Xuan >>> ; Ma, WenwuX ; He, >> Xingguang >>> ; Yang, YvonneX >>> Subject: RE: [PATCH] vhost: fix get hpa fail from guest pages >>> >>>> -----Original Message----- >>>> From: Wang, YuanX >>>> Sent: Thursday, November 11, 2021 2:27 PM >>>> To: maxime.coquelin@redhat.com; Xia, Chenbo >>>> Cc: dev@dpdk.org; Hu, Jiayu ; Ding, Xuan >>>> ; Ma, WenwuX ; He, >>>> Xingguang ; Yang, YvonneX >>>> ; Wang, >>> YuanX >>>> >>>> Subject: [PATCH] vhost: fix get hpa fail from guest pages >>>> >>>> When processing front-end memory regions messages, vhost saves the >>>> guest/host physical address mappings to guest pages and merges >>>> adjacent contiguous pages if hpa is contiguous, however gpa is >>>> likely not contiguous in PA mode and merging will cause the gpa >>>> range to change. >>>> This patch distinguishes the case of discontinuous gpa and does a >>>> range lookup on gpa when doing a binary search. >>>> >>>> Fixes: e246896178e("vhost: get guest/host physical address >>>> mappings") >>>> Fixes: 6563cf92380 ("vhost: fix async copy on multi-page buffers") >>>> >>>> Signed-off-by: Yuan Wang >>>> --- >>>> lib/vhost/vhost.h | 18 ++++++++++++++++-- >>>> lib/vhost/vhost_user.c | 15 +++++++++++---- >>>> 2 files changed, 27 insertions(+), 6 deletions(-) >>>> >>>> diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index >>>> 7085e0885c..b3f0c1d07c 100644 >>>> --- a/lib/vhost/vhost.h >>>> +++ b/lib/vhost/vhost.h >>>> @@ -587,6 +587,20 @@ static __rte_always_inline int >>>> guest_page_addrcmp(const void *p1, >>>> return 0; >>>> } >>>> >>>> +static __rte_always_inline int guest_page_rangecmp(const void *p1, >>>> +const >>> void >>>> *p2) >>>> +{ >>>> + const struct guest_page *page1 = (const struct guest_page *)p1; >>>> + const struct guest_page *page2 = (const struct guest_page *)p2; >>>> + >>>> + if (page1->guest_phys_addr >= page2->guest_phys_addr) { >>>> + if (page1->guest_phys_addr < page2->guest_phys_addr + >> page2->size) >>>> + return 0; >>>> + else >>>> + return 1; >>>> + } else >>>> + return -1; >>>> +} >>>> + >>>> static __rte_always_inline rte_iova_t gpa_to_first_hpa(struct >>>> virtio_net *dev, uint64_t gpa, >>>> uint64_t gpa_size, uint64_t *hpa_size) @@ -597,9 +611,9 @@ >>>> gpa_to_first_hpa(struct virtio_net *dev, uint64_t gpa, >>>> >>>> *hpa_size = gpa_size; >>>> if (dev->nr_guest_pages >= VHOST_BINARY_SEARCH_THRESH) { >>>> - key.guest_phys_addr = gpa & ~(dev->guest_pages[0].size - 1); >>>> + key.guest_phys_addr = gpa; >>>> page = bsearch(&key, dev->guest_pages, dev->nr_guest_pages, >>>> - sizeof(struct guest_page), guest_page_addrcmp); >>>> + sizeof(struct guest_page), guest_page_rangecmp); >>>> if (page) { >>>> if (gpa + gpa_size <= >>>> page->guest_phys_addr + page->size) >> { diff --git >>>> a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index >>>> a781346c4d..7d58fde458 100644 >>>> --- a/lib/vhost/vhost_user.c >>>> +++ b/lib/vhost/vhost_user.c >>>> @@ -999,10 +999,17 @@ add_one_guest_page(struct virtio_net *dev, >>>> uint64_t guest_phys_addr, >>>> if (dev->nr_guest_pages > 0) { >>>> last_page = &dev->guest_pages[dev->nr_guest_pages - 1]; >>>> /* merge if the two pages are continuous */ >>>> - if (host_phys_addr == last_page->host_phys_addr + >>>> - last_page->size) { >>>> - last_page->size += size; >>>> - return 0; >>>> + if (host_phys_addr == last_page->host_phys_addr + >>>> +last_page->size) >>>> { >>>> + if (rte_eal_iova_mode() == RTE_IOVA_VA) { >>>> + last_page->size += size; >>>> + return 0; >>>> + } >>> >>> This makes me think about a question: In IOVA_VA mode, what ensures >>> HPA and GPA are both contiguous? >> >> When I wrote this email, I thought host_phys_addr is HPA but in VA mode, it's >> Host IOVA, aka VA in DPDK's case. So in most cases GPA and VA will be both >> contiguous when the contiguous pages are all in one memory region. But I think >> we should not make such assumption as some vhost master may send GPA- >> contiguous pages in different memory region, then the VA after mmap may not >> be contiguous. > > Now we do async vfio mapping at page granularity. > For 4K/2M pages, without this assumption, pages not be merged may > exceed the IOMMU's capability. This limits only use 1G hugepage for DMA device. I'm sorry, but I'm not sure to understand your explanation. Could you please rephrase it? Thanks, Maxime > Thanks, > Xuan > >> >> Maxime, What do you think? >> >> Thanks, >> Chenbo >> >>> >>> Maxime & Yuan, any thought? >>> >>> Thanks, >>> Chenbo >>> >>>> + >>>> + if (rte_eal_iova_mode() == RTE_IOVA_PA && >>>> + guest_phys_addr == last_page- >>> guest_phys_addr + >>>> last_page->size) { >>>> + last_page->size += size; >>>> + return 0; >>>> + } >>>> } >>>> } >>>> >>>> -- >>>> 2.25.1 >