From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 68854A0588; Wed, 1 Apr 2020 15:01:23 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id C3F88375B; Wed, 1 Apr 2020 15:01:22 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 7658B2B8B for ; Wed, 1 Apr 2020 15:01:20 +0200 (CEST) IronPort-SDR: vqaffsJXf6wFcXSKEemUBWsTiB1/1Ku/JC1ufUmA4daU1XNp2KOCeuQ+oIqYPxlrjfvhD2FLdw U4gRwqM9g62Q== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2020 06:01:19 -0700 IronPort-SDR: iOU6NYSuK6ftLgdX8df10ehCkeJXUy9jmIZcZuwBlSaaz1Ag1Rz/2JPThF2BF6QwUHg8NQGMBd fYqRCAHi+wgw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,331,1580803200"; d="scan'208";a="284401844" Received: from fmsmsx106.amr.corp.intel.com ([10.18.124.204]) by fmsmga002.fm.intel.com with ESMTP; 01 Apr 2020 06:01:19 -0700 Received: from fmsmsx124.amr.corp.intel.com (10.18.125.39) by FMSMSX106.amr.corp.intel.com (10.18.124.204) with Microsoft SMTP Server (TLS) id 14.3.439.0; Wed, 1 Apr 2020 06:01:19 -0700 Received: from shsmsx152.ccr.corp.intel.com (10.239.6.52) by fmsmsx124.amr.corp.intel.com (10.18.125.39) with Microsoft SMTP Server (TLS) id 14.3.439.0; Wed, 1 Apr 2020 06:01:18 -0700 Received: from shsmsx103.ccr.corp.intel.com ([169.254.4.146]) by SHSMSX152.ccr.corp.intel.com ([169.254.6.209]) with mapi id 14.03.0439.000; Wed, 1 Apr 2020 21:01:16 +0800 From: "Liu, Yong" To: Gavin Hu , "maxime.coquelin@redhat.com" , "Ye, Xiaolong" , "Wang, Zhihong" CC: "dev@dpdk.org" , nd Thread-Topic: [dpdk-dev] [PATCH v2 2/2] vhost: cache gpa to hpa translation Thread-Index: AQHWB/Ux3HhOESg2LUy+M7Zh4d4Ys6hjhJyAgAC0/oA= Date: Wed, 1 Apr 2020 13:01:15 +0000 Message-ID: <86228AFD5BCD8E4EBFD2B90117B5E81E63524453@SHSMSX103.ccr.corp.intel.com> References: <20200316153353.112897-1-yong.liu@intel.com> <20200401145011.67357-1-yong.liu@intel.com> <20200401145011.67357-2-yong.liu@intel.com> In-Reply-To: Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.2.0.6 dlp-reaction: no-action x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v2 2/2] vhost: cache gpa to hpa translation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Gavin Hu > Sent: Wednesday, April 1, 2020 6:07 PM > To: Liu, Yong ; maxime.coquelin@redhat.com; Ye, > Xiaolong ; Wang, Zhihong > > Cc: dev@dpdk.org; nd > Subject: RE: [dpdk-dev] [PATCH v2 2/2] vhost: cache gpa to hpa translatio= n >=20 > Hi Marvin, >=20 > > -----Original Message----- > > From: dev On Behalf Of Marvin Liu > > Sent: Wednesday, April 1, 2020 10:50 PM > > To: maxime.coquelin@redhat.com; xiaolong.ye@intel.com; > > zhihong.wang@intel.com > > Cc: dev@dpdk.org; Marvin Liu > > Subject: [dpdk-dev] [PATCH v2 2/2] vhost: cache gpa to hpa translation > > > > If Tx zero copy enabled, gpa to hpa mapping table is updated one by > > one. This will harm performance when guest memory backend using 2M > > hugepages. Now add cached mapping table which will sorted by using > > sequence. Address translation will first check cached mapping table, > > then check unsorted mapping table if no match found. > > > > Signed-off-by: Marvin Liu > > > > diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h > > index 2087d1400..5cb0e83dd 100644 > > --- a/lib/librte_vhost/vhost.h > > +++ b/lib/librte_vhost/vhost.h > > @@ -368,7 +368,9 @@ struct virtio_net { > > struct vhost_device_ops const *notify_ops; > > > > uint32_t nr_guest_pages; > > + uint32_t nr_cached_guest_pages; > > uint32_t max_guest_pages; > > + struct guest_page *cached_guest_pages; > > struct guest_page *guest_pages; > > > > int slave_req_fd; > > @@ -553,12 +555,25 @@ gpa_to_hpa(struct virtio_net *dev, uint64_t gpa, > > uint64_t size) > > { > > uint32_t i; > > struct guest_page *page; > > + uint32_t cached_pages =3D dev->nr_cached_guest_pages; > > + > > + for (i =3D 0; i < cached_pages; i++) { > > + page =3D &dev->cached_guest_pages[i]; > > + if (gpa >=3D page->guest_phys_addr && > > + gpa + size < page->guest_phys_addr + page->size) { > > + return gpa - page->guest_phys_addr + > > + page->host_phys_addr; > > + } > > + } > Sorry, I did not see any speedup with cached guest pages in comparison to > the old code below. > Is it not a simple copy? > Is it a better idea to use hash instead to speed up the translation? > /Gavin Hi Gavin, Here just resort the overall mapping table according to using sequence. =20 Most likely virtio driver will reuse recently cycled buffers, thus search w= ill find match in beginning.=20 That is simple fix for performance enhancement. If use hash for index, it w= ill take much more cost in normal case. Regards, Marvin=20 > > > > for (i =3D 0; i < dev->nr_guest_pages; i++) { > > page =3D &dev->guest_pages[i]; > > > > if (gpa >=3D page->guest_phys_addr && > > gpa + size < page->guest_phys_addr + page->size) { > > + rte_memcpy(&dev- > > >cached_guest_pages[cached_pages], > > + page, sizeof(struct guest_page)); > > + dev->nr_cached_guest_pages++; > > return gpa - page->guest_phys_addr + > > page->host_phys_addr; > > } > > diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_use= r.c > > index 79fcb9d19..1bae1fddc 100644 > > --- a/lib/librte_vhost/vhost_user.c > > +++ b/lib/librte_vhost/vhost_user.c > > @@ -192,7 +192,9 @@ vhost_backend_cleanup(struct virtio_net *dev) > > } > > > > rte_free(dev->guest_pages); > > + rte_free(dev->cached_guest_pages); > > dev->guest_pages =3D NULL; > > + dev->cached_guest_pages =3D NULL; > > > > if (dev->log_addr) { > > munmap((void *)(uintptr_t)dev->log_addr, dev->log_size); > > @@ -898,7 +900,7 @@ add_one_guest_page(struct virtio_net *dev, > > uint64_t guest_phys_addr, > > uint64_t host_phys_addr, uint64_t size) > > { > > struct guest_page *page, *last_page; > > - struct guest_page *old_pages; > > + struct guest_page *old_pages, *old_cached_pages; > > > > if (dev->nr_guest_pages =3D=3D dev->max_guest_pages) { > > dev->max_guest_pages *=3D 2; > > @@ -906,9 +908,19 @@ add_one_guest_page(struct virtio_net *dev, > > uint64_t guest_phys_addr, > > dev->guest_pages =3D rte_realloc(dev->guest_pages, > > dev->max_guest_pages * > > sizeof(*page), > > RTE_CACHE_LINE_SIZE); > > - if (dev->guest_pages =3D=3D NULL) { > > + old_cached_pages =3D dev->cached_guest_pages; > > + dev->cached_guest_pages =3D rte_realloc(dev- > > >cached_guest_pages, > > + dev->max_guest_pages * > > + sizeof(*page), > > + RTE_CACHE_LINE_SIZE); > > + dev->nr_cached_guest_pages =3D 0; > > + if (dev->guest_pages =3D=3D NULL || > > + dev->cached_guest_pages =3D=3D NULL) { > > VHOST_LOG_CONFIG(ERR, "cannot realloc > > guest_pages\n"); > > rte_free(old_pages); > > + rte_free(old_cached_pages); > > + dev->guest_pages =3D NULL; > > + dev->cached_guest_pages =3D NULL; > > return -1; > > } > > } > > @@ -1078,6 +1090,20 @@ vhost_user_set_mem_table(struct virtio_net > > **pdev, struct VhostUserMsg *msg, > > } > > } > > > > + if (dev->cached_guest_pages =3D=3D NULL) { > > + dev->cached_guest_pages =3D rte_zmalloc(NULL, > > + dev->max_guest_pages * > > + sizeof(struct guest_page), > > + RTE_CACHE_LINE_SIZE); > > + if (dev->cached_guest_pages =3D=3D NULL) { > > + VHOST_LOG_CONFIG(ERR, > > + "(%d) failed to allocate memory " > > + "for dev->cached_guest_pages\n", > > + dev->vid); > > + return RTE_VHOST_MSG_RESULT_ERR; > > + } > > + } > > + > > dev->mem =3D rte_zmalloc("vhost-mem-table", sizeof(struct > > rte_vhost_memory) + > > sizeof(struct rte_vhost_mem_region) * memory->nregions, > > 0); > > if (dev->mem =3D=3D NULL) { > > -- > > 2.17.1