From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1AFA2A04B6 for ; Thu, 17 Sep 2020 04:48:51 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E24E71D44F; Thu, 17 Sep 2020 04:48:50 +0200 (CEST) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id B02D31D440; Thu, 17 Sep 2020 04:48:47 +0200 (CEST) IronPort-SDR: 4jyr+tiyVY1dO0pzNhY52wi8/LESH+6WUYNO8FtVEqFqBOZv3PviatNQDTFo7v/igH4kZU3QtW qNiHYhYo/CLQ== X-IronPort-AV: E=McAfee;i="6000,8403,9746"; a="221165918" X-IronPort-AV: E=Sophos;i="5.76,434,1592895600"; d="scan'208";a="221165918" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2020 19:48:46 -0700 IronPort-SDR: jronH8z/6kmaouZ97f/4xWZh8OWMYFu08wrNIOnPOTFleJfqnB5VdxAHsICOluzE8ratCFN3Fk kMERm8iDgR/A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.76,434,1592895600"; d="scan'208";a="339291693" Received: from irsmsx604.ger.corp.intel.com ([163.33.146.137]) by fmsmga002.fm.intel.com with ESMTP; 16 Sep 2020 19:48:45 -0700 Received: from shsmsx606.ccr.corp.intel.com (10.109.6.216) by IRSMSX604.ger.corp.intel.com (163.33.146.137) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Thu, 17 Sep 2020 03:48:43 +0100 Received: from shsmsx606.ccr.corp.intel.com ([10.109.6.216]) by SHSMSX606.ccr.corp.intel.com ([10.109.6.216]) with mapi id 15.01.1713.004; Thu, 17 Sep 2020 10:48:41 +0800 From: "Mei, JianweiX" To: "Burakov, Anatoly" , "dev@dpdk.org" CC: "stable@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH 18.11] vfio: map contiguous areas in one go Thread-Index: AQHWjBw9cmBjPgVAREGbKcAHx0IlwKlsIdZA Date: Thu, 17 Sep 2020 02:48:41 +0000 Message-ID: <3d46008ec97b4bee980c9d91cc4c955c@intel.com> References: <4275aaf7d248f0e2a05ac21c05d682ce7ea4ad56.1600255570.git.anatoly.burakov@intel.com> In-Reply-To: <4275aaf7d248f0e2a05ac21c05d682ce7ea4ad56.1600255570.git.anatoly.burakov@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.36] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-stable] [dpdk-dev] [PATCH 18.11] vfio: map contiguous areas in one go X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Sender: "stable" Tested-by: Mei,JianweiX < jianweix.mei@intel.com> -----Original Message----- From: dev On Behalf Of Anatoly Burakov Sent: Wednesday, September 16, 2020 7:26 PM To: dev@dpdk.org Cc: stable@dpdk.org Subject: [dpdk-dev] [PATCH 18.11] vfio: map contiguous areas in one go Currently, when we are creating DMA mappings for memory that's either exter= nal or is backed by hugepages in IOVA as PA mode, we assume that each page = is necessarily discontiguous. This may not actually be the case, especially= for external memory, where the user is able to create their own IOVA table= and make it contiguous. This is a problem because VFIO has a limited numbe= r of DMA mappings, and it does not appear to concatenate them and treats ea= ch mapping as separate, even when they cover adjacent areas. Fix this so that we always map contiguous memory in a single chunk, as oppo= sed to mapping each segment separately. Signed-off-by: Anatoly Burakov --- Notes: Resend for stable as original patch [1] no longer applies =20 [1] http://patches.dpdk.org/patch/66041/ lib/librte_eal/linuxapp/eal/eal_vfio.c | 59 ++++++++++++++++++++++---- 1 file changed, 51 insertions(+), 8 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxa= pp/eal/eal_vfio.c index 2f84e0f215..be969bbaac 100644 --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c @@ -510,9 +510,11 @@ static void vfio_mem_event_callback(enum rte_mem_event type, const void *addr, size_t = len, void *arg __rte_unused) { + rte_iova_t iova_start, iova_expected; struct rte_memseg_list *msl; struct rte_memseg *ms; size_t cur_len =3D 0; + uint64_t va_start; =20 msl =3D rte_mem_virt2memseg_list(addr); =20 @@ -530,22 +532,63 @@ vfio_mem_event_callback(enum rte_mem_event type, cons= t void *addr, size_t len, =20 /* memsegs are contiguous in memory */ ms =3D rte_mem_virt2memseg(addr, msl); + + /* + * This memory is not guaranteed to be contiguous, but it still could + * be, or it could have some small contiguous chunks. Since the number + * of VFIO mappings is limited, and VFIO appears to not concatenate + * adjacent mappings, we have to do this ourselves. + * + * So, find contiguous chunks, then map them. + */ + va_start =3D ms->addr_64; + iova_start =3D iova_expected =3D ms->iova; while (cur_len < len) { + bool new_contig_area =3D ms->iova !=3D iova_expected; + bool last_seg =3D (len - cur_len) =3D=3D ms->len; + bool skip_last =3D false; + + /* only do mappings when current contiguous area ends */ + if (new_contig_area) { + if (type =3D=3D RTE_MEM_EVENT_ALLOC) + vfio_dma_mem_map(default_vfio_cfg, va_start, + iova_start, + iova_expected - iova_start, 1); + else + vfio_dma_mem_map(default_vfio_cfg, va_start, + iova_start, + iova_expected - iova_start, 0); + va_start =3D ms->addr_64; + iova_start =3D ms->iova; + } /* some memory segments may have invalid IOVA */ if (ms->iova =3D=3D RTE_BAD_IOVA) { RTE_LOG(DEBUG, EAL, "Memory segment at %p has bad IOVA, skipping\n", ms->addr); - goto next; + skip_last =3D true; } - if (type =3D=3D RTE_MEM_EVENT_ALLOC) - vfio_dma_mem_map(default_vfio_cfg, ms->addr_64, - ms->iova, ms->len, 1); - else - vfio_dma_mem_map(default_vfio_cfg, ms->addr_64, - ms->iova, ms->len, 0); -next: + iova_expected =3D ms->iova + ms->len; cur_len +=3D ms->len; ++ms; + + /* + * don't count previous segment, and don't attempt to + * dereference a potentially invalid pointer. + */ + if (skip_last && !last_seg) { + iova_expected =3D iova_start =3D ms->iova; + va_start =3D ms->addr_64; + } else if (!skip_last && last_seg) { + /* this is the last segment and we're not skipping */ + if (type =3D=3D RTE_MEM_EVENT_ALLOC) + vfio_dma_mem_map(default_vfio_cfg, va_start, + iova_start, + iova_expected - iova_start, 1); + else + vfio_dma_mem_map(default_vfio_cfg, va_start, + iova_start, + iova_expected - iova_start, 0); + } } } =20 -- 2.17.1