From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 637D642D8A; Thu, 29 Jun 2023 10:21:45 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DC73440EDB; Thu, 29 Jun 2023 10:21:44 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by mails.dpdk.org (Postfix) with ESMTP id C9FCE406B7 for ; Thu, 29 Jun 2023 10:21:42 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1688026903; x=1719562903; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=ZM4W4hEQNm34hCMzxyWJdo1Qe1ukKzUqlveV/SGA6f8=; b=BIncrmq9KTNmB/H3jhl+jymNe1GJZFt6LvotkRSMgDRVaBfNt6j3KQHo PaKR3Qw3CJzmQJlrpAvRaTOJe9fdbaFXNTf7Nf5UsaWS6UdplvQw0CO46 B7sA8BXkX6VN7Vt1nYqeAdyP3mpYT5Jc3UxhbF7m2bIGkGpZ2TeiZlTFp D+tfJv0sIAMKW86VE+g5lLpJ2eQEYoaP2ZRG8QOZtBapQmFENAVqPHOhs LiQ4ZjShBozDLBHK7GXmEQLyHo9xEvbWz2H9iHT9qpfjJHivG5BPXx40V H623XwmT+fx/A0+o/q256+i3gTtsYpAmbjTf+WSL8AFbHM+Nxz8tUi3tU w==; X-IronPort-AV: E=McAfee;i="6600,9927,10755"; a="342389353" X-IronPort-AV: E=Sophos;i="6.01,168,1684825200"; d="scan'208";a="342389353" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jun 2023 01:21:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10755"; a="841377354" X-IronPort-AV: E=Sophos;i="6.01,168,1684825200"; d="scan'208";a="841377354" Received: from orsmsx601.amr.corp.intel.com ([10.22.229.14]) by orsmga004.jf.intel.com with ESMTP; 29 Jun 2023 01:21:17 -0700 Received: from orsmsx610.amr.corp.intel.com (10.22.229.23) by ORSMSX601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Thu, 29 Jun 2023 01:21:16 -0700 Received: from ORSEDG601.ED.cps.intel.com (10.7.248.6) by orsmsx610.amr.corp.intel.com (10.22.229.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27 via Frontend Transport; Thu, 29 Jun 2023 01:21:16 -0700 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (104.47.59.173) by edgegateway.intel.com (134.134.137.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.27; Thu, 29 Jun 2023 01:21:16 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YThP7CUHhMBtaJUmEB29ww1lLnw4j1RbqwN/YYI1h6/201bgpQnsf3xCZOsooiKK3PfmDqFM+RW/XcgHI9Si4p+7xyqim/estYcHhxKKmWHj2pp/39Yp/rYJbXSlvS9t6uxNpsbILofCo0oslKOfiXnqdoYLz7ZUw0Z1X8CmhrqlX3s6ZxpPvtHQ1XJ864VtRaRb2NAv7HhW3zU94rsnSWwicZ6JH1uRxZCiIyvEyDQDA8/YFx7qdJNn35t0jxcghZtfPFCXA/pxHWlr2fnC/u2RUW0ej8VaGH+7DUcX5UAHHJkpJSVQOTrPrBPD1NBBHHR+MOpVloNZDg2kMr8/Nw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=JyVWLo+c2XFfUEttSbkFwFIyNipy/isuvD18gUvA7Qw=; b=c/AiboYa/XFB4Uyotfl3s4BGfPqD0UpFIJO4yvcGaBBYZarhOwRoby4Dp+36Z/xabcLDi3fSP6Tv4LQXht/CHZZhK4eV1mDhYiuyjtUIPH9FuLoOpFPZ19u8fi8Ft2FSst6NTDq1I9QXvGYE0Bd22JnWiEGHWeds0Uo0/a8iwedMqw78G8MdI2xEgi8NfqGJ3X+BD/vF6Rndcyl0SbtDCeiFuHwaDaNGQ4Ya62TFqzTL60VA/odl4YXkmPclOl8CeAVQLThUg/shLMe9DmWyzIix7sBCt7V3fwk8+n1JEPX0gzhkgw6gJp50ImICRYrmWygHfrlU+Cp1mxA/Wd3qZg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from BN9PR11MB5513.namprd11.prod.outlook.com (2603:10b6:408:102::11) by SA0PR11MB4574.namprd11.prod.outlook.com (2603:10b6:806:71::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.24; Thu, 29 Jun 2023 08:21:14 +0000 Received: from BN9PR11MB5513.namprd11.prod.outlook.com ([fe80::8575:f58b:94d7:e121]) by BN9PR11MB5513.namprd11.prod.outlook.com ([fe80::8575:f58b:94d7:e121%7]) with mapi id 15.20.6521.024; Thu, 29 Jun 2023 08:21:14 +0000 From: "Ding, Xuan" To: Nipun Gupta , "dev@dpdk.org" , "thomas@monjalon.net" , "Burakov, Anatoly" , "ferruh.yigit@amd.com" CC: "nikhil.agarwal@amd.com" , "He, Xingguang" , "Ling, WeiX" Subject: RE: [PATCH] vfio: do not coalesce DMA mappings Thread-Topic: [PATCH] vfio: do not coalesce DMA mappings Thread-Index: AQHZHDVrAwVw6KlvPU+E8wN/ilq+d6+ijFzQ Date: Thu, 29 Jun 2023 08:21:14 +0000 Message-ID: References: <20221230095853.1323616-1-nipun.gupta@amd.com> In-Reply-To: <20221230095853.1323616-1-nipun.gupta@amd.com> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: BN9PR11MB5513:EE_|SA0PR11MB4574:EE_ x-ms-office365-filtering-correlation-id: 69ce3198-0930-491f-9867-08db7879ce23 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: mGmdOvA7T2Ereml2WMWKfWPKIPEWmXWcxX6rozgO3/ss60F03AQDqzs6fS4Cj4CMwCHFDNHbZ9GDbfU4PkHVbCNnr4Yc4dbTimHRwMCf6/2gsiEDhaAzkuZ46YguYPHbFOZzpwCwwXSbMnMDH8lSkB4oyPaL5c0sMdn6UwJjP5H5t8gmIfyPExq1ziwSUZ21/qmbhFica2jmlU0ArutBYARsk50lK6B8LQD3psK+da3sqt5FyXCZcb6Kyzl9kxn1xFjig0/3PVC/9ubUxmBIFfCYmtmuofi8QZ56ZCqE4AgYa00UvibWUAuIl3MxFcRAEN/OKuPRog2yUMERxLEqlC0Q9ODbem9ROj5ALhAaghQuaV64VtnqLpI0zOzaDNjdnq7Wx2zVUj3QDrOzNHZQxcHzWL1EAKTeA88/opVj96LOEGEg/tfq2MPmkQ2d0NQ8qXdBgxudUjtCrAZang09YGlRRevXSux0cFR604Ml3Cs4shG05tUoQQqRPpL0qycUzURyXu2h8ydd+11cEVFVkn5d1kNoogLltUjpeN/tyiT8psxIKOKHzt+tXYMdGhs9sTXFI4tbHMA2+V4g7cTaB/bVuZRoWR8OPP5VLJH5e0H7dxu8Lh09oJ6ykNT8Tx705OgpLgzSqRcBX65vgclHrg== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BN9PR11MB5513.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(376002)(346002)(136003)(396003)(39860400002)(366004)(451199021)(9686003)(41300700001)(966005)(107886003)(26005)(54906003)(71200400001)(7696005)(110136005)(83380400001)(478600001)(6506007)(186003)(53546011)(76116006)(2906002)(52536014)(38100700002)(33656002)(5660300002)(66446008)(66946007)(55016003)(82960400001)(122000001)(64756008)(4326008)(66476007)(316002)(66556008)(8936002)(86362001)(8676002)(38070700005); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?dNLoDbif0hpoXbbR1whO08GufomjMSw1iFEJmrgz8o7RQb1/NMxLPGUM/O9x?= =?us-ascii?Q?fKLHLv+LNEggTGgIIlG3QTTjWMsVEfPibY9k478DZbyOeYUsjgOQLsf2Fj34?= =?us-ascii?Q?kJ2K+Qjz3XP+p2pRstkYM+M33rVazzWtQ6mQ24C6EzcR4uIuJp6MVkcqMvex?= =?us-ascii?Q?b9YVk9HiyzhwUB5JG4GJJxCIV7FaxWcjOWvoI4/RNNAnIGFCYI95847SSa1z?= =?us-ascii?Q?GKkgZ9B3olWm8e0ibjMOS9BE6PsRaHdLRo3rmcfi/zUyTIexnEsCPRPfT1eS?= =?us-ascii?Q?kOoOlgnsgWzvfZ6QIdz6e7ZNfQOv8zv40hRl2lAqFBsA8rT9mRBP0RF4jGxV?= =?us-ascii?Q?FQQyaLGamukho1zDu9gvEgyePZdaQ9EE7k+t/yRMsySLVTgVNy8FHeaPmpsE?= =?us-ascii?Q?sSE5+Vg6dF9K+FRBrMdZSaEZH2FNIhz5eHEMXMnLiRhkWTMHQuDwwRrXWdhe?= =?us-ascii?Q?Q7ErSaQoItxcZB/Q1+H6WJClS45s/F19lfEmwDymfskxpgQ6zEF6meGnF7BZ?= =?us-ascii?Q?NiLIibcWWqtiICNp4oWRPkscmiKVQvwS3GE0JIkv+6fgFnWhLJQicxDr5I6v?= =?us-ascii?Q?FckKTFc+BZDIghAPN+nPHcoFV9+hySR2fPfzO9LicAvoKTwHDvg2zgF/Apbt?= =?us-ascii?Q?4Owg7SHT4uXM6PSFn0wgmxzSYRv5N5x8MTNvrSsh4vqdzHHDOSI1ZbnRg6IN?= =?us-ascii?Q?NmBFsfJvFJNhxXGfKf3FP++8ED15X1/QHvtEplYmqmi0Lki68U2JyXQqSh0V?= =?us-ascii?Q?X+b/lJLcVVPf7R1RLVJKquT1vdqanSwjUH4dbrSSELxs4nbWC13jJ8/kGqvL?= =?us-ascii?Q?1RLQEczPUZzvzCFnwNXm516U8nYLydnLYX7/9HZqxvVGnkBeuGH8dp459jQI?= =?us-ascii?Q?8C5sifnl+16wr+zB6bUldyA9vKKmI1DCBmdPVDH6NaMwPZzyCM4EgNvnkKnu?= =?us-ascii?Q?eXwrXOJkL1LuiniuyRGZaAIxuFxHt0nCm6e2UYkuRwOTBHjP/QBYFjrh04A0?= =?us-ascii?Q?A3ChvTx1Gz8MkD65/OjGXwQYfeHW57ac7qwPE2qBuLpEIsiryuiRLrmGSP4s?= =?us-ascii?Q?c+UPUzRJnJiqaeGkhg/kw2lg/gxZhSCEwa/I/4Ap9rGbOJZdkStyTZ8rchtF?= =?us-ascii?Q?cSRdnS5K/AN7XkOixBoaI3BXL4gMTVkGM+OUJRpqKass3ekJbVmm3/XtZ30m?= =?us-ascii?Q?sXEsk37i4H6I16UqVPI+EoNxzlT7+NhOCz8f2dW595Iul1H6N3Bjs6AM/dYR?= =?us-ascii?Q?t7jA+yNSz2xp4JPkaolvidYF+Uh1QAfh04IcRugUa2ajTexeE/cTiPI/GR1z?= =?us-ascii?Q?fyQbkWCAcT1Q88ivpk053OWYVsJO4ToTwQzgD9aoyUosGVzYkdQZO2LOdRlI?= =?us-ascii?Q?ltumNOh/uC+Jy5+ucgpg8nn/P7cqZrIiccge+Uh3i5P6DH3XV9p0PWlVaIjU?= =?us-ascii?Q?CzcP+3ehszGCOgP0eiEmRdngZOBSqio2VVIYPd0s+Rbw/rp0gOf8AjXrgHLL?= =?us-ascii?Q?e9nkw/vM/IgDZCQqV+2idOc9ymwyfGrFH8maWRac0k3Oi+iUuxDwN+HWgnJN?= =?us-ascii?Q?7iYp52mWzu/Iy2K8hEPoLJQmrsvc6ohtyY0cEq6j?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BN9PR11MB5513.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 69ce3198-0930-491f-9867-08db7879ce23 X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Jun 2023 08:21:14.7351 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 15koUdmtBnB5PnVoKsd8a4pOpmv2LTmbnfKzdOUmqceeZ0HWgHe5Hi9kIhDZMWCtxpxRMscxd6iz/DhU9O+9dg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR11MB4574 X-OriginatorOrg: intel.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Nipun, I'd like to appreciate your time reading this email. Our QA team found that since this commit "a399d7b5a994: do not coalesce DMA= mappings" is introduced, the dpdk testpmd start with "--no-huge" parameters will failed, and shows "= EAL: Cannot set up DMA remapping, error 28 (No space left on device)". So they reported it on dpdk Bugzilla: https://bugs.dpdk.org/show_bug.cgi?id= =3D1235. I understand this feature is to keep consistent with the kernel and not all= ow memory segments be merged. The side effect is the testpmd with "--no-huge" parameters will not be able= to start because the too many pages will exceed the capability of IOMMU. Is it expected? Should we remove the --no-huge" in our testcase? Regards, Xuan > -----Original Message----- > From: Nipun Gupta > Sent: Friday, December 30, 2022 5:59 PM > To: dev@dpdk.org; thomas@monjalon.net; Burakov, Anatoly > ; ferruh.yigit@amd.com > Cc: nikhil.agarwal@amd.com; Nipun Gupta > Subject: [PATCH] vfio: do not coalesce DMA mappings >=20 > At the cleanup time when dma unmap is done, linux kernel does not allow > unmap of individual segments which were coalesced together while creating > the DMA map for type1 IOMMU mappings. So, this change updates the > mapping of the memory > segments(hugepages) on a per-page basis. >=20 > Signed-off-by: Nipun Gupta > --- >=20 > When hotplug of devices is used, multiple pages gets colaeced and a singl= e > mapping gets created for these pages (using APIs > rte_memseg_contig_walk() and type1_map_contig(). On the cleanup time > when the memory is released, the VFIO does not cleans up that memory and > following error is observed in the eal for 2MB > hugepages: > EAL: Unexpected size 0 of DMA remapping cleared instead of 2097152 >=20 > This is because VFIO does not clear the DMA (refer API > vfio_dma_do_unmap() - > https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_ty= pe1. > c#L1330), > where it checks the dma mapping where it checks for IOVA to free: > https://elixir.bootlin.com/linux/latest/source/drivers/vfio/vfio_iommu_ty= pe1. > c#L1418. >=20 > Thus this change updates the mapping to be created individually instead o= f > colaecing them. >=20 > lib/eal/linux/eal_vfio.c | 29 ----------------------------- > 1 file changed, 29 deletions(-) >=20 > diff --git a/lib/eal/linux/eal_vfio.c b/lib/eal/linux/eal_vfio.c index > 549b86ae1d..56edccb0db 100644 > --- a/lib/eal/linux/eal_vfio.c > +++ b/lib/eal/linux/eal_vfio.c > @@ -1369,19 +1369,6 @@ rte_vfio_get_group_num(const char *sysfs_base, > return 1; > } >=20 > -static int > -type1_map_contig(const struct rte_memseg_list *msl, const struct > rte_memseg *ms, > - size_t len, void *arg) > -{ > - int *vfio_container_fd =3D arg; > - > - if (msl->external) > - return 0; > - > - return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, > ms->iova, > - len, 1); > -} > - > static int > type1_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms= , > void *arg) > @@ -1396,10 +1383,6 @@ type1_map(const struct rte_memseg_list *msl, > const struct rte_memseg *ms, > if (ms->iova =3D=3D RTE_BAD_IOVA) > return 0; >=20 > - /* if IOVA mode is VA, we've already mapped the internal segments */ > - if (!msl->external && rte_eal_iova_mode() =3D=3D RTE_IOVA_VA) > - return 0; > - > return vfio_type1_dma_mem_map(*vfio_container_fd, ms->addr_64, > ms->iova, > ms->len, 1); > } > @@ -1464,18 +1447,6 @@ vfio_type1_dma_mem_map(int vfio_container_fd, > uint64_t vaddr, uint64_t iova, static int vfio_type1_dma_map(int > vfio_container_fd) { > - if (rte_eal_iova_mode() =3D=3D RTE_IOVA_VA) { > - /* with IOVA as VA mode, we can get away with mapping > contiguous > - * chunks rather than going page-by-page. > - */ > - int ret =3D rte_memseg_contig_walk(type1_map_contig, > - &vfio_container_fd); > - if (ret) > - return ret; > - /* we have to continue the walk because we've skipped the > - * external segments during the config walk. > - */ > - } > return rte_memseg_walk(type1_map, &vfio_container_fd); } >=20 > -- > 2.25.1