From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E4F49A0A0E for ; Mon, 10 May 2021 16:50:41 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B57F34003E; Mon, 10 May 2021 16:50:41 +0200 (CEST) Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2042.outbound.protection.outlook.com [40.107.236.42]) by mails.dpdk.org (Postfix) with ESMTP id EF0874003E for ; Mon, 10 May 2021 16:50:40 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=lpVdEXgf+MWudp60BLkGoYSkeYkwXhYyKmbkNJ+N2vosJoGF8qE/I3Jf35hGqCj2MhFWQAT4tKNQXBhG56khNz4uNlXEW+AuubaBQ5G/FdGd6g6so9Vhi6wNDjhm+wVAQN8LSiU7NNYAdfoTMjXcBPzewfghBEN7ZDso2NLBfnVxCuMgbx3lCVE/nYhT59cllt838LinrZyNkpt/1hrFHk50NPRH6g5QSW5pls56Zk9sjtkB/W3tQZeyF8R+ogqGi8yaIVDq7gz8Qhtm69aXh9PZBqVaEanGDw+e/lMgtC14FbfmGwUFUNYZUoH+vI5nNEIu7AcEB+ts2fut5crW8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8HKEJKOhcLJWoDklQnyNNK51mZdvIfQytHAXSkx+I7A=; b=NWofcUErqRa3wAozkHe5xc3hoHkyAW+UxSxFAXUtSTNehl0QRk/TBzQ7Q95klEOGJZI8PgMDjdkOdvz/EFRIxd3VgxicuxoWp5nyjSnw4CLAFck4ezL2XTvB0BnryBvLC5b04vQlKc6Pp+yCFMk7G4aYXXL44vRaRDeyevqNdxDSgTmkuiBaXFERJ9NAO8osx0OlX62ifOOe+RSh96CNzcZpNLBcuJvOKr7riBfxCEhwC7WVDsZI+mthfZj6Fp0qKmNIb/7VtgyZZudjmr6huhRrlxoA0z5FEHM5P3wyBrhOuoM1D9dgIyzE32K+UeQmLzns5liuqY107QnKF4kqpA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.34) smtp.rcpttodomain=linux.vnet.ibm.com smtp.mailfrom=nvidia.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8HKEJKOhcLJWoDklQnyNNK51mZdvIfQytHAXSkx+I7A=; b=njn/1Woq99Hay83iUdk9KZU8j/Bkx4t18fNlwVbnL1hcEjC3njAQ2jmPINA63CqJEqop7p++jmcnj7sbYCa/0rE/ywU7cH8L8UBvJO2jJW9RvbGLC/0Whh8O47cUdcupz1HPsLIPRBagpSF1uDnPmrmE/wyiR3FgS6GwApzuIcihWMhJncd0PQPUFewLsIEPlt3L0JuM2dMpZPh0brsXVh4t9Nw0yXQlhcLG+E1RWH/QUGAcl9Tbp9w9i+da/6P6nyjHZA+hzCH8OcNzaRLmNtJ+RGntWQ83In9KB//8Zbi64EV0IBG58tc6TFzS2TyFWromUvxiH4eg8CoKOHzJkQ== Received: from MW4PR04CA0203.namprd04.prod.outlook.com (2603:10b6:303:86::28) by MN2PR12MB4800.namprd12.prod.outlook.com (2603:10b6:208:3c::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4108.27; Mon, 10 May 2021 14:50:39 +0000 Received: from CO1NAM11FT007.eop-nam11.prod.protection.outlook.com (2603:10b6:303:86:cafe::af) by MW4PR04CA0203.outlook.office365.com (2603:10b6:303:86::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4108.24 via Frontend Transport; Mon, 10 May 2021 14:50:38 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.34) smtp.mailfrom=nvidia.com; linux.vnet.ibm.com; dkim=none (message not signed) header.d=none;linux.vnet.ibm.com; dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.34 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.34; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.34) by CO1NAM11FT007.mail.protection.outlook.com (10.13.174.131) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4108.25 via Frontend Transport; Mon, 10 May 2021 14:50:38 +0000 Received: from nvidia.com (172.20.145.6) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 10 May 2021 14:50:36 +0000 From: Xueming Li To: Nithin Dabilpuram CC: Anatoly Burakov , David Christensen , dpdk stable Date: Mon, 10 May 2021 22:50:17 +0800 Message-ID: <20210510145017.26193-1-xuemingl@nvidia.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [172.20.145.6] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL107.nvidia.com (172.20.187.13) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 98d89919-e48c-4faf-46a2-08d913c2f9c7 X-MS-TrafficTypeDiagnostic: MN2PR12MB4800: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: SvTO61gFQJdEd4TxyC5GX7DHzzaQzGVAVqo8H6vjiH1OT1UBmGF3Ru3vkN5p/aHqfinpaW8xPWUMin2zCKh3k2fIitGjsTsDGIkVSjwHG1zKPAec0jBYW6Pml9UNFMnv6nJet6+gTtyUYhoVAxnTLXxNxzUELTAlu6hwhk2Uc4L4cp3Vuh2pdJ9uQ2ni3YlSVs6uibBq2GWDnDRn9wA8sm/u75sQNf5s3RRPH6QtnhFOKi//7Jb/fUbhZ4cEPGjmsq3a64VR5/9Yto0W3eZTEfHS7oBRcziwSkFKh6f+hAwFOzWoh3LrQ7Ov3312ip30ytjryPNv37nRrtF2kfOa6yfqCjGANx1eXvyn/lyXZ8suCzzoDQBW9isdPniVY3OoOF3qTvOV2QPJeWRZZG0pN9KOkHl5HTbe20nLvLipKgGamejz4Hm+pEWuFA4ozGz56UzIH/3UvJrH6MBS+h/zoqnAW+jqbNm1FYnameHLfprXC7tHcZfr4n8y6r0JXT24bB9CCvtQpNdNYcubSbB+EY6lhCNxPvIcCxl4U3HKa/Ahj9DaXNcwKWSDRPRBVQL9gsWYwzJT/Lz1csWwWdBfnvng3Ka4vCRb5IVzB1WM4ZzAmTAUni2QruTqswcz5y+GV2hssr0DKjhBMoYy0qOVx9K0rmOI6VgDocW+QtGP3TFUFyMS+EC+NvKsdmn5Rgnt0lzIU44l1KWqeghK4zHEPWuVbTFb6KfyozEJ1YNL6g4jbNbfovDO4WoJH0lwb6o1psDJoDuarIFKGkKazRg+uA== X-Forefront-Antispam-Report: CIP:216.228.112.34; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid03.nvidia.com; CAT:NONE; SFS:(4636009)(376002)(346002)(136003)(396003)(39860400002)(46966006)(36840700001)(7636003)(356005)(83380400001)(7696005)(53546011)(6916009)(1076003)(8936002)(6666004)(8676002)(70206006)(70586007)(6286002)(54906003)(82740400003)(55016002)(316002)(426003)(966005)(336012)(16526019)(5660300002)(4326008)(36906005)(186003)(36756003)(36860700001)(2906002)(2616005)(82310400003)(47076005)(86362001)(26005)(478600001); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 May 2021 14:50:38.3826 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 98d89919-e48c-4faf-46a2-08d913c2f9c7 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.34]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT007.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4800 Subject: [dpdk-stable] patch 'vfio: do not merge contiguous areas' has been queued to stable release 20.11.2 X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Sender: "stable" Hi, FYI, your patch has been queued to stable release 20.11.2 Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet. It will be pushed if I get no objections before 05/12/21. So please shout if anyone has objections. Also note that after the patch there's a diff of the upstream commit vs the patch applied to the branch. This will indicate if there was any rebasing needed to apply to the stable branch. If there were code changes for rebasing (ie: not only metadata diffs), please double check that the rebase was correctly done. Queued patches are on a temporary branch at: https://github.com/steevenlee/dpdk This queued commit can be viewed at: https://github.com/steevenlee/dpdk/commit/0e42f2b7ea5c0f28e220d253428ed85af735829c Thanks. Xueming Li --- >From 0e42f2b7ea5c0f28e220d253428ed85af735829c Mon Sep 17 00:00:00 2001 From: Nithin Dabilpuram Date: Fri, 15 Jan 2021 13:02:41 +0530 Subject: [PATCH] vfio: do not merge contiguous areas [ upstream commit 016763c219580292c8b05059c7452a7a11d0d19e ] In order to save DMA entries limited by kernel both for external memory and hugepage memory, an attempt was made to map physically contiguous memory in one go. This cannot be done as VFIO IOMMU type1 does not support partially unmapping a previously mapped memory region while Heap can request for multi page mapping and partial unmapping. Hence for going back to old method of mapping/unmapping at memseg granularity, this commit reverts commit d1c7c0cdf7ba ("vfio: map contiguous areas in one go") Also add documentation on what module parameter needs to be used to increase the per-container dma map limit for VFIO. Fixes: d1c7c0cdf7ba ("vfio: map contiguous areas in one go") Signed-off-by: Nithin Dabilpuram Acked-by: Anatoly Burakov Acked-by: David Christensen --- doc/guides/linux_gsg/linux_drivers.rst | 10 +++++ lib/librte_eal/linux/eal_vfio.c | 59 ++++---------------------- 2 files changed, 18 insertions(+), 51 deletions(-) diff --git a/doc/guides/linux_gsg/linux_drivers.rst b/doc/guides/linux_gsg/linux_drivers.rst index 90635a45d9..c6b6881ea2 100644 --- a/doc/guides/linux_gsg/linux_drivers.rst +++ b/doc/guides/linux_gsg/linux_drivers.rst @@ -25,6 +25,16 @@ To make use of VFIO, the ``vfio-pci`` module must be loaded: VFIO kernel is usually present by default in all distributions, however please consult your distributions documentation to make sure that is the case. +For DMA mapping of either external memory or hugepages, VFIO interface is used. +VFIO does not support partial unmap of once mapped memory. Hence DPDK's memory is +mapped in hugepage granularity or system page granularity. Number of DMA +mappings is limited by kernel with user locked memory limit of a process (rlimit) +for system/hugepage memory. Another per-container overall limit applicable both +for external memory and system memory was added in kernel 5.1 defined by +VFIO module parameter ``dma_entry_limit`` with a default value of 64K. +When application is out of DMA entries, these limits need to be adjusted to +increase the allowed limit. + Since Linux version 5.7, the ``vfio-pci`` module supports the creation of virtual functions. After the PF is bound to ``vfio-pci`` module, diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c index 050082444e..64b134d530 100644 --- a/lib/librte_eal/linux/eal_vfio.c +++ b/lib/librte_eal/linux/eal_vfio.c @@ -517,11 +517,9 @@ static void vfio_mem_event_callback(enum rte_mem_event type, const void *addr, size_t len, void *arg __rte_unused) { - rte_iova_t iova_start, iova_expected; struct rte_memseg_list *msl; struct rte_memseg *ms; size_t cur_len = 0; - uint64_t va_start; msl = rte_mem_virt2memseg_list(addr); @@ -539,63 +537,22 @@ vfio_mem_event_callback(enum rte_mem_event type, const void *addr, size_t len, /* memsegs are contiguous in memory */ ms = rte_mem_virt2memseg(addr, msl); - - /* - * This memory is not guaranteed to be contiguous, but it still could - * be, or it could have some small contiguous chunks. Since the number - * of VFIO mappings is limited, and VFIO appears to not concatenate - * adjacent mappings, we have to do this ourselves. - * - * So, find contiguous chunks, then map them. - */ - va_start = ms->addr_64; - iova_start = iova_expected = ms->iova; while (cur_len < len) { - bool new_contig_area = ms->iova != iova_expected; - bool last_seg = (len - cur_len) == ms->len; - bool skip_last = false; - - /* only do mappings when current contiguous area ends */ - if (new_contig_area) { - if (type == RTE_MEM_EVENT_ALLOC) - vfio_dma_mem_map(default_vfio_cfg, va_start, - iova_start, - iova_expected - iova_start, 1); - else - vfio_dma_mem_map(default_vfio_cfg, va_start, - iova_start, - iova_expected - iova_start, 0); - va_start = ms->addr_64; - iova_start = ms->iova; - } /* some memory segments may have invalid IOVA */ if (ms->iova == RTE_BAD_IOVA) { RTE_LOG(DEBUG, EAL, "Memory segment at %p has bad IOVA, skipping\n", ms->addr); - skip_last = true; + goto next; } - iova_expected = ms->iova + ms->len; + if (type == RTE_MEM_EVENT_ALLOC) + vfio_dma_mem_map(default_vfio_cfg, ms->addr_64, + ms->iova, ms->len, 1); + else + vfio_dma_mem_map(default_vfio_cfg, ms->addr_64, + ms->iova, ms->len, 0); +next: cur_len += ms->len; ++ms; - - /* - * don't count previous segment, and don't attempt to - * dereference a potentially invalid pointer. - */ - if (skip_last && !last_seg) { - iova_expected = iova_start = ms->iova; - va_start = ms->addr_64; - } else if (!skip_last && last_seg) { - /* this is the last segment and we're not skipping */ - if (type == RTE_MEM_EVENT_ALLOC) - vfio_dma_mem_map(default_vfio_cfg, va_start, - iova_start, - iova_expected - iova_start, 1); - else - vfio_dma_mem_map(default_vfio_cfg, va_start, - iova_start, - iova_expected - iova_start, 0); - } } } -- 2.25.1 --- Diff of the applied patch vs upstream commit (please double-check if non-empty: --- --- - 2021-05-10 22:48:45.767754600 +0800 +++ 0001-vfio-do-not-merge-contiguous-areas.patch 2021-05-10 22:48:45.380000000 +0800 @@ -1 +1 @@ -From 016763c219580292c8b05059c7452a7a11d0d19e Mon Sep 17 00:00:00 2001 +From 0e42f2b7ea5c0f28e220d253428ed85af735829c Mon Sep 17 00:00:00 2001 @@ -5,0 +6,2 @@ +[ upstream commit 016763c219580292c8b05059c7452a7a11d0d19e ] + @@ -20 +21,0 @@ -Cc: stable@dpdk.org