From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id D75A32C24 for ; Sun, 10 Mar 2019 09:28:18 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from shahafs@mellanox.com) with ESMTPS (AES256-SHA encrypted); 10 Mar 2019 10:28:18 +0200 Received: from unicorn01.mtl.labs.mlnx. (unicorn01.mtl.labs.mlnx [10.7.12.62]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id x2A8SI58029045; Sun, 10 Mar 2019 10:28:18 +0200 From: Shahaf Shuler To: anatoly.burakov@intel.com, yskoh@mellanox.com, thomas@monjalon.net, ferruh.yigit@intel.com, nhorman@tuxdriver.com, gaetan.rivet@6wind.com Cc: dev@dpdk.org Date: Sun, 10 Mar 2019 10:27:57 +0200 Message-Id: X-Mailer: git-send-email 2.12.0 In-Reply-To: References: Subject: [dpdk-dev] [PATCH v4 0/6] introduce DMA memory mapping for external memory X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Mar 2019 08:28:19 -0000 The DPDK APIs expose 3 different modes to work with memory used for DMA: 1. Use the DPDK owned memory (backed by the DPDK provided hugepages). This memory is allocated by the DPDK libraries, included in the DPDK memory system (memseg lists) and automatically DMA mapped by the DPDK layers. 2. Use memory allocated by the user and register to the DPDK memory systems. Upon registration of memory, the DPDK layers will DMA map it to all needed devices. After registration, allocation of this memory will be done with rte_*malloc APIs. 3. Use memory allocated by the user and not registered to the DPDK memory system. This is for users who wants to have tight control on this memory (e.g. avoid the rte_malloc header). The user should create a memory, register it through rte_extmem_register API, and call DMA map function in order to register such memory to the different devices. The scope of the patch focus on #3 above. Currently the only way to map external memory is through VFIO (rte_vfio_dma_map). While VFIO is common, there are other vendors which use different ways to map memory (e.g. Mellanox and NXP). The work in this patch moves the DMA mapping to vendor agnostic APIs. Device level DMA map and unmap APIs were added. Implementation of those APIs was done currently only for PCI devices. For PCI bus devices, the pci driver can expose its own map and unmap functions to be used for the mapping. In case the driver doesn't provide any, the memory will be mapped, if possible, to IOMMU through VFIO APIs. Application usage with those APIs is quite simple: * allocate memory * call rte_extmem_register on the memory chunk. * take a device, and query its rte_device. * call the device specific mapping function for this device. Future work will deprecate the rte_vfio_dma_map and rte_vfio_dma_unmap APIs, leaving the rte device APIs as the preferred option for the user. On v4: - Changed rte_dev_dma_map errno to ENOTSUP in case bus doesn't support DMA map API. On v3: - Fixed compilation issue on freebsd. - Fixed forgotten rte_bus_dma_map to rte_dev_dma_map. - Removed __rte_deprecated from vfio function till the time the rte_dev_dma_map will be non-experimental. - Changed error return value to always be -1, with proper errno. - Used rte_mem_virt2memseg_list instead of rte_mem_virt2memseg to verify memory is registered. - Added above check also on dma_unmap calls. - Added note in the API the memory must be registered in advance. - Added debug log to report the case memory mapping to vfio was skipped. On v2: - Added warn in release notes about the API change in vfio. - Moved function doc to prototype declaration. - Used dma_map and dma_unmap instead of map and unmap. - Used RTE_VFIO_DEFAULT_CONTAINER_FD instead of -1 fixed value. - Moved bus function to eal_common_dev.c and renamed them properly. - Changed eth device iterator to use RTE_DEV_FOREACH. - Enforced memory is registered with rte_extmem_* prior to mapping. - Used EEXIST as the only possible return value from type1 vfio IOMMU mapping. [1] https://patches.dpdk.org/patch/47796/ Shahaf Shuler (6): vfio: allow DMA map of memory for the default vfio fd vfio: don't fail to DMA map if memory is already mapped bus: introduce device level DMA memory mapping net/mlx5: refactor external memory registration net/mlx5: support PCI device DMA map and unmap doc: deprecation notice for VFIO DMA map APIs doc/guides/prog_guide/env_abstraction_layer.rst | 2 +- doc/guides/rel_notes/deprecation.rst | 4 + doc/guides/rel_notes/release_19_05.rst | 3 + drivers/bus/pci/pci_common.c | 48 ++++ drivers/bus/pci/rte_bus_pci.h | 40 ++++ drivers/net/mlx5/mlx5.c | 2 + drivers/net/mlx5/mlx5_mr.c | 225 ++++++++++++++++--- drivers/net/mlx5/mlx5_rxtx.h | 5 + lib/librte_eal/common/eal_common_dev.c | 34 +++ lib/librte_eal/common/include/rte_bus.h | 44 ++++ lib/librte_eal/common/include/rte_dev.h | 47 ++++ lib/librte_eal/common/include/rte_vfio.h | 8 +- lib/librte_eal/linuxapp/eal/eal_vfio.c | 42 +++- lib/librte_eal/rte_eal_version.map | 2 + 14 files changed, 468 insertions(+), 38 deletions(-) -- 2.12.0