From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 7399811A4 for ; Tue, 4 Sep 2018 15:11:56 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 04 Sep 2018 06:11:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,329,1531810800"; d="scan'208";a="67423754" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga007.fm.intel.com with ESMTP; 04 Sep 2018 06:11:52 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w84DBpKu023344; Tue, 4 Sep 2018 14:11:51 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w84DBpIj024180; Tue, 4 Sep 2018 14:11:51 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w84DBpgj024174; Tue, 4 Sep 2018 14:11:51 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: laszlo.madarassy@ericsson.com, laszlo.vadkerti@ericsson.com, andras.kovacs@ericsson.com, winnie.tian@ericsson.com, daniel.andrasi@ericsson.com, janos.kobor@ericsson.com, srinath.mannam@broadcom.com, scott.branden@broadcom.com, ajit.khaparde@broadcom.com, keith.wiles@intel.com, bruce.richardson@intel.com, thomas@monjalon.net Date: Tue, 4 Sep 2018 14:11:35 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 Subject: [dpdk-dev] [PATCH 00/16] Support externally allocated memory in DPDK X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Sep 2018 13:11:57 -0000 This is a proposal to enable using externally allocated memory in DPDK. In a nutshell, here is what is being done here: - Index internal malloc heaps by NUMA node index, rather than NUMA node itself (external heaps will have ID's in order of creation) - Add identifier string to malloc heap, to uniquely identify it - Each new heap will receive a unique socket ID that will be used by allocator to decide from which heap (internal or external) to allocate requested amount of memory - Allow creating named heaps and add/remove memory to/from those heaps - Allocate memseg lists at runtime, to keep track of IOVA addresses of externally allocated memory - If IOVA addresses aren't provided, use RTE_BAD_IOVA - Allow malloc and memzones to allocate from external heaps - Allow other data structures to allocate from externall heaps The responsibility to ensure memory is accessible before using it is on the shoulders of the user - there is no checking done with regards to validity of the memory (nor could there be...). The general approach is to create heap and add memory into it. For any other process wishing to use the same memory, said memory must first be attached (otherwise some things will not work). A design decision was made to make multiprocess synchronization a manual process. Due to underlying issues with attaching to fbarrays in secondary processes, this design was deemed to be better because we don't want to fail to create external heap in the primary because something in the secondary has failed when in fact we may not eve have wanted this memory to be accessible in the secondary in the first place. Using external memory in multiprocess is *hard*, because not only memory space needs to be preallocated, but it also needs to be attached in each process to allow other processes to access the page table. The attach API call may or may not succeed, depending on memory layout, for reasons similar to other multiprocess failures. This is treated as a "known issue" for this release. RFC -> v1 changes: - Removed the "named heaps" API, allocate using fake socket ID instead - Added multiprocess support - Everything is now thread-safe - Numerous bugfixes and API improvements Anatoly Burakov (16): mem: add length to memseg list mem: allow memseg lists to be marked as external malloc: index heaps using heap ID rather than NUMA node mem: do not check for invalid socket ID flow_classify: do not check for invalid socket ID pipeline: do not check for invalid socket ID sched: do not check for invalid socket ID malloc: add name to malloc heaps malloc: add function to query socket ID of named heap malloc: allow creating malloc heaps malloc: allow destroying heaps malloc: allow adding memory to named heaps malloc: allow removing memory from named heaps malloc: allow attaching to external memory chunks malloc: allow detaching from external memory test: add unit tests for external memory support config/common_base | 1 + config/rte_config.h | 1 + drivers/bus/fslmc/fslmc_vfio.c | 7 +- drivers/bus/pci/linux/pci.c | 2 +- drivers/net/mlx4/mlx4_mr.c | 3 + drivers/net/mlx5/mlx5.c | 5 +- drivers/net/mlx5/mlx5_mr.c | 3 + drivers/net/virtio/virtio_user/vhost_kernel.c | 5 +- lib/librte_eal/bsdapp/eal/eal.c | 3 + lib/librte_eal/bsdapp/eal/eal_memory.c | 9 +- lib/librte_eal/common/eal_common_memory.c | 9 +- lib/librte_eal/common/eal_common_memzone.c | 8 +- .../common/include/rte_eal_memconfig.h | 6 +- lib/librte_eal/common/include/rte_malloc.h | 181 +++++++++ .../common/include/rte_malloc_heap.h | 3 + lib/librte_eal/common/include/rte_memory.h | 9 + lib/librte_eal/common/malloc_heap.c | 287 +++++++++++-- lib/librte_eal/common/malloc_heap.h | 17 + lib/librte_eal/common/rte_malloc.c | 383 ++++++++++++++++- lib/librte_eal/linuxapp/eal/eal.c | 3 + lib/librte_eal/linuxapp/eal/eal_memalloc.c | 12 +- lib/librte_eal/linuxapp/eal/eal_memory.c | 4 +- lib/librte_eal/linuxapp/eal/eal_vfio.c | 17 +- lib/librte_eal/rte_eal_version.map | 7 + lib/librte_flow_classify/rte_flow_classify.c | 3 +- lib/librte_mempool/rte_mempool.c | 31 +- lib/librte_pipeline/rte_pipeline.c | 3 +- lib/librte_sched/rte_sched.c | 2 +- test/test/Makefile | 1 + test/test/autotest_data.py | 14 +- test/test/meson.build | 1 + test/test/test_external_mem.c | 384 ++++++++++++++++++ test/test/test_malloc.c | 3 + test/test/test_memzone.c | 3 + 34 files changed, 1346 insertions(+), 84 deletions(-) create mode 100644 test/test/test_external_mem.c -- 2.17.1