From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 22EB84C7B for ; Thu, 31 May 2018 12:58:05 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 May 2018 03:58:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,463,1520924400"; d="scan'208";a="52241466" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by fmsmga002.fm.intel.com with ESMTP; 31 May 2018 03:58:00 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w4VAvx0t004126; Thu, 31 May 2018 11:57:59 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w4VAvx1x002494; Thu, 31 May 2018 11:57:59 +0100 Received: (from aburakov@localhost) by sivswdev01.ir.intel.com with LOCAL id w4VAvo65002458; Thu, 31 May 2018 11:57:50 +0100 From: Anatoly Burakov To: dev@dpdk.org Cc: thomas@monjalon.net, hemant.agrawal@nxp.com, bruce.richardson@intel.com, ferruh.yigit@intel.com, konstantin.ananyev@intel.com, jerin.jacob@caviumnetworks.com, olivier.matz@6wind.com, stephen@networkplumber.org, nhorman@tuxdriver.com, david.marchand@6wind.com, gowrishankar.m@linux.vnet.ibm.com Date: Thu, 31 May 2018 11:57:47 +0100 Message-Id: X-Mailer: git-send-email 1.7.0.7 Subject: [dpdk-dev] [RFC 0/3] Make device mapping more reliable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 May 2018 10:58:06 -0000 Currently, memory for device maps is allocated ad-hoc, by calculating end of VA space allocated for hugepages and crossing fingers in hopes that those addresses will be free in primary and secondary processes. This leads to situations such as this: EAL: Detected 88 lcore(s) EAL: Detected 2 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_178323_8af2229603de4 EAL: Probing VFIO support... EAL: VFIO support initialized EAL: PCI device 0000:81:00.0 on NUMA socket 1 EAL: probe driver: 8086:1563 net_ixgbe EAL: Cannot mmap device resource file /sys/bus/pci/devices/0000:81:00.0/resource0 to address: 0x7ff7f5800000 EAL: Requested device 0000:81:00.0 cannot be used EAL: Error - exiting with code: 1 Cause: No Ethernet ports - bye As can be seen from the above log, secondary process has initialized successfully, but device BAR mapping has failed, which resulted in missing ports in the secondary process. This patchset is an attempt to fix this problem once and for all, by using the same method we use for memory to do device mappings as well. That is, by preallocating all of the device memory in advance, so that initialization either succeeds and allows for device mappings, or it fails outright (whereas currently we may be in an in-between kind of situation, where init has succeeded but device mappings have failed). This change breaks the ABI, so it is not for this release. However, i'd like to hear feedback on the approach and whether there are potential problems with other buses/use cases that i didn't think of. Anatoly Burakov (3): fbarray: allow zero-sized elements mem: add device memory reserve/free API bus/pci: use the new device memory API for BAR mapping drivers/bus/pci/linux/pci_init.h | 1 - drivers/bus/pci/linux/pci_uio.c | 11 +- drivers/bus/pci/linux/pci_vfio.c | 27 +- lib/librte_eal/common/eal_common_fbarray.c | 10 +- lib/librte_eal/common/eal_common_memory.c | 270 ++++++++++++++++-- .../common/include/rte_eal_memconfig.h | 18 ++ lib/librte_eal/common/include/rte_memory.h | 40 +++ lib/librte_pci/Makefile | 1 + lib/librte_pci/rte_pci.c | 20 +- 9 files changed, 350 insertions(+), 48 deletions(-) -- 2.17.0