From: Chenbo Xia <chenbo.xia@intel.com>
To: dev@dpdk.org
Cc: skori@marvell.com
Subject: [RFC 0/4] Support VFIO sparse mmap in PCI bus
Date: Tue, 18 Apr 2023 13:30:08 +0800 [thread overview]
Message-ID: <20230418053012.10667-1-chenbo.xia@intel.com> (raw)
This series introduces a VFIO standard capability, called sparse
mmap to PCI bus. In linux kernel, it's defined as
VFIO_REGION_INFO_CAP_SPARSE_MMAP. Sparse mmap means instead of
mmap whole BAR region into DPDK process, only mmap part of the
BAR region after getting sparse mmap information from kernel.
For the rest of BAR region that is not mmap-ed, DPDK process
can use pread/pwrite system calls to access. Sparse mmap is
useful when kernel does not want userspace to mmap whole BAR
region, or kernel wants to control over access to specific BAR
region. Vendors can choose to enable this feature or not for
their devices in their specific kernel modules.
In this patchset:
Patch 1-3 is mainly for introducing BAR access APIs so that
driver could use them to access specific BAR using pread/pwrite
system calls when part of the BAR is not mmap-able.
Patch 4 adds the VFIO sparse mmap support finally. A question
is for all sparse mmap regions, should they be mapped to a
continuous virtual address region that follows device-specific
BAR layout or not. In theory, there could be three options to
support this feature.
Option 1: Map sparse mmap regions independently
======================================================
In this approach, we mmap each sparse mmap region one by one
and each region could be located anywhere in process address
space. But accessing the mmaped BAR will not be as easy as
'bar_base_address + bar_offset', driver needs to check the
sparse mmap information to access specific BAR register.
Patch 4 in this patchset adopts this option. Driver API change
is introduced in bus_pci_driver.h. Corresponding changes in
all drivers are also done and currently I am assuming drivers
do not support this feature so they will not check the
'is_sparse' flag but assumes it to be false. Note that it will
not break any driver and each vendor can add related logic when
they start to support this feature. This is only because I don't
want to introduce complexity to drivers that do not want to
support this feature.
Option 2: Map sparse mmap regions based on device-specific BAR layout
======================================================================
In this approach, the sparse mmap regions are mapped to continuous
virtual address region that follows device-specific BAR layout.
For example, the BAR size is 0x4000 and only 0-0x1000 (sparse mmap
region #1) and 0x3000-0x4000 (sparse mmap region #2) could be
mmaped. Region #1 will be mapped at 'base_addr' and region #2
will be mapped at 'base_addr + 0x3000'. The good thing is if
we implement like this, driver can still access all BAR registers
using 'bar_base_address + bar_offset' way and we don't need
to introduce any driver API change. But the address space
range 'base_addr + 0x1000' to 'base_addr + 0x3000' may need to
be reserved so it could result in waste of address space or memory
(when we use MAP_ANONYMOUS and MAP_PRIVATE flag to reserve this
range). Meanwhile, driver needs to know which part of BAR is
mmaped (this is possible since the range is defined by vendor's
specific kernel module).
Option 3: Support both option 1 & 2
===================================
We could define a driver flag to let driver choose which way it
perfers since either option has its own Pros & Cons.
Please share your comments, Thanks!
Chenbo Xia (4):
bus/pci: introduce an internal representation of PCI device
bus/pci: avoid depending on private value in kernel source
bus/pci: introduce helper for MMIO read and write
bus/pci: add VFIO sparse mmap support
drivers/baseband/acc/rte_acc100_pmd.c | 6 +-
drivers/baseband/acc/rte_vrb_pmd.c | 6 +-
.../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 6 +-
drivers/baseband/fpga_lte_fec/fpga_lte_fec.c | 6 +-
drivers/bus/pci/bsd/pci.c | 43 +-
drivers/bus/pci/bus_pci_driver.h | 24 +-
drivers/bus/pci/linux/pci.c | 91 +++-
drivers/bus/pci/linux/pci_init.h | 14 +-
drivers/bus/pci/linux/pci_uio.c | 34 +-
drivers/bus/pci/linux/pci_vfio.c | 445 ++++++++++++++----
drivers/bus/pci/pci_common.c | 57 ++-
drivers/bus/pci/pci_common_uio.c | 12 +-
drivers/bus/pci/private.h | 25 +-
drivers/bus/pci/rte_bus_pci.h | 48 ++
drivers/bus/pci/version.map | 3 +
drivers/common/cnxk/roc_dev.c | 4 +-
drivers/common/cnxk/roc_dpi.c | 2 +-
drivers/common/cnxk/roc_ml.c | 22 +-
drivers/common/qat/dev/qat_dev_gen1.c | 2 +-
drivers/common/qat/dev/qat_dev_gen4.c | 4 +-
drivers/common/sfc_efx/sfc_efx.c | 2 +-
drivers/compress/octeontx/otx_zip.c | 4 +-
drivers/crypto/ccp/ccp_dev.c | 4 +-
drivers/crypto/cnxk/cnxk_cryptodev_ops.c | 2 +-
drivers/crypto/nitrox/nitrox_device.c | 4 +-
drivers/crypto/octeontx/otx_cryptodev_ops.c | 6 +-
drivers/crypto/virtio/virtio_pci.c | 6 +-
drivers/dma/cnxk/cnxk_dmadev.c | 2 +-
drivers/dma/hisilicon/hisi_dmadev.c | 6 +-
drivers/dma/idxd/idxd_pci.c | 4 +-
drivers/dma/ioat/ioat_dmadev.c | 2 +-
drivers/event/dlb2/pf/dlb2_main.c | 16 +-
drivers/event/octeontx/ssovf_probe.c | 38 +-
drivers/event/octeontx/timvf_probe.c | 18 +-
drivers/event/skeleton/skeleton_eventdev.c | 2 +-
drivers/mempool/octeontx/octeontx_fpavf.c | 6 +-
drivers/net/ark/ark_ethdev.c | 4 +-
drivers/net/atlantic/atl_ethdev.c | 2 +-
drivers/net/avp/avp_ethdev.c | 20 +-
drivers/net/axgbe/axgbe_ethdev.c | 4 +-
drivers/net/bnx2x/bnx2x_ethdev.c | 6 +-
drivers/net/bnxt/bnxt_ethdev.c | 8 +-
drivers/net/cpfl/cpfl_ethdev.c | 4 +-
drivers/net/cxgbe/cxgbe_ethdev.c | 2 +-
drivers/net/cxgbe/cxgbe_main.c | 2 +-
drivers/net/cxgbe/cxgbevf_ethdev.c | 2 +-
drivers/net/cxgbe/cxgbevf_main.c | 2 +-
drivers/net/e1000/em_ethdev.c | 4 +-
drivers/net/e1000/igb_ethdev.c | 4 +-
drivers/net/ena/ena_ethdev.c | 4 +-
drivers/net/enetc/enetc_ethdev.c | 2 +-
drivers/net/enic/enic_main.c | 4 +-
drivers/net/fm10k/fm10k_ethdev.c | 2 +-
drivers/net/gve/gve_ethdev.c | 4 +-
drivers/net/hinic/base/hinic_pmd_hwif.c | 14 +-
drivers/net/hns3/hns3_ethdev.c | 2 +-
drivers/net/hns3/hns3_ethdev_vf.c | 2 +-
drivers/net/hns3/hns3_rxtx.c | 4 +-
drivers/net/i40e/i40e_ethdev.c | 2 +-
drivers/net/iavf/iavf_ethdev.c | 2 +-
drivers/net/ice/ice_dcf.c | 2 +-
drivers/net/ice/ice_ethdev.c | 2 +-
drivers/net/idpf/idpf_ethdev.c | 4 +-
drivers/net/igc/igc_ethdev.c | 2 +-
drivers/net/ionic/ionic_dev_pci.c | 2 +-
drivers/net/ixgbe/ixgbe_ethdev.c | 4 +-
drivers/net/liquidio/lio_ethdev.c | 4 +-
drivers/net/nfp/nfp_ethdev.c | 2 +-
drivers/net/nfp/nfp_ethdev_vf.c | 6 +-
drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c | 4 +-
drivers/net/ngbe/ngbe_ethdev.c | 2 +-
drivers/net/octeon_ep/otx_ep_ethdev.c | 2 +-
drivers/net/octeontx/base/octeontx_pkivf.c | 6 +-
drivers/net/octeontx/base/octeontx_pkovf.c | 12 +-
drivers/net/qede/qede_main.c | 6 +-
drivers/net/sfc/sfc.c | 2 +-
drivers/net/thunderx/nicvf_ethdev.c | 2 +-
drivers/net/txgbe/txgbe_ethdev.c | 2 +-
drivers/net/txgbe/txgbe_ethdev_vf.c | 2 +-
drivers/net/virtio/virtio_pci.c | 6 +-
drivers/net/vmxnet3/vmxnet3_ethdev.c | 4 +-
drivers/raw/cnxk_bphy/cnxk_bphy.c | 10 +-
drivers/raw/cnxk_bphy/cnxk_bphy_cgx.c | 6 +-
drivers/raw/ifpga/afu_pmd_n3000.c | 4 +-
drivers/raw/ifpga/ifpga_rawdev.c | 6 +-
drivers/raw/ntb/ntb_hw_intel.c | 8 +-
drivers/vdpa/ifc/ifcvf_vdpa.c | 6 +-
drivers/vdpa/sfc/sfc_vdpa_hw.c | 2 +-
drivers/vdpa/sfc/sfc_vdpa_ops.c | 2 +-
lib/eal/include/rte_vfio.h | 1 -
90 files changed, 853 insertions(+), 352 deletions(-)
--
2.17.1
next reply other threads:[~2023-04-18 5:49 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-18 5:30 Chenbo Xia [this message]
2023-04-18 5:30 ` [RFC 1/4] bus/pci: introduce an internal representation of PCI device Chenbo Xia
2023-04-18 5:30 ` [RFC 2/4] bus/pci: avoid depending on private value in kernel source Chenbo Xia
2023-04-18 5:30 ` [RFC 3/4] bus/pci: introduce helper for MMIO read and write Chenbo Xia
2023-04-18 5:30 ` [RFC 4/4] bus/pci: add VFIO sparse mmap support Chenbo Xia
2023-04-18 7:46 ` [RFC 0/4] Support VFIO sparse mmap in PCI bus David Marchand
2023-04-18 9:27 ` Xia, Chenbo
2023-04-18 9:33 ` Xia, Chenbo
2023-05-08 2:13 ` Xia, Chenbo
2023-05-08 3:04 ` Sunil Kumar Kori
2023-05-15 6:46 ` [PATCH v1 " Miao Li
2023-05-15 6:46 ` [PATCH v1 1/4] bus/pci: introduce an internal representation of PCI device Miao Li
2023-05-15 6:46 ` [PATCH v1 2/4] bus/pci: avoid depending on private value in kernel source Miao Li
2023-05-15 6:46 ` [PATCH v1 3/4] bus/pci: introduce helper for MMIO read and write Miao Li
2023-05-15 6:47 ` [PATCH v1 4/4] bus/pci: add VFIO sparse mmap support Miao Li
2023-05-15 9:41 ` [PATCH v2 0/4] Support VFIO sparse mmap in PCI bus Miao Li
2023-05-15 9:41 ` [PATCH v2 1/4] bus/pci: introduce an internal representation of PCI device Miao Li
2023-05-15 9:41 ` [PATCH v2 2/4] bus/pci: avoid depending on private value in kernel source Miao Li
2023-05-15 9:41 ` [PATCH v2 3/4] bus/pci: introduce helper for MMIO read and write Miao Li
2023-05-15 9:41 ` [PATCH v2 4/4] bus/pci: add VFIO sparse mmap support Miao Li
2023-05-25 16:31 ` [PATCH v3 0/4] Support VFIO sparse mmap in PCI bus Miao Li
2023-05-25 16:31 ` [PATCH v3 1/4] bus/pci: introduce an internal representation of PCI device Miao Li
2023-05-29 6:14 ` [EXT] " Sunil Kumar Kori
2023-05-29 6:28 ` Cao, Yahui
2023-05-25 16:31 ` [PATCH v3 2/4] bus/pci: avoid depending on private value in kernel source Miao Li
2023-05-29 6:15 ` [EXT] " Sunil Kumar Kori
2023-05-29 6:30 ` Cao, Yahui
2023-05-25 16:31 ` [PATCH v3 3/4] bus/pci: introduce helper for MMIO read and write Miao Li
2023-05-29 6:16 ` [EXT] " Sunil Kumar Kori
2023-05-29 6:31 ` Cao, Yahui
2023-05-25 16:31 ` [PATCH v3 4/4] bus/pci: add VFIO sparse mmap support Miao Li
2023-05-29 6:17 ` [EXT] " Sunil Kumar Kori
2023-05-29 6:32 ` Cao, Yahui
2023-05-29 9:25 ` Xia, Chenbo
2023-05-31 5:37 ` [PATCH v4 0/4] Support VFIO sparse mmap in PCI bus Miao Li
2023-05-31 5:37 ` [PATCH v4 1/4] bus/pci: introduce an internal representation of PCI device Miao Li
2023-05-31 5:37 ` [PATCH v4 2/4] bus/pci: avoid depending on private value in kernel source Miao Li
2023-05-31 5:37 ` [PATCH v4 3/4] bus/pci: introduce helper for MMIO read and write Miao Li
2023-05-31 5:37 ` [PATCH v4 4/4] bus/pci: add VFIO sparse mmap support Miao Li
2023-06-07 16:30 ` [PATCH v4 0/4] Support VFIO sparse mmap in PCI bus Thomas Monjalon
2023-06-08 0:28 ` Patrick Robb
2023-06-08 1:36 ` Xia, Chenbo
2023-06-08 1:33 ` Xia, Chenbo
2023-06-08 6:43 ` Ali Alnubani
2023-06-08 6:50 ` Xia, Chenbo
2023-06-08 7:03 ` David Marchand
2023-06-08 12:47 ` Patrick Robb
2023-05-15 15:52 ` [PATCH v1 4/4] bus/pci: add VFIO sparse mmap support Stephen Hemminger
2023-05-22 2:41 ` Li, Miao
2023-05-22 3:42 ` Xia, Chenbo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230418053012.10667-1-chenbo.xia@intel.com \
--to=chenbo.xia@intel.com \
--cc=dev@dpdk.org \
--cc=skori@marvell.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).