DPDK patches and discussions
 help / color / mirror / Atom feed
From: Sunil Kumar Kori <skori@marvell.com>
To: "Xia, Chenbo" <chenbo.xia@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "techboard@dpdk.org" <techboard@dpdk.org>,
	"thomas@monjalon.net" <thomas@monjalon.net>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	"ferruh.yigit@amd.com" <ferruh.yigit@amd.com>,
	"david.marchand@redhat.com" <david.marchand@redhat.com>,
	"Cao, Yahui" <yahui.cao@intel.com>,
	"Li, Miao" <miao.li@intel.com>
Subject: RE: [RFC 0/4] Support VFIO sparse mmap in PCI bus
Date: Mon, 8 May 2023 03:04:56 +0000	[thread overview]
Message-ID: <CO6PR18MB3860F3368913B338B56CD432B4719@CO6PR18MB3860.namprd18.prod.outlook.com> (raw)
In-Reply-To: <SN6PR11MB3504CBB8C1B0B24D3216C8149C719@SN6PR11MB3504.namprd11.prod.outlook.com>

+1 for option 2.

Thanks & Regards
Sunil Kumar Kori

> -----Original Message-----
> From: Xia, Chenbo <chenbo.xia@intel.com>
> Sent: Monday, May 8, 2023 7:43 AM
> To: dev@dpdk.org
> Cc: Sunil Kumar Kori <skori@marvell.com>; techboard@dpdk.org;
> thomas@monjalon.net; Richardson, Bruce <bruce.richardson@intel.com>;
> ferruh.yigit@amd.com; david.marchand@redhat.com; Cao, Yahui
> <yahui.cao@intel.com>; Li, Miao <miao.li@intel.com>
> Subject: [EXT] RE: [RFC 0/4] Support VFIO sparse mmap in PCI bus
> 
> External Email
> 
> ----------------------------------------------------------------------
> Gentle ping for some comments..
> 
> After rethink, personally I may choose option 2 as it will have no driver API
> change for PMDs and as I look at current code, that's how we do when MSI-X
> table can't be mmap-ed.
> 
> Thanks,
> Chenbo
> 
> > -----Original Message-----
> > From: Chenbo Xia <chenbo.xia@intel.com>
> > Sent: Tuesday, April 18, 2023 1:30 PM
> > To: dev@dpdk.org
> > Cc: skori@marvell.com
> > Subject: [RFC 0/4] Support VFIO sparse mmap in PCI bus
> >
> > This series introduces a VFIO standard capability, called sparse mmap
> > to PCI bus. In linux kernel, it's defined as
> > VFIO_REGION_INFO_CAP_SPARSE_MMAP. Sparse mmap means instead of
> mmap
> > whole BAR region into DPDK process, only mmap part of the BAR region
> > after getting sparse mmap information from kernel.
> > For the rest of BAR region that is not mmap-ed, DPDK process can use
> > pread/pwrite system calls to access. Sparse mmap is useful when kernel
> > does not want userspace to mmap whole BAR region, or kernel wants to
> > control over access to specific BAR region. Vendors can choose to
> > enable this feature or not for their devices in their specific kernel
> > modules.
> >
> > In this patchset:
> >
> > Patch 1-3 is mainly for introducing BAR access APIs so that driver
> > could use them to access specific BAR using pread/pwrite system calls
> > when part of the BAR is not mmap-able.
> >
> > Patch 4 adds the VFIO sparse mmap support finally. A question is for
> > all sparse mmap regions, should they be mapped to a continuous virtual
> > address region that follows device-specific BAR layout or not. In
> > theory, there could be three options to support this feature.
> >
> > Option 1: Map sparse mmap regions independently
> > ======================================================
> > In this approach, we mmap each sparse mmap region one by one and each
> > region could be located anywhere in process address space. But
> > accessing the mmaped BAR will not be as easy as 'bar_base_address +
> > bar_offset', driver needs to check the sparse mmap information to
> > access specific BAR register.
> >
> > Patch 4 in this patchset adopts this option. Driver API change is
> > introduced in bus_pci_driver.h. Corresponding changes in all drivers
> > are also done and currently I am assuming drivers do not support this
> > feature so they will not check the 'is_sparse' flag but assumes it to
> > be false. Note that it will not break any driver and each vendor can
> > add related logic when they start to support this feature. This is
> > only because I don't want to introduce complexity to drivers that do
> > not want to support this feature.
> >
> > Option 2: Map sparse mmap regions based on device-specific BAR layout
> >
> ================================================================
> ======
> > In this approach, the sparse mmap regions are mapped to continuous
> > virtual address region that follows device-specific BAR layout.
> > For example, the BAR size is 0x4000 and only 0-0x1000 (sparse mmap
> > region #1) and 0x3000-0x4000 (sparse mmap region #2) could be mmaped.
> > Region #1 will be mapped at 'base_addr' and region #2 will be mapped
> > at 'base_addr + 0x3000'. The good thing is if we implement like this,
> > driver can still access all BAR registers using 'bar_base_address +
> > bar_offset' way and we don't need to introduce any driver API change.
> > But the address space range 'base_addr + 0x1000' to 'base_addr +
> > 0x3000' may need to be reserved so it could result in waste of address
> > space or memory (when we use MAP_ANONYMOUS and MAP_PRIVATE flag
> to
> > reserve this range). Meanwhile, driver needs to know which part of BAR
> > is mmaped (this is possible since the range is defined by vendor's
> > specific kernel module).
> >
> > Option 3: Support both option 1 & 2
> > ===================================
> > We could define a driver flag to let driver choose which way it
> > perfers since either option has its own Pros & Cons.
> >
> > Please share your comments, Thanks!
> >
> >
> > Chenbo Xia (4):
> >   bus/pci: introduce an internal representation of PCI device
> >   bus/pci: avoid depending on private value in kernel source
> >   bus/pci: introduce helper for MMIO read and write
> >   bus/pci: add VFIO sparse mmap support
> >
> >  drivers/baseband/acc/rte_acc100_pmd.c         |   6 +-
> >  drivers/baseband/acc/rte_vrb_pmd.c            |   6 +-
> >  .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c         |   6 +-
> >  drivers/baseband/fpga_lte_fec/fpga_lte_fec.c  |   6 +-
> >  drivers/bus/pci/bsd/pci.c                     |  43 +-
> >  drivers/bus/pci/bus_pci_driver.h              |  24 +-
> >  drivers/bus/pci/linux/pci.c                   |  91 +++-
> >  drivers/bus/pci/linux/pci_init.h              |  14 +-
> >  drivers/bus/pci/linux/pci_uio.c               |  34 +-
> >  drivers/bus/pci/linux/pci_vfio.c              | 445 ++++++++++++++----
> >  drivers/bus/pci/pci_common.c                  |  57 ++-
> >  drivers/bus/pci/pci_common_uio.c              |  12 +-
> >  drivers/bus/pci/private.h                     |  25 +-
> >  drivers/bus/pci/rte_bus_pci.h                 |  48 ++
> >  drivers/bus/pci/version.map                   |   3 +
> >  drivers/common/cnxk/roc_dev.c                 |   4 +-
> >  drivers/common/cnxk/roc_dpi.c                 |   2 +-
> >  drivers/common/cnxk/roc_ml.c                  |  22 +-
> >  drivers/common/qat/dev/qat_dev_gen1.c         |   2 +-
> >  drivers/common/qat/dev/qat_dev_gen4.c         |   4 +-
> >  drivers/common/sfc_efx/sfc_efx.c              |   2 +-
> >  drivers/compress/octeontx/otx_zip.c           |   4 +-
> >  drivers/crypto/ccp/ccp_dev.c                  |   4 +-
> >  drivers/crypto/cnxk/cnxk_cryptodev_ops.c      |   2 +-
> >  drivers/crypto/nitrox/nitrox_device.c         |   4 +-
> >  drivers/crypto/octeontx/otx_cryptodev_ops.c   |   6 +-
> >  drivers/crypto/virtio/virtio_pci.c            |   6 +-
> >  drivers/dma/cnxk/cnxk_dmadev.c                |   2 +-
> >  drivers/dma/hisilicon/hisi_dmadev.c           |   6 +-
> >  drivers/dma/idxd/idxd_pci.c                   |   4 +-
> >  drivers/dma/ioat/ioat_dmadev.c                |   2 +-
> >  drivers/event/dlb2/pf/dlb2_main.c             |  16 +-
> >  drivers/event/octeontx/ssovf_probe.c          |  38 +-
> >  drivers/event/octeontx/timvf_probe.c          |  18 +-
> >  drivers/event/skeleton/skeleton_eventdev.c    |   2 +-
> >  drivers/mempool/octeontx/octeontx_fpavf.c     |   6 +-
> >  drivers/net/ark/ark_ethdev.c                  |   4 +-
> >  drivers/net/atlantic/atl_ethdev.c             |   2 +-
> >  drivers/net/avp/avp_ethdev.c                  |  20 +-
> >  drivers/net/axgbe/axgbe_ethdev.c              |   4 +-
> >  drivers/net/bnx2x/bnx2x_ethdev.c              |   6 +-
> >  drivers/net/bnxt/bnxt_ethdev.c                |   8 +-
> >  drivers/net/cpfl/cpfl_ethdev.c                |   4 +-
> >  drivers/net/cxgbe/cxgbe_ethdev.c              |   2 +-
> >  drivers/net/cxgbe/cxgbe_main.c                |   2 +-
> >  drivers/net/cxgbe/cxgbevf_ethdev.c            |   2 +-
> >  drivers/net/cxgbe/cxgbevf_main.c              |   2 +-
> >  drivers/net/e1000/em_ethdev.c                 |   4 +-
> >  drivers/net/e1000/igb_ethdev.c                |   4 +-
> >  drivers/net/ena/ena_ethdev.c                  |   4 +-
> >  drivers/net/enetc/enetc_ethdev.c              |   2 +-
> >  drivers/net/enic/enic_main.c                  |   4 +-
> >  drivers/net/fm10k/fm10k_ethdev.c              |   2 +-
> >  drivers/net/gve/gve_ethdev.c                  |   4 +-
> >  drivers/net/hinic/base/hinic_pmd_hwif.c       |  14 +-
> >  drivers/net/hns3/hns3_ethdev.c                |   2 +-
> >  drivers/net/hns3/hns3_ethdev_vf.c             |   2 +-
> >  drivers/net/hns3/hns3_rxtx.c                  |   4 +-
> >  drivers/net/i40e/i40e_ethdev.c                |   2 +-
> >  drivers/net/iavf/iavf_ethdev.c                |   2 +-
> >  drivers/net/ice/ice_dcf.c                     |   2 +-
> >  drivers/net/ice/ice_ethdev.c                  |   2 +-
> >  drivers/net/idpf/idpf_ethdev.c                |   4 +-
> >  drivers/net/igc/igc_ethdev.c                  |   2 +-
> >  drivers/net/ionic/ionic_dev_pci.c             |   2 +-
> >  drivers/net/ixgbe/ixgbe_ethdev.c              |   4 +-
> >  drivers/net/liquidio/lio_ethdev.c             |   4 +-
> >  drivers/net/nfp/nfp_ethdev.c                  |   2 +-
> >  drivers/net/nfp/nfp_ethdev_vf.c               |   6 +-
> >  drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c    |   4 +-
> >  drivers/net/ngbe/ngbe_ethdev.c                |   2 +-
> >  drivers/net/octeon_ep/otx_ep_ethdev.c         |   2 +-
> >  drivers/net/octeontx/base/octeontx_pkivf.c    |   6 +-
> >  drivers/net/octeontx/base/octeontx_pkovf.c    |  12 +-
> >  drivers/net/qede/qede_main.c                  |   6 +-
> >  drivers/net/sfc/sfc.c                         |   2 +-
> >  drivers/net/thunderx/nicvf_ethdev.c           |   2 +-
> >  drivers/net/txgbe/txgbe_ethdev.c              |   2 +-
> >  drivers/net/txgbe/txgbe_ethdev_vf.c           |   2 +-
> >  drivers/net/virtio/virtio_pci.c               |   6 +-
> >  drivers/net/vmxnet3/vmxnet3_ethdev.c          |   4 +-
> >  drivers/raw/cnxk_bphy/cnxk_bphy.c             |  10 +-
> >  drivers/raw/cnxk_bphy/cnxk_bphy_cgx.c         |   6 +-
> >  drivers/raw/ifpga/afu_pmd_n3000.c             |   4 +-
> >  drivers/raw/ifpga/ifpga_rawdev.c              |   6 +-
> >  drivers/raw/ntb/ntb_hw_intel.c                |   8 +-
> >  drivers/vdpa/ifc/ifcvf_vdpa.c                 |   6 +-
> >  drivers/vdpa/sfc/sfc_vdpa_hw.c                |   2 +-
> >  drivers/vdpa/sfc/sfc_vdpa_ops.c               |   2 +-
> >  lib/eal/include/rte_vfio.h                    |   1 -
> >  90 files changed, 853 insertions(+), 352 deletions(-)
> >
> > --
> > 2.17.1


  reply	other threads:[~2023-05-08  3:05 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-18  5:30 Chenbo Xia
2023-04-18  5:30 ` [RFC 1/4] bus/pci: introduce an internal representation of PCI device Chenbo Xia
2023-04-18  5:30 ` [RFC 2/4] bus/pci: avoid depending on private value in kernel source Chenbo Xia
2023-04-18  5:30 ` [RFC 3/4] bus/pci: introduce helper for MMIO read and write Chenbo Xia
2023-04-18  5:30 ` [RFC 4/4] bus/pci: add VFIO sparse mmap support Chenbo Xia
2023-04-18  7:46 ` [RFC 0/4] Support VFIO sparse mmap in PCI bus David Marchand
2023-04-18  9:27   ` Xia, Chenbo
2023-04-18  9:33   ` Xia, Chenbo
2023-05-08  2:13 ` Xia, Chenbo
2023-05-08  3:04   ` Sunil Kumar Kori [this message]
2023-05-15  6:46 ` [PATCH v1 " Miao Li
2023-05-15  6:46   ` [PATCH v1 1/4] bus/pci: introduce an internal representation of PCI device Miao Li
2023-05-15  6:46   ` [PATCH v1 2/4] bus/pci: avoid depending on private value in kernel source Miao Li
2023-05-15  6:46   ` [PATCH v1 3/4] bus/pci: introduce helper for MMIO read and write Miao Li
2023-05-15  6:47   ` [PATCH v1 4/4] bus/pci: add VFIO sparse mmap support Miao Li
2023-05-15  9:41     ` [PATCH v2 0/4] Support VFIO sparse mmap in PCI bus Miao Li
2023-05-15  9:41       ` [PATCH v2 1/4] bus/pci: introduce an internal representation of PCI device Miao Li
2023-05-15  9:41       ` [PATCH v2 2/4] bus/pci: avoid depending on private value in kernel source Miao Li
2023-05-15  9:41       ` [PATCH v2 3/4] bus/pci: introduce helper for MMIO read and write Miao Li
2023-05-15  9:41       ` [PATCH v2 4/4] bus/pci: add VFIO sparse mmap support Miao Li
2023-05-25 16:31       ` [PATCH v3 0/4] Support VFIO sparse mmap in PCI bus Miao Li
2023-05-25 16:31         ` [PATCH v3 1/4] bus/pci: introduce an internal representation of PCI device Miao Li
2023-05-29  6:14           ` [EXT] " Sunil Kumar Kori
2023-05-29  6:28           ` Cao, Yahui
2023-05-25 16:31         ` [PATCH v3 2/4] bus/pci: avoid depending on private value in kernel source Miao Li
2023-05-29  6:15           ` [EXT] " Sunil Kumar Kori
2023-05-29  6:30           ` Cao, Yahui
2023-05-25 16:31         ` [PATCH v3 3/4] bus/pci: introduce helper for MMIO read and write Miao Li
2023-05-29  6:16           ` [EXT] " Sunil Kumar Kori
2023-05-29  6:31           ` Cao, Yahui
2023-05-25 16:31         ` [PATCH v3 4/4] bus/pci: add VFIO sparse mmap support Miao Li
2023-05-29  6:17           ` [EXT] " Sunil Kumar Kori
2023-05-29  6:32           ` Cao, Yahui
2023-05-29  9:25           ` Xia, Chenbo
2023-05-31  5:37         ` [PATCH v4 0/4] Support VFIO sparse mmap in PCI bus Miao Li
2023-05-31  5:37           ` [PATCH v4 1/4] bus/pci: introduce an internal representation of PCI device Miao Li
2023-05-31  5:37           ` [PATCH v4 2/4] bus/pci: avoid depending on private value in kernel source Miao Li
2023-05-31  5:37           ` [PATCH v4 3/4] bus/pci: introduce helper for MMIO read and write Miao Li
2023-05-31  5:37           ` [PATCH v4 4/4] bus/pci: add VFIO sparse mmap support Miao Li
2023-06-07 16:30           ` [PATCH v4 0/4] Support VFIO sparse mmap in PCI bus Thomas Monjalon
2023-06-08  0:28             ` Patrick Robb
2023-06-08  1:36               ` Xia, Chenbo
2023-06-08  1:33             ` Xia, Chenbo
2023-06-08  6:43           ` Ali Alnubani
2023-06-08  6:50             ` Xia, Chenbo
2023-06-08  7:03               ` David Marchand
2023-06-08 12:47                 ` Patrick Robb
2023-05-15 15:52     ` [PATCH v1 4/4] bus/pci: add VFIO sparse mmap support Stephen Hemminger
2023-05-22  2:41       ` Li, Miao
2023-05-22  3:42       ` Xia, Chenbo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CO6PR18MB3860F3368913B338B56CD432B4719@CO6PR18MB3860.namprd18.prod.outlook.com \
    --to=skori@marvell.com \
    --cc=bruce.richardson@intel.com \
    --cc=chenbo.xia@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@amd.com \
    --cc=miao.li@intel.com \
    --cc=techboard@dpdk.org \
    --cc=thomas@monjalon.net \
    --cc=yahui.cao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).