DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Xia, Chenbo" <chenbo.xia@intel.com>
To: "Li, Miao" <miao.li@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "skori@marvell.com" <skori@marvell.com>,
	"thomas@monjalon.net" <thomas@monjalon.net>,
	"david.marchand@redhat.com" <david.marchand@redhat.com>,
	"ferruh.yigit@amd.com" <ferruh.yigit@amd.com>,
	"Cao, Yahui" <yahui.cao@intel.com>,
	"Burakov, Anatoly" <anatoly.burakov@intel.com>
Subject: RE: [PATCH v3 4/4] bus/pci: add VFIO sparse mmap support
Date: Mon, 29 May 2023 09:25:03 +0000	[thread overview]
Message-ID: <SN6PR11MB3504BDBE330C16E9C14C362E9C4A9@SN6PR11MB3504.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20230525163116.682000-5-miao.li@intel.com>

> -----Original Message-----
> From: Li, Miao <miao.li@intel.com>
> Sent: Friday, May 26, 2023 12:31 AM
> To: dev@dpdk.org
> Cc: skori@marvell.com; thomas@monjalon.net; david.marchand@redhat.com;
> ferruh.yigit@amd.com; Xia, Chenbo <chenbo.xia@intel.com>; Cao, Yahui
> <yahui.cao@intel.com>; Burakov, Anatoly <anatoly.burakov@intel.com>
> Subject: [PATCH v3 4/4] bus/pci: add VFIO sparse mmap support
> 
> This patch adds sparse mmap support in PCI bus. Sparse mmap is a
> capability defined in VFIO which allows multiple mmap areas in one
> VFIO region.
> 
> In this patch, the sparse mmap regions are mapped to one continuous
> virtual address region that follows device-specific BAR layout. So,
> driver can still access all mapped sparse mmap regions by using
> 'bar_base_address + bar_offset'.
> 
> Signed-off-by: Miao Li <miao.li@intel.com>
> Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
> ---
>  drivers/bus/pci/linux/pci_vfio.c | 104 +++++++++++++++++++++++++++----
>  drivers/bus/pci/private.h        |   2 +
>  2 files changed, 94 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/bus/pci/linux/pci_vfio.c
> b/drivers/bus/pci/linux/pci_vfio.c
> index 24b0795fbd..c411909976 100644
> --- a/drivers/bus/pci/linux/pci_vfio.c
> +++ b/drivers/bus/pci/linux/pci_vfio.c
> @@ -673,6 +673,54 @@ pci_vfio_mmap_bar(int vfio_dev_fd, struct
> mapped_pci_resource *vfio_res,
>  	return 0;
>  }
> 
> +static int
> +pci_vfio_sparse_mmap_bar(int vfio_dev_fd, struct mapped_pci_resource
> *vfio_res,
> +		int bar_index, int additional_flags)
> +{
> +	struct pci_map *bar = &vfio_res->maps[bar_index];
> +	struct vfio_region_sparse_mmap_area *sparse;
> +	void *bar_addr;
> +	uint32_t i;
> +
> +	if (bar->size == 0) {
> +		RTE_LOG(DEBUG, EAL, "Bar size is 0, skip BAR%d\n", bar_index);
> +		return 0;
> +	}
> +
> +	/* reserve the address using an inaccessible mapping */
> +	bar_addr = mmap(bar->addr, bar->size, 0, MAP_PRIVATE |
> +			MAP_ANONYMOUS | additional_flags, -1, 0);
> +	if (bar_addr != MAP_FAILED) {
> +		void *map_addr = NULL;
> +		for (i = 0; i < bar->nr_areas; i++) {
> +			sparse = &bar->areas[i];
> +			if (sparse->size) {
> +				void *addr = RTE_PTR_ADD(bar_addr,
> (uintptr_t)sparse->offset);
> +				map_addr = pci_map_resource(addr, vfio_dev_fd,
> +					bar->offset + sparse->offset, sparse->size,
> +					RTE_MAP_FORCE_ADDRESS);
> +				if (map_addr == NULL) {
> +					munmap(bar_addr, bar->size);
> +					RTE_LOG(ERR, EAL, "Failed to map pci
> BAR%d\n",
> +						bar_index);
> +					goto err_map;
> +				}
> +			}
> +		}
> +	} else {
> +		RTE_LOG(ERR, EAL, "Failed to create inaccessible mapping for
> BAR%d\n",
> +			bar_index);
> +		goto err_map;
> +	}
> +
> +	bar->addr = bar_addr;
> +	return 0;
> +
> +err_map:
> +	bar->nr_areas = 0;
> +	return -1;
> +}
> +
>  /*
>   * region info may contain capability headers, so we need to keep
> reallocating
>   * the memory until we match allocated memory size with argsz.
> @@ -875,6 +923,8 @@ pci_vfio_map_resource_primary(struct rte_pci_device
> *dev)
> 
>  	for (i = 0; i < vfio_res->nb_maps; i++) {
>  		void *bar_addr;
> +		struct vfio_info_cap_header *hdr;
> +		struct vfio_region_info_cap_sparse_mmap *sparse;
> 
>  		ret = pci_vfio_get_region_info(vfio_dev_fd, &reg, i);
>  		if (ret < 0) {
> @@ -920,12 +970,33 @@ pci_vfio_map_resource_primary(struct rte_pci_device
> *dev)
>  		maps[i].size = reg->size;
>  		maps[i].path = NULL; /* vfio doesn't have per-resource paths
> */
> 
> -		ret = pci_vfio_mmap_bar(vfio_dev_fd, vfio_res, i, 0);
> -		if (ret < 0) {
> -			RTE_LOG(ERR, EAL, "%s mapping BAR%i failed: %s\n",
> -					pci_addr, i, strerror(errno));
> -			free(reg);
> -			goto err_vfio_res;
> +		hdr = pci_vfio_info_cap(reg, VFIO_REGION_INFO_CAP_SPARSE_MMAP);
> +
> +		if (hdr != NULL) {
> +			sparse = container_of(hdr,
> +				struct vfio_region_info_cap_sparse_mmap, header);
> +			if (sparse->nr_areas > 0) {
> +				maps[i].nr_areas = sparse->nr_areas;
> +				maps[i].areas = sparse->areas;

I just notice that this is wrong as the memory that pointer 'sparse' points to
will be freed at the end. map[i].areas needs to be allocated by rte_zmalloc
and freed correctly. Otherwise it could leads to secondary process segfault
when it tries to access maps[i].areas.

Will fix this in v4.

Thanks,
Chenbo

> +			}
> +		}
> +
> +		if (maps[i].nr_areas > 0) {
> +			ret = pci_vfio_sparse_mmap_bar(vfio_dev_fd, vfio_res, i,
> 0);
> +			if (ret < 0) {
> +				RTE_LOG(ERR, EAL, "%s sparse mapping BAR%i
> failed: %s\n",
> +						pci_addr, i, strerror(errno));
> +				free(reg);
> +				goto err_vfio_res;
> +			}
> +		} else {
> +			ret = pci_vfio_mmap_bar(vfio_dev_fd, vfio_res, i, 0);
> +			if (ret < 0) {
> +				RTE_LOG(ERR, EAL, "%s mapping BAR%i failed: %s\n",
> +						pci_addr, i, strerror(errno));
> +				free(reg);
> +				goto err_vfio_res;
> +			}
>  		}
> 
>  		dev->mem_resource[i].addr = maps[i].addr;
> @@ -1008,11 +1079,20 @@ pci_vfio_map_resource_secondary(struct
> rte_pci_device *dev)
>  	maps = vfio_res->maps;
> 
>  	for (i = 0; i < vfio_res->nb_maps; i++) {
> -		ret = pci_vfio_mmap_bar(vfio_dev_fd, vfio_res, i, MAP_FIXED);
> -		if (ret < 0) {
> -			RTE_LOG(ERR, EAL, "%s mapping BAR%i failed: %s\n",
> -					pci_addr, i, strerror(errno));
> -			goto err_vfio_dev_fd;
> +		if (maps[i].nr_areas > 0) {
> +			ret = pci_vfio_sparse_mmap_bar(vfio_dev_fd, vfio_res, i,
> 0);
> +			if (ret < 0) {
> +				RTE_LOG(ERR, EAL, "%s sparse mapping BAR%i
> failed: %s\n",
> +						pci_addr, i, strerror(errno));
> +				goto err_vfio_dev_fd;
> +			}
> +		} else {
> +			ret = pci_vfio_mmap_bar(vfio_dev_fd, vfio_res, i, 0);
> +			if (ret < 0) {
> +				RTE_LOG(ERR, EAL, "%s mapping BAR%i failed: %s\n",
> +						pci_addr, i, strerror(errno));
> +				goto err_vfio_dev_fd;
> +			}
>  		}
> 
>  		dev->mem_resource[i].addr = maps[i].addr;
> @@ -1062,7 +1142,7 @@ find_and_unmap_vfio_resource(struct
> mapped_pci_res_list *vfio_res_list,
>  		break;
>  	}
> 
> -	if  (vfio_res == NULL)
> +	if (vfio_res == NULL)
>  		return vfio_res;
> 
>  	RTE_LOG(INFO, EAL, "Releasing PCI mapped resource for %s\n",
> diff --git a/drivers/bus/pci/private.h b/drivers/bus/pci/private.h
> index 2d6991ccb7..8b0ce73533 100644
> --- a/drivers/bus/pci/private.h
> +++ b/drivers/bus/pci/private.h
> @@ -121,6 +121,8 @@ struct pci_map {
>  	uint64_t offset;
>  	uint64_t size;
>  	uint64_t phaddr;
> +	uint32_t nr_areas;
> +	struct vfio_region_sparse_mmap_area *areas;
>  };
> 
>  struct pci_msix_table {
> --
> 2.25.1


  parent reply	other threads:[~2023-05-29  9:25 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-18  5:30 [RFC 0/4] Support VFIO sparse mmap in PCI bus Chenbo Xia
2023-04-18  5:30 ` [RFC 1/4] bus/pci: introduce an internal representation of PCI device Chenbo Xia
2023-04-18  5:30 ` [RFC 2/4] bus/pci: avoid depending on private value in kernel source Chenbo Xia
2023-04-18  5:30 ` [RFC 3/4] bus/pci: introduce helper for MMIO read and write Chenbo Xia
2023-04-18  5:30 ` [RFC 4/4] bus/pci: add VFIO sparse mmap support Chenbo Xia
2023-04-18  7:46 ` [RFC 0/4] Support VFIO sparse mmap in PCI bus David Marchand
2023-04-18  9:27   ` Xia, Chenbo
2023-04-18  9:33   ` Xia, Chenbo
2023-05-08  2:13 ` Xia, Chenbo
2023-05-08  3:04   ` Sunil Kumar Kori
2023-05-15  6:46 ` [PATCH v1 " Miao Li
2023-05-15  6:46   ` [PATCH v1 1/4] bus/pci: introduce an internal representation of PCI device Miao Li
2023-05-15  6:46   ` [PATCH v1 2/4] bus/pci: avoid depending on private value in kernel source Miao Li
2023-05-15  6:46   ` [PATCH v1 3/4] bus/pci: introduce helper for MMIO read and write Miao Li
2023-05-15  6:47   ` [PATCH v1 4/4] bus/pci: add VFIO sparse mmap support Miao Li
2023-05-15  9:41     ` [PATCH v2 0/4] Support VFIO sparse mmap in PCI bus Miao Li
2023-05-15  9:41       ` [PATCH v2 1/4] bus/pci: introduce an internal representation of PCI device Miao Li
2023-05-15  9:41       ` [PATCH v2 2/4] bus/pci: avoid depending on private value in kernel source Miao Li
2023-05-15  9:41       ` [PATCH v2 3/4] bus/pci: introduce helper for MMIO read and write Miao Li
2023-05-15  9:41       ` [PATCH v2 4/4] bus/pci: add VFIO sparse mmap support Miao Li
2023-05-25 16:31       ` [PATCH v3 0/4] Support VFIO sparse mmap in PCI bus Miao Li
2023-05-25 16:31         ` [PATCH v3 1/4] bus/pci: introduce an internal representation of PCI device Miao Li
2023-05-29  6:14           ` [EXT] " Sunil Kumar Kori
2023-05-29  6:28           ` Cao, Yahui
2023-05-25 16:31         ` [PATCH v3 2/4] bus/pci: avoid depending on private value in kernel source Miao Li
2023-05-29  6:15           ` [EXT] " Sunil Kumar Kori
2023-05-29  6:30           ` Cao, Yahui
2023-05-25 16:31         ` [PATCH v3 3/4] bus/pci: introduce helper for MMIO read and write Miao Li
2023-05-29  6:16           ` [EXT] " Sunil Kumar Kori
2023-05-29  6:31           ` Cao, Yahui
2023-05-25 16:31         ` [PATCH v3 4/4] bus/pci: add VFIO sparse mmap support Miao Li
2023-05-29  6:17           ` [EXT] " Sunil Kumar Kori
2023-05-29  6:32           ` Cao, Yahui
2023-05-29  9:25           ` Xia, Chenbo [this message]
2023-05-31  5:37         ` [PATCH v4 0/4] Support VFIO sparse mmap in PCI bus Miao Li
2023-05-31  5:37           ` [PATCH v4 1/4] bus/pci: introduce an internal representation of PCI device Miao Li
2023-05-31  5:37           ` [PATCH v4 2/4] bus/pci: avoid depending on private value in kernel source Miao Li
2023-05-31  5:37           ` [PATCH v4 3/4] bus/pci: introduce helper for MMIO read and write Miao Li
2023-05-31  5:37           ` [PATCH v4 4/4] bus/pci: add VFIO sparse mmap support Miao Li
2023-06-07 16:30           ` [PATCH v4 0/4] Support VFIO sparse mmap in PCI bus Thomas Monjalon
2023-06-08  0:28             ` Patrick Robb
2023-06-08  1:36               ` Xia, Chenbo
2023-06-08  1:33             ` Xia, Chenbo
2023-06-08  6:43           ` Ali Alnubani
2023-06-08  6:50             ` Xia, Chenbo
2023-06-08  7:03               ` David Marchand
2023-06-08 12:47                 ` Patrick Robb
2023-05-15 15:52     ` [PATCH v1 4/4] bus/pci: add VFIO sparse mmap support Stephen Hemminger
2023-05-22  2:41       ` Li, Miao
2023-05-22  3:42       ` Xia, Chenbo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SN6PR11MB3504BDBE330C16E9C14C362E9C4A9@SN6PR11MB3504.namprd11.prod.outlook.com \
    --to=chenbo.xia@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@amd.com \
    --cc=miao.li@intel.com \
    --cc=skori@marvell.com \
    --cc=thomas@monjalon.net \
    --cc=yahui.cao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).