DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Xiao Wang <xiao.w.wang@intel.com>, ferruh.yigit@intel.com
Cc: dev@dpdk.org, maxime.coquelin@redhat.com, zhihong.wang@intel.com,
	tiwei.bie@intel.com, jianfeng.tan@intel.com,
	cunming.liang@intel.com, dan.daly@intel.com, thomas@monjalon.net,
	gaetan.rivet@6wind.com, hemant.agrawal@nxp.com,
	Junjie Chen <junjie.j.chen@intel.com>
Subject: Re: [dpdk-dev] [PATCH v6 1/4] eal/vfio: add multiple container support
Date: Thu, 12 Apr 2018 15:03:38 +0100	[thread overview]
Message-ID: <974c9cd0-87c4-6ab1-0787-9278a7379fda@intel.com> (raw)
In-Reply-To: <20180412071956.66178-2-xiao.w.wang@intel.com>

On 12-Apr-18 8:19 AM, Xiao Wang wrote:
> Currently eal vfio framework binds vfio group fd to the default
> container fd during rte_vfio_setup_device, while in some cases,
> e.g. vDPA (vhost data path acceleration), we want to put vfio group
> to a separate container and program IOMMU via this container.
> 
> This patch adds some APIs to support container creating and device
> binding with a container.
> 
> A driver could use "rte_vfio_create_container" helper to create a
> new container from eal, use "rte_vfio_bind_group" to bind a device
> to the newly created container.
> 
> During rte_vfio_setup_device, the container bound with the device
> will be used for IOMMU setup.
> 
> Signed-off-by: Junjie Chen <junjie.j.chen@intel.com>
> Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
> ---

Apologies for late review. Some comments below.

<...>

>   
> +struct rte_memseg;
> +
>   /**
>    * Setup vfio_cfg for the device identified by its address.
>    * It discovers the configured I/O MMU groups or sets a new one for the device.
> @@ -131,6 +133,117 @@ rte_vfio_clear_group(int vfio_group_fd);
>   }
>   #endif
>   

<...>

> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> + *
> + * Perform dma mapping for devices in a conainer.
> + *
> + * @param container_fd
> + *   the specified container fd
> + *
> + * @param dma_type
> + *   the dma map type
> + *
> + * @param ms
> + *   the dma address region to map
> + *
> + * @return
> + *    0 if successful
> + *   <0 if failed
> + */
> +int __rte_experimental
> +rte_vfio_dma_map(int container_fd, int dma_type, const struct rte_memseg *ms);
> +

First of all, why memseg, instead of va/iova/len? This seems like 
unnecessary attachment to internals of DPDK memory representation. Not 
all memory comes in memsegs, this makes the API unnecessarily specific 
to DPDK memory.

Also, why providing DMA type? There's already a VFIO type pointer in 
vfio_config - you can set this pointer for every new created container, 
so the user wouldn't have to care about IOMMU type. Is it not possible 
to figure out DMA type from within EAL VFIO? If not, maybe provide an 
API to do so, e.g. rte_vfio_container_set_dma_type()?

This will also need to be rebased on top of latest HEAD because there 
already is a similar DMA map/unmap API added, only without the container 
parameter. Perhaps rename these new functions to 
rte_vfio_container_(create|destroy|dma_map|dma_unmap)?

> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> + *
> + * Perform dma unmapping for devices in a conainer.
> + *
> + * @param container_fd
> + *   the specified container fd
> + *
> + * @param dma_type
> + *    the dma map type
> + *
> + * @param ms
> + *   the dma address region to unmap
> + *
> + * @return
> + *    0 if successful
> + *   <0 if failed
> + */
> +int __rte_experimental
> +rte_vfio_dma_unmap(int container_fd, int dma_type, const struct rte_memseg *ms);
> +
>   #endif /* VFIO_PRESENT */
>   

<...>

> @@ -75,8 +53,8 @@ vfio_get_group_fd(int iommu_group_no)
>   		if (vfio_group_fd < 0) {
>   			/* if file not found, it's not an error */
>   			if (errno != ENOENT) {
> -				RTE_LOG(ERR, EAL, "Cannot open %s: %s\n", filename,
> -						strerror(errno));
> +				RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> +					filename, strerror(errno));

This looks like unintended change.

>   				return -1;
>   			}
>   
> @@ -86,8 +64,10 @@ vfio_get_group_fd(int iommu_group_no)
>   			vfio_group_fd = open(filename, O_RDWR);
>   			if (vfio_group_fd < 0) {
>   				if (errno != ENOENT) {
> -					RTE_LOG(ERR, EAL, "Cannot open %s: %s\n", filename,
> -							strerror(errno));
> +					RTE_LOG(ERR, EAL,
> +						"Cannot open %s: %s\n",
> +						filename,
> +						strerror(errno));

This looks like unintended change.

>   					return -1;
>   				}
>   				return 0;
> @@ -95,21 +75,19 @@ vfio_get_group_fd(int iommu_group_no)
>   			/* noiommu group found */
>   		}
>   
> -		cur_grp->group_no = iommu_group_no;
> -		cur_grp->fd = vfio_group_fd;
> -		vfio_cfg.vfio_active_groups++;
>   		return vfio_group_fd;
>   	}
> -	/* if we're in a secondary process, request group fd from the primary
> +	/*
> +	 * if we're in a secondary process, request group fd from the primary
>   	 * process via our socket
>   	 */

This looks like unintended change.

>   	else {
> -		int socket_fd, ret;
> -
> -		socket_fd = vfio_mp_sync_connect_to_primary();
> +		int ret;
> +		int socket_fd = vfio_mp_sync_connect_to_primary();
>   
>   		if (socket_fd < 0) {
> -			RTE_LOG(ERR, EAL, "  cannot connect to primary process!\n");
> +			RTE_LOG(ERR, EAL,
> +				"  cannot connect to primary process!\n");

This looks like unintended change.

>   			return -1;
>   		}
>   		if (vfio_mp_sync_send_request(socket_fd, SOCKET_REQ_GROUP) < 0) {
> @@ -122,6 +100,7 @@ vfio_get_group_fd(int iommu_group_no)
>   			close(socket_fd);
>   			return -1;
>   		}
> +
>   		ret = vfio_mp_sync_receive_request(socket_fd);

This looks like unintended change.

(hint: "git revert -n HEAD && git add -p" is your friend :) )

>   		switch (ret) {
>   		case SOCKET_NO_FD:
> @@ -132,9 +111,6 @@ vfio_get_group_fd(int iommu_group_no)
>   			/* if we got the fd, store it and return it */
>   			if (vfio_group_fd > 0) {
>   				close(socket_fd);
> -				cur_grp->group_no = iommu_group_no;
> -				cur_grp->fd = vfio_group_fd;
> -				vfio_cfg.vfio_active_groups++;
>   				return vfio_group_fd;
>   			}
>   			/* fall-through on error */
> @@ -147,70 +123,349 @@ vfio_get_group_fd(int iommu_group_no)
>   	return -1;

<...>

> +int __rte_experimental
> +rte_vfio_create_container(void)
> +{
> +	struct vfio_config *vfio_cfg;
> +	int i;
> +
> +	/* Find an empty slot to store new vfio config */
> +	for (i = 1; i < VFIO_MAX_CONTAINERS; i++) {
> +		if (vfio_cfgs[i] == NULL)
> +			break;
> +	}
> +
> +	if (i == VFIO_MAX_CONTAINERS) {
> +		RTE_LOG(ERR, EAL, "exceed max vfio container limit\n");
> +		return -1;
> +	}
> +
> +	vfio_cfgs[i] = rte_zmalloc("vfio_container", sizeof(struct vfio_config),
> +		RTE_CACHE_LINE_SIZE);
> +	if (vfio_cfgs[i] == NULL)
> +		return -ENOMEM;

Is there a specific reason why 1) dynamic allocation is used (as opposed 
to just storing a static array), and 2) DPDK memory allocation is used? 
This seems like unnecessary complication.

Even if you were to decide to allocate memory instead of having a static 
array, you'll have to register for rte_eal_cleanup() to delete any 
allocated containers on DPDK exit. But, as i said, i think it would be 
better to keep it as static array.

> +
> +	RTE_LOG(INFO, EAL, "alloc container at slot %d\n", i);
> +	vfio_cfg = vfio_cfgs[i];
> +	vfio_cfg->vfio_active_groups = 0;
> +	vfio_cfg->vfio_container_fd = vfio_get_container_fd();
> +
> +	if (vfio_cfg->vfio_container_fd < 0) {
> +		rte_free(vfio_cfgs[i]);
> +		vfio_cfgs[i] = NULL;
> +		return -1;
> +	}
> +
> +	for (i = 0; i < VFIO_MAX_GROUPS; i++) {
> +		vfio_cfg->vfio_groups[i].group_no = -1;
> +		vfio_cfg->vfio_groups[i].fd = -1;
> +		vfio_cfg->vfio_groups[i].devices = 0;
> +	}

<...>

> @@ -665,41 +931,80 @@ vfio_get_group_no(const char *sysfs_base,
>   }
>   
>   static int
> -vfio_type1_dma_map(int vfio_container_fd)
> +do_vfio_type1_dma_map(int vfio_container_fd, const struct rte_memseg *ms)

<...>


> +static int
> +do_vfio_type1_dma_unmap(int vfio_container_fd, const struct rte_memseg *ms)

API's such as these two were recently added to DPDK.

-- 
Thanks,
Anatoly

  reply	other threads:[~2018-04-12 14:03 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-09 23:08 [dpdk-dev] [PATCH 0/3] add ifcvf driver Xiao Wang
2018-03-09 23:08 ` [dpdk-dev] [PATCH 1/3] eal/vfio: add support for multiple container Xiao Wang
2018-03-14 12:08   ` Burakov, Anatoly
2018-03-15 16:49     ` Wang, Xiao W
2018-03-09 23:08 ` [dpdk-dev] [PATCH 2/3] bus/pci: expose sysfs parsing API Xiao Wang
2018-03-14 11:19   ` Burakov, Anatoly
2018-03-14 13:30     ` Gaëtan Rivet
2018-03-15 16:49       ` Wang, Xiao W
2018-03-15 17:19         ` Gaëtan Rivet
2018-03-19  1:31           ` Wang, Xiao W
2018-03-21 13:21   ` [dpdk-dev] [PATCH v2 0/3] add ifcvf driver Xiao Wang
2018-03-21 13:21     ` [dpdk-dev] [PATCH v2 1/3] eal/vfio: add support for multiple container Xiao Wang
2018-03-21 20:32       ` Thomas Monjalon
2018-03-21 21:37         ` Gaëtan Rivet
2018-03-22  3:00           ` Wang, Xiao W
2018-03-21 13:21     ` [dpdk-dev] [PATCH v2 2/3] bus/pci: expose sysfs parsing API Xiao Wang
2018-03-21 20:44       ` Thomas Monjalon
2018-03-22  2:46         ` Wang, Xiao W
2018-03-21 13:21     ` [dpdk-dev] [PATCH v2 3/3] net/ifcvf: add ifcvf driver Xiao Wang
2018-03-21 20:52       ` Thomas Monjalon
2018-03-23 10:39         ` Wang, Xiao W
2018-03-21 20:57       ` Maxime Coquelin
2018-03-23 10:37         ` Wang, Xiao W
2018-03-22  8:51       ` Ferruh Yigit
2018-03-22 17:23         ` Wang, Xiao W
2018-03-31  2:29       ` [dpdk-dev] [PATCH v3 0/3] add ifcvf vdpa driver Xiao Wang
2018-03-31  2:29         ` [dpdk-dev] [PATCH v3 1/4] eal/vfio: add support for multiple container Xiao Wang
2018-03-31 11:06           ` Maxime Coquelin
2018-03-31  2:29         ` [dpdk-dev] [PATCH v3 2/4] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-03-31 11:13           ` Maxime Coquelin
2018-03-31 13:16             ` Thomas Monjalon
2018-04-02  4:08               ` Wang, Xiao W
2018-03-31  2:29         ` [dpdk-dev] [PATCH v3 3/4] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-03-31 11:26           ` Maxime Coquelin
2018-04-03  9:38             ` Wang, Xiao W
2018-04-04 14:40           ` [dpdk-dev] [PATCH v4 0/4] " Xiao Wang
2018-04-04 14:40             ` [dpdk-dev] [PATCH v4 1/4] eal/vfio: add multiple container support Xiao Wang
2018-04-05 18:06               ` [dpdk-dev] [PATCH v5 0/4] add ifcvf vdpa driver Xiao Wang
2018-04-05 18:06                 ` [dpdk-dev] [PATCH v5 1/4] eal/vfio: add multiple container support Xiao Wang
2018-04-05 18:06                 ` [dpdk-dev] [PATCH v5 2/4] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-11 18:58                   ` Ferruh Yigit
2018-04-05 18:07                 ` [dpdk-dev] [PATCH v5 3/4] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-11 18:58                   ` Ferruh Yigit
2018-04-12  7:19                   ` [dpdk-dev] [PATCH v6 0/4] " Xiao Wang
2018-04-12  7:19                     ` [dpdk-dev] [PATCH v6 1/4] eal/vfio: add multiple container support Xiao Wang
2018-04-12 14:03                       ` Burakov, Anatoly [this message]
2018-04-12 16:07                         ` Wang, Xiao W
2018-04-12 16:24                           ` Burakov, Anatoly
2018-04-13  9:18                             ` Wang, Xiao W
2018-04-15 15:33                       ` [dpdk-dev] [PATCH v7 0/5] add ifcvf vdpa driver Xiao Wang
2018-04-15 15:33                         ` [dpdk-dev] [PATCH v7 1/5] vfio: extend data structure for multi container Xiao Wang
2018-04-16 10:02                           ` Burakov, Anatoly
2018-04-16 12:22                             ` Wang, Xiao W
2018-04-16 15:34                           ` [dpdk-dev] [PATCH v8 0/5] add ifcvf vdpa driver Xiao Wang
2018-04-16 15:34                             ` [dpdk-dev] [PATCH v8 1/5] vfio: extend data structure for multi container Xiao Wang
2018-04-16 15:56                               ` Burakov, Anatoly
2018-04-16 15:34                             ` [dpdk-dev] [PATCH v8 2/5] vfio: add multi container support Xiao Wang
2018-04-16 15:58                               ` Burakov, Anatoly
2018-04-17  7:06                               ` [dpdk-dev] [PATCH v9 0/5] add ifcvf vdpa driver Xiao Wang
2018-04-17  7:06                                 ` [dpdk-dev] [PATCH v9 1/5] vfio: extend data structure for multi container Xiao Wang
2018-04-17  7:06                                 ` [dpdk-dev] [PATCH v9 2/5] vfio: add multi container support Xiao Wang
2018-04-17  7:06                                 ` [dpdk-dev] [PATCH v9 3/5] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-17  7:06                                 ` [dpdk-dev] [PATCH v9 4/5] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-17  7:06                                 ` [dpdk-dev] [PATCH v9 5/5] doc: add ifcvf driver document and release note Xiao Wang
2018-04-17 11:13                                 ` [dpdk-dev] [PATCH v9 0/5] add ifcvf vdpa driver Ferruh Yigit
2018-04-16 15:34                             ` [dpdk-dev] [PATCH v8 3/5] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-16 15:34                             ` [dpdk-dev] [PATCH v8 4/5] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-16 15:34                             ` [dpdk-dev] [PATCH v8 5/5] doc: add ifcvf driver document and release note Xiao Wang
2018-04-16 16:36                             ` [dpdk-dev] [PATCH v8 0/5] add ifcvf vdpa driver Ferruh Yigit
2018-04-16 18:07                               ` Thomas Monjalon
2018-04-17  5:36                                 ` Wang, Xiao W
2018-04-15 15:33                         ` [dpdk-dev] [PATCH v7 2/5] vfio: add multi container support Xiao Wang
2018-04-16 10:03                           ` Burakov, Anatoly
2018-04-16 12:44                             ` Wang, Xiao W
2018-04-15 15:33                         ` [dpdk-dev] [PATCH v7 3/5] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-15 15:33                         ` [dpdk-dev] [PATCH v7 4/5] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-15 15:33                         ` [dpdk-dev] [PATCH v7 5/5] doc: add ifcvf driver document and release note Xiao Wang
2018-04-12  7:19                     ` [dpdk-dev] [PATCH v6 2/4] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-12  7:19                     ` [dpdk-dev] [PATCH v6 3/4] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-12  7:19                     ` [dpdk-dev] [PATCH v6 4/4] doc: add ifcvf driver document and release note Xiao Wang
2018-04-05 18:07                 ` [dpdk-dev] [PATCH v5 " Xiao Wang
2018-04-11 18:59                 ` [dpdk-dev] [PATCH v5 0/4] add ifcvf vdpa driver Ferruh Yigit
2018-04-12  5:47                   ` Wang, Xiao W
2018-04-04 14:40             ` [dpdk-dev] [PATCH v4 2/4] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-04 14:40             ` [dpdk-dev] [PATCH v4 3/4] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-04 14:40             ` [dpdk-dev] [PATCH v4 4/4] doc: add ifcvf driver document and release note Xiao Wang
2018-03-31  2:29         ` [dpdk-dev] [PATCH v3 4/4] net/ifcvf: add " Xiao Wang
2018-03-31 11:28           ` Maxime Coquelin
2018-03-09 23:08 ` [dpdk-dev] [PATCH 3/3] net/ifcvf: add ifcvf driver Xiao Wang
2018-03-10 18:23 ` [dpdk-dev] [PATCH 0/3] " Maxime Coquelin
2018-03-15 16:49   ` Wang, Xiao W
2018-03-21 20:47     ` Maxime Coquelin
2018-03-23 10:27       ` Wang, Xiao W
2018-03-25  9:51         ` Maxime Coquelin
2018-03-26  9:05           ` Wang, Xiao W
2018-03-26 13:29             ` Maxime Coquelin
2018-03-27  4:40               ` Wang, Xiao W
2018-03-27  5:09                 ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=974c9cd0-87c4-6ab1-0787-9278a7379fda@intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=cunming.liang@intel.com \
    --cc=dan.daly@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=gaetan.rivet@6wind.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=jianfeng.tan@intel.com \
    --cc=junjie.j.chen@intel.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=thomas@monjalon.net \
    --cc=tiwei.bie@intel.com \
    --cc=xiao.w.wang@intel.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).