DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Wang, Xiao W" <xiao.w.wang@intel.com>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"maxime.coquelin@redhat.com" <maxime.coquelin@redhat.com>,
	"Wang, Zhihong" <zhihong.wang@intel.com>,
	"Bie, Tiwei" <tiwei.bie@intel.com>,
	"Tan, Jianfeng" <jianfeng.tan@intel.com>,
	 "Liang, Cunming" <cunming.liang@intel.com>,
	"Daly, Dan" <dan.daly@intel.com>,
	 "thomas@monjalon.net" <thomas@monjalon.net>,
	"gaetan.rivet@6wind.com" <gaetan.rivet@6wind.com>,
	"hemant.agrawal@nxp.com" <hemant.agrawal@nxp.com>,
	"Chen, Junjie J" <junjie.j.chen@intel.com>
Subject: Re: [dpdk-dev] [PATCH v6 1/4] eal/vfio: add multiple container support
Date: Thu, 12 Apr 2018 16:07:41 +0000	[thread overview]
Message-ID: <B7F2E978279D1D49A3034B7786DACF406F88D1F2@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <974c9cd0-87c4-6ab1-0787-9278a7379fda@intel.com>

Hi Anatoly,

> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Thursday, April 12, 2018 10:04 PM
> To: Wang, Xiao W <xiao.w.wang@intel.com>; Yigit, Ferruh
> <ferruh.yigit@intel.com>
> Cc: dev@dpdk.org; maxime.coquelin@redhat.com; Wang, Zhihong
> <zhihong.wang@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>; Tan, Jianfeng
> <jianfeng.tan@intel.com>; Liang, Cunming <cunming.liang@intel.com>; Daly,
> Dan <dan.daly@intel.com>; thomas@monjalon.net; gaetan.rivet@6wind.com;
> hemant.agrawal@nxp.com; Chen, Junjie J <junjie.j.chen@intel.com>
> Subject: Re: [PATCH v6 1/4] eal/vfio: add multiple container support
> 
> On 12-Apr-18 8:19 AM, Xiao Wang wrote:
> > Currently eal vfio framework binds vfio group fd to the default
> > container fd during rte_vfio_setup_device, while in some cases,
> > e.g. vDPA (vhost data path acceleration), we want to put vfio group
> > to a separate container and program IOMMU via this container.
> >
> > This patch adds some APIs to support container creating and device
> > binding with a container.
> >
> > A driver could use "rte_vfio_create_container" helper to create a
> > new container from eal, use "rte_vfio_bind_group" to bind a device
> > to the newly created container.
> >
> > During rte_vfio_setup_device, the container bound with the device
> > will be used for IOMMU setup.
> >
> > Signed-off-by: Junjie Chen <junjie.j.chen@intel.com>
> > Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
> > Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
> > ---
> 
> Apologies for late review. Some comments below.
> 
> <...>
> 
> >
> > +struct rte_memseg;
> > +
> >   /**
> >    * Setup vfio_cfg for the device identified by its address.
> >    * It discovers the configured I/O MMU groups or sets a new one for the
> device.
> > @@ -131,6 +133,117 @@ rte_vfio_clear_group(int vfio_group_fd);
> >   }
> >   #endif
> >
> 
> <...>
> 
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> > + *
> > + * Perform dma mapping for devices in a conainer.
> > + *
> > + * @param container_fd
> > + *   the specified container fd
> > + *
> > + * @param dma_type
> > + *   the dma map type
> > + *
> > + * @param ms
> > + *   the dma address region to map
> > + *
> > + * @return
> > + *    0 if successful
> > + *   <0 if failed
> > + */
> > +int __rte_experimental
> > +rte_vfio_dma_map(int container_fd, int dma_type, const struct
> rte_memseg *ms);
> > +
> 
> First of all, why memseg, instead of va/iova/len? This seems like
> unnecessary attachment to internals of DPDK memory representation. Not
> all memory comes in memsegs, this makes the API unnecessarily specific
> to DPDK memory.

Agree, will use va/iova/len.

> 
> Also, why providing DMA type? There's already a VFIO type pointer in
> vfio_config - you can set this pointer for every new created container,
> so the user wouldn't have to care about IOMMU type. Is it not possible
> to figure out DMA type from within EAL VFIO? If not, maybe provide an
> API to do so, e.g. rte_vfio_container_set_dma_type()?

It's possible, EAL VFIO should be able to figure out a container's DMA type.

> 
> This will also need to be rebased on top of latest HEAD because there
> already is a similar DMA map/unmap API added, only without the container
> parameter. Perhaps rename these new functions to
> rte_vfio_container_(create|destroy|dma_map|dma_unmap)?

OK, will check the latest HEAD and rebase on that.

> 
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> > + *
> > + * Perform dma unmapping for devices in a conainer.
> > + *
> > + * @param container_fd
> > + *   the specified container fd
> > + *
> > + * @param dma_type
> > + *    the dma map type
> > + *
> > + * @param ms
> > + *   the dma address region to unmap
> > + *
> > + * @return
> > + *    0 if successful
> > + *   <0 if failed
> > + */
> > +int __rte_experimental
> > +rte_vfio_dma_unmap(int container_fd, int dma_type, const struct
> rte_memseg *ms);
> > +
> >   #endif /* VFIO_PRESENT */
> >
> 
> <...>
> 
> > @@ -75,8 +53,8 @@ vfio_get_group_fd(int iommu_group_no)
> >   		if (vfio_group_fd < 0) {
> >   			/* if file not found, it's not an error */
> >   			if (errno != ENOENT) {
> > -				RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> filename,
> > -						strerror(errno));
> > +				RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
> > +					filename, strerror(errno));
> 
> This looks like unintended change.
> 
> >   				return -1;
> >   			}
> >
> > @@ -86,8 +64,10 @@ vfio_get_group_fd(int iommu_group_no)
> >   			vfio_group_fd = open(filename, O_RDWR);
> >   			if (vfio_group_fd < 0) {
> >   				if (errno != ENOENT) {
> > -					RTE_LOG(ERR, EAL, "Cannot
> open %s: %s\n", filename,
> > -							strerror(errno));
> > +					RTE_LOG(ERR, EAL,
> > +						"Cannot open %s: %s\n",
> > +						filename,
> > +						strerror(errno));
> 
> This looks like unintended change.
> 
> >   					return -1;
> >   				}
> >   				return 0;
> > @@ -95,21 +75,19 @@ vfio_get_group_fd(int iommu_group_no)
> >   			/* noiommu group found */
> >   		}
> >
> > -		cur_grp->group_no = iommu_group_no;
> > -		cur_grp->fd = vfio_group_fd;
> > -		vfio_cfg.vfio_active_groups++;
> >   		return vfio_group_fd;
> >   	}
> > -	/* if we're in a secondary process, request group fd from the primary
> > +	/*
> > +	 * if we're in a secondary process, request group fd from the primary
> >   	 * process via our socket
> >   	 */
> 
> This looks like unintended change.
> 
> >   	else {
> > -		int socket_fd, ret;
> > -
> > -		socket_fd = vfio_mp_sync_connect_to_primary();
> > +		int ret;
> > +		int socket_fd = vfio_mp_sync_connect_to_primary();
> >
> >   		if (socket_fd < 0) {
> > -			RTE_LOG(ERR, EAL, "  cannot connect to primary
> process!\n");
> > +			RTE_LOG(ERR, EAL,
> > +				"  cannot connect to primary process!\n");
> 
> This looks like unintended change.
> 
> >   			return -1;
> >   		}
> >   		if (vfio_mp_sync_send_request(socket_fd,
> SOCKET_REQ_GROUP) < 0) {
> > @@ -122,6 +100,7 @@ vfio_get_group_fd(int iommu_group_no)
> >   			close(socket_fd);
> >   			return -1;
> >   		}
> > +
> >   		ret = vfio_mp_sync_receive_request(socket_fd);
> 
> This looks like unintended change.
> 
> (hint: "git revert -n HEAD && git add -p" is your friend :) )

Thanks, will remove these diff.

> 
> >   		switch (ret) {
> >   		case SOCKET_NO_FD:
> > @@ -132,9 +111,6 @@ vfio_get_group_fd(int iommu_group_no)
> >   			/* if we got the fd, store it and return it */
> >   			if (vfio_group_fd > 0) {
> >   				close(socket_fd);
> > -				cur_grp->group_no = iommu_group_no;
> > -				cur_grp->fd = vfio_group_fd;
> > -				vfio_cfg.vfio_active_groups++;
> >   				return vfio_group_fd;
> >   			}
> >   			/* fall-through on error */
> > @@ -147,70 +123,349 @@ vfio_get_group_fd(int iommu_group_no)
> >   	return -1;
> 
> <...>
> 
> > +int __rte_experimental
> > +rte_vfio_create_container(void)
> > +{
> > +	struct vfio_config *vfio_cfg;
> > +	int i;
> > +
> > +	/* Find an empty slot to store new vfio config */
> > +	for (i = 1; i < VFIO_MAX_CONTAINERS; i++) {
> > +		if (vfio_cfgs[i] == NULL)
> > +			break;
> > +	}
> > +
> > +	if (i == VFIO_MAX_CONTAINERS) {
> > +		RTE_LOG(ERR, EAL, "exceed max vfio container limit\n");
> > +		return -1;
> > +	}
> > +
> > +	vfio_cfgs[i] = rte_zmalloc("vfio_container", sizeof(struct vfio_config),
> > +		RTE_CACHE_LINE_SIZE);
> > +	if (vfio_cfgs[i] == NULL)
> > +		return -ENOMEM;
> 
> Is there a specific reason why 1) dynamic allocation is used (as opposed
> to just storing a static array), and 2) DPDK memory allocation is used?
> This seems like unnecessary complication.
> 
> Even if you were to decide to allocate memory instead of having a static
> array, you'll have to register for rte_eal_cleanup() to delete any
> allocated containers on DPDK exit. But, as i said, i think it would be
> better to keep it as static array.
>

Thanks for the suggestion, static array looks simpler and cleaner.
 
> > +
> > +	RTE_LOG(INFO, EAL, "alloc container at slot %d\n", i);
> > +	vfio_cfg = vfio_cfgs[i];
> > +	vfio_cfg->vfio_active_groups = 0;
> > +	vfio_cfg->vfio_container_fd = vfio_get_container_fd();
> > +
> > +	if (vfio_cfg->vfio_container_fd < 0) {
> > +		rte_free(vfio_cfgs[i]);
> > +		vfio_cfgs[i] = NULL;
> > +		return -1;
> > +	}
> > +
> > +	for (i = 0; i < VFIO_MAX_GROUPS; i++) {
> > +		vfio_cfg->vfio_groups[i].group_no = -1;
> > +		vfio_cfg->vfio_groups[i].fd = -1;
> > +		vfio_cfg->vfio_groups[i].devices = 0;
> > +	}
> 
> <...>
> 
> > @@ -665,41 +931,80 @@ vfio_get_group_no(const char *sysfs_base,
> >   }
> >
> >   static int
> > -vfio_type1_dma_map(int vfio_container_fd)
> > +do_vfio_type1_dma_map(int vfio_container_fd, const struct rte_memseg
> *ms)
> 
> <...>
> 
> 
> > +static int
> > +do_vfio_type1_dma_unmap(int vfio_container_fd, const struct
> rte_memseg *ms)
> 
> API's such as these two were recently added to DPDK.

Will check and rebase.

BRs,
Xiao

> 
> --
> Thanks,
> Anatoly

  reply	other threads:[~2018-04-12 16:07 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-09 23:08 [dpdk-dev] [PATCH 0/3] add ifcvf driver Xiao Wang
2018-03-09 23:08 ` [dpdk-dev] [PATCH 1/3] eal/vfio: add support for multiple container Xiao Wang
2018-03-14 12:08   ` Burakov, Anatoly
2018-03-15 16:49     ` Wang, Xiao W
2018-03-09 23:08 ` [dpdk-dev] [PATCH 2/3] bus/pci: expose sysfs parsing API Xiao Wang
2018-03-14 11:19   ` Burakov, Anatoly
2018-03-14 13:30     ` Gaëtan Rivet
2018-03-15 16:49       ` Wang, Xiao W
2018-03-15 17:19         ` Gaëtan Rivet
2018-03-19  1:31           ` Wang, Xiao W
2018-03-21 13:21   ` [dpdk-dev] [PATCH v2 0/3] add ifcvf driver Xiao Wang
2018-03-21 13:21     ` [dpdk-dev] [PATCH v2 1/3] eal/vfio: add support for multiple container Xiao Wang
2018-03-21 20:32       ` Thomas Monjalon
2018-03-21 21:37         ` Gaëtan Rivet
2018-03-22  3:00           ` Wang, Xiao W
2018-03-21 13:21     ` [dpdk-dev] [PATCH v2 2/3] bus/pci: expose sysfs parsing API Xiao Wang
2018-03-21 20:44       ` Thomas Monjalon
2018-03-22  2:46         ` Wang, Xiao W
2018-03-21 13:21     ` [dpdk-dev] [PATCH v2 3/3] net/ifcvf: add ifcvf driver Xiao Wang
2018-03-21 20:52       ` Thomas Monjalon
2018-03-23 10:39         ` Wang, Xiao W
2018-03-21 20:57       ` Maxime Coquelin
2018-03-23 10:37         ` Wang, Xiao W
2018-03-22  8:51       ` Ferruh Yigit
2018-03-22 17:23         ` Wang, Xiao W
2018-03-31  2:29       ` [dpdk-dev] [PATCH v3 0/3] add ifcvf vdpa driver Xiao Wang
2018-03-31  2:29         ` [dpdk-dev] [PATCH v3 1/4] eal/vfio: add support for multiple container Xiao Wang
2018-03-31 11:06           ` Maxime Coquelin
2018-03-31  2:29         ` [dpdk-dev] [PATCH v3 2/4] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-03-31 11:13           ` Maxime Coquelin
2018-03-31 13:16             ` Thomas Monjalon
2018-04-02  4:08               ` Wang, Xiao W
2018-03-31  2:29         ` [dpdk-dev] [PATCH v3 3/4] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-03-31 11:26           ` Maxime Coquelin
2018-04-03  9:38             ` Wang, Xiao W
2018-04-04 14:40           ` [dpdk-dev] [PATCH v4 0/4] " Xiao Wang
2018-04-04 14:40             ` [dpdk-dev] [PATCH v4 1/4] eal/vfio: add multiple container support Xiao Wang
2018-04-05 18:06               ` [dpdk-dev] [PATCH v5 0/4] add ifcvf vdpa driver Xiao Wang
2018-04-05 18:06                 ` [dpdk-dev] [PATCH v5 1/4] eal/vfio: add multiple container support Xiao Wang
2018-04-05 18:06                 ` [dpdk-dev] [PATCH v5 2/4] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-11 18:58                   ` Ferruh Yigit
2018-04-05 18:07                 ` [dpdk-dev] [PATCH v5 3/4] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-11 18:58                   ` Ferruh Yigit
2018-04-12  7:19                   ` [dpdk-dev] [PATCH v6 0/4] " Xiao Wang
2018-04-12  7:19                     ` [dpdk-dev] [PATCH v6 1/4] eal/vfio: add multiple container support Xiao Wang
2018-04-12 14:03                       ` Burakov, Anatoly
2018-04-12 16:07                         ` Wang, Xiao W [this message]
2018-04-12 16:24                           ` Burakov, Anatoly
2018-04-13  9:18                             ` Wang, Xiao W
2018-04-15 15:33                       ` [dpdk-dev] [PATCH v7 0/5] add ifcvf vdpa driver Xiao Wang
2018-04-15 15:33                         ` [dpdk-dev] [PATCH v7 1/5] vfio: extend data structure for multi container Xiao Wang
2018-04-16 10:02                           ` Burakov, Anatoly
2018-04-16 12:22                             ` Wang, Xiao W
2018-04-16 15:34                           ` [dpdk-dev] [PATCH v8 0/5] add ifcvf vdpa driver Xiao Wang
2018-04-16 15:34                             ` [dpdk-dev] [PATCH v8 1/5] vfio: extend data structure for multi container Xiao Wang
2018-04-16 15:56                               ` Burakov, Anatoly
2018-04-16 15:34                             ` [dpdk-dev] [PATCH v8 2/5] vfio: add multi container support Xiao Wang
2018-04-16 15:58                               ` Burakov, Anatoly
2018-04-17  7:06                               ` [dpdk-dev] [PATCH v9 0/5] add ifcvf vdpa driver Xiao Wang
2018-04-17  7:06                                 ` [dpdk-dev] [PATCH v9 1/5] vfio: extend data structure for multi container Xiao Wang
2018-04-17  7:06                                 ` [dpdk-dev] [PATCH v9 2/5] vfio: add multi container support Xiao Wang
2018-04-17  7:06                                 ` [dpdk-dev] [PATCH v9 3/5] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-17  7:06                                 ` [dpdk-dev] [PATCH v9 4/5] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-17  7:06                                 ` [dpdk-dev] [PATCH v9 5/5] doc: add ifcvf driver document and release note Xiao Wang
2018-04-17 11:13                                 ` [dpdk-dev] [PATCH v9 0/5] add ifcvf vdpa driver Ferruh Yigit
2018-04-16 15:34                             ` [dpdk-dev] [PATCH v8 3/5] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-16 15:34                             ` [dpdk-dev] [PATCH v8 4/5] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-16 15:34                             ` [dpdk-dev] [PATCH v8 5/5] doc: add ifcvf driver document and release note Xiao Wang
2018-04-16 16:36                             ` [dpdk-dev] [PATCH v8 0/5] add ifcvf vdpa driver Ferruh Yigit
2018-04-16 18:07                               ` Thomas Monjalon
2018-04-17  5:36                                 ` Wang, Xiao W
2018-04-15 15:33                         ` [dpdk-dev] [PATCH v7 2/5] vfio: add multi container support Xiao Wang
2018-04-16 10:03                           ` Burakov, Anatoly
2018-04-16 12:44                             ` Wang, Xiao W
2018-04-15 15:33                         ` [dpdk-dev] [PATCH v7 3/5] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-15 15:33                         ` [dpdk-dev] [PATCH v7 4/5] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-15 15:33                         ` [dpdk-dev] [PATCH v7 5/5] doc: add ifcvf driver document and release note Xiao Wang
2018-04-12  7:19                     ` [dpdk-dev] [PATCH v6 2/4] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-12  7:19                     ` [dpdk-dev] [PATCH v6 3/4] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-12  7:19                     ` [dpdk-dev] [PATCH v6 4/4] doc: add ifcvf driver document and release note Xiao Wang
2018-04-05 18:07                 ` [dpdk-dev] [PATCH v5 " Xiao Wang
2018-04-11 18:59                 ` [dpdk-dev] [PATCH v5 0/4] add ifcvf vdpa driver Ferruh Yigit
2018-04-12  5:47                   ` Wang, Xiao W
2018-04-04 14:40             ` [dpdk-dev] [PATCH v4 2/4] net/virtio: skip device probe in vdpa mode Xiao Wang
2018-04-04 14:40             ` [dpdk-dev] [PATCH v4 3/4] net/ifcvf: add ifcvf vdpa driver Xiao Wang
2018-04-04 14:40             ` [dpdk-dev] [PATCH v4 4/4] doc: add ifcvf driver document and release note Xiao Wang
2018-03-31  2:29         ` [dpdk-dev] [PATCH v3 4/4] net/ifcvf: add " Xiao Wang
2018-03-31 11:28           ` Maxime Coquelin
2018-03-09 23:08 ` [dpdk-dev] [PATCH 3/3] net/ifcvf: add ifcvf driver Xiao Wang
2018-03-10 18:23 ` [dpdk-dev] [PATCH 0/3] " Maxime Coquelin
2018-03-15 16:49   ` Wang, Xiao W
2018-03-21 20:47     ` Maxime Coquelin
2018-03-23 10:27       ` Wang, Xiao W
2018-03-25  9:51         ` Maxime Coquelin
2018-03-26  9:05           ` Wang, Xiao W
2018-03-26 13:29             ` Maxime Coquelin
2018-03-27  4:40               ` Wang, Xiao W
2018-03-27  5:09                 ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B7F2E978279D1D49A3034B7786DACF406F88D1F2@SHSMSX101.ccr.corp.intel.com \
    --to=xiao.w.wang@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=cunming.liang@intel.com \
    --cc=dan.daly@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=gaetan.rivet@6wind.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=jianfeng.tan@intel.com \
    --cc=junjie.j.chen@intel.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=thomas@monjalon.net \
    --cc=tiwei.bie@intel.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).