From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id 6D1E71B173 for ; Mon, 16 Apr 2018 12:02:51 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Apr 2018 03:02:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,459,1517904000"; d="scan'208";a="48222619" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.237.220.128]) ([10.237.220.128]) by orsmga001.jf.intel.com with ESMTP; 16 Apr 2018 03:02:46 -0700 To: Xiao Wang , ferruh.yigit@intel.com Cc: dev@dpdk.org, maxime.coquelin@redhat.com, zhihong.wang@intel.com, tiwei.bie@intel.com, jianfeng.tan@intel.com, cunming.liang@intel.com, dan.daly@intel.com, thomas@monjalon.net, Junjie Chen References: <20180412071956.66178-2-xiao.w.wang@intel.com> <20180415153349.62105-1-xiao.w.wang@intel.com> <20180415153349.62105-2-xiao.w.wang@intel.com> From: "Burakov, Anatoly" Message-ID: <9f41b581-ea27-51fb-dfbe-affd64d7996c@intel.com> Date: Mon, 16 Apr 2018 11:02:44 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180415153349.62105-2-xiao.w.wang@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v7 1/5] vfio: extend data structure for multi container X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Apr 2018 10:02:52 -0000 On 15-Apr-18 4:33 PM, Xiao Wang wrote: > Currently eal vfio framework binds vfio group fd to the default > container fd during rte_vfio_setup_device, while in some cases, > e.g. vDPA (vhost data path acceleration), we want to put vfio group > to a separate container and program IOMMU via this container. > > This patch extends the vfio_config structure to contain per-container > user_mem_maps and defines an array of vfio_config. The next patch will > base on this to add container API. > > Signed-off-by: Junjie Chen > Signed-off-by: Xiao Wang > Reviewed-by: Maxime Coquelin > Reviewed-by: Ferruh Yigit > --- > config/common_base | 1 + > lib/librte_eal/linuxapp/eal/eal_vfio.c | 407 ++++++++++++++++++++++----------- > lib/librte_eal/linuxapp/eal/eal_vfio.h | 19 +- > 3 files changed, 275 insertions(+), 152 deletions(-) > > diff --git a/config/common_base b/config/common_base > index c4236fd1f..4a76d2f14 100644 > --- a/config/common_base > +++ b/config/common_base > @@ -87,6 +87,7 @@ CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n > CONFIG_RTE_EAL_IGB_UIO=n > CONFIG_RTE_EAL_VFIO=n > CONFIG_RTE_MAX_VFIO_GROUPS=64 > +CONFIG_RTE_MAX_VFIO_CONTAINERS=64 > CONFIG_RTE_MALLOC_DEBUG=n > CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES=n > CONFIG_RTE_USE_LIBBSD=n > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c > index 589d7d478..46fba2d8d 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c > @@ -22,8 +22,46 @@ > > #define VFIO_MEM_EVENT_CLB_NAME "vfio_mem_event_clb" > > +/* > + * we don't need to store device fd's anywhere since they can be obtained from > + * the group fd via an ioctl() call. > + */ > +struct vfio_group { > + int group_no; > + int fd; > + int devices; > +}; What is the purpose of moving this into .c file? Seems like an unnecessary change. > + > +/* hot plug/unplug of VFIO groups may cause all DMA maps to be dropped. we can > + * recreate the mappings for DPDK segments, but we cannot do so for memory that > + * was registered by the user themselves, so we need to store the user mappings > + * somewhere, to recreate them later. > + */ > +#define VFIO_MAX_USER_MEM_MAPS 256 > +struct user_mem_map { > + uint64_t addr; > + uint64_t iova; > + uint64_t len; > +}; > + <...> > +static struct vfio_config * > +get_vfio_cfg_by_group_no(int iommu_group_no) > +{ > + struct vfio_config *vfio_cfg; > + int i, j; > + > + for (i = 0; i < VFIO_MAX_CONTAINERS; i++) { > + vfio_cfg = &vfio_cfgs[i]; > + for (j = 0; j < VFIO_MAX_GROUPS; j++) { > + if (vfio_cfg->vfio_groups[j].group_no == > + iommu_group_no) > + return vfio_cfg; > + } > + } > + > + return default_vfio_cfg; Here and in other places: i'm not sure returning default vfio config if group not found is such a good idea. It would be better if calling code explicitly handled case of group not existing yet. > +} > + > +static struct vfio_config * > +get_vfio_cfg_by_group_fd(int vfio_group_fd) > +{ > + struct vfio_config *vfio_cfg; > + int i, j; > + > + for (i = 0; i < VFIO_MAX_CONTAINERS; i++) { > + vfio_cfg = &vfio_cfgs[i]; > + for (j = 0; j < VFIO_MAX_GROUPS; j++) > + if (vfio_cfg->vfio_groups[j].fd == vfio_group_fd) > + return vfio_cfg; > + } > <...> > - for (i = 0; i < VFIO_MAX_GROUPS; i++) { > - vfio_cfg.vfio_groups[i].fd = -1; > - vfio_cfg.vfio_groups[i].group_no = -1; > - vfio_cfg.vfio_groups[i].devices = 0; > + rte_spinlock_recursive_t lock = RTE_SPINLOCK_RECURSIVE_INITIALIZER; > + > + for (i = 0; i < VFIO_MAX_CONTAINERS; i++) { > + vfio_cfgs[i].vfio_container_fd = -1; > + vfio_cfgs[i].vfio_active_groups = 0; > + vfio_cfgs[i].vfio_iommu_type = NULL; > + vfio_cfgs[i].mem_maps.lock = lock; Nitpick - why copy, instead of straight up initializing with RTE_SPINLOCK_RECURSIVE_INITIALIZER? > + > + for (j = 0; j < VFIO_MAX_GROUPS; j++) { > + vfio_cfgs[i].vfio_groups[j].fd = -1; > + vfio_cfgs[i].vfio_groups[j].group_no = -1; > + vfio_cfgs[i].vfio_groups[j].devices = 0; > + } > } > > /* inform the user that we are probing for VFIO */ > @@ -841,12 +971,12 @@ rte_vfio_enable(const char *modname) > return 0; > } <...> -- Thanks, Anatoly