From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 89192558C for ; Wed, 11 Jan 2017 04:13:12 +0100 (CET) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga104.fm.intel.com with ESMTP; 10 Jan 2017 19:13:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,345,1477983600"; d="scan'208";a="47663284" Received: from shwdeisgchi083.ccr.corp.intel.com (HELO [10.239.67.193]) ([10.239.67.193]) by orsmga004.jf.intel.com with ESMTP; 10 Jan 2017 19:13:10 -0800 To: Jason Wang , dev@dpdk.org References: <1480689075-66977-1-git-send-email-jianfeng.tan@intel.com> <1482477266-39199-1-git-send-email-jianfeng.tan@intel.com> <1482477266-39199-6-git-send-email-jianfeng.tan@intel.com> <46af618f-c01b-3571-78fc-12d10859a4a1@redhat.com> <4fa4d06c-d359-df12-a073-7c2c2540b634@intel.com> Cc: yuanhan.liu@linux.intel.com, ferruh.yigit@intel.com, cunming.liang@intel.com From: "Tan, Jianfeng" Message-ID: Date: Wed, 11 Jan 2017 11:13:09 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v2 5/7] net/virtio_user: add vhost kernel support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Jan 2017 03:13:13 -0000 On 1/11/2017 10:42 AM, Jason Wang wrote: > > > On 2017年01月10日 14:11, Tan, Jianfeng wrote: >> >> Hi Jason, >> >> >> On 1/9/2017 12:39 PM, Jason Wang wrote: >>> >>> >>> On 2016年12月23日 15:14, Jianfeng Tan wrote: >>>> This patch add support vhost kernel as the backend for virtio_user. >>>> Three main hook functions are added: >>>> - vhost_kernel_setup() to open char device, each vq pair needs one >>>> vhostfd; >>>> - vhost_kernel_ioctl() to communicate control messages with vhost >>>> kernel module; >>>> - vhost_kernel_enable_queue_pair() to open tap device and set it >>>> as the backend of corresonding vhost fd (that is to say, vq >>>> pair). >>>> >>>> Signed-off-by: Jianfeng Tan >>>> --- >>>> >> [...] >>>> +/* TUNSETIFF ifr flags */ >>>> +#define IFF_TAP 0x0002 >>>> +#define IFF_NO_PI 0x1000 >>>> +#define IFF_ONE_QUEUE 0x2000 >>>> +#define IFF_VNET_HDR 0x4000 >>>> +#define IFF_MULTI_QUEUE 0x0100 >>>> +#define IFF_ATTACH_QUEUE 0x0200 >>>> +#define IFF_DETACH_QUEUE 0x0400 >>> >>> Do we really want to duplicate those things which has been exposed >>> by uapi here? >> >> You mean those defined in ? Redefine those common >> macros, or include standard header file, with respective pros and >> cons. DPDK prefers the redefinition way as far as I understand, >> doesn't it? >> > > Well, if you really want to do this, you may want to use an > independent file. Then you can sync it with linux headers with a bash > script. Agreed! [...] >>>> +static struct vhost_memory_kernel * >>>> +prepare_vhost_memory_kernel(void) >>>> +{ >>>> + uint32_t i, j, k = 0; >>>> + struct rte_memseg *seg; >>>> + struct vhost_memory_region *mr; >>>> + struct vhost_memory_kernel *vm; >>>> + >>>> + vm = malloc(sizeof(struct vhost_memory_kernel) + >>>> + VHOST_KERNEL_MAX_REGIONS * >>>> + sizeof(struct vhost_memory_region)); >>>> + >>>> + for (i = 0; i < RTE_MAX_MEMSEG; ++i) { >>>> + seg = &rte_eal_get_configuration()->mem_config->memseg[i]; >>>> + if (!seg->addr) >>>> + break; >>> >>> If we're sure the number of regions is less than 64(or the module >>> parameter read from /sys), can we avoid the iteration here? >> >> The "if" statement under the "for" statement can save us from all >> RTE_MAX_MEMSEG iteration. > > But if we know the number of regions is short than the limit, there's > even no need for this? The problem is: we don't have a variable to know how many segments are there in DPDK. Anyway, we need to report error in the situation that # of segments > # of regions. Or can you give a more specific example? [...] >>>> + if (dev->ifname) >>>> + strncpy(ifr.ifr_name, dev->ifname, IFNAMSIZ); >>>> + else >>>> + strncpy(ifr.ifr_name, "tap%d", IFNAMSIZ); >>>> + if (ioctl(tapfd, TUNSETIFF, (void *)&ifr) == -1) { >>>> + PMD_DRV_LOG(ERR, "TUNSETIFF failed: %s", strerror(errno)); >>>> + goto error; >>>> + } >>> >>> This requires CAP_NET_ADMIN, so we should really consider to accept >>> a pre-created fd here. >> >> It sounds like a future work for me. So far, all DPDK apps are >> running in privileged mode (including CAP_NET_ADMIN?). >> > > That's not safe. Yes. > Accepting a per-created fd can solve this, and can even support macvtap So you are proposing a way to specify the virtio_user parameter like, --vdev=virtio_user,tapfd=fd1,xxx, right? As replied in your another comment, it's a great way to go. But I'm afraid we have no time to implement that in this cycle. Thanks, Jianfeng