From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 134922AA0 for ; Fri, 7 Sep 2018 14:21:44 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Sep 2018 05:21:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,342,1531810800"; d="scan'208";a="68293416" Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.237.220.152]) ([10.237.220.152]) by fmsmga007.fm.intel.com with ESMTP; 07 Sep 2018 05:21:36 -0700 To: Tiwei Bie Cc: maxime.coquelin@redhat.com, zhihong.wang@intel.com, dev@dpdk.org, seanbh@gmail.com References: <20180905042852.6212-1-tiwei.bie@intel.com> <20180905042852.6212-3-tiwei.bie@intel.com> <5bf3a5ef-790c-40af-d4ca-1fc748dd6c04@intel.com> <20180907113559.GA22407@debian> From: "Burakov, Anatoly" Message-ID: <4b106cd0-6ccf-f80c-90fa-1421cf47f9a8@intel.com> Date: Fri, 7 Sep 2018 13:21:35 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20180907113559.GA22407@debian> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH 2/3] net/virtio-user: avoid parsing process mappings X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Sep 2018 12:21:45 -0000 On 07-Sep-18 12:35 PM, Tiwei Bie wrote: > On Fri, Sep 07, 2018 at 10:39:16AM +0100, Burakov, Anatoly wrote: >> On 05-Sep-18 5:28 AM, Tiwei Bie wrote: >>> Recently some memory APIs were introduced to allow users to >>> get the file descriptor and offset for each memory segment. >>> We can leverage those APIs to get rid of the /proc magic on >>> memory table preparation in vhost-user backend. >>> >>> Signed-off-by: Tiwei Bie >>> --- >> >> >> >>> - for (i = 0; i < num; ++i) { >>> - mr = &msg->payload.memory.regions[i]; >>> - mr->guest_phys_addr = huges[i].addr; /* use vaddr! */ >>> - mr->userspace_addr = huges[i].addr; >>> - mr->memory_size = huges[i].size; >>> - mr->mmap_offset = 0; >>> - fds[i] = open(huges[i].path, O_RDWR); >>> + if (rte_memseg_get_fd_offset_thread_unsafe(ms, &offset) < 0) { >>> + PMD_DRV_LOG(ERR, "Failed to get offset, ms=%p rte_errno=%d", >>> + ms, rte_errno); >>> + return -1; >>> + } >>> + >>> + start_addr = (uint64_t)(uintptr_t)ms->addr; >>> + end_addr = start_addr + ms->len; >>> + >>> + for (i = 0; i < wa->region_nr; i++) { >> >> There has to be a better way than to run this loop on every segment. Maybe >> store last-used region, and only do a region look up if there's a mismatch? >> Generally, in single-file segments mode, you'll get lots of segments from >> one memseg list one after the other, so you will need to do a lookup once >> memseg list changes, instead of on each segment. > > We may have holes in one memseg list due to dynamic free. > Do you mean we just need to do rte_memseg_contig_walk() > and we can assume that fds of the contiguous memegs will > be the same? No, i didn't mean that. Whether or not you are in single-file segments mode, you still need to scan each segment. However, you lose your state when you exit this function, and thus have to look up the appropriate memory region (that matches your fd) again, over and over. It would be good if you could store last-used memory region somewhere, so that next time you come back into this function, if the memseg has the same fd, you will not have to look it up. Something like this: struct walk_arg { *last_used; } int walk_func() { cur_region = wa->last_used; // check if it matches if (cur->region->fd != fd) { // fd is different - we've changed the segment wa->last_used = cur_region } } So, cache last used region to not have to look it up again, because chances are, you won't have to. That way, you will still do region lookups, but only if you have to - not every time. > >> >>> + if (wa->fds[i] != fd) >>> + continue; >>> + >>> + mr = &wa->vm->regions[i]; >>> + >>> + if (mr->userspace_addr + mr->memory_size < end_addr) >>> + mr->memory_size = end_addr - mr->userspace_addr; >>> + >> >> >> >>> int fds[VHOST_MEMORY_MAX_NREGIONS]; >>> int fd_num = 0; >>> - int i, len; >>> + int len; >>> int vhostfd = dev->vhostfd; >>> RTE_SET_USED(m); >>> @@ -364,10 +337,6 @@ vhost_user_sock(struct virtio_user_dev *dev, >>> return -1; >>> } >>> - if (req == VHOST_USER_SET_MEM_TABLE) >>> - for (i = 0; i < fd_num; ++i) >>> - close(fds[i]); >>> - >> >> You're sharing fd's - presumably the other side of this is in a different >> process, so it's safe to close these fd's there? > > Above code belongs to virtio-user, and it will close the > fds got from rte_memseg_get_fd_thread_unsafe(). > > Below is the code which will close these fds on the other > side (i.e. the vhost-user process): > > https://github.com/DPDK/dpdk/blob/3605968c2fa7/lib/librte_vhost/vhost_user.c#L805 > https://github.com/DPDK/dpdk/blob/3605968c2fa7/lib/librte_vhost/vhost_user.c#L97 OK, so not a problem then. > > >> >>> if (need_reply) { >>> if (vhost_user_read(vhostfd, &msg) < 0) { >>> PMD_DRV_LOG(ERR, "Received msg failed: %s", >>> >> >> >> -- >> Thanks, >> Anatoly > -- Thanks, Anatoly