From: Linhaifeng <haifeng.lin@huawei.com>
To: Tetsuya Mukawa <mukawa@igel.co.jp>, "Xie, Huawei" <huawei.xie@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] vhost-user technical isssues
Date: Fri, 14 Nov 2014 11:13:41 +0800 [thread overview]
Message-ID: <54657365.7090504@huawei.com> (raw)
In-Reply-To: <54656950.1050204@igel.co.jp>
On 2014/11/14 10:30, Tetsuya Mukawa wrote:
> Hi Lin,
>
> (2014/11/13 15:30), Linhaifeng wrote:
>> On 2014/11/12 12:12, Tetsuya Mukawa wrote:
>>> Hi Xie,
>>>
>>> (2014/11/12 6:37), Xie, Huawei wrote:
>>>> Hi Tetsuya:
>>>> There are two major technical issues in my mind for vhost-user implementation.
>>>>
>>>> 1) memory region map
>>>> Vhost-user passes us file fd and offset for each memory region. Unfortunately the mmap offset is "very" wrong. I discovered this issue long time ago, and also found
>>>> that I couldn't mmap the huge page file even with correct offset(need double check).
>>>> Just now I find that people reported this issue on Nov 3.
>>>> [Qemu-devel] [PULL 27/29] vhost-user: fix mmap offset calculation
>>>> Anyway, I turned to the same idea used in our DPDK vhost-cuse: only use the fd for region(0) to map the whole file.
>>>> I think we should use this way temporarily to support qemu-2.1 as it has that bug.
>>> I agree with you.
>>> Also we may have an issue about un-mapping file on hugetlbfs of linux.
>>> When I check munmap(), it seems 'size' need to be aligned by hugepage size.
>>> (I guess it may be a kernel bug. Might be fixed already.)
>>> Please add return value checking code for munmap().
>>> Still munmap() might be failed.
>>>
>> are you munmmap the region 0? region 0 is not need to mmap so not need to munmap too.
>>
>> I can munmap success with the other regions.
> Could you please let me know how many size do you specify when you
> munmap region1?
>
2G (region->memory_size + region.memory->offset)
> I still fail to munmap region1.
> Here is a patch to vhost-user test of QEMU. Could you please check it?
>
> ----------------------------------
> diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c
> index 75fedf0..4e17910 100644
> --- a/tests/vhost-user-test.c
> +++ b/tests/vhost-user-test.c
> @@ -37,7 +37,7 @@
> #endif
>
> #define QEMU_CMD_ACCEL " -machine accel=tcg"
> -#define QEMU_CMD_MEM " -m 512 -object
> memory-backend-file,id=mem,size=512M,"\
> +#define QEMU_CMD_MEM " -m 6000 -object
> memory-backend-file,id=mem,size=6000M,"\
> "mem-path=%s,share=on -numa node,memdev=mem"
> #define QEMU_CMD_CHR " -chardev socket,id=chr0,path=%s"
> #define QEMU_CMD_NETDEV " -netdev
> vhost-user,id=net0,chardev=chr0,vhostforce"
> @@ -221,14 +221,16 @@ static void read_guest_mem(void)
>
> /* check for sanity */
> g_assert_cmpint(fds_num, >, 0);
> - g_assert_cmpint(fds_num, ==, memory.nregions);
> + //g_assert_cmpint(fds_num, ==, memory.nregions);
>
> + fprintf(stderr, "%s(%d)\n", __func__, __LINE__);
> /* iterate all regions */
> for (i = 0; i < fds_num; i++) {
> + int ret = 0;
>
> /* We'll check only the region statring at 0x0*/
> if (memory.regions[i].guest_phys_addr != 0x0) {
> - continue;
> + //continue;
> }
if (memory.regions[i].guest_phys_addr == 0x0) {
close(fd);
continue;
}
>
> g_assert_cmpint(memory.regions[i].memory_size, >, 1024);
> @@ -237,6 +239,13 @@ static void read_guest_mem(void)
>
> guest_mem = mmap(0, size, PROT_READ | PROT_WRITE,
> MAP_SHARED, fds[i], 0);
> + fprintf(stderr, "guest_phys_addr=%lu, memory_size=%lu, "
> + "userspace_addr=%lu, mmap_offset=%lu\n",
> + memory.regions[i].guest_phys_addr,
> + memory.regions[i].memory_size,
> + memory.regions[i].userspace_addr,
> + memory.regions[i].mmap_offset);
> + fprintf(stderr, "mmap=%p, size=%lu\n", guest_mem, size);
>
> g_assert(guest_mem != MAP_FAILED);
> guest_mem += (memory.regions[i].mmap_offset / sizeof(*guest_mem));
> @@ -248,7 +257,20 @@ static void read_guest_mem(void)
> g_assert_cmpint(a, ==, b);
> }
>
> - munmap(guest_mem, memory.regions[i].memory_size);
> + ret = munmap(guest_mem, memory.regions[i].memory_size);
> + fprintf(stderr, "munmap=%p, size=%lu, ret=%d\n",
> + guest_mem, memory.regions[i].memory_size, ret);
> + {
> + size_t hugepagesize;
> +
> + size = memory.regions[i].memory_size;
> + /* assume hugepage size is 1GB, try again */
> + hugepagesize = 1024 * 1024 * 1024;
> + size = (size + hugepagesize - 1) / hugepagesize * hugepagesize;
> + }
size should be same as mmap and
guest_mem -= (memory.regions[i].mmap_offset / sizeof(*guest_mem));
> + ret = munmap(guest_mem, size);
> + fprintf(stderr, "munmap=%p, size=%lu, ret=%d\n",
> + guest_mem, size, ret);
> }
>
> g_assert_cmpint(1, ==, 1);
> ----------------------------------
> $ sudo QTEST_HUGETLBFS_PATH=/mnt/huge make check
> region=0, mmap=0x2aaac0000000, size=6291456000
> region=0, munmap=0x2aab80000000, size=3070230528, ret=-1 << failed
> region=0, munmap=0x2aab80000000, size=3221225472, ret=0
> region=1, mmap=0x2aab80000000, size=655360
> region=1, munmap=0x2aab80000000, size=655360, ret=-1 << failed
> region=1, munmap=0x2aab80000000, size=1073741824, ret=0
>
>
> Thanks,
> Tetsuya
>
> .
>
--
Regards,
Haifeng
next prev parent reply other threads:[~2014-11-14 3:03 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-11 21:37 Xie, Huawei
2014-11-12 4:12 ` Tetsuya Mukawa
2014-11-13 6:30 ` Linhaifeng
2014-11-14 2:30 ` Tetsuya Mukawa
2014-11-14 3:13 ` Linhaifeng [this message]
2014-11-14 3:40 ` Tetsuya Mukawa
2014-11-14 4:05 ` Tetsuya Mukawa
2014-11-14 4:42 ` Linhaifeng
2014-11-14 5:12 ` Tetsuya Mukawa
2014-11-14 5:30 ` Linhaifeng
2014-11-14 6:57 ` Tetsuya Mukawa
2014-11-14 10:59 ` Xie, Huawei
2014-11-17 6:14 ` Tetsuya Mukawa
2014-11-14 0:22 ` Xie, Huawei
2014-11-14 2:52 ` Tetsuya Mukawa
2014-11-15 1:42 ` Xie, Huawei
2014-11-13 6:12 ` Linhaifeng
2014-11-13 6:27 ` Linhaifeng
2014-11-14 1:28 ` Xie, Huawei
2014-11-14 2:24 ` Linhaifeng
2014-11-14 2:35 ` Tetsuya Mukawa
2014-11-14 6:24 ` Xie, Huawei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54657365.7090504@huawei.com \
--to=haifeng.lin@huawei.com \
--cc=dev@dpdk.org \
--cc=huawei.xie@intel.com \
--cc=mukawa@igel.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).