From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [119.145.14.66]) by dpdk.org (Postfix) with ESMTP id DBA467E75 for ; Fri, 14 Nov 2014 04:03:49 +0100 (CET) Received: from 172.24.2.119 (EHLO szxeml408-hub.china.huawei.com) ([172.24.2.119]) by szxrg03-dlp.huawei.com (MOS 4.4.3-GA FastPath queued) with ESMTP id AXC05846; Fri, 14 Nov 2014 11:13:47 +0800 (CST) Received: from [127.0.0.1] (10.177.19.115) by szxeml408-hub.china.huawei.com (10.82.67.95) with Microsoft SMTP Server id 14.3.158.1; Fri, 14 Nov 2014 11:13:45 +0800 Message-ID: <54657365.7090504@huawei.com> Date: Fri, 14 Nov 2014 11:13:41 +0800 From: Linhaifeng User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:31.0) Gecko/20100101 Thunderbird/31.1.0 MIME-Version: 1.0 To: Tetsuya Mukawa , "Xie, Huawei" References: <5462DE39.1070006@igel.co.jp> <54645007.3010301@huawei.com> <54656950.1050204@igel.co.jp> In-Reply-To: <54656950.1050204@igel.co.jp> Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.19.115] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020208.5465736B.018C, ss=1, re=0.001, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2013-05-26 15:14:31, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: a05de08dea6b4b9923aacf9f77a26e78 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] vhost-user technical isssues X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Nov 2014 03:03:51 -0000 On 2014/11/14 10:30, Tetsuya Mukawa wrote: > Hi Lin, > > (2014/11/13 15:30), Linhaifeng wrote: >> On 2014/11/12 12:12, Tetsuya Mukawa wrote: >>> Hi Xie, >>> >>> (2014/11/12 6:37), Xie, Huawei wrote: >>>> Hi Tetsuya: >>>> There are two major technical issues in my mind for vhost-user implementation. >>>> >>>> 1) memory region map >>>> Vhost-user passes us file fd and offset for each memory region. Unfortunately the mmap offset is "very" wrong. I discovered this issue long time ago, and also found >>>> that I couldn't mmap the huge page file even with correct offset(need double check). >>>> Just now I find that people reported this issue on Nov 3. >>>> [Qemu-devel] [PULL 27/29] vhost-user: fix mmap offset calculation >>>> Anyway, I turned to the same idea used in our DPDK vhost-cuse: only use the fd for region(0) to map the whole file. >>>> I think we should use this way temporarily to support qemu-2.1 as it has that bug. >>> I agree with you. >>> Also we may have an issue about un-mapping file on hugetlbfs of linux. >>> When I check munmap(), it seems 'size' need to be aligned by hugepage size. >>> (I guess it may be a kernel bug. Might be fixed already.) >>> Please add return value checking code for munmap(). >>> Still munmap() might be failed. >>> >> are you munmmap the region 0? region 0 is not need to mmap so not need to munmap too. >> >> I can munmap success with the other regions. > Could you please let me know how many size do you specify when you > munmap region1? > 2G (region->memory_size + region.memory->offset) > I still fail to munmap region1. > Here is a patch to vhost-user test of QEMU. Could you please check it? > > ---------------------------------- > diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c > index 75fedf0..4e17910 100644 > --- a/tests/vhost-user-test.c > +++ b/tests/vhost-user-test.c > @@ -37,7 +37,7 @@ > #endif > > #define QEMU_CMD_ACCEL " -machine accel=tcg" > -#define QEMU_CMD_MEM " -m 512 -object > memory-backend-file,id=mem,size=512M,"\ > +#define QEMU_CMD_MEM " -m 6000 -object > memory-backend-file,id=mem,size=6000M,"\ > "mem-path=%s,share=on -numa node,memdev=mem" > #define QEMU_CMD_CHR " -chardev socket,id=chr0,path=%s" > #define QEMU_CMD_NETDEV " -netdev > vhost-user,id=net0,chardev=chr0,vhostforce" > @@ -221,14 +221,16 @@ static void read_guest_mem(void) > > /* check for sanity */ > g_assert_cmpint(fds_num, >, 0); > - g_assert_cmpint(fds_num, ==, memory.nregions); > + //g_assert_cmpint(fds_num, ==, memory.nregions); > > + fprintf(stderr, "%s(%d)\n", __func__, __LINE__); > /* iterate all regions */ > for (i = 0; i < fds_num; i++) { > + int ret = 0; > > /* We'll check only the region statring at 0x0*/ > if (memory.regions[i].guest_phys_addr != 0x0) { > - continue; > + //continue; > } if (memory.regions[i].guest_phys_addr == 0x0) { close(fd); continue; } > > g_assert_cmpint(memory.regions[i].memory_size, >, 1024); > @@ -237,6 +239,13 @@ static void read_guest_mem(void) > > guest_mem = mmap(0, size, PROT_READ | PROT_WRITE, > MAP_SHARED, fds[i], 0); > + fprintf(stderr, "guest_phys_addr=%lu, memory_size=%lu, " > + "userspace_addr=%lu, mmap_offset=%lu\n", > + memory.regions[i].guest_phys_addr, > + memory.regions[i].memory_size, > + memory.regions[i].userspace_addr, > + memory.regions[i].mmap_offset); > + fprintf(stderr, "mmap=%p, size=%lu\n", guest_mem, size); > > g_assert(guest_mem != MAP_FAILED); > guest_mem += (memory.regions[i].mmap_offset / sizeof(*guest_mem)); > @@ -248,7 +257,20 @@ static void read_guest_mem(void) > g_assert_cmpint(a, ==, b); > } > > - munmap(guest_mem, memory.regions[i].memory_size); > + ret = munmap(guest_mem, memory.regions[i].memory_size); > + fprintf(stderr, "munmap=%p, size=%lu, ret=%d\n", > + guest_mem, memory.regions[i].memory_size, ret); > + { > + size_t hugepagesize; > + > + size = memory.regions[i].memory_size; > + /* assume hugepage size is 1GB, try again */ > + hugepagesize = 1024 * 1024 * 1024; > + size = (size + hugepagesize - 1) / hugepagesize * hugepagesize; > + } size should be same as mmap and guest_mem -= (memory.regions[i].mmap_offset / sizeof(*guest_mem)); > + ret = munmap(guest_mem, size); > + fprintf(stderr, "munmap=%p, size=%lu, ret=%d\n", > + guest_mem, size, ret); > } > > g_assert_cmpint(1, ==, 1); > ---------------------------------- > $ sudo QTEST_HUGETLBFS_PATH=/mnt/huge make check > region=0, mmap=0x2aaac0000000, size=6291456000 > region=0, munmap=0x2aab80000000, size=3070230528, ret=-1 << failed > region=0, munmap=0x2aab80000000, size=3221225472, ret=0 > region=1, mmap=0x2aab80000000, size=655360 > region=1, munmap=0x2aab80000000, size=655360, ret=-1 << failed > region=1, munmap=0x2aab80000000, size=1073741824, ret=0 > > > Thanks, > Tetsuya > > . > -- Regards, Haifeng