From: "Xie, Huawei" <huawei.xie@intel.com>
To: Zhuangyanying <ann.zhuangyanying@huawei.com>,
"dev@dpdk.org" <dev@dpdk.org>
Cc: "gaoxiaoqiu@huawei.com" <gaoxiaoqiu@huawei.com>,
"oscar.zhangbo@huawei.com" <oscar.zhangbo@huawei.com>,
"zhbzg@huawei.com" <zhbzg@huawei.com>,
"guohongzhen@huawei.com" <guohongzhen@huawei.com>,
"zhoujingbin@huawei.com" <zhoujingbin@huawei.com>
Subject: Re: [dpdk-dev] vhost compliant virtio based networking interface in container
Date: Thu, 20 Aug 2015 10:14:55 +0000 [thread overview]
Message-ID: <C37D651A908B024F974696C65296B57B2BDA19E3@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <C37D651A908B024F974696C65296B57B2BD9F976@SHSMSX101.ccr.corp.intel.com>
Added dev@dpdk.org
On 8/20/2015 6:04 PM, Xie, Huawei wrote:
> Yanping:
> I read your mail, seems what we did are quite similar. Here i wrote a
> quick mail to describe our design. Let me know if it is the same thing.
>
> Problem Statement:
> We don't have a high performance networking interface in container for
> NFV. Current veth pair based interface couldn't be easily accelerated.
>
> The key components involved:
> 1. DPDK based virtio PMD driver in container.
> 2. device simulation framework in container.
> 3. dpdk(or kernel) vhost running in host.
>
> How virtio is created?
> A: There is no "real" virtio-pci device in container environment.
> 1). Host maintains pools of memories, and shares memory to container.
> This could be accomplished through host share a huge page file to container.
> 2). Containers creates virtio rings based on the shared memory.
> 3). Container creates mbuf memory pools on the shared memory.
> 4) Container send the memory and vring information to vhost through
> vhost message. This could be done either through ioctl call or vhost
> user message.
>
> How vhost message is sent?
> A: There are two alternative ways to do this.
> 1) The customized virtio PMD is responsible for all the vring creation,
> and vhost message sending.
> 2) We could do this through a lightweight device simulation framework.
> The device simulation creates simple PCI bus. On the PCI bus,
> virtio-net PCI devices are created. The device simulations provides
> IOAPI for MMIO/IO access.
> 2.1 virtio PMD configures the pseudo virtio device as how it does in
> KVM guest enviroment.
> 2.2 Rather than using io instruction, virtio PMD uses IOAPI for IO
> operation on the virtio-net PCI device.
> 2.3 The device simulation is responsible for device state machine
> simulation.
> 2.4 The device simulation is responsbile for talking to vhost.
> With this approach, we could minimize the virtio PMD modifications.
> The virtio PMD is like configuring a real virtio-net PCI device.
>
> Memory mapping?
> A: QEMU could access the whole guest memory in KVM enviroment. We need
> to fill the gap.
> container maps the shared memory to container's virtual address space
> and host maps it to host's virtual address space. There is a fixed
> offset mapping.
> Container creates shared vring based on the memory. Container also
> creates mbuf memory pool based on the shared memroy.
> In VHOST_SET_MEMORY_TABLE message, we send the memory mapping
> information for the shared memory. As we require mbuf pool created on
> the shared memory, and buffers are allcoated from the mbuf pools, dpdk
> vhost could translate the GPA in vring desc to host virtual.
>
>
> GPA or CVA in vring desc?
> To ease the memory translation, rather than using GPA, here we use
> CVA(container virtual address). This the tricky thing here.
> 1) virtio PMD writes vring's VFN rather than PFN to PFN register through
> IOAPI.
> 2) device simulation framework will use VFN as PFN.
> 3) device simulation sends SET_VRING_ADDR with CVA.
> 4) virtio PMD fills vring desc with CVA of the mbuf data pointer rather
> than GPA.
> So when host sees the CVA, it could translates it to HVA(host virtual
> address).
>
> Worth to note:
> The virtio interface in container follows the vhost message format, and
> is compliant with dpdk vhost implmentation, i.e, no dpdk vhost
> modification is needed.
> vHost isn't aware whether the incoming virtio comes from KVM guest or
> container.
>
> The pretty much covers the high level design. There are quite some low
> level issues. For example, 32bit PFN is enough for KVM guest, since we
> use 64bit VFN(virtual page frame number), trick is done here through a
> special IOAPI.
>
> /huawei
>
>
>
>
>
>
>
>
next parent reply other threads:[~2015-08-20 10:15 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <C37D651A908B024F974696C65296B57B2BD9F976@SHSMSX101.ccr.corp.intel.com>
2015-08-20 10:14 ` Xie, Huawei [this message]
2015-08-25 2:58 ` Tetsuya Mukawa
2015-08-25 9:56 ` Xie, Huawei
2015-08-26 9:23 ` Tetsuya Mukawa
2015-09-07 5:54 ` Xie, Huawei
2015-09-08 4:44 ` Tetsuya Mukawa
2015-09-14 3:15 ` Xie, Huawei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=C37D651A908B024F974696C65296B57B2BDA19E3@SHSMSX101.ccr.corp.intel.com \
--to=huawei.xie@intel.com \
--cc=ann.zhuangyanying@huawei.com \
--cc=dev@dpdk.org \
--cc=gaoxiaoqiu@huawei.com \
--cc=guohongzhen@huawei.com \
--cc=oscar.zhangbo@huawei.com \
--cc=zhbzg@huawei.com \
--cc=zhoujingbin@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).