DPDK patches and discussions
 help / color / mirror / Atom feed
* Re: [dpdk-dev] [PATCH] vhost: change the vhost library to a common framework which can support more VIRTIO devices
  2016-09-14 12:15 [dpdk-dev] [PATCH] vhost: change the vhost library to a common framework which can support more VIRTIO devices Changpeng Liu
@ 2016-09-13 12:58 ` Yuanhan Liu
  2016-09-13 13:24   ` Thomas Monjalon
  2016-09-15  0:28 ` [dpdk-dev] [PATCH v2 1/2] " Changpeng Liu
  1 sibling, 1 reply; 9+ messages in thread
From: Yuanhan Liu @ 2016-09-13 12:58 UTC (permalink / raw)
  To: Changpeng Liu; +Cc: dev, james.r.harris, Thomas Monjalon

On Wed, Sep 14, 2016 at 08:15:00PM +0800, Changpeng Liu wrote:
> For storage virtualization use cases, vhost-scsi becomes a more popular
> solution to support VMs. However a user space vhost-scsi-user solution
> does not exist currently. SPDK(Storage Performance Development Kit,
> https://github.com/spdk/spdk) will provide a user space vhost-scsi target
> to support multiple VMs through Qemu. Originally SPDK is built on top
> of DPDK libraries, so we would like to use DPDK vhost library as the
> communication channel between Qemu and vhost-scsi target application.
> 
> Currently DPDK vhost library can only support VIRTIO_ID_NET device type,
> we would like to extend the library to support VIRTIO_ID_SCSI and
> VIRTIO_ID_BLK. Most of DPDK vhost library can be reused only several
> differences:
> 1. VIRTIO SCSI device has different vring queues compared with VIRTIO NET
> device, at least 3 vring queues needed for SCSI device type;
> 2. VIRTIO SCSI will need several extra message operation code, such as
> SCSI_SET_ENDPIONT/SCSI_CLEAR_ENDPOINT;
> 
> First, we would like to extend DPDK vhost library as a common framework

I don't see how common it becomes with this patch applied.

> which be friendly to add other VIRTIO device types, to implement this feature,
> we add a new data structure virtio_dev, which can deliver socket messages
> to different VIRTIO devices, each specific VIRTIO device will register
> callback to virtio_dev.
> 
> Secondly, we would to upstream a patch to Qemu community to add vhost-scsi
> specific operation command such as SCSI_SET_ENDPOINT and SCSI_CLEAR_ENDOINT,
> and user space feature bits.
> 
> Finally, after the Qemu patch set was merged, we will add VIRTIO_ID_SCSI
> support to DPDK vhost library

You actually should send this part out with this patchset. You are making
changes for adding the vhost-scsi support, however, you don't show us how
the code to support vhost-scsi looks like. That means, it's hard for us to
understand why you are doing those changes.

What I said is DPDK will not consider merging vhost-scsi patches unless
QEMU have merged the vhost-scsi part. This doesn't mean you can't send
out the DPDK vhost-scsi patches before that.

> and an example vhost-scsi target which can
> add a SCSI device to VM through this example application.
> 
> This patch set changed the vhost library as a common framework which
> can add other VIRTIO device type in future.
> 
> Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
> ---
>  lib/librte_vhost/Makefile         |   4 +-
>  lib/librte_vhost/rte_virtio_dev.h | 140 ++++++++
>  lib/librte_vhost/rte_virtio_net.h |  97 +-----

rte_virtio_net.h is the header file will be exported for applications.
Every change there would mean either an API or ABI breakage. Thus, we
should try to avoid touching it. Don't even to say you added yet another
header file, rte_virtio_dev.h.

I confess that the rte_virtio_net.h filename isn't that right: it sticks
to virtio-net so tightly. We may could rename it to rte_vhost.h, but I
doubt it's worthwhile: as said, it breaks the API.

The fortunate thing about this file is that, the context is actually not
sticking to virtio-net too much. I mean, all the APIs are using the "vid",
which is just a number. Well, except the virtio_net_device_ops() structure,
which also should be renamed to vhost_device_ops(). Besides that, the
three ops, "new_device", "destroy_device" and "vring_state_changed", are
actually not limited to virtio-net device.

That is to say, we could have two options here:

- rename the header file and the structure properly, to not limited to
  virtio-net

- live with it, let it be a legacy issue, and document it at somewhere,
  say, "due to history reason, that virtio-net is the first one supported
  in DPDK, we kept the header filename as rte_virtio_net.h, but not ..."

I personally would prefer the later one, which saves us from breaking
applications again. I don't have strong objection to the first one though.

Thomas, any comments?

>  lib/librte_vhost/socket.c         |   6 +-
>  lib/librte_vhost/vhost.c          | 421 ------------------------
>  lib/librte_vhost/vhost.h          | 288 -----------------
>  lib/librte_vhost/vhost_device.h   | 230 +++++++++++++
>  lib/librte_vhost/vhost_net.c      | 659 ++++++++++++++++++++++++++++++++++++++
>  lib/librte_vhost/vhost_net.h      | 126 ++++++++
>  lib/librte_vhost/vhost_user.c     | 451 +++++++++++++-------------
>  lib/librte_vhost/vhost_user.h     |  17 +-
>  lib/librte_vhost/virtio_net.c     |  37 ++-

That basically means you are heading the wrong way. For example,

> +struct virtio_dev_table {
> +	int (*vhost_dev_ready)(struct virtio_dev *dev);
> +	struct vhost_virtqueue* (*vhost_dev_get_queues)(struct virtio_dev *dev, uint16_t queue_id);
> +	void (*vhost_dev_cleanup)(struct virtio_dev *dev, int destroy);
> +	void (*vhost_dev_free)(struct virtio_dev *dev);
> +	void (*vhost_dev_reset)(struct virtio_dev *dev);
> +	uint64_t (*vhost_dev_get_features)(struct virtio_dev *dev);
> +	int (*vhost_dev_set_features)(struct virtio_dev *dev, uint64_t features);
> +	uint64_t (*vhost_dev_get_protocol_features)(struct virtio_dev *dev);
> +	int (*vhost_dev_set_protocol_features)(struct virtio_dev *dev, uint64_t features);
> +	uint32_t (*vhost_dev_get_default_queue_num)(struct virtio_dev *dev);
> +	uint32_t (*vhost_dev_get_queue_num)(struct virtio_dev *dev);
> +	uint16_t (*vhost_dev_get_avail_entries)(struct virtio_dev *dev, uint16_t queue_id);
> +	int (*vhost_dev_get_vring_base)(struct virtio_dev *dev, struct vhost_virtqueue *vq);
> +	int (*vhost_dev_set_vring_num)(struct virtio_dev *dev, struct vhost_virtqueue *vq);
> +	int (*vhost_dev_set_vring_call)(struct virtio_dev *dev, struct vhost_vring_file *file);
> +	int (*vhost_dev_set_log_base)(struct virtio_dev *dev, int fd, uint64_t size, uint64_t off);
> +};

This looks wrong. Most of them (if not all) should be same, regardless
it's for virtio-net, or virtio-scsi. I don't understand why you should
even touch this. Those are for handling *vhost-user* messages, but not
virtio-net, nor virtio-scsi. They should be same no matter which virtio
device we are dealing with. Well, virtio-scsi may just have few more
messages than virtio-net.

	--yliu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: change the vhost library to a common framework which can support more VIRTIO devices
  2016-09-13 12:58 ` Yuanhan Liu
@ 2016-09-13 13:24   ` Thomas Monjalon
  2016-09-13 13:49     ` Yuanhan Liu
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Monjalon @ 2016-09-13 13:24 UTC (permalink / raw)
  To: Yuanhan Liu; +Cc: Changpeng Liu, dev, james.r.harris

2016-09-13 20:58, Yuanhan Liu:
> rte_virtio_net.h is the header file will be exported for applications.
> Every change there would mean either an API or ABI breakage. Thus, we
> should try to avoid touching it. Don't even to say you added yet another
> header file, rte_virtio_dev.h.
> 
> I confess that the rte_virtio_net.h filename isn't that right: it sticks
> to virtio-net so tightly. We may could rename it to rte_vhost.h, but I
> doubt it's worthwhile: as said, it breaks the API.
> 
> The fortunate thing about this file is that, the context is actually not
> sticking to virtio-net too much. I mean, all the APIs are using the "vid",
> which is just a number. Well, except the virtio_net_device_ops() structure,
> which also should be renamed to vhost_device_ops(). Besides that, the
> three ops, "new_device", "destroy_device" and "vring_state_changed", are
> actually not limited to virtio-net device.
> 
> That is to say, we could have two options here:
> 
> - rename the header file and the structure properly, to not limited to
>   virtio-net
> 
> - live with it, let it be a legacy issue, and document it at somewhere,
>   say, "due to history reason, that virtio-net is the first one supported
>   in DPDK, we kept the header filename as rte_virtio_net.h, but not ..."
> 
> I personally would prefer the later one, which saves us from breaking
> applications again. I don't have strong objection to the first one though.
> 
> Thomas, any comments?

I don't think keeping broken names for historical reasons is a good
long term maintenance.
It could be a FIXME comment that we would fix when we have other reasons
to break the API.
However, in this case, it is easy to keep the compatibility, I think,
by including rte_virtio.h in rte_virtio_net.h.
Note: renames can also generally be managed with symlinks.

I also don't really understand why this file name is rte_virtio_net.h and
not rte_vhost_net.h.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [PATCH] vhost: change the vhost library to a common framework which can support more VIRTIO devices
  2016-09-13 13:24   ` Thomas Monjalon
@ 2016-09-13 13:49     ` Yuanhan Liu
  0 siblings, 0 replies; 9+ messages in thread
From: Yuanhan Liu @ 2016-09-13 13:49 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Changpeng Liu, dev, james.r.harris

On Tue, Sep 13, 2016 at 03:24:53PM +0200, Thomas Monjalon wrote:
> 2016-09-13 20:58, Yuanhan Liu:
> > rte_virtio_net.h is the header file will be exported for applications.
> > Every change there would mean either an API or ABI breakage. Thus, we
> > should try to avoid touching it. Don't even to say you added yet another
> > header file, rte_virtio_dev.h.
> > 
> > I confess that the rte_virtio_net.h filename isn't that right: it sticks
> > to virtio-net so tightly. We may could rename it to rte_vhost.h, but I
> > doubt it's worthwhile: as said, it breaks the API.
> > 
> > The fortunate thing about this file is that, the context is actually not
> > sticking to virtio-net too much. I mean, all the APIs are using the "vid",
> > which is just a number. Well, except the virtio_net_device_ops() structure,
> > which also should be renamed to vhost_device_ops(). Besides that, the
> > three ops, "new_device", "destroy_device" and "vring_state_changed", are
> > actually not limited to virtio-net device.
> > 
> > That is to say, we could have two options here:
> > 
> > - rename the header file and the structure properly, to not limited to
> >   virtio-net
> > 
> > - live with it, let it be a legacy issue, and document it at somewhere,
> >   say, "due to history reason, that virtio-net is the first one supported
> >   in DPDK, we kept the header filename as rte_virtio_net.h, but not ..."
> > 
> > I personally would prefer the later one, which saves us from breaking
> > applications again. I don't have strong objection to the first one though.
> > 
> > Thomas, any comments?
> 
> I don't think keeping broken names for historical reasons is a good
> long term maintenance.

Good point.

> It could be a FIXME comment that we would fix when we have other reasons
> to break the API.
> However, in this case, it is easy to keep the compatibility, I think,
> by including rte_virtio.h in rte_virtio_net.h.

Nice trick!

> Note: renames can also generally be managed with symlinks.
> 
> I also don't really understand why this file name is rte_virtio_net.h and
> not rte_vhost_net.h.

No idea. It's just named so since the beginning. Also I missed this file
while doing the ABI/API refactoring, otherwise, it would be no pain at
this stage.

	--yliu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/2] vhost: add vhost-scsi support to vhost library
  2016-09-15  0:28   ` [dpdk-dev] [PATCH v2 2/2] vhost: add vhost-scsi support to vhost library Changpeng Liu
@ 2016-09-14  3:28     ` Yuanhan Liu
  2016-09-14  4:46       ` Liu, Changpeng
  0 siblings, 1 reply; 9+ messages in thread
From: Yuanhan Liu @ 2016-09-14  3:28 UTC (permalink / raw)
  To: Changpeng Liu; +Cc: dev, james.r.harris

On Thu, Sep 15, 2016 at 08:28:18AM +0800, Changpeng Liu wrote:
> Since we changed the vhost library as a common framework to add other

As I said in my earlier email, I don't see how common it becomes after
your refactoring. __Another__ for example, I just saw a bunch of
duplicated code below that should not even be there (vhost-scsi.c).

Assuming we may add vhost-crypto in future, don't we have to duplicate
again in vhost-crypto.c in your way? The answer is obviously NO.

> +static void
> +cleanup_vq(struct vhost_virtqueue *vq, int destroy)
> +{
> +	if ((vq->callfd >= 0) && (destroy != 0))
> +		close(vq->callfd);
> +	if (vq->kickfd >= 0)
> +		close(vq->kickfd);
> +}
> +
> +/*
> + * Unmap any memory, close any file descriptors and
> + * free any memory owned by a device.
> + */
> +static void
> +cleanup_device(struct virtio_dev *device, int destroy)
> +{
> +	struct virtio_scsi *dev = get_scsi_device(device);
> +	uint32_t i;
> +
> +	dev->features = 0;
> +	dev->protocol_features = 0;
> +
> +	for (i = 0; i < dev->virt_q_nb; i++) {
> +		cleanup_vq(dev->virtqueue[i], destroy);
> +	}
> +}
> +
> +static void
> +init_vring_queue(struct vhost_virtqueue *vq)
> +{
> +	memset(vq, 0, sizeof(struct vhost_virtqueue));
> +
> +	vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD;
> +	vq->callfd = VIRTIO_UNINITIALIZED_EVENTFD;
> +
> +	/* Backends are set to -1 indicating an inactive device. */
> +	vq->backend = -1;
> +	vq->enabled = 1;
> +}
> +

[... snipped a bunch of duplicated code ...]

> +int
> +rte_vhost_scsi_pop_request(int vid, uint16_t queue_id,
> +	struct virtio_scsi_cmd_req **request, struct virtio_scsi_cmd_resp **response, struct iovec *iovs, int *iov_cnt, uint32_t *desc_idx, uint32_t *xfer_direction)

We definitely don't want to introduce a new such API for each vhost device.
The proposal I gave is something like rte_vhost_vring_dequeue_burst(),
which, as the name explains, just dequeues some vring entries and let the
application to consume it. The application then could be a virtio-scsi
device, virtio-crypto device, and even, a virtio-net device.

Few more generic comments:

- you touched way more code than necessary.

- you should split your patches into some small patches: one patch just
  does one tiny logic. Doing one bunch of stuff in one patch is really
  hard for review. For example, in patch 1, you did:

  * move bunch of code from here and there
  * besides that, you even modified the code you moved.
  * introduce virtio_dev_table
  * split virtio_net_dev and introduce virtio_dev
  * change some vhost user message handler, say VHOST_USER_GET_QUEUE_NUM.
  * ...

  That's way too much for a single patch!

  If you think some functions are not well placed, that you want to move
  them to somewhere else, fine, just move it. And if you want to modify
  few of them, that's fine, too. But you should make the changes in another
  patch. 

  This helps review, and what's more importantly, it helps us to locate
  buggy code if any. Just assume you introduced a bug in patch 1, it's
  so big a patch that it's hard for human to spot it. Later, someone
  reported that something is broken and he make a bisect and show this
  patch is the culprit. However, it's so big a patch, that even we know
  there is a bug there, it may take a lot of time to figure out which
  change breaks it.

  If you're splitting patches properly, the bug code could even be spotted
  in review time.

  That are some generic comments about making patches to introduce something
  big.


Besides, I'd like to state again, it seems you are heading the wrong
direction: again, you touched way too much code than necessary to add
vhost-scsi support. In a rough thinking, it could be simple as:

- handle vring queues correctly for vhost-scsi; currently, it sticks to
  virtio-net queue pairs.

- add vring operation functions, such as dequeue/enqueue vrings, update
  used rings, ...

- add vhost-scsi messages

- may need change they way to trigger new_device() callback for
  vhost-scsi device.

Above should be enough (I guess). And again, please make one patch for each
item.  Besides the 2nd item may introduce some code, others should be small
changes.

And, let us forget about the names so far, just reuse what we have. Say,
don't bother to introduce virtio_dev, just use virtio_net (well, I don't
object to make the change now, only if you can do it elegantly). Also, let's
stick to the rte_virtio_net.h as well: let's make it right later.

So far, just let us focus on what's need be done to make vhost-scsi work.
Okay to you guys?

	--yliu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/2] vhost: add vhost-scsi support to vhost library
  2016-09-14  3:28     ` Yuanhan Liu
@ 2016-09-14  4:46       ` Liu, Changpeng
  2016-09-14  5:48         ` Yuanhan Liu
  0 siblings, 1 reply; 9+ messages in thread
From: Liu, Changpeng @ 2016-09-14  4:46 UTC (permalink / raw)
  To: Yuanhan Liu; +Cc: dev, Harris, James R



> -----Original Message-----
> From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
> Sent: Wednesday, September 14, 2016 11:29 AM
> To: Liu, Changpeng <changpeng.liu@intel.com>
> Cc: dev@dpdk.org; Harris, James R <james.r.harris@intel.com>
> Subject: Re: [PATCH v2 2/2] vhost: add vhost-scsi support to vhost library
> 
> On Thu, Sep 15, 2016 at 08:28:18AM +0800, Changpeng Liu wrote:
> > Since we changed the vhost library as a common framework to add other
> 
> As I said in my earlier email, I don't see how common it becomes after
> your refactoring. __Another__ for example, I just saw a bunch of
> duplicated code below that should not even be there (vhost-scsi.c).
> 
> Assuming we may add vhost-crypto in future, don't we have to duplicate
> again in vhost-crypto.c in your way? The answer is obviously NO.
> 
> > +static void
> > +cleanup_vq(struct vhost_virtqueue *vq, int destroy)
> > +{
> > +	if ((vq->callfd >= 0) && (destroy != 0))
> > +		close(vq->callfd);
> > +	if (vq->kickfd >= 0)
> > +		close(vq->kickfd);
> > +}
> > +
> > +/*
> > + * Unmap any memory, close any file descriptors and
> > + * free any memory owned by a device.
> > + */
> > +static void
> > +cleanup_device(struct virtio_dev *device, int destroy)
> > +{
> > +	struct virtio_scsi *dev = get_scsi_device(device);
> > +	uint32_t i;
> > +
> > +	dev->features = 0;
> > +	dev->protocol_features = 0;
> > +
> > +	for (i = 0; i < dev->virt_q_nb; i++) {
> > +		cleanup_vq(dev->virtqueue[i], destroy);
> > +	}
> > +}
> > +
> > +static void
> > +init_vring_queue(struct vhost_virtqueue *vq)
> > +{
> > +	memset(vq, 0, sizeof(struct vhost_virtqueue));
> > +
> > +	vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD;
> > +	vq->callfd = VIRTIO_UNINITIALIZED_EVENTFD;
> > +
> > +	/* Backends are set to -1 indicating an inactive device. */
> > +	vq->backend = -1;
> > +	vq->enabled = 1;
> > +}
> > +

Agreed, cleanup_vq and init_vring_queue can be eliminated as duplicated code here.

> 
> [... snipped a bunch of duplicated code ...]
> 
> > +int
> > +rte_vhost_scsi_pop_request(int vid, uint16_t queue_id,
> > +	struct virtio_scsi_cmd_req **request, struct virtio_scsi_cmd_resp
> **response, struct iovec *iovs, int *iov_cnt, uint32_t *desc_idx, uint32_t
> *xfer_direction)
> 
> We definitely don't want to introduce a new such API for each vhost device.
> The proposal I gave is something like rte_vhost_vring_dequeue_burst(),
> which, as the name explains, just dequeues some vring entries and let the
> application to consume it. The application then could be a virtio-scsi
> device, virtio-crypto device, and even, a virtio-net device.
> 
> Few more generic comments:
> 
> - you touched way more code than necessary.
> 
> - you should split your patches into some small patches: one patch just
>   does one tiny logic. Doing one bunch of stuff in one patch is really
>   hard for review. For example, in patch 1, you did:
> 
>   * move bunch of code from here and there
>   * besides that, you even modified the code you moved.
>   * introduce virtio_dev_table
>   * split virtio_net_dev and introduce virtio_dev
>   * change some vhost user message handler, say
> VHOST_USER_GET_QUEUE_NUM.
>   * ...
> 
>   That's way too much for a single patch!

Agreed, the 2 patch set I sent as RFC purpose, I will break it into small patches at last. 

> 
>   If you think some functions are not well placed, that you want to move
>   them to somewhere else, fine, just move it. And if you want to modify
>   few of them, that's fine, too. But you should make the changes in another
>   patch.
> 
>   This helps review, and what's more importantly, it helps us to locate
>   buggy code if any. Just assume you introduced a bug in patch 1, it's
>   so big a patch that it's hard for human to spot it. Later, someone
>   reported that something is broken and he make a bisect and show this
>   patch is the culprit. However, it's so big a patch, that even we know
>   there is a bug there, it may take a lot of time to figure out which
>   change breaks it.
> 
>   If you're splitting patches properly, the bug code could even be spotted
>   in review time.
> 
>   That are some generic comments about making patches to introduce something
>   big.
> 
> 
> Besides, I'd like to state again, it seems you are heading the wrong
> direction: again, you touched way too much code than necessary to add
> vhost-scsi support. In a rough thinking, it could be simple as:
> 
> - handle vring queues correctly for vhost-scsi; currently, it sticks to
>   virtio-net queue pairs.
> 
> - add vring operation functions, such as dequeue/enqueue vrings, update
>   used rings, ...
> 
> - add vhost-scsi messages
> 
> - may need change they way to trigger new_device() callback for
>   vhost-scsi device.
> 
> Above should be enough (I guess). And again, please make one patch for each
> item.  Besides the 2nd item may introduce some code, others should be small
> changes.
> 
> And, let us forget about the names so far, just reuse what we have. Say,
> don't bother to introduce virtio_dev, just use virtio_net (well, I don't
> object to make the change now, only if you can do it elegantly). Also, let's
> stick to the rte_virtio_net.h as well: let's make it right later.
> 
> So far, just let us focus on what's need be done to make vhost-scsi work.
> Okay to you guys?

Cannot agree with this comments, as you already know that virtio_net and virtio_scsi
are different devices, why should add SCSI related logic into virtio_net file, just because
it's easy for code review?

> 
> 	--yliu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/2] vhost: add vhost-scsi support to vhost library
  2016-09-14  4:46       ` Liu, Changpeng
@ 2016-09-14  5:48         ` Yuanhan Liu
  0 siblings, 0 replies; 9+ messages in thread
From: Yuanhan Liu @ 2016-09-14  5:48 UTC (permalink / raw)
  To: Liu, Changpeng; +Cc: dev, Harris, James R

On Wed, Sep 14, 2016 at 04:46:21AM +0000, Liu, Changpeng wrote:
> > Few more generic comments:
> > 
> > - you touched way more code than necessary.
> > 
> > - you should split your patches into some small patches: one patch just
> >   does one tiny logic. Doing one bunch of stuff in one patch is really
> >   hard for review. For example, in patch 1, you did:
> > 
> >   * move bunch of code from here and there
> >   * besides that, you even modified the code you moved.
> >   * introduce virtio_dev_table
> >   * split virtio_net_dev and introduce virtio_dev
> >   * change some vhost user message handler, say
> > VHOST_USER_GET_QUEUE_NUM.
> >   * ...
> > 
> >   That's way too much for a single patch!
> 
> Agreed, the 2 patch set I sent as RFC purpose, I will break it into small patches at last. 

If you want to let others to get your point easily, you should breat it
in the beginning, even for RFC.

> 
> > 
> >   If you think some functions are not well placed, that you want to move
> >   them to somewhere else, fine, just move it. And if you want to modify
> >   few of them, that's fine, too. But you should make the changes in another
> >   patch.
> > 
> >   This helps review, and what's more importantly, it helps us to locate
> >   buggy code if any. Just assume you introduced a bug in patch 1, it's
> >   so big a patch that it's hard for human to spot it. Later, someone
> >   reported that something is broken and he make a bisect and show this
> >   patch is the culprit. However, it's so big a patch, that even we know
> >   there is a bug there, it may take a lot of time to figure out which
> >   change breaks it.
> > 
> >   If you're splitting patches properly, the bug code could even be spotted
> >   in review time.
> > 
> >   That are some generic comments about making patches to introduce something
> >   big.
> > 
> > 
> > Besides, I'd like to state again, it seems you are heading the wrong
> > direction: again, you touched way too much code than necessary to add
> > vhost-scsi support. In a rough thinking, it could be simple as:
> > 
> > - handle vring queues correctly for vhost-scsi; currently, it sticks to
> >   virtio-net queue pairs.
> > 
> > - add vring operation functions, such as dequeue/enqueue vrings, update
> >   used rings, ...
> > 
> > - add vhost-scsi messages
> > 
> > - may need change they way to trigger new_device() callback for
> >   vhost-scsi device.
> > 
> > Above should be enough (I guess). And again, please make one patch for each
> > item.  Besides the 2nd item may introduce some code, others should be small
> > changes.
> > 
> > And, let us forget about the names so far, just reuse what we have. Say,
> > don't bother to introduce virtio_dev, just use virtio_net (well, I don't
> > object to make the change now, only if you can do it elegantly). Also, let's
> > stick to the rte_virtio_net.h as well: let's make it right later.
> > 
> > So far, just let us focus on what's need be done to make vhost-scsi work.
> > Okay to you guys?
> 
> Cannot agree with this comments, as you already know that virtio_net and virtio_scsi
> are different devices, why should add SCSI related logic into virtio_net file,

Not really, I'd think most of them are common. Looking at your implemention,
you just added "struct vhost_scsi_target scsi_target" for vhost-scsi device,
and changed virt_qp_nb to virt_q_nb. You may say, I hid few more fields
from virtio_net for vhost_scsi. Well, you are using 'union', I see no big
difference.

> just because
> it's easy for code review?

No, and I said, "I don't object to make the change now, only if you can
do it elegantly". And unfortunately, you were not heading that way.

	--yliu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCH] vhost: change the vhost library to a common framework which can support more VIRTIO devices
@ 2016-09-14 12:15 Changpeng Liu
  2016-09-13 12:58 ` Yuanhan Liu
  2016-09-15  0:28 ` [dpdk-dev] [PATCH v2 1/2] " Changpeng Liu
  0 siblings, 2 replies; 9+ messages in thread
From: Changpeng Liu @ 2016-09-14 12:15 UTC (permalink / raw)
  To: dev; +Cc: yuanhan.liu, james.r.harris, changpeng.liu

For storage virtualization use cases, vhost-scsi becomes a more popular
solution to support VMs. However a user space vhost-scsi-user solution
does not exist currently. SPDK(Storage Performance Development Kit,
https://github.com/spdk/spdk) will provide a user space vhost-scsi target
to support multiple VMs through Qemu. Originally SPDK is built on top
of DPDK libraries, so we would like to use DPDK vhost library as the
communication channel between Qemu and vhost-scsi target application.

Currently DPDK vhost library can only support VIRTIO_ID_NET device type,
we would like to extend the library to support VIRTIO_ID_SCSI and
VIRTIO_ID_BLK. Most of DPDK vhost library can be reused only several
differences:
1. VIRTIO SCSI device has different vring queues compared with VIRTIO NET
device, at least 3 vring queues needed for SCSI device type;
2. VIRTIO SCSI will need several extra message operation code, such as
SCSI_SET_ENDPIONT/SCSI_CLEAR_ENDPOINT;

First, we would like to extend DPDK vhost library as a common framework
which be friendly to add other VIRTIO device types, to implement this feature,
we add a new data structure virtio_dev, which can deliver socket messages
to different VIRTIO devices, each specific VIRTIO device will register
callback to virtio_dev.

Secondly, we would to upstream a patch to Qemu community to add vhost-scsi
specific operation command such as SCSI_SET_ENDPOINT and SCSI_CLEAR_ENDOINT,
and user space feature bits.

Finally, after the Qemu patch set was merged, we will add VIRTIO_ID_SCSI
support to DPDK vhost library and an example vhost-scsi target which can
add a SCSI device to VM through this example application.

This patch set changed the vhost library as a common framework which
can add other VIRTIO device type in future.

Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
---
 lib/librte_vhost/Makefile         |   4 +-
 lib/librte_vhost/rte_virtio_dev.h | 140 ++++++++
 lib/librte_vhost/rte_virtio_net.h |  97 +-----
 lib/librte_vhost/socket.c         |   6 +-
 lib/librte_vhost/vhost.c          | 421 ------------------------
 lib/librte_vhost/vhost.h          | 288 -----------------
 lib/librte_vhost/vhost_device.h   | 230 +++++++++++++
 lib/librte_vhost/vhost_net.c      | 659 ++++++++++++++++++++++++++++++++++++++
 lib/librte_vhost/vhost_net.h      | 126 ++++++++
 lib/librte_vhost/vhost_user.c     | 451 +++++++++++++-------------
 lib/librte_vhost/vhost_user.h     |  17 +-
 lib/librte_vhost/virtio_net.c     |  37 ++-
 12 files changed, 1426 insertions(+), 1050 deletions(-)
 create mode 100644 lib/librte_vhost/rte_virtio_dev.h
 delete mode 100644 lib/librte_vhost/vhost.c
 delete mode 100644 lib/librte_vhost/vhost.h
 create mode 100644 lib/librte_vhost/vhost_device.h
 create mode 100644 lib/librte_vhost/vhost_net.c
 create mode 100644 lib/librte_vhost/vhost_net.h

diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 415ffc6..af30491 100644
--- a/lib/librte_vhost/Makefile
+++ b/lib/librte_vhost/Makefile
@@ -47,11 +47,11 @@ LDLIBS += -lnuma
 endif
 
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c socket.c vhost.c vhost_user.c \
+SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c socket.c vhost_net.c vhost_user.c \
 				   virtio_net.c
 
 # install includes
-SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h rte_virtio_dev.h
 
 # dependencies
 DEPDIRS-$(CONFIG_RTE_LIBRTE_VHOST) += lib/librte_eal
diff --git a/lib/librte_vhost/rte_virtio_dev.h b/lib/librte_vhost/rte_virtio_dev.h
new file mode 100644
index 0000000..e3c857a
--- /dev/null
+++ b/lib/librte_vhost/rte_virtio_dev.h
@@ -0,0 +1,140 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _VIRTIO_DEV_H_
+#define _VIRTIO_DEV_H_
+
+/* Device types and capabilities flag */
+#define RTE_VHOST_USER_CLIENT		(1ULL << 0)
+#define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
+#define RTE_VHOST_USER_TX_ZERO_COPY	(1ULL << 2)
+
+#define RTE_VHOST_USER_DEV_NET		(1ULL << 32)
+
+/**
+ * Device and vring operations.
+ *
+ * Make sure to set VIRTIO_DEV_RUNNING to the device flags in new_device and
+ * remove it in destroy_device.
+ *
+ */
+struct virtio_net_device_ops {
+	int (*new_device)(int vid);		/**< Add device. */
+	void (*destroy_device)(int vid);	/**< Remove device. */
+
+	int (*vring_state_changed)(int vid, uint16_t queue_id, int enable);	/**< triggered when a vring is enabled or disabled */
+
+	void *reserved[5]; /**< Reserved for future extension */
+};
+
+/**
+ *  Disable features in feature_mask. Returns 0 on success.
+ */
+int rte_vhost_feature_disable(uint64_t feature_mask);
+
+/**
+ *  Enable features in feature_mask. Returns 0 on success.
+ */
+int rte_vhost_feature_enable(uint64_t feature_mask);
+
+/* Returns currently supported vhost features */
+uint64_t rte_vhost_feature_get(void);
+
+int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable);
+
+/**
+ * Register vhost driver. path could be different for multiple
+ * instance support.
+ */
+int rte_vhost_driver_register(const char *path, uint64_t flags);
+
+/* Unregister vhost driver. This is only meaningful to vhost user. */
+int rte_vhost_driver_unregister(const char *path);
+
+/* Start vhost driver session blocking loop. */
+int rte_vhost_driver_session_start(void);
+
+/**
+ * Get the numa node from which the virtio net device's memory
+ * is allocated.
+ *
+ * @param vid
+ *  virtio-net device ID
+ *
+ * @return
+ *  The numa node, -1 on failure
+ */
+int rte_vhost_get_numa_node(int vid);
+
+/**
+ * Get the number of queues the device supports.
+ *
+ * @param vid
+ *  virtio-net device ID
+ *
+ * @return
+ *  The number of queues, 0 on failure
+ */
+uint32_t rte_vhost_get_queue_num(int vid);
+
+/**
+ * Get how many avail entries are left in the queue
+ *
+ * @param vid
+ *  virtio-net device ID
+ * @param queue_id
+ *  virtio queue index
+ *
+ * @return
+ *  num of avail entires left
+ */
+uint16_t rte_vhost_avail_entries(int vid, uint16_t queue_id);
+
+/**
+ * Get the virtio net device's ifname. For vhost-cuse, ifname is the
+ * path of the char device. For vhost-user, ifname is the vhost-user
+ * socket file path.
+ *
+ * @param vid
+ *  virtio-net device ID
+ * @param buf
+ *  The buffer to stored the queried ifname
+ * @param len
+ *  The length of buf
+ *
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_get_ifname(int vid, char *buf, size_t len);
+
+#endif /* _VIRTIO_DEV_H_ */
\ No newline at end of file
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 3ddc9ca..86ede8a 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -50,107 +50,14 @@
 #include <rte_memory.h>
 #include <rte_mempool.h>
 #include <rte_ether.h>
-
-#define RTE_VHOST_USER_CLIENT		(1ULL << 0)
-#define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
-#define RTE_VHOST_USER_TX_ZERO_COPY	(1ULL << 2)
+#include <rte_virtio_dev.h>
 
 /* Enum for virtqueue management. */
 enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
 
-/**
- * Device and vring operations.
- */
-struct virtio_net_device_ops {
-	int (*new_device)(int vid);		/**< Add device. */
-	void (*destroy_device)(int vid);	/**< Remove device. */
-
-	int (*vring_state_changed)(int vid, uint16_t queue_id, int enable);	/**< triggered when a vring is enabled or disabled */
-
-	void *reserved[5]; /**< Reserved for future extension */
-};
-
-/**
- *  Disable features in feature_mask. Returns 0 on success.
- */
-int rte_vhost_feature_disable(uint64_t feature_mask);
-
-/**
- *  Enable features in feature_mask. Returns 0 on success.
- */
-int rte_vhost_feature_enable(uint64_t feature_mask);
-
-/* Returns currently supported vhost features */
-uint64_t rte_vhost_feature_get(void);
-
-int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable);
-
-/**
- * Register vhost driver. path could be different for multiple
- * instance support.
- */
-int rte_vhost_driver_register(const char *path, uint64_t flags);
-
-/* Unregister vhost driver. This is only meaningful to vhost user. */
-int rte_vhost_driver_unregister(const char *path);
-
 /* Register callbacks. */
 int rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const);
-/* Start vhost driver session blocking loop. */
-int rte_vhost_driver_session_start(void);
-
-/**
- * Get the numa node from which the virtio net device's memory
- * is allocated.
- *
- * @param vid
- *  virtio-net device ID
- *
- * @return
- *  The numa node, -1 on failure
- */
-int rte_vhost_get_numa_node(int vid);
 
-/**
- * Get the number of queues the device supports.
- *
- * @param vid
- *  virtio-net device ID
- *
- * @return
- *  The number of queues, 0 on failure
- */
-uint32_t rte_vhost_get_queue_num(int vid);
-
-/**
- * Get the virtio net device's ifname. For vhost-cuse, ifname is the
- * path of the char device. For vhost-user, ifname is the vhost-user
- * socket file path.
- *
- * @param vid
- *  virtio-net device ID
- * @param buf
- *  The buffer to stored the queried ifname
- * @param len
- *  The length of buf
- *
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_get_ifname(int vid, char *buf, size_t len);
-
-/**
- * Get how many avail entries are left in the queue
- *
- * @param vid
- *  virtio-net device ID
- * @param queue_id
- *  virtio queue index
- *
- * @return
- *  num of avail entires left
- */
-uint16_t rte_vhost_avail_entries(int vid, uint16_t queue_id);
 
 /**
  * This function adds buffers to the virtio devices RX virtqueue. Buffers can
@@ -191,4 +98,4 @@ uint16_t rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
 uint16_t rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
 
-#endif /* _VIRTIO_NET_H_ */
+#endif /* _VIRTIO_NET_H_ */
\ No newline at end of file
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 5c3962d..1474c98 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -49,7 +49,7 @@
 #include <rte_log.h>
 
 #include "fd_man.h"
-#include "vhost.h"
+#include "vhost_device.h"
 #include "vhost_user.h"
 
 /*
@@ -62,6 +62,7 @@ struct vhost_user_socket {
 	int connfd;
 	bool is_server;
 	bool reconnect;
+	int type;
 	bool tx_zero_copy;
 };
 
@@ -194,7 +195,7 @@ vhost_user_add_connection(int fd, struct vhost_user_socket *vsocket)
 		return;
 	}
 
-	vid = vhost_new_device();
+	vid = vhost_new_device(vsocket->type);
 	if (vid == -1) {
 		close(fd);
 		free(conn);
@@ -525,6 +526,7 @@ rte_vhost_driver_register(const char *path, uint64_t flags)
 		goto out;
 	}
 
+	vsocket->type = VIRTIO_ID_NET;
 	vhost_user.vsockets[vhost_user.vsocket_cnt++] = vsocket;
 
 out:
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
deleted file mode 100644
index 5461e5b..0000000
--- a/lib/librte_vhost/vhost.c
+++ /dev/null
@@ -1,421 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#include <linux/vhost.h>
-#include <linux/virtio_net.h>
-#include <stddef.h>
-#include <stdint.h>
-#include <stdlib.h>
-#ifdef RTE_LIBRTE_VHOST_NUMA
-#include <numaif.h>
-#endif
-
-#include <rte_ethdev.h>
-#include <rte_log.h>
-#include <rte_string_fns.h>
-#include <rte_memory.h>
-#include <rte_malloc.h>
-#include <rte_virtio_net.h>
-
-#include "vhost.h"
-
-#define VHOST_USER_F_PROTOCOL_FEATURES	30
-
-/* Features supported by this lib. */
-#define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
-				(1ULL << VIRTIO_NET_F_CTRL_VQ) | \
-				(1ULL << VIRTIO_NET_F_CTRL_RX) | \
-				(1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) | \
-				(VHOST_SUPPORTS_MQ)            | \
-				(1ULL << VIRTIO_F_VERSION_1)   | \
-				(1ULL << VHOST_F_LOG_ALL)      | \
-				(1ULL << VHOST_USER_F_PROTOCOL_FEATURES) | \
-				(1ULL << VIRTIO_NET_F_HOST_TSO4) | \
-				(1ULL << VIRTIO_NET_F_HOST_TSO6) | \
-				(1ULL << VIRTIO_NET_F_CSUM)    | \
-				(1ULL << VIRTIO_NET_F_GUEST_CSUM) | \
-				(1ULL << VIRTIO_NET_F_GUEST_TSO4) | \
-				(1ULL << VIRTIO_NET_F_GUEST_TSO6))
-
-uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
-
-struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
-
-/* device ops to add/remove device to/from data core. */
-struct virtio_net_device_ops const *notify_ops;
-
-struct virtio_net *
-get_device(int vid)
-{
-	struct virtio_net *dev = vhost_devices[vid];
-
-	if (unlikely(!dev)) {
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%d) device not found.\n", vid);
-	}
-
-	return dev;
-}
-
-static void
-cleanup_vq(struct vhost_virtqueue *vq, int destroy)
-{
-	if ((vq->callfd >= 0) && (destroy != 0))
-		close(vq->callfd);
-	if (vq->kickfd >= 0)
-		close(vq->kickfd);
-}
-
-/*
- * Unmap any memory, close any file descriptors and
- * free any memory owned by a device.
- */
-void
-cleanup_device(struct virtio_net *dev, int destroy)
-{
-	uint32_t i;
-
-	vhost_backend_cleanup(dev);
-
-	for (i = 0; i < dev->virt_qp_nb; i++) {
-		cleanup_vq(dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_RXQ], destroy);
-		cleanup_vq(dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_TXQ], destroy);
-	}
-}
-
-/*
- * Release virtqueues and device memory.
- */
-static void
-free_device(struct virtio_net *dev)
-{
-	uint32_t i;
-
-	for (i = 0; i < dev->virt_qp_nb; i++)
-		rte_free(dev->virtqueue[i * VIRTIO_QNUM]);
-
-	rte_free(dev);
-}
-
-static void
-init_vring_queue(struct vhost_virtqueue *vq, int qp_idx)
-{
-	memset(vq, 0, sizeof(struct vhost_virtqueue));
-
-	vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD;
-	vq->callfd = VIRTIO_UNINITIALIZED_EVENTFD;
-
-	/* Backends are set to -1 indicating an inactive device. */
-	vq->backend = -1;
-
-	/* always set the default vq pair to enabled */
-	if (qp_idx == 0)
-		vq->enabled = 1;
-
-	TAILQ_INIT(&vq->zmbuf_list);
-}
-
-static void
-init_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx)
-{
-	uint32_t base_idx = qp_idx * VIRTIO_QNUM;
-
-	init_vring_queue(dev->virtqueue[base_idx + VIRTIO_RXQ], qp_idx);
-	init_vring_queue(dev->virtqueue[base_idx + VIRTIO_TXQ], qp_idx);
-}
-
-static void
-reset_vring_queue(struct vhost_virtqueue *vq, int qp_idx)
-{
-	int callfd;
-
-	callfd = vq->callfd;
-	init_vring_queue(vq, qp_idx);
-	vq->callfd = callfd;
-}
-
-static void
-reset_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx)
-{
-	uint32_t base_idx = qp_idx * VIRTIO_QNUM;
-
-	reset_vring_queue(dev->virtqueue[base_idx + VIRTIO_RXQ], qp_idx);
-	reset_vring_queue(dev->virtqueue[base_idx + VIRTIO_TXQ], qp_idx);
-}
-
-int
-alloc_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx)
-{
-	struct vhost_virtqueue *virtqueue = NULL;
-	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
-	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
-
-	virtqueue = rte_malloc(NULL,
-			       sizeof(struct vhost_virtqueue) * VIRTIO_QNUM, 0);
-	if (virtqueue == NULL) {
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"Failed to allocate memory for virt qp:%d.\n", qp_idx);
-		return -1;
-	}
-
-	dev->virtqueue[virt_rx_q_idx] = virtqueue;
-	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
-
-	init_vring_queue_pair(dev, qp_idx);
-
-	dev->virt_qp_nb += 1;
-
-	return 0;
-}
-
-/*
- * Reset some variables in device structure, while keeping few
- * others untouched, such as vid, ifname, virt_qp_nb: they
- * should be same unless the device is removed.
- */
-void
-reset_device(struct virtio_net *dev)
-{
-	uint32_t i;
-
-	dev->features = 0;
-	dev->protocol_features = 0;
-	dev->flags = 0;
-
-	for (i = 0; i < dev->virt_qp_nb; i++)
-		reset_vring_queue_pair(dev, i);
-}
-
-/*
- * Function is called from the CUSE open function. The device structure is
- * initialised and a new entry is added to the device configuration linked
- * list.
- */
-int
-vhost_new_device(void)
-{
-	struct virtio_net *dev;
-	int i;
-
-	dev = rte_zmalloc(NULL, sizeof(struct virtio_net), 0);
-	if (dev == NULL) {
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"Failed to allocate memory for new dev.\n");
-		return -1;
-	}
-
-	for (i = 0; i < MAX_VHOST_DEVICE; i++) {
-		if (vhost_devices[i] == NULL)
-			break;
-	}
-	if (i == MAX_VHOST_DEVICE) {
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"Failed to find a free slot for new device.\n");
-		return -1;
-	}
-
-	vhost_devices[i] = dev;
-	dev->vid = i;
-
-	return i;
-}
-
-/*
- * Function is called from the CUSE release function. This function will
- * cleanup the device and remove it from device configuration linked list.
- */
-void
-vhost_destroy_device(int vid)
-{
-	struct virtio_net *dev = get_device(vid);
-
-	if (dev == NULL)
-		return;
-
-	if (dev->flags & VIRTIO_DEV_RUNNING) {
-		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(vid);
-	}
-
-	cleanup_device(dev, 1);
-	free_device(dev);
-
-	vhost_devices[vid] = NULL;
-}
-
-void
-vhost_set_ifname(int vid, const char *if_name, unsigned int if_len)
-{
-	struct virtio_net *dev;
-	unsigned int len;
-
-	dev = get_device(vid);
-	if (dev == NULL)
-		return;
-
-	len = if_len > sizeof(dev->ifname) ?
-		sizeof(dev->ifname) : if_len;
-
-	strncpy(dev->ifname, if_name, len);
-	dev->ifname[sizeof(dev->ifname) - 1] = '\0';
-}
-
-void
-vhost_enable_tx_zero_copy(int vid)
-{
-	struct virtio_net *dev = get_device(vid);
-
-	if (dev == NULL)
-		return;
-
-	dev->tx_zero_copy = 1;
-}
-
-int
-rte_vhost_get_numa_node(int vid)
-{
-#ifdef RTE_LIBRTE_VHOST_NUMA
-	struct virtio_net *dev = get_device(vid);
-	int numa_node;
-	int ret;
-
-	if (dev == NULL)
-		return -1;
-
-	ret = get_mempolicy(&numa_node, NULL, 0, dev,
-			    MPOL_F_NODE | MPOL_F_ADDR);
-	if (ret < 0) {
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%d) failed to query numa node: %d\n", vid, ret);
-		return -1;
-	}
-
-	return numa_node;
-#else
-	RTE_SET_USED(vid);
-	return -1;
-#endif
-}
-
-uint32_t
-rte_vhost_get_queue_num(int vid)
-{
-	struct virtio_net *dev = get_device(vid);
-
-	if (dev == NULL)
-		return 0;
-
-	return dev->virt_qp_nb;
-}
-
-int
-rte_vhost_get_ifname(int vid, char *buf, size_t len)
-{
-	struct virtio_net *dev = get_device(vid);
-
-	if (dev == NULL)
-		return -1;
-
-	len = RTE_MIN(len, sizeof(dev->ifname));
-
-	strncpy(buf, dev->ifname, len);
-	buf[len - 1] = '\0';
-
-	return 0;
-}
-
-uint16_t
-rte_vhost_avail_entries(int vid, uint16_t queue_id)
-{
-	struct virtio_net *dev;
-	struct vhost_virtqueue *vq;
-
-	dev = get_device(vid);
-	if (!dev)
-		return 0;
-
-	vq = dev->virtqueue[queue_id];
-	if (!vq->enabled)
-		return 0;
-
-	return *(volatile uint16_t *)&vq->avail->idx - vq->last_used_idx;
-}
-
-int
-rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable)
-{
-	struct virtio_net *dev = get_device(vid);
-
-	if (dev == NULL)
-		return -1;
-
-	if (enable) {
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"guest notification isn't supported.\n");
-		return -1;
-	}
-
-	dev->virtqueue[queue_id]->used->flags = VRING_USED_F_NO_NOTIFY;
-	return 0;
-}
-
-uint64_t rte_vhost_feature_get(void)
-{
-	return VHOST_FEATURES;
-}
-
-int rte_vhost_feature_disable(uint64_t feature_mask)
-{
-	VHOST_FEATURES = VHOST_FEATURES & ~feature_mask;
-	return 0;
-}
-
-int rte_vhost_feature_enable(uint64_t feature_mask)
-{
-	if ((feature_mask & VHOST_SUPPORTED_FEATURES) == feature_mask) {
-		VHOST_FEATURES = VHOST_FEATURES | feature_mask;
-		return 0;
-	}
-	return -1;
-}
-
-/*
- * Register ops so that we can add/remove device to data core.
- */
-int
-rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const ops)
-{
-	notify_ops = ops;
-
-	return 0;
-}
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
deleted file mode 100644
index 7e4a15e..0000000
--- a/lib/librte_vhost/vhost.h
+++ /dev/null
@@ -1,288 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _VHOST_NET_CDEV_H_
-#define _VHOST_NET_CDEV_H_
-#include <stdint.h>
-#include <stdio.h>
-#include <sys/types.h>
-#include <sys/queue.h>
-#include <unistd.h>
-#include <linux/vhost.h>
-
-#include <rte_log.h>
-
-#include "rte_virtio_net.h"
-
-/* Used to indicate that the device is running on a data core */
-#define VIRTIO_DEV_RUNNING 1
-
-/* Backend value set by guest. */
-#define VIRTIO_DEV_STOPPED -1
-
-#define BUF_VECTOR_MAX 256
-
-/**
- * Structure contains buffer address, length and descriptor index
- * from vring to do scatter RX.
- */
-struct buf_vector {
-	uint64_t buf_addr;
-	uint32_t buf_len;
-	uint32_t desc_idx;
-};
-
-/*
- * A structure to hold some fields needed in zero copy code path,
- * mainly for associating an mbuf with the right desc_idx.
- */
-struct zcopy_mbuf {
-	struct rte_mbuf *mbuf;
-	uint32_t desc_idx;
-	uint16_t in_use;
-
-	TAILQ_ENTRY(zcopy_mbuf) next;
-};
-TAILQ_HEAD(zcopy_mbuf_list, zcopy_mbuf);
-
-/**
- * Structure contains variables relevant to RX/TX virtqueues.
- */
-struct vhost_virtqueue {
-	struct vring_desc	*desc;
-	struct vring_avail	*avail;
-	struct vring_used	*used;
-	uint32_t		size;
-
-	uint16_t		last_avail_idx;
-	volatile uint16_t	last_used_idx;
-#define VIRTIO_INVALID_EVENTFD		(-1)
-#define VIRTIO_UNINITIALIZED_EVENTFD	(-2)
-
-	/* Backend value to determine if device should started/stopped */
-	int			backend;
-	/* Used to notify the guest (trigger interrupt) */
-	int			callfd;
-	/* Currently unused as polling mode is enabled */
-	int			kickfd;
-	int			enabled;
-
-	/* Physical address of used ring, for logging */
-	uint64_t		log_guest_addr;
-
-	uint16_t		nr_zmbuf;
-	uint16_t		zmbuf_size;
-	uint16_t		last_zmbuf_idx;
-	struct zcopy_mbuf	*zmbufs;
-	struct zcopy_mbuf_list	zmbuf_list;
-} __rte_cache_aligned;
-
-/* Old kernels have no such macro defined */
-#ifndef VIRTIO_NET_F_GUEST_ANNOUNCE
- #define VIRTIO_NET_F_GUEST_ANNOUNCE 21
-#endif
-
-
-/*
- * Make an extra wrapper for VIRTIO_NET_F_MQ and
- * VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX as they are
- * introduced since kernel v3.8. This makes our
- * code buildable for older kernel.
- */
-#ifdef VIRTIO_NET_F_MQ
- #define VHOST_MAX_QUEUE_PAIRS	VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX
- #define VHOST_SUPPORTS_MQ	(1ULL << VIRTIO_NET_F_MQ)
-#else
- #define VHOST_MAX_QUEUE_PAIRS	1
- #define VHOST_SUPPORTS_MQ	0
-#endif
-
-/*
- * Define virtio 1.0 for older kernels
- */
-#ifndef VIRTIO_F_VERSION_1
- #define VIRTIO_F_VERSION_1 32
-#endif
-
-struct guest_page {
-	uint64_t guest_phys_addr;
-	uint64_t host_phys_addr;
-	uint64_t size;
-};
-
-/**
- * Device structure contains all configuration information relating
- * to the device.
- */
-struct virtio_net {
-	/* Frontend (QEMU) memory and memory region information */
-	struct virtio_memory	*mem;
-	uint64_t		features;
-	uint64_t		protocol_features;
-	int			vid;
-	uint32_t		flags;
-	uint16_t		vhost_hlen;
-	/* to tell if we need broadcast rarp packet */
-	rte_atomic16_t		broadcast_rarp;
-	uint32_t		virt_qp_nb;
-	int			tx_zero_copy;
-	struct vhost_virtqueue	*virtqueue[VHOST_MAX_QUEUE_PAIRS * 2];
-#define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
-	char			ifname[IF_NAME_SZ];
-	uint64_t		log_size;
-	uint64_t		log_base;
-	uint64_t		log_addr;
-	struct ether_addr	mac;
-
-	uint32_t		nr_guest_pages;
-	uint32_t		max_guest_pages;
-	struct guest_page       *guest_pages;
-} __rte_cache_aligned;
-
-/**
- * Information relating to memory regions including offsets to
- * addresses in QEMUs memory file.
- */
-struct virtio_memory_region {
-	uint64_t guest_phys_addr;
-	uint64_t guest_user_addr;
-	uint64_t host_user_addr;
-	uint64_t size;
-	void	 *mmap_addr;
-	uint64_t mmap_size;
-	int fd;
-};
-
-
-/**
- * Memory structure includes region and mapping information.
- */
-struct virtio_memory {
-	uint32_t nregions;
-	struct virtio_memory_region regions[0];
-};
-
-
-/* Macros for printing using RTE_LOG */
-#define RTE_LOGTYPE_VHOST_CONFIG RTE_LOGTYPE_USER1
-#define RTE_LOGTYPE_VHOST_DATA   RTE_LOGTYPE_USER1
-
-#ifdef RTE_LIBRTE_VHOST_DEBUG
-#define VHOST_MAX_PRINT_BUFF 6072
-#define LOG_LEVEL RTE_LOG_DEBUG
-#define LOG_DEBUG(log_type, fmt, args...) RTE_LOG(DEBUG, log_type, fmt, ##args)
-#define PRINT_PACKET(device, addr, size, header) do { \
-	char *pkt_addr = (char *)(addr); \
-	unsigned int index; \
-	char packet[VHOST_MAX_PRINT_BUFF]; \
-	\
-	if ((header)) \
-		snprintf(packet, VHOST_MAX_PRINT_BUFF, "(%d) Header size %d: ", (device->vid), (size)); \
-	else \
-		snprintf(packet, VHOST_MAX_PRINT_BUFF, "(%d) Packet size %d: ", (device->vid), (size)); \
-	for (index = 0; index < (size); index++) { \
-		snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), \
-			"%02hhx ", pkt_addr[index]); \
-	} \
-	snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), "\n"); \
-	\
-	LOG_DEBUG(VHOST_DATA, "%s", packet); \
-} while (0)
-#else
-#define LOG_LEVEL RTE_LOG_INFO
-#define LOG_DEBUG(log_type, fmt, args...) do {} while (0)
-#define PRINT_PACKET(device, addr, size, header) do {} while (0)
-#endif
-
-extern uint64_t VHOST_FEATURES;
-#define MAX_VHOST_DEVICE	1024
-extern struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
-
-/* Convert guest physical Address to host virtual address */
-static inline uint64_t __attribute__((always_inline))
-gpa_to_vva(struct virtio_net *dev, uint64_t gpa)
-{
-	struct virtio_memory_region *reg;
-	uint32_t i;
-
-	for (i = 0; i < dev->mem->nregions; i++) {
-		reg = &dev->mem->regions[i];
-		if (gpa >= reg->guest_phys_addr &&
-		    gpa <  reg->guest_phys_addr + reg->size) {
-			return gpa - reg->guest_phys_addr +
-			       reg->host_user_addr;
-		}
-	}
-
-	return 0;
-}
-
-/* Convert guest physical address to host physical address */
-static inline phys_addr_t __attribute__((always_inline))
-gpa_to_hpa(struct virtio_net *dev, uint64_t gpa, uint64_t size)
-{
-	uint32_t i;
-	struct guest_page *page;
-
-	for (i = 0; i < dev->nr_guest_pages; i++) {
-		page = &dev->guest_pages[i];
-
-		if (gpa >= page->guest_phys_addr &&
-		    gpa + size < page->guest_phys_addr + page->size) {
-			return gpa - page->guest_phys_addr +
-			       page->host_phys_addr;
-		}
-	}
-
-	return 0;
-}
-
-struct virtio_net_device_ops const *notify_ops;
-struct virtio_net *get_device(int vid);
-
-int vhost_new_device(void);
-void cleanup_device(struct virtio_net *dev, int destroy);
-void reset_device(struct virtio_net *dev);
-void vhost_destroy_device(int);
-
-int alloc_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx);
-
-void vhost_set_ifname(int, const char *if_name, unsigned int if_len);
-void vhost_enable_tx_zero_copy(int vid);
-
-/*
- * Backend-specific cleanup. Defined by vhost-cuse and vhost-user.
- */
-void vhost_backend_cleanup(struct virtio_net *dev);
-
-#endif /* _VHOST_NET_CDEV_H_ */
diff --git a/lib/librte_vhost/vhost_device.h b/lib/librte_vhost/vhost_device.h
new file mode 100644
index 0000000..7101bb0
--- /dev/null
+++ b/lib/librte_vhost/vhost_device.h
@@ -0,0 +1,230 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _VHOST_DEVICE_H_
+#define _VHOST_DEVICE_H_
+
+#include <linux/virtio_ids.h>
+
+#include "vhost_net.h"
+#include "vhost_user.h"
+
+/* Used to indicate that the device is running on a data core */
+#define VIRTIO_DEV_RUNNING 1
+
+/* Backend value set by guest. */
+#define VIRTIO_DEV_STOPPED -1
+
+/**
+ * Structure contains variables relevant to RX/TX virtqueues.
+ */
+struct vhost_virtqueue {
+	struct vring_desc	*desc;
+	struct vring_avail	*avail;
+	struct vring_used	*used;
+	uint32_t		size;
+
+	uint16_t		last_avail_idx;
+	volatile uint16_t	last_used_idx;
+#define VIRTIO_INVALID_EVENTFD		(-1)
+#define VIRTIO_UNINITIALIZED_EVENTFD	(-2)
+
+	/* Backend value to determine if device should started/stopped */
+	int			backend;
+	/* Used to notify the guest (trigger interrupt) */
+	int			callfd;
+	/* Currently unused as polling mode is enabled */
+	int			kickfd;
+	int			enabled;
+
+	/* Physical address of used ring, for logging */
+	uint64_t		log_guest_addr;
+
+	uint16_t		nr_zmbuf;
+	uint16_t		zmbuf_size;
+	uint16_t		last_zmbuf_idx;
+	struct zcopy_mbuf	*zmbufs;
+	struct zcopy_mbuf_list	zmbuf_list;
+} __rte_cache_aligned;
+
+struct virtio_dev;
+
+struct virtio_dev_table {
+	int (*vhost_dev_ready)(struct virtio_dev *dev);
+	struct vhost_virtqueue* (*vhost_dev_get_queues)(struct virtio_dev *dev, uint16_t queue_id);
+	void (*vhost_dev_cleanup)(struct virtio_dev *dev, int destroy);
+	void (*vhost_dev_free)(struct virtio_dev *dev);
+	void (*vhost_dev_reset)(struct virtio_dev *dev);
+	uint64_t (*vhost_dev_get_features)(struct virtio_dev *dev);
+	int (*vhost_dev_set_features)(struct virtio_dev *dev, uint64_t features);
+	uint64_t (*vhost_dev_get_protocol_features)(struct virtio_dev *dev);
+	int (*vhost_dev_set_protocol_features)(struct virtio_dev *dev, uint64_t features);
+	uint32_t (*vhost_dev_get_default_queue_num)(struct virtio_dev *dev);
+	uint32_t (*vhost_dev_get_queue_num)(struct virtio_dev *dev);
+	uint16_t (*vhost_dev_get_avail_entries)(struct virtio_dev *dev, uint16_t queue_id);
+	int (*vhost_dev_get_vring_base)(struct virtio_dev *dev, struct vhost_virtqueue *vq);
+	int (*vhost_dev_set_vring_num)(struct virtio_dev *dev, struct vhost_virtqueue *vq);
+	int (*vhost_dev_set_vring_call)(struct virtio_dev *dev, struct vhost_vring_file *file);
+	int (*vhost_dev_set_log_base)(struct virtio_dev *dev, int fd, uint64_t size, uint64_t off);
+};
+
+struct virtio_dev {
+	/* Frontend (QEMU) memory and memory region information */
+	struct virtio_memory	*mem;
+	int			vid;
+	uint32_t		flags;
+	#define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
+	char			ifname[IF_NAME_SZ];
+
+	uint32_t		dev_type;
+	union {
+		struct virtio_net	net_dev;
+	} dev;
+
+	uint32_t		nr_guest_pages;
+	uint32_t		max_guest_pages;
+	struct guest_page       *guest_pages;
+
+	const struct virtio_net_device_ops	*notify_ops;
+	struct virtio_dev_table fn_table;
+} __rte_cache_aligned;
+
+extern struct virtio_net_device_ops const *notify_ops;
+
+/*
+ * Define virtio 1.0 for older kernels
+ */
+#ifndef VIRTIO_F_VERSION_1
+ #define VIRTIO_F_VERSION_1 32
+#endif
+
+struct guest_page {
+	uint64_t guest_phys_addr;
+	uint64_t host_phys_addr;
+	uint64_t size;
+};
+
+/**
+ * Information relating to memory regions including offsets to
+ * addresses in QEMUs memory file.
+ */
+struct virtio_memory_region {
+	uint64_t guest_phys_addr;
+	uint64_t guest_user_addr;
+	uint64_t host_user_addr;
+	uint64_t size;
+	void	 *mmap_addr;
+	uint64_t mmap_size;
+	int fd;
+};
+
+/**
+ * Memory structure includes region and mapping information.
+ */
+struct virtio_memory {
+	uint32_t nregions;
+	struct virtio_memory_region regions[0];
+};
+
+
+/* Macros for printing using RTE_LOG */
+#define RTE_LOGTYPE_VHOST_CONFIG RTE_LOGTYPE_USER1
+#define RTE_LOGTYPE_VHOST_DATA   RTE_LOGTYPE_USER1
+
+#ifdef RTE_LIBRTE_VHOST_DEBUG
+#define VHOST_MAX_PRINT_BUFF 6072
+#define LOG_LEVEL RTE_LOG_DEBUG
+#define LOG_DEBUG(log_type, fmt, args...) RTE_LOG(DEBUG, log_type, fmt, ##args)
+#define PRINT_PACKET(device, addr, size, header) do { \
+	char *pkt_addr = (char *)(addr); \
+	unsigned int index; \
+	char packet[VHOST_MAX_PRINT_BUFF]; \
+	\
+	if ((header)) \
+		snprintf(packet, VHOST_MAX_PRINT_BUFF, "(%d) Header size %d: ", (device->vid), (size)); \
+	else \
+		snprintf(packet, VHOST_MAX_PRINT_BUFF, "(%d) Packet size %d: ", (device->vid), (size)); \
+	for (index = 0; index < (size); index++) { \
+		snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), \
+			"%02hhx ", pkt_addr[index]); \
+	} \
+	snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), "\n"); \
+	\
+	LOG_DEBUG(VHOST_DATA, "%s", packet); \
+} while (0)
+#else
+#define LOG_LEVEL RTE_LOG_INFO
+#define LOG_DEBUG(log_type, fmt, args...) do {} while (0)
+#define PRINT_PACKET(device, addr, size, header) do {} while (0)
+#endif
+
+/* Convert guest physical Address to host virtual address */
+static inline uint64_t __attribute__((always_inline))
+gpa_to_vva(struct virtio_dev *dev, uint64_t gpa)
+{
+	struct virtio_memory_region *reg;
+	uint32_t i;
+
+	for (i = 0; i < dev->mem->nregions; i++) {
+		reg = &dev->mem->regions[i];
+		if (gpa >= reg->guest_phys_addr &&
+		    gpa <  reg->guest_phys_addr + reg->size) {
+			return gpa - reg->guest_phys_addr +
+			       reg->host_user_addr;
+		}
+	}
+
+	return 0;
+}
+
+/* Convert guest physical address to host physical address */
+static inline phys_addr_t __attribute__((always_inline))
+gpa_to_hpa(struct virtio_dev *dev, uint64_t gpa, uint64_t size)
+{
+	uint32_t i;
+	struct guest_page *page;
+
+	for (i = 0; i < dev->nr_guest_pages; i++) {
+		page = &dev->guest_pages[i];
+
+		if (gpa >= page->guest_phys_addr &&
+		    gpa + size < page->guest_phys_addr + page->size) {
+			return gpa - page->guest_phys_addr +
+			       page->host_phys_addr;
+		}
+	}
+
+	return 0;
+}
+
+#endif
\ No newline at end of file
diff --git a/lib/librte_vhost/vhost_net.c b/lib/librte_vhost/vhost_net.c
new file mode 100644
index 0000000..f141b32
--- /dev/null
+++ b/lib/librte_vhost/vhost_net.c
@@ -0,0 +1,659 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <linux/vhost.h>
+#include <linux/virtio_net.h>
+#include <stddef.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <assert.h>
+#ifdef RTE_LIBRTE_VHOST_NUMA
+#include <numaif.h>
+#endif
+#include <sys/mman.h>
+
+#include <rte_ethdev.h>
+#include <rte_log.h>
+#include <rte_string_fns.h>
+#include <rte_memory.h>
+#include <rte_malloc.h>
+#include <rte_virtio_net.h>
+
+#include "vhost_net.h"
+#include "vhost_device.h"
+
+#define VHOST_USER_F_PROTOCOL_FEATURES	30
+
+/* Features supported by this lib. */
+#define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
+				(1ULL << VIRTIO_NET_F_CTRL_VQ) | \
+				(1ULL << VIRTIO_NET_F_CTRL_RX) | \
+				(1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) | \
+				(VHOST_SUPPORTS_MQ)            | \
+				(1ULL << VIRTIO_F_VERSION_1)   | \
+				(1ULL << VHOST_F_LOG_ALL)      | \
+				(1ULL << VHOST_USER_F_PROTOCOL_FEATURES) | \
+				(1ULL << VIRTIO_NET_F_HOST_TSO4) | \
+				(1ULL << VIRTIO_NET_F_HOST_TSO6) | \
+				(1ULL << VIRTIO_NET_F_CSUM)    | \
+				(1ULL << VIRTIO_NET_F_GUEST_CSUM) | \
+				(1ULL << VIRTIO_NET_F_GUEST_TSO4) | \
+				(1ULL << VIRTIO_NET_F_GUEST_TSO6))
+
+uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
+
+/* device ops to add/remove device to/from data core. */
+struct virtio_net_device_ops const *notify_ops = NULL;
+
+struct virtio_net *
+get_net_device(struct virtio_dev *dev)
+{
+	if (!dev)
+		return NULL;
+ 
+	return &dev->dev.net_dev;
+}
+
+static void
+cleanup_vq(struct vhost_virtqueue *vq, int destroy)
+{
+	if ((vq->callfd >= 0) && (destroy != 0))
+		close(vq->callfd);
+	if (vq->kickfd >= 0)
+		close(vq->kickfd);
+}
+
+/*
+ * Unmap any memory, close any file descriptors and
+ * free any memory owned by a device.
+ */
+static void
+cleanup_device(struct virtio_dev *device, int destroy)
+{
+	struct virtio_net *dev = get_net_device(device);
+	uint32_t i;
+
+	dev->features = 0;
+	dev->protocol_features = 0;
+
+	for (i = 0; i < dev->virt_qp_nb; i++) {
+		cleanup_vq(dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_RXQ], destroy);
+		cleanup_vq(dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_TXQ], destroy);
+	}
+
+	if (dev->log_addr) {
+		munmap((void *)(uintptr_t)dev->log_addr, dev->log_size);
+		dev->log_addr = 0;
+	}
+}
+
+/*
+ * Release virtqueues and device memory.
+ */
+static void
+free_device(struct virtio_dev *device)
+{
+	struct virtio_net *dev = get_net_device(device);
+	uint32_t i;
+
+	for (i = 0; i < dev->virt_qp_nb; i++)
+		rte_free(dev->virtqueue[i * VIRTIO_QNUM]);
+
+	rte_free(dev);
+}
+
+static void
+init_vring_queue(struct vhost_virtqueue *vq, int qp_idx)
+{
+	memset(vq, 0, sizeof(struct vhost_virtqueue));
+
+	vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD;
+	vq->callfd = VIRTIO_UNINITIALIZED_EVENTFD;
+
+	/* Backends are set to -1 indicating an inactive device. */
+	vq->backend = -1;
+
+	/* always set the default vq pair to enabled */
+	if (qp_idx == 0)
+		vq->enabled = 1;
+
+	TAILQ_INIT(&vq->zmbuf_list);
+}
+
+static void
+init_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx)
+{
+	uint32_t base_idx = qp_idx * VIRTIO_QNUM;
+
+	init_vring_queue(dev->virtqueue[base_idx + VIRTIO_RXQ], qp_idx);
+	init_vring_queue(dev->virtqueue[base_idx + VIRTIO_TXQ], qp_idx);
+}
+
+static void
+reset_vring_queue(struct vhost_virtqueue *vq, int qp_idx)
+{
+	int callfd;
+
+	callfd = vq->callfd;
+	init_vring_queue(vq, qp_idx);
+	vq->callfd = callfd;
+}
+
+static void
+reset_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx)
+{
+	uint32_t base_idx = qp_idx * VIRTIO_QNUM;
+
+	reset_vring_queue(dev->virtqueue[base_idx + VIRTIO_RXQ], qp_idx);
+	reset_vring_queue(dev->virtqueue[base_idx + VIRTIO_TXQ], qp_idx);
+}
+
+static int
+alloc_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx)
+{
+	struct vhost_virtqueue *virtqueue = NULL;
+	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+
+	virtqueue = rte_malloc(NULL,
+			       sizeof(struct vhost_virtqueue) * VIRTIO_QNUM, 0);
+	if (virtqueue == NULL) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"Failed to allocate memory for virt qp:%d.\n", qp_idx);
+		return -1;
+	}
+
+	dev->virtqueue[virt_rx_q_idx] = virtqueue;
+	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
+
+	init_vring_queue_pair(dev, qp_idx);
+
+	dev->virt_qp_nb += 1;
+
+	return 0;
+}
+
+/*
+ * Reset some variables in device structure, while keeping few
+ * others untouched, such as vid, ifname, virt_qp_nb: they
+ * should be same unless the device is removed.
+ */
+static void
+reset_device(struct virtio_dev *device)
+{
+	struct virtio_net *dev = get_net_device(device);
+	uint32_t i;
+
+	for (i = 0; i < dev->virt_qp_nb; i++)
+		reset_vring_queue_pair(dev, i);
+}
+
+static uint64_t
+vhost_dev_get_features(struct virtio_dev *dev)
+{
+	if (dev == NULL)
+		return 0;
+
+	return VHOST_FEATURES;	
+}
+
+static int
+vhost_dev_set_features(struct virtio_dev *device, uint64_t features)
+{
+	struct virtio_net *dev = get_net_device(device);
+
+	if (features & ~VHOST_FEATURES)
+		return -1;
+
+	dev->features = features;
+	if (dev->features &
+		((1 << VIRTIO_NET_F_MRG_RXBUF) | (1ULL << VIRTIO_F_VERSION_1))) {
+		dev->vhost_hlen = sizeof(struct virtio_net_hdr_mrg_rxbuf);
+	} else {
+		dev->vhost_hlen = sizeof(struct virtio_net_hdr);
+	}
+	LOG_DEBUG(VHOST_CONFIG,
+		"(%d) mergeable RX buffers %s, virtio 1 %s\n",
+		device->vid,
+		(dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off",
+		(dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off");
+
+	return 0;
+}
+
+static int
+vhost_dev_set_vring_num(struct virtio_dev *device,
+			 struct vhost_virtqueue *vq)
+{
+	struct virtio_net *dev = get_net_device(device);
+
+	if (dev->tx_zero_copy) {
+		vq->nr_zmbuf = 0;
+		vq->last_zmbuf_idx = 0;
+		vq->zmbuf_size = vq->size;
+		vq->zmbufs = rte_zmalloc(NULL, vq->zmbuf_size *
+					 sizeof(struct zcopy_mbuf), 0);
+		if (vq->zmbufs == NULL) {
+			RTE_LOG(WARNING, VHOST_CONFIG,
+				"failed to allocate mem for zero copy; "
+				"zero copy is force disabled\n");
+			dev->tx_zero_copy = 0;
+		}
+	}
+
+	return 0;
+}
+
+static int
+vq_is_ready(struct vhost_virtqueue *vq)
+{
+	return vq && vq->desc   &&
+	       vq->kickfd != VIRTIO_UNINITIALIZED_EVENTFD &&
+	       vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD;
+}
+
+static int
+vhost_dev_is_ready(struct virtio_dev *device)
+{
+	struct virtio_net *dev = get_net_device(device);
+	struct vhost_virtqueue *rvq, *tvq;
+	uint32_t i;
+
+	for (i = 0; i < dev->virt_qp_nb; i++) {
+		rvq = dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_RXQ];
+		tvq = dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_TXQ];
+
+		if (!vq_is_ready(rvq) || !vq_is_ready(tvq)) {
+			RTE_LOG(INFO, VHOST_CONFIG,
+				"virtio is not ready for processing.\n");
+			return 0;
+		}
+	}
+
+	RTE_LOG(INFO, VHOST_CONFIG,
+		"virtio is now ready for processing.\n");
+	return 1;
+}
+
+static int
+vhost_dev_set_vring_call(struct virtio_dev *device, struct vhost_vring_file *file)
+{
+	struct virtio_net *dev = get_net_device(device);
+	struct vhost_virtqueue *vq;
+	uint32_t cur_qp_idx;
+
+	/*
+	 * FIXME: VHOST_SET_VRING_CALL is the first per-vring message
+	 * we get, so we do vring queue pair allocation here.
+	 */
+	cur_qp_idx = file->index / VIRTIO_QNUM;
+	if (cur_qp_idx + 1 > dev->virt_qp_nb) {
+		if (alloc_vring_queue_pair(dev, cur_qp_idx) < 0)
+			return -1;
+	}
+
+	vq = dev->virtqueue[file->index];
+	assert(vq != NULL);
+
+	if (vq->callfd >= 0)
+		close(vq->callfd);
+
+	vq->callfd = file->fd;
+	return 0;
+}
+
+static int
+vhost_dev_set_protocol_features(struct virtio_dev *device,
+				 uint64_t protocol_features)
+{
+	struct virtio_net *dev = get_net_device(device);
+
+	if (protocol_features & ~VHOST_USER_PROTOCOL_FEATURES)
+		return -1;
+
+	dev->protocol_features = protocol_features;
+	return 0;
+}
+
+static uint64_t
+vhost_dev_get_protocol_features(struct virtio_dev *dev)
+{
+	if (dev == NULL)
+		return 0;
+
+	return VHOST_USER_PROTOCOL_FEATURES;
+}
+
+static uint32_t
+vhost_dev_get_default_queue_num(struct virtio_dev *dev)
+{
+	if (dev == NULL)
+		return 0;
+
+	return VHOST_MAX_QUEUE_PAIRS;
+}
+
+static uint32_t
+vhost_dev_get_queue_num(struct virtio_dev *device)
+{
+	struct virtio_net *dev;
+	if (device == NULL)
+		return 0;
+
+	dev = get_net_device(device);
+	return dev->virt_qp_nb;
+}
+
+static uint16_t
+vhost_dev_get_avail_entries(struct virtio_dev *device, uint16_t queue_id)
+{
+	struct virtio_net *dev = get_net_device(device);
+	struct vhost_virtqueue *vq;
+
+	vq = dev->virtqueue[queue_id];
+	if (!vq->enabled)
+		return 0;
+
+	return *(volatile uint16_t *)&vq->avail->idx - vq->last_used_idx;
+}
+
+void
+vhost_enable_tx_zero_copy(int vid)
+{
+	struct virtio_dev *device = get_device(vid);
+	struct virtio_net *dev;
+
+	if (device == NULL)
+		return;
+
+	dev = get_net_device(device);
+	dev->tx_zero_copy = 1;
+}
+
+static void
+free_zmbufs(struct vhost_virtqueue *vq)
+{
+	struct zcopy_mbuf *zmbuf, *next;
+
+	for (zmbuf = TAILQ_FIRST(&vq->zmbuf_list);
+	     zmbuf != NULL; zmbuf = next) {
+		next = TAILQ_NEXT(zmbuf, next);
+
+		rte_pktmbuf_free(zmbuf->mbuf);
+		TAILQ_REMOVE(&vq->zmbuf_list, zmbuf, next);
+	}
+
+	rte_free(vq->zmbufs);
+}
+
+static int
+vhost_dev_get_vring_base(struct virtio_dev *device, struct vhost_virtqueue *vq)
+{
+	struct virtio_net *dev = get_net_device(device);
+
+	/*
+	 * Based on current qemu vhost-user implementation, this message is
+	 * sent and only sent in vhost_vring_stop.
+	 * TODO: cleanup the vring, it isn't usable since here.
+	 */
+	if (vq->kickfd >= 0)
+		close(vq->kickfd);
+
+	vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD;
+
+	if (dev->tx_zero_copy)
+		free_zmbufs(vq);
+
+	return 0;
+}
+
+static int
+vhost_dev_set_log_base(struct virtio_dev *device, int fd, uint64_t size, uint64_t off)
+{
+	void *addr;
+	struct virtio_net *dev = get_net_device(device);
+
+	/*
+	 * mmap from 0 to workaround a hugepage mmap bug: mmap will
+	 * fail when offset is not page size aligned.
+	 */
+	addr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+	close(fd);
+	if (addr == MAP_FAILED) {
+		RTE_LOG(ERR, VHOST_CONFIG, "mmap log base failed!\n");
+		return -1;
+	}
+
+	/*
+	 * Free previously mapped log memory on occasionally
+	 * multiple VHOST_USER_SET_LOG_BASE.
+	 */
+	if (dev->log_addr) {
+		munmap((void *)(uintptr_t)dev->log_addr, dev->log_size);
+	}
+	dev->log_addr = (uint64_t)(uintptr_t)addr;
+	dev->log_base = dev->log_addr + off;
+	dev->log_size = size;
+
+	return 0;
+}
+
+/*
+ * An rarp packet is constructed and broadcasted to notify switches about
+ * the new location of the migrated VM, so that packets from outside will
+ * not be lost after migration.
+ *
+ * However, we don't actually "send" a rarp packet here, instead, we set
+ * a flag 'broadcast_rarp' to let rte_vhost_dequeue_burst() inject it.
+ */
+int
+vhost_user_send_rarp(struct virtio_dev *device, struct VhostUserMsg *msg)
+{
+	struct virtio_net *dev = get_net_device(device);
+	uint8_t *mac = (uint8_t *)&msg->payload.u64;
+
+	RTE_LOG(DEBUG, VHOST_CONFIG,
+		":: mac: %02x:%02x:%02x:%02x:%02x:%02x\n",
+		mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
+	memcpy(dev->mac.addr_bytes, mac, 6);
+
+	/*
+	 * Set the flag to inject a RARP broadcast packet at
+	 * rte_vhost_dequeue_burst().
+	 *
+	 * rte_smp_wmb() is for making sure the mac is copied
+	 * before the flag is set.
+	 */
+	rte_smp_wmb();
+	rte_atomic16_set(&dev->broadcast_rarp, 1);
+
+	return 0;
+}
+
+static struct vhost_virtqueue *
+vhost_dev_get_queues(struct virtio_dev *device, uint16_t queue_id)
+{
+	struct virtio_net *dev = get_net_device(device);
+	struct vhost_virtqueue *vq;
+
+	vq = dev->virtqueue[queue_id];
+
+	return vq;
+}
+
+void
+vhost_net_device_init(struct virtio_dev *device)
+{
+	struct virtio_net *dev = get_net_device(device);
+
+	device->fn_table.vhost_dev_ready  = vhost_dev_is_ready;
+	device->fn_table.vhost_dev_get_queues  = vhost_dev_get_queues;
+	device->fn_table.vhost_dev_cleanup = cleanup_device;
+	device->fn_table.vhost_dev_free  = free_device;
+	device->fn_table.vhost_dev_reset  = reset_device;
+	device->fn_table.vhost_dev_get_features  = vhost_dev_get_features;
+	device->fn_table.vhost_dev_set_features  = vhost_dev_set_features;
+	device->fn_table.vhost_dev_get_protocol_features  = vhost_dev_get_protocol_features;
+	device->fn_table.vhost_dev_set_protocol_features  = vhost_dev_set_protocol_features;
+	device->fn_table.vhost_dev_get_default_queue_num  = vhost_dev_get_default_queue_num;
+	device->fn_table.vhost_dev_get_queue_num  = vhost_dev_get_queue_num;
+	device->fn_table.vhost_dev_get_avail_entries  = vhost_dev_get_avail_entries;
+	device->fn_table.vhost_dev_get_vring_base  = vhost_dev_get_vring_base;
+	device->fn_table.vhost_dev_set_vring_num = vhost_dev_set_vring_num;
+	device->fn_table.vhost_dev_set_vring_call  = vhost_dev_set_vring_call;
+	device->fn_table.vhost_dev_set_log_base = vhost_dev_set_log_base;
+
+	dev->device = device;
+}
+
+uint64_t rte_vhost_feature_get(void)
+{
+	return VHOST_FEATURES;
+}
+
+int rte_vhost_feature_disable(uint64_t feature_mask)
+{
+	VHOST_FEATURES = VHOST_FEATURES & ~feature_mask;
+	return 0;
+}
+
+int rte_vhost_feature_enable(uint64_t feature_mask)
+{
+	if ((feature_mask & VHOST_SUPPORTED_FEATURES) == feature_mask) {
+		VHOST_FEATURES = VHOST_FEATURES | feature_mask;
+		return 0;
+	}
+	return -1;
+}
+
+int
+rte_vhost_get_numa_node(int vid)
+{
+#ifdef RTE_LIBRTE_VHOST_NUMA
+	struct virtio_dev *dev = get_device(vid);
+	int numa_node;
+	int ret;
+
+	if (dev == NULL)
+		return -1;
+
+	ret = get_mempolicy(&numa_node, NULL, 0, dev,
+			    MPOL_F_NODE | MPOL_F_ADDR);
+	if (ret < 0) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"(%d) failed to query numa node: %d\n", vid, ret);
+		return -1;
+	}
+
+	return numa_node;
+#else
+	RTE_SET_USED(vid);
+	return -1;
+#endif
+}
+
+uint32_t
+rte_vhost_get_queue_num(int vid)
+{
+	struct virtio_dev *device = get_device(vid);
+
+	if (device == NULL)
+		return 0;
+
+	if (device->fn_table.vhost_dev_get_queue_num)
+		return device->fn_table.vhost_dev_get_queue_num(device);
+
+	return 0;
+}
+
+int
+rte_vhost_get_ifname(int vid, char *buf, size_t len)
+{
+	struct virtio_dev *dev = get_device(vid);
+
+	if (dev == NULL)
+		return -1;
+
+	len = RTE_MIN(len, sizeof(dev->ifname));
+
+	strncpy(buf, dev->ifname, len);
+	buf[len - 1] = '\0';
+
+	return 0;
+}
+
+uint16_t
+rte_vhost_avail_entries(int vid, uint16_t queue_id)
+{
+	struct virtio_dev *device;
+
+	device = get_device(vid);
+	if (!device)
+		return 0;
+
+	if (device->fn_table.vhost_dev_get_avail_entries)
+		return device->fn_table.vhost_dev_get_avail_entries(device, queue_id);
+
+	return 0;
+}
+
+int
+rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable)
+{
+	struct virtio_dev *device = get_device(vid);
+	struct vhost_virtqueue *vq;
+
+	if (device == NULL)
+		return -1;
+
+	vq = device->fn_table.vhost_dev_get_queues(device, queue_id);
+	if (enable) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"guest notification isn't supported.\n");
+		return -1;
+	}
+
+	vq->used->flags = VRING_USED_F_NO_NOTIFY;
+	return 0;
+}
+
+/*
+ * Register ops so that we can add/remove device to data core.
+ */
+int
+rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const ops)
+{
+	notify_ops = ops;
+
+	return 0;
+}
\ No newline at end of file
diff --git a/lib/librte_vhost/vhost_net.h b/lib/librte_vhost/vhost_net.h
new file mode 100644
index 0000000..53b6b16
--- /dev/null
+++ b/lib/librte_vhost/vhost_net.h
@@ -0,0 +1,126 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _VHOST_NET_H_
+#define _VHOST_NET_H_
+#include <stdint.h>
+#include <stdio.h>
+#include <sys/types.h>
+#include <sys/queue.h>
+#include <unistd.h>
+#include <linux/vhost.h>
+
+#include <rte_log.h>
+
+#include "rte_virtio_net.h"
+#include "vhost_user.h"
+
+#define VHOST_USER_PROTOCOL_F_MQ	0
+#define VHOST_USER_PROTOCOL_F_LOG_SHMFD	1
+#define VHOST_USER_PROTOCOL_F_RARP	2
+
+#define VHOST_USER_PROTOCOL_FEATURES	((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \
+					 (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\
+					 (1ULL << VHOST_USER_PROTOCOL_F_RARP))
+
+#define BUF_VECTOR_MAX 256
+
+/**
+ * Structure contains buffer address, length and descriptor index
+ * from vring to do scatter RX.
+ */
+struct buf_vector {
+	uint64_t buf_addr;
+	uint32_t buf_len;
+	uint32_t desc_idx;
+};
+
+/*
+ * A structure to hold some fields needed in zero copy code path,
+ * mainly for associating an mbuf with the right desc_idx.
+ */
+struct zcopy_mbuf {
+	struct rte_mbuf *mbuf;
+	uint32_t desc_idx;
+	uint16_t in_use;
+
+	TAILQ_ENTRY(zcopy_mbuf) next;
+};
+TAILQ_HEAD(zcopy_mbuf_list, zcopy_mbuf);
+
+/* Old kernels have no such macro defined */
+#ifndef VIRTIO_NET_F_GUEST_ANNOUNCE
+ #define VIRTIO_NET_F_GUEST_ANNOUNCE 21
+#endif
+
+/*
+ * Make an extra wrapper for VIRTIO_NET_F_MQ and
+ * VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX as they are
+ * introduced since kernel v3.8. This makes our
+ * code buildable for older kernel.
+ */
+#ifdef VIRTIO_NET_F_MQ
+ #define VHOST_MAX_QUEUE_PAIRS	VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX
+ #define VHOST_SUPPORTS_MQ	(1ULL << VIRTIO_NET_F_MQ)
+#else
+ #define VHOST_MAX_QUEUE_PAIRS	1
+ #define VHOST_SUPPORTS_MQ	0
+#endif
+
+/**
+ * Device structure contains all configuration information relating
+ * to the device.
+ */
+struct virtio_net {
+	uint64_t		features;
+	uint64_t		protocol_features;
+	uint16_t		vhost_hlen;
+	uint64_t		log_size;
+	uint64_t		log_base;
+	uint64_t		log_addr;
+	/* to tell if we need broadcast rarp packet */
+	rte_atomic16_t		broadcast_rarp;
+	uint32_t		virt_qp_nb;
+	int			tx_zero_copy;
+	struct vhost_virtqueue	*virtqueue[VHOST_MAX_QUEUE_PAIRS * 2];
+	struct ether_addr	mac;
+	/* transport layer device context */
+	struct virtio_dev	*device;
+} __rte_cache_aligned;
+
+void vhost_enable_tx_zero_copy(int vid);
+int vhost_user_send_rarp(struct virtio_dev *device, struct VhostUserMsg *msg);
+void vhost_net_device_init(struct virtio_dev *device);
+struct virtio_net *get_net_device(struct virtio_dev *dev);
+
+#endif /* _VHOST_NET_H_ */
\ No newline at end of file
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index ff995d5..90c4b03 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -48,9 +48,13 @@
 #include <rte_malloc.h>
 #include <rte_log.h>
 
-#include "vhost.h"
+#include "vhost_device.h"
+#include "vhost_net.h"
 #include "vhost_user.h"
 
+#define MAX_VHOST_DEVICE        1024
+struct virtio_dev *vhost_devices[MAX_VHOST_DEVICE];
+
 static const char *vhost_message_str[VHOST_USER_MAX] = {
 	[VHOST_USER_NONE] = "VHOST_USER_NONE",
 	[VHOST_USER_GET_FEATURES] = "VHOST_USER_GET_FEATURES",
@@ -85,7 +89,7 @@ get_blk_size(int fd)
 }
 
 static void
-free_mem_region(struct virtio_net *dev)
+free_mem_region(struct virtio_dev *dev)
 {
 	uint32_t i;
 	struct virtio_memory_region *reg;
@@ -102,18 +106,99 @@ free_mem_region(struct virtio_net *dev)
 	}
 }
 
-void
-vhost_backend_cleanup(struct virtio_net *dev)
+static void
+vhost_backend_cleanup(struct virtio_dev *dev)
 {
 	if (dev->mem) {
 		free_mem_region(dev);
-		rte_free(dev->mem);
+		free(dev->mem);
 		dev->mem = NULL;
 	}
-	if (dev->log_addr) {
-		munmap((void *)(uintptr_t)dev->log_addr, dev->log_size);
-		dev->log_addr = 0;
+}
+
+struct virtio_dev *
+get_device(int vid)
+{
+	struct virtio_dev *dev = vhost_devices[vid];
+
+	if (unlikely(!dev)) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"(%d) device not found.\n", vid);
+	}
+
+	return dev;
+}
+
+/*
+ * Function is called from the CUSE open function. The device structure is
+ * initialised and a new entry is added to the device configuration linked
+ * list.
+ */
+int
+vhost_new_device(int type)
+{
+	struct virtio_dev *dev;
+	int i;
+
+	dev = rte_zmalloc(NULL, sizeof(struct virtio_dev), 0);
+	if (dev == NULL) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"Failed to allocate memory for new dev.\n");
+		return -1;
+	}
+
+	for (i = 0; i < MAX_VHOST_DEVICE; i++) {
+		if (vhost_devices[i] == NULL)
+			break;
+	}
+	if (i == MAX_VHOST_DEVICE) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"Failed to find a free slot for new device.\n");
+		return -1;
+	}
+
+	switch(type) {
+		case VIRTIO_ID_NET:
+			dev->notify_ops = notify_ops;
+			vhost_net_device_init(dev);
+			assert(notify_ops != NULL);
+			break;
+		default:
+			return -1;
+	}
+
+	vhost_devices[i] = dev;
+	dev->vid = i;
+	dev->dev_type = type;
+	assert(dev->fn_table.vhost_dev_get_queues != NULL);
+
+	return i;
+}
+
+/*
+ * Function is called from the CUSE release function. This function will
+ * cleanup the device and remove it from device configuration linked list.
+ */
+void
+vhost_destroy_device(int vid)
+{
+	struct virtio_dev *dev = get_device(vid);
+
+	if (dev == NULL)
+		return;
+
+	if (dev->flags & VIRTIO_DEV_RUNNING) {
+		dev->flags &= ~VIRTIO_DEV_RUNNING;
+		dev->notify_ops->destroy_device(vid);
 	}
+
+	vhost_backend_cleanup(dev);
+	if (dev->fn_table.vhost_dev_cleanup)
+		dev->fn_table.vhost_dev_cleanup(dev, 1);
+	if (dev->fn_table.vhost_dev_free)
+		dev->fn_table.vhost_dev_free(dev);
+
+	vhost_devices[vid] = NULL;
 }
 
 /*
@@ -126,16 +211,28 @@ vhost_user_set_owner(void)
 	return 0;
 }
 
+/*
+ * Called from CUSE IOCTL: VHOST_RESET_OWNER
+ */
 static int
-vhost_user_reset_owner(struct virtio_net *dev)
+vhost_user_reset_owner(struct virtio_dev *dev)
 {
+	if (dev == NULL)
+		return -1;
+
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
-	cleanup_device(dev, 0);
-	reset_device(dev);
+	dev->flags = 0;
+
+	vhost_backend_cleanup(dev);
+	if (dev->fn_table.vhost_dev_cleanup)
+		dev->fn_table.vhost_dev_cleanup(dev, 0);
+	if (dev->fn_table.vhost_dev_reset)
+		dev->fn_table.vhost_dev_reset(dev);
+
 	return 0;
 }
 
@@ -143,61 +240,61 @@ vhost_user_reset_owner(struct virtio_net *dev)
  * The features that we support are requested.
  */
 static uint64_t
-vhost_user_get_features(void)
+vhost_user_get_features(struct virtio_dev *dev)
 {
-	return VHOST_FEATURES;
+	if (dev == NULL)
+		return 0;
+
+	if (dev->fn_table.vhost_dev_get_features)
+		return dev->fn_table.vhost_dev_get_features(dev);
+
+	return 0;
 }
 
 /*
  * We receive the negotiated features supported by us and the virtio device.
  */
 static int
-vhost_user_set_features(struct virtio_net *dev, uint64_t features)
+vhost_user_set_features(struct virtio_dev *dev, uint64_t pu)
 {
-	if (features & ~VHOST_FEATURES)
-		return -1;
+	int ret = 0;
 
-	dev->features = features;
-	if (dev->features &
-		((1 << VIRTIO_NET_F_MRG_RXBUF) | (1ULL << VIRTIO_F_VERSION_1))) {
-		dev->vhost_hlen = sizeof(struct virtio_net_hdr_mrg_rxbuf);
-	} else {
-		dev->vhost_hlen = sizeof(struct virtio_net_hdr);
-	}
-	LOG_DEBUG(VHOST_CONFIG,
-		"(%d) mergeable RX buffers %s, virtio 1 %s\n",
-		dev->vid,
-		(dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off",
-		(dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off");
+	if (dev->fn_table.vhost_dev_set_features)
+		ret = dev->fn_table.vhost_dev_set_features(dev, pu);
 
-	return 0;
+	return ret;
+}
+
+void
+vhost_set_ifname(int vid, const char *if_name, unsigned int if_len)
+{
+	struct virtio_dev *dev;
+	unsigned int len;
+
+	dev = get_device(vid);
+	if (dev == NULL)
+		return;
+
+	len = if_len > sizeof(dev->ifname) ?
+		sizeof(dev->ifname) : if_len;
+
+	strncpy(dev->ifname, if_name, len);
+	dev->ifname[sizeof(dev->ifname) - 1] = '\0';
 }
 
 /*
  * The virtio device sends us the size of the descriptor ring.
  */
 static int
-vhost_user_set_vring_num(struct virtio_net *dev,
-			 struct vhost_vring_state *state)
+vhost_user_set_vring_num(struct virtio_dev *dev, struct vhost_vring_state *state)
 {
-	struct vhost_virtqueue *vq = dev->virtqueue[state->index];
+	struct vhost_virtqueue *vq;
 
+	vq = dev->fn_table.vhost_dev_get_queues(dev, state->index);
 	vq->size = state->num;
 
-	if (dev->tx_zero_copy) {
-		vq->nr_zmbuf = 0;
-		vq->last_zmbuf_idx = 0;
-		vq->zmbuf_size = vq->size;
-		vq->zmbufs = rte_zmalloc(NULL, vq->zmbuf_size *
-					 sizeof(struct zcopy_mbuf), 0);
-		if (vq->zmbufs == NULL) {
-			RTE_LOG(WARNING, VHOST_CONFIG,
-				"failed to allocate mem for zero copy; "
-				"zero copy is force disabled\n");
-			dev->tx_zero_copy = 0;
-		}
-	}
-
+	if (dev->fn_table.vhost_dev_set_vring_num)
+		dev->fn_table.vhost_dev_set_vring_num(dev, vq);
 	return 0;
 }
 
@@ -206,11 +303,11 @@ vhost_user_set_vring_num(struct virtio_net *dev,
  * same numa node as the memory of vring descriptor.
  */
 #ifdef RTE_LIBRTE_VHOST_NUMA
-static struct virtio_net*
-numa_realloc(struct virtio_net *dev, int index)
+static struct virtio_dev*
+numa_realloc(struct virtio_dev *dev, int index)
 {
 	int oldnode, newnode;
-	struct virtio_net *old_dev;
+	struct virtio_dev *old_dev;
 	struct vhost_virtqueue *old_vq, *vq;
 	int ret;
 
@@ -222,7 +319,7 @@ numa_realloc(struct virtio_net *dev, int index)
 		return dev;
 
 	old_dev = dev;
-	vq = old_vq = dev->virtqueue[index];
+	vq = old_vq = dev->fn_table.virtio_dev_get_queues(dev, index);
 
 	ret = get_mempolicy(&newnode, NULL, 0, old_vq->desc,
 			    MPOL_F_NODE | MPOL_F_ADDR);
@@ -277,8 +374,8 @@ out:
 	return dev;
 }
 #else
-static struct virtio_net*
-numa_realloc(struct virtio_net *dev, int index __rte_unused)
+static struct virtio_dev*
+numa_realloc(struct virtio_dev *dev, int index __rte_unused)
 {
 	return dev;
 }
@@ -289,7 +386,7 @@ numa_realloc(struct virtio_net *dev, int index __rte_unused)
  * used to convert the ring addresses to our address space.
  */
 static uint64_t
-qva_to_vva(struct virtio_net *dev, uint64_t qva)
+qva_to_vva(struct virtio_dev *dev, uint64_t qva)
 {
 	struct virtio_memory_region *reg;
 	uint32_t i;
@@ -313,15 +410,14 @@ qva_to_vva(struct virtio_net *dev, uint64_t qva)
  * This function then converts these to our address space.
  */
 static int
-vhost_user_set_vring_addr(struct virtio_net *dev, struct vhost_vring_addr *addr)
+vhost_user_set_vring_addr(struct virtio_dev *dev, struct vhost_vring_addr *addr)
 {
 	struct vhost_virtqueue *vq;
 
 	if (dev->mem == NULL)
 		return -1;
 
-	/* addr->index refers to the queue index. The txq 1, rxq is 0. */
-	vq = dev->virtqueue[addr->index];
+	vq = dev->fn_table.vhost_dev_get_queues(dev, addr->index);
 
 	/* The addresses are converted from QEMU virtual to Vhost virtual. */
 	vq->desc = (struct vring_desc *)(uintptr_t)qva_to_vva(dev,
@@ -334,7 +430,7 @@ vhost_user_set_vring_addr(struct virtio_net *dev, struct vhost_vring_addr *addr)
 	}
 
 	dev = numa_realloc(dev, addr->index);
-	vq = dev->virtqueue[addr->index];
+	vq = dev->fn_table.vhost_dev_get_queues(dev, addr->index);
 
 	vq->avail = (struct vring_avail *)(uintptr_t)qva_to_vva(dev,
 			addr->avail_user_addr);
@@ -381,17 +477,19 @@ vhost_user_set_vring_addr(struct virtio_net *dev, struct vhost_vring_addr *addr)
  * The virtio device sends us the available ring last used index.
  */
 static int
-vhost_user_set_vring_base(struct virtio_net *dev,
-			  struct vhost_vring_state *state)
+vhost_user_set_vring_base(struct virtio_dev *dev, struct vhost_vring_state *state)
 {
-	dev->virtqueue[state->index]->last_used_idx  = state->num;
-	dev->virtqueue[state->index]->last_avail_idx = state->num;
+	struct vhost_virtqueue *vq;
+
+	vq = dev->fn_table.vhost_dev_get_queues(dev, state->index);
+	vq->last_used_idx = state->num;
+	vq->last_avail_idx = state->num;
 
 	return 0;
 }
 
 static void
-add_one_guest_page(struct virtio_net *dev, uint64_t guest_phys_addr,
+add_one_guest_page(struct virtio_dev *dev, uint64_t guest_phys_addr,
 		   uint64_t host_phys_addr, uint64_t size)
 {
 	struct guest_page *page, *last_page;
@@ -419,7 +517,7 @@ add_one_guest_page(struct virtio_net *dev, uint64_t guest_phys_addr,
 }
 
 static void
-add_guest_pages(struct virtio_net *dev, struct virtio_memory_region *reg,
+add_guest_pages(struct virtio_dev *dev, struct virtio_memory_region *reg,
 		uint64_t page_size)
 {
 	uint64_t reg_size = reg->size;
@@ -450,7 +548,7 @@ add_guest_pages(struct virtio_net *dev, struct virtio_memory_region *reg,
 #ifdef RTE_LIBRTE_VHOST_DEBUG
 /* TODO: enable it only in debug mode? */
 static void
-dump_guest_pages(struct virtio_net *dev)
+dump_guest_pages(struct virtio_dev *dev)
 {
 	uint32_t i;
 	struct guest_page *page;
@@ -474,7 +572,7 @@ dump_guest_pages(struct virtio_net *dev)
 #endif
 
 static int
-vhost_user_set_mem_table(struct virtio_net *dev, struct VhostUserMsg *pmsg)
+vhost_user_set_mem_table(struct virtio_dev *dev, struct VhostUserMsg *pmsg)
 {
 	struct VhostUserMemory memory = pmsg->payload.memory;
 	struct virtio_memory_region *reg;
@@ -488,7 +586,7 @@ vhost_user_set_mem_table(struct virtio_net *dev, struct VhostUserMsg *pmsg)
 	/* Remove from the data plane. */
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
 	if (dev->mem) {
@@ -588,41 +686,22 @@ err_mmap:
 }
 
 static int
-vq_is_ready(struct vhost_virtqueue *vq)
-{
-	return vq && vq->desc   &&
-	       vq->kickfd != VIRTIO_UNINITIALIZED_EVENTFD &&
-	       vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD;
-}
-
-static int
-virtio_is_ready(struct virtio_net *dev)
+virtio_is_ready(struct virtio_dev *dev)
 {
-	struct vhost_virtqueue *rvq, *tvq;
-	uint32_t i;
-
-	for (i = 0; i < dev->virt_qp_nb; i++) {
-		rvq = dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_RXQ];
-		tvq = dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_TXQ];
-
-		if (!vq_is_ready(rvq) || !vq_is_ready(tvq)) {
-			RTE_LOG(INFO, VHOST_CONFIG,
-				"virtio is not ready for processing.\n");
-			return 0;
-		}
-	}
+	if (dev->fn_table.vhost_dev_ready)
+		return dev->fn_table.vhost_dev_ready(dev);
 
-	RTE_LOG(INFO, VHOST_CONFIG,
-		"virtio is now ready for processing.\n");
-	return 1;
+	return -1;
 }
 
+/*
+ *  In vhost-user, when we receive kick message, will test whether virtio
+ *  device is ready for packet processing.
+ */
 static void
-vhost_user_set_vring_call(struct virtio_net *dev, struct VhostUserMsg *pmsg)
+vhost_user_set_vring_call(struct virtio_dev *dev, struct VhostUserMsg *pmsg)
 {
 	struct vhost_vring_file file;
-	struct vhost_virtqueue *vq;
-	uint32_t cur_qp_idx;
 
 	file.index = pmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
 	if (pmsg->payload.u64 & VHOST_USER_VRING_NOFD_MASK)
@@ -632,23 +711,8 @@ vhost_user_set_vring_call(struct virtio_net *dev, struct VhostUserMsg *pmsg)
 	RTE_LOG(INFO, VHOST_CONFIG,
 		"vring call idx:%d file:%d\n", file.index, file.fd);
 
-	/*
-	 * FIXME: VHOST_SET_VRING_CALL is the first per-vring message
-	 * we get, so we do vring queue pair allocation here.
-	 */
-	cur_qp_idx = file.index / VIRTIO_QNUM;
-	if (cur_qp_idx + 1 > dev->virt_qp_nb) {
-		if (alloc_vring_queue_pair(dev, cur_qp_idx) < 0)
-			return;
-	}
-
-	vq = dev->virtqueue[file.index];
-	assert(vq != NULL);
-
-	if (vq->callfd >= 0)
-		close(vq->callfd);
-
-	vq->callfd = file.fd;
+	if (dev->fn_table.vhost_dev_set_vring_call)
+		dev->fn_table.vhost_dev_set_vring_call(dev, &file);
 }
 
 /*
@@ -656,11 +720,14 @@ vhost_user_set_vring_call(struct virtio_net *dev, struct VhostUserMsg *pmsg)
  *  device is ready for packet processing.
  */
 static void
-vhost_user_set_vring_kick(struct virtio_net *dev, struct VhostUserMsg *pmsg)
+vhost_user_set_vring_kick(struct virtio_dev *dev, struct VhostUserMsg *pmsg)
 {
 	struct vhost_vring_file file;
 	struct vhost_virtqueue *vq;
 
+	if (!dev)
+		return;
+
 	file.index = pmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
 	if (pmsg->payload.u64 & VHOST_USER_VRING_NOFD_MASK)
 		file.fd = VIRTIO_INVALID_EVENTFD;
@@ -668,69 +735,44 @@ vhost_user_set_vring_kick(struct virtio_net *dev, struct VhostUserMsg *pmsg)
 		file.fd = pmsg->fds[0];
 	RTE_LOG(INFO, VHOST_CONFIG,
 		"vring kick idx:%d file:%d\n", file.index, file.fd);
-
-	vq = dev->virtqueue[file.index];
+	vq = dev->fn_table.vhost_dev_get_queues(dev, file.index);
 	if (vq->kickfd >= 0)
 		close(vq->kickfd);
+
 	vq->kickfd = file.fd;
 
 	if (virtio_is_ready(dev) && !(dev->flags & VIRTIO_DEV_RUNNING)) {
-		if (dev->tx_zero_copy) {
-			RTE_LOG(INFO, VHOST_CONFIG,
-				"Tx zero copy is enabled\n");
-		}
-
-		if (notify_ops->new_device(dev->vid) == 0)
+		if (dev->notify_ops->new_device(dev->vid) == 0)
 			dev->flags |= VIRTIO_DEV_RUNNING;
 	}
 }
 
-static void
-free_zmbufs(struct vhost_virtqueue *vq)
-{
-	struct zcopy_mbuf *zmbuf, *next;
-
-	for (zmbuf = TAILQ_FIRST(&vq->zmbuf_list);
-	     zmbuf != NULL; zmbuf = next) {
-		next = TAILQ_NEXT(zmbuf, next);
-
-		rte_pktmbuf_free(zmbuf->mbuf);
-		TAILQ_REMOVE(&vq->zmbuf_list, zmbuf, next);
-	}
-
-	rte_free(vq->zmbufs);
-}
-
 /*
  * when virtio is stopped, qemu will send us the GET_VRING_BASE message.
  */
 static int
-vhost_user_get_vring_base(struct virtio_net *dev,
-			  struct vhost_vring_state *state)
+vhost_user_get_vring_base(struct virtio_dev *dev, struct vhost_vring_state *state)
 {
+	struct vhost_virtqueue *vq;
+	if (dev == NULL)
+		return -1;
+
 	/* We have to stop the queue (virtio) if it is running. */
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
+	vq = dev->fn_table.vhost_dev_get_queues(dev, state->index);
+	/* Here we are safe to get the last used index */
+	state->num = vq->last_used_idx;
+
 	/* Here we are safe to get the last used index */
-	state->num = dev->virtqueue[state->index]->last_used_idx;
+	if (dev->fn_table.vhost_dev_get_vring_base)
+		dev->fn_table.vhost_dev_get_vring_base(dev, vq);
 
 	RTE_LOG(INFO, VHOST_CONFIG,
 		"vring base idx:%d file:%d\n", state->index, state->num);
-	/*
-	 * Based on current qemu vhost-user implementation, this message is
-	 * sent and only sent in vhost_vring_stop.
-	 * TODO: cleanup the vring, it isn't usable since here.
-	 */
-	if (dev->virtqueue[state->index]->kickfd >= 0)
-		close(dev->virtqueue[state->index]->kickfd);
-
-	dev->virtqueue[state->index]->kickfd = VIRTIO_UNINITIALIZED_EVENTFD;
-
-	if (dev->tx_zero_copy)
-		free_zmbufs(dev->virtqueue[state->index]);
 
 	return 0;
 }
@@ -740,39 +782,54 @@ vhost_user_get_vring_base(struct virtio_net *dev,
  * enable the virtio queue pair.
  */
 static int
-vhost_user_set_vring_enable(struct virtio_net *dev,
-			    struct vhost_vring_state *state)
+vhost_user_set_vring_enable(struct virtio_dev *dev, struct vhost_vring_state *state)
 {
+	struct vhost_virtqueue *vq;
 	int enable = (int)state->num;
 
+	if (dev == NULL)
+		return -1;
+
 	RTE_LOG(INFO, VHOST_CONFIG,
 		"set queue enable: %d to qp idx: %d\n",
 		enable, state->index);
 
-	if (notify_ops->vring_state_changed)
-		notify_ops->vring_state_changed(dev->vid, state->index, enable);
-
-	dev->virtqueue[state->index]->enabled = enable;
+	if (dev->notify_ops->vring_state_changed)
+		dev->notify_ops->vring_state_changed(dev->vid, state->index, enable);
+	
+	vq = dev->fn_table.vhost_dev_get_queues(dev, state->index);
+	vq->enabled = enable;
 
 	return 0;
 }
 
 static void
-vhost_user_set_protocol_features(struct virtio_net *dev,
-				 uint64_t protocol_features)
+vhost_user_set_protocol_features(struct virtio_dev *dev, uint64_t protocol_features)
 {
-	if (protocol_features & ~VHOST_USER_PROTOCOL_FEATURES)
+	if (dev == NULL)
 		return;
 
-	dev->protocol_features = protocol_features;
+	if (dev->fn_table.vhost_dev_set_protocol_features)
+		dev->fn_table.vhost_dev_set_protocol_features(dev, protocol_features);
+}
+
+static uint64_t
+vhost_user_get_protocol_features(struct virtio_dev *dev)
+{
+	if (dev == NULL)
+		return 0;
+
+	if (dev->fn_table.vhost_dev_get_protocol_features)
+		return dev->fn_table.vhost_dev_get_protocol_features(dev);
+
+	return 0;
 }
 
 static int
-vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
+vhost_user_set_log_base(struct virtio_dev *dev, struct VhostUserMsg *msg)
 {
 	int fd = msg->fds[0];
 	uint64_t size, off;
-	void *addr;
 
 	if (fd < 0) {
 		RTE_LOG(ERR, VHOST_CONFIG, "invalid log fd: %d\n", fd);
@@ -792,58 +849,20 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
 		"log mmap size: %"PRId64", offset: %"PRId64"\n",
 		size, off);
 
-	/*
-	 * mmap from 0 to workaround a hugepage mmap bug: mmap will
-	 * fail when offset is not page size aligned.
-	 */
-	addr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
-	close(fd);
-	if (addr == MAP_FAILED) {
-		RTE_LOG(ERR, VHOST_CONFIG, "mmap log base failed!\n");
-		return -1;
-	}
-
-	/*
-	 * Free previously mapped log memory on occasionally
-	 * multiple VHOST_USER_SET_LOG_BASE.
-	 */
-	if (dev->log_addr) {
-		munmap((void *)(uintptr_t)dev->log_addr, dev->log_size);
-	}
-	dev->log_addr = (uint64_t)(uintptr_t)addr;
-	dev->log_base = dev->log_addr + off;
-	dev->log_size = size;
+	if (dev->fn_table.vhost_dev_set_log_base)
+		return dev->fn_table.vhost_dev_set_log_base(dev, fd, size, off);
 
 	return 0;
 }
 
-/*
- * An rarp packet is constructed and broadcasted to notify switches about
- * the new location of the migrated VM, so that packets from outside will
- * not be lost after migration.
- *
- * However, we don't actually "send" a rarp packet here, instead, we set
- * a flag 'broadcast_rarp' to let rte_vhost_dequeue_burst() inject it.
- */
-static int
-vhost_user_send_rarp(struct virtio_net *dev, struct VhostUserMsg *msg)
+static uint32_t
+vhost_user_get_queue_num(struct virtio_dev *dev)
 {
-	uint8_t *mac = (uint8_t *)&msg->payload.u64;
-
-	RTE_LOG(DEBUG, VHOST_CONFIG,
-		":: mac: %02x:%02x:%02x:%02x:%02x:%02x\n",
-		mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
-	memcpy(dev->mac.addr_bytes, mac, 6);
+	if (dev == NULL)
+		return 0;
 
-	/*
-	 * Set the flag to inject a RARP broadcast packet at
-	 * rte_vhost_dequeue_burst().
-	 *
-	 * rte_smp_wmb() is for making sure the mac is copied
-	 * before the flag is set.
-	 */
-	rte_smp_wmb();
-	rte_atomic16_set(&dev->broadcast_rarp, 1);
+	if (dev->fn_table.vhost_dev_get_queue_num)
+		return dev->fn_table.vhost_dev_get_queue_num(dev);
 
 	return 0;
 }
@@ -899,7 +918,7 @@ send_vhost_message(int sockfd, struct VhostUserMsg *msg)
 int
 vhost_user_msg_handler(int vid, int fd)
 {
-	struct virtio_net *dev;
+	struct virtio_dev *dev;
 	struct VhostUserMsg msg;
 	int ret;
 
@@ -926,7 +945,7 @@ vhost_user_msg_handler(int vid, int fd)
 		vhost_message_str[msg.request]);
 	switch (msg.request) {
 	case VHOST_USER_GET_FEATURES:
-		msg.payload.u64 = vhost_user_get_features();
+		msg.payload.u64 = vhost_user_get_features(dev);
 		msg.size = sizeof(msg.payload.u64);
 		send_vhost_message(fd, &msg);
 		break;
@@ -935,7 +954,7 @@ vhost_user_msg_handler(int vid, int fd)
 		break;
 
 	case VHOST_USER_GET_PROTOCOL_FEATURES:
-		msg.payload.u64 = VHOST_USER_PROTOCOL_FEATURES;
+		msg.payload.u64 = vhost_user_get_protocol_features(dev);
 		msg.size = sizeof(msg.payload.u64);
 		send_vhost_message(fd, &msg);
 		break;
@@ -996,7 +1015,7 @@ vhost_user_msg_handler(int vid, int fd)
 		break;
 
 	case VHOST_USER_GET_QUEUE_NUM:
-		msg.payload.u64 = VHOST_MAX_QUEUE_PAIRS;
+		msg.payload.u64 = vhost_user_get_queue_num(dev);
 		msg.size = sizeof(msg.payload.u64);
 		send_vhost_message(fd, &msg);
 		break;
@@ -1014,4 +1033,4 @@ vhost_user_msg_handler(int vid, int fd)
 	}
 
 	return 0;
-}
+}
\ No newline at end of file
diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index ba78d32..59f80f2 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -38,19 +38,12 @@
 #include <linux/vhost.h>
 
 #include "rte_virtio_net.h"
+#include "rte_virtio_dev.h"
 
 /* refer to hw/virtio/vhost-user.c */
 
 #define VHOST_MEMORY_MAX_NREGIONS 8
 
-#define VHOST_USER_PROTOCOL_F_MQ	0
-#define VHOST_USER_PROTOCOL_F_LOG_SHMFD	1
-#define VHOST_USER_PROTOCOL_F_RARP	2
-
-#define VHOST_USER_PROTOCOL_FEATURES	((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \
-					 (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\
-					 (1ULL << VHOST_USER_PROTOCOL_F_RARP))
-
 typedef enum VhostUserRequest {
 	VHOST_USER_NONE = 0,
 	VHOST_USER_GET_FEATURES = 1,
@@ -117,12 +110,16 @@ typedef struct VhostUserMsg {
 /* The version of the protocol we support */
 #define VHOST_USER_VERSION    0x1
 
-
 /* vhost_user.c */
 int vhost_user_msg_handler(int vid, int fd);
 
 /* socket.c */
 int read_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num);
 int send_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num);
+void vhost_set_ifname(int vid, const char *if_name, unsigned int if_len);
+int vhost_new_device(int type);
+void vhost_destroy_device(int vid);
+
+struct virtio_dev *get_device(int vid);
 
-#endif
+#endif
\ No newline at end of file
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 277b150..c11e9b2 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -45,7 +45,8 @@
 #include <rte_sctp.h>
 #include <rte_arp.h>
 
-#include "vhost.h"
+#include "vhost_net.h"
+#include "vhost_device.h"
 
 #define MAX_PKT_BURST 32
 #define VHOST_LOG_PAGE	4096
@@ -147,7 +148,7 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
 
 	desc = &vq->desc[desc_idx];
-	desc_addr = gpa_to_vva(dev, desc->addr);
+	desc_addr = gpa_to_vva(dev->device, desc->addr);
 	/*
 	 * Checking of 'desc_addr' placed outside of 'unlikely' macro to avoid
 	 * performance issue with some versions of gcc (4.8.4 and 5.3.0) which
@@ -187,7 +188,7 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 				return -1;
 
 			desc = &vq->desc[desc->next];
-			desc_addr = gpa_to_vva(dev, desc->addr);
+			desc_addr = gpa_to_vva(dev->device, desc->addr);
 			if (unlikely(!desc_addr))
 				return -1;
 
@@ -232,7 +233,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 	LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__);
 	if (unlikely(!is_valid_virt_queue_idx(queue_id, 0, dev->virt_qp_nb))) {
 		RTE_LOG(ERR, VHOST_DATA, "(%d) %s: invalid virtqueue idx %d.\n",
-			dev->vid, __func__, queue_id);
+			dev->device->vid, __func__, queue_id);
 		return 0;
 	}
 
@@ -395,7 +396,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n",
 		dev->vid, cur_idx, end_idx);
 
-	desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
+	desc_addr = gpa_to_vva(dev->device, buf_vec[vec_idx].buf_addr);
 	if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr)
 		return 0;
 
@@ -432,7 +433,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			}
 
 			vec_idx++;
-			desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
+			desc_addr = gpa_to_vva(dev->device, buf_vec[vec_idx].buf_addr);
 			if (unlikely(!desc_addr))
 				return 0;
 
@@ -487,7 +488,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
 	LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__);
 	if (unlikely(!is_valid_virt_queue_idx(queue_id, 0, dev->virt_qp_nb))) {
 		RTE_LOG(ERR, VHOST_DATA, "(%d) %s: invalid virtqueue idx %d.\n",
-			dev->vid, __func__, queue_id);
+			dev->device->vid, __func__, queue_id);
 		return 0;
 	}
 
@@ -537,10 +538,12 @@ uint16_t
 rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
 	struct rte_mbuf **pkts, uint16_t count)
 {
-	struct virtio_net *dev = get_device(vid);
+	struct virtio_dev *device = get_device(vid);
+	struct virtio_net *dev;
 
-	if (!dev)
+	if (!device)
 		return 0;
+	dev = get_net_device(device);
 
 	if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF))
 		return virtio_dev_merge_rx(dev, queue_id, pkts, count);
@@ -734,7 +737,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	if (unlikely(desc->len < dev->vhost_hlen))
 		return -1;
 
-	desc_addr = gpa_to_vva(dev, desc->addr);
+	desc_addr = gpa_to_vva(dev->device, desc->addr);
 	if (unlikely(!desc_addr))
 		return -1;
 
@@ -771,7 +774,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		   (desc->flags & VRING_DESC_F_NEXT) != 0)) {
 		desc = &vq->desc[desc->next];
 
-		desc_addr = gpa_to_vva(dev, desc->addr);
+		desc_addr = gpa_to_vva(dev->device, desc->addr);
 		if (unlikely(!desc_addr))
 			return -1;
 
@@ -800,7 +803,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		 * copy is enabled.
 		 */
 		if (dev->tx_zero_copy &&
-		    (hpa = gpa_to_hpa(dev, desc->addr + desc_offset, cpy_len))) {
+		    (hpa = gpa_to_hpa(dev->device, desc->addr + desc_offset, cpy_len))) {
 			cur->data_len = cpy_len;
 			cur->data_off = 0;
 			cur->buf_addr = (void *)(uintptr_t)desc_addr;
@@ -833,7 +836,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 				return -1;
 			desc = &vq->desc[desc->next];
 
-			desc_addr = gpa_to_vva(dev, desc->addr);
+			desc_addr = gpa_to_vva(dev->device, desc->addr);
 			if (unlikely(!desc_addr))
 				return -1;
 
@@ -924,6 +927,7 @@ uint16_t
 rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count)
 {
+	struct virtio_dev *device;
 	struct virtio_net *dev;
 	struct rte_mbuf *rarp_mbuf = NULL;
 	struct vhost_virtqueue *vq;
@@ -933,13 +937,14 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 	uint16_t free_entries;
 	uint16_t avail_idx;
 
-	dev = get_device(vid);
-	if (!dev)
+	device = get_device(vid);
+	if (!device)
 		return 0;
+	dev = get_net_device(device);
 
 	if (unlikely(!is_valid_virt_queue_idx(queue_id, 1, dev->virt_qp_nb))) {
 		RTE_LOG(ERR, VHOST_DATA, "(%d) %s: invalid virtqueue idx %d.\n",
-			dev->vid, __func__, queue_id);
+			dev->device->vid, __func__, queue_id);
 		return 0;
 	}
 
-- 
1.9.3

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCH v2 1/2] vhost: change the vhost library to a common framework which can support more VIRTIO devices
  2016-09-14 12:15 [dpdk-dev] [PATCH] vhost: change the vhost library to a common framework which can support more VIRTIO devices Changpeng Liu
  2016-09-13 12:58 ` Yuanhan Liu
@ 2016-09-15  0:28 ` Changpeng Liu
  2016-09-15  0:28   ` [dpdk-dev] [PATCH v2 2/2] vhost: add vhost-scsi support to vhost library Changpeng Liu
  1 sibling, 1 reply; 9+ messages in thread
From: Changpeng Liu @ 2016-09-15  0:28 UTC (permalink / raw)
  To: dev; +Cc: yuanhan.liu, james.r.harris

For storage virtualization use cases, vhost-scsi becomes a more popular
solution to support VMs. However a user space vhost-scsi-user solution
does not exist currently. SPDK(Storage Performance Development Kit,
https://github.com/spdk/spdk) will provide a user space vhost-scsi target
to support multiple VMs through Qemu. Originally SPDK is built on top
of DPDK libraries, so we would like to use DPDK vhost library as the
communication channel between Qemu and vhost-scsi target application.

Currently DPDK vhost library can only support VIRTIO_ID_NET device type,
we would like to extend the library to support VIRTIO_ID_SCSI and
VIRTIO_ID_BLK. Most of DPDK vhost library can be reused only several
differences:
1. VIRTIO SCSI device has different vring queues compared with VIRTIO NET
device, at least 3 vring queues needed for SCSI device type;
2. VIRTIO SCSI will need several extra message operation code, such as
SCSI_SET_ENDPIONT/SCSI_CLEAR_ENDPOINT;

First, we would like to extend DPDK vhost library as a common framework
which be friendly to add other VIRTIO device types, to implement this feature,
we add a new data structure virtio_dev, which can deliver socket messages
to different VIRTIO devices, each specific VIRTIO device will register
callback to virtio_dev.

Secondly, we would to upstream a patch to Qemu community to add vhost-scsi
specific operation command such as SCSI_SET_ENDPOINT and SCSI_CLEAR_ENDOINT,
and user space feature bits.

Finally, after the Qemu patch set was merged, we will add VIRTIO_ID_SCSI
support to DPDK vhost library and an example vhost-scsi target which can
add a SCSI device to VM through this example application.

This patch set changed the vhost library as a common framework which
can add other VIRTIO device type in future.

Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
---
 lib/librte_vhost/Makefile         |   4 +-
 lib/librte_vhost/rte_virtio_dev.h | 140 ++++++++
 lib/librte_vhost/rte_virtio_net.h |  97 +-----
 lib/librte_vhost/socket.c         |   6 +-
 lib/librte_vhost/vhost.c          | 421 ------------------------
 lib/librte_vhost/vhost.h          | 288 -----------------
 lib/librte_vhost/vhost_device.h   | 230 +++++++++++++
 lib/librte_vhost/vhost_net.c      | 659 ++++++++++++++++++++++++++++++++++++++
 lib/librte_vhost/vhost_net.h      | 126 ++++++++
 lib/librte_vhost/vhost_user.c     | 451 +++++++++++++-------------
 lib/librte_vhost/vhost_user.h     |  17 +-
 lib/librte_vhost/virtio_net.c     |  37 ++-
 12 files changed, 1426 insertions(+), 1050 deletions(-)
 create mode 100644 lib/librte_vhost/rte_virtio_dev.h
 delete mode 100644 lib/librte_vhost/vhost.c
 delete mode 100644 lib/librte_vhost/vhost.h
 create mode 100644 lib/librte_vhost/vhost_device.h
 create mode 100644 lib/librte_vhost/vhost_net.c
 create mode 100644 lib/librte_vhost/vhost_net.h

diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 415ffc6..af30491 100644
--- a/lib/librte_vhost/Makefile
+++ b/lib/librte_vhost/Makefile
@@ -47,11 +47,11 @@ LDLIBS += -lnuma
 endif
 
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c socket.c vhost.c vhost_user.c \
+SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c socket.c vhost_net.c vhost_user.c \
 				   virtio_net.c
 
 # install includes
-SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h rte_virtio_dev.h
 
 # dependencies
 DEPDIRS-$(CONFIG_RTE_LIBRTE_VHOST) += lib/librte_eal
diff --git a/lib/librte_vhost/rte_virtio_dev.h b/lib/librte_vhost/rte_virtio_dev.h
new file mode 100644
index 0000000..e3c857a
--- /dev/null
+++ b/lib/librte_vhost/rte_virtio_dev.h
@@ -0,0 +1,140 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _VIRTIO_DEV_H_
+#define _VIRTIO_DEV_H_
+
+/* Device types and capabilities flag */
+#define RTE_VHOST_USER_CLIENT		(1ULL << 0)
+#define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
+#define RTE_VHOST_USER_TX_ZERO_COPY	(1ULL << 2)
+
+#define RTE_VHOST_USER_DEV_NET		(1ULL << 32)
+
+/**
+ * Device and vring operations.
+ *
+ * Make sure to set VIRTIO_DEV_RUNNING to the device flags in new_device and
+ * remove it in destroy_device.
+ *
+ */
+struct virtio_net_device_ops {
+	int (*new_device)(int vid);		/**< Add device. */
+	void (*destroy_device)(int vid);	/**< Remove device. */
+
+	int (*vring_state_changed)(int vid, uint16_t queue_id, int enable);	/**< triggered when a vring is enabled or disabled */
+
+	void *reserved[5]; /**< Reserved for future extension */
+};
+
+/**
+ *  Disable features in feature_mask. Returns 0 on success.
+ */
+int rte_vhost_feature_disable(uint64_t feature_mask);
+
+/**
+ *  Enable features in feature_mask. Returns 0 on success.
+ */
+int rte_vhost_feature_enable(uint64_t feature_mask);
+
+/* Returns currently supported vhost features */
+uint64_t rte_vhost_feature_get(void);
+
+int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable);
+
+/**
+ * Register vhost driver. path could be different for multiple
+ * instance support.
+ */
+int rte_vhost_driver_register(const char *path, uint64_t flags);
+
+/* Unregister vhost driver. This is only meaningful to vhost user. */
+int rte_vhost_driver_unregister(const char *path);
+
+/* Start vhost driver session blocking loop. */
+int rte_vhost_driver_session_start(void);
+
+/**
+ * Get the numa node from which the virtio net device's memory
+ * is allocated.
+ *
+ * @param vid
+ *  virtio-net device ID
+ *
+ * @return
+ *  The numa node, -1 on failure
+ */
+int rte_vhost_get_numa_node(int vid);
+
+/**
+ * Get the number of queues the device supports.
+ *
+ * @param vid
+ *  virtio-net device ID
+ *
+ * @return
+ *  The number of queues, 0 on failure
+ */
+uint32_t rte_vhost_get_queue_num(int vid);
+
+/**
+ * Get how many avail entries are left in the queue
+ *
+ * @param vid
+ *  virtio-net device ID
+ * @param queue_id
+ *  virtio queue index
+ *
+ * @return
+ *  num of avail entires left
+ */
+uint16_t rte_vhost_avail_entries(int vid, uint16_t queue_id);
+
+/**
+ * Get the virtio net device's ifname. For vhost-cuse, ifname is the
+ * path of the char device. For vhost-user, ifname is the vhost-user
+ * socket file path.
+ *
+ * @param vid
+ *  virtio-net device ID
+ * @param buf
+ *  The buffer to stored the queried ifname
+ * @param len
+ *  The length of buf
+ *
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_vhost_get_ifname(int vid, char *buf, size_t len);
+
+#endif /* _VIRTIO_DEV_H_ */
\ No newline at end of file
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 3ddc9ca..86ede8a 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -50,107 +50,14 @@
 #include <rte_memory.h>
 #include <rte_mempool.h>
 #include <rte_ether.h>
-
-#define RTE_VHOST_USER_CLIENT		(1ULL << 0)
-#define RTE_VHOST_USER_NO_RECONNECT	(1ULL << 1)
-#define RTE_VHOST_USER_TX_ZERO_COPY	(1ULL << 2)
+#include <rte_virtio_dev.h>
 
 /* Enum for virtqueue management. */
 enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
 
-/**
- * Device and vring operations.
- */
-struct virtio_net_device_ops {
-	int (*new_device)(int vid);		/**< Add device. */
-	void (*destroy_device)(int vid);	/**< Remove device. */
-
-	int (*vring_state_changed)(int vid, uint16_t queue_id, int enable);	/**< triggered when a vring is enabled or disabled */
-
-	void *reserved[5]; /**< Reserved for future extension */
-};
-
-/**
- *  Disable features in feature_mask. Returns 0 on success.
- */
-int rte_vhost_feature_disable(uint64_t feature_mask);
-
-/**
- *  Enable features in feature_mask. Returns 0 on success.
- */
-int rte_vhost_feature_enable(uint64_t feature_mask);
-
-/* Returns currently supported vhost features */
-uint64_t rte_vhost_feature_get(void);
-
-int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable);
-
-/**
- * Register vhost driver. path could be different for multiple
- * instance support.
- */
-int rte_vhost_driver_register(const char *path, uint64_t flags);
-
-/* Unregister vhost driver. This is only meaningful to vhost user. */
-int rte_vhost_driver_unregister(const char *path);
-
 /* Register callbacks. */
 int rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const);
-/* Start vhost driver session blocking loop. */
-int rte_vhost_driver_session_start(void);
-
-/**
- * Get the numa node from which the virtio net device's memory
- * is allocated.
- *
- * @param vid
- *  virtio-net device ID
- *
- * @return
- *  The numa node, -1 on failure
- */
-int rte_vhost_get_numa_node(int vid);
 
-/**
- * Get the number of queues the device supports.
- *
- * @param vid
- *  virtio-net device ID
- *
- * @return
- *  The number of queues, 0 on failure
- */
-uint32_t rte_vhost_get_queue_num(int vid);
-
-/**
- * Get the virtio net device's ifname. For vhost-cuse, ifname is the
- * path of the char device. For vhost-user, ifname is the vhost-user
- * socket file path.
- *
- * @param vid
- *  virtio-net device ID
- * @param buf
- *  The buffer to stored the queried ifname
- * @param len
- *  The length of buf
- *
- * @return
- *  0 on success, -1 on failure
- */
-int rte_vhost_get_ifname(int vid, char *buf, size_t len);
-
-/**
- * Get how many avail entries are left in the queue
- *
- * @param vid
- *  virtio-net device ID
- * @param queue_id
- *  virtio queue index
- *
- * @return
- *  num of avail entires left
- */
-uint16_t rte_vhost_avail_entries(int vid, uint16_t queue_id);
 
 /**
  * This function adds buffers to the virtio devices RX virtqueue. Buffers can
@@ -191,4 +98,4 @@ uint16_t rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
 uint16_t rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
 
-#endif /* _VIRTIO_NET_H_ */
+#endif /* _VIRTIO_NET_H_ */
\ No newline at end of file
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 5c3962d..1474c98 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -49,7 +49,7 @@
 #include <rte_log.h>
 
 #include "fd_man.h"
-#include "vhost.h"
+#include "vhost_device.h"
 #include "vhost_user.h"
 
 /*
@@ -62,6 +62,7 @@ struct vhost_user_socket {
 	int connfd;
 	bool is_server;
 	bool reconnect;
+	int type;
 	bool tx_zero_copy;
 };
 
@@ -194,7 +195,7 @@ vhost_user_add_connection(int fd, struct vhost_user_socket *vsocket)
 		return;
 	}
 
-	vid = vhost_new_device();
+	vid = vhost_new_device(vsocket->type);
 	if (vid == -1) {
 		close(fd);
 		free(conn);
@@ -525,6 +526,7 @@ rte_vhost_driver_register(const char *path, uint64_t flags)
 		goto out;
 	}
 
+	vsocket->type = VIRTIO_ID_NET;
 	vhost_user.vsockets[vhost_user.vsocket_cnt++] = vsocket;
 
 out:
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
deleted file mode 100644
index 5461e5b..0000000
--- a/lib/librte_vhost/vhost.c
+++ /dev/null
@@ -1,421 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#include <linux/vhost.h>
-#include <linux/virtio_net.h>
-#include <stddef.h>
-#include <stdint.h>
-#include <stdlib.h>
-#ifdef RTE_LIBRTE_VHOST_NUMA
-#include <numaif.h>
-#endif
-
-#include <rte_ethdev.h>
-#include <rte_log.h>
-#include <rte_string_fns.h>
-#include <rte_memory.h>
-#include <rte_malloc.h>
-#include <rte_virtio_net.h>
-
-#include "vhost.h"
-
-#define VHOST_USER_F_PROTOCOL_FEATURES	30
-
-/* Features supported by this lib. */
-#define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
-				(1ULL << VIRTIO_NET_F_CTRL_VQ) | \
-				(1ULL << VIRTIO_NET_F_CTRL_RX) | \
-				(1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) | \
-				(VHOST_SUPPORTS_MQ)            | \
-				(1ULL << VIRTIO_F_VERSION_1)   | \
-				(1ULL << VHOST_F_LOG_ALL)      | \
-				(1ULL << VHOST_USER_F_PROTOCOL_FEATURES) | \
-				(1ULL << VIRTIO_NET_F_HOST_TSO4) | \
-				(1ULL << VIRTIO_NET_F_HOST_TSO6) | \
-				(1ULL << VIRTIO_NET_F_CSUM)    | \
-				(1ULL << VIRTIO_NET_F_GUEST_CSUM) | \
-				(1ULL << VIRTIO_NET_F_GUEST_TSO4) | \
-				(1ULL << VIRTIO_NET_F_GUEST_TSO6))
-
-uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
-
-struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
-
-/* device ops to add/remove device to/from data core. */
-struct virtio_net_device_ops const *notify_ops;
-
-struct virtio_net *
-get_device(int vid)
-{
-	struct virtio_net *dev = vhost_devices[vid];
-
-	if (unlikely(!dev)) {
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%d) device not found.\n", vid);
-	}
-
-	return dev;
-}
-
-static void
-cleanup_vq(struct vhost_virtqueue *vq, int destroy)
-{
-	if ((vq->callfd >= 0) && (destroy != 0))
-		close(vq->callfd);
-	if (vq->kickfd >= 0)
-		close(vq->kickfd);
-}
-
-/*
- * Unmap any memory, close any file descriptors and
- * free any memory owned by a device.
- */
-void
-cleanup_device(struct virtio_net *dev, int destroy)
-{
-	uint32_t i;
-
-	vhost_backend_cleanup(dev);
-
-	for (i = 0; i < dev->virt_qp_nb; i++) {
-		cleanup_vq(dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_RXQ], destroy);
-		cleanup_vq(dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_TXQ], destroy);
-	}
-}
-
-/*
- * Release virtqueues and device memory.
- */
-static void
-free_device(struct virtio_net *dev)
-{
-	uint32_t i;
-
-	for (i = 0; i < dev->virt_qp_nb; i++)
-		rte_free(dev->virtqueue[i * VIRTIO_QNUM]);
-
-	rte_free(dev);
-}
-
-static void
-init_vring_queue(struct vhost_virtqueue *vq, int qp_idx)
-{
-	memset(vq, 0, sizeof(struct vhost_virtqueue));
-
-	vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD;
-	vq->callfd = VIRTIO_UNINITIALIZED_EVENTFD;
-
-	/* Backends are set to -1 indicating an inactive device. */
-	vq->backend = -1;
-
-	/* always set the default vq pair to enabled */
-	if (qp_idx == 0)
-		vq->enabled = 1;
-
-	TAILQ_INIT(&vq->zmbuf_list);
-}
-
-static void
-init_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx)
-{
-	uint32_t base_idx = qp_idx * VIRTIO_QNUM;
-
-	init_vring_queue(dev->virtqueue[base_idx + VIRTIO_RXQ], qp_idx);
-	init_vring_queue(dev->virtqueue[base_idx + VIRTIO_TXQ], qp_idx);
-}
-
-static void
-reset_vring_queue(struct vhost_virtqueue *vq, int qp_idx)
-{
-	int callfd;
-
-	callfd = vq->callfd;
-	init_vring_queue(vq, qp_idx);
-	vq->callfd = callfd;
-}
-
-static void
-reset_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx)
-{
-	uint32_t base_idx = qp_idx * VIRTIO_QNUM;
-
-	reset_vring_queue(dev->virtqueue[base_idx + VIRTIO_RXQ], qp_idx);
-	reset_vring_queue(dev->virtqueue[base_idx + VIRTIO_TXQ], qp_idx);
-}
-
-int
-alloc_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx)
-{
-	struct vhost_virtqueue *virtqueue = NULL;
-	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
-	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
-
-	virtqueue = rte_malloc(NULL,
-			       sizeof(struct vhost_virtqueue) * VIRTIO_QNUM, 0);
-	if (virtqueue == NULL) {
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"Failed to allocate memory for virt qp:%d.\n", qp_idx);
-		return -1;
-	}
-
-	dev->virtqueue[virt_rx_q_idx] = virtqueue;
-	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
-
-	init_vring_queue_pair(dev, qp_idx);
-
-	dev->virt_qp_nb += 1;
-
-	return 0;
-}
-
-/*
- * Reset some variables in device structure, while keeping few
- * others untouched, such as vid, ifname, virt_qp_nb: they
- * should be same unless the device is removed.
- */
-void
-reset_device(struct virtio_net *dev)
-{
-	uint32_t i;
-
-	dev->features = 0;
-	dev->protocol_features = 0;
-	dev->flags = 0;
-
-	for (i = 0; i < dev->virt_qp_nb; i++)
-		reset_vring_queue_pair(dev, i);
-}
-
-/*
- * Function is called from the CUSE open function. The device structure is
- * initialised and a new entry is added to the device configuration linked
- * list.
- */
-int
-vhost_new_device(void)
-{
-	struct virtio_net *dev;
-	int i;
-
-	dev = rte_zmalloc(NULL, sizeof(struct virtio_net), 0);
-	if (dev == NULL) {
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"Failed to allocate memory for new dev.\n");
-		return -1;
-	}
-
-	for (i = 0; i < MAX_VHOST_DEVICE; i++) {
-		if (vhost_devices[i] == NULL)
-			break;
-	}
-	if (i == MAX_VHOST_DEVICE) {
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"Failed to find a free slot for new device.\n");
-		return -1;
-	}
-
-	vhost_devices[i] = dev;
-	dev->vid = i;
-
-	return i;
-}
-
-/*
- * Function is called from the CUSE release function. This function will
- * cleanup the device and remove it from device configuration linked list.
- */
-void
-vhost_destroy_device(int vid)
-{
-	struct virtio_net *dev = get_device(vid);
-
-	if (dev == NULL)
-		return;
-
-	if (dev->flags & VIRTIO_DEV_RUNNING) {
-		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(vid);
-	}
-
-	cleanup_device(dev, 1);
-	free_device(dev);
-
-	vhost_devices[vid] = NULL;
-}
-
-void
-vhost_set_ifname(int vid, const char *if_name, unsigned int if_len)
-{
-	struct virtio_net *dev;
-	unsigned int len;
-
-	dev = get_device(vid);
-	if (dev == NULL)
-		return;
-
-	len = if_len > sizeof(dev->ifname) ?
-		sizeof(dev->ifname) : if_len;
-
-	strncpy(dev->ifname, if_name, len);
-	dev->ifname[sizeof(dev->ifname) - 1] = '\0';
-}
-
-void
-vhost_enable_tx_zero_copy(int vid)
-{
-	struct virtio_net *dev = get_device(vid);
-
-	if (dev == NULL)
-		return;
-
-	dev->tx_zero_copy = 1;
-}
-
-int
-rte_vhost_get_numa_node(int vid)
-{
-#ifdef RTE_LIBRTE_VHOST_NUMA
-	struct virtio_net *dev = get_device(vid);
-	int numa_node;
-	int ret;
-
-	if (dev == NULL)
-		return -1;
-
-	ret = get_mempolicy(&numa_node, NULL, 0, dev,
-			    MPOL_F_NODE | MPOL_F_ADDR);
-	if (ret < 0) {
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%d) failed to query numa node: %d\n", vid, ret);
-		return -1;
-	}
-
-	return numa_node;
-#else
-	RTE_SET_USED(vid);
-	return -1;
-#endif
-}
-
-uint32_t
-rte_vhost_get_queue_num(int vid)
-{
-	struct virtio_net *dev = get_device(vid);
-
-	if (dev == NULL)
-		return 0;
-
-	return dev->virt_qp_nb;
-}
-
-int
-rte_vhost_get_ifname(int vid, char *buf, size_t len)
-{
-	struct virtio_net *dev = get_device(vid);
-
-	if (dev == NULL)
-		return -1;
-
-	len = RTE_MIN(len, sizeof(dev->ifname));
-
-	strncpy(buf, dev->ifname, len);
-	buf[len - 1] = '\0';
-
-	return 0;
-}
-
-uint16_t
-rte_vhost_avail_entries(int vid, uint16_t queue_id)
-{
-	struct virtio_net *dev;
-	struct vhost_virtqueue *vq;
-
-	dev = get_device(vid);
-	if (!dev)
-		return 0;
-
-	vq = dev->virtqueue[queue_id];
-	if (!vq->enabled)
-		return 0;
-
-	return *(volatile uint16_t *)&vq->avail->idx - vq->last_used_idx;
-}
-
-int
-rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable)
-{
-	struct virtio_net *dev = get_device(vid);
-
-	if (dev == NULL)
-		return -1;
-
-	if (enable) {
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"guest notification isn't supported.\n");
-		return -1;
-	}
-
-	dev->virtqueue[queue_id]->used->flags = VRING_USED_F_NO_NOTIFY;
-	return 0;
-}
-
-uint64_t rte_vhost_feature_get(void)
-{
-	return VHOST_FEATURES;
-}
-
-int rte_vhost_feature_disable(uint64_t feature_mask)
-{
-	VHOST_FEATURES = VHOST_FEATURES & ~feature_mask;
-	return 0;
-}
-
-int rte_vhost_feature_enable(uint64_t feature_mask)
-{
-	if ((feature_mask & VHOST_SUPPORTED_FEATURES) == feature_mask) {
-		VHOST_FEATURES = VHOST_FEATURES | feature_mask;
-		return 0;
-	}
-	return -1;
-}
-
-/*
- * Register ops so that we can add/remove device to data core.
- */
-int
-rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const ops)
-{
-	notify_ops = ops;
-
-	return 0;
-}
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
deleted file mode 100644
index 7e4a15e..0000000
--- a/lib/librte_vhost/vhost.h
+++ /dev/null
@@ -1,288 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _VHOST_NET_CDEV_H_
-#define _VHOST_NET_CDEV_H_
-#include <stdint.h>
-#include <stdio.h>
-#include <sys/types.h>
-#include <sys/queue.h>
-#include <unistd.h>
-#include <linux/vhost.h>
-
-#include <rte_log.h>
-
-#include "rte_virtio_net.h"
-
-/* Used to indicate that the device is running on a data core */
-#define VIRTIO_DEV_RUNNING 1
-
-/* Backend value set by guest. */
-#define VIRTIO_DEV_STOPPED -1
-
-#define BUF_VECTOR_MAX 256
-
-/**
- * Structure contains buffer address, length and descriptor index
- * from vring to do scatter RX.
- */
-struct buf_vector {
-	uint64_t buf_addr;
-	uint32_t buf_len;
-	uint32_t desc_idx;
-};
-
-/*
- * A structure to hold some fields needed in zero copy code path,
- * mainly for associating an mbuf with the right desc_idx.
- */
-struct zcopy_mbuf {
-	struct rte_mbuf *mbuf;
-	uint32_t desc_idx;
-	uint16_t in_use;
-
-	TAILQ_ENTRY(zcopy_mbuf) next;
-};
-TAILQ_HEAD(zcopy_mbuf_list, zcopy_mbuf);
-
-/**
- * Structure contains variables relevant to RX/TX virtqueues.
- */
-struct vhost_virtqueue {
-	struct vring_desc	*desc;
-	struct vring_avail	*avail;
-	struct vring_used	*used;
-	uint32_t		size;
-
-	uint16_t		last_avail_idx;
-	volatile uint16_t	last_used_idx;
-#define VIRTIO_INVALID_EVENTFD		(-1)
-#define VIRTIO_UNINITIALIZED_EVENTFD	(-2)
-
-	/* Backend value to determine if device should started/stopped */
-	int			backend;
-	/* Used to notify the guest (trigger interrupt) */
-	int			callfd;
-	/* Currently unused as polling mode is enabled */
-	int			kickfd;
-	int			enabled;
-
-	/* Physical address of used ring, for logging */
-	uint64_t		log_guest_addr;
-
-	uint16_t		nr_zmbuf;
-	uint16_t		zmbuf_size;
-	uint16_t		last_zmbuf_idx;
-	struct zcopy_mbuf	*zmbufs;
-	struct zcopy_mbuf_list	zmbuf_list;
-} __rte_cache_aligned;
-
-/* Old kernels have no such macro defined */
-#ifndef VIRTIO_NET_F_GUEST_ANNOUNCE
- #define VIRTIO_NET_F_GUEST_ANNOUNCE 21
-#endif
-
-
-/*
- * Make an extra wrapper for VIRTIO_NET_F_MQ and
- * VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX as they are
- * introduced since kernel v3.8. This makes our
- * code buildable for older kernel.
- */
-#ifdef VIRTIO_NET_F_MQ
- #define VHOST_MAX_QUEUE_PAIRS	VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX
- #define VHOST_SUPPORTS_MQ	(1ULL << VIRTIO_NET_F_MQ)
-#else
- #define VHOST_MAX_QUEUE_PAIRS	1
- #define VHOST_SUPPORTS_MQ	0
-#endif
-
-/*
- * Define virtio 1.0 for older kernels
- */
-#ifndef VIRTIO_F_VERSION_1
- #define VIRTIO_F_VERSION_1 32
-#endif
-
-struct guest_page {
-	uint64_t guest_phys_addr;
-	uint64_t host_phys_addr;
-	uint64_t size;
-};
-
-/**
- * Device structure contains all configuration information relating
- * to the device.
- */
-struct virtio_net {
-	/* Frontend (QEMU) memory and memory region information */
-	struct virtio_memory	*mem;
-	uint64_t		features;
-	uint64_t		protocol_features;
-	int			vid;
-	uint32_t		flags;
-	uint16_t		vhost_hlen;
-	/* to tell if we need broadcast rarp packet */
-	rte_atomic16_t		broadcast_rarp;
-	uint32_t		virt_qp_nb;
-	int			tx_zero_copy;
-	struct vhost_virtqueue	*virtqueue[VHOST_MAX_QUEUE_PAIRS * 2];
-#define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
-	char			ifname[IF_NAME_SZ];
-	uint64_t		log_size;
-	uint64_t		log_base;
-	uint64_t		log_addr;
-	struct ether_addr	mac;
-
-	uint32_t		nr_guest_pages;
-	uint32_t		max_guest_pages;
-	struct guest_page       *guest_pages;
-} __rte_cache_aligned;
-
-/**
- * Information relating to memory regions including offsets to
- * addresses in QEMUs memory file.
- */
-struct virtio_memory_region {
-	uint64_t guest_phys_addr;
-	uint64_t guest_user_addr;
-	uint64_t host_user_addr;
-	uint64_t size;
-	void	 *mmap_addr;
-	uint64_t mmap_size;
-	int fd;
-};
-
-
-/**
- * Memory structure includes region and mapping information.
- */
-struct virtio_memory {
-	uint32_t nregions;
-	struct virtio_memory_region regions[0];
-};
-
-
-/* Macros for printing using RTE_LOG */
-#define RTE_LOGTYPE_VHOST_CONFIG RTE_LOGTYPE_USER1
-#define RTE_LOGTYPE_VHOST_DATA   RTE_LOGTYPE_USER1
-
-#ifdef RTE_LIBRTE_VHOST_DEBUG
-#define VHOST_MAX_PRINT_BUFF 6072
-#define LOG_LEVEL RTE_LOG_DEBUG
-#define LOG_DEBUG(log_type, fmt, args...) RTE_LOG(DEBUG, log_type, fmt, ##args)
-#define PRINT_PACKET(device, addr, size, header) do { \
-	char *pkt_addr = (char *)(addr); \
-	unsigned int index; \
-	char packet[VHOST_MAX_PRINT_BUFF]; \
-	\
-	if ((header)) \
-		snprintf(packet, VHOST_MAX_PRINT_BUFF, "(%d) Header size %d: ", (device->vid), (size)); \
-	else \
-		snprintf(packet, VHOST_MAX_PRINT_BUFF, "(%d) Packet size %d: ", (device->vid), (size)); \
-	for (index = 0; index < (size); index++) { \
-		snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), \
-			"%02hhx ", pkt_addr[index]); \
-	} \
-	snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), "\n"); \
-	\
-	LOG_DEBUG(VHOST_DATA, "%s", packet); \
-} while (0)
-#else
-#define LOG_LEVEL RTE_LOG_INFO
-#define LOG_DEBUG(log_type, fmt, args...) do {} while (0)
-#define PRINT_PACKET(device, addr, size, header) do {} while (0)
-#endif
-
-extern uint64_t VHOST_FEATURES;
-#define MAX_VHOST_DEVICE	1024
-extern struct virtio_net *vhost_devices[MAX_VHOST_DEVICE];
-
-/* Convert guest physical Address to host virtual address */
-static inline uint64_t __attribute__((always_inline))
-gpa_to_vva(struct virtio_net *dev, uint64_t gpa)
-{
-	struct virtio_memory_region *reg;
-	uint32_t i;
-
-	for (i = 0; i < dev->mem->nregions; i++) {
-		reg = &dev->mem->regions[i];
-		if (gpa >= reg->guest_phys_addr &&
-		    gpa <  reg->guest_phys_addr + reg->size) {
-			return gpa - reg->guest_phys_addr +
-			       reg->host_user_addr;
-		}
-	}
-
-	return 0;
-}
-
-/* Convert guest physical address to host physical address */
-static inline phys_addr_t __attribute__((always_inline))
-gpa_to_hpa(struct virtio_net *dev, uint64_t gpa, uint64_t size)
-{
-	uint32_t i;
-	struct guest_page *page;
-
-	for (i = 0; i < dev->nr_guest_pages; i++) {
-		page = &dev->guest_pages[i];
-
-		if (gpa >= page->guest_phys_addr &&
-		    gpa + size < page->guest_phys_addr + page->size) {
-			return gpa - page->guest_phys_addr +
-			       page->host_phys_addr;
-		}
-	}
-
-	return 0;
-}
-
-struct virtio_net_device_ops const *notify_ops;
-struct virtio_net *get_device(int vid);
-
-int vhost_new_device(void);
-void cleanup_device(struct virtio_net *dev, int destroy);
-void reset_device(struct virtio_net *dev);
-void vhost_destroy_device(int);
-
-int alloc_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx);
-
-void vhost_set_ifname(int, const char *if_name, unsigned int if_len);
-void vhost_enable_tx_zero_copy(int vid);
-
-/*
- * Backend-specific cleanup. Defined by vhost-cuse and vhost-user.
- */
-void vhost_backend_cleanup(struct virtio_net *dev);
-
-#endif /* _VHOST_NET_CDEV_H_ */
diff --git a/lib/librte_vhost/vhost_device.h b/lib/librte_vhost/vhost_device.h
new file mode 100644
index 0000000..7101bb0
--- /dev/null
+++ b/lib/librte_vhost/vhost_device.h
@@ -0,0 +1,230 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _VHOST_DEVICE_H_
+#define _VHOST_DEVICE_H_
+
+#include <linux/virtio_ids.h>
+
+#include "vhost_net.h"
+#include "vhost_user.h"
+
+/* Used to indicate that the device is running on a data core */
+#define VIRTIO_DEV_RUNNING 1
+
+/* Backend value set by guest. */
+#define VIRTIO_DEV_STOPPED -1
+
+/**
+ * Structure contains variables relevant to RX/TX virtqueues.
+ */
+struct vhost_virtqueue {
+	struct vring_desc	*desc;
+	struct vring_avail	*avail;
+	struct vring_used	*used;
+	uint32_t		size;
+
+	uint16_t		last_avail_idx;
+	volatile uint16_t	last_used_idx;
+#define VIRTIO_INVALID_EVENTFD		(-1)
+#define VIRTIO_UNINITIALIZED_EVENTFD	(-2)
+
+	/* Backend value to determine if device should started/stopped */
+	int			backend;
+	/* Used to notify the guest (trigger interrupt) */
+	int			callfd;
+	/* Currently unused as polling mode is enabled */
+	int			kickfd;
+	int			enabled;
+
+	/* Physical address of used ring, for logging */
+	uint64_t		log_guest_addr;
+
+	uint16_t		nr_zmbuf;
+	uint16_t		zmbuf_size;
+	uint16_t		last_zmbuf_idx;
+	struct zcopy_mbuf	*zmbufs;
+	struct zcopy_mbuf_list	zmbuf_list;
+} __rte_cache_aligned;
+
+struct virtio_dev;
+
+struct virtio_dev_table {
+	int (*vhost_dev_ready)(struct virtio_dev *dev);
+	struct vhost_virtqueue* (*vhost_dev_get_queues)(struct virtio_dev *dev, uint16_t queue_id);
+	void (*vhost_dev_cleanup)(struct virtio_dev *dev, int destroy);
+	void (*vhost_dev_free)(struct virtio_dev *dev);
+	void (*vhost_dev_reset)(struct virtio_dev *dev);
+	uint64_t (*vhost_dev_get_features)(struct virtio_dev *dev);
+	int (*vhost_dev_set_features)(struct virtio_dev *dev, uint64_t features);
+	uint64_t (*vhost_dev_get_protocol_features)(struct virtio_dev *dev);
+	int (*vhost_dev_set_protocol_features)(struct virtio_dev *dev, uint64_t features);
+	uint32_t (*vhost_dev_get_default_queue_num)(struct virtio_dev *dev);
+	uint32_t (*vhost_dev_get_queue_num)(struct virtio_dev *dev);
+	uint16_t (*vhost_dev_get_avail_entries)(struct virtio_dev *dev, uint16_t queue_id);
+	int (*vhost_dev_get_vring_base)(struct virtio_dev *dev, struct vhost_virtqueue *vq);
+	int (*vhost_dev_set_vring_num)(struct virtio_dev *dev, struct vhost_virtqueue *vq);
+	int (*vhost_dev_set_vring_call)(struct virtio_dev *dev, struct vhost_vring_file *file);
+	int (*vhost_dev_set_log_base)(struct virtio_dev *dev, int fd, uint64_t size, uint64_t off);
+};
+
+struct virtio_dev {
+	/* Frontend (QEMU) memory and memory region information */
+	struct virtio_memory	*mem;
+	int			vid;
+	uint32_t		flags;
+	#define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
+	char			ifname[IF_NAME_SZ];
+
+	uint32_t		dev_type;
+	union {
+		struct virtio_net	net_dev;
+	} dev;
+
+	uint32_t		nr_guest_pages;
+	uint32_t		max_guest_pages;
+	struct guest_page       *guest_pages;
+
+	const struct virtio_net_device_ops	*notify_ops;
+	struct virtio_dev_table fn_table;
+} __rte_cache_aligned;
+
+extern struct virtio_net_device_ops const *notify_ops;
+
+/*
+ * Define virtio 1.0 for older kernels
+ */
+#ifndef VIRTIO_F_VERSION_1
+ #define VIRTIO_F_VERSION_1 32
+#endif
+
+struct guest_page {
+	uint64_t guest_phys_addr;
+	uint64_t host_phys_addr;
+	uint64_t size;
+};
+
+/**
+ * Information relating to memory regions including offsets to
+ * addresses in QEMUs memory file.
+ */
+struct virtio_memory_region {
+	uint64_t guest_phys_addr;
+	uint64_t guest_user_addr;
+	uint64_t host_user_addr;
+	uint64_t size;
+	void	 *mmap_addr;
+	uint64_t mmap_size;
+	int fd;
+};
+
+/**
+ * Memory structure includes region and mapping information.
+ */
+struct virtio_memory {
+	uint32_t nregions;
+	struct virtio_memory_region regions[0];
+};
+
+
+/* Macros for printing using RTE_LOG */
+#define RTE_LOGTYPE_VHOST_CONFIG RTE_LOGTYPE_USER1
+#define RTE_LOGTYPE_VHOST_DATA   RTE_LOGTYPE_USER1
+
+#ifdef RTE_LIBRTE_VHOST_DEBUG
+#define VHOST_MAX_PRINT_BUFF 6072
+#define LOG_LEVEL RTE_LOG_DEBUG
+#define LOG_DEBUG(log_type, fmt, args...) RTE_LOG(DEBUG, log_type, fmt, ##args)
+#define PRINT_PACKET(device, addr, size, header) do { \
+	char *pkt_addr = (char *)(addr); \
+	unsigned int index; \
+	char packet[VHOST_MAX_PRINT_BUFF]; \
+	\
+	if ((header)) \
+		snprintf(packet, VHOST_MAX_PRINT_BUFF, "(%d) Header size %d: ", (device->vid), (size)); \
+	else \
+		snprintf(packet, VHOST_MAX_PRINT_BUFF, "(%d) Packet size %d: ", (device->vid), (size)); \
+	for (index = 0; index < (size); index++) { \
+		snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), \
+			"%02hhx ", pkt_addr[index]); \
+	} \
+	snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), "\n"); \
+	\
+	LOG_DEBUG(VHOST_DATA, "%s", packet); \
+} while (0)
+#else
+#define LOG_LEVEL RTE_LOG_INFO
+#define LOG_DEBUG(log_type, fmt, args...) do {} while (0)
+#define PRINT_PACKET(device, addr, size, header) do {} while (0)
+#endif
+
+/* Convert guest physical Address to host virtual address */
+static inline uint64_t __attribute__((always_inline))
+gpa_to_vva(struct virtio_dev *dev, uint64_t gpa)
+{
+	struct virtio_memory_region *reg;
+	uint32_t i;
+
+	for (i = 0; i < dev->mem->nregions; i++) {
+		reg = &dev->mem->regions[i];
+		if (gpa >= reg->guest_phys_addr &&
+		    gpa <  reg->guest_phys_addr + reg->size) {
+			return gpa - reg->guest_phys_addr +
+			       reg->host_user_addr;
+		}
+	}
+
+	return 0;
+}
+
+/* Convert guest physical address to host physical address */
+static inline phys_addr_t __attribute__((always_inline))
+gpa_to_hpa(struct virtio_dev *dev, uint64_t gpa, uint64_t size)
+{
+	uint32_t i;
+	struct guest_page *page;
+
+	for (i = 0; i < dev->nr_guest_pages; i++) {
+		page = &dev->guest_pages[i];
+
+		if (gpa >= page->guest_phys_addr &&
+		    gpa + size < page->guest_phys_addr + page->size) {
+			return gpa - page->guest_phys_addr +
+			       page->host_phys_addr;
+		}
+	}
+
+	return 0;
+}
+
+#endif
\ No newline at end of file
diff --git a/lib/librte_vhost/vhost_net.c b/lib/librte_vhost/vhost_net.c
new file mode 100644
index 0000000..f141b32
--- /dev/null
+++ b/lib/librte_vhost/vhost_net.c
@@ -0,0 +1,659 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <linux/vhost.h>
+#include <linux/virtio_net.h>
+#include <stddef.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <assert.h>
+#ifdef RTE_LIBRTE_VHOST_NUMA
+#include <numaif.h>
+#endif
+#include <sys/mman.h>
+
+#include <rte_ethdev.h>
+#include <rte_log.h>
+#include <rte_string_fns.h>
+#include <rte_memory.h>
+#include <rte_malloc.h>
+#include <rte_virtio_net.h>
+
+#include "vhost_net.h"
+#include "vhost_device.h"
+
+#define VHOST_USER_F_PROTOCOL_FEATURES	30
+
+/* Features supported by this lib. */
+#define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
+				(1ULL << VIRTIO_NET_F_CTRL_VQ) | \
+				(1ULL << VIRTIO_NET_F_CTRL_RX) | \
+				(1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) | \
+				(VHOST_SUPPORTS_MQ)            | \
+				(1ULL << VIRTIO_F_VERSION_1)   | \
+				(1ULL << VHOST_F_LOG_ALL)      | \
+				(1ULL << VHOST_USER_F_PROTOCOL_FEATURES) | \
+				(1ULL << VIRTIO_NET_F_HOST_TSO4) | \
+				(1ULL << VIRTIO_NET_F_HOST_TSO6) | \
+				(1ULL << VIRTIO_NET_F_CSUM)    | \
+				(1ULL << VIRTIO_NET_F_GUEST_CSUM) | \
+				(1ULL << VIRTIO_NET_F_GUEST_TSO4) | \
+				(1ULL << VIRTIO_NET_F_GUEST_TSO6))
+
+uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
+
+/* device ops to add/remove device to/from data core. */
+struct virtio_net_device_ops const *notify_ops = NULL;
+
+struct virtio_net *
+get_net_device(struct virtio_dev *dev)
+{
+	if (!dev)
+		return NULL;
+ 
+	return &dev->dev.net_dev;
+}
+
+static void
+cleanup_vq(struct vhost_virtqueue *vq, int destroy)
+{
+	if ((vq->callfd >= 0) && (destroy != 0))
+		close(vq->callfd);
+	if (vq->kickfd >= 0)
+		close(vq->kickfd);
+}
+
+/*
+ * Unmap any memory, close any file descriptors and
+ * free any memory owned by a device.
+ */
+static void
+cleanup_device(struct virtio_dev *device, int destroy)
+{
+	struct virtio_net *dev = get_net_device(device);
+	uint32_t i;
+
+	dev->features = 0;
+	dev->protocol_features = 0;
+
+	for (i = 0; i < dev->virt_qp_nb; i++) {
+		cleanup_vq(dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_RXQ], destroy);
+		cleanup_vq(dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_TXQ], destroy);
+	}
+
+	if (dev->log_addr) {
+		munmap((void *)(uintptr_t)dev->log_addr, dev->log_size);
+		dev->log_addr = 0;
+	}
+}
+
+/*
+ * Release virtqueues and device memory.
+ */
+static void
+free_device(struct virtio_dev *device)
+{
+	struct virtio_net *dev = get_net_device(device);
+	uint32_t i;
+
+	for (i = 0; i < dev->virt_qp_nb; i++)
+		rte_free(dev->virtqueue[i * VIRTIO_QNUM]);
+
+	rte_free(dev);
+}
+
+static void
+init_vring_queue(struct vhost_virtqueue *vq, int qp_idx)
+{
+	memset(vq, 0, sizeof(struct vhost_virtqueue));
+
+	vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD;
+	vq->callfd = VIRTIO_UNINITIALIZED_EVENTFD;
+
+	/* Backends are set to -1 indicating an inactive device. */
+	vq->backend = -1;
+
+	/* always set the default vq pair to enabled */
+	if (qp_idx == 0)
+		vq->enabled = 1;
+
+	TAILQ_INIT(&vq->zmbuf_list);
+}
+
+static void
+init_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx)
+{
+	uint32_t base_idx = qp_idx * VIRTIO_QNUM;
+
+	init_vring_queue(dev->virtqueue[base_idx + VIRTIO_RXQ], qp_idx);
+	init_vring_queue(dev->virtqueue[base_idx + VIRTIO_TXQ], qp_idx);
+}
+
+static void
+reset_vring_queue(struct vhost_virtqueue *vq, int qp_idx)
+{
+	int callfd;
+
+	callfd = vq->callfd;
+	init_vring_queue(vq, qp_idx);
+	vq->callfd = callfd;
+}
+
+static void
+reset_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx)
+{
+	uint32_t base_idx = qp_idx * VIRTIO_QNUM;
+
+	reset_vring_queue(dev->virtqueue[base_idx + VIRTIO_RXQ], qp_idx);
+	reset_vring_queue(dev->virtqueue[base_idx + VIRTIO_TXQ], qp_idx);
+}
+
+static int
+alloc_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx)
+{
+	struct vhost_virtqueue *virtqueue = NULL;
+	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+
+	virtqueue = rte_malloc(NULL,
+			       sizeof(struct vhost_virtqueue) * VIRTIO_QNUM, 0);
+	if (virtqueue == NULL) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"Failed to allocate memory for virt qp:%d.\n", qp_idx);
+		return -1;
+	}
+
+	dev->virtqueue[virt_rx_q_idx] = virtqueue;
+	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
+
+	init_vring_queue_pair(dev, qp_idx);
+
+	dev->virt_qp_nb += 1;
+
+	return 0;
+}
+
+/*
+ * Reset some variables in device structure, while keeping few
+ * others untouched, such as vid, ifname, virt_qp_nb: they
+ * should be same unless the device is removed.
+ */
+static void
+reset_device(struct virtio_dev *device)
+{
+	struct virtio_net *dev = get_net_device(device);
+	uint32_t i;
+
+	for (i = 0; i < dev->virt_qp_nb; i++)
+		reset_vring_queue_pair(dev, i);
+}
+
+static uint64_t
+vhost_dev_get_features(struct virtio_dev *dev)
+{
+	if (dev == NULL)
+		return 0;
+
+	return VHOST_FEATURES;	
+}
+
+static int
+vhost_dev_set_features(struct virtio_dev *device, uint64_t features)
+{
+	struct virtio_net *dev = get_net_device(device);
+
+	if (features & ~VHOST_FEATURES)
+		return -1;
+
+	dev->features = features;
+	if (dev->features &
+		((1 << VIRTIO_NET_F_MRG_RXBUF) | (1ULL << VIRTIO_F_VERSION_1))) {
+		dev->vhost_hlen = sizeof(struct virtio_net_hdr_mrg_rxbuf);
+	} else {
+		dev->vhost_hlen = sizeof(struct virtio_net_hdr);
+	}
+	LOG_DEBUG(VHOST_CONFIG,
+		"(%d) mergeable RX buffers %s, virtio 1 %s\n",
+		device->vid,
+		(dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off",
+		(dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off");
+
+	return 0;
+}
+
+static int
+vhost_dev_set_vring_num(struct virtio_dev *device,
+			 struct vhost_virtqueue *vq)
+{
+	struct virtio_net *dev = get_net_device(device);
+
+	if (dev->tx_zero_copy) {
+		vq->nr_zmbuf = 0;
+		vq->last_zmbuf_idx = 0;
+		vq->zmbuf_size = vq->size;
+		vq->zmbufs = rte_zmalloc(NULL, vq->zmbuf_size *
+					 sizeof(struct zcopy_mbuf), 0);
+		if (vq->zmbufs == NULL) {
+			RTE_LOG(WARNING, VHOST_CONFIG,
+				"failed to allocate mem for zero copy; "
+				"zero copy is force disabled\n");
+			dev->tx_zero_copy = 0;
+		}
+	}
+
+	return 0;
+}
+
+static int
+vq_is_ready(struct vhost_virtqueue *vq)
+{
+	return vq && vq->desc   &&
+	       vq->kickfd != VIRTIO_UNINITIALIZED_EVENTFD &&
+	       vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD;
+}
+
+static int
+vhost_dev_is_ready(struct virtio_dev *device)
+{
+	struct virtio_net *dev = get_net_device(device);
+	struct vhost_virtqueue *rvq, *tvq;
+	uint32_t i;
+
+	for (i = 0; i < dev->virt_qp_nb; i++) {
+		rvq = dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_RXQ];
+		tvq = dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_TXQ];
+
+		if (!vq_is_ready(rvq) || !vq_is_ready(tvq)) {
+			RTE_LOG(INFO, VHOST_CONFIG,
+				"virtio is not ready for processing.\n");
+			return 0;
+		}
+	}
+
+	RTE_LOG(INFO, VHOST_CONFIG,
+		"virtio is now ready for processing.\n");
+	return 1;
+}
+
+static int
+vhost_dev_set_vring_call(struct virtio_dev *device, struct vhost_vring_file *file)
+{
+	struct virtio_net *dev = get_net_device(device);
+	struct vhost_virtqueue *vq;
+	uint32_t cur_qp_idx;
+
+	/*
+	 * FIXME: VHOST_SET_VRING_CALL is the first per-vring message
+	 * we get, so we do vring queue pair allocation here.
+	 */
+	cur_qp_idx = file->index / VIRTIO_QNUM;
+	if (cur_qp_idx + 1 > dev->virt_qp_nb) {
+		if (alloc_vring_queue_pair(dev, cur_qp_idx) < 0)
+			return -1;
+	}
+
+	vq = dev->virtqueue[file->index];
+	assert(vq != NULL);
+
+	if (vq->callfd >= 0)
+		close(vq->callfd);
+
+	vq->callfd = file->fd;
+	return 0;
+}
+
+static int
+vhost_dev_set_protocol_features(struct virtio_dev *device,
+				 uint64_t protocol_features)
+{
+	struct virtio_net *dev = get_net_device(device);
+
+	if (protocol_features & ~VHOST_USER_PROTOCOL_FEATURES)
+		return -1;
+
+	dev->protocol_features = protocol_features;
+	return 0;
+}
+
+static uint64_t
+vhost_dev_get_protocol_features(struct virtio_dev *dev)
+{
+	if (dev == NULL)
+		return 0;
+
+	return VHOST_USER_PROTOCOL_FEATURES;
+}
+
+static uint32_t
+vhost_dev_get_default_queue_num(struct virtio_dev *dev)
+{
+	if (dev == NULL)
+		return 0;
+
+	return VHOST_MAX_QUEUE_PAIRS;
+}
+
+static uint32_t
+vhost_dev_get_queue_num(struct virtio_dev *device)
+{
+	struct virtio_net *dev;
+	if (device == NULL)
+		return 0;
+
+	dev = get_net_device(device);
+	return dev->virt_qp_nb;
+}
+
+static uint16_t
+vhost_dev_get_avail_entries(struct virtio_dev *device, uint16_t queue_id)
+{
+	struct virtio_net *dev = get_net_device(device);
+	struct vhost_virtqueue *vq;
+
+	vq = dev->virtqueue[queue_id];
+	if (!vq->enabled)
+		return 0;
+
+	return *(volatile uint16_t *)&vq->avail->idx - vq->last_used_idx;
+}
+
+void
+vhost_enable_tx_zero_copy(int vid)
+{
+	struct virtio_dev *device = get_device(vid);
+	struct virtio_net *dev;
+
+	if (device == NULL)
+		return;
+
+	dev = get_net_device(device);
+	dev->tx_zero_copy = 1;
+}
+
+static void
+free_zmbufs(struct vhost_virtqueue *vq)
+{
+	struct zcopy_mbuf *zmbuf, *next;
+
+	for (zmbuf = TAILQ_FIRST(&vq->zmbuf_list);
+	     zmbuf != NULL; zmbuf = next) {
+		next = TAILQ_NEXT(zmbuf, next);
+
+		rte_pktmbuf_free(zmbuf->mbuf);
+		TAILQ_REMOVE(&vq->zmbuf_list, zmbuf, next);
+	}
+
+	rte_free(vq->zmbufs);
+}
+
+static int
+vhost_dev_get_vring_base(struct virtio_dev *device, struct vhost_virtqueue *vq)
+{
+	struct virtio_net *dev = get_net_device(device);
+
+	/*
+	 * Based on current qemu vhost-user implementation, this message is
+	 * sent and only sent in vhost_vring_stop.
+	 * TODO: cleanup the vring, it isn't usable since here.
+	 */
+	if (vq->kickfd >= 0)
+		close(vq->kickfd);
+
+	vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD;
+
+	if (dev->tx_zero_copy)
+		free_zmbufs(vq);
+
+	return 0;
+}
+
+static int
+vhost_dev_set_log_base(struct virtio_dev *device, int fd, uint64_t size, uint64_t off)
+{
+	void *addr;
+	struct virtio_net *dev = get_net_device(device);
+
+	/*
+	 * mmap from 0 to workaround a hugepage mmap bug: mmap will
+	 * fail when offset is not page size aligned.
+	 */
+	addr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+	close(fd);
+	if (addr == MAP_FAILED) {
+		RTE_LOG(ERR, VHOST_CONFIG, "mmap log base failed!\n");
+		return -1;
+	}
+
+	/*
+	 * Free previously mapped log memory on occasionally
+	 * multiple VHOST_USER_SET_LOG_BASE.
+	 */
+	if (dev->log_addr) {
+		munmap((void *)(uintptr_t)dev->log_addr, dev->log_size);
+	}
+	dev->log_addr = (uint64_t)(uintptr_t)addr;
+	dev->log_base = dev->log_addr + off;
+	dev->log_size = size;
+
+	return 0;
+}
+
+/*
+ * An rarp packet is constructed and broadcasted to notify switches about
+ * the new location of the migrated VM, so that packets from outside will
+ * not be lost after migration.
+ *
+ * However, we don't actually "send" a rarp packet here, instead, we set
+ * a flag 'broadcast_rarp' to let rte_vhost_dequeue_burst() inject it.
+ */
+int
+vhost_user_send_rarp(struct virtio_dev *device, struct VhostUserMsg *msg)
+{
+	struct virtio_net *dev = get_net_device(device);
+	uint8_t *mac = (uint8_t *)&msg->payload.u64;
+
+	RTE_LOG(DEBUG, VHOST_CONFIG,
+		":: mac: %02x:%02x:%02x:%02x:%02x:%02x\n",
+		mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
+	memcpy(dev->mac.addr_bytes, mac, 6);
+
+	/*
+	 * Set the flag to inject a RARP broadcast packet at
+	 * rte_vhost_dequeue_burst().
+	 *
+	 * rte_smp_wmb() is for making sure the mac is copied
+	 * before the flag is set.
+	 */
+	rte_smp_wmb();
+	rte_atomic16_set(&dev->broadcast_rarp, 1);
+
+	return 0;
+}
+
+static struct vhost_virtqueue *
+vhost_dev_get_queues(struct virtio_dev *device, uint16_t queue_id)
+{
+	struct virtio_net *dev = get_net_device(device);
+	struct vhost_virtqueue *vq;
+
+	vq = dev->virtqueue[queue_id];
+
+	return vq;
+}
+
+void
+vhost_net_device_init(struct virtio_dev *device)
+{
+	struct virtio_net *dev = get_net_device(device);
+
+	device->fn_table.vhost_dev_ready  = vhost_dev_is_ready;
+	device->fn_table.vhost_dev_get_queues  = vhost_dev_get_queues;
+	device->fn_table.vhost_dev_cleanup = cleanup_device;
+	device->fn_table.vhost_dev_free  = free_device;
+	device->fn_table.vhost_dev_reset  = reset_device;
+	device->fn_table.vhost_dev_get_features  = vhost_dev_get_features;
+	device->fn_table.vhost_dev_set_features  = vhost_dev_set_features;
+	device->fn_table.vhost_dev_get_protocol_features  = vhost_dev_get_protocol_features;
+	device->fn_table.vhost_dev_set_protocol_features  = vhost_dev_set_protocol_features;
+	device->fn_table.vhost_dev_get_default_queue_num  = vhost_dev_get_default_queue_num;
+	device->fn_table.vhost_dev_get_queue_num  = vhost_dev_get_queue_num;
+	device->fn_table.vhost_dev_get_avail_entries  = vhost_dev_get_avail_entries;
+	device->fn_table.vhost_dev_get_vring_base  = vhost_dev_get_vring_base;
+	device->fn_table.vhost_dev_set_vring_num = vhost_dev_set_vring_num;
+	device->fn_table.vhost_dev_set_vring_call  = vhost_dev_set_vring_call;
+	device->fn_table.vhost_dev_set_log_base = vhost_dev_set_log_base;
+
+	dev->device = device;
+}
+
+uint64_t rte_vhost_feature_get(void)
+{
+	return VHOST_FEATURES;
+}
+
+int rte_vhost_feature_disable(uint64_t feature_mask)
+{
+	VHOST_FEATURES = VHOST_FEATURES & ~feature_mask;
+	return 0;
+}
+
+int rte_vhost_feature_enable(uint64_t feature_mask)
+{
+	if ((feature_mask & VHOST_SUPPORTED_FEATURES) == feature_mask) {
+		VHOST_FEATURES = VHOST_FEATURES | feature_mask;
+		return 0;
+	}
+	return -1;
+}
+
+int
+rte_vhost_get_numa_node(int vid)
+{
+#ifdef RTE_LIBRTE_VHOST_NUMA
+	struct virtio_dev *dev = get_device(vid);
+	int numa_node;
+	int ret;
+
+	if (dev == NULL)
+		return -1;
+
+	ret = get_mempolicy(&numa_node, NULL, 0, dev,
+			    MPOL_F_NODE | MPOL_F_ADDR);
+	if (ret < 0) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"(%d) failed to query numa node: %d\n", vid, ret);
+		return -1;
+	}
+
+	return numa_node;
+#else
+	RTE_SET_USED(vid);
+	return -1;
+#endif
+}
+
+uint32_t
+rte_vhost_get_queue_num(int vid)
+{
+	struct virtio_dev *device = get_device(vid);
+
+	if (device == NULL)
+		return 0;
+
+	if (device->fn_table.vhost_dev_get_queue_num)
+		return device->fn_table.vhost_dev_get_queue_num(device);
+
+	return 0;
+}
+
+int
+rte_vhost_get_ifname(int vid, char *buf, size_t len)
+{
+	struct virtio_dev *dev = get_device(vid);
+
+	if (dev == NULL)
+		return -1;
+
+	len = RTE_MIN(len, sizeof(dev->ifname));
+
+	strncpy(buf, dev->ifname, len);
+	buf[len - 1] = '\0';
+
+	return 0;
+}
+
+uint16_t
+rte_vhost_avail_entries(int vid, uint16_t queue_id)
+{
+	struct virtio_dev *device;
+
+	device = get_device(vid);
+	if (!device)
+		return 0;
+
+	if (device->fn_table.vhost_dev_get_avail_entries)
+		return device->fn_table.vhost_dev_get_avail_entries(device, queue_id);
+
+	return 0;
+}
+
+int
+rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable)
+{
+	struct virtio_dev *device = get_device(vid);
+	struct vhost_virtqueue *vq;
+
+	if (device == NULL)
+		return -1;
+
+	vq = device->fn_table.vhost_dev_get_queues(device, queue_id);
+	if (enable) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"guest notification isn't supported.\n");
+		return -1;
+	}
+
+	vq->used->flags = VRING_USED_F_NO_NOTIFY;
+	return 0;
+}
+
+/*
+ * Register ops so that we can add/remove device to data core.
+ */
+int
+rte_vhost_driver_callback_register(struct virtio_net_device_ops const * const ops)
+{
+	notify_ops = ops;
+
+	return 0;
+}
\ No newline at end of file
diff --git a/lib/librte_vhost/vhost_net.h b/lib/librte_vhost/vhost_net.h
new file mode 100644
index 0000000..53b6b16
--- /dev/null
+++ b/lib/librte_vhost/vhost_net.h
@@ -0,0 +1,126 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _VHOST_NET_H_
+#define _VHOST_NET_H_
+#include <stdint.h>
+#include <stdio.h>
+#include <sys/types.h>
+#include <sys/queue.h>
+#include <unistd.h>
+#include <linux/vhost.h>
+
+#include <rte_log.h>
+
+#include "rte_virtio_net.h"
+#include "vhost_user.h"
+
+#define VHOST_USER_PROTOCOL_F_MQ	0
+#define VHOST_USER_PROTOCOL_F_LOG_SHMFD	1
+#define VHOST_USER_PROTOCOL_F_RARP	2
+
+#define VHOST_USER_PROTOCOL_FEATURES	((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \
+					 (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\
+					 (1ULL << VHOST_USER_PROTOCOL_F_RARP))
+
+#define BUF_VECTOR_MAX 256
+
+/**
+ * Structure contains buffer address, length and descriptor index
+ * from vring to do scatter RX.
+ */
+struct buf_vector {
+	uint64_t buf_addr;
+	uint32_t buf_len;
+	uint32_t desc_idx;
+};
+
+/*
+ * A structure to hold some fields needed in zero copy code path,
+ * mainly for associating an mbuf with the right desc_idx.
+ */
+struct zcopy_mbuf {
+	struct rte_mbuf *mbuf;
+	uint32_t desc_idx;
+	uint16_t in_use;
+
+	TAILQ_ENTRY(zcopy_mbuf) next;
+};
+TAILQ_HEAD(zcopy_mbuf_list, zcopy_mbuf);
+
+/* Old kernels have no such macro defined */
+#ifndef VIRTIO_NET_F_GUEST_ANNOUNCE
+ #define VIRTIO_NET_F_GUEST_ANNOUNCE 21
+#endif
+
+/*
+ * Make an extra wrapper for VIRTIO_NET_F_MQ and
+ * VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX as they are
+ * introduced since kernel v3.8. This makes our
+ * code buildable for older kernel.
+ */
+#ifdef VIRTIO_NET_F_MQ
+ #define VHOST_MAX_QUEUE_PAIRS	VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX
+ #define VHOST_SUPPORTS_MQ	(1ULL << VIRTIO_NET_F_MQ)
+#else
+ #define VHOST_MAX_QUEUE_PAIRS	1
+ #define VHOST_SUPPORTS_MQ	0
+#endif
+
+/**
+ * Device structure contains all configuration information relating
+ * to the device.
+ */
+struct virtio_net {
+	uint64_t		features;
+	uint64_t		protocol_features;
+	uint16_t		vhost_hlen;
+	uint64_t		log_size;
+	uint64_t		log_base;
+	uint64_t		log_addr;
+	/* to tell if we need broadcast rarp packet */
+	rte_atomic16_t		broadcast_rarp;
+	uint32_t		virt_qp_nb;
+	int			tx_zero_copy;
+	struct vhost_virtqueue	*virtqueue[VHOST_MAX_QUEUE_PAIRS * 2];
+	struct ether_addr	mac;
+	/* transport layer device context */
+	struct virtio_dev	*device;
+} __rte_cache_aligned;
+
+void vhost_enable_tx_zero_copy(int vid);
+int vhost_user_send_rarp(struct virtio_dev *device, struct VhostUserMsg *msg);
+void vhost_net_device_init(struct virtio_dev *device);
+struct virtio_net *get_net_device(struct virtio_dev *dev);
+
+#endif /* _VHOST_NET_H_ */
\ No newline at end of file
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index ff995d5..90c4b03 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -48,9 +48,13 @@
 #include <rte_malloc.h>
 #include <rte_log.h>
 
-#include "vhost.h"
+#include "vhost_device.h"
+#include "vhost_net.h"
 #include "vhost_user.h"
 
+#define MAX_VHOST_DEVICE        1024
+struct virtio_dev *vhost_devices[MAX_VHOST_DEVICE];
+
 static const char *vhost_message_str[VHOST_USER_MAX] = {
 	[VHOST_USER_NONE] = "VHOST_USER_NONE",
 	[VHOST_USER_GET_FEATURES] = "VHOST_USER_GET_FEATURES",
@@ -85,7 +89,7 @@ get_blk_size(int fd)
 }
 
 static void
-free_mem_region(struct virtio_net *dev)
+free_mem_region(struct virtio_dev *dev)
 {
 	uint32_t i;
 	struct virtio_memory_region *reg;
@@ -102,18 +106,99 @@ free_mem_region(struct virtio_net *dev)
 	}
 }
 
-void
-vhost_backend_cleanup(struct virtio_net *dev)
+static void
+vhost_backend_cleanup(struct virtio_dev *dev)
 {
 	if (dev->mem) {
 		free_mem_region(dev);
-		rte_free(dev->mem);
+		free(dev->mem);
 		dev->mem = NULL;
 	}
-	if (dev->log_addr) {
-		munmap((void *)(uintptr_t)dev->log_addr, dev->log_size);
-		dev->log_addr = 0;
+}
+
+struct virtio_dev *
+get_device(int vid)
+{
+	struct virtio_dev *dev = vhost_devices[vid];
+
+	if (unlikely(!dev)) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"(%d) device not found.\n", vid);
+	}
+
+	return dev;
+}
+
+/*
+ * Function is called from the CUSE open function. The device structure is
+ * initialised and a new entry is added to the device configuration linked
+ * list.
+ */
+int
+vhost_new_device(int type)
+{
+	struct virtio_dev *dev;
+	int i;
+
+	dev = rte_zmalloc(NULL, sizeof(struct virtio_dev), 0);
+	if (dev == NULL) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"Failed to allocate memory for new dev.\n");
+		return -1;
+	}
+
+	for (i = 0; i < MAX_VHOST_DEVICE; i++) {
+		if (vhost_devices[i] == NULL)
+			break;
+	}
+	if (i == MAX_VHOST_DEVICE) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"Failed to find a free slot for new device.\n");
+		return -1;
+	}
+
+	switch(type) {
+		case VIRTIO_ID_NET:
+			dev->notify_ops = notify_ops;
+			vhost_net_device_init(dev);
+			assert(notify_ops != NULL);
+			break;
+		default:
+			return -1;
+	}
+
+	vhost_devices[i] = dev;
+	dev->vid = i;
+	dev->dev_type = type;
+	assert(dev->fn_table.vhost_dev_get_queues != NULL);
+
+	return i;
+}
+
+/*
+ * Function is called from the CUSE release function. This function will
+ * cleanup the device and remove it from device configuration linked list.
+ */
+void
+vhost_destroy_device(int vid)
+{
+	struct virtio_dev *dev = get_device(vid);
+
+	if (dev == NULL)
+		return;
+
+	if (dev->flags & VIRTIO_DEV_RUNNING) {
+		dev->flags &= ~VIRTIO_DEV_RUNNING;
+		dev->notify_ops->destroy_device(vid);
 	}
+
+	vhost_backend_cleanup(dev);
+	if (dev->fn_table.vhost_dev_cleanup)
+		dev->fn_table.vhost_dev_cleanup(dev, 1);
+	if (dev->fn_table.vhost_dev_free)
+		dev->fn_table.vhost_dev_free(dev);
+
+	vhost_devices[vid] = NULL;
 }
 
 /*
@@ -126,16 +211,28 @@ vhost_user_set_owner(void)
 	return 0;
 }
 
+/*
+ * Called from CUSE IOCTL: VHOST_RESET_OWNER
+ */
 static int
-vhost_user_reset_owner(struct virtio_net *dev)
+vhost_user_reset_owner(struct virtio_dev *dev)
 {
+	if (dev == NULL)
+		return -1;
+
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
-	cleanup_device(dev, 0);
-	reset_device(dev);
+	dev->flags = 0;
+
+	vhost_backend_cleanup(dev);
+	if (dev->fn_table.vhost_dev_cleanup)
+		dev->fn_table.vhost_dev_cleanup(dev, 0);
+	if (dev->fn_table.vhost_dev_reset)
+		dev->fn_table.vhost_dev_reset(dev);
+
 	return 0;
 }
 
@@ -143,61 +240,61 @@ vhost_user_reset_owner(struct virtio_net *dev)
  * The features that we support are requested.
  */
 static uint64_t
-vhost_user_get_features(void)
+vhost_user_get_features(struct virtio_dev *dev)
 {
-	return VHOST_FEATURES;
+	if (dev == NULL)
+		return 0;
+
+	if (dev->fn_table.vhost_dev_get_features)
+		return dev->fn_table.vhost_dev_get_features(dev);
+
+	return 0;
 }
 
 /*
  * We receive the negotiated features supported by us and the virtio device.
  */
 static int
-vhost_user_set_features(struct virtio_net *dev, uint64_t features)
+vhost_user_set_features(struct virtio_dev *dev, uint64_t pu)
 {
-	if (features & ~VHOST_FEATURES)
-		return -1;
+	int ret = 0;
 
-	dev->features = features;
-	if (dev->features &
-		((1 << VIRTIO_NET_F_MRG_RXBUF) | (1ULL << VIRTIO_F_VERSION_1))) {
-		dev->vhost_hlen = sizeof(struct virtio_net_hdr_mrg_rxbuf);
-	} else {
-		dev->vhost_hlen = sizeof(struct virtio_net_hdr);
-	}
-	LOG_DEBUG(VHOST_CONFIG,
-		"(%d) mergeable RX buffers %s, virtio 1 %s\n",
-		dev->vid,
-		(dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off",
-		(dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off");
+	if (dev->fn_table.vhost_dev_set_features)
+		ret = dev->fn_table.vhost_dev_set_features(dev, pu);
 
-	return 0;
+	return ret;
+}
+
+void
+vhost_set_ifname(int vid, const char *if_name, unsigned int if_len)
+{
+	struct virtio_dev *dev;
+	unsigned int len;
+
+	dev = get_device(vid);
+	if (dev == NULL)
+		return;
+
+	len = if_len > sizeof(dev->ifname) ?
+		sizeof(dev->ifname) : if_len;
+
+	strncpy(dev->ifname, if_name, len);
+	dev->ifname[sizeof(dev->ifname) - 1] = '\0';
 }
 
 /*
  * The virtio device sends us the size of the descriptor ring.
  */
 static int
-vhost_user_set_vring_num(struct virtio_net *dev,
-			 struct vhost_vring_state *state)
+vhost_user_set_vring_num(struct virtio_dev *dev, struct vhost_vring_state *state)
 {
-	struct vhost_virtqueue *vq = dev->virtqueue[state->index];
+	struct vhost_virtqueue *vq;
 
+	vq = dev->fn_table.vhost_dev_get_queues(dev, state->index);
 	vq->size = state->num;
 
-	if (dev->tx_zero_copy) {
-		vq->nr_zmbuf = 0;
-		vq->last_zmbuf_idx = 0;
-		vq->zmbuf_size = vq->size;
-		vq->zmbufs = rte_zmalloc(NULL, vq->zmbuf_size *
-					 sizeof(struct zcopy_mbuf), 0);
-		if (vq->zmbufs == NULL) {
-			RTE_LOG(WARNING, VHOST_CONFIG,
-				"failed to allocate mem for zero copy; "
-				"zero copy is force disabled\n");
-			dev->tx_zero_copy = 0;
-		}
-	}
-
+	if (dev->fn_table.vhost_dev_set_vring_num)
+		dev->fn_table.vhost_dev_set_vring_num(dev, vq);
 	return 0;
 }
 
@@ -206,11 +303,11 @@ vhost_user_set_vring_num(struct virtio_net *dev,
  * same numa node as the memory of vring descriptor.
  */
 #ifdef RTE_LIBRTE_VHOST_NUMA
-static struct virtio_net*
-numa_realloc(struct virtio_net *dev, int index)
+static struct virtio_dev*
+numa_realloc(struct virtio_dev *dev, int index)
 {
 	int oldnode, newnode;
-	struct virtio_net *old_dev;
+	struct virtio_dev *old_dev;
 	struct vhost_virtqueue *old_vq, *vq;
 	int ret;
 
@@ -222,7 +319,7 @@ numa_realloc(struct virtio_net *dev, int index)
 		return dev;
 
 	old_dev = dev;
-	vq = old_vq = dev->virtqueue[index];
+	vq = old_vq = dev->fn_table.virtio_dev_get_queues(dev, index);
 
 	ret = get_mempolicy(&newnode, NULL, 0, old_vq->desc,
 			    MPOL_F_NODE | MPOL_F_ADDR);
@@ -277,8 +374,8 @@ out:
 	return dev;
 }
 #else
-static struct virtio_net*
-numa_realloc(struct virtio_net *dev, int index __rte_unused)
+static struct virtio_dev*
+numa_realloc(struct virtio_dev *dev, int index __rte_unused)
 {
 	return dev;
 }
@@ -289,7 +386,7 @@ numa_realloc(struct virtio_net *dev, int index __rte_unused)
  * used to convert the ring addresses to our address space.
  */
 static uint64_t
-qva_to_vva(struct virtio_net *dev, uint64_t qva)
+qva_to_vva(struct virtio_dev *dev, uint64_t qva)
 {
 	struct virtio_memory_region *reg;
 	uint32_t i;
@@ -313,15 +410,14 @@ qva_to_vva(struct virtio_net *dev, uint64_t qva)
  * This function then converts these to our address space.
  */
 static int
-vhost_user_set_vring_addr(struct virtio_net *dev, struct vhost_vring_addr *addr)
+vhost_user_set_vring_addr(struct virtio_dev *dev, struct vhost_vring_addr *addr)
 {
 	struct vhost_virtqueue *vq;
 
 	if (dev->mem == NULL)
 		return -1;
 
-	/* addr->index refers to the queue index. The txq 1, rxq is 0. */
-	vq = dev->virtqueue[addr->index];
+	vq = dev->fn_table.vhost_dev_get_queues(dev, addr->index);
 
 	/* The addresses are converted from QEMU virtual to Vhost virtual. */
 	vq->desc = (struct vring_desc *)(uintptr_t)qva_to_vva(dev,
@@ -334,7 +430,7 @@ vhost_user_set_vring_addr(struct virtio_net *dev, struct vhost_vring_addr *addr)
 	}
 
 	dev = numa_realloc(dev, addr->index);
-	vq = dev->virtqueue[addr->index];
+	vq = dev->fn_table.vhost_dev_get_queues(dev, addr->index);
 
 	vq->avail = (struct vring_avail *)(uintptr_t)qva_to_vva(dev,
 			addr->avail_user_addr);
@@ -381,17 +477,19 @@ vhost_user_set_vring_addr(struct virtio_net *dev, struct vhost_vring_addr *addr)
  * The virtio device sends us the available ring last used index.
  */
 static int
-vhost_user_set_vring_base(struct virtio_net *dev,
-			  struct vhost_vring_state *state)
+vhost_user_set_vring_base(struct virtio_dev *dev, struct vhost_vring_state *state)
 {
-	dev->virtqueue[state->index]->last_used_idx  = state->num;
-	dev->virtqueue[state->index]->last_avail_idx = state->num;
+	struct vhost_virtqueue *vq;
+
+	vq = dev->fn_table.vhost_dev_get_queues(dev, state->index);
+	vq->last_used_idx = state->num;
+	vq->last_avail_idx = state->num;
 
 	return 0;
 }
 
 static void
-add_one_guest_page(struct virtio_net *dev, uint64_t guest_phys_addr,
+add_one_guest_page(struct virtio_dev *dev, uint64_t guest_phys_addr,
 		   uint64_t host_phys_addr, uint64_t size)
 {
 	struct guest_page *page, *last_page;
@@ -419,7 +517,7 @@ add_one_guest_page(struct virtio_net *dev, uint64_t guest_phys_addr,
 }
 
 static void
-add_guest_pages(struct virtio_net *dev, struct virtio_memory_region *reg,
+add_guest_pages(struct virtio_dev *dev, struct virtio_memory_region *reg,
 		uint64_t page_size)
 {
 	uint64_t reg_size = reg->size;
@@ -450,7 +548,7 @@ add_guest_pages(struct virtio_net *dev, struct virtio_memory_region *reg,
 #ifdef RTE_LIBRTE_VHOST_DEBUG
 /* TODO: enable it only in debug mode? */
 static void
-dump_guest_pages(struct virtio_net *dev)
+dump_guest_pages(struct virtio_dev *dev)
 {
 	uint32_t i;
 	struct guest_page *page;
@@ -474,7 +572,7 @@ dump_guest_pages(struct virtio_net *dev)
 #endif
 
 static int
-vhost_user_set_mem_table(struct virtio_net *dev, struct VhostUserMsg *pmsg)
+vhost_user_set_mem_table(struct virtio_dev *dev, struct VhostUserMsg *pmsg)
 {
 	struct VhostUserMemory memory = pmsg->payload.memory;
 	struct virtio_memory_region *reg;
@@ -488,7 +586,7 @@ vhost_user_set_mem_table(struct virtio_net *dev, struct VhostUserMsg *pmsg)
 	/* Remove from the data plane. */
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
 	if (dev->mem) {
@@ -588,41 +686,22 @@ err_mmap:
 }
 
 static int
-vq_is_ready(struct vhost_virtqueue *vq)
-{
-	return vq && vq->desc   &&
-	       vq->kickfd != VIRTIO_UNINITIALIZED_EVENTFD &&
-	       vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD;
-}
-
-static int
-virtio_is_ready(struct virtio_net *dev)
+virtio_is_ready(struct virtio_dev *dev)
 {
-	struct vhost_virtqueue *rvq, *tvq;
-	uint32_t i;
-
-	for (i = 0; i < dev->virt_qp_nb; i++) {
-		rvq = dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_RXQ];
-		tvq = dev->virtqueue[i * VIRTIO_QNUM + VIRTIO_TXQ];
-
-		if (!vq_is_ready(rvq) || !vq_is_ready(tvq)) {
-			RTE_LOG(INFO, VHOST_CONFIG,
-				"virtio is not ready for processing.\n");
-			return 0;
-		}
-	}
+	if (dev->fn_table.vhost_dev_ready)
+		return dev->fn_table.vhost_dev_ready(dev);
 
-	RTE_LOG(INFO, VHOST_CONFIG,
-		"virtio is now ready for processing.\n");
-	return 1;
+	return -1;
 }
 
+/*
+ *  In vhost-user, when we receive kick message, will test whether virtio
+ *  device is ready for packet processing.
+ */
 static void
-vhost_user_set_vring_call(struct virtio_net *dev, struct VhostUserMsg *pmsg)
+vhost_user_set_vring_call(struct virtio_dev *dev, struct VhostUserMsg *pmsg)
 {
 	struct vhost_vring_file file;
-	struct vhost_virtqueue *vq;
-	uint32_t cur_qp_idx;
 
 	file.index = pmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
 	if (pmsg->payload.u64 & VHOST_USER_VRING_NOFD_MASK)
@@ -632,23 +711,8 @@ vhost_user_set_vring_call(struct virtio_net *dev, struct VhostUserMsg *pmsg)
 	RTE_LOG(INFO, VHOST_CONFIG,
 		"vring call idx:%d file:%d\n", file.index, file.fd);
 
-	/*
-	 * FIXME: VHOST_SET_VRING_CALL is the first per-vring message
-	 * we get, so we do vring queue pair allocation here.
-	 */
-	cur_qp_idx = file.index / VIRTIO_QNUM;
-	if (cur_qp_idx + 1 > dev->virt_qp_nb) {
-		if (alloc_vring_queue_pair(dev, cur_qp_idx) < 0)
-			return;
-	}
-
-	vq = dev->virtqueue[file.index];
-	assert(vq != NULL);
-
-	if (vq->callfd >= 0)
-		close(vq->callfd);
-
-	vq->callfd = file.fd;
+	if (dev->fn_table.vhost_dev_set_vring_call)
+		dev->fn_table.vhost_dev_set_vring_call(dev, &file);
 }
 
 /*
@@ -656,11 +720,14 @@ vhost_user_set_vring_call(struct virtio_net *dev, struct VhostUserMsg *pmsg)
  *  device is ready for packet processing.
  */
 static void
-vhost_user_set_vring_kick(struct virtio_net *dev, struct VhostUserMsg *pmsg)
+vhost_user_set_vring_kick(struct virtio_dev *dev, struct VhostUserMsg *pmsg)
 {
 	struct vhost_vring_file file;
 	struct vhost_virtqueue *vq;
 
+	if (!dev)
+		return;
+
 	file.index = pmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
 	if (pmsg->payload.u64 & VHOST_USER_VRING_NOFD_MASK)
 		file.fd = VIRTIO_INVALID_EVENTFD;
@@ -668,69 +735,44 @@ vhost_user_set_vring_kick(struct virtio_net *dev, struct VhostUserMsg *pmsg)
 		file.fd = pmsg->fds[0];
 	RTE_LOG(INFO, VHOST_CONFIG,
 		"vring kick idx:%d file:%d\n", file.index, file.fd);
-
-	vq = dev->virtqueue[file.index];
+	vq = dev->fn_table.vhost_dev_get_queues(dev, file.index);
 	if (vq->kickfd >= 0)
 		close(vq->kickfd);
+
 	vq->kickfd = file.fd;
 
 	if (virtio_is_ready(dev) && !(dev->flags & VIRTIO_DEV_RUNNING)) {
-		if (dev->tx_zero_copy) {
-			RTE_LOG(INFO, VHOST_CONFIG,
-				"Tx zero copy is enabled\n");
-		}
-
-		if (notify_ops->new_device(dev->vid) == 0)
+		if (dev->notify_ops->new_device(dev->vid) == 0)
 			dev->flags |= VIRTIO_DEV_RUNNING;
 	}
 }
 
-static void
-free_zmbufs(struct vhost_virtqueue *vq)
-{
-	struct zcopy_mbuf *zmbuf, *next;
-
-	for (zmbuf = TAILQ_FIRST(&vq->zmbuf_list);
-	     zmbuf != NULL; zmbuf = next) {
-		next = TAILQ_NEXT(zmbuf, next);
-
-		rte_pktmbuf_free(zmbuf->mbuf);
-		TAILQ_REMOVE(&vq->zmbuf_list, zmbuf, next);
-	}
-
-	rte_free(vq->zmbufs);
-}
-
 /*
  * when virtio is stopped, qemu will send us the GET_VRING_BASE message.
  */
 static int
-vhost_user_get_vring_base(struct virtio_net *dev,
-			  struct vhost_vring_state *state)
+vhost_user_get_vring_base(struct virtio_dev *dev, struct vhost_vring_state *state)
 {
+	struct vhost_virtqueue *vq;
+	if (dev == NULL)
+		return -1;
+
 	/* We have to stop the queue (virtio) if it is running. */
 	if (dev->flags & VIRTIO_DEV_RUNNING) {
 		dev->flags &= ~VIRTIO_DEV_RUNNING;
-		notify_ops->destroy_device(dev->vid);
+		dev->notify_ops->destroy_device(dev->vid);
 	}
 
+	vq = dev->fn_table.vhost_dev_get_queues(dev, state->index);
+	/* Here we are safe to get the last used index */
+	state->num = vq->last_used_idx;
+
 	/* Here we are safe to get the last used index */
-	state->num = dev->virtqueue[state->index]->last_used_idx;
+	if (dev->fn_table.vhost_dev_get_vring_base)
+		dev->fn_table.vhost_dev_get_vring_base(dev, vq);
 
 	RTE_LOG(INFO, VHOST_CONFIG,
 		"vring base idx:%d file:%d\n", state->index, state->num);
-	/*
-	 * Based on current qemu vhost-user implementation, this message is
-	 * sent and only sent in vhost_vring_stop.
-	 * TODO: cleanup the vring, it isn't usable since here.
-	 */
-	if (dev->virtqueue[state->index]->kickfd >= 0)
-		close(dev->virtqueue[state->index]->kickfd);
-
-	dev->virtqueue[state->index]->kickfd = VIRTIO_UNINITIALIZED_EVENTFD;
-
-	if (dev->tx_zero_copy)
-		free_zmbufs(dev->virtqueue[state->index]);
 
 	return 0;
 }
@@ -740,39 +782,54 @@ vhost_user_get_vring_base(struct virtio_net *dev,
  * enable the virtio queue pair.
  */
 static int
-vhost_user_set_vring_enable(struct virtio_net *dev,
-			    struct vhost_vring_state *state)
+vhost_user_set_vring_enable(struct virtio_dev *dev, struct vhost_vring_state *state)
 {
+	struct vhost_virtqueue *vq;
 	int enable = (int)state->num;
 
+	if (dev == NULL)
+		return -1;
+
 	RTE_LOG(INFO, VHOST_CONFIG,
 		"set queue enable: %d to qp idx: %d\n",
 		enable, state->index);
 
-	if (notify_ops->vring_state_changed)
-		notify_ops->vring_state_changed(dev->vid, state->index, enable);
-
-	dev->virtqueue[state->index]->enabled = enable;
+	if (dev->notify_ops->vring_state_changed)
+		dev->notify_ops->vring_state_changed(dev->vid, state->index, enable);
+	
+	vq = dev->fn_table.vhost_dev_get_queues(dev, state->index);
+	vq->enabled = enable;
 
 	return 0;
 }
 
 static void
-vhost_user_set_protocol_features(struct virtio_net *dev,
-				 uint64_t protocol_features)
+vhost_user_set_protocol_features(struct virtio_dev *dev, uint64_t protocol_features)
 {
-	if (protocol_features & ~VHOST_USER_PROTOCOL_FEATURES)
+	if (dev == NULL)
 		return;
 
-	dev->protocol_features = protocol_features;
+	if (dev->fn_table.vhost_dev_set_protocol_features)
+		dev->fn_table.vhost_dev_set_protocol_features(dev, protocol_features);
+}
+
+static uint64_t
+vhost_user_get_protocol_features(struct virtio_dev *dev)
+{
+	if (dev == NULL)
+		return 0;
+
+	if (dev->fn_table.vhost_dev_get_protocol_features)
+		return dev->fn_table.vhost_dev_get_protocol_features(dev);
+
+	return 0;
 }
 
 static int
-vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
+vhost_user_set_log_base(struct virtio_dev *dev, struct VhostUserMsg *msg)
 {
 	int fd = msg->fds[0];
 	uint64_t size, off;
-	void *addr;
 
 	if (fd < 0) {
 		RTE_LOG(ERR, VHOST_CONFIG, "invalid log fd: %d\n", fd);
@@ -792,58 +849,20 @@ vhost_user_set_log_base(struct virtio_net *dev, struct VhostUserMsg *msg)
 		"log mmap size: %"PRId64", offset: %"PRId64"\n",
 		size, off);
 
-	/*
-	 * mmap from 0 to workaround a hugepage mmap bug: mmap will
-	 * fail when offset is not page size aligned.
-	 */
-	addr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
-	close(fd);
-	if (addr == MAP_FAILED) {
-		RTE_LOG(ERR, VHOST_CONFIG, "mmap log base failed!\n");
-		return -1;
-	}
-
-	/*
-	 * Free previously mapped log memory on occasionally
-	 * multiple VHOST_USER_SET_LOG_BASE.
-	 */
-	if (dev->log_addr) {
-		munmap((void *)(uintptr_t)dev->log_addr, dev->log_size);
-	}
-	dev->log_addr = (uint64_t)(uintptr_t)addr;
-	dev->log_base = dev->log_addr + off;
-	dev->log_size = size;
+	if (dev->fn_table.vhost_dev_set_log_base)
+		return dev->fn_table.vhost_dev_set_log_base(dev, fd, size, off);
 
 	return 0;
 }
 
-/*
- * An rarp packet is constructed and broadcasted to notify switches about
- * the new location of the migrated VM, so that packets from outside will
- * not be lost after migration.
- *
- * However, we don't actually "send" a rarp packet here, instead, we set
- * a flag 'broadcast_rarp' to let rte_vhost_dequeue_burst() inject it.
- */
-static int
-vhost_user_send_rarp(struct virtio_net *dev, struct VhostUserMsg *msg)
+static uint32_t
+vhost_user_get_queue_num(struct virtio_dev *dev)
 {
-	uint8_t *mac = (uint8_t *)&msg->payload.u64;
-
-	RTE_LOG(DEBUG, VHOST_CONFIG,
-		":: mac: %02x:%02x:%02x:%02x:%02x:%02x\n",
-		mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
-	memcpy(dev->mac.addr_bytes, mac, 6);
+	if (dev == NULL)
+		return 0;
 
-	/*
-	 * Set the flag to inject a RARP broadcast packet at
-	 * rte_vhost_dequeue_burst().
-	 *
-	 * rte_smp_wmb() is for making sure the mac is copied
-	 * before the flag is set.
-	 */
-	rte_smp_wmb();
-	rte_atomic16_set(&dev->broadcast_rarp, 1);
+	if (dev->fn_table.vhost_dev_get_queue_num)
+		return dev->fn_table.vhost_dev_get_queue_num(dev);
 
 	return 0;
 }
@@ -899,7 +918,7 @@ send_vhost_message(int sockfd, struct VhostUserMsg *msg)
 int
 vhost_user_msg_handler(int vid, int fd)
 {
-	struct virtio_net *dev;
+	struct virtio_dev *dev;
 	struct VhostUserMsg msg;
 	int ret;
 
@@ -926,7 +945,7 @@ vhost_user_msg_handler(int vid, int fd)
 		vhost_message_str[msg.request]);
 	switch (msg.request) {
 	case VHOST_USER_GET_FEATURES:
-		msg.payload.u64 = vhost_user_get_features();
+		msg.payload.u64 = vhost_user_get_features(dev);
 		msg.size = sizeof(msg.payload.u64);
 		send_vhost_message(fd, &msg);
 		break;
@@ -935,7 +954,7 @@ vhost_user_msg_handler(int vid, int fd)
 		break;
 
 	case VHOST_USER_GET_PROTOCOL_FEATURES:
-		msg.payload.u64 = VHOST_USER_PROTOCOL_FEATURES;
+		msg.payload.u64 = vhost_user_get_protocol_features(dev);
 		msg.size = sizeof(msg.payload.u64);
 		send_vhost_message(fd, &msg);
 		break;
@@ -996,7 +1015,7 @@ vhost_user_msg_handler(int vid, int fd)
 		break;
 
 	case VHOST_USER_GET_QUEUE_NUM:
-		msg.payload.u64 = VHOST_MAX_QUEUE_PAIRS;
+		msg.payload.u64 = vhost_user_get_queue_num(dev);
 		msg.size = sizeof(msg.payload.u64);
 		send_vhost_message(fd, &msg);
 		break;
@@ -1014,4 +1033,4 @@ vhost_user_msg_handler(int vid, int fd)
 	}
 
 	return 0;
-}
+}
\ No newline at end of file
diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index ba78d32..59f80f2 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -38,19 +38,12 @@
 #include <linux/vhost.h>
 
 #include "rte_virtio_net.h"
+#include "rte_virtio_dev.h"
 
 /* refer to hw/virtio/vhost-user.c */
 
 #define VHOST_MEMORY_MAX_NREGIONS 8
 
-#define VHOST_USER_PROTOCOL_F_MQ	0
-#define VHOST_USER_PROTOCOL_F_LOG_SHMFD	1
-#define VHOST_USER_PROTOCOL_F_RARP	2
-
-#define VHOST_USER_PROTOCOL_FEATURES	((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \
-					 (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\
-					 (1ULL << VHOST_USER_PROTOCOL_F_RARP))
-
 typedef enum VhostUserRequest {
 	VHOST_USER_NONE = 0,
 	VHOST_USER_GET_FEATURES = 1,
@@ -117,12 +110,16 @@ typedef struct VhostUserMsg {
 /* The version of the protocol we support */
 #define VHOST_USER_VERSION    0x1
 
-
 /* vhost_user.c */
 int vhost_user_msg_handler(int vid, int fd);
 
 /* socket.c */
 int read_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num);
 int send_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num);
+void vhost_set_ifname(int vid, const char *if_name, unsigned int if_len);
+int vhost_new_device(int type);
+void vhost_destroy_device(int vid);
+
+struct virtio_dev *get_device(int vid);
 
-#endif
+#endif
\ No newline at end of file
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 277b150..c11e9b2 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -45,7 +45,8 @@
 #include <rte_sctp.h>
 #include <rte_arp.h>
 
-#include "vhost.h"
+#include "vhost_net.h"
+#include "vhost_device.h"
 
 #define MAX_PKT_BURST 32
 #define VHOST_LOG_PAGE	4096
@@ -147,7 +148,7 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
 
 	desc = &vq->desc[desc_idx];
-	desc_addr = gpa_to_vva(dev, desc->addr);
+	desc_addr = gpa_to_vva(dev->device, desc->addr);
 	/*
 	 * Checking of 'desc_addr' placed outside of 'unlikely' macro to avoid
 	 * performance issue with some versions of gcc (4.8.4 and 5.3.0) which
@@ -187,7 +188,7 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
 				return -1;
 
 			desc = &vq->desc[desc->next];
-			desc_addr = gpa_to_vva(dev, desc->addr);
+			desc_addr = gpa_to_vva(dev->device, desc->addr);
 			if (unlikely(!desc_addr))
 				return -1;
 
@@ -232,7 +233,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 	LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__);
 	if (unlikely(!is_valid_virt_queue_idx(queue_id, 0, dev->virt_qp_nb))) {
 		RTE_LOG(ERR, VHOST_DATA, "(%d) %s: invalid virtqueue idx %d.\n",
-			dev->vid, __func__, queue_id);
+			dev->device->vid, __func__, queue_id);
 		return 0;
 	}
 
@@ -395,7 +396,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	LOG_DEBUG(VHOST_DATA, "(%d) current index %d | end index %d\n",
 		dev->vid, cur_idx, end_idx);
 
-	desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
+	desc_addr = gpa_to_vva(dev->device, buf_vec[vec_idx].buf_addr);
 	if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr)
 		return 0;
 
@@ -432,7 +433,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			}
 
 			vec_idx++;
-			desc_addr = gpa_to_vva(dev, buf_vec[vec_idx].buf_addr);
+			desc_addr = gpa_to_vva(dev->device, buf_vec[vec_idx].buf_addr);
 			if (unlikely(!desc_addr))
 				return 0;
 
@@ -487,7 +488,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
 	LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__);
 	if (unlikely(!is_valid_virt_queue_idx(queue_id, 0, dev->virt_qp_nb))) {
 		RTE_LOG(ERR, VHOST_DATA, "(%d) %s: invalid virtqueue idx %d.\n",
-			dev->vid, __func__, queue_id);
+			dev->device->vid, __func__, queue_id);
 		return 0;
 	}
 
@@ -537,10 +538,12 @@ uint16_t
 rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
 	struct rte_mbuf **pkts, uint16_t count)
 {
-	struct virtio_net *dev = get_device(vid);
+	struct virtio_dev *device = get_device(vid);
+	struct virtio_net *dev;
 
-	if (!dev)
+	if (!device)
 		return 0;
+	dev = get_net_device(device);
 
 	if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF))
 		return virtio_dev_merge_rx(dev, queue_id, pkts, count);
@@ -734,7 +737,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	if (unlikely(desc->len < dev->vhost_hlen))
 		return -1;
 
-	desc_addr = gpa_to_vva(dev, desc->addr);
+	desc_addr = gpa_to_vva(dev->device, desc->addr);
 	if (unlikely(!desc_addr))
 		return -1;
 
@@ -771,7 +774,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		   (desc->flags & VRING_DESC_F_NEXT) != 0)) {
 		desc = &vq->desc[desc->next];
 
-		desc_addr = gpa_to_vva(dev, desc->addr);
+		desc_addr = gpa_to_vva(dev->device, desc->addr);
 		if (unlikely(!desc_addr))
 			return -1;
 
@@ -800,7 +803,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 		 * copy is enabled.
 		 */
 		if (dev->tx_zero_copy &&
-		    (hpa = gpa_to_hpa(dev, desc->addr + desc_offset, cpy_len))) {
+		    (hpa = gpa_to_hpa(dev->device, desc->addr + desc_offset, cpy_len))) {
 			cur->data_len = cpy_len;
 			cur->data_off = 0;
 			cur->buf_addr = (void *)(uintptr_t)desc_addr;
@@ -833,7 +836,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
 				return -1;
 			desc = &vq->desc[desc->next];
 
-			desc_addr = gpa_to_vva(dev, desc->addr);
+			desc_addr = gpa_to_vva(dev->device, desc->addr);
 			if (unlikely(!desc_addr))
 				return -1;
 
@@ -924,6 +927,7 @@ uint16_t
 rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count)
 {
+	struct virtio_dev *device;
 	struct virtio_net *dev;
 	struct rte_mbuf *rarp_mbuf = NULL;
 	struct vhost_virtqueue *vq;
@@ -933,13 +937,14 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 	uint16_t free_entries;
 	uint16_t avail_idx;
 
-	dev = get_device(vid);
-	if (!dev)
+	device = get_device(vid);
+	if (!device)
 		return 0;
+	dev = get_net_device(device);
 
 	if (unlikely(!is_valid_virt_queue_idx(queue_id, 1, dev->virt_qp_nb))) {
 		RTE_LOG(ERR, VHOST_DATA, "(%d) %s: invalid virtqueue idx %d.\n",
-			dev->vid, __func__, queue_id);
+			dev->device->vid, __func__, queue_id);
 		return 0;
 	}
 
-- 
1.9.3

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [dpdk-dev] [PATCH v2 2/2] vhost: add vhost-scsi support to vhost library
  2016-09-15  0:28 ` [dpdk-dev] [PATCH v2 1/2] " Changpeng Liu
@ 2016-09-15  0:28   ` Changpeng Liu
  2016-09-14  3:28     ` Yuanhan Liu
  0 siblings, 1 reply; 9+ messages in thread
From: Changpeng Liu @ 2016-09-15  0:28 UTC (permalink / raw)
  To: dev; +Cc: yuanhan.liu, james.r.harris

Since we changed the vhost library as a common framework to add other
VIRTIO device type, here we add VIRTIO_ID_SCSI device type to vhost
library to support vhost-scsi target.

Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
---
 lib/librte_vhost/Makefile          |   4 +-
 lib/librte_vhost/rte_virtio_dev.h  |   1 +
 lib/librte_vhost/rte_virtio_scsi.h |  68 +++++++
 lib/librte_vhost/socket.c          |   2 +
 lib/librte_vhost/vhost_device.h    |   3 +
 lib/librte_vhost/vhost_net.c       |   2 +-
 lib/librte_vhost/vhost_scsi.c      | 354 +++++++++++++++++++++++++++++++++++++
 lib/librte_vhost/vhost_scsi.h      |  68 +++++++
 lib/librte_vhost/vhost_user.c      |  31 +++-
 lib/librte_vhost/vhost_user.h      |   5 +
 lib/librte_vhost/virtio_scsi.c     | 145 +++++++++++++++
 11 files changed, 675 insertions(+), 8 deletions(-)
 create mode 100644 lib/librte_vhost/rte_virtio_scsi.h
 create mode 100644 lib/librte_vhost/vhost_scsi.c
 create mode 100644 lib/librte_vhost/vhost_scsi.h
 create mode 100644 lib/librte_vhost/virtio_scsi.c

diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index af30491..e8fca35 100644
--- a/lib/librte_vhost/Makefile
+++ b/lib/librte_vhost/Makefile
@@ -48,10 +48,10 @@ endif
 
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c socket.c vhost_net.c vhost_user.c \
-				   virtio_net.c
+				   virtio_net.c vhost_scsi.c virtio_scsi.c
 
 # install includes
-SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h rte_virtio_dev.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h rte_virtio_scsi.h rte_virtio_dev.h
 
 # dependencies
 DEPDIRS-$(CONFIG_RTE_LIBRTE_VHOST) += lib/librte_eal
diff --git a/lib/librte_vhost/rte_virtio_dev.h b/lib/librte_vhost/rte_virtio_dev.h
index e3c857a..325a208 100644
--- a/lib/librte_vhost/rte_virtio_dev.h
+++ b/lib/librte_vhost/rte_virtio_dev.h
@@ -40,6 +40,7 @@
 #define RTE_VHOST_USER_TX_ZERO_COPY	(1ULL << 2)
 
 #define RTE_VHOST_USER_DEV_NET		(1ULL << 32)
+#define RTE_VHOST_USER_DEV_SCSI		(1ULL << 33)
 
 /**
  * Device and vring operations.
diff --git a/lib/librte_vhost/rte_virtio_scsi.h b/lib/librte_vhost/rte_virtio_scsi.h
new file mode 100644
index 0000000..4e4cec5
--- /dev/null
+++ b/lib/librte_vhost/rte_virtio_scsi.h
@@ -0,0 +1,68 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _VIRTIO_SCSI_H_
+#define _VIRTIO_SCSI_H_
+
+/**
+ * @file
+ * Interface to vhost net
+ */
+
+#include <stdint.h>
+#include <linux/vhost.h>
+#include <linux/virtio_ring.h>
+#include <linux/virtio_scsi.h>
+#include <sys/eventfd.h>
+#include <sys/socket.h>
+
+#include <rte_memory.h>
+#include <rte_mempool.h>
+#include <rte_virtio_dev.h>
+
+enum {DIR_DMA_NONE, DIR_DMA_FROM_DEV, DIR_DMA_TO_DEV};
+
+/* Register callbacks. */
+int rte_vhost_scsi_driver_callback_register(struct virtio_net_device_ops const * const);
+
+int rte_vhost_scsi_pop_request(int vid, uint16_t queue_id,
+	struct virtio_scsi_cmd_req **request,
+	struct virtio_scsi_cmd_resp **response,
+	struct iovec *iovs, int *iov_cnt, uint32_t *desc_idx,
+	uint32_t *xfer_direction);
+
+int rte_vhost_scsi_push_response(int vid, uint16_t queue_id,
+				 uint32_t req_idx, uint32_t len);
+
+
+#endif /* _VIRTIO_SCSI_H_ */
\ No newline at end of file
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 1474c98..2b3f854 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -527,6 +527,8 @@ rte_vhost_driver_register(const char *path, uint64_t flags)
 	}
 
 	vsocket->type = VIRTIO_ID_NET;
+	if (flags & RTE_VHOST_USER_DEV_SCSI)
+		vsocket->type = VIRTIO_ID_SCSI;
 	vhost_user.vsockets[vhost_user.vsocket_cnt++] = vsocket;
 
 out:
diff --git a/lib/librte_vhost/vhost_device.h b/lib/librte_vhost/vhost_device.h
index 7101bb0..f1125e1 100644
--- a/lib/librte_vhost/vhost_device.h
+++ b/lib/librte_vhost/vhost_device.h
@@ -37,6 +37,7 @@
 #include <linux/virtio_ids.h>
 
 #include "vhost_net.h"
+#include "vhost_scsi.h"
 #include "vhost_user.h"
 
 /* Used to indicate that the device is running on a data core */
@@ -109,6 +110,7 @@ struct virtio_dev {
 	uint32_t		dev_type;
 	union {
 		struct virtio_net	net_dev;
+		struct virtio_scsi	scsi_dev;
 	} dev;
 
 	uint32_t		nr_guest_pages;
@@ -120,6 +122,7 @@ struct virtio_dev {
 } __rte_cache_aligned;
 
 extern struct virtio_net_device_ops const *notify_ops;
+extern struct virtio_net_device_ops const *scsi_notify_ops;
 
 /*
  * Define virtio 1.0 for older kernels
diff --git a/lib/librte_vhost/vhost_net.c b/lib/librte_vhost/vhost_net.c
index f141b32..e38e3ac 100644
--- a/lib/librte_vhost/vhost_net.c
+++ b/lib/librte_vhost/vhost_net.c
@@ -518,7 +518,7 @@ vhost_net_device_init(struct virtio_dev *device)
 
 	device->fn_table.vhost_dev_ready  = vhost_dev_is_ready;
 	device->fn_table.vhost_dev_get_queues  = vhost_dev_get_queues;
-	device->fn_table.vhost_dev_cleanup = cleanup_device;
+	device->fn_table.vhost_dev_cleanup  = cleanup_device;
 	device->fn_table.vhost_dev_free  = free_device;
 	device->fn_table.vhost_dev_reset  = reset_device;
 	device->fn_table.vhost_dev_get_features  = vhost_dev_get_features;
diff --git a/lib/librte_vhost/vhost_scsi.c b/lib/librte_vhost/vhost_scsi.c
new file mode 100644
index 0000000..0f14f77
--- /dev/null
+++ b/lib/librte_vhost/vhost_scsi.c
@@ -0,0 +1,354 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <linux/vhost.h>
+#include <linux/virtio_scsi.h>
+#include <stddef.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <assert.h>
+#include <sys/mman.h>
+#include <unistd.h>
+
+#include <rte_log.h>
+#include <rte_string_fns.h>
+#include <rte_memory.h>
+#include <rte_malloc.h>
+
+#include "vhost_scsi.h"
+#include "vhost_device.h"
+
+/* device ops to add/remove device to/from data core. */
+struct virtio_net_device_ops const *scsi_notify_ops;
+
+/* Features supported by this lib. */
+#define VHOST_SCSI_SUPPORTED_FEATURES ((1ULL << VIRTIO_SCSI_F_INOUT) | \
+				(1ULL << VIRTIO_SCSI_F_HOTPLUG) | \
+				(1ULL << VIRTIO_SCSI_F_CHANGE))
+
+static uint64_t VHOST_SCSI_FEATURES = VHOST_SCSI_SUPPORTED_FEATURES;
+
+struct virtio_scsi *
+get_scsi_device(struct virtio_dev *dev)
+{
+	if (!dev)
+		return NULL;
+ 
+	return &dev->dev.scsi_dev;
+}
+
+static void
+cleanup_vq(struct vhost_virtqueue *vq, int destroy)
+{
+	if ((vq->callfd >= 0) && (destroy != 0))
+		close(vq->callfd);
+	if (vq->kickfd >= 0)
+		close(vq->kickfd);
+}
+
+/*
+ * Unmap any memory, close any file descriptors and
+ * free any memory owned by a device.
+ */
+static void
+cleanup_device(struct virtio_dev *device, int destroy)
+{
+	struct virtio_scsi *dev = get_scsi_device(device);
+	uint32_t i;
+
+	dev->features = 0;
+	dev->protocol_features = 0;
+
+	for (i = 0; i < dev->virt_q_nb; i++) {
+		cleanup_vq(dev->virtqueue[i], destroy);
+	}
+}
+
+static void
+init_vring_queue(struct vhost_virtqueue *vq)
+{
+	memset(vq, 0, sizeof(struct vhost_virtqueue));
+
+	vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD;
+	vq->callfd = VIRTIO_UNINITIALIZED_EVENTFD;
+
+	/* Backends are set to -1 indicating an inactive device. */
+	vq->backend = -1;
+	vq->enabled = 1;
+}
+
+static void
+reset_vring_queue(struct vhost_virtqueue *vq)
+{
+	int callfd;
+
+	callfd = vq->callfd;
+	init_vring_queue(vq);
+	vq->callfd = callfd;
+}
+
+static int
+alloc_vring_queue(struct virtio_scsi *dev, uint32_t q_idx)
+{
+	struct vhost_virtqueue *virtqueue = NULL;
+
+	virtqueue = rte_malloc(NULL,
+			       sizeof(struct vhost_virtqueue), 0);
+	if (virtqueue == NULL) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"Failed to allocate memory for virt qp:%d.\n", q_idx);
+		return -1;
+	}
+
+	dev->virtqueue[q_idx] = virtqueue;
+
+	init_vring_queue(virtqueue);
+
+	dev->virt_q_nb += 1;
+
+	return 0;
+}
+
+/*
+ * Reset some variables in device structure, while keeping few
+ * others untouched, such as vid, ifname, virt_qp_nb: they
+ * should be same unless the device is removed.
+ */
+static void
+reset_device(struct virtio_dev *device)
+{
+	struct virtio_scsi *dev = get_scsi_device(device);
+	uint32_t i;
+
+	for (i = 0; i < dev->virt_q_nb; i++)
+		reset_vring_queue(dev->virtqueue[i]);
+}
+
+/*
+ * Release virtqueues and device memory.
+ */
+static void
+free_device(struct virtio_dev *device)
+{
+	struct virtio_scsi *dev = get_scsi_device(device);
+	uint32_t i;
+
+	for (i = 0; i < dev->virt_q_nb; i++)
+		rte_free(dev->virtqueue[i]);
+
+	rte_free(dev);
+}
+
+static uint64_t
+vhost_dev_get_features(struct virtio_dev *dev)
+{
+	if (dev == NULL)
+		return 0;
+
+	return VHOST_SCSI_FEATURES;	
+}
+
+static int
+vhost_dev_set_features(struct virtio_dev *device, uint64_t features)
+{
+	struct virtio_scsi *dev = get_scsi_device(device);
+
+	if (features & ~VHOST_SCSI_FEATURES)
+		return -1;
+
+	dev->features = features;
+	LOG_DEBUG(VHOST_CONFIG,
+		"(%d) RW %s, Hotplug %s, Status Change %s\n",
+		device->vid,
+		(dev->features & (1ULL << VIRTIO_SCSI_F_INOUT)) ? "on" : "off",
+		(dev->features & (1ULL << VIRTIO_SCSI_F_HOTPLUG)) ? "on" : "off",
+		(dev->features & (1ULL << VIRTIO_SCSI_F_CHANGE)) ? "on" : "off");
+
+	return 0;
+}
+
+static int
+vq_is_ready(struct vhost_virtqueue *vq)
+{
+	return vq && vq->desc   &&
+	       vq->kickfd != VIRTIO_UNINITIALIZED_EVENTFD &&
+	       vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD;
+}
+
+static int
+vhost_dev_is_ready(struct virtio_dev *device)
+{
+	struct virtio_scsi *dev = get_scsi_device(device);
+	struct vhost_virtqueue *vq;
+	uint32_t i;
+
+	for (i = 0; i < dev->virt_q_nb; i++) {
+		vq = dev->virtqueue[i];
+
+		if (!vq_is_ready(vq)) {
+			RTE_LOG(INFO, VHOST_CONFIG,
+				"virtio is not ready for processing.\n");
+			return 0;
+		}
+	}
+
+	RTE_LOG(INFO, VHOST_CONFIG,
+		"virtio is now ready for processing.\n");
+	return 1;
+}
+
+static int
+vhost_dev_set_vring_call(struct virtio_dev *device, struct vhost_vring_file *file)
+{
+	struct virtio_scsi *dev = get_scsi_device(device);
+	struct vhost_virtqueue *vq;
+	uint32_t cur_q_idx;
+
+	/*
+	 * FIXME: VHOST_SET_VRING_CALL is the first per-vring message
+	 * we get, so we do vring queue pair allocation here.
+	 */
+	cur_q_idx = file->index;
+	if (cur_q_idx + 1 > dev->virt_q_nb) {
+		if (alloc_vring_queue(dev, cur_q_idx) < 0)
+			return -1;
+	}
+
+	vq = dev->virtqueue[file->index];
+	assert(vq != NULL);
+
+	if (vq->callfd >= 0)
+		close(vq->callfd);
+
+	vq->callfd = file->fd;
+	return 0;
+}
+
+static uint32_t
+vhost_dev_get_default_queue_num(struct virtio_dev *dev)
+{
+	if (dev == NULL)
+		return 0;
+
+	return VHOST_MAX_SCSI_QUEUES;
+}
+
+static uint32_t
+vhost_dev_get_queue_num(struct virtio_dev *device)
+{
+	struct virtio_scsi *dev;
+	if (device == NULL)
+		return 0;
+
+	dev = get_scsi_device(device);
+	return dev->virt_q_nb;
+}
+
+static int
+vhost_dev_get_vring_base(struct virtio_dev *device, struct vhost_virtqueue *vq)
+{
+	struct virtio_scsi *dev = get_scsi_device(device);
+
+	if (dev == NULL)
+		return 0;
+	/*
+	 * Based on current qemu vhost-user implementation, this message is
+	 * sent and only sent in vhost_vring_stop.
+	 * TODO: cleanup the vring, it isn't usable since here.
+	 */
+	if (vq->kickfd >= 0)
+		close(vq->kickfd);
+	vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD;
+
+	if (vq->callfd >= 0)
+		close(vq->callfd);
+	vq->callfd = VIRTIO_UNINITIALIZED_EVENTFD;
+
+	return 0;
+}
+
+static struct vhost_virtqueue *
+vhost_dev_get_queues(struct virtio_dev *device, uint16_t queue_id)
+{
+	struct virtio_scsi *dev = get_scsi_device(device);
+	struct vhost_virtqueue *vq;
+
+	vq = dev->virtqueue[queue_id];
+
+	return vq;
+}
+
+int
+vhost_user_scsi_set_endpoint(struct virtio_dev *device, struct VhostUserMsg *pmsg)
+{
+       struct virtio_scsi *dev = get_scsi_device(device);
+
+       if (!dev || pmsg->size != sizeof(dev->scsi_target))
+               return -1;
+
+       memcpy(&dev->scsi_target, &pmsg->payload.scsi_target, sizeof(dev->scsi_target));
+
+       return 0;
+}
+
+void
+vhost_scsi_device_init(struct virtio_dev *device)
+{
+	struct virtio_scsi *dev = get_scsi_device(device);
+
+	device->fn_table.vhost_dev_ready  = vhost_dev_is_ready;
+	device->fn_table.vhost_dev_get_queues  = vhost_dev_get_queues;
+	device->fn_table.vhost_dev_cleanup  = cleanup_device;
+	device->fn_table.vhost_dev_free  = free_device;
+	device->fn_table.vhost_dev_reset  = reset_device;
+	device->fn_table.vhost_dev_get_features  = vhost_dev_get_features;
+	device->fn_table.vhost_dev_set_features  = vhost_dev_set_features;
+	device->fn_table.vhost_dev_get_default_queue_num  = vhost_dev_get_default_queue_num;
+	device->fn_table.vhost_dev_get_queue_num  = vhost_dev_get_queue_num;
+	device->fn_table.vhost_dev_get_vring_base  = vhost_dev_get_vring_base;
+	device->fn_table.vhost_dev_set_vring_call  = vhost_dev_set_vring_call;
+
+	dev->device = device;
+}
+
+
+/*
+ * Register ops so that we can add/remove device to data core.
+ */
+int
+rte_vhost_scsi_driver_callback_register(struct virtio_net_device_ops const * const ops)
+{
+	scsi_notify_ops = ops;
+
+	return 0;
+}
\ No newline at end of file
diff --git a/lib/librte_vhost/vhost_scsi.h b/lib/librte_vhost/vhost_scsi.h
new file mode 100644
index 0000000..4aba9b4
--- /dev/null
+++ b/lib/librte_vhost/vhost_scsi.h
@@ -0,0 +1,68 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _VHOST_SCSI_H_
+#define _VHOST_SCSI_H_
+#include <stdint.h>
+#include <stdio.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <linux/vhost.h>
+
+#include <rte_log.h>
+
+#include "rte_virtio_scsi.h"
+#include "vhost_user.h"
+
+#define VHOST_MAX_SCSI_QUEUES		0x1
+
+#define VHOST_USER_SCSI_ABI_VERSION	0x1
+
+/**
+ * Device structure contains all configuration information relating
+ * to the device.
+ */
+struct virtio_scsi {
+	uint64_t		features;
+	uint64_t		protocol_features;
+	uint32_t		virt_q_nb;
+	struct vhost_scsi_target scsi_target;
+	struct vhost_virtqueue	*virtqueue[VHOST_MAX_SCSI_QUEUES + 2];
+	struct virtio_dev	*device;
+} __rte_cache_aligned;
+
+struct virtio_scsi *get_scsi_device(struct virtio_dev *dev);
+void vhost_scsi_device_init(struct virtio_dev *device);
+int vhost_user_scsi_set_endpoint(struct virtio_dev *device, struct VhostUserMsg *pmsg);
+
+#endif
\ No newline at end of file
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 90c4b03..320f86b 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -50,6 +50,7 @@
 
 #include "vhost_device.h"
 #include "vhost_net.h"
+#include "vhost_scsi.h"
 #include "vhost_user.h"
 
 #define MAX_VHOST_DEVICE        1024
@@ -76,6 +77,9 @@ static const char *vhost_message_str[VHOST_USER_MAX] = {
 	[VHOST_USER_GET_QUEUE_NUM]  = "VHOST_USER_GET_QUEUE_NUM",
 	[VHOST_USER_SET_VRING_ENABLE]  = "VHOST_USER_SET_VRING_ENABLE",
 	[VHOST_USER_SEND_RARP]  = "VHOST_USER_SEND_RARP",
+	[VHOST_USER_SCSI_SET_ENDPOINT] = "VHOST_USER_SCSI_SET_ENDPOINT",
+	[VHOST_USER_SCSI_CLEAR_ENDPOINT] = "VHOST_USER_SCSI_CLEAR_ENDPOINT",
+	[VHOST_USER_SCSI_GET_ABI_VERSION] = "VHOST_USER_SCSI_GET_ABI_VERSION",
 };
 
 static uint64_t
@@ -163,6 +167,12 @@ vhost_new_device(int type)
 			vhost_net_device_init(dev);
 			assert(notify_ops != NULL);
 			break;
+
+		case VIRTIO_ID_SCSI:
+			dev->notify_ops = scsi_notify_ops;
+			vhost_scsi_device_init(dev);
+			assert(scsi_notify_ops != NULL);
+			break;
 		default:
 			return -1;
 	}
@@ -713,6 +723,11 @@ vhost_user_set_vring_call(struct virtio_dev *dev, struct VhostUserMsg *pmsg)
 
 	if (dev->fn_table.vhost_dev_set_vring_call)
 		dev->fn_table.vhost_dev_set_vring_call(dev, &file);
+
+	if (virtio_is_ready(dev) && !(dev->flags & VIRTIO_DEV_RUNNING)) {
+		if (dev->notify_ops->new_device(dev->vid) == 0)
+			dev->flags |= VIRTIO_DEV_RUNNING;
+	}
 }
 
 /*
@@ -740,11 +755,6 @@ vhost_user_set_vring_kick(struct virtio_dev *dev, struct VhostUserMsg *pmsg)
 		close(vq->kickfd);
 
 	vq->kickfd = file.fd;
-
-	if (virtio_is_ready(dev) && !(dev->flags & VIRTIO_DEV_RUNNING)) {
-		if (dev->notify_ops->new_device(dev->vid) == 0)
-			dev->flags |= VIRTIO_DEV_RUNNING;
-	}
 }
 
 /*
@@ -1027,6 +1037,17 @@ vhost_user_msg_handler(int vid, int fd)
 		vhost_user_send_rarp(dev, &msg);
 		break;
 
+	case VHOST_USER_SCSI_SET_ENDPOINT:
+		vhost_user_scsi_set_endpoint(dev, &msg);
+		break;
+	case VHOST_USER_SCSI_CLEAR_ENDPOINT:
+		break;
+	case VHOST_USER_SCSI_GET_ABI_VERSION:
+		msg.payload.s32 = VHOST_USER_SCSI_ABI_VERSION;
+		msg.size = sizeof(msg.payload.s32);
+		send_vhost_message(fd, &msg);
+		break;
+
 	default:
 		break;
 
diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index 59f80f2..dfeb27f 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -65,6 +65,9 @@ typedef enum VhostUserRequest {
 	VHOST_USER_GET_QUEUE_NUM = 17,
 	VHOST_USER_SET_VRING_ENABLE = 18,
 	VHOST_USER_SEND_RARP = 19,
+	VHOST_USER_SCSI_SET_ENDPOINT = 20,
+	VHOST_USER_SCSI_CLEAR_ENDPOINT = 21,
+	VHOST_USER_SCSI_GET_ABI_VERSION = 22,
 	VHOST_USER_MAX
 } VhostUserRequest;
 
@@ -97,10 +100,12 @@ typedef struct VhostUserMsg {
 #define VHOST_USER_VRING_IDX_MASK   0xff
 #define VHOST_USER_VRING_NOFD_MASK  (0x1<<8)
 		uint64_t u64;
+		int32_t s32;
 		struct vhost_vring_state state;
 		struct vhost_vring_addr addr;
 		VhostUserMemory memory;
 		VhostUserLog    log;
+		struct vhost_scsi_target scsi_target;
 	} payload;
 	int fds[VHOST_MEMORY_MAX_NREGIONS];
 } __attribute((packed)) VhostUserMsg;
diff --git a/lib/librte_vhost/virtio_scsi.c b/lib/librte_vhost/virtio_scsi.c
new file mode 100644
index 0000000..a49f9ce
--- /dev/null
+++ b/lib/librte_vhost/virtio_scsi.c
@@ -0,0 +1,145 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+#include <linux/virtio_ring.h>
+#include <linux/virtio_scsi.h>
+#include <sys/uio.h>
+
+#include <rte_mbuf.h>
+#include <rte_memcpy.h>
+#include <rte_virtio_scsi.h>
+
+#include "vhost_scsi.h"
+#include "vhost_device.h"
+
+int
+rte_vhost_scsi_pop_request(int vid, uint16_t queue_id,
+	struct virtio_scsi_cmd_req **request, struct virtio_scsi_cmd_resp **response, struct iovec *iovs, int *iov_cnt, uint32_t *desc_idx, uint32_t *xfer_direction)
+{
+	struct virtio_dev *device;
+	struct virtio_scsi *dev;
+	struct vhost_virtqueue *vq;
+	struct vring_desc *desc;
+	uint16_t avail_idx;
+	uint32_t resp_idx, data_idx;
+	uint32_t free_entries;
+
+	device = get_device(vid);
+	if (!device)
+		return 0;
+
+	dev = get_scsi_device(device);
+
+	if (queue_id >= dev->virt_q_nb) {
+		return -1;
+	}
+
+	vq = dev->virtqueue[queue_id];
+	if (unlikely(vq->enabled == 0))
+		return 0;
+
+	free_entries = *((volatile uint16_t *)&vq->avail->idx) -
+			vq->last_avail_idx;
+	if (free_entries == 0)
+		return 0;
+
+	LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->device->vid, __func__);
+
+#define FLAGS_NEXT 0x1
+#define FLAGS_WRITE 0x2
+
+	/* Prefetch available and used ring */
+	avail_idx = vq->last_avail_idx & (vq->size - 1);
+	rte_prefetch0(&vq->avail->ring[avail_idx]);
+
+	*desc_idx = vq->avail->ring[avail_idx];
+
+	/* 1st descriptor */
+	desc = &vq->desc[*desc_idx];
+	*request = (void *)gpa_to_vva(device, desc->addr);
+	/* 2st descriptor */
+	resp_idx = desc->next;
+	desc = &vq->desc[resp_idx];
+	*response = (void *)gpa_to_vva(device, desc->addr);
+
+	if (desc->flags & FLAGS_NEXT) {
+		data_idx = desc->next;
+		desc = &vq->desc[data_idx];
+		iovs[0].iov_base = (void *)gpa_to_vva(device, desc->addr);
+		iovs[0].iov_len = desc->len;
+		if (desc->flags & FLAGS_WRITE)
+			*xfer_direction = DIR_DMA_FROM_DEV;
+		*iov_cnt = 1;
+	}
+
+	rte_smp_wmb();
+	rte_smp_rmb();
+	vq->last_avail_idx += 1;
+
+	return 1;
+}
+
+int
+rte_vhost_scsi_push_response(int vid, uint16_t queue_id, uint32_t req_idx, uint32_t len)
+{
+	struct virtio_dev *device;
+	struct virtio_scsi *dev;
+	struct vhost_virtqueue *vq;
+
+	device = get_device(vid);
+	if (!device)
+		return 0;
+
+	dev = get_scsi_device(device);
+
+	if (queue_id >= dev->virt_q_nb) {
+		return -1;
+	}
+
+	vq = dev->virtqueue[queue_id];
+	if (unlikely(vq->enabled == 0))
+		return 0;
+
+	vq->used->ring[vq->used->idx & (vq->size - 1)].id = req_idx;
+	vq->used->ring[vq->used->idx & (vq->size - 1)].len = len;
+
+	rte_smp_wmb();
+	rte_smp_rmb();
+	vq->used->idx++;
+
+	eventfd_write(vq->callfd, (eventfd_t)1);
+
+	return 0;
+}
-- 
1.9.3

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-09-14  5:48 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-14 12:15 [dpdk-dev] [PATCH] vhost: change the vhost library to a common framework which can support more VIRTIO devices Changpeng Liu
2016-09-13 12:58 ` Yuanhan Liu
2016-09-13 13:24   ` Thomas Monjalon
2016-09-13 13:49     ` Yuanhan Liu
2016-09-15  0:28 ` [dpdk-dev] [PATCH v2 1/2] " Changpeng Liu
2016-09-15  0:28   ` [dpdk-dev] [PATCH v2 2/2] vhost: add vhost-scsi support to vhost library Changpeng Liu
2016-09-14  3:28     ` Yuanhan Liu
2016-09-14  4:46       ` Liu, Changpeng
2016-09-14  5:48         ` Yuanhan Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).