DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost
@ 2015-05-21  7:49 Ouyang Changchun
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 1/6] ixgbe: Support VMDq RSS in non-SRIOV environment Ouyang Changchun
                   ` (7 more replies)
  0 siblings, 8 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-05-21  7:49 UTC (permalink / raw)
  To: dev

This patch set supports the multiple queues for each virtio device in vhost.
The vhost-user is used to enable the multiple queues feature, It's not ready for vhost-cuse.

One prerequisite to enable this feature is that a QEMU patch plus a fix is required to apply
on QEMU2.2/2.3, pls refer to this link for the details of the patch and the fix:
http://lists.nongnu.org/archive/html/qemu-devel/2015-04/msg00917.html

A formal v3 patch for the code change and the fix will be sent to qemu community soon. 
 
Basicaly vhost sample leverages the VMDq+RSS in HW to receive packets and distribute them
into different queue in the pool according to their 5 tuples.
 
On the other hand, it enables multiple queues mode in vhost/virtio layer by setting the queue
number as the value larger than 1.
 
HW queue numbers in pool is required to be exactly same with the queue number in each virtio
device, e.g. rxq = 4, the queue number is 4, it means there are 4 HW queues in each VMDq pool,
and 4 queues in each virtio device/port, every queue in pool maps to one qeueu in virtio device.
 
=========================================
==================|   |==================|
       vport0     |   |      vport1      |
---  ---  ---  ---|   |---  ---  ---  ---|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
||   ||   ||   ||      ||   ||   ||   ||
||   ||   ||   ||      ||   ||   ||   ||
||= =||= =||= =||=|   =||== ||== ||== ||=|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
 
------------------|   |------------------|
     VMDq pool0   |   |    VMDq pool1    |
==================|   |==================|
 
In RX side, it firstly polls each queue of the pool and gets the packets from
it and enqueue them into its corresponding queue in virtio device/port.
In TX side, it dequeue packets from each queue of virtio device/port and send
to either physical port or another virtio device according to its destination
MAC address.
 
It includes a workaround here in virtio as control queue not work for vhost-user
multiple queues. It needs further investigate to root the cause, hopefully it could
be addressed in next version.

Here is some test guidance.
1. On host, firstly mount hugepage, and insmod uio, igb_uio, bind one nic on igb_uio;
and then run vhost sample, key steps as follows: 
sudo mount -t hugetlbfs nodev /mnt/huge
sudo modprobe uio
sudo insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko
 
$RTE_SDK/tools/dpdk_nic_bind.py --bind igb_uio 0000:08:00.0
sudo $RTE_SDK/examples/vhost/build/vhost-switch -c 0xf0 -n 4 --huge-dir /mnt/huge --socket-mem 1024,0 -- -p 1 --vm2vm 0 --dev-basename usvhost --rxq 2

2. After step 1, on host, modprobe kvm and kvm_intel, and use qemu command line to start one guest:
modprobe kvm
modprobe kvm_intel
sudo mount -t hugetlbfs nodev /dev/hugepages -o pagesize=1G

$QEMU_PATH/qemu-system-x86_64 -enable-kvm -m 4096 -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -mem-prealloc -smp 10 -cpu core2duo,+sse3,+sse4.1,+sse4.2 -name <vm-name> -drive file=<img-path>/vm.img -chardev socket,id=char0,path=<usvhost-path>/usvhost -netdev type=vhost-user,id=hostnet2,chardev=char0,vhostforce=on,queues=2 -device virtio-net-pci,mq=on,vectors=6,netdev=hostnet2,id=net2,mac=52:54:00:12:34:56,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -chardev socket,id=char1,path=<usvhost-path>/usvhost -netdev type=vhost-user,id=hostnet3,chardev=char1,vhostforce=on,queues=2 -device virtio-net-pci,mq=on,vectors=6,netdev=hostnet3,id=net3,mac=52:54:00:12:34:57,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off

3. Log on guest, use testpmd(dpdk based) to test, use multiple virtio queues to rx and tx packets.
modprobe uio
insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko
echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
./tools/dpdk_nic_bind.py --bind igb_uio 00:03.0 00:04.0 
 
$RTE_SDK/$RTE_TARGET/app/testpmd -c 1f -n 4 -- --rxq=2 --txq=2 --nb-cores=4 --rx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" --tx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" -i --disable-hw-vlan --txqflags 0xf00

4. Use packet generator to send packets with dest MAC:52 54 00 12 34 57  VLAN tag:1001,
select IPv4 as protocols and continuous incremental IP address.

5. Testpmd on guest can display packets received/transmitted in both queues of each virtio port.

Changchun Ouyang (6):
  ixgbe: Support VMDq RSS in non-SRIOV environment
  lib_vhost: Support multiple queues in virtio dev
  lib_vhost: Set memory layout for multiple queues mode
  vhost: Add new command line option: rxq
  vhost: Support multiple queues
  virtio: Resolve for control queue

 examples/vhost/main.c                         | 199 +++++++++++++++++---------
 lib/librte_ether/rte_ethdev.c                 |  40 ++++++
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c             |  82 +++++++++--
 lib/librte_pmd_virtio/virtio_ethdev.c         |   6 +
 lib/librte_vhost/rte_virtio_net.h             |  25 +++-
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c |  57 ++++----
 lib/librte_vhost/vhost_rxtx.c                 |  53 +++----
 lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
 lib/librte_vhost/vhost_user/virtio-net-user.c | 156 ++++++++++++++------
 lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
 lib/librte_vhost/virtio-net.c                 | 158 ++++++++++++--------
 11 files changed, 545 insertions(+), 237 deletions(-)

-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH 1/6] ixgbe: Support VMDq RSS in non-SRIOV environment
  2015-05-21  7:49 [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost Ouyang Changchun
@ 2015-05-21  7:49 ` Ouyang Changchun
  2015-08-24 10:41   ` Qiu, Michael
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 2/6] lib_vhost: Support multiple queues in virtio dev Ouyang Changchun
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 65+ messages in thread
From: Ouyang Changchun @ 2015-05-21  7:49 UTC (permalink / raw)
  To: dev

In non-SRIOV environment, VMDq RSS could be enabled by MRQC register.
In theory, the queue number per pool could be 2 or 4, but only 2 queues are
available due to HW limitation, the same limit also exist in Linux ixgbe driver.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 lib/librte_ether/rte_ethdev.c     | 40 +++++++++++++++++++
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 82 +++++++++++++++++++++++++++++++++------
 2 files changed, 111 insertions(+), 11 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 024fe8b..6535715 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -933,6 +933,16 @@ rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t nb_rx_q)
 	return 0;
 }
 
+#define VMDQ_RSS_RX_QUEUE_NUM_MAX 4
+
+static int
+rte_eth_dev_check_vmdq_rss_rxq_num(__rte_unused uint8_t port_id, uint16_t nb_rx_q)
+{
+	if (nb_rx_q > VMDQ_RSS_RX_QUEUE_NUM_MAX)
+		return -EINVAL;
+	return 0;
+}
+
 static int
 rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 		      const struct rte_eth_conf *dev_conf)
@@ -1093,6 +1103,36 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 				return -EINVAL;
 			}
 		}
+
+		if (dev_conf->rxmode.mq_mode == ETH_MQ_RX_VMDQ_RSS) {
+			uint32_t nb_queue_pools =
+				dev_conf->rx_adv_conf.vmdq_rx_conf.nb_queue_pools;
+			struct rte_eth_dev_info dev_info;
+
+			rte_eth_dev_info_get(port_id, &dev_info);
+			dev->data->dev_conf.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
+			if (nb_queue_pools == ETH_32_POOLS || nb_queue_pools == ETH_64_POOLS)
+				RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool =
+					dev_info.max_rx_queues/nb_queue_pools;
+			else {
+				PMD_DEBUG_TRACE("ethdev port_id=%d VMDQ "
+						"nb_queue_pools=%d invalid "
+						"in VMDQ RSS\n"
+						port_id,
+						nb_queue_pools);
+				return -EINVAL;
+			}
+
+			if (rte_eth_dev_check_vmdq_rss_rxq_num(port_id,
+				RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) != 0) {
+				PMD_DEBUG_TRACE("ethdev port_id=%d"
+					" SRIOV active, invalid queue"
+					" number for VMDQ RSS, allowed"
+					" value are 1, 2 or 4\n",
+					port_id);
+				return -EINVAL;
+			}
+		}
 	}
 	return 0;
 }
diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index 57c9430..8eb0151 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -3312,15 +3312,15 @@ void ixgbe_configure_dcb(struct rte_eth_dev *dev)
 }
 
 /*
- * VMDq only support for 10 GbE NIC.
+ * Config pool for VMDq on 10 GbE NIC.
  */
 static void
-ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
+ixgbe_vmdq_pool_configure(struct rte_eth_dev *dev)
 {
 	struct rte_eth_vmdq_rx_conf *cfg;
 	struct ixgbe_hw *hw;
 	enum rte_eth_nb_pools num_pools;
-	uint32_t mrqc, vt_ctl, vlanctrl;
+	uint32_t vt_ctl, vlanctrl;
 	uint32_t vmolr = 0;
 	int i;
 
@@ -3329,12 +3329,6 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
 	cfg = &dev->data->dev_conf.rx_adv_conf.vmdq_rx_conf;
 	num_pools = cfg->nb_queue_pools;
 
-	ixgbe_rss_disable(dev);
-
-	/* MRQC: enable vmdq */
-	mrqc = IXGBE_MRQC_VMDQEN;
-	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
-
 	/* PFVTCTL: turn on virtualisation and set the default pool */
 	vt_ctl = IXGBE_VT_CTL_VT_ENABLE | IXGBE_VT_CTL_REPLEN;
 	if (cfg->enable_default_pool)
@@ -3401,6 +3395,28 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
 }
 
 /*
+ * VMDq only support for 10 GbE NIC.
+ */
+static void
+ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
+{
+	struct ixgbe_hw *hw;
+	uint32_t mrqc;
+
+	PMD_INIT_FUNC_TRACE();
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	ixgbe_rss_disable(dev);
+
+	/* MRQC: enable vmdq */
+	mrqc = IXGBE_MRQC_VMDQEN;
+	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
+	IXGBE_WRITE_FLUSH(hw);
+
+	ixgbe_vmdq_pool_configure(dev);
+}
+
+/*
  * ixgbe_dcb_config_tx_hw_config - Configure general VMDq TX parameters
  * @hw: pointer to hardware structure
  */
@@ -3505,6 +3521,41 @@ ixgbe_config_vf_rss(struct rte_eth_dev *dev)
 }
 
 static int
+ixgbe_config_vmdq_rss(struct rte_eth_dev *dev)
+{
+	struct ixgbe_hw *hw;
+	uint32_t mrqc;
+
+	ixgbe_rss_configure(dev);
+
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/* MRQC: enable VMDQ RSS */
+	mrqc = IXGBE_READ_REG(hw, IXGBE_MRQC);
+	mrqc &= ~IXGBE_MRQC_MRQE_MASK;
+
+	switch (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) {
+	case 2:
+		mrqc |= IXGBE_MRQC_VMDQRSS64EN;
+		break;
+
+	case 4:
+		mrqc |= IXGBE_MRQC_VMDQRSS32EN;
+		break;
+
+	default:
+		PMD_INIT_LOG(ERR, "Invalid pool number in non-IOV mode with VMDQ RSS");
+		return -EINVAL;
+	}
+
+	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
+
+	ixgbe_vmdq_pool_configure(dev);
+
+	return 0;
+}
+
+static int
 ixgbe_config_vf_default(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw =
@@ -3560,6 +3611,10 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev)
 				ixgbe_vmdq_rx_hw_configure(dev);
 				break;
 
+			case ETH_MQ_RX_VMDQ_RSS:
+				ixgbe_config_vmdq_rss(dev);
+				break;
+
 			case ETH_MQ_RX_NONE:
 				/* if mq_mode is none, disable rss mode.*/
 			default: ixgbe_rss_disable(dev);
@@ -4038,6 +4093,8 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
 
 	/* Setup RX queues */
 	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		uint32_t psrtype = 0;
+
 		rxq = dev->data->rx_queues[i];
 
 		/*
@@ -4065,12 +4122,10 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
 		if (rx_conf->header_split) {
 			if (hw->mac.type == ixgbe_mac_82599EB) {
 				/* Must setup the PSRTYPE register */
-				uint32_t psrtype;
 				psrtype = IXGBE_PSRTYPE_TCPHDR |
 					IXGBE_PSRTYPE_UDPHDR   |
 					IXGBE_PSRTYPE_IPV4HDR  |
 					IXGBE_PSRTYPE_IPV6HDR;
-				IXGBE_WRITE_REG(hw, IXGBE_PSRTYPE(rxq->reg_idx), psrtype);
 			}
 			srrctl = ((rx_conf->split_hdr_size <<
 				IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT) &
@@ -4080,6 +4135,11 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
 #endif
 			srrctl = IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;
 
+		/* Set RQPL for VMDQ RSS according to max Rx queue */
+		psrtype |= (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool >> 1) <<
+			IXGBE_PSRTYPE_RQPL_SHIFT;
+		IXGBE_WRITE_REG(hw, IXGBE_PSRTYPE(rxq->reg_idx), psrtype);
+
 		/* Set if packets are dropped when no descriptors available */
 		if (rxq->drop_en)
 			srrctl |= IXGBE_SRRCTL_DROP_EN;
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH 2/6] lib_vhost: Support multiple queues in virtio dev
  2015-05-21  7:49 [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost Ouyang Changchun
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 1/6] ixgbe: Support VMDq RSS in non-SRIOV environment Ouyang Changchun
@ 2015-05-21  7:49 ` Ouyang Changchun
  2015-06-03  2:47   ` Xie, Huawei
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 3/6] lib_vhost: Set memory layout for multiple queues mode Ouyang Changchun
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 65+ messages in thread
From: Ouyang Changchun @ 2015-05-21  7:49 UTC (permalink / raw)
  To: dev

Each virtio device could have multiple queues, say 2 or 4, at most 8.
Enabling this feature allows virtio device/port on guest has the ability to
use different vCPU to receive/transmit packets from/to each queue.

In multiple queues mode, virtio device readiness means all queues of
this virtio device are ready, cleanup/destroy a virtio device also
requires clearing all queues belong to it.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 lib/librte_vhost/rte_virtio_net.h             |  15 ++-
 lib/librte_vhost/vhost_rxtx.c                 |  32 ++++---
 lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
 lib/librte_vhost/vhost_user/virtio-net-user.c |  97 +++++++++++++++----
 lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
 lib/librte_vhost/virtio-net.c                 | 132 +++++++++++++++++---------
 6 files changed, 201 insertions(+), 81 deletions(-)

diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 5d38185..3e82bef 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -59,6 +59,10 @@ struct rte_mbuf;
 /* Backend value set by guest. */
 #define VIRTIO_DEV_STOPPED -1
 
+/**
+ * Maximum number of virtqueues per device.
+ */
+#define VIRTIO_MAX_VIRTQUEUES 8
 
 /* Enum for virtqueue management. */
 enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
@@ -96,13 +100,14 @@ struct vhost_virtqueue {
  * Device structure contains all configuration information relating to the device.
  */
 struct virtio_net {
-	struct vhost_virtqueue	*virtqueue[VIRTIO_QNUM];	/**< Contains all virtqueue information. */
 	struct virtio_memory	*mem;		/**< QEMU memory and memory region information. */
+	struct vhost_virtqueue	*virtqueue[VIRTIO_QNUM * VIRTIO_MAX_VIRTQUEUES]; /**< Contains all virtqueue information. */
 	uint64_t		features;	/**< Negotiated feature set. */
 	uint64_t		device_fh;	/**< device identifier. */
 	uint32_t		flags;		/**< Device flags. Only used to check if device is running on data core. */
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
 	char			ifname[IF_NAME_SZ];	/**< Name of the tap device or socket path. */
+	uint32_t                num_virt_queues;
 	void			*priv;		/**< private context */
 } __rte_cache_aligned;
 
@@ -220,4 +225,12 @@ uint16_t rte_vhost_enqueue_burst(struct virtio_net *dev, uint16_t queue_id,
 uint16_t rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
 
+/**
+ * This function set the queue number to one vhost device.
+ * @param q_number
+ *  queue number one vhost device.
+ * @return
+ *  0 if success, -1 if q_number exceed the max.
+ */
+int rte_vhost_q_num_set(uint32_t q_number);
 #endif /* _VIRTIO_NET_H_ */
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 4809d32..19f9518 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -67,12 +67,12 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 	uint8_t success = 0;
 
 	LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_rx()\n", dev->device_fh);
-	if (unlikely(queue_id != VIRTIO_RXQ)) {
-		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
-		return 0;
+	if (unlikely(queue_id >= VIRTIO_QNUM * dev->num_virt_queues)) {
+		LOG_DEBUG(VHOST_DATA, "queue id: %d invalid.\n", queue_id);
+		return -1;
 	}
 
-	vq = dev->virtqueue[VIRTIO_RXQ];
+	vq = dev->virtqueue[queue_id];
 	count = (count > MAX_PKT_BURST) ? MAX_PKT_BURST : count;
 
 	/*
@@ -188,8 +188,9 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 }
 
 static inline uint32_t __attribute__((always_inline))
-copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t res_base_idx,
-	uint16_t res_end_idx, struct rte_mbuf *pkt)
+copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
+	uint16_t res_base_idx, uint16_t res_end_idx,
+	struct rte_mbuf *pkt)
 {
 	uint32_t vec_idx = 0;
 	uint32_t entry_success = 0;
@@ -217,9 +218,9 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t res_base_idx,
 	 * Convert from gpa to vva
 	 * (guest physical addr -> vhost virtual addr)
 	 */
-	vq = dev->virtqueue[VIRTIO_RXQ];
 	vb_addr =
 		gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
+	vq = dev->virtqueue[queue_id];
 	vb_hdr_addr = vb_addr;
 
 	/* Prefetch buffer address. */
@@ -407,11 +408,12 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
 
 	LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_merge_rx()\n",
 		dev->device_fh);
-	if (unlikely(queue_id != VIRTIO_RXQ)) {
-		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
+	if (unlikely(queue_id >= VIRTIO_QNUM * dev->num_virt_queues)) {
+		LOG_DEBUG(VHOST_DATA, "queue id: %d invalid.\n", queue_id);
+		return -1;
 	}
 
-	vq = dev->virtqueue[VIRTIO_RXQ];
+	vq = dev->virtqueue[queue_id];
 	count = RTE_MIN((uint32_t)MAX_PKT_BURST, count);
 
 	if (count == 0)
@@ -493,7 +495,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
 
 		res_end_idx = res_cur_idx;
 
-		entry_success = copy_from_mbuf_to_vring(dev, res_base_idx,
+		entry_success = copy_from_mbuf_to_vring(dev, queue_id, res_base_idx,
 			res_end_idx, pkts[pkt_idx]);
 
 		rte_compiler_barrier();
@@ -543,12 +545,12 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 	uint16_t free_entries, entry_success = 0;
 	uint16_t avail_idx;
 
-	if (unlikely(queue_id != VIRTIO_TXQ)) {
-		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
-		return 0;
+	if (unlikely(queue_id >= VIRTIO_QNUM * dev->num_virt_queues)) {
+		LOG_DEBUG(VHOST_DATA, "queue id:%d invalid.\n", queue_id);
+		return -1;
 	}
 
-	vq = dev->virtqueue[VIRTIO_TXQ];
+	vq = dev->virtqueue[queue_id];
 	avail_idx =  *((volatile uint16_t *)&vq->avail->idx);
 
 	/* If there are no available buffers then return. */
diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c
index 31f1215..b66a653 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -378,7 +378,9 @@ vserver_message_handler(int connfd, void *dat, int *remove)
 		ops->set_owner(ctx);
 		break;
 	case VHOST_USER_RESET_OWNER:
-		ops->reset_owner(ctx);
+		RTE_LOG(INFO, VHOST_CONFIG,
+			"(%"PRIu64") VHOST_NET_RESET_OWNER\n", ctx.fh);
+		user_reset_owner(ctx, &msg.payload.state);
 		break;
 
 	case VHOST_USER_SET_MEM_TABLE:
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c
index c1ffc38..bdb2d40 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -209,22 +209,56 @@ static int
 virtio_is_ready(struct virtio_net *dev)
 {
 	struct vhost_virtqueue *rvq, *tvq;
+	uint32_t q_idx;
 
 	/* mq support in future.*/
-	rvq = dev->virtqueue[VIRTIO_RXQ];
-	tvq = dev->virtqueue[VIRTIO_TXQ];
-	if (rvq && tvq && rvq->desc && tvq->desc &&
-		(rvq->kickfd != (eventfd_t)-1) &&
-		(rvq->callfd != (eventfd_t)-1) &&
-		(tvq->kickfd != (eventfd_t)-1) &&
-		(tvq->callfd != (eventfd_t)-1)) {
-		RTE_LOG(INFO, VHOST_CONFIG,
-			"virtio is now ready for processing.\n");
-		return 1;
+	for (q_idx = 0; q_idx < dev->num_virt_queues; q_idx++) {
+                uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+                uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+
+		rvq = dev->virtqueue[virt_rx_q_idx];
+		tvq = dev->virtqueue[virt_tx_q_idx];
+		if ((rvq == NULL) || (tvq == NULL) ||
+			(rvq->desc == NULL) || (tvq->desc == NULL) ||
+			(rvq->kickfd == (eventfd_t)-1) ||
+			(rvq->callfd == (eventfd_t)-1) ||
+			(tvq->kickfd == (eventfd_t)-1) ||
+			(tvq->callfd == (eventfd_t)-1)) {
+			RTE_LOG(INFO, VHOST_CONFIG,
+				"virtio isn't ready for processing.\n");
+			return 0;
+		}
 	}
 	RTE_LOG(INFO, VHOST_CONFIG,
-		"virtio isn't ready for processing.\n");
-	return 0;
+		"virtio is now ready for processing.\n");
+	return 1;
+}
+
+static int
+virtio_is_ready_for_reset(struct virtio_net *dev)
+{
+	struct vhost_virtqueue *rvq, *tvq;
+	uint32_t q_idx;
+
+	/* mq support in future.*/
+	for (q_idx = 0; q_idx < dev->num_virt_queues; q_idx++) {
+		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+
+		rvq = dev->virtqueue[virt_rx_q_idx];
+		tvq = dev->virtqueue[virt_tx_q_idx];
+		if ((rvq == NULL) || (tvq == NULL) ||
+			(rvq->kickfd != (eventfd_t)-1) ||
+			(tvq->kickfd != (eventfd_t)-1)) {
+			RTE_LOG(INFO, VHOST_CONFIG,
+				"virtio isn't ready for reset.\n");
+			return 0;
+		}
+	}
+
+	RTE_LOG(INFO, VHOST_CONFIG,
+		"virtio is now ready for reset.\n");
+	return 1;
 }
 
 void
@@ -290,15 +324,42 @@ user_get_vring_base(struct vhost_device_ctx ctx,
 	 * sent and only sent in vhost_vring_stop.
 	 * TODO: cleanup the vring, it isn't usable since here.
 	 */
-	if (((int)dev->virtqueue[VIRTIO_RXQ]->kickfd) >= 0) {
-		close(dev->virtqueue[VIRTIO_RXQ]->kickfd);
-		dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
+	if (((int)dev->virtqueue[state->index]->kickfd) >= 0) {
+		close(dev->virtqueue[state->index]->kickfd);
+		dev->virtqueue[state->index]->kickfd = (eventfd_t)-1;
 	}
-	if (((int)dev->virtqueue[VIRTIO_TXQ]->kickfd) >= 0) {
-		close(dev->virtqueue[VIRTIO_TXQ]->kickfd);
-		dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
+
+	return 0;
+}
+
+/*
+ * when virtio is stopped, qemu will send us the RESET_OWNER message.
+ */
+int
+user_reset_owner(struct vhost_device_ctx ctx,
+	struct vhost_vring_state *state)
+{
+	struct virtio_net *dev = get_device(ctx);
+
+	/* We have to stop the queue (virtio) if it is running. */
+	if (dev->flags & VIRTIO_DEV_RUNNING)
+		notify_ops->destroy_device(dev);
+
+	RTE_LOG(INFO, VHOST_CONFIG,
+		"reset owner --- state idx:%d state num:%d\n", state->index, state->num);
+	/*
+	 * Based on current qemu vhost-user implementation, this message is
+	 * sent and only sent in vhost_net_stop_one.
+	 * TODO: cleanup the vring, it isn't usable since here.
+	 */
+	if (((int)dev->virtqueue[state->index]->kickfd) >= 0) {
+		close(dev->virtqueue[state->index]->kickfd);
+		dev->virtqueue[state->index]->kickfd = (eventfd_t)-1;
 	}
 
+	if (virtio_is_ready_for_reset(dev))
+		ops->reset_owner(ctx);
+
 	return 0;
 }
 
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.h b/lib/librte_vhost/vhost_user/virtio-net-user.h
index df24860..2429836 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.h
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.h
@@ -46,4 +46,6 @@ void user_set_vring_kick(struct vhost_device_ctx, struct VhostUserMsg *);
 int user_get_vring_base(struct vhost_device_ctx, struct vhost_vring_state *);
 
 void user_destroy_device(struct vhost_device_ctx);
+
+int user_reset_owner(struct vhost_device_ctx ctx, struct vhost_vring_state *state);
 #endif
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 4672e67..680f1b8 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -66,9 +66,11 @@ static struct virtio_net_config_ll *ll_root;
 /* Features supported by this lib. */
 #define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
 				(1ULL << VIRTIO_NET_F_CTRL_VQ) | \
-				(1ULL << VIRTIO_NET_F_CTRL_RX))
+				(1ULL << VIRTIO_NET_F_CTRL_RX) | \
+				(1ULL << VIRTIO_NET_F_MQ))
 static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
 
+static uint32_t q_num = 1;
 
 /*
  * Converts QEMU virtual address to Vhost virtual address. This function is
@@ -177,6 +179,8 @@ add_config_ll_entry(struct virtio_net_config_ll *new_ll_dev)
 static void
 cleanup_device(struct virtio_net *dev)
 {
+	uint32_t q_idx;
+
 	/* Unmap QEMU memory file if mapped. */
 	if (dev->mem) {
 		munmap((void *)(uintptr_t)dev->mem->mapped_address,
@@ -185,14 +189,18 @@ cleanup_device(struct virtio_net *dev)
 	}
 
 	/* Close any event notifiers opened by device. */
-	if ((int)dev->virtqueue[VIRTIO_RXQ]->callfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_RXQ]->callfd);
-	if ((int)dev->virtqueue[VIRTIO_RXQ]->kickfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_RXQ]->kickfd);
-	if ((int)dev->virtqueue[VIRTIO_TXQ]->callfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_TXQ]->callfd);
-	if ((int)dev->virtqueue[VIRTIO_TXQ]->kickfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_TXQ]->kickfd);
+	for (q_idx = 0; q_idx < dev->num_virt_queues; q_idx++) {
+		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+		if ((int)dev->virtqueue[virt_rx_q_idx]->callfd >= 0)
+			close((int)dev->virtqueue[virt_rx_q_idx]->callfd);
+		if ((int)dev->virtqueue[virt_rx_q_idx]->kickfd >= 0)
+			close((int)dev->virtqueue[virt_rx_q_idx]->kickfd);
+		if ((int)dev->virtqueue[virt_tx_q_idx]->callfd >= 0)
+			close((int)dev->virtqueue[virt_tx_q_idx]->callfd);
+		if ((int)dev->virtqueue[virt_tx_q_idx]->kickfd >= 0)
+			close((int)dev->virtqueue[virt_tx_q_idx]->kickfd);
+	}
 }
 
 /*
@@ -201,7 +209,10 @@ cleanup_device(struct virtio_net *dev)
 static void
 free_device(struct virtio_net_config_ll *ll_dev)
 {
-	/* Free any malloc'd memory */
+	/*
+	 * Free any malloc'd memory, just need free once even in multi Q case
+	 * as they are malloc'd once.
+	 */
 	free(ll_dev->dev.virtqueue[VIRTIO_RXQ]);
 	free(ll_dev->dev.virtqueue[VIRTIO_TXQ]);
 	free(ll_dev);
@@ -240,9 +251,10 @@ rm_config_ll_entry(struct virtio_net_config_ll *ll_dev,
  *  Initialise all variables in device structure.
  */
 static void
-init_device(struct virtio_net *dev)
+init_device(struct virtio_net *dev, uint8_t reset_owner)
 {
 	uint64_t vq_offset;
+	uint32_t q_idx;
 
 	/*
 	 * Virtqueues have already been malloced so
@@ -251,19 +263,27 @@ init_device(struct virtio_net *dev)
 	vq_offset = offsetof(struct virtio_net, mem);
 
 	/* Set everything to 0. */
-	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
-		(sizeof(struct virtio_net) - (size_t)vq_offset));
-	memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct vhost_virtqueue));
-	memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct vhost_virtqueue));
-
-	dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
-	dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
-	dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
-	dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;
-
-	/* Backends are set to -1 indicating an inactive device. */
-	dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
-	dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
+	if (!reset_owner)
+		memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
+			(sizeof(struct virtio_net) - (size_t)vq_offset));
+
+	dev->num_virt_queues = q_num;
+
+	for (q_idx = 0; q_idx < dev->num_virt_queues; q_idx++) {
+		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+		memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct vhost_virtqueue));
+		memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct vhost_virtqueue));
+
+		dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
+		dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
+		dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
+		dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
+
+		/* Backends are set to -1 indicating an inactive device. */
+		dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
+		dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED;
+	}
 }
 
 /*
@@ -276,6 +296,7 @@ new_device(struct vhost_device_ctx ctx)
 {
 	struct virtio_net_config_ll *new_ll_dev;
 	struct vhost_virtqueue *virtqueue_rx, *virtqueue_tx;
+	uint32_t q_idx;
 
 	/* Setup device and virtqueues. */
 	new_ll_dev = malloc(sizeof(struct virtio_net_config_ll));
@@ -286,7 +307,7 @@ new_device(struct vhost_device_ctx ctx)
 		return -1;
 	}
 
-	virtqueue_rx = malloc(sizeof(struct vhost_virtqueue));
+	virtqueue_rx = malloc(sizeof(struct vhost_virtqueue) * q_num);
 	if (virtqueue_rx == NULL) {
 		free(new_ll_dev);
 		RTE_LOG(ERR, VHOST_CONFIG,
@@ -295,7 +316,7 @@ new_device(struct vhost_device_ctx ctx)
 		return -1;
 	}
 
-	virtqueue_tx = malloc(sizeof(struct vhost_virtqueue));
+	virtqueue_tx = malloc(sizeof(struct vhost_virtqueue) * q_num);
 	if (virtqueue_tx == NULL) {
 		free(virtqueue_rx);
 		free(new_ll_dev);
@@ -305,11 +326,16 @@ new_device(struct vhost_device_ctx ctx)
 		return -1;
 	}
 
-	new_ll_dev->dev.virtqueue[VIRTIO_RXQ] = virtqueue_rx;
-	new_ll_dev->dev.virtqueue[VIRTIO_TXQ] = virtqueue_tx;
+	memset(new_ll_dev->dev.virtqueue, 0, sizeof(new_ll_dev->dev.virtqueue));
+	for (q_idx = 0; q_idx < q_num; q_idx++) {
+		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+		new_ll_dev->dev.virtqueue[virt_rx_q_idx] = virtqueue_rx + q_idx;
+		new_ll_dev->dev.virtqueue[virt_tx_q_idx] = virtqueue_tx + q_idx;
+	}
 
 	/* Initialise device and virtqueues. */
-	init_device(&new_ll_dev->dev);
+	init_device(&new_ll_dev->dev, 0);
 
 	new_ll_dev->next = NULL;
 
@@ -398,7 +424,7 @@ reset_owner(struct vhost_device_ctx ctx)
 	ll_dev = get_config_ll_entry(ctx);
 
 	cleanup_device(&ll_dev->dev);
-	init_device(&ll_dev->dev);
+	init_device(&ll_dev->dev, 1);
 
 	return 0;
 }
@@ -429,6 +455,7 @@ static int
 set_features(struct vhost_device_ctx ctx, uint64_t *pu)
 {
 	struct virtio_net *dev;
+	uint32_t q_idx;
 
 	dev = get_device(ctx);
 	if (dev == NULL)
@@ -440,22 +467,26 @@ set_features(struct vhost_device_ctx ctx, uint64_t *pu)
 	dev->features = *pu;
 
 	/* Set the vhost_hlen depending on if VIRTIO_NET_F_MRG_RXBUF is set. */
-	if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
-		LOG_DEBUG(VHOST_CONFIG,
-			"(%"PRIu64") Mergeable RX buffers enabled\n",
-			dev->device_fh);
-		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr_mrg_rxbuf);
-		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr_mrg_rxbuf);
-	} else {
-		LOG_DEBUG(VHOST_CONFIG,
-			"(%"PRIu64") Mergeable RX buffers disabled\n",
-			dev->device_fh);
-		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr);
-		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr);
+	for (q_idx = 0; q_idx < dev->num_virt_queues; q_idx++) {
+		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+		if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
+			LOG_DEBUG(VHOST_CONFIG,
+				"(%"PRIu64") Mergeable RX buffers enabled\n",
+				dev->device_fh);
+			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr_mrg_rxbuf);
+			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr_mrg_rxbuf);
+		} else {
+			LOG_DEBUG(VHOST_CONFIG,
+				"(%"PRIu64") Mergeable RX buffers disabled\n",
+				dev->device_fh);
+			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr);
+			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr);
+		}
 	}
 	return 0;
 }
@@ -736,6 +767,15 @@ int rte_vhost_feature_enable(uint64_t feature_mask)
 	return -1;
 }
 
+int rte_vhost_q_num_set(uint32_t q_number)
+{
+	if (q_number > VIRTIO_MAX_VIRTQUEUES)
+		return -1;
+
+	q_num = q_number;
+	return 0;
+}
+
 /*
  * Register ops so that we can add/remove device to data core.
  */
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH 3/6] lib_vhost: Set memory layout for multiple queues mode
  2015-05-21  7:49 [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost Ouyang Changchun
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 1/6] ixgbe: Support VMDq RSS in non-SRIOV environment Ouyang Changchun
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 2/6] lib_vhost: Support multiple queues in virtio dev Ouyang Changchun
@ 2015-05-21  7:49 ` Ouyang Changchun
  2015-06-02  3:33   ` Xie, Huawei
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 4/6] vhost: Add new command line option: rxq Ouyang Changchun
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 65+ messages in thread
From: Ouyang Changchun @ 2015-05-21  7:49 UTC (permalink / raw)
  To: dev

QEMU sends separate commands orderly to set the memory layout for each queue
in one virtio device, accordingly vhost need keep memory layout information
for each queue of the virtio device.

This also need adjust the interface a bit for function gpa_to_vva by
introducing the queue index to specify queue of device to look up its
virtual vhost address for the incoming guest physical address.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 examples/vhost/main.c                         | 21 +++++-----
 lib/librte_vhost/rte_virtio_net.h             | 10 +++--
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 57 ++++++++++++++------------
 lib/librte_vhost/vhost_rxtx.c                 | 21 +++++-----
 lib/librte_vhost/vhost_user/virtio-net-user.c | 59 ++++++++++++++-------------
 lib/librte_vhost/virtio-net.c                 | 26 +++++++-----
 6 files changed, 106 insertions(+), 88 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 509e9d8..408eb3f 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1466,11 +1466,11 @@ attach_rxmbuf_zcp(struct virtio_net *dev)
 		desc = &vq->desc[desc_idx];
 		if (desc->flags & VRING_DESC_F_NEXT) {
 			desc = &vq->desc[desc->next];
-			buff_addr = gpa_to_vva(dev, desc->addr);
+			buff_addr = gpa_to_vva(dev, 0, desc->addr);
 			phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len,
 					&addr_type);
 		} else {
-			buff_addr = gpa_to_vva(dev,
+			buff_addr = gpa_to_vva(dev, 0,
 					desc->addr + vq->vhost_hlen);
 			phys_addr = gpa_to_hpa(vdev,
 					desc->addr + vq->vhost_hlen,
@@ -1722,7 +1722,7 @@ virtio_dev_rx_zcp(struct virtio_net *dev, struct rte_mbuf **pkts,
 			rte_pktmbuf_data_len(buff), 0);
 
 		/* Buffer address translation for virtio header. */
-		buff_hdr_addr = gpa_to_vva(dev, desc->addr);
+		buff_hdr_addr = gpa_to_vva(dev, 0, desc->addr);
 		packet_len = rte_pktmbuf_data_len(buff) + vq->vhost_hlen;
 
 		/*
@@ -1946,7 +1946,7 @@ virtio_dev_tx_zcp(struct virtio_net *dev)
 		desc = &vq->desc[desc->next];
 
 		/* Buffer address translation. */
-		buff_addr = gpa_to_vva(dev, desc->addr);
+		buff_addr = gpa_to_vva(dev, 0, desc->addr);
 		/* Need check extra VLAN_HLEN size for inserting VLAN tag */
 		phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len + VLAN_HLEN,
 			&addr_type);
@@ -2604,13 +2604,14 @@ new_device (struct virtio_net *dev)
 	dev->priv = vdev;
 
 	if (zero_copy) {
-		vdev->nregions_hpa = dev->mem->nregions;
-		for (regionidx = 0; regionidx < dev->mem->nregions; regionidx++) {
+		struct virtio_memory *dev_mem = dev->mem_arr[0];
+		vdev->nregions_hpa = dev_mem->nregions;
+		for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) {
 			vdev->nregions_hpa
 				+= check_hpa_regions(
-					dev->mem->regions[regionidx].guest_phys_address
-					+ dev->mem->regions[regionidx].address_offset,
-					dev->mem->regions[regionidx].memory_size);
+					dev_mem->regions[regionidx].guest_phys_address
+					+ dev_mem->regions[regionidx].address_offset,
+					dev_mem->regions[regionidx].memory_size);
 
 		}
 
@@ -2626,7 +2627,7 @@ new_device (struct virtio_net *dev)
 
 
 		if (fill_hpa_memory_regions(
-			vdev->regions_hpa, dev->mem
+			vdev->regions_hpa, dev_mem
 			) != vdev->nregions_hpa) {
 
 			RTE_LOG(ERR, VHOST_CONFIG,
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 3e82bef..95cfe18 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -100,14 +100,15 @@ struct vhost_virtqueue {
  * Device structure contains all configuration information relating to the device.
  */
 struct virtio_net {
-	struct virtio_memory	*mem;		/**< QEMU memory and memory region information. */
 	struct vhost_virtqueue	*virtqueue[VIRTIO_QNUM * VIRTIO_MAX_VIRTQUEUES]; /**< Contains all virtqueue information. */
+	struct virtio_memory    *mem_arr[VIRTIO_MAX_VIRTQUEUES];        /**< Array for QEMU memory and memory region information. */
 	uint64_t		features;	/**< Negotiated feature set. */
 	uint64_t		device_fh;	/**< device identifier. */
 	uint32_t		flags;		/**< Device flags. Only used to check if device is running on data core. */
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
 	char			ifname[IF_NAME_SZ];	/**< Name of the tap device or socket path. */
 	uint32_t                num_virt_queues;
+	uint32_t                mem_idx;        /** Used in set memory layout, unique for each queue within virtio device. */
 	void			*priv;		/**< private context */
 } __rte_cache_aligned;
 
@@ -158,14 +159,15 @@ rte_vring_available_entries(struct virtio_net *dev, uint16_t queue_id)
  * This is used to convert guest virtio buffer addresses.
  */
 static inline uint64_t __attribute__((always_inline))
-gpa_to_vva(struct virtio_net *dev, uint64_t guest_pa)
+gpa_to_vva(struct virtio_net *dev, uint32_t q_idx, uint64_t guest_pa)
 {
 	struct virtio_memory_regions *region;
+	struct virtio_memory * dev_mem = dev->mem_arr[q_idx];
 	uint32_t regionidx;
 	uint64_t vhost_va = 0;
 
-	for (regionidx = 0; regionidx < dev->mem->nregions; regionidx++) {
-		region = &dev->mem->regions[regionidx];
+	for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) {
+		region = &dev_mem->regions[regionidx];
 		if ((guest_pa >= region->guest_phys_address) &&
 			(guest_pa <= region->guest_phys_address_end)) {
 			vhost_va = region->address_offset + guest_pa;
diff --git a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
index ae2c3fa..623ed53 100644
--- a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
+++ b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
@@ -273,28 +273,32 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 		((uint64_t)(uintptr_t)mem_regions_addr + size);
 	uint64_t base_address = 0, mapped_address, mapped_size;
 	struct virtio_net *dev;
+	struct virtio_memory * dev_mem = NULL;
 
 	dev = get_device(ctx);
 	if (dev == NULL)
-		return -1;
-
-	if (dev->mem && dev->mem->mapped_address) {
-		munmap((void *)(uintptr_t)dev->mem->mapped_address,
-			(size_t)dev->mem->mapped_size);
-		free(dev->mem);
-		dev->mem = NULL;
+		goto error;
+
+	dev_mem = dev->mem_arr[dev->mem_idx];
+	if (dev_mem && dev_mem->mapped_address) {
+		munmap((void *)(uintptr_t)dev_mem->mapped_address,
+			(size_t)dev_mem->mapped_size);
+		free(dev_mem);
+		dev->mem_arr[dev->mem_idx] = NULL;
 	}
 
-	dev->mem = calloc(1, sizeof(struct virtio_memory) +
+	dev->mem_arr[dev->mem_idx] = calloc(1, sizeof(struct virtio_memory) +
 		sizeof(struct virtio_memory_regions) * nregions);
-	if (dev->mem == NULL) {
+	dev_mem = dev->mem_arr[dev->mem_idx];
+
+	if (dev_mem == NULL) {
 		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to allocate memory for dev->mem\n",
-			dev->device_fh);
-		return -1;
+			"(%"PRIu64") Failed to allocate memory for dev->mem_arr[%d]\n",
+			dev->device_fh, dev->mem_idx);
+		goto error;
 	}
 
-	pregion = &dev->mem->regions[0];
+	pregion = &dev_mem->regions[0];
 
 	for (idx = 0; idx < nregions; idx++) {
 		pregion[idx].guest_phys_address =
@@ -320,14 +324,12 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 				pregion[idx].userspace_address;
 			/* Map VM memory file */
 			if (host_memory_map(ctx.pid, base_address,
-				&mapped_address, &mapped_size) != 0) {
-				free(dev->mem);
-				dev->mem = NULL;
-				return -1;
-			}
-			dev->mem->mapped_address = mapped_address;
-			dev->mem->base_address = base_address;
-			dev->mem->mapped_size = mapped_size;
+				&mapped_address, &mapped_size) != 0)
+				goto free;
+
+			dev_mem->mapped_address = mapped_address;
+			dev_mem->base_address = base_address;
+			dev_mem->mapped_size = mapped_size;
 		}
 	}
 
@@ -335,9 +337,7 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 	if (base_address == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"Failed to find base address of qemu memory file.\n");
-		free(dev->mem);
-		dev->mem = NULL;
-		return -1;
+		goto free;
 	}
 
 	valid_regions = nregions;
@@ -369,9 +369,16 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 			pregion[idx].userspace_address -
 			pregion[idx].guest_phys_address;
 	}
-	dev->mem->nregions = valid_regions;
 
+	dev_mem->nregions = valid_regions;
+	dev->mem_idx++;
 	return 0;
+
+free:
+	free(dev_mem);
+	dev->mem_arr[dev->mem_idx] = NULL;
+error:
+	return -1;
 }
 
 /*
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 19f9518..3ed1ae3 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -119,7 +119,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 		buff = pkts[packet_success];
 
 		/* Convert from gpa to vva (guest physical addr -> vhost virtual addr) */
-		buff_addr = gpa_to_vva(dev, desc->addr);
+		buff_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM, desc->addr);
 		/* Prefetch buffer address. */
 		rte_prefetch0((void *)(uintptr_t)buff_addr);
 
@@ -135,7 +135,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 			desc->len = vq->vhost_hlen;
 			desc = &vq->desc[desc->next];
 			/* Buffer address translation. */
-			buff_addr = gpa_to_vva(dev, desc->addr);
+			buff_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM, desc->addr);
 			desc->len = rte_pktmbuf_data_len(buff);
 		} else {
 			buff_addr += vq->vhost_hlen;
@@ -218,9 +218,9 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 	 * Convert from gpa to vva
 	 * (guest physical addr -> vhost virtual addr)
 	 */
-	vb_addr =
-		gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
 	vq = dev->virtqueue[queue_id];
+	vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
+			vq->buf_vec[vec_idx].buf_addr);
 	vb_hdr_addr = vb_addr;
 
 	/* Prefetch buffer address. */
@@ -262,8 +262,8 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 		}
 
 		vec_idx++;
-		vb_addr =
-			gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
+		vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
+			vq->buf_vec[vec_idx].buf_addr);
 
 		/* Prefetch buffer address. */
 		rte_prefetch0((void *)(uintptr_t)vb_addr);
@@ -308,7 +308,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 			}
 
 			vec_idx++;
-			vb_addr = gpa_to_vva(dev,
+			vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
 				vq->buf_vec[vec_idx].buf_addr);
 			vb_offset = 0;
 			vb_avail = vq->buf_vec[vec_idx].buf_len;
@@ -352,7 +352,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 
 					/* Get next buffer from buf_vec. */
 					vec_idx++;
-					vb_addr = gpa_to_vva(dev,
+					vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
 						vq->buf_vec[vec_idx].buf_addr);
 					vb_avail =
 						vq->buf_vec[vec_idx].buf_len;
@@ -594,7 +594,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 		desc = &vq->desc[desc->next];
 
 		/* Buffer address translation. */
-		vb_addr = gpa_to_vva(dev, desc->addr);
+		vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM, desc->addr);
 		/* Prefetch buffer address. */
 		rte_prefetch0((void *)(uintptr_t)vb_addr);
 
@@ -700,7 +700,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 					desc = &vq->desc[desc->next];
 
 					/* Buffer address translation. */
-					vb_addr = gpa_to_vva(dev, desc->addr);
+					vb_addr = gpa_to_vva(dev,
+						queue_id / VIRTIO_QNUM, desc->addr);
 					/* Prefetch buffer address. */
 					rte_prefetch0((void *)(uintptr_t)vb_addr);
 					vb_offset = 0;
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c
index bdb2d40..ffb1dce 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -70,17 +70,17 @@ get_blk_size(int fd)
 }
 
 static void
-free_mem_region(struct virtio_net *dev)
+free_mem_region(struct virtio_memory *dev_mem)
 {
 	struct orig_region_map *region;
 	unsigned int idx;
 	uint64_t alignment;
 
-	if (!dev || !dev->mem)
+	if (!dev_mem)
 		return;
 
-	region = orig_region(dev->mem, dev->mem->nregions);
-	for (idx = 0; idx < dev->mem->nregions; idx++) {
+	region = orig_region(dev_mem, dev_mem->nregions);
+	for (idx = 0; idx < dev_mem->nregions; idx++) {
 		if (region[idx].mapped_address) {
 			alignment = region[idx].blksz;
 			munmap((void *)(uintptr_t)
@@ -103,37 +103,37 @@ user_set_mem_table(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 	unsigned int idx = 0;
 	struct orig_region_map *pregion_orig;
 	uint64_t alignment;
+	struct virtio_memory *dev_mem = NULL;
 
 	/* unmap old memory regions one by one*/
 	dev = get_device(ctx);
 	if (dev == NULL)
 		return -1;
 
-	/* Remove from the data plane. */
-	if (dev->flags & VIRTIO_DEV_RUNNING)
-		notify_ops->destroy_device(dev);
-
-	if (dev->mem) {
-		free_mem_region(dev);
-		free(dev->mem);
-		dev->mem = NULL;
+	dev_mem = dev->mem_arr[dev->mem_idx];
+	if (dev_mem) {
+		free_mem_region(dev_mem);
+		free(dev_mem);
+		dev->mem_arr[dev->mem_idx] = NULL;
 	}
 
-	dev->mem = calloc(1,
+	dev->mem_arr[dev->mem_idx] = calloc(1,
 		sizeof(struct virtio_memory) +
 		sizeof(struct virtio_memory_regions) * memory.nregions +
 		sizeof(struct orig_region_map) * memory.nregions);
-	if (dev->mem == NULL) {
+
+	dev_mem = dev->mem_arr[dev->mem_idx];
+	if (dev_mem == NULL) {
 		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to allocate memory for dev->mem\n",
-			dev->device_fh);
+			"(%"PRIu64") Failed to allocate memory for dev->mem_arr[%d]\n",
+			dev->device_fh, dev->mem_idx);
 		return -1;
 	}
-	dev->mem->nregions = memory.nregions;
+	dev_mem->nregions = memory.nregions;
 
-	pregion_orig = orig_region(dev->mem, memory.nregions);
+	pregion_orig = orig_region(dev_mem, memory.nregions);
 	for (idx = 0; idx < memory.nregions; idx++) {
-		pregion = &dev->mem->regions[idx];
+		pregion = &dev_mem->regions[idx];
 		pregion->guest_phys_address =
 			memory.regions[idx].guest_phys_addr;
 		pregion->guest_phys_address_end =
@@ -175,9 +175,9 @@ user_set_mem_table(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 			pregion->guest_phys_address;
 
 		if (memory.regions[idx].guest_phys_addr == 0) {
-			dev->mem->base_address =
+			dev_mem->base_address =
 				memory.regions[idx].userspace_addr;
-			dev->mem->mapped_address =
+			dev_mem->mapped_address =
 				pregion->address_offset;
 		}
 
@@ -189,6 +189,7 @@ user_set_mem_table(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 			 pregion->memory_size);
 	}
 
+	dev->mem_idx++;
 	return 0;
 
 err_mmap:
@@ -200,8 +201,8 @@ err_mmap:
 					alignment));
 		close(pregion_orig[idx].fd);
 	}
-	free(dev->mem);
-	dev->mem = NULL;
+	free(dev_mem);
+	dev->mem_arr[dev->mem_idx] = NULL;
 	return -1;
 }
 
@@ -367,13 +368,15 @@ void
 user_destroy_device(struct vhost_device_ctx ctx)
 {
 	struct virtio_net *dev = get_device(ctx);
+	uint32_t i;
 
 	if (dev && (dev->flags & VIRTIO_DEV_RUNNING))
 		notify_ops->destroy_device(dev);
 
-	if (dev && dev->mem) {
-		free_mem_region(dev);
-		free(dev->mem);
-		dev->mem = NULL;
-	}
+	for (i = 0; i < dev->num_virt_queues; i++)
+		if (dev && dev->mem_arr[i]) {
+			free_mem_region(dev->mem_arr[i]);
+			free(dev->mem_arr[i]);
+			dev->mem_arr[i] = NULL;
+		}
 }
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 680f1b8..e853ba2 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -77,15 +77,16 @@ static uint32_t q_num = 1;
  * used to convert the ring addresses to our address space.
  */
 static uint64_t
-qva_to_vva(struct virtio_net *dev, uint64_t qemu_va)
+qva_to_vva(struct virtio_net *dev, uint32_t q_idx, uint64_t qemu_va)
 {
 	struct virtio_memory_regions *region;
 	uint64_t vhost_va = 0;
 	uint32_t regionidx = 0;
+	struct virtio_memory *dev_mem = dev->mem_arr[q_idx];
 
 	/* Find the region where the address lives. */
-	for (regionidx = 0; regionidx < dev->mem->nregions; regionidx++) {
-		region = &dev->mem->regions[regionidx];
+	for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) {
+		region = &dev_mem->regions[regionidx];
 		if ((qemu_va >= region->userspace_address) &&
 			(qemu_va <= region->userspace_address +
 			region->memory_size)) {
@@ -182,10 +183,13 @@ cleanup_device(struct virtio_net *dev)
 	uint32_t q_idx;
 
 	/* Unmap QEMU memory file if mapped. */
-	if (dev->mem) {
-		munmap((void *)(uintptr_t)dev->mem->mapped_address,
-			(size_t)dev->mem->mapped_size);
-		free(dev->mem);
+	for (q_idx = 0; q_idx < dev->num_virt_queues; q_idx++) {
+		struct virtio_memory * dev_mem = dev->mem_arr[q_idx];
+		if (dev_mem) {
+			munmap((void *)(uintptr_t)dev_mem->mapped_address,
+				(size_t)dev_mem->mapped_size);
+			free(dev_mem);
+		}
 	}
 
 	/* Close any event notifiers opened by device. */
@@ -260,7 +264,7 @@ init_device(struct virtio_net *dev, uint8_t reset_owner)
 	 * Virtqueues have already been malloced so
 	 * we don't want to set them to NULL.
 	 */
-	vq_offset = offsetof(struct virtio_net, mem);
+	vq_offset = offsetof(struct virtio_net, mem_arr);
 
 	/* Set everything to 0. */
 	if (!reset_owner)
@@ -530,7 +534,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 
 	/* The addresses are converted from QEMU virtual to Vhost virtual. */
 	vq->desc = (struct vring_desc *)(uintptr_t)qva_to_vva(dev,
-			addr->desc_user_addr);
+			addr->index / VIRTIO_QNUM, addr->desc_user_addr);
 	if (vq->desc == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"(%"PRIu64") Failed to find desc ring address.\n",
@@ -539,7 +543,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 	}
 
 	vq->avail = (struct vring_avail *)(uintptr_t)qva_to_vva(dev,
-			addr->avail_user_addr);
+			addr->index / VIRTIO_QNUM, addr->avail_user_addr);
 	if (vq->avail == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"(%"PRIu64") Failed to find avail ring address.\n",
@@ -548,7 +552,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 	}
 
 	vq->used = (struct vring_used *)(uintptr_t)qva_to_vva(dev,
-			addr->used_user_addr);
+			addr->index / VIRTIO_QNUM, addr->used_user_addr);
 	if (vq->used == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"(%"PRIu64") Failed to find used ring address.\n",
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH 4/6] vhost: Add new command line option: rxq
  2015-05-21  7:49 [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost Ouyang Changchun
                   ` (2 preceding siblings ...)
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 3/6] lib_vhost: Set memory layout for multiple queues mode Ouyang Changchun
@ 2015-05-21  7:49 ` Ouyang Changchun
  2015-05-22  1:39   ` Thomas F Herbert
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 5/6] vhost: Support multiple queues Ouyang Changchun
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 65+ messages in thread
From: Ouyang Changchun @ 2015-05-21  7:49 UTC (permalink / raw)
  To: dev

Sample vhost need know the queue number user want to enable for each virtio device,
so add the new option '--rxq' into it.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 examples/vhost/main.c | 46 ++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 408eb3f..16d4463 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -163,6 +163,9 @@ static int mergeable;
 /* Do vlan strip on host, enabled on default */
 static uint32_t vlan_strip = 1;
 
+/* Rx queue number per virtio device */
+static uint32_t rxq = 1;
+
 /* number of descriptors to apply*/
 static uint32_t num_rx_descriptor = RTE_TEST_RX_DESC_DEFAULT_ZCP;
 static uint32_t num_tx_descriptor = RTE_TEST_TX_DESC_DEFAULT_ZCP;
@@ -408,8 +411,14 @@ port_init(uint8_t port)
 		txconf->tx_deferred_start = 1;
 	}
 
-	/*configure the number of supported virtio devices based on VMDQ limits */
-	num_devices = dev_info.max_vmdq_pools;
+	/* Configure the virtio devices num based on VMDQ limits */
+	switch (rxq) {
+	case 1:
+	case 2: num_devices = dev_info.max_vmdq_pools;
+		break;
+	case 4: num_devices = dev_info.max_vmdq_pools / 2;
+		break;
+	}
 
 	if (zero_copy) {
 		rx_ring_size = num_rx_descriptor;
@@ -431,7 +440,7 @@ port_init(uint8_t port)
 		return retval;
 	/* NIC queues are divided into pf queues and vmdq queues.  */
 	num_pf_queues = dev_info.max_rx_queues - dev_info.vmdq_queue_num;
-	queues_per_pool = dev_info.vmdq_queue_num / dev_info.max_vmdq_pools;
+	queues_per_pool = dev_info.vmdq_queue_num / num_devices;
 	num_vmdq_queues = num_devices * queues_per_pool;
 	num_queues = num_pf_queues + num_vmdq_queues;
 	vmdq_queue_base = dev_info.vmdq_queue_base;
@@ -576,7 +585,8 @@ us_vhost_usage(const char *prgname)
 	"		--rx-desc-num [0-N]: the number of descriptors on rx, "
 			"used only when zero copy is enabled.\n"
 	"		--tx-desc-num [0-N]: the number of descriptors on tx, "
-			"used only when zero copy is enabled.\n",
+			"used only when zero copy is enabled.\n"
+	"		--rxq [1,2,4]: rx queue number for each vhost device\n",
 	       prgname);
 }
 
@@ -602,6 +612,7 @@ us_vhost_parse_args(int argc, char **argv)
 		{"zero-copy", required_argument, NULL, 0},
 		{"rx-desc-num", required_argument, NULL, 0},
 		{"tx-desc-num", required_argument, NULL, 0},
+		{"rxq", required_argument, NULL, 0},
 		{NULL, 0, 0, 0},
 	};
 
@@ -778,6 +789,20 @@ us_vhost_parse_args(int argc, char **argv)
 				}
 			}
 
+			/* Specify the Rx queue number for each vhost dev. */
+			if (!strncmp(long_option[option_index].name,
+				"rxq", MAX_LONG_OPT_SZ)) {
+				ret = parse_num_opt(optarg, 4);
+				if ((ret == -1) || (!POWEROF2(ret))) {
+					RTE_LOG(INFO, VHOST_CONFIG,
+					"Invalid argument for rxq [1,2,4],"
+					"power of 2 required.\n");
+					us_vhost_usage(prgname);
+					return -1;
+				} else {
+					rxq = ret;
+				}
+			}
 			break;
 
 			/* Invalid option - print options. */
@@ -813,6 +838,19 @@ us_vhost_parse_args(int argc, char **argv)
 		return -1;
 	}
 
+	if (rxq > 1) {
+		vmdq_conf_default.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
+		vmdq_conf_default.rx_adv_conf.rss_conf.rss_hf = ETH_RSS_IP |
+				ETH_RSS_UDP | ETH_RSS_TCP | ETH_RSS_SCTP;
+	}
+
+	if ((zero_copy == 1) && (rxq > 1)) {
+		RTE_LOG(INFO, VHOST_PORT,
+			"Vhost zero copy doesn't support mq mode,"
+			"please specify '--rxq 1' to disable it.\n");
+		return -1;
+	}
+
 	return 0;
 }
 
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH 5/6] vhost: Support multiple queues
  2015-05-21  7:49 [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost Ouyang Changchun
                   ` (3 preceding siblings ...)
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 4/6] vhost: Add new command line option: rxq Ouyang Changchun
@ 2015-05-21  7:49 ` Ouyang Changchun
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 6/6] virtio: Resolve for control queue Ouyang Changchun
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-05-21  7:49 UTC (permalink / raw)
  To: dev

Sample vhost leverage the VMDq+RSS in HW to receive packets and distribute them
into different queue in the pool according to 5 tuples.

And enable multiple queues mode in vhost/virtio layer.

HW queue numbers in pool exactly same with the queue number in virtio device,
e.g. rxq = 4, the queue number is 4, it means 4 HW queues in each VMDq pool,
and 4 queues in each virtio device/port, one maps to each.

=========================================
==================|   |==================|
       vport0     |   |      vport1      |
---  ---  ---  ---|   |---  ---  ---  ---|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
||   ||   ||   ||      ||   ||   ||   ||
||   ||   ||   ||      ||   ||   ||   ||
||= =||= =||= =||=|   =||== ||== ||== ||=|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |

------------------|   |------------------|
     VMDq pool0   |   |    VMDq pool1    |
==================|   |==================|

In RX side, it firstly polls each queue of the pool and gets the packets from
it and enqueue them into its corresponding queue in virtio device/port.
In TX side, it dequeue packets from each queue of virtio device/port and send
to either physical port or another virtio device according to its destination
MAC address.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 examples/vhost/main.c | 132 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 79 insertions(+), 53 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 16d4463..0a33e57 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -998,8 +998,9 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)
 
 	/* Enable stripping of the vlan tag as we handle routing. */
 	if (vlan_strip)
-		rte_eth_dev_set_vlan_strip_on_queue(ports[0],
-			(uint16_t)vdev->vmdq_rx_q, 1);
+		for (i = 0; i < (int)rxq; i++)
+			rte_eth_dev_set_vlan_strip_on_queue(ports[0],
+				(uint16_t)(vdev->vmdq_rx_q + i), 1);
 
 	/* Set device as ready for RX. */
 	vdev->ready = DEVICE_RX;
@@ -1014,7 +1015,7 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)
 static inline void
 unlink_vmdq(struct vhost_dev *vdev)
 {
-	unsigned i = 0;
+	unsigned i = 0, j = 0;
 	unsigned rx_count;
 	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
 
@@ -1027,15 +1028,19 @@ unlink_vmdq(struct vhost_dev *vdev)
 		vdev->vlan_tag = 0;
 
 		/*Clear out the receive buffers*/
-		rx_count = rte_eth_rx_burst(ports[0],
-					(uint16_t)vdev->vmdq_rx_q, pkts_burst, MAX_PKT_BURST);
+		for (i = 0; i < rxq; i++) {
+			rx_count = rte_eth_rx_burst(ports[0],
+					(uint16_t)vdev->vmdq_rx_q + i,
+					pkts_burst, MAX_PKT_BURST);
 
-		while (rx_count) {
-			for (i = 0; i < rx_count; i++)
-				rte_pktmbuf_free(pkts_burst[i]);
+			while (rx_count) {
+				for (j = 0; j < rx_count; j++)
+					rte_pktmbuf_free(pkts_burst[j]);
 
-			rx_count = rte_eth_rx_burst(ports[0],
-					(uint16_t)vdev->vmdq_rx_q, pkts_burst, MAX_PKT_BURST);
+				rx_count = rte_eth_rx_burst(ports[0],
+					(uint16_t)vdev->vmdq_rx_q + i,
+					pkts_burst, MAX_PKT_BURST);
+			}
 		}
 
 		vdev->ready = DEVICE_MAC_LEARNING;
@@ -1047,7 +1052,7 @@ unlink_vmdq(struct vhost_dev *vdev)
  * the packet on that devices RX queue. If not then return.
  */
 static inline int __attribute__((always_inline))
-virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m)
+virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m, uint32_t q_idx)
 {
 	struct virtio_net_data_ll *dev_ll;
 	struct ether_hdr *pkt_hdr;
@@ -1062,7 +1067,7 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m)
 
 	while (dev_ll != NULL) {
 		if ((dev_ll->vdev->ready == DEVICE_RX) && ether_addr_cmp(&(pkt_hdr->d_addr),
-				          &dev_ll->vdev->mac_address)) {
+					&dev_ll->vdev->mac_address)) {
 
 			/* Drop the packet if the TX packet is destined for the TX device. */
 			if (dev_ll->vdev->dev->device_fh == dev->device_fh) {
@@ -1080,7 +1085,9 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m)
 				LOG_DEBUG(VHOST_DATA, "(%"PRIu64") Device is marked for removal\n", tdev->device_fh);
 			} else {
 				/*send the packet to the local virtio device*/
-				ret = rte_vhost_enqueue_burst(tdev, VIRTIO_RXQ, &m, 1);
+				ret = rte_vhost_enqueue_burst(tdev,
+					VIRTIO_RXQ + q_idx * VIRTIO_QNUM,
+					&m, 1);
 				if (enable_stats) {
 					rte_atomic64_add(
 					&dev_statistics[tdev->device_fh].rx_total_atomic,
@@ -1157,7 +1164,8 @@ find_local_dest(struct virtio_net *dev, struct rte_mbuf *m,
  * or the physical port.
  */
 static inline void __attribute__((always_inline))
-virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
+virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m,
+		uint16_t vlan_tag, uint32_t q_idx)
 {
 	struct mbuf_table *tx_q;
 	struct rte_mbuf **m_table;
@@ -1167,7 +1175,8 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
 	struct ether_hdr *nh;
 
 	/*check if destination is local VM*/
-	if ((vm2vm_mode == VM2VM_SOFTWARE) && (virtio_tx_local(vdev, m) == 0)) {
+	if ((vm2vm_mode == VM2VM_SOFTWARE) &&
+		(virtio_tx_local(vdev, m, q_idx) == 0)) {
 		rte_pktmbuf_free(m);
 		return;
 	}
@@ -1331,49 +1340,60 @@ switch_worker(__attribute__((unused)) void *arg)
 			}
 			if (likely(vdev->ready == DEVICE_RX)) {
 				/*Handle guest RX*/
-				rx_count = rte_eth_rx_burst(ports[0],
-					vdev->vmdq_rx_q, pkts_burst, MAX_PKT_BURST);
+				for (i = 0; i < rxq; i ++) {
+					rx_count = rte_eth_rx_burst(ports[0],
+						vdev->vmdq_rx_q + i, pkts_burst, MAX_PKT_BURST);
 
-				if (rx_count) {
-					/*
-					* Retry is enabled and the queue is full then we wait and retry to avoid packet loss
-					* Here MAX_PKT_BURST must be less than virtio queue size
-					*/
-					if (enable_retry && unlikely(rx_count > rte_vring_available_entries(dev, VIRTIO_RXQ))) {
-						for (retry = 0; retry < burst_rx_retry_num; retry++) {
-							rte_delay_us(burst_rx_delay_time);
-							if (rx_count <= rte_vring_available_entries(dev, VIRTIO_RXQ))
-								break;
+					if (rx_count) {
+						/*
+						* Retry is enabled and the queue is full then we wait and retry to avoid packet loss
+						* Here MAX_PKT_BURST must be less than virtio queue size
+						*/
+						if (enable_retry && unlikely(rx_count > rte_vring_available_entries(dev,
+											VIRTIO_RXQ + i * VIRTIO_QNUM))) {
+							for (retry = 0; retry < burst_rx_retry_num; retry++) {
+								rte_delay_us(burst_rx_delay_time);
+								if (rx_count <= rte_vring_available_entries(dev,
+											VIRTIO_RXQ + i * VIRTIO_QNUM))
+									break;
+							}
+						}
+						ret_count = rte_vhost_enqueue_burst(dev, VIRTIO_RXQ + i * VIRTIO_QNUM,
+											pkts_burst, rx_count);
+						if (enable_stats) {
+							rte_atomic64_add(
+							&dev_statistics[dev_ll->vdev->dev->device_fh].rx_total_atomic,
+							rx_count);
+							rte_atomic64_add(
+							&dev_statistics[dev_ll->vdev->dev->device_fh].rx_atomic, ret_count);
+						}
+						while (likely(rx_count)) {
+							rx_count--;
+							rte_pktmbuf_free(pkts_burst[rx_count]);
 						}
 					}
-					ret_count = rte_vhost_enqueue_burst(dev, VIRTIO_RXQ, pkts_burst, rx_count);
-					if (enable_stats) {
-						rte_atomic64_add(
-						&dev_statistics[dev_ll->vdev->dev->device_fh].rx_total_atomic,
-						rx_count);
-						rte_atomic64_add(
-						&dev_statistics[dev_ll->vdev->dev->device_fh].rx_atomic, ret_count);
-					}
-					while (likely(rx_count)) {
-						rx_count--;
-						rte_pktmbuf_free(pkts_burst[rx_count]);
-					}
-
 				}
 			}
 
 			if (likely(!vdev->remove)) {
 				/* Handle guest TX*/
-				tx_count = rte_vhost_dequeue_burst(dev, VIRTIO_TXQ, mbuf_pool, pkts_burst, MAX_PKT_BURST);
-				/* If this is the first received packet we need to learn the MAC and setup VMDQ */
-				if (unlikely(vdev->ready == DEVICE_MAC_LEARNING) && tx_count) {
-					if (vdev->remove || (link_vmdq(vdev, pkts_burst[0]) == -1)) {
-						while (tx_count)
-							rte_pktmbuf_free(pkts_burst[--tx_count]);
+				for (i = 0; i < rxq; i++) {
+					tx_count = rte_vhost_dequeue_burst(dev, VIRTIO_TXQ + i * 2,
+							mbuf_pool, pkts_burst, MAX_PKT_BURST);
+					/*
+					 * If this is the first received packet we need to learn
+					 * the MAC and setup VMDQ
+					 */
+					if (unlikely(vdev->ready == DEVICE_MAC_LEARNING) && tx_count) {
+						if (vdev->remove || (link_vmdq(vdev, pkts_burst[0]) == -1)) {
+							while (tx_count)
+								rte_pktmbuf_free(pkts_burst[--tx_count]);
+						}
 					}
+					while (tx_count)
+						virtio_tx_route(vdev, pkts_burst[--tx_count],
+								(uint16_t)dev->device_fh, i);
 				}
-				while (tx_count)
-					virtio_tx_route(vdev, pkts_burst[--tx_count], (uint16_t)dev->device_fh);
 			}
 
 			/*move to the next device in the list*/
@@ -2677,12 +2697,12 @@ new_device (struct virtio_net *dev)
 		}
 	}
 
-
 	/* Add device to main ll */
 	ll_dev = get_data_ll_free_entry(&ll_root_free);
 	if (ll_dev == NULL) {
-		RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") No free entry found in linked list. Device limit "
-			"of %d devices per core has been reached\n",
+		RTE_LOG(INFO, VHOST_DATA,
+			"(%"PRIu64") No free entry found in linked list."
+			"Device limit of %d devices per core has been reached\n",
 			dev->device_fh, num_devices);
 		if (vdev->regions_hpa)
 			rte_free(vdev->regions_hpa);
@@ -2691,8 +2711,12 @@ new_device (struct virtio_net *dev)
 	}
 	ll_dev->vdev = vdev;
 	add_data_ll_entry(&ll_root_used, ll_dev);
-	vdev->vmdq_rx_q
-		= dev->device_fh * queues_per_pool + vmdq_queue_base;
+	vdev->vmdq_rx_q	= dev->device_fh * rxq + vmdq_queue_base;
+
+	if ((rxq > 1) && (queues_per_pool != rxq)) {
+		RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") rxq: %d != queues_per_pool: %d \n",
+			dev->device_fh, rxq, queues_per_pool);
+	}
 
 	if (zero_copy) {
 		uint32_t index = vdev->vmdq_rx_q;
@@ -2938,6 +2962,8 @@ main(int argc, char *argv[])
 	if (ret < 0)
 		rte_exit(EXIT_FAILURE, "Invalid argument\n");
 
+	rte_vhost_q_num_set(rxq);
+
 	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id ++)
 		if (rte_lcore_is_enabled(lcore_id))
 			lcore_ids[core_id ++] = lcore_id;
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH 6/6] virtio: Resolve for control queue
  2015-05-21  7:49 [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost Ouyang Changchun
                   ` (4 preceding siblings ...)
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 5/6] vhost: Support multiple queues Ouyang Changchun
@ 2015-05-21  7:49 ` Ouyang Changchun
  2015-05-22  1:13 ` [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost Thomas F Herbert
  2015-06-10  5:52 ` [dpdk-dev] [PATCH v2 0/7] " Ouyang Changchun
  7 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-05-21  7:49 UTC (permalink / raw)
  To: dev

Control queue can't work for vhost-user mulitple queue mode,
so workaround to return a value directly in send_command function.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 lib/librte_pmd_virtio/virtio_ethdev.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
index f74e413..2a5d282 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -128,6 +128,12 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
 		return -1;
 	}
 
+	/*
+	 * FIXME: The control queue doesn't work for vhost-user
+	 * multiple queue, workaround it to return directly.
+	 */
+	return 0;
+
 	PMD_INIT_LOG(DEBUG, "vq->vq_desc_head_idx = %d, status = %d, "
 		"vq->hw->cvq = %p vq = %p",
 		vq->vq_desc_head_idx, status, vq->hw->cvq, vq);
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost
  2015-05-21  7:49 [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost Ouyang Changchun
                   ` (5 preceding siblings ...)
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 6/6] virtio: Resolve for control queue Ouyang Changchun
@ 2015-05-22  1:13 ` Thomas F Herbert
  2015-05-22  6:08   ` Ouyang, Changchun
  2015-06-10  5:52 ` [dpdk-dev] [PATCH v2 0/7] " Ouyang Changchun
  7 siblings, 1 reply; 65+ messages in thread
From: Thomas F Herbert @ 2015-05-22  1:13 UTC (permalink / raw)
  To: Ouyang Changchun, dev



On 5/21/15 3:49 AM, Ouyang Changchun wrote:
> This patch set supports the multiple queues for each virtio device in vhost.
> The vhost-user is used to enable the multiple queues feature, It's not ready for vhost-cuse.
Thanks. I tried it and verified that this patch applies cleanly to 
master. Could you also notify the list when qemu patch is available. 
Thanks again!
>
> One prerequisite to enable this feature is that a QEMU patch plus a fix is required to apply
> on QEMU2.2/2.3, pls refer to this link for the details of the patch and the fix:
> http://lists.nongnu.org/archive/html/qemu-devel/2015-04/msg00917.html
>
> A formal v3 patch for the code change and the fix will be sent to qemu community soon.
>
> Basicaly vhost sample leverages the VMDq+RSS in HW to receive packets and distribute them
> into different queue in the pool according to their 5 tuples.
>
> On the other hand, it enables multiple queues mode in vhost/virtio layer by setting the queue
> number as the value larger than 1.
>
> HW queue numbers in pool is required to be exactly same with the queue number in each virtio
> device, e.g. rxq = 4, the queue number is 4, it means there are 4 HW queues in each VMDq pool,
> and 4 queues in each virtio device/port, every queue in pool maps to one qeueu in virtio device.
>
> =========================================
> ==================|   |==================|
>         vport0     |   |      vport1      |
> ---  ---  ---  ---|   |---  ---  ---  ---|
> q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
> /\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
> ||   ||   ||   ||      ||   ||   ||   ||
> ||   ||   ||   ||      ||   ||   ||   ||
> ||= =||= =||= =||=|   =||== ||== ||== ||=|
> q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
>
> ------------------|   |------------------|
>       VMDq pool0   |   |    VMDq pool1    |
> ==================|   |==================|
>
> In RX side, it firstly polls each queue of the pool and gets the packets from
> it and enqueue them into its corresponding queue in virtio device/port.
> In TX side, it dequeue packets from each queue of virtio device/port and send
> to either physical port or another virtio device according to its destination
> MAC address.
>
> It includes a workaround here in virtio as control queue not work for vhost-user
> multiple queues. It needs further investigate to root the cause, hopefully it could
> be addressed in next version.
>
> Here is some test guidance.
> 1. On host, firstly mount hugepage, and insmod uio, igb_uio, bind one nic on igb_uio;
> and then run vhost sample, key steps as follows:
> sudo mount -t hugetlbfs nodev /mnt/huge
> sudo modprobe uio
> sudo insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko
>
> $RTE_SDK/tools/dpdk_nic_bind.py --bind igb_uio 0000:08:00.0
> sudo $RTE_SDK/examples/vhost/build/vhost-switch -c 0xf0 -n 4 --huge-dir /mnt/huge --socket-mem 1024,0 -- -p 1 --vm2vm 0 --dev-basename usvhost --rxq 2
>
> 2. After step 1, on host, modprobe kvm and kvm_intel, and use qemu command line to start one guest:
> modprobe kvm
> modprobe kvm_intel
> sudo mount -t hugetlbfs nodev /dev/hugepages -o pagesize=1G
>
> $QEMU_PATH/qemu-system-x86_64 -enable-kvm -m 4096 -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -mem-prealloc -smp 10 -cpu core2duo,+sse3,+sse4.1,+sse4.2 -name <vm-name> -drive file=<img-path>/vm.img -chardev socket,id=char0,path=<usvhost-path>/usvhost -netdev type=vhost-user,id=hostnet2,chardev=char0,vhostforce=on,queues=2 -device virtio-net-pci,mq=on,vectors=6,netdev=hostnet2,id=net2,mac=52:54:00:12:34:56,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -chardev socket,id=char1,path=<usvhost-path>/usvhost -netdev type=vhost-user,id=hostnet3,chardev=char1,vhostforce=on,queues=2 -device virtio-net-pci,mq=on,vectors=6,netdev=hostnet3,id=net3,mac=52:54:00:12:34:57,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off
>
> 3. Log on guest, use testpmd(dpdk based) to test, use multiple virtio queues to rx and tx packets.
> modprobe uio
> insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko
> echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
> ./tools/dpdk_nic_bind.py --bind igb_uio 00:03.0 00:04.0
>
> $RTE_SDK/$RTE_TARGET/app/testpmd -c 1f -n 4 -- --rxq=2 --txq=2 --nb-cores=4 --rx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" --tx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" -i --disable-hw-vlan --txqflags 0xf00
>
> 4. Use packet generator to send packets with dest MAC:52 54 00 12 34 57  VLAN tag:1001,
> select IPv4 as protocols and continuous incremental IP address.
>
> 5. Testpmd on guest can display packets received/transmitted in both queues of each virtio port.
>
> Changchun Ouyang (6):
>    ixgbe: Support VMDq RSS in non-SRIOV environment
>    lib_vhost: Support multiple queues in virtio dev
>    lib_vhost: Set memory layout for multiple queues mode
>    vhost: Add new command line option: rxq
>    vhost: Support multiple queues
>    virtio: Resolve for control queue
>
>   examples/vhost/main.c                         | 199 +++++++++++++++++---------
>   lib/librte_ether/rte_ethdev.c                 |  40 ++++++
>   lib/librte_pmd_ixgbe/ixgbe_rxtx.c             |  82 +++++++++--
>   lib/librte_pmd_virtio/virtio_ethdev.c         |   6 +
>   lib/librte_vhost/rte_virtio_net.h             |  25 +++-
>   lib/librte_vhost/vhost_cuse/virtio-net-cdev.c |  57 ++++----
>   lib/librte_vhost/vhost_rxtx.c                 |  53 +++----
>   lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
>   lib/librte_vhost/vhost_user/virtio-net-user.c | 156 ++++++++++++++------
>   lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
>   lib/librte_vhost/virtio-net.c                 | 158 ++++++++++++--------
>   11 files changed, 545 insertions(+), 237 deletions(-)
>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH 4/6] vhost: Add new command line option: rxq
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 4/6] vhost: Add new command line option: rxq Ouyang Changchun
@ 2015-05-22  1:39   ` Thomas F Herbert
  2015-05-22  6:05     ` Ouyang, Changchun
  0 siblings, 1 reply; 65+ messages in thread
From: Thomas F Herbert @ 2015-05-22  1:39 UTC (permalink / raw)
  To: dpdk >> dev@dpdk.org



On 5/21/15 3:49 AM, Ouyang Changchun wrote:
> Sample vhost need know the queue number user want to enable for each virtio device,
> so add the new option '--rxq' into it.
Could you also add the new --rxq option description to us_vhost_usage()?
>
> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> ---
>   examples/vhost/main.c | 46 ++++++++++++++++++++++++++++++++++++++++++----
>   1 file changed, 42 insertions(+), 4 deletions(-)
>
> diff --git a/examples/vhost/main.c b/examples/vhost/main.c
> index 408eb3f..16d4463 100644
> --- a/examples/vhost/main.c
> +++ b/examples/vhost/main.c
> @@ -163,6 +163,9 @@ static int mergeable;
>   /* Do vlan strip on host, enabled on default */
>   static uint32_t vlan_strip = 1;
>
> +/* Rx queue number per virtio device */
> +static uint32_t rxq = 1;
> +
>   /* number of descriptors to apply*/
>   static uint32_t num_rx_descriptor = RTE_TEST_RX_DESC_DEFAULT_ZCP;
>   static uint32_t num_tx_descriptor = RTE_TEST_TX_DESC_DEFAULT_ZCP;
> @@ -408,8 +411,14 @@ port_init(uint8_t port)
>   		txconf->tx_deferred_start = 1;
>   	}
>
> -	/*configure the number of supported virtio devices based on VMDQ limits */
> -	num_devices = dev_info.max_vmdq_pools;
> +	/* Configure the virtio devices num based on VMDQ limits */
> +	switch (rxq) {
> +	case 1:
> +	case 2: num_devices = dev_info.max_vmdq_pools;
> +		break;
> +	case 4: num_devices = dev_info.max_vmdq_pools / 2;
> +		break;
> +	}
>
>   	if (zero_copy) {
>   		rx_ring_size = num_rx_descriptor;
> @@ -431,7 +440,7 @@ port_init(uint8_t port)
>   		return retval;
>   	/* NIC queues are divided into pf queues and vmdq queues.  */
>   	num_pf_queues = dev_info.max_rx_queues - dev_info.vmdq_queue_num;
> -	queues_per_pool = dev_info.vmdq_queue_num / dev_info.max_vmdq_pools;
> +	queues_per_pool = dev_info.vmdq_queue_num / num_devices;
>   	num_vmdq_queues = num_devices * queues_per_pool;
>   	num_queues = num_pf_queues + num_vmdq_queues;
>   	vmdq_queue_base = dev_info.vmdq_queue_base;
> @@ -576,7 +585,8 @@ us_vhost_usage(const char *prgname)
>   	"		--rx-desc-num [0-N]: the number of descriptors on rx, "
>   			"used only when zero copy is enabled.\n"
>   	"		--tx-desc-num [0-N]: the number of descriptors on tx, "
> -			"used only when zero copy is enabled.\n",
> +			"used only when zero copy is enabled.\n"
> +	"		--rxq [1,2,4]: rx queue number for each vhost device\n",
>   	       prgname);
>   }
>
> @@ -602,6 +612,7 @@ us_vhost_parse_args(int argc, char **argv)
>   		{"zero-copy", required_argument, NULL, 0},
>   		{"rx-desc-num", required_argument, NULL, 0},
>   		{"tx-desc-num", required_argument, NULL, 0},
> +		{"rxq", required_argument, NULL, 0},
>   		{NULL, 0, 0, 0},
>   	};
>
> @@ -778,6 +789,20 @@ us_vhost_parse_args(int argc, char **argv)
>   				}
>   			}
>
> +			/* Specify the Rx queue number for each vhost dev. */
> +			if (!strncmp(long_option[option_index].name,
> +				"rxq", MAX_LONG_OPT_SZ)) {
> +				ret = parse_num_opt(optarg, 4);
> +				if ((ret == -1) || (!POWEROF2(ret))) {
> +					RTE_LOG(INFO, VHOST_CONFIG,
> +					"Invalid argument for rxq [1,2,4],"
> +					"power of 2 required.\n");
> +					us_vhost_usage(prgname);
> +					return -1;
> +				} else {
> +					rxq = ret;
> +				}
> +			}
>   			break;
>
>   			/* Invalid option - print options. */
> @@ -813,6 +838,19 @@ us_vhost_parse_args(int argc, char **argv)
>   		return -1;
>   	}
>
> +	if (rxq > 1) {
> +		vmdq_conf_default.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
> +		vmdq_conf_default.rx_adv_conf.rss_conf.rss_hf = ETH_RSS_IP |
> +				ETH_RSS_UDP | ETH_RSS_TCP | ETH_RSS_SCTP;
> +	}
> +
> +	if ((zero_copy == 1) && (rxq > 1)) {
> +		RTE_LOG(INFO, VHOST_PORT,
> +			"Vhost zero copy doesn't support mq mode,"
> +			"please specify '--rxq 1' to disable it.\n");
> +		return -1;
> +	}
> +
>   	return 0;
>   }
>
>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH 4/6] vhost: Add new command line option: rxq
  2015-05-22  1:39   ` Thomas F Herbert
@ 2015-05-22  6:05     ` Ouyang, Changchun
  2015-05-22 12:51       ` Thomas F Herbert
  0 siblings, 1 reply; 65+ messages in thread
From: Ouyang, Changchun @ 2015-05-22  6:05 UTC (permalink / raw)
  To: Thomas F Herbert, dpdk >> dev@dpdk.org

Hi Thomas,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas F Herbert
> Sent: Friday, May 22, 2015 9:39 AM
> To: dpdk >> dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 4/6] vhost: Add new command line option:
> rxq
> 
> 
> 
> On 5/21/15 3:49 AM, Ouyang Changchun wrote:
> > Sample vhost need know the queue number user want to enable for each
> > virtio device, so add the new option '--rxq' into it.
> Could you also add the new --rxq option description to us_vhost_usage()?

Actually we have, please see below

> +585,8 @@
> > us_vhost_usage(const char *prgname)
> >   	"		--rx-desc-num [0-N]: the number of descriptors on rx,
> "
> >   			"used only when zero copy is enabled.\n"
> >   	"		--tx-desc-num [0-N]: the number of descriptors on tx,
> "
> > -			"used only when zero copy is enabled.\n",
> > +			"used only when zero copy is enabled.\n"
> > +	"		--rxq [1,2,4]: rx queue number for each vhost
> device\n",
> >   	       prgname);
> >   }

Thanks
Changchun

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost
  2015-05-22  1:13 ` [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost Thomas F Herbert
@ 2015-05-22  6:08   ` Ouyang, Changchun
  0 siblings, 0 replies; 65+ messages in thread
From: Ouyang, Changchun @ 2015-05-22  6:08 UTC (permalink / raw)
  To: Thomas F Herbert, dev

Hi Thomas,

> -----Original Message-----
> From: Thomas F Herbert [mailto:therbert@redhat.com]
> Sent: Friday, May 22, 2015 9:13 AM
> To: Ouyang, Changchun; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost
> 
> 
> 
> On 5/21/15 3:49 AM, Ouyang Changchun wrote:
> > This patch set supports the multiple queues for each virtio device in vhost.
> > The vhost-user is used to enable the multiple queues feature, It's not
> ready for vhost-cuse.
> Thanks. I tried it and verified that this patch applies cleanly to master. Could
> you also notify the list when qemu patch is available.

I have sent out the qemu patch to qemu community last night, please see this link:
http://patchwork.ozlabs.org/patch/475055/

thanks for your verifying
Changchun

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH 4/6] vhost: Add new command line option: rxq
  2015-05-22  6:05     ` Ouyang, Changchun
@ 2015-05-22 12:51       ` Thomas F Herbert
  2015-05-23  1:25         ` Ouyang, Changchun
  0 siblings, 1 reply; 65+ messages in thread
From: Thomas F Herbert @ 2015-05-22 12:51 UTC (permalink / raw)
  To: Ouyang, Changchun, dpdk >> dev@dpdk.org



On 5/22/15 2:05 AM, Ouyang, Changchun wrote:
> Hi Thomas,
>
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas F Herbert
>> Sent: Friday, May 22, 2015 9:39 AM
>> To: dpdk >> dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH 4/6] vhost: Add new command line option:
>> rxq
>>
>>
>>
>> On 5/21/15 3:49 AM, Ouyang Changchun wrote:
>>> Sample vhost need know the queue number user want to enable for each
>>> virtio device, so add the new option '--rxq' into it.
>> Could you also add the new --rxq option description to us_vhost_usage()?
>
> Actually we have, please see below
True enough. However, the code calls rte_eal_init() before parsing args 
and therefore takes rte_exit before printing usage of the non-eal options.
>
>> +585,8 @@
>>> us_vhost_usage(const char *prgname)
>>>    	"		--rx-desc-num [0-N]: the number of descriptors on rx,
>> "
>>>    			"used only when zero copy is enabled.\n"
>>>    	"		--tx-desc-num [0-N]: the number of descriptors on tx,
>> "
>>> -			"used only when zero copy is enabled.\n",
>>> +			"used only when zero copy is enabled.\n"
>>> +	"		--rxq [1,2,4]: rx queue number for each vhost
>> device\n",
>>>    	       prgname);
>>>    }
>
> Thanks
> Changchun
>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH 4/6] vhost: Add new command line option: rxq
  2015-05-22 12:51       ` Thomas F Herbert
@ 2015-05-23  1:25         ` Ouyang, Changchun
  2015-05-26  7:21           ` Ouyang, Changchun
  0 siblings, 1 reply; 65+ messages in thread
From: Ouyang, Changchun @ 2015-05-23  1:25 UTC (permalink / raw)
  To: Thomas F Herbert, dpdk >> dev@dpdk.org

Hi Thomas,

> -----Original Message-----
> From: Thomas F Herbert [mailto:therbert@redhat.com]
> Sent: Friday, May 22, 2015 8:51 PM
> To: Ouyang, Changchun; dpdk >> dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 4/6] vhost: Add new command line option:
> rxq
> 
> 
> 
> On 5/22/15 2:05 AM, Ouyang, Changchun wrote:
> > Hi Thomas,
> >
> >> -----Original Message-----
> >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas F
> Herbert
> >> Sent: Friday, May 22, 2015 9:39 AM
> >> To: dpdk >> dev@dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH 4/6] vhost: Add new command line
> option:
> >> rxq
> >>
> >>
> >>
> >> On 5/21/15 3:49 AM, Ouyang Changchun wrote:
> >>> Sample vhost need know the queue number user want to enable for
> each
> >>> virtio device, so add the new option '--rxq' into it.
> >> Could you also add the new --rxq option description to us_vhost_usage()?
> >
> > Actually we have, please see below
> True enough. However, the code calls rte_eal_init() before parsing args and
> therefore takes rte_exit before printing usage of the non-eal options.

Yes, it is really the case for every other options, and for every other samples in dpdk.
Anyway I can't put --rxq and its description into eal layer, as it belongs to vhost sample here.

Thanks
Changchun

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH 4/6] vhost: Add new command line option: rxq
  2015-05-23  1:25         ` Ouyang, Changchun
@ 2015-05-26  7:21           ` Ouyang, Changchun
  0 siblings, 0 replies; 65+ messages in thread
From: Ouyang, Changchun @ 2015-05-26  7:21 UTC (permalink / raw)
  To: Thomas F Herbert, dpdk >> dev@dpdk.org

Hi Thomas,

> -----Original Message-----
> From: Ouyang, Changchun
> Sent: Saturday, May 23, 2015 9:25 AM
> To: Thomas F Herbert; dpdk >> dev@dpdk.org
> Cc: Ouyang, Changchun
> Subject: RE: [dpdk-dev] [PATCH 4/6] vhost: Add new command line option:
> rxq
> 
> Hi Thomas,
> 
> > -----Original Message-----
> > From: Thomas F Herbert [mailto:therbert@redhat.com]
> > Sent: Friday, May 22, 2015 8:51 PM
> > To: Ouyang, Changchun; dpdk >> dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 4/6] vhost: Add new command line option:
> > rxq
> >
> >
> >
> > On 5/22/15 2:05 AM, Ouyang, Changchun wrote:
> > > Hi Thomas,
> > >
> > >> -----Original Message-----
> > >> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas F
> > Herbert
> > >> Sent: Friday, May 22, 2015 9:39 AM
> > >> To: dpdk >> dev@dpdk.org
> > >> Subject: Re: [dpdk-dev] [PATCH 4/6] vhost: Add new command line
> > option:
> > >> rxq
> > >>
> > >>
> > >>
> > >> On 5/21/15 3:49 AM, Ouyang Changchun wrote:
> > >>> Sample vhost need know the queue number user want to enable for
> > each
> > >>> virtio device, so add the new option '--rxq' into it.
> > >> Could you also add the new --rxq option description to
> us_vhost_usage()?
> > >
> > > Actually we have, please see below
> > True enough. However, the code calls rte_eal_init() before parsing
> > args and therefore takes rte_exit before printing usage of the non-eal
> options.

Use this command line could address your question:
vhost-switch -c 0xf -n 4 -- -help
it will go through eal  and come to helper function in vhost-switch, then you can see the usage info for vhost.

Thanks
Changchun

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH 3/6] lib_vhost: Set memory layout for multiple queues mode
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 3/6] lib_vhost: Set memory layout for multiple queues mode Ouyang Changchun
@ 2015-06-02  3:33   ` Xie, Huawei
  0 siblings, 0 replies; 65+ messages in thread
From: Xie, Huawei @ 2015-06-02  3:33 UTC (permalink / raw)
  To: Ouyang, Changchun, dev

Is there any possibility that different queue has different memory
translation?
How about we use the memory region of the first queue discovered?


On 5/21/2015 3:50 PM, Ouyang Changchun wrote:
> QEMU sends separate commands orderly to set the memory layout for each queue
> in one virtio device, accordingly vhost need keep memory layout information
> for each queue of the virtio device.
>
> This also need adjust the interface a bit for function gpa_to_vva by
> introducing the queue index to specify queue of device to look up its
> virtual vhost address for the incoming guest physical address.
>
> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> ---
>  examples/vhost/main.c                         | 21 +++++-----
>  lib/librte_vhost/rte_virtio_net.h             | 10 +++--
>  lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 57 ++++++++++++++------------
>  lib/librte_vhost/vhost_rxtx.c                 | 21 +++++-----
>  lib/librte_vhost/vhost_user/virtio-net-user.c | 59 ++++++++++++++-------------
>  lib/librte_vhost/virtio-net.c                 | 26 +++++++-----
>  6 files changed, 106 insertions(+), 88 deletions(-)
>
>


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH 2/6] lib_vhost: Support multiple queues in virtio dev
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 2/6] lib_vhost: Support multiple queues in virtio dev Ouyang Changchun
@ 2015-06-03  2:47   ` Xie, Huawei
  0 siblings, 0 replies; 65+ messages in thread
From: Xie, Huawei @ 2015-06-03  2:47 UTC (permalink / raw)
  To: dev


On 5/21/2015 3:51 PM, Ouyang Changchun wrote:
> Each virtio device could have multiple queues, say 2 or 4, at most 8.
> Enabling this feature allows virtio device/port on guest has the ability to
> use different vCPU to receive/transmit packets from/to each queue.
>
> In multiple queues mode, virtio device readiness means all queues of
> this virtio device are ready, cleanup/destroy a virtio device also
> requires clearing all queues belong to it.
>
> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> ---
>  lib/librte_vhost/rte_virtio_net.h             |  15 ++-
>  lib/librte_vhost/vhost_rxtx.c                 |  32 ++++---
>  lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
>  lib/librte_vhost/vhost_user/virtio-net-user.c |  97 +++++++++++++++----
>  lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
>  lib/librte_vhost/virtio-net.c                 | 132 +++++++++++++++++---------
>  6 files changed, 201 insertions(+), 81 deletions(-)
>
> diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
> index 5d38185..3e82bef 100644
> --- a/lib/librte_vhost/rte_virtio_net.h
> +++ b/lib/librte_vhost/rte_virtio_net.h
> @@ -59,6 +59,10 @@ struct rte_mbuf;

Some basic question:
Does vhost have no way to know how many queues each virtio device has?
rte_vhost_q_num_set would set the same number of queues for all virtio
devices, so different virtio devices couldn't have different number of
queues.



^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v2 0/7] Support multiple queues in vhost
  2015-05-21  7:49 [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost Ouyang Changchun
                   ` (6 preceding siblings ...)
  2015-05-22  1:13 ` [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost Thomas F Herbert
@ 2015-06-10  5:52 ` Ouyang Changchun
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 1/7] ixgbe: Support VMDq RSS in non-SRIOV environment Ouyang Changchun
                     ` (7 more replies)
  7 siblings, 8 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-10  5:52 UTC (permalink / raw)
  To: dev

This patch set supports the multiple queues for each virtio device in vhost.
The vhost-user is used to enable the multiple queues feature, It's not ready for vhost-cuse.
 
The QEMU patch of enabling vhost-use multiple queues has already merged into upstream sub-tree in
QEMU community and it will be put in QEMU 2.4. If using QEMU 2.3, it requires applying the
same patch onto QEMU 2.3 and rebuild the QEMU before running vhost multiple queues:
http://patchwork.ozlabs.org/patch/477461/
 
Basically vhost sample leverages the VMDq+RSS in HW to receive packets and distribute them
into different queue in the pool according to their 5 tuples.
 
On the other hand, the vhost will get the queue pair number based on the communication message with
QEMU.
 
HW queue numbers in pool is strongly recommended to set as identical with the queue number to start
the QMEU guest and identical with the queue number to start with virtio port on guest.
E.g. use '--rxq 4' to set the queue number as 4, it means there are 4 HW queues in each VMDq pool,
and 4 queues in each vhost device/port, every queue in pool maps to one queue in vhost device.
 
=========================================
==================|   |==================|
       vport0     |   |      vport1      |
---  ---  ---  ---|   |---  ---  ---  ---|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
||   ||   ||   ||      ||   ||   ||   ||
||   ||   ||   ||      ||   ||   ||   ||
||= =||= =||= =||=|   =||== ||== ||== ||=|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
------------------|   |------------------|
     VMDq pool0   |   |    VMDq pool1    |
==================|   |==================|
 
In RX side, it firstly polls each queue of the pool and gets the packets from
it and enqueue them into its corresponding virtqueue in virtio device/port.
In TX side, it dequeue packets from each virtqueue of virtio device/port and send
to either physical port or another virtio device according to its destination
MAC address.
 
Here is some test guidance.
1. On host, firstly mount hugepage, and insmod uio, igb_uio, bind one nic on igb_uio;
and then run vhost sample, key steps as follows:
sudo mount -t hugetlbfs nodev /mnt/huge
sudo modprobe uio
sudo insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko

$RTE_SDK/tools/dpdk_nic_bind.py --bind igb_uio 0000:08:00.0
sudo $RTE_SDK/examples/vhost/build/vhost-switch -c 0xf0 -n 4 --huge-dir /mnt/huge --socket-mem 1024,0 -- -p 1 --vm2vm 0 --dev-basename usvhost --rxq 2

use '--stats 1' to enable the stats dumping on screen for vhost.
 
2. After step 1, on host, modprobe kvm and kvm_intel, and use qemu command line to start one guest:
modprobe kvm
modprobe kvm_intel
sudo mount -t hugetlbfs nodev /dev/hugepages -o pagesize=1G
 
$QEMU_PATH/qemu-system-x86_64 -enable-kvm -m 4096 -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -mem-prealloc -smp 10 -cpu core2duo,+sse3,+sse4.1,+sse4.2 -name <vm-name> -drive file=<img-path>/vm.img -chardev socket,id=char0,path=<usvhost-path>/usvhost -netdev type=vhost-user,id=hostnet2,chardev=char0,vhostforce=on,queues=2 -device virtio-net-pci,mq=on,vectors=6,netdev=hostnet2,id=net2,mac=52:54:00:12:34:56,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -chardev socket,id=char1,path=<usvhost-path>/usvhost -netdev type=vhost-user,id=hostnet3,chardev=char1,vhostforce=on,queues=2 -device virtio-net-pci,mq=on,vectors=6,netdev=hostnet3,id=net3,mac=52:54:00:12:34:57,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off
 
3. Log on guest, use testpmd(dpdk based) to test, use multiple virtio queues to rx and tx packets.
modprobe uio
insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko
echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
./tools/dpdk_nic_bind.py --bind igb_uio 00:03.0 00:04.0
 
$RTE_SDK/$RTE_TARGET/app/testpmd -c 1f -n 4 -- --rxq=2 --txq=2 --nb-cores=4 --rx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" --tx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" -i --disable-hw-vlan --txqflags 0xf00

set fwd mac
start tx_first
 
4. Use packet generator to send packets with dest MAC:52 54 00 12 34 57  VLAN tag:1001,
select IPv4 as protocols and continuous incremental IP address.
 
5. Testpmd on guest can display packets received/transmitted in both queues of each virtio port.

Changchun Ouyang (7):
  ixgbe: Support VMDq RSS in non-SRIOV environment
  lib_vhost: Support multiple queues in virtio dev
  lib_vhost: Set memory layout for multiple queues mode
  vhost: Add new command line option: rxq
  vhost: Support multiple queues
  virtio: Resolve for control queue
  vhost: Add per queue stats info

 drivers/net/ixgbe/ixgbe_rxtx.c                |  86 +++++--
 drivers/net/virtio/virtio_ethdev.c            |  15 +-
 examples/vhost/main.c                         | 324 +++++++++++++++++---------
 lib/librte_ether/rte_ethdev.c                 |  40 ++++
 lib/librte_vhost/rte_virtio_net.h             |  20 +-
 lib/librte_vhost/vhost-net.h                  |   1 +
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c |  57 +++--
 lib/librte_vhost/vhost_rxtx.c                 |  53 +++--
 lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
 lib/librte_vhost/vhost_user/virtio-net-user.c | 135 +++++++----
 lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
 lib/librte_vhost/virtio-net.c                 | 195 +++++++++++-----
 12 files changed, 640 insertions(+), 292 deletions(-)

-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v2 1/7] ixgbe: Support VMDq RSS in non-SRIOV environment
  2015-06-10  5:52 ` [dpdk-dev] [PATCH v2 0/7] " Ouyang Changchun
@ 2015-06-10  5:52   ` Ouyang Changchun
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 2/7] lib_vhost: Support multiple queues in virtio dev Ouyang Changchun
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-10  5:52 UTC (permalink / raw)
  To: dev

In non-SRIOV environment, VMDq RSS could be enabled by MRQC register.
In theory, the queue number per pool could be 2 or 4, but only 2 queues are
available due to HW limitation, the same limit also exist in Linux ixgbe driver.

Changes in v2:
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 drivers/net/ixgbe/ixgbe_rxtx.c | 86 +++++++++++++++++++++++++++++++++++-------
 lib/librte_ether/rte_ethdev.c  | 40 ++++++++++++++++++++
 2 files changed, 113 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 4f9ab22..13e661f 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -3311,16 +3311,16 @@ void ixgbe_configure_dcb(struct rte_eth_dev *dev)
 	return;
 }
 
-/*
- * VMDq only support for 10 GbE NIC.
+/**
+ * Config pool for VMDq on 10 GbE NIC.
  */
 static void
-ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
+ixgbe_vmdq_pool_configure(struct rte_eth_dev *dev)
 {
 	struct rte_eth_vmdq_rx_conf *cfg;
 	struct ixgbe_hw *hw;
 	enum rte_eth_nb_pools num_pools;
-	uint32_t mrqc, vt_ctl, vlanctrl;
+	uint32_t vt_ctl, vlanctrl;
 	uint32_t vmolr = 0;
 	int i;
 
@@ -3329,12 +3329,6 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
 	cfg = &dev->data->dev_conf.rx_adv_conf.vmdq_rx_conf;
 	num_pools = cfg->nb_queue_pools;
 
-	ixgbe_rss_disable(dev);
-
-	/* MRQC: enable vmdq */
-	mrqc = IXGBE_MRQC_VMDQEN;
-	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
-
 	/* PFVTCTL: turn on virtualisation and set the default pool */
 	vt_ctl = IXGBE_VT_CTL_VT_ENABLE | IXGBE_VT_CTL_REPLEN;
 	if (cfg->enable_default_pool)
@@ -3400,7 +3394,29 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
 	IXGBE_WRITE_FLUSH(hw);
 }
 
-/*
+/**
+ * VMDq only support for 10 GbE NIC.
+ */
+static void
+ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
+{
+	struct ixgbe_hw *hw;
+	uint32_t mrqc;
+
+	PMD_INIT_FUNC_TRACE();
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	ixgbe_rss_disable(dev);
+
+	/* MRQC: enable vmdq */
+	mrqc = IXGBE_MRQC_VMDQEN;
+	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
+	IXGBE_WRITE_FLUSH(hw);
+
+	ixgbe_vmdq_pool_configure(dev);
+}
+
+/**
  * ixgbe_dcb_config_tx_hw_config - Configure general VMDq TX parameters
  * @hw: pointer to hardware structure
  */
@@ -3505,6 +3521,41 @@ ixgbe_config_vf_rss(struct rte_eth_dev *dev)
 }
 
 static int
+ixgbe_config_vmdq_rss(struct rte_eth_dev *dev)
+{
+	struct ixgbe_hw *hw;
+	uint32_t mrqc;
+
+	ixgbe_rss_configure(dev);
+
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/* MRQC: enable VMDQ RSS */
+	mrqc = IXGBE_READ_REG(hw, IXGBE_MRQC);
+	mrqc &= ~IXGBE_MRQC_MRQE_MASK;
+
+	switch (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) {
+	case 2:
+		mrqc |= IXGBE_MRQC_VMDQRSS64EN;
+		break;
+
+	case 4:
+		mrqc |= IXGBE_MRQC_VMDQRSS32EN;
+		break;
+
+	default:
+		PMD_INIT_LOG(ERR, "Invalid pool number in non-IOV mode with VMDQ RSS");
+		return -EINVAL;
+	}
+
+	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
+
+	ixgbe_vmdq_pool_configure(dev);
+
+	return 0;
+}
+
+static int
 ixgbe_config_vf_default(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw =
@@ -3560,6 +3611,10 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev)
 				ixgbe_vmdq_rx_hw_configure(dev);
 				break;
 
+			case ETH_MQ_RX_VMDQ_RSS:
+				ixgbe_config_vmdq_rss(dev);
+				break;
+
 			case ETH_MQ_RX_NONE:
 				/* if mq_mode is none, disable rss mode.*/
 			default: ixgbe_rss_disable(dev);
@@ -4038,6 +4093,8 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
 
 	/* Setup RX queues */
 	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		uint32_t psrtype = 0;
+
 		rxq = dev->data->rx_queues[i];
 
 		/*
@@ -4065,12 +4122,10 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
 		if (rx_conf->header_split) {
 			if (hw->mac.type == ixgbe_mac_82599EB) {
 				/* Must setup the PSRTYPE register */
-				uint32_t psrtype;
 				psrtype = IXGBE_PSRTYPE_TCPHDR |
 					IXGBE_PSRTYPE_UDPHDR   |
 					IXGBE_PSRTYPE_IPV4HDR  |
 					IXGBE_PSRTYPE_IPV6HDR;
-				IXGBE_WRITE_REG(hw, IXGBE_PSRTYPE(rxq->reg_idx), psrtype);
 			}
 			srrctl = ((rx_conf->split_hdr_size <<
 				IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT) &
@@ -4080,6 +4135,11 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
 #endif
 			srrctl = IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;
 
+		/* Set RQPL for VMDQ RSS according to max Rx queue */
+		psrtype |= (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool >> 1) <<
+			IXGBE_PSRTYPE_RQPL_SHIFT;
+		IXGBE_WRITE_REG(hw, IXGBE_PSRTYPE(rxq->reg_idx), psrtype);
+
 		/* Set if packets are dropped when no descriptors available */
 		if (rxq->drop_en)
 			srrctl |= IXGBE_SRRCTL_DROP_EN;
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 5a94654..190c529 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -933,6 +933,16 @@ rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t nb_rx_q)
 	return 0;
 }
 
+#define VMDQ_RSS_RX_QUEUE_NUM_MAX 4
+
+static int
+rte_eth_dev_check_vmdq_rss_rxq_num(__rte_unused uint8_t port_id, uint16_t nb_rx_q)
+{
+	if (nb_rx_q > VMDQ_RSS_RX_QUEUE_NUM_MAX)
+		return -EINVAL;
+	return 0;
+}
+
 static int
 rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 		      const struct rte_eth_conf *dev_conf)
@@ -1093,6 +1103,36 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 				return -EINVAL;
 			}
 		}
+
+		if (dev_conf->rxmode.mq_mode == ETH_MQ_RX_VMDQ_RSS) {
+			uint32_t nb_queue_pools =
+				dev_conf->rx_adv_conf.vmdq_rx_conf.nb_queue_pools;
+			struct rte_eth_dev_info dev_info;
+
+			rte_eth_dev_info_get(port_id, &dev_info);
+			dev->data->dev_conf.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
+			if (nb_queue_pools == ETH_32_POOLS || nb_queue_pools == ETH_64_POOLS)
+				RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool =
+					dev_info.max_rx_queues/nb_queue_pools;
+			else {
+				PMD_DEBUG_TRACE("ethdev port_id=%d VMDQ "
+						"nb_queue_pools=%d invalid "
+						"in VMDQ RSS\n"
+						port_id,
+						nb_queue_pools);
+				return -EINVAL;
+			}
+
+			if (rte_eth_dev_check_vmdq_rss_rxq_num(port_id,
+				RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) != 0) {
+				PMD_DEBUG_TRACE("ethdev port_id=%d"
+					" SRIOV active, invalid queue"
+					" number for VMDQ RSS, allowed"
+					" value are 1, 2 or 4\n",
+					port_id);
+				return -EINVAL;
+			}
+		}
 	}
 	return 0;
 }
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v2 2/7] lib_vhost: Support multiple queues in virtio dev
  2015-06-10  5:52 ` [dpdk-dev] [PATCH v2 0/7] " Ouyang Changchun
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 1/7] ixgbe: Support VMDq RSS in non-SRIOV environment Ouyang Changchun
@ 2015-06-10  5:52   ` Ouyang Changchun
  2015-06-11  9:54     ` Panu Matilainen
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 3/7] lib_vhost: Set memory layout for multiple queues mode Ouyang Changchun
                     ` (5 subsequent siblings)
  7 siblings, 1 reply; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-10  5:52 UTC (permalink / raw)
  To: dev

Each virtio device could have multiple queues, say 2 or 4, at most 8.
Enabling this feature allows virtio device/port on guest has the ability to
use different vCPU to receive/transmit packets from/to each queue.

In multiple queues mode, virtio device readiness means all queues of
this virtio device are ready, cleanup/destroy a virtio device also
requires clearing all queues belong to it.

Changes in v2:
  - remove the q_num_set api
  - add the qp_num_get api
  - determine the queue pair num from qemu message
  - rework for reset owner message handler
  - dynamically alloc mem for dev virtqueue
  - queue pair num could be 0x8000
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 lib/librte_vhost/rte_virtio_net.h             |  10 +-
 lib/librte_vhost/vhost-net.h                  |   1 +
 lib/librte_vhost/vhost_rxtx.c                 |  32 ++---
 lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
 lib/librte_vhost/vhost_user/virtio-net-user.c |  76 +++++++++---
 lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
 lib/librte_vhost/virtio-net.c                 | 161 +++++++++++++++++---------
 7 files changed, 197 insertions(+), 89 deletions(-)

diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 5d38185..92b4bfa 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -59,7 +59,6 @@ struct rte_mbuf;
 /* Backend value set by guest. */
 #define VIRTIO_DEV_STOPPED -1
 
-
 /* Enum for virtqueue management. */
 enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
 
@@ -96,13 +95,14 @@ struct vhost_virtqueue {
  * Device structure contains all configuration information relating to the device.
  */
 struct virtio_net {
-	struct vhost_virtqueue	*virtqueue[VIRTIO_QNUM];	/**< Contains all virtqueue information. */
 	struct virtio_memory	*mem;		/**< QEMU memory and memory region information. */
+	struct vhost_virtqueue	**virtqueue;    /**< Contains all virtqueue information. */
 	uint64_t		features;	/**< Negotiated feature set. */
 	uint64_t		device_fh;	/**< device identifier. */
 	uint32_t		flags;		/**< Device flags. Only used to check if device is running on data core. */
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
 	char			ifname[IF_NAME_SZ];	/**< Name of the tap device or socket path. */
+	uint32_t                num_virt_queues;
 	void			*priv;		/**< private context */
 } __rte_cache_aligned;
 
@@ -220,4 +220,10 @@ uint16_t rte_vhost_enqueue_burst(struct virtio_net *dev, uint16_t queue_id,
 uint16_t rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
 
+/**
+ * This function get the queue pair number of one vhost device.
+ * @return
+ *  num of queue pair of specified virtio device.
+ */
+uint16_t rte_vhost_qp_num_get(struct virtio_net *dev);
 #endif /* _VIRTIO_NET_H_ */
diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
index c69b60b..7dff14d 100644
--- a/lib/librte_vhost/vhost-net.h
+++ b/lib/librte_vhost/vhost-net.h
@@ -115,4 +115,5 @@ struct vhost_net_device_ops {
 
 
 struct vhost_net_device_ops const *get_virtio_net_callbacks(void);
+int alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx);
 #endif /* _VHOST_NET_CDEV_H_ */
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 4809d32..19f9518 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -67,12 +67,12 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 	uint8_t success = 0;
 
 	LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_rx()\n", dev->device_fh);
-	if (unlikely(queue_id != VIRTIO_RXQ)) {
-		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
-		return 0;
+	if (unlikely(queue_id >= VIRTIO_QNUM * dev->num_virt_queues)) {
+		LOG_DEBUG(VHOST_DATA, "queue id: %d invalid.\n", queue_id);
+		return -1;
 	}
 
-	vq = dev->virtqueue[VIRTIO_RXQ];
+	vq = dev->virtqueue[queue_id];
 	count = (count > MAX_PKT_BURST) ? MAX_PKT_BURST : count;
 
 	/*
@@ -188,8 +188,9 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 }
 
 static inline uint32_t __attribute__((always_inline))
-copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t res_base_idx,
-	uint16_t res_end_idx, struct rte_mbuf *pkt)
+copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
+	uint16_t res_base_idx, uint16_t res_end_idx,
+	struct rte_mbuf *pkt)
 {
 	uint32_t vec_idx = 0;
 	uint32_t entry_success = 0;
@@ -217,9 +218,9 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t res_base_idx,
 	 * Convert from gpa to vva
 	 * (guest physical addr -> vhost virtual addr)
 	 */
-	vq = dev->virtqueue[VIRTIO_RXQ];
 	vb_addr =
 		gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
+	vq = dev->virtqueue[queue_id];
 	vb_hdr_addr = vb_addr;
 
 	/* Prefetch buffer address. */
@@ -407,11 +408,12 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
 
 	LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_merge_rx()\n",
 		dev->device_fh);
-	if (unlikely(queue_id != VIRTIO_RXQ)) {
-		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
+	if (unlikely(queue_id >= VIRTIO_QNUM * dev->num_virt_queues)) {
+		LOG_DEBUG(VHOST_DATA, "queue id: %d invalid.\n", queue_id);
+		return -1;
 	}
 
-	vq = dev->virtqueue[VIRTIO_RXQ];
+	vq = dev->virtqueue[queue_id];
 	count = RTE_MIN((uint32_t)MAX_PKT_BURST, count);
 
 	if (count == 0)
@@ -493,7 +495,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
 
 		res_end_idx = res_cur_idx;
 
-		entry_success = copy_from_mbuf_to_vring(dev, res_base_idx,
+		entry_success = copy_from_mbuf_to_vring(dev, queue_id, res_base_idx,
 			res_end_idx, pkts[pkt_idx]);
 
 		rte_compiler_barrier();
@@ -543,12 +545,12 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 	uint16_t free_entries, entry_success = 0;
 	uint16_t avail_idx;
 
-	if (unlikely(queue_id != VIRTIO_TXQ)) {
-		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
-		return 0;
+	if (unlikely(queue_id >= VIRTIO_QNUM * dev->num_virt_queues)) {
+		LOG_DEBUG(VHOST_DATA, "queue id:%d invalid.\n", queue_id);
+		return -1;
 	}
 
-	vq = dev->virtqueue[VIRTIO_TXQ];
+	vq = dev->virtqueue[queue_id];
 	avail_idx =  *((volatile uint16_t *)&vq->avail->idx);
 
 	/* If there are no available buffers then return. */
diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c
index 31f1215..b66a653 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -378,7 +378,9 @@ vserver_message_handler(int connfd, void *dat, int *remove)
 		ops->set_owner(ctx);
 		break;
 	case VHOST_USER_RESET_OWNER:
-		ops->reset_owner(ctx);
+		RTE_LOG(INFO, VHOST_CONFIG,
+			"(%"PRIu64") VHOST_NET_RESET_OWNER\n", ctx.fh);
+		user_reset_owner(ctx, &msg.payload.state);
 		break;
 
 	case VHOST_USER_SET_MEM_TABLE:
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c
index c1ffc38..b4de86d 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -209,30 +209,46 @@ static int
 virtio_is_ready(struct virtio_net *dev)
 {
 	struct vhost_virtqueue *rvq, *tvq;
+	uint32_t q_idx;
 
 	/* mq support in future.*/
-	rvq = dev->virtqueue[VIRTIO_RXQ];
-	tvq = dev->virtqueue[VIRTIO_TXQ];
-	if (rvq && tvq && rvq->desc && tvq->desc &&
-		(rvq->kickfd != (eventfd_t)-1) &&
-		(rvq->callfd != (eventfd_t)-1) &&
-		(tvq->kickfd != (eventfd_t)-1) &&
-		(tvq->callfd != (eventfd_t)-1)) {
-		RTE_LOG(INFO, VHOST_CONFIG,
-			"virtio is now ready for processing.\n");
-		return 1;
+	for (q_idx = 0; q_idx < dev->num_virt_queues; q_idx++) {
+		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+
+		rvq = dev->virtqueue[virt_rx_q_idx];
+		tvq = dev->virtqueue[virt_tx_q_idx];
+		if ((rvq == NULL) || (tvq == NULL) ||
+			(rvq->desc == NULL) || (tvq->desc == NULL) ||
+			(rvq->kickfd == (eventfd_t)-1) ||
+			(rvq->callfd == (eventfd_t)-1) ||
+			(tvq->kickfd == (eventfd_t)-1) ||
+			(tvq->callfd == (eventfd_t)-1)) {
+			RTE_LOG(INFO, VHOST_CONFIG,
+				"virtio isn't ready for processing.\n");
+			return 0;
+		}
 	}
 	RTE_LOG(INFO, VHOST_CONFIG,
-		"virtio isn't ready for processing.\n");
-	return 0;
+		"virtio is now ready for processing.\n");
+	return 1;
 }
 
 void
 user_set_vring_call(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 {
 	struct vhost_vring_file file;
+	struct virtio_net *dev = get_device(ctx);
+	uint32_t cur_qp_idx;
 
 	file.index = pmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
+	cur_qp_idx = (file.index & (~0x1)) >> 1;
+
+	if (dev->num_virt_queues < cur_qp_idx + 1) {
+		if (alloc_vring_queue_pair(dev, cur_qp_idx) == 0)
+			dev->num_virt_queues = cur_qp_idx + 1;
+	}
+
 	if (pmsg->payload.u64 & VHOST_USER_VRING_NOFD_MASK)
 		file.fd = -1;
 	else
@@ -290,13 +306,37 @@ user_get_vring_base(struct vhost_device_ctx ctx,
 	 * sent and only sent in vhost_vring_stop.
 	 * TODO: cleanup the vring, it isn't usable since here.
 	 */
-	if (((int)dev->virtqueue[VIRTIO_RXQ]->kickfd) >= 0) {
-		close(dev->virtqueue[VIRTIO_RXQ]->kickfd);
-		dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
+	if (((int)dev->virtqueue[state->index]->kickfd) >= 0) {
+		close(dev->virtqueue[state->index]->kickfd);
+		dev->virtqueue[state->index]->kickfd = (eventfd_t)-1;
 	}
-	if (((int)dev->virtqueue[VIRTIO_TXQ]->kickfd) >= 0) {
-		close(dev->virtqueue[VIRTIO_TXQ]->kickfd);
-		dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
+
+	return 0;
+}
+
+/*
+ * when virtio is stopped, qemu will send us the RESET_OWNER message.
+ */
+int
+user_reset_owner(struct vhost_device_ctx ctx,
+	struct vhost_vring_state *state)
+{
+	struct virtio_net *dev = get_device(ctx);
+
+	/* We have to stop the queue (virtio) if it is running. */
+	if (dev->flags & VIRTIO_DEV_RUNNING)
+		notify_ops->destroy_device(dev);
+
+	RTE_LOG(INFO, VHOST_CONFIG,
+		"reset owner --- state idx:%d state num:%d\n", state->index, state->num);
+	/*
+	 * Based on current qemu vhost-user implementation, this message is
+	 * sent and only sent in vhost_net_stop_one.
+	 * TODO: cleanup the vring, it isn't usable since here.
+	 */
+	if (((int)dev->virtqueue[state->index]->kickfd) >= 0) {
+		close(dev->virtqueue[state->index]->kickfd);
+		dev->virtqueue[state->index]->kickfd = (eventfd_t)-1;
 	}
 
 	return 0;
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.h b/lib/librte_vhost/vhost_user/virtio-net-user.h
index df24860..2429836 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.h
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.h
@@ -46,4 +46,6 @@ void user_set_vring_kick(struct vhost_device_ctx, struct VhostUserMsg *);
 int user_get_vring_base(struct vhost_device_ctx, struct vhost_vring_state *);
 
 void user_destroy_device(struct vhost_device_ctx);
+
+int user_reset_owner(struct vhost_device_ctx ctx, struct vhost_vring_state *state);
 #endif
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 4672e67..58d21a8 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -66,10 +66,10 @@ static struct virtio_net_config_ll *ll_root;
 /* Features supported by this lib. */
 #define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
 				(1ULL << VIRTIO_NET_F_CTRL_VQ) | \
-				(1ULL << VIRTIO_NET_F_CTRL_RX))
+				(1ULL << VIRTIO_NET_F_CTRL_RX) | \
+				(1ULL << VIRTIO_NET_F_MQ))
 static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
 
-
 /*
  * Converts QEMU virtual address to Vhost virtual address. This function is
  * used to convert the ring addresses to our address space.
@@ -177,6 +177,8 @@ add_config_ll_entry(struct virtio_net_config_ll *new_ll_dev)
 static void
 cleanup_device(struct virtio_net *dev)
 {
+	uint32_t qp_idx;
+
 	/* Unmap QEMU memory file if mapped. */
 	if (dev->mem) {
 		munmap((void *)(uintptr_t)dev->mem->mapped_address,
@@ -185,14 +187,18 @@ cleanup_device(struct virtio_net *dev)
 	}
 
 	/* Close any event notifiers opened by device. */
-	if ((int)dev->virtqueue[VIRTIO_RXQ]->callfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_RXQ]->callfd);
-	if ((int)dev->virtqueue[VIRTIO_RXQ]->kickfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_RXQ]->kickfd);
-	if ((int)dev->virtqueue[VIRTIO_TXQ]->callfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_TXQ]->callfd);
-	if ((int)dev->virtqueue[VIRTIO_TXQ]->kickfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_TXQ]->kickfd);
+	for (qp_idx = 0; qp_idx < dev->num_virt_queues; qp_idx++) {
+		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+		if ((int)dev->virtqueue[virt_rx_q_idx]->callfd >= 0)
+			close((int)dev->virtqueue[virt_rx_q_idx]->callfd);
+		if ((int)dev->virtqueue[virt_rx_q_idx]->kickfd >= 0)
+			close((int)dev->virtqueue[virt_rx_q_idx]->kickfd);
+		if ((int)dev->virtqueue[virt_tx_q_idx]->callfd >= 0)
+			close((int)dev->virtqueue[virt_tx_q_idx]->callfd);
+		if ((int)dev->virtqueue[virt_tx_q_idx]->kickfd >= 0)
+			close((int)dev->virtqueue[virt_tx_q_idx]->kickfd);
+	}
 }
 
 /*
@@ -201,9 +207,17 @@ cleanup_device(struct virtio_net *dev)
 static void
 free_device(struct virtio_net_config_ll *ll_dev)
 {
-	/* Free any malloc'd memory */
-	free(ll_dev->dev.virtqueue[VIRTIO_RXQ]);
-	free(ll_dev->dev.virtqueue[VIRTIO_TXQ]);
+	uint32_t qp_idx;
+
+	/*
+	 * Free any malloc'd memory.
+	 */
+	/* Free every queue pair. */
+	for (qp_idx = 0; qp_idx < ll_dev->dev.num_virt_queues; qp_idx++) {
+		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		free(ll_dev->dev.virtqueue[virt_rx_q_idx]);
+	}
+	free(ll_dev->dev.virtqueue);
 	free(ll_dev);
 }
 
@@ -237,6 +251,27 @@ rm_config_ll_entry(struct virtio_net_config_ll *ll_dev,
 }
 
 /*
+ *  Initialise all variables in vring queue pair.
+ */
+static void
+init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx)
+{
+	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+	memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct vhost_virtqueue));
+	memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct vhost_virtqueue));
+
+	dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
+	dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
+	dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
+	dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
+
+	/* Backends are set to -1 indicating an inactive device. */
+	dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
+	dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED;
+}
+
+/*
  *  Initialise all variables in device structure.
  */
 static void
@@ -253,17 +288,31 @@ init_device(struct virtio_net *dev)
 	/* Set everything to 0. */
 	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
 		(sizeof(struct virtio_net) - (size_t)vq_offset));
-	memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct vhost_virtqueue));
-	memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct vhost_virtqueue));
 
-	dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
-	dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
-	dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
-	dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;
+	init_vring_queue_pair(dev, 0);
+	dev->num_virt_queues = 1;
+}
 
-	/* Backends are set to -1 indicating an inactive device. */
-	dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
-	dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
+/*
+ *  Alloc mem for vring queue pair.
+ */
+int
+alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx)
+{
+	struct vhost_virtqueue *virtqueue = NULL;
+	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+
+	virtqueue = malloc(sizeof(struct vhost_virtqueue) * VIRTIO_QNUM);
+	if (virtqueue == NULL) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"Failed to allocate memory for virt qp:%d.\n", qp_idx);
+		return -1;
+	}
+
+	dev->virtqueue[virt_rx_q_idx] = virtqueue;
+	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
+	return 0;
 }
 
 /*
@@ -275,7 +324,6 @@ static int
 new_device(struct vhost_device_ctx ctx)
 {
 	struct virtio_net_config_ll *new_ll_dev;
-	struct vhost_virtqueue *virtqueue_rx, *virtqueue_tx;
 
 	/* Setup device and virtqueues. */
 	new_ll_dev = malloc(sizeof(struct virtio_net_config_ll));
@@ -286,28 +334,22 @@ new_device(struct vhost_device_ctx ctx)
 		return -1;
 	}
 
-	virtqueue_rx = malloc(sizeof(struct vhost_virtqueue));
-	if (virtqueue_rx == NULL) {
-		free(new_ll_dev);
+	new_ll_dev->dev.virtqueue =
+		malloc(VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct vhost_virtqueue *));
+	if (new_ll_dev->dev.virtqueue == NULL) {
 		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to allocate memory for rxq.\n",
+			"(%"PRIu64") Failed to allocate memory for dev.virtqueue.\n",
 			ctx.fh);
+		free(new_ll_dev);
 		return -1;
 	}
 
-	virtqueue_tx = malloc(sizeof(struct vhost_virtqueue));
-	if (virtqueue_tx == NULL) {
-		free(virtqueue_rx);
+	if (alloc_vring_queue_pair(&new_ll_dev->dev, 0) == -1) {
+		free(new_ll_dev->dev.virtqueue);
 		free(new_ll_dev);
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to allocate memory for txq.\n",
-			ctx.fh);
 		return -1;
 	}
 
-	new_ll_dev->dev.virtqueue[VIRTIO_RXQ] = virtqueue_rx;
-	new_ll_dev->dev.virtqueue[VIRTIO_TXQ] = virtqueue_tx;
-
 	/* Initialise device and virtqueues. */
 	init_device(&new_ll_dev->dev);
 
@@ -391,7 +433,7 @@ set_owner(struct vhost_device_ctx ctx)
  * Called from CUSE IOCTL: VHOST_RESET_OWNER
  */
 static int
-reset_owner(struct vhost_device_ctx ctx)
+reset_owner(__rte_unused struct vhost_device_ctx ctx)
 {
 	struct virtio_net_config_ll *ll_dev;
 
@@ -429,6 +471,7 @@ static int
 set_features(struct vhost_device_ctx ctx, uint64_t *pu)
 {
 	struct virtio_net *dev;
+	uint32_t q_idx;
 
 	dev = get_device(ctx);
 	if (dev == NULL)
@@ -440,22 +483,26 @@ set_features(struct vhost_device_ctx ctx, uint64_t *pu)
 	dev->features = *pu;
 
 	/* Set the vhost_hlen depending on if VIRTIO_NET_F_MRG_RXBUF is set. */
-	if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
-		LOG_DEBUG(VHOST_CONFIG,
-			"(%"PRIu64") Mergeable RX buffers enabled\n",
-			dev->device_fh);
-		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr_mrg_rxbuf);
-		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr_mrg_rxbuf);
-	} else {
-		LOG_DEBUG(VHOST_CONFIG,
-			"(%"PRIu64") Mergeable RX buffers disabled\n",
-			dev->device_fh);
-		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr);
-		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr);
+	for (q_idx = 0; q_idx < dev->num_virt_queues; q_idx++) {
+		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+		if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
+			LOG_DEBUG(VHOST_CONFIG,
+				"(%"PRIu64") Mergeable RX buffers enabled\n",
+				dev->device_fh);
+			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr_mrg_rxbuf);
+			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr_mrg_rxbuf);
+		} else {
+			LOG_DEBUG(VHOST_CONFIG,
+				"(%"PRIu64") Mergeable RX buffers disabled\n",
+				dev->device_fh);
+			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr);
+			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr);
+		}
 	}
 	return 0;
 }
@@ -736,6 +783,14 @@ int rte_vhost_feature_enable(uint64_t feature_mask)
 	return -1;
 }
 
+uint16_t rte_vhost_qp_num_get(struct virtio_net *dev)
+{
+	if (dev == NULL)
+		return 0;
+
+	return dev->num_virt_queues;
+}
+
 /*
  * Register ops so that we can add/remove device to data core.
  */
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v2 3/7] lib_vhost: Set memory layout for multiple queues mode
  2015-06-10  5:52 ` [dpdk-dev] [PATCH v2 0/7] " Ouyang Changchun
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 1/7] ixgbe: Support VMDq RSS in non-SRIOV environment Ouyang Changchun
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 2/7] lib_vhost: Support multiple queues in virtio dev Ouyang Changchun
@ 2015-06-10  5:52   ` Ouyang Changchun
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 4/7] vhost: Add new command line option: rxq Ouyang Changchun
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-10  5:52 UTC (permalink / raw)
  To: dev

QEMU sends separate commands orderly to set the memory layout for each queue
in one virtio device, accordingly vhost need keep memory layout information
for each queue of the virtio device.

This also need adjust the interface a bit for function gpa_to_vva by
introducing the queue index to specify queue of device to look up its
virtual vhost address for the incoming guest physical address.

Chagnes in v2
  - q_idx is changed into qp_idx
  - dynamically alloc mem for dev mem_arr
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 examples/vhost/main.c                         | 21 +++++-----
 lib/librte_vhost/rte_virtio_net.h             | 10 +++--
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 57 ++++++++++++++------------
 lib/librte_vhost/vhost_rxtx.c                 | 21 +++++-----
 lib/librte_vhost/vhost_user/virtio-net-user.c | 59 ++++++++++++++-------------
 lib/librte_vhost/virtio-net.c                 | 38 ++++++++++++-----
 6 files changed, 118 insertions(+), 88 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 509e9d8..408eb3f 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1466,11 +1466,11 @@ attach_rxmbuf_zcp(struct virtio_net *dev)
 		desc = &vq->desc[desc_idx];
 		if (desc->flags & VRING_DESC_F_NEXT) {
 			desc = &vq->desc[desc->next];
-			buff_addr = gpa_to_vva(dev, desc->addr);
+			buff_addr = gpa_to_vva(dev, 0, desc->addr);
 			phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len,
 					&addr_type);
 		} else {
-			buff_addr = gpa_to_vva(dev,
+			buff_addr = gpa_to_vva(dev, 0,
 					desc->addr + vq->vhost_hlen);
 			phys_addr = gpa_to_hpa(vdev,
 					desc->addr + vq->vhost_hlen,
@@ -1722,7 +1722,7 @@ virtio_dev_rx_zcp(struct virtio_net *dev, struct rte_mbuf **pkts,
 			rte_pktmbuf_data_len(buff), 0);
 
 		/* Buffer address translation for virtio header. */
-		buff_hdr_addr = gpa_to_vva(dev, desc->addr);
+		buff_hdr_addr = gpa_to_vva(dev, 0, desc->addr);
 		packet_len = rte_pktmbuf_data_len(buff) + vq->vhost_hlen;
 
 		/*
@@ -1946,7 +1946,7 @@ virtio_dev_tx_zcp(struct virtio_net *dev)
 		desc = &vq->desc[desc->next];
 
 		/* Buffer address translation. */
-		buff_addr = gpa_to_vva(dev, desc->addr);
+		buff_addr = gpa_to_vva(dev, 0, desc->addr);
 		/* Need check extra VLAN_HLEN size for inserting VLAN tag */
 		phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len + VLAN_HLEN,
 			&addr_type);
@@ -2604,13 +2604,14 @@ new_device (struct virtio_net *dev)
 	dev->priv = vdev;
 
 	if (zero_copy) {
-		vdev->nregions_hpa = dev->mem->nregions;
-		for (regionidx = 0; regionidx < dev->mem->nregions; regionidx++) {
+		struct virtio_memory *dev_mem = dev->mem_arr[0];
+		vdev->nregions_hpa = dev_mem->nregions;
+		for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) {
 			vdev->nregions_hpa
 				+= check_hpa_regions(
-					dev->mem->regions[regionidx].guest_phys_address
-					+ dev->mem->regions[regionidx].address_offset,
-					dev->mem->regions[regionidx].memory_size);
+					dev_mem->regions[regionidx].guest_phys_address
+					+ dev_mem->regions[regionidx].address_offset,
+					dev_mem->regions[regionidx].memory_size);
 
 		}
 
@@ -2626,7 +2627,7 @@ new_device (struct virtio_net *dev)
 
 
 		if (fill_hpa_memory_regions(
-			vdev->regions_hpa, dev->mem
+			vdev->regions_hpa, dev_mem
 			) != vdev->nregions_hpa) {
 
 			RTE_LOG(ERR, VHOST_CONFIG,
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 92b4bfa..7a1126e 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -95,14 +95,15 @@ struct vhost_virtqueue {
  * Device structure contains all configuration information relating to the device.
  */
 struct virtio_net {
-	struct virtio_memory	*mem;		/**< QEMU memory and memory region information. */
 	struct vhost_virtqueue	**virtqueue;    /**< Contains all virtqueue information. */
+	struct virtio_memory    **mem_arr;      /**< Array for QEMU memory and memory region information. */
 	uint64_t		features;	/**< Negotiated feature set. */
 	uint64_t		device_fh;	/**< device identifier. */
 	uint32_t		flags;		/**< Device flags. Only used to check if device is running on data core. */
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
 	char			ifname[IF_NAME_SZ];	/**< Name of the tap device or socket path. */
 	uint32_t                num_virt_queues;
+	uint32_t                mem_idx;        /** Used in set memory layout, unique for each queue within virtio device. */
 	void			*priv;		/**< private context */
 } __rte_cache_aligned;
 
@@ -153,14 +154,15 @@ rte_vring_available_entries(struct virtio_net *dev, uint16_t queue_id)
  * This is used to convert guest virtio buffer addresses.
  */
 static inline uint64_t __attribute__((always_inline))
-gpa_to_vva(struct virtio_net *dev, uint64_t guest_pa)
+gpa_to_vva(struct virtio_net *dev, uint32_t q_idx, uint64_t guest_pa)
 {
 	struct virtio_memory_regions *region;
+	struct virtio_memory *dev_mem = dev->mem_arr[q_idx];
 	uint32_t regionidx;
 	uint64_t vhost_va = 0;
 
-	for (regionidx = 0; regionidx < dev->mem->nregions; regionidx++) {
-		region = &dev->mem->regions[regionidx];
+	for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) {
+		region = &dev_mem->regions[regionidx];
 		if ((guest_pa >= region->guest_phys_address) &&
 			(guest_pa <= region->guest_phys_address_end)) {
 			vhost_va = region->address_offset + guest_pa;
diff --git a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
index ae2c3fa..7a4733c 100644
--- a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
+++ b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
@@ -273,28 +273,32 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 		((uint64_t)(uintptr_t)mem_regions_addr + size);
 	uint64_t base_address = 0, mapped_address, mapped_size;
 	struct virtio_net *dev;
+	struct virtio_memory *dev_mem = NULL;
 
 	dev = get_device(ctx);
 	if (dev == NULL)
-		return -1;
-
-	if (dev->mem && dev->mem->mapped_address) {
-		munmap((void *)(uintptr_t)dev->mem->mapped_address,
-			(size_t)dev->mem->mapped_size);
-		free(dev->mem);
-		dev->mem = NULL;
+		goto error;
+
+	dev_mem = dev->mem_arr[dev->mem_idx];
+	if (dev_mem && dev_mem->mapped_address) {
+		munmap((void *)(uintptr_t)dev_mem->mapped_address,
+			(size_t)dev_mem->mapped_size);
+		free(dev_mem);
+		dev->mem_arr[dev->mem_idx] = NULL;
 	}
 
-	dev->mem = calloc(1, sizeof(struct virtio_memory) +
+	dev->mem_arr[dev->mem_idx] = calloc(1, sizeof(struct virtio_memory) +
 		sizeof(struct virtio_memory_regions) * nregions);
-	if (dev->mem == NULL) {
+	dev_mem = dev->mem_arr[dev->mem_idx];
+
+	if (dev_mem == NULL) {
 		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to allocate memory for dev->mem\n",
-			dev->device_fh);
-		return -1;
+			"(%"PRIu64") Failed to allocate memory for dev->mem_arr[%d]\n",
+			dev->device_fh, dev->mem_idx);
+		goto error;
 	}
 
-	pregion = &dev->mem->regions[0];
+	pregion = &dev_mem->regions[0];
 
 	for (idx = 0; idx < nregions; idx++) {
 		pregion[idx].guest_phys_address =
@@ -320,14 +324,12 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 				pregion[idx].userspace_address;
 			/* Map VM memory file */
 			if (host_memory_map(ctx.pid, base_address,
-				&mapped_address, &mapped_size) != 0) {
-				free(dev->mem);
-				dev->mem = NULL;
-				return -1;
-			}
-			dev->mem->mapped_address = mapped_address;
-			dev->mem->base_address = base_address;
-			dev->mem->mapped_size = mapped_size;
+				&mapped_address, &mapped_size) != 0)
+				goto free;
+
+			dev_mem->mapped_address = mapped_address;
+			dev_mem->base_address = base_address;
+			dev_mem->mapped_size = mapped_size;
 		}
 	}
 
@@ -335,9 +337,7 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 	if (base_address == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"Failed to find base address of qemu memory file.\n");
-		free(dev->mem);
-		dev->mem = NULL;
-		return -1;
+		goto free;
 	}
 
 	valid_regions = nregions;
@@ -369,9 +369,16 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 			pregion[idx].userspace_address -
 			pregion[idx].guest_phys_address;
 	}
-	dev->mem->nregions = valid_regions;
 
+	dev_mem->nregions = valid_regions;
+	dev->mem_idx = (dev->mem_idx + 1) % (dev->num_virt_queues * VIRTIO_QNUM);
 	return 0;
+
+free:
+	free(dev_mem);
+	dev->mem_arr[dev->mem_idx] = NULL;
+error:
+	return -1;
 }
 
 /*
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 19f9518..3ed1ae3 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -119,7 +119,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 		buff = pkts[packet_success];
 
 		/* Convert from gpa to vva (guest physical addr -> vhost virtual addr) */
-		buff_addr = gpa_to_vva(dev, desc->addr);
+		buff_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM, desc->addr);
 		/* Prefetch buffer address. */
 		rte_prefetch0((void *)(uintptr_t)buff_addr);
 
@@ -135,7 +135,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 			desc->len = vq->vhost_hlen;
 			desc = &vq->desc[desc->next];
 			/* Buffer address translation. */
-			buff_addr = gpa_to_vva(dev, desc->addr);
+			buff_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM, desc->addr);
 			desc->len = rte_pktmbuf_data_len(buff);
 		} else {
 			buff_addr += vq->vhost_hlen;
@@ -218,9 +218,9 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 	 * Convert from gpa to vva
 	 * (guest physical addr -> vhost virtual addr)
 	 */
-	vb_addr =
-		gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
 	vq = dev->virtqueue[queue_id];
+	vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
+			vq->buf_vec[vec_idx].buf_addr);
 	vb_hdr_addr = vb_addr;
 
 	/* Prefetch buffer address. */
@@ -262,8 +262,8 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 		}
 
 		vec_idx++;
-		vb_addr =
-			gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
+		vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
+			vq->buf_vec[vec_idx].buf_addr);
 
 		/* Prefetch buffer address. */
 		rte_prefetch0((void *)(uintptr_t)vb_addr);
@@ -308,7 +308,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 			}
 
 			vec_idx++;
-			vb_addr = gpa_to_vva(dev,
+			vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
 				vq->buf_vec[vec_idx].buf_addr);
 			vb_offset = 0;
 			vb_avail = vq->buf_vec[vec_idx].buf_len;
@@ -352,7 +352,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 
 					/* Get next buffer from buf_vec. */
 					vec_idx++;
-					vb_addr = gpa_to_vva(dev,
+					vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
 						vq->buf_vec[vec_idx].buf_addr);
 					vb_avail =
 						vq->buf_vec[vec_idx].buf_len;
@@ -594,7 +594,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 		desc = &vq->desc[desc->next];
 
 		/* Buffer address translation. */
-		vb_addr = gpa_to_vva(dev, desc->addr);
+		vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM, desc->addr);
 		/* Prefetch buffer address. */
 		rte_prefetch0((void *)(uintptr_t)vb_addr);
 
@@ -700,7 +700,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 					desc = &vq->desc[desc->next];
 
 					/* Buffer address translation. */
-					vb_addr = gpa_to_vva(dev, desc->addr);
+					vb_addr = gpa_to_vva(dev,
+						queue_id / VIRTIO_QNUM, desc->addr);
 					/* Prefetch buffer address. */
 					rte_prefetch0((void *)(uintptr_t)vb_addr);
 					vb_offset = 0;
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c
index b4de86d..337e7e4 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -70,17 +70,17 @@ get_blk_size(int fd)
 }
 
 static void
-free_mem_region(struct virtio_net *dev)
+free_mem_region(struct virtio_memory *dev_mem)
 {
 	struct orig_region_map *region;
 	unsigned int idx;
 	uint64_t alignment;
 
-	if (!dev || !dev->mem)
+	if (!dev_mem)
 		return;
 
-	region = orig_region(dev->mem, dev->mem->nregions);
-	for (idx = 0; idx < dev->mem->nregions; idx++) {
+	region = orig_region(dev_mem, dev_mem->nregions);
+	for (idx = 0; idx < dev_mem->nregions; idx++) {
 		if (region[idx].mapped_address) {
 			alignment = region[idx].blksz;
 			munmap((void *)(uintptr_t)
@@ -103,37 +103,37 @@ user_set_mem_table(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 	unsigned int idx = 0;
 	struct orig_region_map *pregion_orig;
 	uint64_t alignment;
+	struct virtio_memory *dev_mem = NULL;
 
 	/* unmap old memory regions one by one*/
 	dev = get_device(ctx);
 	if (dev == NULL)
 		return -1;
 
-	/* Remove from the data plane. */
-	if (dev->flags & VIRTIO_DEV_RUNNING)
-		notify_ops->destroy_device(dev);
-
-	if (dev->mem) {
-		free_mem_region(dev);
-		free(dev->mem);
-		dev->mem = NULL;
+	dev_mem = dev->mem_arr[dev->mem_idx];
+	if (dev_mem) {
+		free_mem_region(dev_mem);
+		free(dev_mem);
+		dev->mem_arr[dev->mem_idx] = NULL;
 	}
 
-	dev->mem = calloc(1,
+	dev->mem_arr[dev->mem_idx] = calloc(1,
 		sizeof(struct virtio_memory) +
 		sizeof(struct virtio_memory_regions) * memory.nregions +
 		sizeof(struct orig_region_map) * memory.nregions);
-	if (dev->mem == NULL) {
+
+	dev_mem = dev->mem_arr[dev->mem_idx];
+	if (dev_mem == NULL) {
 		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to allocate memory for dev->mem\n",
-			dev->device_fh);
+			"(%"PRIu64") Failed to allocate memory for dev->mem_arr[%d]\n",
+			dev->device_fh, dev->mem_idx);
 		return -1;
 	}
-	dev->mem->nregions = memory.nregions;
+	dev_mem->nregions = memory.nregions;
 
-	pregion_orig = orig_region(dev->mem, memory.nregions);
+	pregion_orig = orig_region(dev_mem, memory.nregions);
 	for (idx = 0; idx < memory.nregions; idx++) {
-		pregion = &dev->mem->regions[idx];
+		pregion = &dev_mem->regions[idx];
 		pregion->guest_phys_address =
 			memory.regions[idx].guest_phys_addr;
 		pregion->guest_phys_address_end =
@@ -175,9 +175,9 @@ user_set_mem_table(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 			pregion->guest_phys_address;
 
 		if (memory.regions[idx].guest_phys_addr == 0) {
-			dev->mem->base_address =
+			dev_mem->base_address =
 				memory.regions[idx].userspace_addr;
-			dev->mem->mapped_address =
+			dev_mem->mapped_address =
 				pregion->address_offset;
 		}
 
@@ -189,6 +189,7 @@ user_set_mem_table(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 			 pregion->memory_size);
 	}
 
+	dev->mem_idx = (dev->mem_idx + 1) % (dev->num_virt_queues * VIRTIO_QNUM);
 	return 0;
 
 err_mmap:
@@ -200,8 +201,8 @@ err_mmap:
 					alignment));
 		close(pregion_orig[idx].fd);
 	}
-	free(dev->mem);
-	dev->mem = NULL;
+	free(dev_mem);
+	dev->mem_arr[dev->mem_idx] = NULL;
 	return -1;
 }
 
@@ -346,13 +347,15 @@ void
 user_destroy_device(struct vhost_device_ctx ctx)
 {
 	struct virtio_net *dev = get_device(ctx);
+	uint32_t i;
 
 	if (dev && (dev->flags & VIRTIO_DEV_RUNNING))
 		notify_ops->destroy_device(dev);
 
-	if (dev && dev->mem) {
-		free_mem_region(dev);
-		free(dev->mem);
-		dev->mem = NULL;
-	}
+	for (i = 0; i < dev->num_virt_queues; i++)
+		if (dev && dev->mem_arr[i]) {
+			free_mem_region(dev->mem_arr[i]);
+			free(dev->mem_arr[i]);
+			dev->mem_arr[i] = NULL;
+		}
 }
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 58d21a8..91c3caa 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -75,15 +75,16 @@ static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
  * used to convert the ring addresses to our address space.
  */
 static uint64_t
-qva_to_vva(struct virtio_net *dev, uint64_t qemu_va)
+qva_to_vva(struct virtio_net *dev, uint32_t q_idx, uint64_t qemu_va)
 {
 	struct virtio_memory_regions *region;
 	uint64_t vhost_va = 0;
 	uint32_t regionidx = 0;
+	struct virtio_memory *dev_mem = dev->mem_arr[q_idx];
 
 	/* Find the region where the address lives. */
-	for (regionidx = 0; regionidx < dev->mem->nregions; regionidx++) {
-		region = &dev->mem->regions[regionidx];
+	for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) {
+		region = &dev_mem->regions[regionidx];
 		if ((qemu_va >= region->userspace_address) &&
 			(qemu_va <= region->userspace_address +
 			region->memory_size)) {
@@ -180,10 +181,13 @@ cleanup_device(struct virtio_net *dev)
 	uint32_t qp_idx;
 
 	/* Unmap QEMU memory file if mapped. */
-	if (dev->mem) {
-		munmap((void *)(uintptr_t)dev->mem->mapped_address,
-			(size_t)dev->mem->mapped_size);
-		free(dev->mem);
+	for (qp_idx = 0; qp_idx < dev->num_virt_queues; qp_idx++) {
+		struct virtio_memory *dev_mem = dev->mem_arr[qp_idx];
+		if (dev_mem) {
+			munmap((void *)(uintptr_t)dev_mem->mapped_address,
+				(size_t)dev_mem->mapped_size);
+			free(dev_mem);
+		}
 	}
 
 	/* Close any event notifiers opened by device. */
@@ -212,6 +216,8 @@ free_device(struct virtio_net_config_ll *ll_dev)
 	/*
 	 * Free any malloc'd memory.
 	 */
+	free(ll_dev->dev.mem_arr);
+
 	/* Free every queue pair. */
 	for (qp_idx = 0; qp_idx < ll_dev->dev.num_virt_queues; qp_idx++) {
 		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
@@ -283,7 +289,7 @@ init_device(struct virtio_net *dev)
 	 * Virtqueues have already been malloced so
 	 * we don't want to set them to NULL.
 	 */
-	vq_offset = offsetof(struct virtio_net, mem);
+	vq_offset = offsetof(struct virtio_net, features);
 
 	/* Set everything to 0. */
 	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
@@ -350,6 +356,16 @@ new_device(struct vhost_device_ctx ctx)
 		return -1;
 	}
 
+	new_ll_dev->dev.mem_arr =
+		malloc(VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct virtio_memory *));
+	if (new_ll_dev->dev.mem_arr == NULL) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"(%"PRIu64") Failed to allocate memory for dev.mem_arr.\n",
+			ctx.fh);
+		free_device(new_ll_dev);
+		return -1;
+	}
+
 	/* Initialise device and virtqueues. */
 	init_device(&new_ll_dev->dev);
 
@@ -546,7 +562,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 
 	/* The addresses are converted from QEMU virtual to Vhost virtual. */
 	vq->desc = (struct vring_desc *)(uintptr_t)qva_to_vva(dev,
-			addr->desc_user_addr);
+			addr->index / VIRTIO_QNUM, addr->desc_user_addr);
 	if (vq->desc == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"(%"PRIu64") Failed to find desc ring address.\n",
@@ -555,7 +571,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 	}
 
 	vq->avail = (struct vring_avail *)(uintptr_t)qva_to_vva(dev,
-			addr->avail_user_addr);
+			addr->index / VIRTIO_QNUM, addr->avail_user_addr);
 	if (vq->avail == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"(%"PRIu64") Failed to find avail ring address.\n",
@@ -564,7 +580,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 	}
 
 	vq->used = (struct vring_used *)(uintptr_t)qva_to_vva(dev,
-			addr->used_user_addr);
+			addr->index / VIRTIO_QNUM, addr->used_user_addr);
 	if (vq->used == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"(%"PRIu64") Failed to find used ring address.\n",
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v2 4/7] vhost: Add new command line option: rxq
  2015-06-10  5:52 ` [dpdk-dev] [PATCH v2 0/7] " Ouyang Changchun
                     ` (2 preceding siblings ...)
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 3/7] lib_vhost: Set memory layout for multiple queues mode Ouyang Changchun
@ 2015-06-10  5:52   ` Ouyang Changchun
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 5/7] vhost: Support multiple queues Ouyang Changchun
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-10  5:52 UTC (permalink / raw)
  To: dev

Sample vhost need know the queue number user want to enable for each virtio device,
so add the new option '--rxq' into it.

Changes in v2
  - refine help info
  - check if rxq = 0
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 examples/vhost/main.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 46 insertions(+), 4 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 408eb3f..09ed0ca 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -163,6 +163,9 @@ static int mergeable;
 /* Do vlan strip on host, enabled on default */
 static uint32_t vlan_strip = 1;
 
+/* Rx queue number per virtio device */
+static uint32_t rxq = 1;
+
 /* number of descriptors to apply*/
 static uint32_t num_rx_descriptor = RTE_TEST_RX_DESC_DEFAULT_ZCP;
 static uint32_t num_tx_descriptor = RTE_TEST_TX_DESC_DEFAULT_ZCP;
@@ -408,8 +411,19 @@ port_init(uint8_t port)
 		txconf->tx_deferred_start = 1;
 	}
 
-	/*configure the number of supported virtio devices based on VMDQ limits */
-	num_devices = dev_info.max_vmdq_pools;
+	/* Configure the virtio devices num based on VMDQ limits */
+	switch (rxq) {
+	case 1:
+	case 2:
+		num_devices = dev_info.max_vmdq_pools;
+		break;
+	case 4:
+		num_devices = dev_info.max_vmdq_pools / 2;
+		break;
+	default:
+		RTE_LOG(ERR, VHOST_CONFIG, "rxq invalid for VMDq.\n");
+		return -1;
+	}
 
 	if (zero_copy) {
 		rx_ring_size = num_rx_descriptor;
@@ -431,7 +445,7 @@ port_init(uint8_t port)
 		return retval;
 	/* NIC queues are divided into pf queues and vmdq queues.  */
 	num_pf_queues = dev_info.max_rx_queues - dev_info.vmdq_queue_num;
-	queues_per_pool = dev_info.vmdq_queue_num / dev_info.max_vmdq_pools;
+	queues_per_pool = dev_info.vmdq_queue_num / num_devices;
 	num_vmdq_queues = num_devices * queues_per_pool;
 	num_queues = num_pf_queues + num_vmdq_queues;
 	vmdq_queue_base = dev_info.vmdq_queue_base;
@@ -576,7 +590,8 @@ us_vhost_usage(const char *prgname)
 	"		--rx-desc-num [0-N]: the number of descriptors on rx, "
 			"used only when zero copy is enabled.\n"
 	"		--tx-desc-num [0-N]: the number of descriptors on tx, "
-			"used only when zero copy is enabled.\n",
+			"used only when zero copy is enabled.\n"
+	"		--rxq [1,2,4]: rx queue number for each vhost device\n",
 	       prgname);
 }
 
@@ -602,6 +617,7 @@ us_vhost_parse_args(int argc, char **argv)
 		{"zero-copy", required_argument, NULL, 0},
 		{"rx-desc-num", required_argument, NULL, 0},
 		{"tx-desc-num", required_argument, NULL, 0},
+		{"rxq", required_argument, NULL, 0},
 		{NULL, 0, 0, 0},
 	};
 
@@ -778,6 +794,19 @@ us_vhost_parse_args(int argc, char **argv)
 				}
 			}
 
+			/* Specify the Rx queue number for each vhost dev. */
+			if (!strncmp(long_option[option_index].name,
+				"rxq", MAX_LONG_OPT_SZ)) {
+				ret = parse_num_opt(optarg, 4);
+				if ((ret == -1) || (ret == 0) || (!POWEROF2(ret))) {
+					RTE_LOG(INFO, VHOST_CONFIG,
+					"Valid value for rxq is [1,2,4]\n");
+					us_vhost_usage(prgname);
+					return -1;
+				} else {
+					rxq = ret;
+				}
+			}
 			break;
 
 			/* Invalid option - print options. */
@@ -813,6 +842,19 @@ us_vhost_parse_args(int argc, char **argv)
 		return -1;
 	}
 
+	if (rxq > 1) {
+		vmdq_conf_default.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
+		vmdq_conf_default.rx_adv_conf.rss_conf.rss_hf = ETH_RSS_IP |
+				ETH_RSS_UDP | ETH_RSS_TCP | ETH_RSS_SCTP;
+	}
+
+	if ((zero_copy == 1) && (rxq > 1)) {
+		RTE_LOG(INFO, VHOST_PORT,
+			"Vhost zero copy doesn't support mq mode,"
+			"please specify '--rxq 1' to disable it.\n");
+		return -1;
+	}
+
 	return 0;
 }
 
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v2 5/7] vhost: Support multiple queues
  2015-06-10  5:52 ` [dpdk-dev] [PATCH v2 0/7] " Ouyang Changchun
                     ` (3 preceding siblings ...)
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 4/7] vhost: Add new command line option: rxq Ouyang Changchun
@ 2015-06-10  5:52   ` Ouyang Changchun
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 6/7] virtio: Resolve for control queue Ouyang Changchun
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-10  5:52 UTC (permalink / raw)
  To: dev

Sample vhost leverage the VMDq+RSS in HW to receive packets and distribute them
into different queue in the pool according to 5 tuples.

And enable multiple queues mode in vhost/virtio layer.

HW queue numbers in pool exactly same with the queue number in virtio device,
e.g. rxq = 4, the queue number is 4, it means 4 HW queues in each VMDq pool,
and 4 queues in each virtio device/port, one maps to each.

=========================================
==================|   |==================|
       vport0     |   |      vport1      |
---  ---  ---  ---|   |---  ---  ---  ---|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
||   ||   ||   ||      ||   ||   ||   ||
||   ||   ||   ||      ||   ||   ||   ||
||= =||= =||= =||=|   =||== ||== ||== ||=|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |

------------------|   |------------------|
     VMDq pool0   |   |    VMDq pool1    |
==================|   |==================|

In RX side, it firstly polls each queue of the pool and gets the packets from
it and enqueue them into its corresponding queue in virtio device/port.
In TX side, it dequeue packets from each queue of virtio device/port and send
to either physical port or another virtio device according to its destination
MAC address.

Changes in v2:
  - check queue num per pool in VMDq and queue pair number per vhost device
  - remove the unnecessary calling q_num_set api
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 examples/vhost/main.c | 132 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 79 insertions(+), 53 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 09ed0ca..76b6ae7 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1002,8 +1002,9 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)
 
 	/* Enable stripping of the vlan tag as we handle routing. */
 	if (vlan_strip)
-		rte_eth_dev_set_vlan_strip_on_queue(ports[0],
-			(uint16_t)vdev->vmdq_rx_q, 1);
+		for (i = 0; i < (int)rxq; i++)
+			rte_eth_dev_set_vlan_strip_on_queue(ports[0],
+				(uint16_t)(vdev->vmdq_rx_q + i), 1);
 
 	/* Set device as ready for RX. */
 	vdev->ready = DEVICE_RX;
@@ -1018,7 +1019,7 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)
 static inline void
 unlink_vmdq(struct vhost_dev *vdev)
 {
-	unsigned i = 0;
+	unsigned i = 0, j = 0;
 	unsigned rx_count;
 	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
 
@@ -1031,15 +1032,19 @@ unlink_vmdq(struct vhost_dev *vdev)
 		vdev->vlan_tag = 0;
 
 		/*Clear out the receive buffers*/
-		rx_count = rte_eth_rx_burst(ports[0],
-					(uint16_t)vdev->vmdq_rx_q, pkts_burst, MAX_PKT_BURST);
+		for (i = 0; i < rxq; i++) {
+			rx_count = rte_eth_rx_burst(ports[0],
+					(uint16_t)vdev->vmdq_rx_q + i,
+					pkts_burst, MAX_PKT_BURST);
 
-		while (rx_count) {
-			for (i = 0; i < rx_count; i++)
-				rte_pktmbuf_free(pkts_burst[i]);
+			while (rx_count) {
+				for (j = 0; j < rx_count; j++)
+					rte_pktmbuf_free(pkts_burst[j]);
 
-			rx_count = rte_eth_rx_burst(ports[0],
-					(uint16_t)vdev->vmdq_rx_q, pkts_burst, MAX_PKT_BURST);
+				rx_count = rte_eth_rx_burst(ports[0],
+					(uint16_t)vdev->vmdq_rx_q + i,
+					pkts_burst, MAX_PKT_BURST);
+			}
 		}
 
 		vdev->ready = DEVICE_MAC_LEARNING;
@@ -1051,7 +1056,7 @@ unlink_vmdq(struct vhost_dev *vdev)
  * the packet on that devices RX queue. If not then return.
  */
 static inline int __attribute__((always_inline))
-virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m)
+virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m, uint32_t q_idx)
 {
 	struct virtio_net_data_ll *dev_ll;
 	struct ether_hdr *pkt_hdr;
@@ -1066,7 +1071,7 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m)
 
 	while (dev_ll != NULL) {
 		if ((dev_ll->vdev->ready == DEVICE_RX) && ether_addr_cmp(&(pkt_hdr->d_addr),
-				          &dev_ll->vdev->mac_address)) {
+					&dev_ll->vdev->mac_address)) {
 
 			/* Drop the packet if the TX packet is destined for the TX device. */
 			if (dev_ll->vdev->dev->device_fh == dev->device_fh) {
@@ -1084,7 +1089,9 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m)
 				LOG_DEBUG(VHOST_DATA, "(%"PRIu64") Device is marked for removal\n", tdev->device_fh);
 			} else {
 				/*send the packet to the local virtio device*/
-				ret = rte_vhost_enqueue_burst(tdev, VIRTIO_RXQ, &m, 1);
+				ret = rte_vhost_enqueue_burst(tdev,
+					VIRTIO_RXQ + q_idx * VIRTIO_QNUM,
+					&m, 1);
 				if (enable_stats) {
 					rte_atomic64_add(
 					&dev_statistics[tdev->device_fh].rx_total_atomic,
@@ -1161,7 +1168,8 @@ find_local_dest(struct virtio_net *dev, struct rte_mbuf *m,
  * or the physical port.
  */
 static inline void __attribute__((always_inline))
-virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
+virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m,
+		uint16_t vlan_tag, uint32_t q_idx)
 {
 	struct mbuf_table *tx_q;
 	struct rte_mbuf **m_table;
@@ -1171,7 +1179,8 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
 	struct ether_hdr *nh;
 
 	/*check if destination is local VM*/
-	if ((vm2vm_mode == VM2VM_SOFTWARE) && (virtio_tx_local(vdev, m) == 0)) {
+	if ((vm2vm_mode == VM2VM_SOFTWARE) &&
+		(virtio_tx_local(vdev, m, q_idx) == 0)) {
 		rte_pktmbuf_free(m);
 		return;
 	}
@@ -1335,49 +1344,60 @@ switch_worker(__attribute__((unused)) void *arg)
 			}
 			if (likely(vdev->ready == DEVICE_RX)) {
 				/*Handle guest RX*/
-				rx_count = rte_eth_rx_burst(ports[0],
-					vdev->vmdq_rx_q, pkts_burst, MAX_PKT_BURST);
+				for (i = 0; i < rxq; i++) {
+					rx_count = rte_eth_rx_burst(ports[0],
+						vdev->vmdq_rx_q + i, pkts_burst, MAX_PKT_BURST);
 
-				if (rx_count) {
-					/*
-					* Retry is enabled and the queue is full then we wait and retry to avoid packet loss
-					* Here MAX_PKT_BURST must be less than virtio queue size
-					*/
-					if (enable_retry && unlikely(rx_count > rte_vring_available_entries(dev, VIRTIO_RXQ))) {
-						for (retry = 0; retry < burst_rx_retry_num; retry++) {
-							rte_delay_us(burst_rx_delay_time);
-							if (rx_count <= rte_vring_available_entries(dev, VIRTIO_RXQ))
-								break;
+					if (rx_count) {
+						/*
+						* Retry is enabled and the queue is full then we wait and retry to avoid packet loss
+						* Here MAX_PKT_BURST must be less than virtio queue size
+						*/
+						if (enable_retry && unlikely(rx_count > rte_vring_available_entries(dev,
+											VIRTIO_RXQ + i * VIRTIO_QNUM))) {
+							for (retry = 0; retry < burst_rx_retry_num; retry++) {
+								rte_delay_us(burst_rx_delay_time);
+								if (rx_count <= rte_vring_available_entries(dev,
+											VIRTIO_RXQ + i * VIRTIO_QNUM))
+									break;
+							}
+						}
+						ret_count = rte_vhost_enqueue_burst(dev, VIRTIO_RXQ + i * VIRTIO_QNUM,
+											pkts_burst, rx_count);
+						if (enable_stats) {
+							rte_atomic64_add(
+							&dev_statistics[dev_ll->vdev->dev->device_fh].rx_total_atomic,
+							rx_count);
+							rte_atomic64_add(
+							&dev_statistics[dev_ll->vdev->dev->device_fh].rx_atomic, ret_count);
+						}
+						while (likely(rx_count)) {
+							rx_count--;
+							rte_pktmbuf_free(pkts_burst[rx_count]);
 						}
 					}
-					ret_count = rte_vhost_enqueue_burst(dev, VIRTIO_RXQ, pkts_burst, rx_count);
-					if (enable_stats) {
-						rte_atomic64_add(
-						&dev_statistics[dev_ll->vdev->dev->device_fh].rx_total_atomic,
-						rx_count);
-						rte_atomic64_add(
-						&dev_statistics[dev_ll->vdev->dev->device_fh].rx_atomic, ret_count);
-					}
-					while (likely(rx_count)) {
-						rx_count--;
-						rte_pktmbuf_free(pkts_burst[rx_count]);
-					}
-
 				}
 			}
 
 			if (likely(!vdev->remove)) {
 				/* Handle guest TX*/
-				tx_count = rte_vhost_dequeue_burst(dev, VIRTIO_TXQ, mbuf_pool, pkts_burst, MAX_PKT_BURST);
-				/* If this is the first received packet we need to learn the MAC and setup VMDQ */
-				if (unlikely(vdev->ready == DEVICE_MAC_LEARNING) && tx_count) {
-					if (vdev->remove || (link_vmdq(vdev, pkts_burst[0]) == -1)) {
-						while (tx_count)
-							rte_pktmbuf_free(pkts_burst[--tx_count]);
+				for (i = 0; i < rxq; i++) {
+					tx_count = rte_vhost_dequeue_burst(dev, VIRTIO_TXQ + i * 2,
+							mbuf_pool, pkts_burst, MAX_PKT_BURST);
+					/*
+					 * If this is the first received packet we need to learn
+					 * the MAC and setup VMDQ
+					 */
+					if (unlikely(vdev->ready == DEVICE_MAC_LEARNING) && tx_count) {
+						if (vdev->remove || (link_vmdq(vdev, pkts_burst[0]) == -1)) {
+							while (tx_count)
+								rte_pktmbuf_free(pkts_burst[--tx_count]);
+						}
 					}
+					while (tx_count)
+						virtio_tx_route(vdev, pkts_burst[--tx_count],
+								(uint16_t)dev->device_fh, i);
 				}
-				while (tx_count)
-					virtio_tx_route(vdev, pkts_burst[--tx_count], (uint16_t)dev->device_fh);
 			}
 
 			/*move to the next device in the list*/
@@ -2636,6 +2656,13 @@ new_device (struct virtio_net *dev)
 	struct vhost_dev *vdev;
 	uint32_t regionidx;
 
+	if ((rxq > 1) && (dev->num_virt_queues != rxq)) {
+		RTE_LOG(ERR, VHOST_DATA, "(%"PRIu64") queue num in VMDq pool:"
+			"%d != queue pair num in vhost dev:%d\n",
+			dev->device_fh, rxq, dev->num_virt_queues);
+		return -1;
+	}
+
 	vdev = rte_zmalloc("vhost device", sizeof(*vdev), RTE_CACHE_LINE_SIZE);
 	if (vdev == NULL) {
 		RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") Couldn't allocate memory for vhost dev\n",
@@ -2681,12 +2708,12 @@ new_device (struct virtio_net *dev)
 		}
 	}
 
-
 	/* Add device to main ll */
 	ll_dev = get_data_ll_free_entry(&ll_root_free);
 	if (ll_dev == NULL) {
-		RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") No free entry found in linked list. Device limit "
-			"of %d devices per core has been reached\n",
+		RTE_LOG(INFO, VHOST_DATA,
+			"(%"PRIu64") No free entry found in linked list."
+			"Device limit of %d devices per core has been reached\n",
 			dev->device_fh, num_devices);
 		if (vdev->regions_hpa)
 			rte_free(vdev->regions_hpa);
@@ -2695,8 +2722,7 @@ new_device (struct virtio_net *dev)
 	}
 	ll_dev->vdev = vdev;
 	add_data_ll_entry(&ll_root_used, ll_dev);
-	vdev->vmdq_rx_q
-		= dev->device_fh * queues_per_pool + vmdq_queue_base;
+	vdev->vmdq_rx_q	= dev->device_fh * rxq + vmdq_queue_base;
 
 	if (zero_copy) {
 		uint32_t index = vdev->vmdq_rx_q;
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v2 6/7] virtio: Resolve for control queue
  2015-06-10  5:52 ` [dpdk-dev] [PATCH v2 0/7] " Ouyang Changchun
                     ` (4 preceding siblings ...)
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 5/7] vhost: Support multiple queues Ouyang Changchun
@ 2015-06-10  5:52   ` Ouyang Changchun
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 7/7] vhost: Add per queue stats info Ouyang Changchun
  2015-06-15  7:56   ` [dpdk-dev] [PATCH v3 0/9] Support multiple queues in vhost Ouyang Changchun
  7 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-10  5:52 UTC (permalink / raw)
  To: dev

Control queue can't work for vhost-user mulitple queue mode,
so introduce a counter to void the dead loop when polling the control queue.

Changes in v2:
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 drivers/net/virtio/virtio_ethdev.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index f74e413..9618f4f 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -61,6 +61,7 @@
 #include "virtio_logs.h"
 #include "virtqueue.h"
 
+#define CQ_POLL_COUNTER 500 /* Avoid dead loop when polling control queue */
 
 static int eth_virtio_dev_init(struct rte_eth_dev *eth_dev);
 static int  virtio_dev_configure(struct rte_eth_dev *dev);
@@ -118,6 +119,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
 	int k, sum = 0;
 	virtio_net_ctrl_ack status = ~0;
 	struct virtio_pmd_ctrl result;
+	uint32_t cq_poll = CQ_POLL_COUNTER;
 
 	ctrl->status = status;
 
@@ -177,9 +179,15 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
 	virtqueue_notify(vq);
 
 	rte_rmb();
-	while (vq->vq_used_cons_idx == vq->vq_ring.used->idx) {
+
+	/**
+	 * FIXME: The control queue doesn't work for vhost-user
+	 * multiple queue, introduce poll_ms to avoid the deadloop.
+	 */
+	while ((vq->vq_used_cons_idx == vq->vq_ring.used->idx) && (cq_poll != 0)) {
 		rte_rmb();
 		usleep(100);
+		cq_poll--;
 	}
 
 	while (vq->vq_used_cons_idx != vq->vq_ring.used->idx) {
@@ -207,7 +215,10 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
 	PMD_INIT_LOG(DEBUG, "vq->vq_free_cnt=%d\nvq->vq_desc_head_idx=%d",
 			vq->vq_free_cnt, vq->vq_desc_head_idx);
 
-	memcpy(&result, vq->virtio_net_hdr_mz->addr,
+	if (cq_poll == 0)
+		result.status = 0;
+	else
+		memcpy(&result, vq->virtio_net_hdr_mz->addr,
 			sizeof(struct virtio_pmd_ctrl));
 
 	return result.status;
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v2 7/7] vhost: Add per queue stats info
  2015-06-10  5:52 ` [dpdk-dev] [PATCH v2 0/7] " Ouyang Changchun
                     ` (5 preceding siblings ...)
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 6/7] virtio: Resolve for control queue Ouyang Changchun
@ 2015-06-10  5:52   ` Ouyang Changchun
  2015-06-15  7:56   ` [dpdk-dev] [PATCH v3 0/9] Support multiple queues in vhost Ouyang Changchun
  7 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-10  5:52 UTC (permalink / raw)
  To: dev

Add per queue stats info

Changes in v2
  - fix the stats issue in tx_local
  - dynamically alloc mem for queue pair stats info
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 examples/vhost/main.c | 125 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 78 insertions(+), 47 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 76b6ae7..e4202dd 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -314,7 +314,7 @@ struct ipv4_hdr {
 #define VLAN_ETH_HLEN   18
 
 /* Per-device statistics struct */
-struct device_statistics {
+struct qp_statistics {
 	uint64_t tx_total;
 	rte_atomic64_t rx_total_atomic;
 	uint64_t rx_total;
@@ -322,6 +322,10 @@ struct device_statistics {
 	rte_atomic64_t rx_atomic;
 	uint64_t rx;
 } __rte_cache_aligned;
+
+struct device_statistics {
+	struct qp_statistics *qp_stats;
+};
 struct device_statistics dev_statistics[MAX_DEVICES];
 
 /*
@@ -738,6 +742,16 @@ us_vhost_parse_args(int argc, char **argv)
 					return -1;
 				} else {
 					enable_stats = ret;
+					for (i = 0; i < MAX_DEVICES; i++) {
+						dev_statistics[i].qp_stats =
+							malloc(VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct qp_statistics));
+						if (dev_statistics[i].qp_stats == NULL) {
+							RTE_LOG(ERR, VHOST_CONFIG, "Failed to allocate memory for qp stats.\n");
+							return -1;
+						}
+						memset(dev_statistics[i].qp_stats, 0,
+							VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct qp_statistics));
+					}
 				}
 			}
 
@@ -1094,13 +1108,13 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m, uint32_t q_idx)
 					&m, 1);
 				if (enable_stats) {
 					rte_atomic64_add(
-					&dev_statistics[tdev->device_fh].rx_total_atomic,
+					&dev_statistics[tdev->device_fh].qp_stats[q_idx].rx_total_atomic,
 					1);
 					rte_atomic64_add(
-					&dev_statistics[tdev->device_fh].rx_atomic,
+					&dev_statistics[tdev->device_fh].qp_stats[q_idx].rx_atomic,
 					ret);
-					dev_statistics[tdev->device_fh].tx_total++;
-					dev_statistics[tdev->device_fh].tx += ret;
+					dev_statistics[dev->device_fh].qp_stats[q_idx].tx_total++;
+					dev_statistics[dev->device_fh].qp_stats[q_idx].tx += ret;
 				}
 			}
 
@@ -1234,8 +1248,8 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m,
 	tx_q->m_table[len] = m;
 	len++;
 	if (enable_stats) {
-		dev_statistics[dev->device_fh].tx_total++;
-		dev_statistics[dev->device_fh].tx++;
+		dev_statistics[dev->device_fh].qp_stats[q_idx].tx_total++;
+		dev_statistics[dev->device_fh].qp_stats[q_idx].tx++;
 	}
 
 	if (unlikely(len == MAX_PKT_BURST)) {
@@ -1366,10 +1380,10 @@ switch_worker(__attribute__((unused)) void *arg)
 											pkts_burst, rx_count);
 						if (enable_stats) {
 							rte_atomic64_add(
-							&dev_statistics[dev_ll->vdev->dev->device_fh].rx_total_atomic,
+							&dev_statistics[dev_ll->vdev->dev->device_fh].qp_stats[i].rx_total_atomic,
 							rx_count);
 							rte_atomic64_add(
-							&dev_statistics[dev_ll->vdev->dev->device_fh].rx_atomic, ret_count);
+							&dev_statistics[dev_ll->vdev->dev->device_fh].qp_stats[i].rx_atomic, ret_count);
 						}
 						while (likely(rx_count)) {
 							rx_count--;
@@ -1919,8 +1933,8 @@ virtio_tx_route_zcp(struct virtio_net *dev, struct rte_mbuf *m,
 		(mbuf->next == NULL) ? "null" : "non-null");
 
 	if (enable_stats) {
-		dev_statistics[dev->device_fh].tx_total++;
-		dev_statistics[dev->device_fh].tx++;
+		dev_statistics[dev->device_fh].qp_stats[0].tx_total++;
+		dev_statistics[dev->device_fh].qp_stats[0].tx++;
 	}
 
 	if (unlikely(len == MAX_PKT_BURST)) {
@@ -2203,9 +2217,9 @@ switch_worker_zcp(__attribute__((unused)) void *arg)
 					ret_count = virtio_dev_rx_zcp(dev,
 							pkts_burst, rx_count);
 					if (enable_stats) {
-						dev_statistics[dev->device_fh].rx_total
+						dev_statistics[dev->device_fh].qp_stats[0].rx_total
 							+= rx_count;
-						dev_statistics[dev->device_fh].rx
+						dev_statistics[dev->device_fh].qp_stats[0].rx
 							+= ret_count;
 					}
 					while (likely(rx_count)) {
@@ -2825,7 +2839,9 @@ new_device (struct virtio_net *dev)
 	add_data_ll_entry(&lcore_info[vdev->coreid].lcore_ll->ll_root_used, ll_dev);
 
 	/* Initialize device stats */
-	memset(&dev_statistics[dev->device_fh], 0, sizeof(struct device_statistics));
+	if (enable_stats)
+		memset(dev_statistics[dev->device_fh].qp_stats, 0,
+			VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct qp_statistics));
 
 	/* Disable notifications. */
 	rte_vhost_enable_guest_notification(dev, VIRTIO_RXQ, 0);
@@ -2858,7 +2874,7 @@ print_stats(void)
 	struct virtio_net_data_ll *dev_ll;
 	uint64_t tx_dropped, rx_dropped;
 	uint64_t tx, tx_total, rx, rx_total;
-	uint32_t device_fh;
+	uint32_t device_fh, i;
 	const char clr[] = { 27, '[', '2', 'J', '\0' };
 	const char top_left[] = { 27, '[', '1', ';', '1', 'H','\0' };
 
@@ -2873,35 +2889,53 @@ print_stats(void)
 		dev_ll = ll_root_used;
 		while (dev_ll != NULL) {
 			device_fh = (uint32_t)dev_ll->vdev->dev->device_fh;
-			tx_total = dev_statistics[device_fh].tx_total;
-			tx = dev_statistics[device_fh].tx;
-			tx_dropped = tx_total - tx;
-			if (zero_copy == 0) {
-				rx_total = rte_atomic64_read(
-					&dev_statistics[device_fh].rx_total_atomic);
-				rx = rte_atomic64_read(
-					&dev_statistics[device_fh].rx_atomic);
-			} else {
-				rx_total = dev_statistics[device_fh].rx_total;
-				rx = dev_statistics[device_fh].rx;
-			}
-			rx_dropped = rx_total - rx;
-
-			printf("\nStatistics for device %"PRIu32" ------------------------------"
-					"\nTX total: 		%"PRIu64""
-					"\nTX dropped: 		%"PRIu64""
-					"\nTX successful: 		%"PRIu64""
-					"\nRX total: 		%"PRIu64""
-					"\nRX dropped: 		%"PRIu64""
-					"\nRX successful: 		%"PRIu64"",
-					device_fh,
-					tx_total,
-					tx_dropped,
-					tx,
-					rx_total,
-					rx_dropped,
-					rx);
-
+			for (i = 0; i < rxq; i++) {
+				tx_total = dev_statistics[device_fh].qp_stats[i].tx_total;
+				tx = dev_statistics[device_fh].qp_stats[i].tx;
+				tx_dropped = tx_total - tx;
+				if (zero_copy == 0) {
+					rx_total = rte_atomic64_read(
+						&dev_statistics[device_fh].qp_stats[i].rx_total_atomic);
+					rx = rte_atomic64_read(
+						&dev_statistics[device_fh].qp_stats[i].rx_atomic);
+				} else {
+					rx_total = dev_statistics[device_fh].qp_stats[0].rx_total;
+					rx = dev_statistics[device_fh].qp_stats[0].rx;
+				}
+				rx_dropped = rx_total - rx;
+
+				if (rxq > 1)
+					printf("\nStatistics for device %"PRIu32" queue id: %d------------------"
+						"\nTX total:		%"PRIu64""
+						"\nTX dropped:		%"PRIu64""
+						"\nTX successful:	%"PRIu64""
+						"\nRX total:		%"PRIu64""
+						"\nRX dropped:		%"PRIu64""
+						"\nRX successful:	%"PRIu64"",
+						device_fh,
+						i,
+						tx_total,
+						tx_dropped,
+						tx,
+						rx_total,
+						rx_dropped,
+						rx);
+				else
+					printf("\nStatistics for device %"PRIu32" ------------------------------"
+						"\nTX total:		%"PRIu64""
+						"\nTX dropped:		%"PRIu64""
+						"\nTX successful:	%"PRIu64""
+						"\nRX total:		%"PRIu64""
+						"\nRX dropped:		%"PRIu64""
+						"\nRX successful:	%"PRIu64"",
+						device_fh,
+						tx_total,
+						tx_dropped,
+						tx,
+						rx_total,
+						rx_dropped,
+						rx);
+				}
 			dev_ll = dev_ll->next;
 		}
 		printf("\n======================================================\n");
@@ -3071,9 +3105,6 @@ main(int argc, char *argv[])
 	if (init_data_ll() == -1)
 		rte_exit(EXIT_FAILURE, "Failed to initialize linked list\n");
 
-	/* Initialize device stats */
-	memset(&dev_statistics, 0, sizeof(dev_statistics));
-
 	/* Enable stats if the user option is set. */
 	if (enable_stats)
 		pthread_create(&tid, NULL, (void*)print_stats, NULL );
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/7] lib_vhost: Support multiple queues in virtio dev
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 2/7] lib_vhost: Support multiple queues in virtio dev Ouyang Changchun
@ 2015-06-11  9:54     ` Panu Matilainen
  0 siblings, 0 replies; 65+ messages in thread
From: Panu Matilainen @ 2015-06-11  9:54 UTC (permalink / raw)
  To: Ouyang Changchun, dev

On 06/10/2015 08:52 AM, Ouyang Changchun wrote:
> Each virtio device could have multiple queues, say 2 or 4, at most 8.
> Enabling this feature allows virtio device/port on guest has the ability to
> use different vCPU to receive/transmit packets from/to each queue.
>
> In multiple queues mode, virtio device readiness means all queues of
> this virtio device are ready, cleanup/destroy a virtio device also
> requires clearing all queues belong to it.
>
> Changes in v2:
>    - remove the q_num_set api
>    - add the qp_num_get api
>    - determine the queue pair num from qemu message
>    - rework for reset owner message handler
>    - dynamically alloc mem for dev virtqueue
>    - queue pair num could be 0x8000
>    - fix checkpatch errors
>
> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> ---
>   lib/librte_vhost/rte_virtio_net.h             |  10 +-
>   lib/librte_vhost/vhost-net.h                  |   1 +
>   lib/librte_vhost/vhost_rxtx.c                 |  32 ++---
>   lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
>   lib/librte_vhost/vhost_user/virtio-net-user.c |  76 +++++++++---
>   lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
>   lib/librte_vhost/virtio-net.c                 | 161 +++++++++++++++++---------
>   7 files changed, 197 insertions(+), 89 deletions(-)
>
> diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
> index 5d38185..92b4bfa 100644
> --- a/lib/librte_vhost/rte_virtio_net.h
> +++ b/lib/librte_vhost/rte_virtio_net.h
> @@ -59,7 +59,6 @@ struct rte_mbuf;
>   /* Backend value set by guest. */
>   #define VIRTIO_DEV_STOPPED -1
>
> -
>   /* Enum for virtqueue management. */
>   enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
>
> @@ -96,13 +95,14 @@ struct vhost_virtqueue {
>    * Device structure contains all configuration information relating to the device.
>    */
>   struct virtio_net {
> -	struct vhost_virtqueue	*virtqueue[VIRTIO_QNUM];	/**< Contains all virtqueue information. */
>   	struct virtio_memory	*mem;		/**< QEMU memory and memory region information. */
> +	struct vhost_virtqueue	**virtqueue;    /**< Contains all virtqueue information. */
>   	uint64_t		features;	/**< Negotiated feature set. */
>   	uint64_t		device_fh;	/**< device identifier. */
>   	uint32_t		flags;		/**< Device flags. Only used to check if device is running on data core. */
>   #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
>   	char			ifname[IF_NAME_SZ];	/**< Name of the tap device or socket path. */
> +	uint32_t                num_virt_queues;
>   	void			*priv;		/**< private context */
>   } __rte_cache_aligned;
>
> @@ -220,4 +220,10 @@ uint16_t rte_vhost_enqueue_burst(struct virtio_net *dev, uint16_t queue_id,
>   uint16_t rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
>   	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
>

Unfortunately this is an ABI break, NAK. Ditto for other changes to 
struct virtio_net in patch 3/7 in this series. See 
http://dpdk.org/browse/dpdk/tree/doc/guides/rel_notes/abi.rst for the 
ABI policy.

There's plenty of discussion around the ABI going on at the moment, 
including this thread: http://dpdk.org/ml/archives/dev/2015-June/018456.html

	- Panu -

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v3 0/9] Support multiple queues in vhost
  2015-06-10  5:52 ` [dpdk-dev] [PATCH v2 0/7] " Ouyang Changchun
                     ` (6 preceding siblings ...)
  2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 7/7] vhost: Add per queue stats info Ouyang Changchun
@ 2015-06-15  7:56   ` Ouyang Changchun
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 1/9] ixgbe: Support VMDq RSS in non-SRIOV environment Ouyang Changchun
                       ` (9 more replies)
  7 siblings, 10 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-15  7:56 UTC (permalink / raw)
  To: dev

This patch set supports the multiple queues for each virtio device in vhost.
The vhost-user is used to enable the multiple queues feature, It's not ready for vhost-cuse.
 
The QEMU patch of enabling vhost-use multiple queues has already merged into upstream sub-tree in
QEMU community and it will be put in QEMU 2.4. If using QEMU 2.3, it requires applying the
same patch onto QEMU 2.3 and rebuild the QEMU before running vhost multiple queues:
http://patchwork.ozlabs.org/patch/477461/
 
Basically vhost sample leverages the VMDq+RSS in HW to receive packets and distribute them
into different queue in the pool according to their 5 tuples.
 
On the other hand, the vhost will get the queue pair number based on the communication message with
QEMU.
 
HW queue numbers in pool is strongly recommended to set as identical with the queue number to start
the QMEU guest and identical with the queue number to start with virtio port on guest.
E.g. use '--rxq 4' to set the queue number as 4, it means there are 4 HW queues in each VMDq pool,
and 4 queues in each vhost device/port, every queue in pool maps to one queue in vhost device.
 
=========================================
==================|   |==================|
       vport0     |   |      vport1      |
---  ---  ---  ---|   |---  ---  ---  ---|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
||   ||   ||   ||      ||   ||   ||   ||
||   ||   ||   ||      ||   ||   ||   ||
||= =||= =||= =||=|   =||== ||== ||== ||=|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
------------------|   |------------------|
     VMDq pool0   |   |    VMDq pool1    |
==================|   |==================|
 
In RX side, it firstly polls each queue of the pool and gets the packets from
it and enqueue them into its corresponding virtqueue in virtio device/port.
In TX side, it dequeue packets from each virtqueue of virtio device/port and send
to either physical port or another virtio device according to its destination
MAC address.

Here is some test guidance.
1. On host, firstly mount hugepage, and insmod uio, igb_uio, bind one nic on igb_uio;
and then run vhost sample, key steps as follows:
sudo mount -t hugetlbfs nodev /mnt/huge
sudo modprobe uio
sudo insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko
 
$RTE_SDK/tools/dpdk_nic_bind.py --bind igb_uio 0000:08:00.0
sudo $RTE_SDK/examples/vhost/build/vhost-switch -c 0xf0 -n 4 --huge-dir /mnt/huge --socket-mem 1024,0 -- -p 1 --vm2vm 0 --dev-basename usvhost --rxq 2
 
use '--stats 1' to enable the stats dumping on screen for vhost.
 
2. After step 1, on host, modprobe kvm and kvm_intel, and use qemu command line to start one guest:
modprobe kvm
modprobe kvm_intel
sudo mount -t hugetlbfs nodev /dev/hugepages -o pagesize=1G
 
$QEMU_PATH/qemu-system-x86_64 -enable-kvm -m 4096 -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -mem-prealloc -smp 10 -cpu core2duo,+sse3,+sse4.1,+sse4.2 -name <vm-name> -drive file=<img-path>/vm.img -chardev socket,id=char0,path=<usvhost-path>/usvhost -netdev type=vhost-user,id=hostnet2,chardev=char0,vhostforce=on,queues=2 -device virtio-net-pci,mq=on,vectors=6,netdev=hostnet2,id=net2,mac=52:54:00:12:34:56,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -chardev socket,id=char1,path=<usvhost-path>/usvhost -netdev type=vhost-user,id=hostnet3,chardev=char1,vhostforce=on,queues=2 -device virtio-net-pci,mq=on,vectors=6,netdev=hostnet3,id=net3,mac=52:54:00:12:34:57,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off
 
3. Log on guest, use testpmd(dpdk based) to test, use multiple virtio queues to rx and tx packets.
modprobe uio
insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko
echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
./tools/dpdk_nic_bind.py --bind igb_uio 00:03.0 00:04.0
 
$RTE_SDK/$RTE_TARGET/app/testpmd -c 1f -n 4 -- --rxq=2 --txq=2 --nb-cores=4 --rx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" --tx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" -i --disable-hw-vlan --txqflags 0xf00
 
set fwd mac
start tx_first
 
4. Use packet generator to send packets with dest MAC:52 54 00 12 34 57  VLAN tag:1001,
select IPv4 as protocols and continuous incremental IP address.
 
5. Testpmd on guest can display packets received/transmitted in both queues of each virtio port.

Changchun Ouyang (9):
  ixgbe: Support VMDq RSS in non-SRIOV environment
  lib_vhost: Support multiple queues in virtio dev
  lib_vhost: Set memory layout for multiple queues mode
  lib_vhost: Check the virtqueue address's validity
  vhost: Add new command line option: rxq
  vhost: Support multiple queues
  virtio: Resolve for control queue
  vhost: Add per queue stats info
  doc: Update doc for vhost multiple queues

 doc/guides/prog_guide/vhost_lib.rst           |  35 +++
 doc/guides/sample_app_ug/vhost.rst            | 110 +++++++++
 drivers/net/ixgbe/ixgbe_rxtx.c                |  86 +++++--
 drivers/net/virtio/virtio_ethdev.c            |  15 +-
 examples/vhost/main.c                         | 324 +++++++++++++++++---------
 lib/librte_ether/rte_ethdev.c                 |  40 ++++
 lib/librte_vhost/rte_virtio_net.h             |  20 +-
 lib/librte_vhost/vhost-net.h                  |   1 +
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c |  57 +++--
 lib/librte_vhost/vhost_rxtx.c                 |  70 ++++--
 lib/librte_vhost/vhost_user/vhost-net-user.c  |  15 +-
 lib/librte_vhost/vhost_user/virtio-net-user.c | 135 +++++++----
 lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
 lib/librte_vhost/virtio-net.c                 | 205 +++++++++++-----
 14 files changed, 824 insertions(+), 291 deletions(-)

-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v3 1/9] ixgbe: Support VMDq RSS in non-SRIOV environment
  2015-06-15  7:56   ` [dpdk-dev] [PATCH v3 0/9] Support multiple queues in vhost Ouyang Changchun
@ 2015-06-15  7:56     ` Ouyang Changchun
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in virtio dev Ouyang Changchun
                       ` (8 subsequent siblings)
  9 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-15  7:56 UTC (permalink / raw)
  To: dev

In non-SRIOV environment, VMDq RSS could be enabled by MRQC register.
In theory, the queue number per pool could be 2 or 4, but only 2 queues are
available due to HW limitation, the same limit also exist in Linux ixgbe driver.

Changes in v2:
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 drivers/net/ixgbe/ixgbe_rxtx.c | 86 +++++++++++++++++++++++++++++++++++-------
 lib/librte_ether/rte_ethdev.c  | 40 ++++++++++++++++++++
 2 files changed, 113 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 4f9ab22..13e661f 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -3311,16 +3311,16 @@ void ixgbe_configure_dcb(struct rte_eth_dev *dev)
 	return;
 }
 
-/*
- * VMDq only support for 10 GbE NIC.
+/**
+ * Config pool for VMDq on 10 GbE NIC.
  */
 static void
-ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
+ixgbe_vmdq_pool_configure(struct rte_eth_dev *dev)
 {
 	struct rte_eth_vmdq_rx_conf *cfg;
 	struct ixgbe_hw *hw;
 	enum rte_eth_nb_pools num_pools;
-	uint32_t mrqc, vt_ctl, vlanctrl;
+	uint32_t vt_ctl, vlanctrl;
 	uint32_t vmolr = 0;
 	int i;
 
@@ -3329,12 +3329,6 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
 	cfg = &dev->data->dev_conf.rx_adv_conf.vmdq_rx_conf;
 	num_pools = cfg->nb_queue_pools;
 
-	ixgbe_rss_disable(dev);
-
-	/* MRQC: enable vmdq */
-	mrqc = IXGBE_MRQC_VMDQEN;
-	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
-
 	/* PFVTCTL: turn on virtualisation and set the default pool */
 	vt_ctl = IXGBE_VT_CTL_VT_ENABLE | IXGBE_VT_CTL_REPLEN;
 	if (cfg->enable_default_pool)
@@ -3400,7 +3394,29 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
 	IXGBE_WRITE_FLUSH(hw);
 }
 
-/*
+/**
+ * VMDq only support for 10 GbE NIC.
+ */
+static void
+ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
+{
+	struct ixgbe_hw *hw;
+	uint32_t mrqc;
+
+	PMD_INIT_FUNC_TRACE();
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	ixgbe_rss_disable(dev);
+
+	/* MRQC: enable vmdq */
+	mrqc = IXGBE_MRQC_VMDQEN;
+	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
+	IXGBE_WRITE_FLUSH(hw);
+
+	ixgbe_vmdq_pool_configure(dev);
+}
+
+/**
  * ixgbe_dcb_config_tx_hw_config - Configure general VMDq TX parameters
  * @hw: pointer to hardware structure
  */
@@ -3505,6 +3521,41 @@ ixgbe_config_vf_rss(struct rte_eth_dev *dev)
 }
 
 static int
+ixgbe_config_vmdq_rss(struct rte_eth_dev *dev)
+{
+	struct ixgbe_hw *hw;
+	uint32_t mrqc;
+
+	ixgbe_rss_configure(dev);
+
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/* MRQC: enable VMDQ RSS */
+	mrqc = IXGBE_READ_REG(hw, IXGBE_MRQC);
+	mrqc &= ~IXGBE_MRQC_MRQE_MASK;
+
+	switch (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) {
+	case 2:
+		mrqc |= IXGBE_MRQC_VMDQRSS64EN;
+		break;
+
+	case 4:
+		mrqc |= IXGBE_MRQC_VMDQRSS32EN;
+		break;
+
+	default:
+		PMD_INIT_LOG(ERR, "Invalid pool number in non-IOV mode with VMDQ RSS");
+		return -EINVAL;
+	}
+
+	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
+
+	ixgbe_vmdq_pool_configure(dev);
+
+	return 0;
+}
+
+static int
 ixgbe_config_vf_default(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw =
@@ -3560,6 +3611,10 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev)
 				ixgbe_vmdq_rx_hw_configure(dev);
 				break;
 
+			case ETH_MQ_RX_VMDQ_RSS:
+				ixgbe_config_vmdq_rss(dev);
+				break;
+
 			case ETH_MQ_RX_NONE:
 				/* if mq_mode is none, disable rss mode.*/
 			default: ixgbe_rss_disable(dev);
@@ -4038,6 +4093,8 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
 
 	/* Setup RX queues */
 	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		uint32_t psrtype = 0;
+
 		rxq = dev->data->rx_queues[i];
 
 		/*
@@ -4065,12 +4122,10 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
 		if (rx_conf->header_split) {
 			if (hw->mac.type == ixgbe_mac_82599EB) {
 				/* Must setup the PSRTYPE register */
-				uint32_t psrtype;
 				psrtype = IXGBE_PSRTYPE_TCPHDR |
 					IXGBE_PSRTYPE_UDPHDR   |
 					IXGBE_PSRTYPE_IPV4HDR  |
 					IXGBE_PSRTYPE_IPV6HDR;
-				IXGBE_WRITE_REG(hw, IXGBE_PSRTYPE(rxq->reg_idx), psrtype);
 			}
 			srrctl = ((rx_conf->split_hdr_size <<
 				IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT) &
@@ -4080,6 +4135,11 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
 #endif
 			srrctl = IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;
 
+		/* Set RQPL for VMDQ RSS according to max Rx queue */
+		psrtype |= (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool >> 1) <<
+			IXGBE_PSRTYPE_RQPL_SHIFT;
+		IXGBE_WRITE_REG(hw, IXGBE_PSRTYPE(rxq->reg_idx), psrtype);
+
 		/* Set if packets are dropped when no descriptors available */
 		if (rxq->drop_en)
 			srrctl |= IXGBE_SRRCTL_DROP_EN;
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index e13fde5..6048b0f 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -933,6 +933,16 @@ rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t nb_rx_q)
 	return 0;
 }
 
+#define VMDQ_RSS_RX_QUEUE_NUM_MAX 4
+
+static int
+rte_eth_dev_check_vmdq_rss_rxq_num(__rte_unused uint8_t port_id, uint16_t nb_rx_q)
+{
+	if (nb_rx_q > VMDQ_RSS_RX_QUEUE_NUM_MAX)
+		return -EINVAL;
+	return 0;
+}
+
 static int
 rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 		      const struct rte_eth_conf *dev_conf)
@@ -1093,6 +1103,36 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 				return -EINVAL;
 			}
 		}
+
+		if (dev_conf->rxmode.mq_mode == ETH_MQ_RX_VMDQ_RSS) {
+			uint32_t nb_queue_pools =
+				dev_conf->rx_adv_conf.vmdq_rx_conf.nb_queue_pools;
+			struct rte_eth_dev_info dev_info;
+
+			rte_eth_dev_info_get(port_id, &dev_info);
+			dev->data->dev_conf.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
+			if (nb_queue_pools == ETH_32_POOLS || nb_queue_pools == ETH_64_POOLS)
+				RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool =
+					dev_info.max_rx_queues/nb_queue_pools;
+			else {
+				PMD_DEBUG_TRACE("ethdev port_id=%d VMDQ "
+						"nb_queue_pools=%d invalid "
+						"in VMDQ RSS\n"
+						port_id,
+						nb_queue_pools);
+				return -EINVAL;
+			}
+
+			if (rte_eth_dev_check_vmdq_rss_rxq_num(port_id,
+				RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) != 0) {
+				PMD_DEBUG_TRACE("ethdev port_id=%d"
+					" SRIOV active, invalid queue"
+					" number for VMDQ RSS, allowed"
+					" value are 1, 2 or 4\n",
+					port_id);
+				return -EINVAL;
+			}
+		}
 	}
 	return 0;
 }
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in virtio dev
  2015-06-15  7:56   ` [dpdk-dev] [PATCH v3 0/9] Support multiple queues in vhost Ouyang Changchun
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 1/9] ixgbe: Support VMDq RSS in non-SRIOV environment Ouyang Changchun
@ 2015-06-15  7:56     ` Ouyang Changchun
  2015-06-18 13:16       ` Flavio Leitner
  2015-06-18 13:34       ` Flavio Leitner
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 3/9] lib_vhost: Set memory layout for multiple queues mode Ouyang Changchun
                       ` (7 subsequent siblings)
  9 siblings, 2 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-15  7:56 UTC (permalink / raw)
  To: dev

Each virtio device could have multiple queues, say 2 or 4, at most 8.
Enabling this feature allows virtio device/port on guest has the ability to
use different vCPU to receive/transmit packets from/to each queue.

In multiple queues mode, virtio device readiness means all queues of
this virtio device are ready, cleanup/destroy a virtio device also
requires clearing all queues belong to it.

Changes in v3:
  - fix coding style
  - check virtqueue idx validity

Changes in v2:
  - remove the q_num_set api
  - add the qp_num_get api
  - determine the queue pair num from qemu message
  - rework for reset owner message handler
  - dynamically alloc mem for dev virtqueue
  - queue pair num could be 0x8000
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 lib/librte_vhost/rte_virtio_net.h             |  10 +-
 lib/librte_vhost/vhost-net.h                  |   1 +
 lib/librte_vhost/vhost_rxtx.c                 |  49 +++++---
 lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
 lib/librte_vhost/vhost_user/virtio-net-user.c |  76 +++++++++---
 lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
 lib/librte_vhost/virtio-net.c                 | 161 +++++++++++++++++---------
 7 files changed, 216 insertions(+), 87 deletions(-)

diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 5d38185..873be3e 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -59,7 +59,6 @@ struct rte_mbuf;
 /* Backend value set by guest. */
 #define VIRTIO_DEV_STOPPED -1
 
-
 /* Enum for virtqueue management. */
 enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
 
@@ -96,13 +95,14 @@ struct vhost_virtqueue {
  * Device structure contains all configuration information relating to the device.
  */
 struct virtio_net {
-	struct vhost_virtqueue	*virtqueue[VIRTIO_QNUM];	/**< Contains all virtqueue information. */
 	struct virtio_memory	*mem;		/**< QEMU memory and memory region information. */
+	struct vhost_virtqueue	**virtqueue;    /**< Contains all virtqueue information. */
 	uint64_t		features;	/**< Negotiated feature set. */
 	uint64_t		device_fh;	/**< device identifier. */
 	uint32_t		flags;		/**< Device flags. Only used to check if device is running on data core. */
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
 	char			ifname[IF_NAME_SZ];	/**< Name of the tap device or socket path. */
+	uint32_t		num_virt_queues;
 	void			*priv;		/**< private context */
 } __rte_cache_aligned;
 
@@ -220,4 +220,10 @@ uint16_t rte_vhost_enqueue_burst(struct virtio_net *dev, uint16_t queue_id,
 uint16_t rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
 
+/**
+ * This function get the queue pair number of one vhost device.
+ * @return
+ *  num of queue pair of specified virtio device.
+ */
+uint16_t rte_vhost_qp_num_get(struct virtio_net *dev);
 #endif /* _VIRTIO_NET_H_ */
diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
index c69b60b..7dff14d 100644
--- a/lib/librte_vhost/vhost-net.h
+++ b/lib/librte_vhost/vhost-net.h
@@ -115,4 +115,5 @@ struct vhost_net_device_ops {
 
 
 struct vhost_net_device_ops const *get_virtio_net_callbacks(void);
+int alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx);
 #endif /* _VHOST_NET_CDEV_H_ */
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 2da4a02..d2a7143 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -43,6 +43,18 @@
 #define MAX_PKT_BURST 32
 
 /**
+ * Check the virtqueue idx validility,
+ * return 1 if pass, otherwise 0.
+ */
+static inline uint8_t __attribute__((always_inline))
+check_virtqueue_idx(uint16_t virtq_idx, uint8_t is_tx, uint32_t virtq_num)
+{
+	if ((is_tx ^ (virtq_idx & 0x1)) || (virtq_idx >= virtq_num))
+		return 0;
+	return 1;
+}
+
+/**
  * This function adds buffers to the virtio devices RX virtqueue. Buffers can
  * be received from the physical port or from another virtio device. A packet
  * count is returned to indicate the number of packets that are succesfully
@@ -67,12 +79,15 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 	uint8_t success = 0;
 
 	LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_rx()\n", dev->device_fh);
-	if (unlikely(queue_id != VIRTIO_RXQ)) {
-		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
+	if (unlikely(check_virtqueue_idx(queue_id, 0,
+		VIRTIO_QNUM * dev->num_virt_queues) == 0)) {
+		RTE_LOG(ERR, VHOST_DATA,
+			"%s (%"PRIu64"): virtqueue idx:%d invalid.\n",
+			 __func__, dev->device_fh, queue_id);
 		return 0;
 	}
 
-	vq = dev->virtqueue[VIRTIO_RXQ];
+	vq = dev->virtqueue[queue_id];
 	count = (count > MAX_PKT_BURST) ? MAX_PKT_BURST : count;
 
 	/*
@@ -188,8 +203,9 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 }
 
 static inline uint32_t __attribute__((always_inline))
-copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t res_base_idx,
-	uint16_t res_end_idx, struct rte_mbuf *pkt)
+copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
+	uint16_t res_base_idx, uint16_t res_end_idx,
+	struct rte_mbuf *pkt)
 {
 	uint32_t vec_idx = 0;
 	uint32_t entry_success = 0;
@@ -217,9 +233,9 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t res_base_idx,
 	 * Convert from gpa to vva
 	 * (guest physical addr -> vhost virtual addr)
 	 */
-	vq = dev->virtqueue[VIRTIO_RXQ];
 	vb_addr =
 		gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
+	vq = dev->virtqueue[queue_id];
 	vb_hdr_addr = vb_addr;
 
 	/* Prefetch buffer address. */
@@ -407,11 +423,15 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
 
 	LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_merge_rx()\n",
 		dev->device_fh);
-	if (unlikely(queue_id != VIRTIO_RXQ)) {
-		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
+	if (unlikely(check_virtqueue_idx(queue_id, 0,
+		VIRTIO_QNUM * dev->num_virt_queues) == 0)) {
+		RTE_LOG(ERR, VHOST_DATA,
+			"%s (%"PRIu64"): virtqueue idx:%d invalid.\n",
+			 __func__, dev->device_fh, queue_id);
+		return 0;
 	}
 
-	vq = dev->virtqueue[VIRTIO_RXQ];
+	vq = dev->virtqueue[queue_id];
 	count = RTE_MIN((uint32_t)MAX_PKT_BURST, count);
 
 	if (count == 0)
@@ -493,7 +513,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
 
 		res_end_idx = res_cur_idx;
 
-		entry_success = copy_from_mbuf_to_vring(dev, res_base_idx,
+		entry_success = copy_from_mbuf_to_vring(dev, queue_id, res_base_idx,
 			res_end_idx, pkts[pkt_idx]);
 
 		rte_compiler_barrier();
@@ -543,12 +563,15 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 	uint16_t free_entries, entry_success = 0;
 	uint16_t avail_idx;
 
-	if (unlikely(queue_id != VIRTIO_TXQ)) {
-		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
+	if (unlikely(check_virtqueue_idx(queue_id, 1,
+		VIRTIO_QNUM * dev->num_virt_queues) == 0)) {
+		RTE_LOG(ERR, VHOST_DATA,
+			"%s (%"PRIu64"): virtqueue idx:%d invalid.\n",
+			 __func__, dev->device_fh, queue_id);
 		return 0;
 	}
 
-	vq = dev->virtqueue[VIRTIO_TXQ];
+	vq = dev->virtqueue[queue_id];
 	avail_idx =  *((volatile uint16_t *)&vq->avail->idx);
 
 	/* If there are no available buffers then return. */
diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c
index 31f1215..b66a653 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -378,7 +378,9 @@ vserver_message_handler(int connfd, void *dat, int *remove)
 		ops->set_owner(ctx);
 		break;
 	case VHOST_USER_RESET_OWNER:
-		ops->reset_owner(ctx);
+		RTE_LOG(INFO, VHOST_CONFIG,
+			"(%"PRIu64") VHOST_NET_RESET_OWNER\n", ctx.fh);
+		user_reset_owner(ctx, &msg.payload.state);
 		break;
 
 	case VHOST_USER_SET_MEM_TABLE:
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c
index c1ffc38..b4de86d 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -209,30 +209,46 @@ static int
 virtio_is_ready(struct virtio_net *dev)
 {
 	struct vhost_virtqueue *rvq, *tvq;
+	uint32_t q_idx;
 
 	/* mq support in future.*/
-	rvq = dev->virtqueue[VIRTIO_RXQ];
-	tvq = dev->virtqueue[VIRTIO_TXQ];
-	if (rvq && tvq && rvq->desc && tvq->desc &&
-		(rvq->kickfd != (eventfd_t)-1) &&
-		(rvq->callfd != (eventfd_t)-1) &&
-		(tvq->kickfd != (eventfd_t)-1) &&
-		(tvq->callfd != (eventfd_t)-1)) {
-		RTE_LOG(INFO, VHOST_CONFIG,
-			"virtio is now ready for processing.\n");
-		return 1;
+	for (q_idx = 0; q_idx < dev->num_virt_queues; q_idx++) {
+		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+
+		rvq = dev->virtqueue[virt_rx_q_idx];
+		tvq = dev->virtqueue[virt_tx_q_idx];
+		if ((rvq == NULL) || (tvq == NULL) ||
+			(rvq->desc == NULL) || (tvq->desc == NULL) ||
+			(rvq->kickfd == (eventfd_t)-1) ||
+			(rvq->callfd == (eventfd_t)-1) ||
+			(tvq->kickfd == (eventfd_t)-1) ||
+			(tvq->callfd == (eventfd_t)-1)) {
+			RTE_LOG(INFO, VHOST_CONFIG,
+				"virtio isn't ready for processing.\n");
+			return 0;
+		}
 	}
 	RTE_LOG(INFO, VHOST_CONFIG,
-		"virtio isn't ready for processing.\n");
-	return 0;
+		"virtio is now ready for processing.\n");
+	return 1;
 }
 
 void
 user_set_vring_call(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 {
 	struct vhost_vring_file file;
+	struct virtio_net *dev = get_device(ctx);
+	uint32_t cur_qp_idx;
 
 	file.index = pmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
+	cur_qp_idx = (file.index & (~0x1)) >> 1;
+
+	if (dev->num_virt_queues < cur_qp_idx + 1) {
+		if (alloc_vring_queue_pair(dev, cur_qp_idx) == 0)
+			dev->num_virt_queues = cur_qp_idx + 1;
+	}
+
 	if (pmsg->payload.u64 & VHOST_USER_VRING_NOFD_MASK)
 		file.fd = -1;
 	else
@@ -290,13 +306,37 @@ user_get_vring_base(struct vhost_device_ctx ctx,
 	 * sent and only sent in vhost_vring_stop.
 	 * TODO: cleanup the vring, it isn't usable since here.
 	 */
-	if (((int)dev->virtqueue[VIRTIO_RXQ]->kickfd) >= 0) {
-		close(dev->virtqueue[VIRTIO_RXQ]->kickfd);
-		dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
+	if (((int)dev->virtqueue[state->index]->kickfd) >= 0) {
+		close(dev->virtqueue[state->index]->kickfd);
+		dev->virtqueue[state->index]->kickfd = (eventfd_t)-1;
 	}
-	if (((int)dev->virtqueue[VIRTIO_TXQ]->kickfd) >= 0) {
-		close(dev->virtqueue[VIRTIO_TXQ]->kickfd);
-		dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
+
+	return 0;
+}
+
+/*
+ * when virtio is stopped, qemu will send us the RESET_OWNER message.
+ */
+int
+user_reset_owner(struct vhost_device_ctx ctx,
+	struct vhost_vring_state *state)
+{
+	struct virtio_net *dev = get_device(ctx);
+
+	/* We have to stop the queue (virtio) if it is running. */
+	if (dev->flags & VIRTIO_DEV_RUNNING)
+		notify_ops->destroy_device(dev);
+
+	RTE_LOG(INFO, VHOST_CONFIG,
+		"reset owner --- state idx:%d state num:%d\n", state->index, state->num);
+	/*
+	 * Based on current qemu vhost-user implementation, this message is
+	 * sent and only sent in vhost_net_stop_one.
+	 * TODO: cleanup the vring, it isn't usable since here.
+	 */
+	if (((int)dev->virtqueue[state->index]->kickfd) >= 0) {
+		close(dev->virtqueue[state->index]->kickfd);
+		dev->virtqueue[state->index]->kickfd = (eventfd_t)-1;
 	}
 
 	return 0;
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.h b/lib/librte_vhost/vhost_user/virtio-net-user.h
index df24860..2429836 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.h
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.h
@@ -46,4 +46,6 @@ void user_set_vring_kick(struct vhost_device_ctx, struct VhostUserMsg *);
 int user_get_vring_base(struct vhost_device_ctx, struct vhost_vring_state *);
 
 void user_destroy_device(struct vhost_device_ctx);
+
+int user_reset_owner(struct vhost_device_ctx ctx, struct vhost_vring_state *state);
 #endif
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index fced2ab..aaea7d5 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -67,10 +67,10 @@ static struct virtio_net_config_ll *ll_root;
 #define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
 				(1ULL << VIRTIO_NET_F_CTRL_VQ) | \
 				(1ULL << VIRTIO_NET_F_CTRL_RX) | \
-				(1ULL << VHOST_F_LOG_ALL))
+				(1ULL << VHOST_F_LOG_ALL)) | \
+				(1ULL << VIRTIO_NET_F_MQ))
 static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
 
-
 /*
  * Converts QEMU virtual address to Vhost virtual address. This function is
  * used to convert the ring addresses to our address space.
@@ -178,6 +178,8 @@ add_config_ll_entry(struct virtio_net_config_ll *new_ll_dev)
 static void
 cleanup_device(struct virtio_net *dev)
 {
+	uint32_t qp_idx;
+
 	/* Unmap QEMU memory file if mapped. */
 	if (dev->mem) {
 		munmap((void *)(uintptr_t)dev->mem->mapped_address,
@@ -186,14 +188,18 @@ cleanup_device(struct virtio_net *dev)
 	}
 
 	/* Close any event notifiers opened by device. */
-	if ((int)dev->virtqueue[VIRTIO_RXQ]->callfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_RXQ]->callfd);
-	if ((int)dev->virtqueue[VIRTIO_RXQ]->kickfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_RXQ]->kickfd);
-	if ((int)dev->virtqueue[VIRTIO_TXQ]->callfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_TXQ]->callfd);
-	if ((int)dev->virtqueue[VIRTIO_TXQ]->kickfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_TXQ]->kickfd);
+	for (qp_idx = 0; qp_idx < dev->num_virt_queues; qp_idx++) {
+		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+		if ((int)dev->virtqueue[virt_rx_q_idx]->callfd >= 0)
+			close((int)dev->virtqueue[virt_rx_q_idx]->callfd);
+		if ((int)dev->virtqueue[virt_rx_q_idx]->kickfd >= 0)
+			close((int)dev->virtqueue[virt_rx_q_idx]->kickfd);
+		if ((int)dev->virtqueue[virt_tx_q_idx]->callfd >= 0)
+			close((int)dev->virtqueue[virt_tx_q_idx]->callfd);
+		if ((int)dev->virtqueue[virt_tx_q_idx]->kickfd >= 0)
+			close((int)dev->virtqueue[virt_tx_q_idx]->kickfd);
+	}
 }
 
 /*
@@ -202,9 +208,17 @@ cleanup_device(struct virtio_net *dev)
 static void
 free_device(struct virtio_net_config_ll *ll_dev)
 {
-	/* Free any malloc'd memory */
-	free(ll_dev->dev.virtqueue[VIRTIO_RXQ]);
-	free(ll_dev->dev.virtqueue[VIRTIO_TXQ]);
+	uint32_t qp_idx;
+
+	/*
+	 * Free any malloc'd memory.
+	 */
+	/* Free every queue pair. */
+	for (qp_idx = 0; qp_idx < ll_dev->dev.num_virt_queues; qp_idx++) {
+		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		free(ll_dev->dev.virtqueue[virt_rx_q_idx]);
+	}
+	free(ll_dev->dev.virtqueue);
 	free(ll_dev);
 }
 
@@ -238,6 +252,27 @@ rm_config_ll_entry(struct virtio_net_config_ll *ll_dev,
 }
 
 /*
+ *  Initialise all variables in vring queue pair.
+ */
+static void
+init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx)
+{
+	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+	memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct vhost_virtqueue));
+	memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct vhost_virtqueue));
+
+	dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
+	dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
+	dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
+	dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
+
+	/* Backends are set to -1 indicating an inactive device. */
+	dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
+	dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED;
+}
+
+/*
  *  Initialise all variables in device structure.
  */
 static void
@@ -254,17 +289,31 @@ init_device(struct virtio_net *dev)
 	/* Set everything to 0. */
 	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
 		(sizeof(struct virtio_net) - (size_t)vq_offset));
-	memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct vhost_virtqueue));
-	memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct vhost_virtqueue));
 
-	dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
-	dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
-	dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
-	dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;
+	init_vring_queue_pair(dev, 0);
+	dev->num_virt_queues = 1;
+}
 
-	/* Backends are set to -1 indicating an inactive device. */
-	dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
-	dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
+/*
+ *  Alloc mem for vring queue pair.
+ */
+int
+alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx)
+{
+	struct vhost_virtqueue *virtqueue = NULL;
+	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+
+	virtqueue = malloc(sizeof(struct vhost_virtqueue) * VIRTIO_QNUM);
+	if (virtqueue == NULL) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"Failed to allocate memory for virt qp:%d.\n", qp_idx);
+		return -1;
+	}
+
+	dev->virtqueue[virt_rx_q_idx] = virtqueue;
+	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
+	return 0;
 }
 
 /*
@@ -276,7 +325,6 @@ static int
 new_device(struct vhost_device_ctx ctx)
 {
 	struct virtio_net_config_ll *new_ll_dev;
-	struct vhost_virtqueue *virtqueue_rx, *virtqueue_tx;
 
 	/* Setup device and virtqueues. */
 	new_ll_dev = malloc(sizeof(struct virtio_net_config_ll));
@@ -287,28 +335,22 @@ new_device(struct vhost_device_ctx ctx)
 		return -1;
 	}
 
-	virtqueue_rx = malloc(sizeof(struct vhost_virtqueue));
-	if (virtqueue_rx == NULL) {
-		free(new_ll_dev);
+	new_ll_dev->dev.virtqueue =
+		malloc(VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct vhost_virtqueue *));
+	if (new_ll_dev->dev.virtqueue == NULL) {
 		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to allocate memory for rxq.\n",
+			"(%"PRIu64") Failed to allocate memory for dev.virtqueue.\n",
 			ctx.fh);
+		free(new_ll_dev);
 		return -1;
 	}
 
-	virtqueue_tx = malloc(sizeof(struct vhost_virtqueue));
-	if (virtqueue_tx == NULL) {
-		free(virtqueue_rx);
+	if (alloc_vring_queue_pair(&new_ll_dev->dev, 0) == -1) {
+		free(new_ll_dev->dev.virtqueue);
 		free(new_ll_dev);
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to allocate memory for txq.\n",
-			ctx.fh);
 		return -1;
 	}
 
-	new_ll_dev->dev.virtqueue[VIRTIO_RXQ] = virtqueue_rx;
-	new_ll_dev->dev.virtqueue[VIRTIO_TXQ] = virtqueue_tx;
-
 	/* Initialise device and virtqueues. */
 	init_device(&new_ll_dev->dev);
 
@@ -392,7 +434,7 @@ set_owner(struct vhost_device_ctx ctx)
  * Called from CUSE IOCTL: VHOST_RESET_OWNER
  */
 static int
-reset_owner(struct vhost_device_ctx ctx)
+reset_owner(__rte_unused struct vhost_device_ctx ctx)
 {
 	struct virtio_net_config_ll *ll_dev;
 
@@ -430,6 +472,7 @@ static int
 set_features(struct vhost_device_ctx ctx, uint64_t *pu)
 {
 	struct virtio_net *dev;
+	uint32_t q_idx;
 
 	dev = get_device(ctx);
 	if (dev == NULL)
@@ -441,22 +484,26 @@ set_features(struct vhost_device_ctx ctx, uint64_t *pu)
 	dev->features = *pu;
 
 	/* Set the vhost_hlen depending on if VIRTIO_NET_F_MRG_RXBUF is set. */
-	if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
-		LOG_DEBUG(VHOST_CONFIG,
-			"(%"PRIu64") Mergeable RX buffers enabled\n",
-			dev->device_fh);
-		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr_mrg_rxbuf);
-		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr_mrg_rxbuf);
-	} else {
-		LOG_DEBUG(VHOST_CONFIG,
-			"(%"PRIu64") Mergeable RX buffers disabled\n",
-			dev->device_fh);
-		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr);
-		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr);
+	for (q_idx = 0; q_idx < dev->num_virt_queues; q_idx++) {
+		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+		if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
+			LOG_DEBUG(VHOST_CONFIG,
+				"(%"PRIu64") Mergeable RX buffers enabled\n",
+				dev->device_fh);
+			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr_mrg_rxbuf);
+			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr_mrg_rxbuf);
+		} else {
+			LOG_DEBUG(VHOST_CONFIG,
+				"(%"PRIu64") Mergeable RX buffers disabled\n",
+				dev->device_fh);
+			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr);
+			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr);
+		}
 	}
 	return 0;
 }
@@ -737,6 +784,14 @@ int rte_vhost_feature_enable(uint64_t feature_mask)
 	return -1;
 }
 
+uint16_t rte_vhost_qp_num_get(struct virtio_net *dev)
+{
+	if (dev == NULL)
+		return 0;
+
+	return dev->num_virt_queues;
+}
+
 /*
  * Register ops so that we can add/remove device to data core.
  */
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v3 3/9] lib_vhost: Set memory layout for multiple queues mode
  2015-06-15  7:56   ` [dpdk-dev] [PATCH v3 0/9] Support multiple queues in vhost Ouyang Changchun
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 1/9] ixgbe: Support VMDq RSS in non-SRIOV environment Ouyang Changchun
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in virtio dev Ouyang Changchun
@ 2015-06-15  7:56     ` Ouyang Changchun
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 4/9] lib_vhost: Check the virtqueue address's validity Ouyang Changchun
                       ` (6 subsequent siblings)
  9 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-15  7:56 UTC (permalink / raw)
  To: dev

QEMU sends separate commands orderly to set the memory layout for each queue
in one virtio device, accordingly vhost need keep memory layout information
for each queue of the virtio device.

This also need adjust the interface a bit for function gpa_to_vva by
introducing the queue index to specify queue of device to look up its
virtual vhost address for the incoming guest physical address.

Chagnes in v3
  - fix coding style

Chagnes in v2
  - q_idx is changed into qp_idx
  - dynamically alloc mem for dev mem_arr
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 examples/vhost/main.c                         | 21 +++++-----
 lib/librte_vhost/rte_virtio_net.h             | 10 +++--
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 57 ++++++++++++++------------
 lib/librte_vhost/vhost_rxtx.c                 | 21 +++++-----
 lib/librte_vhost/vhost_user/virtio-net-user.c | 59 ++++++++++++++-------------
 lib/librte_vhost/virtio-net.c                 | 38 ++++++++++++-----
 6 files changed, 118 insertions(+), 88 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 7863dcf..aba287a 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1466,11 +1466,11 @@ attach_rxmbuf_zcp(struct virtio_net *dev)
 		desc = &vq->desc[desc_idx];
 		if (desc->flags & VRING_DESC_F_NEXT) {
 			desc = &vq->desc[desc->next];
-			buff_addr = gpa_to_vva(dev, desc->addr);
+			buff_addr = gpa_to_vva(dev, 0, desc->addr);
 			phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len,
 					&addr_type);
 		} else {
-			buff_addr = gpa_to_vva(dev,
+			buff_addr = gpa_to_vva(dev, 0,
 					desc->addr + vq->vhost_hlen);
 			phys_addr = gpa_to_hpa(vdev,
 					desc->addr + vq->vhost_hlen,
@@ -1722,7 +1722,7 @@ virtio_dev_rx_zcp(struct virtio_net *dev, struct rte_mbuf **pkts,
 			rte_pktmbuf_data_len(buff), 0);
 
 		/* Buffer address translation for virtio header. */
-		buff_hdr_addr = gpa_to_vva(dev, desc->addr);
+		buff_hdr_addr = gpa_to_vva(dev, 0, desc->addr);
 		packet_len = rte_pktmbuf_data_len(buff) + vq->vhost_hlen;
 
 		/*
@@ -1946,7 +1946,7 @@ virtio_dev_tx_zcp(struct virtio_net *dev)
 		desc = &vq->desc[desc->next];
 
 		/* Buffer address translation. */
-		buff_addr = gpa_to_vva(dev, desc->addr);
+		buff_addr = gpa_to_vva(dev, 0, desc->addr);
 		/* Need check extra VLAN_HLEN size for inserting VLAN tag */
 		phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len + VLAN_HLEN,
 			&addr_type);
@@ -2604,13 +2604,14 @@ new_device (struct virtio_net *dev)
 	dev->priv = vdev;
 
 	if (zero_copy) {
-		vdev->nregions_hpa = dev->mem->nregions;
-		for (regionidx = 0; regionidx < dev->mem->nregions; regionidx++) {
+		struct virtio_memory *dev_mem = dev->mem_arr[0];
+		vdev->nregions_hpa = dev_mem->nregions;
+		for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) {
 			vdev->nregions_hpa
 				+= check_hpa_regions(
-					dev->mem->regions[regionidx].guest_phys_address
-					+ dev->mem->regions[regionidx].address_offset,
-					dev->mem->regions[regionidx].memory_size);
+					dev_mem->regions[regionidx].guest_phys_address
+					+ dev_mem->regions[regionidx].address_offset,
+					dev_mem->regions[regionidx].memory_size);
 
 		}
 
@@ -2626,7 +2627,7 @@ new_device (struct virtio_net *dev)
 
 
 		if (fill_hpa_memory_regions(
-			vdev->regions_hpa, dev->mem
+			vdev->regions_hpa, dev_mem
 			) != vdev->nregions_hpa) {
 
 			RTE_LOG(ERR, VHOST_CONFIG,
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 873be3e..1b75f45 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -95,14 +95,15 @@ struct vhost_virtqueue {
  * Device structure contains all configuration information relating to the device.
  */
 struct virtio_net {
-	struct virtio_memory	*mem;		/**< QEMU memory and memory region information. */
 	struct vhost_virtqueue	**virtqueue;    /**< Contains all virtqueue information. */
+	struct virtio_memory    **mem_arr;      /**< Array for QEMU memory and memory region information. */
 	uint64_t		features;	/**< Negotiated feature set. */
 	uint64_t		device_fh;	/**< device identifier. */
 	uint32_t		flags;		/**< Device flags. Only used to check if device is running on data core. */
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
 	char			ifname[IF_NAME_SZ];	/**< Name of the tap device or socket path. */
 	uint32_t		num_virt_queues;
+	uint32_t		mem_idx;	/** Used in set memory layout, unique for each queue within virtio device. */
 	void			*priv;		/**< private context */
 } __rte_cache_aligned;
 
@@ -153,14 +154,15 @@ rte_vring_available_entries(struct virtio_net *dev, uint16_t queue_id)
  * This is used to convert guest virtio buffer addresses.
  */
 static inline uint64_t __attribute__((always_inline))
-gpa_to_vva(struct virtio_net *dev, uint64_t guest_pa)
+gpa_to_vva(struct virtio_net *dev, uint32_t q_idx, uint64_t guest_pa)
 {
 	struct virtio_memory_regions *region;
+	struct virtio_memory *dev_mem = dev->mem_arr[q_idx];
 	uint32_t regionidx;
 	uint64_t vhost_va = 0;
 
-	for (regionidx = 0; regionidx < dev->mem->nregions; regionidx++) {
-		region = &dev->mem->regions[regionidx];
+	for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) {
+		region = &dev_mem->regions[regionidx];
 		if ((guest_pa >= region->guest_phys_address) &&
 			(guest_pa <= region->guest_phys_address_end)) {
 			vhost_va = region->address_offset + guest_pa;
diff --git a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
index ae2c3fa..7a4733c 100644
--- a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
+++ b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
@@ -273,28 +273,32 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 		((uint64_t)(uintptr_t)mem_regions_addr + size);
 	uint64_t base_address = 0, mapped_address, mapped_size;
 	struct virtio_net *dev;
+	struct virtio_memory *dev_mem = NULL;
 
 	dev = get_device(ctx);
 	if (dev == NULL)
-		return -1;
-
-	if (dev->mem && dev->mem->mapped_address) {
-		munmap((void *)(uintptr_t)dev->mem->mapped_address,
-			(size_t)dev->mem->mapped_size);
-		free(dev->mem);
-		dev->mem = NULL;
+		goto error;
+
+	dev_mem = dev->mem_arr[dev->mem_idx];
+	if (dev_mem && dev_mem->mapped_address) {
+		munmap((void *)(uintptr_t)dev_mem->mapped_address,
+			(size_t)dev_mem->mapped_size);
+		free(dev_mem);
+		dev->mem_arr[dev->mem_idx] = NULL;
 	}
 
-	dev->mem = calloc(1, sizeof(struct virtio_memory) +
+	dev->mem_arr[dev->mem_idx] = calloc(1, sizeof(struct virtio_memory) +
 		sizeof(struct virtio_memory_regions) * nregions);
-	if (dev->mem == NULL) {
+	dev_mem = dev->mem_arr[dev->mem_idx];
+
+	if (dev_mem == NULL) {
 		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to allocate memory for dev->mem\n",
-			dev->device_fh);
-		return -1;
+			"(%"PRIu64") Failed to allocate memory for dev->mem_arr[%d]\n",
+			dev->device_fh, dev->mem_idx);
+		goto error;
 	}
 
-	pregion = &dev->mem->regions[0];
+	pregion = &dev_mem->regions[0];
 
 	for (idx = 0; idx < nregions; idx++) {
 		pregion[idx].guest_phys_address =
@@ -320,14 +324,12 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 				pregion[idx].userspace_address;
 			/* Map VM memory file */
 			if (host_memory_map(ctx.pid, base_address,
-				&mapped_address, &mapped_size) != 0) {
-				free(dev->mem);
-				dev->mem = NULL;
-				return -1;
-			}
-			dev->mem->mapped_address = mapped_address;
-			dev->mem->base_address = base_address;
-			dev->mem->mapped_size = mapped_size;
+				&mapped_address, &mapped_size) != 0)
+				goto free;
+
+			dev_mem->mapped_address = mapped_address;
+			dev_mem->base_address = base_address;
+			dev_mem->mapped_size = mapped_size;
 		}
 	}
 
@@ -335,9 +337,7 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 	if (base_address == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"Failed to find base address of qemu memory file.\n");
-		free(dev->mem);
-		dev->mem = NULL;
-		return -1;
+		goto free;
 	}
 
 	valid_regions = nregions;
@@ -369,9 +369,16 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 			pregion[idx].userspace_address -
 			pregion[idx].guest_phys_address;
 	}
-	dev->mem->nregions = valid_regions;
 
+	dev_mem->nregions = valid_regions;
+	dev->mem_idx = (dev->mem_idx + 1) % (dev->num_virt_queues * VIRTIO_QNUM);
 	return 0;
+
+free:
+	free(dev_mem);
+	dev->mem_arr[dev->mem_idx] = NULL;
+error:
+	return -1;
 }
 
 /*
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index d2a7143..6c8fe70 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -134,7 +134,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 		buff = pkts[packet_success];
 
 		/* Convert from gpa to vva (guest physical addr -> vhost virtual addr) */
-		buff_addr = gpa_to_vva(dev, desc->addr);
+		buff_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM, desc->addr);
 		/* Prefetch buffer address. */
 		rte_prefetch0((void *)(uintptr_t)buff_addr);
 
@@ -150,7 +150,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 			desc->len = vq->vhost_hlen;
 			desc = &vq->desc[desc->next];
 			/* Buffer address translation. */
-			buff_addr = gpa_to_vva(dev, desc->addr);
+			buff_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM, desc->addr);
 			desc->len = rte_pktmbuf_data_len(buff);
 		} else {
 			buff_addr += vq->vhost_hlen;
@@ -233,9 +233,9 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 	 * Convert from gpa to vva
 	 * (guest physical addr -> vhost virtual addr)
 	 */
-	vb_addr =
-		gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
 	vq = dev->virtqueue[queue_id];
+	vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
+			vq->buf_vec[vec_idx].buf_addr);
 	vb_hdr_addr = vb_addr;
 
 	/* Prefetch buffer address. */
@@ -277,8 +277,8 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 		}
 
 		vec_idx++;
-		vb_addr =
-			gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
+		vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
+			vq->buf_vec[vec_idx].buf_addr);
 
 		/* Prefetch buffer address. */
 		rte_prefetch0((void *)(uintptr_t)vb_addr);
@@ -323,7 +323,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 			}
 
 			vec_idx++;
-			vb_addr = gpa_to_vva(dev,
+			vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
 				vq->buf_vec[vec_idx].buf_addr);
 			vb_offset = 0;
 			vb_avail = vq->buf_vec[vec_idx].buf_len;
@@ -367,7 +367,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 
 					/* Get next buffer from buf_vec. */
 					vec_idx++;
-					vb_addr = gpa_to_vva(dev,
+					vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
 						vq->buf_vec[vec_idx].buf_addr);
 					vb_avail =
 						vq->buf_vec[vec_idx].buf_len;
@@ -615,7 +615,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 		desc = &vq->desc[desc->next];
 
 		/* Buffer address translation. */
-		vb_addr = gpa_to_vva(dev, desc->addr);
+		vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM, desc->addr);
 		/* Prefetch buffer address. */
 		rte_prefetch0((void *)(uintptr_t)vb_addr);
 
@@ -721,7 +721,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 					desc = &vq->desc[desc->next];
 
 					/* Buffer address translation. */
-					vb_addr = gpa_to_vva(dev, desc->addr);
+					vb_addr = gpa_to_vva(dev,
+						queue_id / VIRTIO_QNUM, desc->addr);
 					/* Prefetch buffer address. */
 					rte_prefetch0((void *)(uintptr_t)vb_addr);
 					vb_offset = 0;
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c
index b4de86d..337e7e4 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -70,17 +70,17 @@ get_blk_size(int fd)
 }
 
 static void
-free_mem_region(struct virtio_net *dev)
+free_mem_region(struct virtio_memory *dev_mem)
 {
 	struct orig_region_map *region;
 	unsigned int idx;
 	uint64_t alignment;
 
-	if (!dev || !dev->mem)
+	if (!dev_mem)
 		return;
 
-	region = orig_region(dev->mem, dev->mem->nregions);
-	for (idx = 0; idx < dev->mem->nregions; idx++) {
+	region = orig_region(dev_mem, dev_mem->nregions);
+	for (idx = 0; idx < dev_mem->nregions; idx++) {
 		if (region[idx].mapped_address) {
 			alignment = region[idx].blksz;
 			munmap((void *)(uintptr_t)
@@ -103,37 +103,37 @@ user_set_mem_table(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 	unsigned int idx = 0;
 	struct orig_region_map *pregion_orig;
 	uint64_t alignment;
+	struct virtio_memory *dev_mem = NULL;
 
 	/* unmap old memory regions one by one*/
 	dev = get_device(ctx);
 	if (dev == NULL)
 		return -1;
 
-	/* Remove from the data plane. */
-	if (dev->flags & VIRTIO_DEV_RUNNING)
-		notify_ops->destroy_device(dev);
-
-	if (dev->mem) {
-		free_mem_region(dev);
-		free(dev->mem);
-		dev->mem = NULL;
+	dev_mem = dev->mem_arr[dev->mem_idx];
+	if (dev_mem) {
+		free_mem_region(dev_mem);
+		free(dev_mem);
+		dev->mem_arr[dev->mem_idx] = NULL;
 	}
 
-	dev->mem = calloc(1,
+	dev->mem_arr[dev->mem_idx] = calloc(1,
 		sizeof(struct virtio_memory) +
 		sizeof(struct virtio_memory_regions) * memory.nregions +
 		sizeof(struct orig_region_map) * memory.nregions);
-	if (dev->mem == NULL) {
+
+	dev_mem = dev->mem_arr[dev->mem_idx];
+	if (dev_mem == NULL) {
 		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to allocate memory for dev->mem\n",
-			dev->device_fh);
+			"(%"PRIu64") Failed to allocate memory for dev->mem_arr[%d]\n",
+			dev->device_fh, dev->mem_idx);
 		return -1;
 	}
-	dev->mem->nregions = memory.nregions;
+	dev_mem->nregions = memory.nregions;
 
-	pregion_orig = orig_region(dev->mem, memory.nregions);
+	pregion_orig = orig_region(dev_mem, memory.nregions);
 	for (idx = 0; idx < memory.nregions; idx++) {
-		pregion = &dev->mem->regions[idx];
+		pregion = &dev_mem->regions[idx];
 		pregion->guest_phys_address =
 			memory.regions[idx].guest_phys_addr;
 		pregion->guest_phys_address_end =
@@ -175,9 +175,9 @@ user_set_mem_table(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 			pregion->guest_phys_address;
 
 		if (memory.regions[idx].guest_phys_addr == 0) {
-			dev->mem->base_address =
+			dev_mem->base_address =
 				memory.regions[idx].userspace_addr;
-			dev->mem->mapped_address =
+			dev_mem->mapped_address =
 				pregion->address_offset;
 		}
 
@@ -189,6 +189,7 @@ user_set_mem_table(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 			 pregion->memory_size);
 	}
 
+	dev->mem_idx = (dev->mem_idx + 1) % (dev->num_virt_queues * VIRTIO_QNUM);
 	return 0;
 
 err_mmap:
@@ -200,8 +201,8 @@ err_mmap:
 					alignment));
 		close(pregion_orig[idx].fd);
 	}
-	free(dev->mem);
-	dev->mem = NULL;
+	free(dev_mem);
+	dev->mem_arr[dev->mem_idx] = NULL;
 	return -1;
 }
 
@@ -346,13 +347,15 @@ void
 user_destroy_device(struct vhost_device_ctx ctx)
 {
 	struct virtio_net *dev = get_device(ctx);
+	uint32_t i;
 
 	if (dev && (dev->flags & VIRTIO_DEV_RUNNING))
 		notify_ops->destroy_device(dev);
 
-	if (dev && dev->mem) {
-		free_mem_region(dev);
-		free(dev->mem);
-		dev->mem = NULL;
-	}
+	for (i = 0; i < dev->num_virt_queues; i++)
+		if (dev && dev->mem_arr[i]) {
+			free_mem_region(dev->mem_arr[i]);
+			free(dev->mem_arr[i]);
+			dev->mem_arr[i] = NULL;
+		}
 }
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index aaea7d5..3e24841 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -76,15 +76,16 @@ static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
  * used to convert the ring addresses to our address space.
  */
 static uint64_t
-qva_to_vva(struct virtio_net *dev, uint64_t qemu_va)
+qva_to_vva(struct virtio_net *dev, uint32_t q_idx, uint64_t qemu_va)
 {
 	struct virtio_memory_regions *region;
 	uint64_t vhost_va = 0;
 	uint32_t regionidx = 0;
+	struct virtio_memory *dev_mem = dev->mem_arr[q_idx];
 
 	/* Find the region where the address lives. */
-	for (regionidx = 0; regionidx < dev->mem->nregions; regionidx++) {
-		region = &dev->mem->regions[regionidx];
+	for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) {
+		region = &dev_mem->regions[regionidx];
 		if ((qemu_va >= region->userspace_address) &&
 			(qemu_va <= region->userspace_address +
 			region->memory_size)) {
@@ -181,10 +182,13 @@ cleanup_device(struct virtio_net *dev)
 	uint32_t qp_idx;
 
 	/* Unmap QEMU memory file if mapped. */
-	if (dev->mem) {
-		munmap((void *)(uintptr_t)dev->mem->mapped_address,
-			(size_t)dev->mem->mapped_size);
-		free(dev->mem);
+	for (qp_idx = 0; qp_idx < dev->num_virt_queues; qp_idx++) {
+		struct virtio_memory *dev_mem = dev->mem_arr[qp_idx];
+		if (dev_mem) {
+			munmap((void *)(uintptr_t)dev_mem->mapped_address,
+				(size_t)dev_mem->mapped_size);
+			free(dev_mem);
+		}
 	}
 
 	/* Close any event notifiers opened by device. */
@@ -213,6 +217,8 @@ free_device(struct virtio_net_config_ll *ll_dev)
 	/*
 	 * Free any malloc'd memory.
 	 */
+	free(ll_dev->dev.mem_arr);
+
 	/* Free every queue pair. */
 	for (qp_idx = 0; qp_idx < ll_dev->dev.num_virt_queues; qp_idx++) {
 		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
@@ -284,7 +290,7 @@ init_device(struct virtio_net *dev)
 	 * Virtqueues have already been malloced so
 	 * we don't want to set them to NULL.
 	 */
-	vq_offset = offsetof(struct virtio_net, mem);
+	vq_offset = offsetof(struct virtio_net, features);
 
 	/* Set everything to 0. */
 	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
@@ -351,6 +357,16 @@ new_device(struct vhost_device_ctx ctx)
 		return -1;
 	}
 
+	new_ll_dev->dev.mem_arr =
+		malloc(VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct virtio_memory *));
+	if (new_ll_dev->dev.mem_arr == NULL) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"(%"PRIu64") Failed to allocate memory for dev.mem_arr.\n",
+			ctx.fh);
+		free_device(new_ll_dev);
+		return -1;
+	}
+
 	/* Initialise device and virtqueues. */
 	init_device(&new_ll_dev->dev);
 
@@ -547,7 +563,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 
 	/* The addresses are converted from QEMU virtual to Vhost virtual. */
 	vq->desc = (struct vring_desc *)(uintptr_t)qva_to_vva(dev,
-			addr->desc_user_addr);
+			addr->index / VIRTIO_QNUM, addr->desc_user_addr);
 	if (vq->desc == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"(%"PRIu64") Failed to find desc ring address.\n",
@@ -556,7 +572,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 	}
 
 	vq->avail = (struct vring_avail *)(uintptr_t)qva_to_vva(dev,
-			addr->avail_user_addr);
+			addr->index / VIRTIO_QNUM, addr->avail_user_addr);
 	if (vq->avail == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"(%"PRIu64") Failed to find avail ring address.\n",
@@ -565,7 +581,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 	}
 
 	vq->used = (struct vring_used *)(uintptr_t)qva_to_vva(dev,
-			addr->used_user_addr);
+			addr->index / VIRTIO_QNUM, addr->used_user_addr);
 	if (vq->used == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"(%"PRIu64") Failed to find used ring address.\n",
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v3 4/9] lib_vhost: Check the virtqueue address's validity
  2015-06-15  7:56   ` [dpdk-dev] [PATCH v3 0/9] Support multiple queues in vhost Ouyang Changchun
                       ` (2 preceding siblings ...)
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 3/9] lib_vhost: Set memory layout for multiple queues mode Ouyang Changchun
@ 2015-06-15  7:56     ` Ouyang Changchun
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 5/9] vhost: Add new command line option: rxq Ouyang Changchun
                       ` (5 subsequent siblings)
  9 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-15  7:56 UTC (permalink / raw)
  To: dev

This is added since v3.
Check the virtqueue address's validity.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 lib/librte_vhost/vhost_user/vhost-net-user.c | 11 ++++++++++-
 lib/librte_vhost/virtio-net.c                | 10 ++++++++++
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c
index b66a653..552b501 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -398,7 +398,16 @@ vserver_message_handler(int connfd, void *dat, int *remove)
 		ops->set_vring_num(ctx, &msg.payload.state);
 		break;
 	case VHOST_USER_SET_VRING_ADDR:
-		ops->set_vring_addr(ctx, &msg.payload.addr);
+		if (ops->set_vring_addr(ctx, &msg.payload.addr) != 0) {
+			RTE_LOG(ERR, VHOST_CONFIG,
+				"error found in vhost set vring,"
+				"the vhost device will destroy\n");
+			close(connfd);
+			*remove = 1;
+			free(cfd_ctx);
+			user_destroy_device(ctx);
+			ops->destroy_device(ctx);
+		}
 		break;
 	case VHOST_USER_SET_VRING_BASE:
 		ops->set_vring_base(ctx, &msg.payload.state);
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 3e24841..80df0ec 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -553,6 +553,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 {
 	struct virtio_net *dev;
 	struct vhost_virtqueue *vq;
+	uint32_t i;
 
 	dev = get_device(ctx);
 	if (dev == NULL)
@@ -580,6 +581,15 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 		return -1;
 	}
 
+	for (i = vq->last_used_idx; i < vq->avail->idx; i++)
+		if (vq->avail->ring[i] >= vq->size) {
+			RTE_LOG(ERR, VHOST_CONFIG, "%s (%"PRIu64"):"
+				"Please check virt queue pair idx:%d is "
+				"enalbed correctly on guest.\n", __func__,
+				dev->device_fh, addr->index / VIRTIO_QNUM);
+			return -1;
+		}
+
 	vq->used = (struct vring_used *)(uintptr_t)qva_to_vva(dev,
 			addr->index / VIRTIO_QNUM, addr->used_user_addr);
 	if (vq->used == 0) {
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v3 5/9] vhost: Add new command line option: rxq
  2015-06-15  7:56   ` [dpdk-dev] [PATCH v3 0/9] Support multiple queues in vhost Ouyang Changchun
                       ` (3 preceding siblings ...)
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 4/9] lib_vhost: Check the virtqueue address's validity Ouyang Changchun
@ 2015-06-15  7:56     ` Ouyang Changchun
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 6/9] vhost: Support multiple queues Ouyang Changchun
                       ` (4 subsequent siblings)
  9 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-15  7:56 UTC (permalink / raw)
  To: dev

Sample vhost need know the queue number user want to enable for each virtio device,
so add the new option '--rxq' into it.

Changes in v3
  - fix coding style

Changes in v2
  - refine help info
  - check if rxq = 0
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 examples/vhost/main.c | 49 +++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 45 insertions(+), 4 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index aba287a..cd9640e 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -163,6 +163,9 @@ static int mergeable;
 /* Do vlan strip on host, enabled on default */
 static uint32_t vlan_strip = 1;
 
+/* Rx queue number per virtio device */
+static uint32_t rxq = 1;
+
 /* number of descriptors to apply*/
 static uint32_t num_rx_descriptor = RTE_TEST_RX_DESC_DEFAULT_ZCP;
 static uint32_t num_tx_descriptor = RTE_TEST_TX_DESC_DEFAULT_ZCP;
@@ -408,8 +411,19 @@ port_init(uint8_t port)
 		txconf->tx_deferred_start = 1;
 	}
 
-	/*configure the number of supported virtio devices based on VMDQ limits */
-	num_devices = dev_info.max_vmdq_pools;
+	/* Configure the virtio devices num based on VMDQ limits */
+	switch (rxq) {
+	case 1:
+	case 2:
+		num_devices = dev_info.max_vmdq_pools;
+		break;
+	case 4:
+		num_devices = dev_info.max_vmdq_pools / 2;
+		break;
+	default:
+		RTE_LOG(ERR, VHOST_CONFIG, "rxq invalid for VMDq.\n");
+		return -1;
+	}
 
 	if (zero_copy) {
 		rx_ring_size = num_rx_descriptor;
@@ -431,7 +445,7 @@ port_init(uint8_t port)
 		return retval;
 	/* NIC queues are divided into pf queues and vmdq queues.  */
 	num_pf_queues = dev_info.max_rx_queues - dev_info.vmdq_queue_num;
-	queues_per_pool = dev_info.vmdq_queue_num / dev_info.max_vmdq_pools;
+	queues_per_pool = dev_info.vmdq_queue_num / num_devices;
 	num_vmdq_queues = num_devices * queues_per_pool;
 	num_queues = num_pf_queues + num_vmdq_queues;
 	vmdq_queue_base = dev_info.vmdq_queue_base;
@@ -576,7 +590,8 @@ us_vhost_usage(const char *prgname)
 	"		--rx-desc-num [0-N]: the number of descriptors on rx, "
 			"used only when zero copy is enabled.\n"
 	"		--tx-desc-num [0-N]: the number of descriptors on tx, "
-			"used only when zero copy is enabled.\n",
+			"used only when zero copy is enabled.\n"
+	"		--rxq [1,2,4]: rx queue number for each vhost device\n",
 	       prgname);
 }
 
@@ -602,6 +617,7 @@ us_vhost_parse_args(int argc, char **argv)
 		{"zero-copy", required_argument, NULL, 0},
 		{"rx-desc-num", required_argument, NULL, 0},
 		{"tx-desc-num", required_argument, NULL, 0},
+		{"rxq", required_argument, NULL, 0},
 		{NULL, 0, 0, 0},
 	};
 
@@ -778,6 +794,18 @@ us_vhost_parse_args(int argc, char **argv)
 				}
 			}
 
+			/* Specify the Rx queue number for each vhost dev. */
+			if (!strncmp(long_option[option_index].name,
+				"rxq", MAX_LONG_OPT_SZ)) {
+				ret = parse_num_opt(optarg, 4);
+				if ((ret == -1) || (ret == 0) || (!POWEROF2(ret))) {
+					RTE_LOG(INFO, VHOST_CONFIG,
+					"Valid value for rxq is [1,2,4]\n");
+					us_vhost_usage(prgname);
+					return -1;
+				} else
+					rxq = ret;
+			}
 			break;
 
 			/* Invalid option - print options. */
@@ -813,6 +841,19 @@ us_vhost_parse_args(int argc, char **argv)
 		return -1;
 	}
 
+	if (rxq > 1) {
+		vmdq_conf_default.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
+		vmdq_conf_default.rx_adv_conf.rss_conf.rss_hf = ETH_RSS_IP |
+				ETH_RSS_UDP | ETH_RSS_TCP | ETH_RSS_SCTP;
+	}
+
+	if ((zero_copy == 1) && (rxq > 1)) {
+		RTE_LOG(INFO, VHOST_PORT,
+			"Vhost zero copy doesn't support mq mode,"
+			"please specify '--rxq 1' to disable it.\n");
+		return -1;
+	}
+
 	return 0;
 }
 
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v3 6/9] vhost: Support multiple queues
  2015-06-15  7:56   ` [dpdk-dev] [PATCH v3 0/9] Support multiple queues in vhost Ouyang Changchun
                       ` (4 preceding siblings ...)
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 5/9] vhost: Add new command line option: rxq Ouyang Changchun
@ 2015-06-15  7:56     ` Ouyang Changchun
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 7/9] virtio: Resolve for control queue Ouyang Changchun
                       ` (3 subsequent siblings)
  9 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-15  7:56 UTC (permalink / raw)
  To: dev

Sample vhost leverage the VMDq+RSS in HW to receive packets and distribute them
into different queue in the pool according to 5 tuples.

And enable multiple queues mode in vhost/virtio layer.

HW queue numbers in pool exactly same with the queue number in virtio device,
e.g. rxq = 4, the queue number is 4, it means 4 HW queues in each VMDq pool,
and 4 queues in each virtio device/port, one maps to each.

=========================================
==================|   |==================|
       vport0     |   |      vport1      |
---  ---  ---  ---|   |---  ---  ---  ---|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
||   ||   ||   ||      ||   ||   ||   ||
||   ||   ||   ||      ||   ||   ||   ||
||= =||= =||= =||=|   =||== ||== ||== ||=|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |

------------------|   |------------------|
     VMDq pool0   |   |    VMDq pool1    |
==================|   |==================|

In RX side, it firstly polls each queue of the pool and gets the packets from
it and enqueue them into its corresponding queue in virtio device/port.
In TX side, it dequeue packets from each queue of virtio device/port and send
to either physical port or another virtio device according to its destination
MAC address.

Changes in v2:
  - check queue num per pool in VMDq and queue pair number per vhost device
  - remove the unnecessary calling q_num_set api
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 examples/vhost/main.c | 132 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 79 insertions(+), 53 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index cd9640e..d40cb11 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1001,8 +1001,9 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)
 
 	/* Enable stripping of the vlan tag as we handle routing. */
 	if (vlan_strip)
-		rte_eth_dev_set_vlan_strip_on_queue(ports[0],
-			(uint16_t)vdev->vmdq_rx_q, 1);
+		for (i = 0; i < (int)rxq; i++)
+			rte_eth_dev_set_vlan_strip_on_queue(ports[0],
+				(uint16_t)(vdev->vmdq_rx_q + i), 1);
 
 	/* Set device as ready for RX. */
 	vdev->ready = DEVICE_RX;
@@ -1017,7 +1018,7 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)
 static inline void
 unlink_vmdq(struct vhost_dev *vdev)
 {
-	unsigned i = 0;
+	unsigned i = 0, j = 0;
 	unsigned rx_count;
 	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
 
@@ -1030,15 +1031,19 @@ unlink_vmdq(struct vhost_dev *vdev)
 		vdev->vlan_tag = 0;
 
 		/*Clear out the receive buffers*/
-		rx_count = rte_eth_rx_burst(ports[0],
-					(uint16_t)vdev->vmdq_rx_q, pkts_burst, MAX_PKT_BURST);
+		for (i = 0; i < rxq; i++) {
+			rx_count = rte_eth_rx_burst(ports[0],
+					(uint16_t)vdev->vmdq_rx_q + i,
+					pkts_burst, MAX_PKT_BURST);
 
-		while (rx_count) {
-			for (i = 0; i < rx_count; i++)
-				rte_pktmbuf_free(pkts_burst[i]);
+			while (rx_count) {
+				for (j = 0; j < rx_count; j++)
+					rte_pktmbuf_free(pkts_burst[j]);
 
-			rx_count = rte_eth_rx_burst(ports[0],
-					(uint16_t)vdev->vmdq_rx_q, pkts_burst, MAX_PKT_BURST);
+				rx_count = rte_eth_rx_burst(ports[0],
+					(uint16_t)vdev->vmdq_rx_q + i,
+					pkts_burst, MAX_PKT_BURST);
+			}
 		}
 
 		vdev->ready = DEVICE_MAC_LEARNING;
@@ -1050,7 +1055,7 @@ unlink_vmdq(struct vhost_dev *vdev)
  * the packet on that devices RX queue. If not then return.
  */
 static inline int __attribute__((always_inline))
-virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m)
+virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m, uint32_t q_idx)
 {
 	struct virtio_net_data_ll *dev_ll;
 	struct ether_hdr *pkt_hdr;
@@ -1065,7 +1070,7 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m)
 
 	while (dev_ll != NULL) {
 		if ((dev_ll->vdev->ready == DEVICE_RX) && ether_addr_cmp(&(pkt_hdr->d_addr),
-				          &dev_ll->vdev->mac_address)) {
+					&dev_ll->vdev->mac_address)) {
 
 			/* Drop the packet if the TX packet is destined for the TX device. */
 			if (dev_ll->vdev->dev->device_fh == dev->device_fh) {
@@ -1083,7 +1088,9 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m)
 				LOG_DEBUG(VHOST_DATA, "(%"PRIu64") Device is marked for removal\n", tdev->device_fh);
 			} else {
 				/*send the packet to the local virtio device*/
-				ret = rte_vhost_enqueue_burst(tdev, VIRTIO_RXQ, &m, 1);
+				ret = rte_vhost_enqueue_burst(tdev,
+					VIRTIO_RXQ + q_idx * VIRTIO_QNUM,
+					&m, 1);
 				if (enable_stats) {
 					rte_atomic64_add(
 					&dev_statistics[tdev->device_fh].rx_total_atomic,
@@ -1160,7 +1167,8 @@ find_local_dest(struct virtio_net *dev, struct rte_mbuf *m,
  * or the physical port.
  */
 static inline void __attribute__((always_inline))
-virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
+virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m,
+		uint16_t vlan_tag, uint32_t q_idx)
 {
 	struct mbuf_table *tx_q;
 	struct rte_mbuf **m_table;
@@ -1170,7 +1178,8 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
 	struct ether_hdr *nh;
 
 	/*check if destination is local VM*/
-	if ((vm2vm_mode == VM2VM_SOFTWARE) && (virtio_tx_local(vdev, m) == 0)) {
+	if ((vm2vm_mode == VM2VM_SOFTWARE) &&
+		(virtio_tx_local(vdev, m, q_idx) == 0)) {
 		rte_pktmbuf_free(m);
 		return;
 	}
@@ -1334,49 +1343,60 @@ switch_worker(__attribute__((unused)) void *arg)
 			}
 			if (likely(vdev->ready == DEVICE_RX)) {
 				/*Handle guest RX*/
-				rx_count = rte_eth_rx_burst(ports[0],
-					vdev->vmdq_rx_q, pkts_burst, MAX_PKT_BURST);
+				for (i = 0; i < rxq; i++) {
+					rx_count = rte_eth_rx_burst(ports[0],
+						vdev->vmdq_rx_q + i, pkts_burst, MAX_PKT_BURST);
 
-				if (rx_count) {
-					/*
-					* Retry is enabled and the queue is full then we wait and retry to avoid packet loss
-					* Here MAX_PKT_BURST must be less than virtio queue size
-					*/
-					if (enable_retry && unlikely(rx_count > rte_vring_available_entries(dev, VIRTIO_RXQ))) {
-						for (retry = 0; retry < burst_rx_retry_num; retry++) {
-							rte_delay_us(burst_rx_delay_time);
-							if (rx_count <= rte_vring_available_entries(dev, VIRTIO_RXQ))
-								break;
+					if (rx_count) {
+						/*
+						* Retry is enabled and the queue is full then we wait and retry to avoid packet loss
+						* Here MAX_PKT_BURST must be less than virtio queue size
+						*/
+						if (enable_retry && unlikely(rx_count > rte_vring_available_entries(dev,
+											VIRTIO_RXQ + i * VIRTIO_QNUM))) {
+							for (retry = 0; retry < burst_rx_retry_num; retry++) {
+								rte_delay_us(burst_rx_delay_time);
+								if (rx_count <= rte_vring_available_entries(dev,
+											VIRTIO_RXQ + i * VIRTIO_QNUM))
+									break;
+							}
+						}
+						ret_count = rte_vhost_enqueue_burst(dev, VIRTIO_RXQ + i * VIRTIO_QNUM,
+											pkts_burst, rx_count);
+						if (enable_stats) {
+							rte_atomic64_add(
+							&dev_statistics[dev_ll->vdev->dev->device_fh].rx_total_atomic,
+							rx_count);
+							rte_atomic64_add(
+							&dev_statistics[dev_ll->vdev->dev->device_fh].rx_atomic, ret_count);
+						}
+						while (likely(rx_count)) {
+							rx_count--;
+							rte_pktmbuf_free(pkts_burst[rx_count]);
 						}
 					}
-					ret_count = rte_vhost_enqueue_burst(dev, VIRTIO_RXQ, pkts_burst, rx_count);
-					if (enable_stats) {
-						rte_atomic64_add(
-						&dev_statistics[dev_ll->vdev->dev->device_fh].rx_total_atomic,
-						rx_count);
-						rte_atomic64_add(
-						&dev_statistics[dev_ll->vdev->dev->device_fh].rx_atomic, ret_count);
-					}
-					while (likely(rx_count)) {
-						rx_count--;
-						rte_pktmbuf_free(pkts_burst[rx_count]);
-					}
-
 				}
 			}
 
 			if (likely(!vdev->remove)) {
 				/* Handle guest TX*/
-				tx_count = rte_vhost_dequeue_burst(dev, VIRTIO_TXQ, mbuf_pool, pkts_burst, MAX_PKT_BURST);
-				/* If this is the first received packet we need to learn the MAC and setup VMDQ */
-				if (unlikely(vdev->ready == DEVICE_MAC_LEARNING) && tx_count) {
-					if (vdev->remove || (link_vmdq(vdev, pkts_burst[0]) == -1)) {
-						while (tx_count)
-							rte_pktmbuf_free(pkts_burst[--tx_count]);
+				for (i = 0; i < rxq; i++) {
+					tx_count = rte_vhost_dequeue_burst(dev, VIRTIO_TXQ + i * 2,
+							mbuf_pool, pkts_burst, MAX_PKT_BURST);
+					/*
+					 * If this is the first received packet we need to learn
+					 * the MAC and setup VMDQ
+					 */
+					if (unlikely(vdev->ready == DEVICE_MAC_LEARNING) && tx_count) {
+						if (vdev->remove || (link_vmdq(vdev, pkts_burst[0]) == -1)) {
+							while (tx_count)
+								rte_pktmbuf_free(pkts_burst[--tx_count]);
+						}
 					}
+					while (tx_count)
+						virtio_tx_route(vdev, pkts_burst[--tx_count],
+								(uint16_t)dev->device_fh, i);
 				}
-				while (tx_count)
-					virtio_tx_route(vdev, pkts_burst[--tx_count], (uint16_t)dev->device_fh);
 			}
 
 			/*move to the next device in the list*/
@@ -2635,6 +2655,13 @@ new_device (struct virtio_net *dev)
 	struct vhost_dev *vdev;
 	uint32_t regionidx;
 
+	if ((rxq > 1) && (dev->num_virt_queues != rxq)) {
+		RTE_LOG(ERR, VHOST_DATA, "(%"PRIu64") queue num in VMDq pool:"
+			"%d != queue pair num in vhost dev:%d\n",
+			dev->device_fh, rxq, dev->num_virt_queues);
+		return -1;
+	}
+
 	vdev = rte_zmalloc("vhost device", sizeof(*vdev), RTE_CACHE_LINE_SIZE);
 	if (vdev == NULL) {
 		RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") Couldn't allocate memory for vhost dev\n",
@@ -2680,12 +2707,12 @@ new_device (struct virtio_net *dev)
 		}
 	}
 
-
 	/* Add device to main ll */
 	ll_dev = get_data_ll_free_entry(&ll_root_free);
 	if (ll_dev == NULL) {
-		RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") No free entry found in linked list. Device limit "
-			"of %d devices per core has been reached\n",
+		RTE_LOG(INFO, VHOST_DATA,
+			"(%"PRIu64") No free entry found in linked list."
+			"Device limit of %d devices per core has been reached\n",
 			dev->device_fh, num_devices);
 		if (vdev->regions_hpa)
 			rte_free(vdev->regions_hpa);
@@ -2694,8 +2721,7 @@ new_device (struct virtio_net *dev)
 	}
 	ll_dev->vdev = vdev;
 	add_data_ll_entry(&ll_root_used, ll_dev);
-	vdev->vmdq_rx_q
-		= dev->device_fh * queues_per_pool + vmdq_queue_base;
+	vdev->vmdq_rx_q	= dev->device_fh * rxq + vmdq_queue_base;
 
 	if (zero_copy) {
 		uint32_t index = vdev->vmdq_rx_q;
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v3 7/9] virtio: Resolve for control queue
  2015-06-15  7:56   ` [dpdk-dev] [PATCH v3 0/9] Support multiple queues in vhost Ouyang Changchun
                       ` (5 preceding siblings ...)
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 6/9] vhost: Support multiple queues Ouyang Changchun
@ 2015-06-15  7:56     ` Ouyang Changchun
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 8/9] vhost: Add per queue stats info Ouyang Changchun
                       ` (2 subsequent siblings)
  9 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-15  7:56 UTC (permalink / raw)
  To: dev

Control queue can't work for vhost-user mulitple queue mode,
so introduce a counter to void the dead loop when polling the control queue.

Changes in v2:
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 drivers/net/virtio/virtio_ethdev.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index fe5f9a1..e4bedbd 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -61,6 +61,7 @@
 #include "virtio_logs.h"
 #include "virtqueue.h"
 
+#define CQ_POLL_COUNTER 500 /* Avoid dead loop when polling control queue */
 
 static int eth_virtio_dev_init(struct rte_eth_dev *eth_dev);
 static int  virtio_dev_configure(struct rte_eth_dev *dev);
@@ -118,6 +119,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
 	int k, sum = 0;
 	virtio_net_ctrl_ack status = ~0;
 	struct virtio_pmd_ctrl result;
+	uint32_t cq_poll = CQ_POLL_COUNTER;
 
 	ctrl->status = status;
 
@@ -178,9 +180,15 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
 	virtqueue_notify(vq);
 
 	rte_rmb();
-	while (vq->vq_used_cons_idx == vq->vq_ring.used->idx) {
+
+	/**
+	 * FIXME: The control queue doesn't work for vhost-user
+	 * multiple queue, introduce poll_ms to avoid the deadloop.
+	 */
+	while ((vq->vq_used_cons_idx == vq->vq_ring.used->idx) && (cq_poll != 0)) {
 		rte_rmb();
 		usleep(100);
+		cq_poll--;
 	}
 
 	while (vq->vq_used_cons_idx != vq->vq_ring.used->idx) {
@@ -208,7 +216,10 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl,
 	PMD_INIT_LOG(DEBUG, "vq->vq_free_cnt=%d\nvq->vq_desc_head_idx=%d",
 			vq->vq_free_cnt, vq->vq_desc_head_idx);
 
-	memcpy(&result, vq->virtio_net_hdr_mz->addr,
+	if (cq_poll == 0)
+		result.status = 0;
+	else
+		memcpy(&result, vq->virtio_net_hdr_mz->addr,
 			sizeof(struct virtio_pmd_ctrl));
 
 	return result.status;
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v3 8/9] vhost: Add per queue stats info
  2015-06-15  7:56   ` [dpdk-dev] [PATCH v3 0/9] Support multiple queues in vhost Ouyang Changchun
                       ` (6 preceding siblings ...)
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 7/9] virtio: Resolve for control queue Ouyang Changchun
@ 2015-06-15  7:56     ` Ouyang Changchun
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 9/9] doc: Update doc for vhost multiple queues Ouyang Changchun
  2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
  9 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-15  7:56 UTC (permalink / raw)
  To: dev

Add per queue stats info

Changes in v3
  - fix coding style and displaying format
  - check stats_enable to alloc mem for queue pair

Changes in v2
  - fix the stats issue in tx_local
  - dynamically alloc mem for queue pair stats info
  - fix checkpatch errors

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 examples/vhost/main.c | 126 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 79 insertions(+), 47 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index d40cb11..76f645f 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -314,7 +314,7 @@ struct ipv4_hdr {
 #define VLAN_ETH_HLEN   18
 
 /* Per-device statistics struct */
-struct device_statistics {
+struct qp_statistics {
 	uint64_t tx_total;
 	rte_atomic64_t rx_total_atomic;
 	uint64_t rx_total;
@@ -322,6 +322,10 @@ struct device_statistics {
 	rte_atomic64_t rx_atomic;
 	uint64_t rx;
 } __rte_cache_aligned;
+
+struct device_statistics {
+	struct qp_statistics *qp_stats;
+};
 struct device_statistics dev_statistics[MAX_DEVICES];
 
 /*
@@ -738,6 +742,17 @@ us_vhost_parse_args(int argc, char **argv)
 					return -1;
 				} else {
 					enable_stats = ret;
+					if (enable_stats)
+						for (i = 0; i < MAX_DEVICES; i++) {
+							dev_statistics[i].qp_stats =
+								malloc(VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct qp_statistics));
+							if (dev_statistics[i].qp_stats == NULL) {
+								RTE_LOG(ERR, VHOST_CONFIG, "Failed to allocate memory for qp stats.\n");
+								return -1;
+							}
+							memset(dev_statistics[i].qp_stats, 0,
+								VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct qp_statistics));
+						}
 				}
 			}
 
@@ -1093,13 +1108,13 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m, uint32_t q_idx)
 					&m, 1);
 				if (enable_stats) {
 					rte_atomic64_add(
-					&dev_statistics[tdev->device_fh].rx_total_atomic,
+					&dev_statistics[tdev->device_fh].qp_stats[q_idx].rx_total_atomic,
 					1);
 					rte_atomic64_add(
-					&dev_statistics[tdev->device_fh].rx_atomic,
+					&dev_statistics[tdev->device_fh].qp_stats[q_idx].rx_atomic,
 					ret);
-					dev_statistics[tdev->device_fh].tx_total++;
-					dev_statistics[tdev->device_fh].tx += ret;
+					dev_statistics[dev->device_fh].qp_stats[q_idx].tx_total++;
+					dev_statistics[dev->device_fh].qp_stats[q_idx].tx += ret;
 				}
 			}
 
@@ -1233,8 +1248,8 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m,
 	tx_q->m_table[len] = m;
 	len++;
 	if (enable_stats) {
-		dev_statistics[dev->device_fh].tx_total++;
-		dev_statistics[dev->device_fh].tx++;
+		dev_statistics[dev->device_fh].qp_stats[q_idx].tx_total++;
+		dev_statistics[dev->device_fh].qp_stats[q_idx].tx++;
 	}
 
 	if (unlikely(len == MAX_PKT_BURST)) {
@@ -1365,10 +1380,10 @@ switch_worker(__attribute__((unused)) void *arg)
 											pkts_burst, rx_count);
 						if (enable_stats) {
 							rte_atomic64_add(
-							&dev_statistics[dev_ll->vdev->dev->device_fh].rx_total_atomic,
+							&dev_statistics[dev_ll->vdev->dev->device_fh].qp_stats[i].rx_total_atomic,
 							rx_count);
 							rte_atomic64_add(
-							&dev_statistics[dev_ll->vdev->dev->device_fh].rx_atomic, ret_count);
+							&dev_statistics[dev_ll->vdev->dev->device_fh].qp_stats[i].rx_atomic, ret_count);
 						}
 						while (likely(rx_count)) {
 							rx_count--;
@@ -1918,8 +1933,8 @@ virtio_tx_route_zcp(struct virtio_net *dev, struct rte_mbuf *m,
 		(mbuf->next == NULL) ? "null" : "non-null");
 
 	if (enable_stats) {
-		dev_statistics[dev->device_fh].tx_total++;
-		dev_statistics[dev->device_fh].tx++;
+		dev_statistics[dev->device_fh].qp_stats[0].tx_total++;
+		dev_statistics[dev->device_fh].qp_stats[0].tx++;
 	}
 
 	if (unlikely(len == MAX_PKT_BURST)) {
@@ -2202,9 +2217,9 @@ switch_worker_zcp(__attribute__((unused)) void *arg)
 					ret_count = virtio_dev_rx_zcp(dev,
 							pkts_burst, rx_count);
 					if (enable_stats) {
-						dev_statistics[dev->device_fh].rx_total
+						dev_statistics[dev->device_fh].qp_stats[0].rx_total
 							+= rx_count;
-						dev_statistics[dev->device_fh].rx
+						dev_statistics[dev->device_fh].qp_stats[0].rx
 							+= ret_count;
 					}
 					while (likely(rx_count)) {
@@ -2824,7 +2839,9 @@ new_device (struct virtio_net *dev)
 	add_data_ll_entry(&lcore_info[vdev->coreid].lcore_ll->ll_root_used, ll_dev);
 
 	/* Initialize device stats */
-	memset(&dev_statistics[dev->device_fh], 0, sizeof(struct device_statistics));
+	if (enable_stats)
+		memset(dev_statistics[dev->device_fh].qp_stats, 0,
+			VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct qp_statistics));
 
 	/* Disable notifications. */
 	rte_vhost_enable_guest_notification(dev, VIRTIO_RXQ, 0);
@@ -2857,7 +2874,7 @@ print_stats(void)
 	struct virtio_net_data_ll *dev_ll;
 	uint64_t tx_dropped, rx_dropped;
 	uint64_t tx, tx_total, rx, rx_total;
-	uint32_t device_fh;
+	uint32_t device_fh, i;
 	const char clr[] = { 27, '[', '2', 'J', '\0' };
 	const char top_left[] = { 27, '[', '1', ';', '1', 'H','\0' };
 
@@ -2872,35 +2889,53 @@ print_stats(void)
 		dev_ll = ll_root_used;
 		while (dev_ll != NULL) {
 			device_fh = (uint32_t)dev_ll->vdev->dev->device_fh;
-			tx_total = dev_statistics[device_fh].tx_total;
-			tx = dev_statistics[device_fh].tx;
-			tx_dropped = tx_total - tx;
-			if (zero_copy == 0) {
-				rx_total = rte_atomic64_read(
-					&dev_statistics[device_fh].rx_total_atomic);
-				rx = rte_atomic64_read(
-					&dev_statistics[device_fh].rx_atomic);
-			} else {
-				rx_total = dev_statistics[device_fh].rx_total;
-				rx = dev_statistics[device_fh].rx;
-			}
-			rx_dropped = rx_total - rx;
-
-			printf("\nStatistics for device %"PRIu32" ------------------------------"
-					"\nTX total: 		%"PRIu64""
-					"\nTX dropped: 		%"PRIu64""
-					"\nTX successful: 		%"PRIu64""
-					"\nRX total: 		%"PRIu64""
-					"\nRX dropped: 		%"PRIu64""
-					"\nRX successful: 		%"PRIu64"",
-					device_fh,
-					tx_total,
-					tx_dropped,
-					tx,
-					rx_total,
-					rx_dropped,
-					rx);
-
+			for (i = 0; i < rxq; i++) {
+				tx_total = dev_statistics[device_fh].qp_stats[i].tx_total;
+				tx = dev_statistics[device_fh].qp_stats[i].tx;
+				tx_dropped = tx_total - tx;
+				if (zero_copy == 0) {
+					rx_total = rte_atomic64_read(
+						&dev_statistics[device_fh].qp_stats[i].rx_total_atomic);
+					rx = rte_atomic64_read(
+						&dev_statistics[device_fh].qp_stats[i].rx_atomic);
+				} else {
+					rx_total = dev_statistics[device_fh].qp_stats[0].rx_total;
+					rx = dev_statistics[device_fh].qp_stats[0].rx;
+				}
+				rx_dropped = rx_total - rx;
+
+				if (rxq > 1)
+					printf("\nStatistics for device %"PRIu32" queue id: %d------------------"
+						"\nTX total:		%"PRIu64""
+						"\nTX dropped:		%"PRIu64""
+						"\nTX success:		%"PRIu64""
+						"\nRX total:		%"PRIu64""
+						"\nRX dropped:		%"PRIu64""
+						"\nRX success:		%"PRIu64"",
+						device_fh,
+						i,
+						tx_total,
+						tx_dropped,
+						tx,
+						rx_total,
+						rx_dropped,
+						rx);
+				else
+					printf("\nStatistics for device %"PRIu32" ------------------------------"
+						"\nTX total:		%"PRIu64""
+						"\nTX dropped:		%"PRIu64""
+						"\nTX success:		%"PRIu64""
+						"\nRX total:		%"PRIu64""
+						"\nRX dropped:		%"PRIu64""
+						"\nRX success:		%"PRIu64"",
+						device_fh,
+						tx_total,
+						tx_dropped,
+						tx,
+						rx_total,
+						rx_dropped,
+						rx);
+				}
 			dev_ll = dev_ll->next;
 		}
 		printf("\n======================================================\n");
@@ -3070,9 +3105,6 @@ main(int argc, char *argv[])
 	if (init_data_ll() == -1)
 		rte_exit(EXIT_FAILURE, "Failed to initialize linked list\n");
 
-	/* Initialize device stats */
-	memset(&dev_statistics, 0, sizeof(dev_statistics));
-
 	/* Enable stats if the user option is set. */
 	if (enable_stats)
 		pthread_create(&tid, NULL, (void*)print_stats, NULL );
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v3 9/9] doc: Update doc for vhost multiple queues
  2015-06-15  7:56   ` [dpdk-dev] [PATCH v3 0/9] Support multiple queues in vhost Ouyang Changchun
                       ` (7 preceding siblings ...)
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 8/9] vhost: Add per queue stats info Ouyang Changchun
@ 2015-06-15  7:56     ` Ouyang Changchun
  2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
  9 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-06-15  7:56 UTC (permalink / raw)
  To: dev

Update the sample guide doc for vhost multiple queues;
Update the prog guide doc for vhost lib multiple queues feature;

It is added since v3

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 doc/guides/prog_guide/vhost_lib.rst |  35 ++++++++++++
 doc/guides/sample_app_ug/vhost.rst  | 110 ++++++++++++++++++++++++++++++++++++
 2 files changed, 145 insertions(+)

diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index 48e1fff..e444681 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -128,6 +128,41 @@ VHOST_GET_VRING_BASE is used as the signal to remove vhost device from data plan
 
 When the socket connection is closed, vhost will destroy the device.
 
+Vhost multiple queues feature
+-----------------------------
+This feature supports the multiple queues for each virtio device in vhost.
+The vhost-user is used to enable the multiple queues feature, It's not ready for vhost-cuse.
+
+The QEMU patch of enabling vhost-use multiple queues has already merged into upstream sub-tree in
+QEMU community and it will be put in QEMU 2.4. If using QEMU 2.3, it requires applying the
+same patch onto QEMU 2.3 and rebuild the QEMU before running vhost multiple queues:
+http://patchwork.ozlabs.org/patch/477461/
+
+The vhost will get the queue pair number based on the communication message with QEMU.
+
+HW queue numbers in pool is strongly recommended to set as identical with the queue number to start
+the QMEU guest and identical with the queue number to start with virtio port on guest.
+
+=========================================
+==================|   |==================|
+       vport0     |   |      vport1      |
+---  ---  ---  ---|   |---  ---  ---  ---|
+q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
+/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
+||   ||   ||   ||      ||   ||   ||   ||
+||   ||   ||   ||      ||   ||   ||   ||
+||= =||= =||= =||=|   =||== ||== ||== ||=|
+q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
+------------------|   |------------------|
+     VMDq pool0   |   |    VMDq pool1    |
+==================|   |==================|
+
+In RX side, it firstly polls each queue of the pool and gets the packets from
+it and enqueue them into its corresponding virtqueue in virtio device/port.
+In TX side, it dequeue packets from each virtqueue of virtio device/port and send
+to either physical port or another virtio device according to its destination
+MAC address.
+
 Vhost supported vSwitch reference
 ---------------------------------
 
diff --git a/doc/guides/sample_app_ug/vhost.rst b/doc/guides/sample_app_ug/vhost.rst
index 730b9da..9a57d19 100644
--- a/doc/guides/sample_app_ug/vhost.rst
+++ b/doc/guides/sample_app_ug/vhost.rst
@@ -514,6 +514,13 @@ It is enabled by default.
 
     user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir /mnt/huge -- --vlan-strip [0, 1]
 
+**rxq.**
+The rxq option specify the rx queue number per VMDq pool, it is 1 on default.
+
+.. code-block:: console
+
+    user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir /mnt/huge -- --rxq [1, 2, 4]
+
 Running the Virtual Machine (QEMU)
 ----------------------------------
 
@@ -833,3 +840,106 @@ For example:
 The above message indicates that device 0 has been registered with MAC address cc:bb:bb:bb:bb:bb and VLAN tag 1000.
 Any packets received on the NIC with these values is placed on the devices receive queue.
 When a virtio-net device transmits packets, the VLAN tag is added to the packet by the DPDK vhost sample code.
+
+Vhost multiple queues
+---------------------
+
+This feature supports the multiple queues for each virtio device in vhost.
+The vhost-user is used to enable the multiple queues feature, It's not ready for vhost-cuse.
+
+The QEMU patch of enabling vhost-use multiple queues has already merged into upstream sub-tree in
+QEMU community and it will be put in QEMU 2.4. If using QEMU 2.3, it requires applying the
+same patch onto QEMU 2.3 and rebuild the QEMU before running vhost multiple queues:
+http://patchwork.ozlabs.org/patch/477461/
+
+Basically vhost sample leverages the VMDq+RSS in HW to receive packets and distribute them
+into different queue in the pool according to their 5 tuples.
+
+On the other hand, the vhost will get the queue pair number based on the communication message with
+QEMU.
+
+HW queue numbers in pool is strongly recommended to set as identical with the queue number to start
+the QMEU guest and identical with the queue number to start with virtio port on guest.
+E.g. use '--rxq 4' to set the queue number as 4, it means there are 4 HW queues in each VMDq pool,
+and 4 queues in each vhost device/port, every queue in pool maps to one queue in vhost device.
+
+=========================================
+==================|   |==================|
+       vport0     |   |      vport1      |
+---  ---  ---  ---|   |---  ---  ---  ---|
+q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
+/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
+||   ||   ||   ||      ||   ||   ||   ||
+||   ||   ||   ||      ||   ||   ||   ||
+||= =||= =||= =||=|   =||== ||== ||== ||=|
+q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
+------------------|   |------------------|
+     VMDq pool0   |   |    VMDq pool1    |
+==================|   |==================|
+
+In RX side, it firstly polls each queue of the pool and gets the packets from
+it and enqueue them into its corresponding virtqueue in virtio device/port.
+In TX side, it dequeue packets from each virtqueue of virtio device/port and send
+to either physical port or another virtio device according to its destination
+MAC address.
+
+
+Test guidance
+~~~~~~~~~~~~~
+
+#.  On host, firstly mount hugepage, and insmod uio, igb_uio, bind one nic on igb_uio;
+    and then run vhost sample, key steps as follows:
+
+.. code-block:: console
+
+    sudo mount -t hugetlbfs nodev /mnt/huge
+    sudo modprobe uio
+    sudo insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko
+
+    $RTE_SDK/tools/dpdk_nic_bind.py --bind igb_uio 0000:08:00.0
+    sudo $RTE_SDK/examples/vhost/build/vhost-switch -c 0xf0 -n 4 --huge-dir \
+    /mnt/huge --socket-mem 1024,0 -- -p 1 --vm2vm 0 --dev-basename usvhost --rxq 2
+
+.. note::
+
+    use '--stats 1' to enable the stats dumping on screen for vhost.
+
+#.  After step 1, on host, modprobe kvm and kvm_intel, and use qemu command line to start one guest:
+
+.. code-block:: console
+
+    modprobe kvm
+    modprobe kvm_intel
+    sudo mount -t hugetlbfs nodev /dev/hugepages -o pagesize=1G
+
+    $QEMU_PATH/qemu-system-x86_64 -enable-kvm -m 4096 \
+    -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on \
+    -numa node,memdev=mem -mem-prealloc -smp 10 -cpu core2duo,+sse3,+sse4.1,+sse4.2 \
+    -name <vm-name> -drive file=<img-path>/vm.img \
+    -chardev socket,id=char0,path=<usvhost-path>/usvhost \
+    -netdev type=vhost-user,id=hostnet2,chardev=char0,vhostforce=on,queues=2 \
+    -device virtio-net-pci,mq=on,vectors=6,netdev=hostnet2,id=net2,mac=52:54:00:12:34:56,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off \
+    -chardev socket,id=char1,path=<usvhost-path>/usvhost \
+    -netdev type=vhost-user,id=hostnet3,chardev=char1,vhostforce=on,queues=2 \
+    -device virtio-net-pci,mq=on,vectors=6,netdev=hostnet3,id=net3,mac=52:54:00:12:34:57,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off
+
+#.  Log on guest, use testpmd(dpdk based) to test, use multiple virtio queues to rx and tx packets.
+
+.. code-block:: console
+
+    modprobe uio
+    insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko
+    echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
+    ./tools/dpdk_nic_bind.py --bind igb_uio 00:03.0 00:04.0
+
+    $RTE_SDK/$RTE_TARGET/app/testpmd -c 1f -n 4 -- --rxq=2 --txq=2 --nb-cores=4 \
+    --rx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" \
+    --tx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" -i --disable-hw-vlan --txqflags 0xf00
+
+    set fwd mac
+    start tx_first
+
+#.  Use packet generator to send packets with dest MAC:52 54 00 12 34 57  VLAN tag:1001,
+    select IPv4 as protocols and continuous incremental IP address.
+
+#.  Testpmd on guest can display packets received/transmitted in both queues of each virtio port.
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in virtio dev
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in virtio dev Ouyang Changchun
@ 2015-06-18 13:16       ` Flavio Leitner
  2015-06-19  1:06         ` Ouyang, Changchun
  2015-06-18 13:34       ` Flavio Leitner
  1 sibling, 1 reply; 65+ messages in thread
From: Flavio Leitner @ 2015-06-18 13:16 UTC (permalink / raw)
  To: Ouyang Changchun; +Cc: dev

On Mon, Jun 15, 2015 at 03:56:39PM +0800, Ouyang Changchun wrote:
> Each virtio device could have multiple queues, say 2 or 4, at most 8.
> Enabling this feature allows virtio device/port on guest has the ability to
> use different vCPU to receive/transmit packets from/to each queue.
> 
> In multiple queues mode, virtio device readiness means all queues of
> this virtio device are ready, cleanup/destroy a virtio device also
> requires clearing all queues belong to it.
> 
> Changes in v3:
>   - fix coding style
>   - check virtqueue idx validity
> 
> Changes in v2:
>   - remove the q_num_set api
>   - add the qp_num_get api
>   - determine the queue pair num from qemu message
>   - rework for reset owner message handler
>   - dynamically alloc mem for dev virtqueue
>   - queue pair num could be 0x8000
>   - fix checkpatch errors
> 
> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
[...]

> diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
> index fced2ab..aaea7d5 100644
> --- a/lib/librte_vhost/virtio-net.c
> +++ b/lib/librte_vhost/virtio-net.c
> @@ -67,10 +67,10 @@ static struct virtio_net_config_ll *ll_root;
>  #define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
>  				(1ULL << VIRTIO_NET_F_CTRL_VQ) | \
>  				(1ULL << VIRTIO_NET_F_CTRL_RX) | \
> -				(1ULL << VHOST_F_LOG_ALL))
> +				(1ULL << VHOST_F_LOG_ALL)) | \
> +				(1ULL << VIRTIO_NET_F_MQ))

One extra parenthesis after VHOST_F_LOG_ALL
BTW, this series need rebase with latest dpdk.
fbl

>  static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in virtio dev
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in virtio dev Ouyang Changchun
  2015-06-18 13:16       ` Flavio Leitner
@ 2015-06-18 13:34       ` Flavio Leitner
  2015-06-19  1:17         ` Ouyang, Changchun
  1 sibling, 1 reply; 65+ messages in thread
From: Flavio Leitner @ 2015-06-18 13:34 UTC (permalink / raw)
  To: Ouyang Changchun; +Cc: dev

On Mon, Jun 15, 2015 at 03:56:39PM +0800, Ouyang Changchun wrote:
> Each virtio device could have multiple queues, say 2 or 4, at most 8.
> Enabling this feature allows virtio device/port on guest has the ability to
> use different vCPU to receive/transmit packets from/to each queue.
> 
> In multiple queues mode, virtio device readiness means all queues of
> this virtio device are ready, cleanup/destroy a virtio device also
> requires clearing all queues belong to it.
> 
> Changes in v3:
>   - fix coding style
>   - check virtqueue idx validity
> 
> Changes in v2:
>   - remove the q_num_set api
>   - add the qp_num_get api
>   - determine the queue pair num from qemu message
>   - rework for reset owner message handler
>   - dynamically alloc mem for dev virtqueue
>   - queue pair num could be 0x8000
>   - fix checkpatch errors
> 
> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> ---
>  lib/librte_vhost/rte_virtio_net.h             |  10 +-
>  lib/librte_vhost/vhost-net.h                  |   1 +
>  lib/librte_vhost/vhost_rxtx.c                 |  49 +++++---
>  lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
>  lib/librte_vhost/vhost_user/virtio-net-user.c |  76 +++++++++---
>  lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
>  lib/librte_vhost/virtio-net.c                 | 161 +++++++++++++++++---------
>  7 files changed, 216 insertions(+), 87 deletions(-)
> 
> diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
> index 5d38185..873be3e 100644
> --- a/lib/librte_vhost/rte_virtio_net.h
> +++ b/lib/librte_vhost/rte_virtio_net.h
> @@ -59,7 +59,6 @@ struct rte_mbuf;
>  /* Backend value set by guest. */
>  #define VIRTIO_DEV_STOPPED -1
>  
> -
>  /* Enum for virtqueue management. */
>  enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
>  
> @@ -96,13 +95,14 @@ struct vhost_virtqueue {
>   * Device structure contains all configuration information relating to the device.
>   */
>  struct virtio_net {
> -	struct vhost_virtqueue	*virtqueue[VIRTIO_QNUM];	/**< Contains all virtqueue information. */
>  	struct virtio_memory	*mem;		/**< QEMU memory and memory region information. */
> +	struct vhost_virtqueue	**virtqueue;    /**< Contains all virtqueue information. */
>  	uint64_t		features;	/**< Negotiated feature set. */
>  	uint64_t		device_fh;	/**< device identifier. */
>  	uint32_t		flags;		/**< Device flags. Only used to check if device is running on data core. */
>  #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
>  	char			ifname[IF_NAME_SZ];	/**< Name of the tap device or socket path. */
> +	uint32_t		num_virt_queues;
>  	void			*priv;		/**< private context */
>  } __rte_cache_aligned;


As already pointed out, this breaks ABI.
Do you have a plan for that or are you pushing this for dpdk 2.2?


> @@ -220,4 +220,10 @@ uint16_t rte_vhost_enqueue_burst(struct virtio_net *dev, uint16_t queue_id,
>  uint16_t rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
>  	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
>  
> +/**
> + * This function get the queue pair number of one vhost device.
> + * @return
> + *  num of queue pair of specified virtio device.
> + */
> +uint16_t rte_vhost_qp_num_get(struct virtio_net *dev);

This needs to go to rte_vhost_version.map too.

fbl

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in virtio dev
  2015-06-18 13:16       ` Flavio Leitner
@ 2015-06-19  1:06         ` Ouyang, Changchun
  0 siblings, 0 replies; 65+ messages in thread
From: Ouyang, Changchun @ 2015-06-19  1:06 UTC (permalink / raw)
  To: Flavio Leitner; +Cc: dev

Hi Flavio,

> -----Original Message-----
> From: Flavio Leitner [mailto:fbl@sysclose.org]
> Sent: Thursday, June 18, 2015 9:17 PM
> To: Ouyang, Changchun
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in
> virtio dev
> 
> On Mon, Jun 15, 2015 at 03:56:39PM +0800, Ouyang Changchun wrote:
> > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > Enabling this feature allows virtio device/port on guest has the
> > ability to use different vCPU to receive/transmit packets from/to each
> queue.
> >
> > In multiple queues mode, virtio device readiness means all queues of
> > this virtio device are ready, cleanup/destroy a virtio device also
> > requires clearing all queues belong to it.
> >
> > Changes in v3:
> >   - fix coding style
> >   - check virtqueue idx validity
> >
> > Changes in v2:
> >   - remove the q_num_set api
> >   - add the qp_num_get api
> >   - determine the queue pair num from qemu message
> >   - rework for reset owner message handler
> >   - dynamically alloc mem for dev virtqueue
> >   - queue pair num could be 0x8000
> >   - fix checkpatch errors
> >
> > Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> [...]
> 
> > diff --git a/lib/librte_vhost/virtio-net.c
> > b/lib/librte_vhost/virtio-net.c index fced2ab..aaea7d5 100644
> > --- a/lib/librte_vhost/virtio-net.c
> > +++ b/lib/librte_vhost/virtio-net.c
> > @@ -67,10 +67,10 @@ static struct virtio_net_config_ll *ll_root;
> > #define VHOST_SUPPORTED_FEATURES ((1ULL <<
> VIRTIO_NET_F_MRG_RXBUF) | \
> >  				(1ULL << VIRTIO_NET_F_CTRL_VQ) | \
> >  				(1ULL << VIRTIO_NET_F_CTRL_RX) | \
> > -				(1ULL << VHOST_F_LOG_ALL))
> > +				(1ULL << VHOST_F_LOG_ALL)) | \
> > +				(1ULL << VIRTIO_NET_F_MQ))
> 
> One extra parenthesis after VHOST_F_LOG_ALL BTW, this series need
> rebase with latest dpdk.
> fbl
> 
Yes, will updated it.
Thanks
Changchun

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in virtio dev
  2015-06-18 13:34       ` Flavio Leitner
@ 2015-06-19  1:17         ` Ouyang, Changchun
  0 siblings, 0 replies; 65+ messages in thread
From: Ouyang, Changchun @ 2015-06-19  1:17 UTC (permalink / raw)
  To: Flavio Leitner; +Cc: dev



> -----Original Message-----
> From: Flavio Leitner [mailto:fbl@sysclose.org]
> Sent: Thursday, June 18, 2015 9:34 PM
> To: Ouyang, Changchun
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in
> virtio dev
> 
> On Mon, Jun 15, 2015 at 03:56:39PM +0800, Ouyang Changchun wrote:
> > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > Enabling this feature allows virtio device/port on guest has the
> > ability to use different vCPU to receive/transmit packets from/to each
> queue.
> >
> > In multiple queues mode, virtio device readiness means all queues of
> > this virtio device are ready, cleanup/destroy a virtio device also
> > requires clearing all queues belong to it.
> >
> > Changes in v3:
> >   - fix coding style
> >   - check virtqueue idx validity
> >
> > Changes in v2:
> >   - remove the q_num_set api
> >   - add the qp_num_get api
> >   - determine the queue pair num from qemu message
> >   - rework for reset owner message handler
> >   - dynamically alloc mem for dev virtqueue
> >   - queue pair num could be 0x8000
> >   - fix checkpatch errors
> >
> > Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> > ---
> >  lib/librte_vhost/rte_virtio_net.h             |  10 +-
> >  lib/librte_vhost/vhost-net.h                  |   1 +
> >  lib/librte_vhost/vhost_rxtx.c                 |  49 +++++---
> >  lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
> >  lib/librte_vhost/vhost_user/virtio-net-user.c |  76 +++++++++---
> >  lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
> >  lib/librte_vhost/virtio-net.c                 | 161 +++++++++++++++++---------
> >  7 files changed, 216 insertions(+), 87 deletions(-)
> >
> > diff --git a/lib/librte_vhost/rte_virtio_net.h
> > b/lib/librte_vhost/rte_virtio_net.h
> > index 5d38185..873be3e 100644
> > --- a/lib/librte_vhost/rte_virtio_net.h
> > +++ b/lib/librte_vhost/rte_virtio_net.h
> > @@ -59,7 +59,6 @@ struct rte_mbuf;
> >  /* Backend value set by guest. */
> >  #define VIRTIO_DEV_STOPPED -1
> >
> > -
> >  /* Enum for virtqueue management. */
> >  enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
> >
> > @@ -96,13 +95,14 @@ struct vhost_virtqueue {
> >   * Device structure contains all configuration information relating to the
> device.
> >   */
> >  struct virtio_net {
> > -	struct vhost_virtqueue	*virtqueue[VIRTIO_QNUM];	/**< Contains
> all virtqueue information. */
> >  	struct virtio_memory	*mem;		/**< QEMU memory and
> memory region information. */
> > +	struct vhost_virtqueue	**virtqueue;    /**< Contains all virtqueue
> information. */
> >  	uint64_t		features;	/**< Negotiated feature set.
> */
> >  	uint64_t		device_fh;	/**< device identifier. */
> >  	uint32_t		flags;		/**< Device flags. Only used
> to check if device is running on data core. */
> >  #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
> >  	char			ifname[IF_NAME_SZ];	/**< Name of the tap
> device or socket path. */
> > +	uint32_t		num_virt_queues;
> >  	void			*priv;		/**< private context */
> >  } __rte_cache_aligned;
> 
> 
> As already pointed out, this breaks ABI.
> Do you have a plan for that or are you pushing this for dpdk 2.2?

Yes, I think it will be enabled in 2.2.
I have already  sent out the abi announce a few days ago.
> 
> 
> > @@ -220,4 +220,10 @@ uint16_t rte_vhost_enqueue_burst(struct
> > virtio_net *dev, uint16_t queue_id,  uint16_t
> rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
> >  	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t
> > count);
> >
> > +/**
> > + * This function get the queue pair number of one vhost device.
> > + * @return
> > + *  num of queue pair of specified virtio device.
> > + */
> > +uint16_t rte_vhost_qp_num_get(struct virtio_net *dev);
> 
> This needs to go to rte_vhost_version.map too.
Will update it.

Thanks
Changchun

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost
  2015-06-15  7:56   ` [dpdk-dev] [PATCH v3 0/9] Support multiple queues in vhost Ouyang Changchun
                       ` (8 preceding siblings ...)
  2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 9/9] doc: Update doc for vhost multiple queues Ouyang Changchun
@ 2015-08-12  8:02     ` Ouyang Changchun
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 01/12] ixgbe: support VMDq RSS in non-SRIOV environment Ouyang Changchun
                         ` (11 more replies)
  9 siblings, 12 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-08-12  8:02 UTC (permalink / raw)
  To: dev

This patch targets for R2.2, please ignore it in R2.1.
Send them out a bit earlier just for seeking more comments.

This patch set supports the multiple queues for each virtio device in vhost.
Currently the multiple queues feature is supported only for vhost-user, not yet for vhost-cuse.

The new QEMU patch version(v6) of enabling vhost-use multiple queues has already been sent out to
QEMU community and in its comments collecting stage. It requires applying the patch set onto QEMU
and rebuild the QEMU before running vhost multiple queues:
    http://patchwork.ozlabs.org/patch/506333/
    http://patchwork.ozlabs.org/patch/506334/

Note: the QEMU patch is based on top of 2 other patches, see patch description for more details
 
Basically vhost sample leverages the VMDq+RSS in HW to receive packets and distribute them
into different queue in the pool according to their 5 tuples.
 
On the other hand, the vhost will get the queue pair number based on the communication message with
QEMU.
 
HW queue numbers in pool is strongly recommended to set as identical with the queue number to start
the QMEU guest and identical with the queue number to start with virtio port on guest.
E.g. use '--rxq 4' to set the queue number as 4, it means there are 4 HW queues in each VMDq pool,
and 4 queues in each vhost device/port, every queue in pool maps to one queue in vhost device.
 
=========================================
==================|   |==================|
       vport0     |   |      vport1      |
---  ---  ---  ---|   |---  ---  ---  ---|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
||   ||   ||   ||      ||   ||   ||   ||
||   ||   ||   ||      ||   ||   ||   ||
||= =||= =||= =||=|   =||== ||== ||== ||=|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
------------------|   |------------------|
     VMDq pool0   |   |    VMDq pool1    |
==================|   |==================|
 
In RX side, it firstly polls each queue of the pool and gets the packets from
it and enqueue them into its corresponding virtqueue in virtio device/port.
In TX side, it dequeue packets from each virtqueue of virtio device/port and send
to either physical port or another virtio device according to its destination
MAC address.
 
Here is some test guidance.
1. On host, firstly mount hugepage, and insmod uio, igb_uio, bind one nic on igb_uio;
and then run vhost sample, key steps as follows:
sudo mount -t hugetlbfs nodev /mnt/huge
sudo modprobe uio
sudo insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko
 
$RTE_SDK/tools/dpdk_nic_bind.py --bind igb_uio 0000:08:00.0
sudo $RTE_SDK/examples/vhost/build/vhost-switch -c 0xf0 -n 4 --huge-dir /mnt/huge --socket-mem 1024,0 -- -p 1 --vm2vm 0 --dev-basename usvhost --rxq 2
 
use '--stats 1' to enable the stats dumping on screen for vhost.

2. After step 1, on host, modprobe kvm and kvm_intel, and use qemu command line to start one guest:
modprobe kvm
modprobe kvm_intel
sudo mount -t hugetlbfs nodev /dev/hugepages -o pagesize=1G
 
$QEMU_PATH/qemu-system-x86_64 -enable-kvm -m 4096 -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -mem-prealloc -smp 10 -cpu core2duo,+sse3,+sse4.1,+sse4.2 -name <vm-name> -drive file=<img-path>/vm.img -chardev socket,id=char0,path=<usvhost-path>/usvhost -netdev type=vhost-user,id=hostnet2,chardev=char0,vhostforce=on,queues=2 -device virtio-net-pci,mq=on,vectors=6,netdev=hostnet2,id=net2,mac=52:54:00:12:34:56,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off -chardev socket,id=char1,path=<usvhost-path>/usvhost -netdev type=vhost-user,id=hostnet3,chardev=char1,vhostforce=on,queues=2 -device virtio-net-pci,mq=on,vectors=6,netdev=hostnet3,id=net3,mac=52:54:00:12:34:57,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off
 
3. Log on guest, use testpmd(dpdk based) to test, use multiple virtio queues to rx and tx packets.
modprobe uio
insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko
echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
./tools/dpdk_nic_bind.py --bind igb_uio 00:03.0 00:04.0
 
$RTE_SDK/$RTE_TARGET/app/testpmd -c 1f -n 4 -- --rxq=2 --txq=2 --nb-cores=4 --rx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" --tx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" -i --disable-hw-vlan --txqflags 0xf00
 
set fwd mac
start tx_first
 
4. Use packet generator to send packets with dest MAC:52 54 00 12 34 57  VLAN tag:1001,
select IPv4 as protocols and continuous incremental IP address.
 
5. Testpmd on guest can display packets received/transmitted in both queues of each virtio port.

Changchun Ouyang (12):
  ixgbe: support VMDq RSS in non-SRIOV environment
  vhost: support multiple queues in virtio dev
  vhost: update version map file
  vhost: set memory layout for multiple queues mode
  vhost: check the virtqueue address's validity
  vhost: support protocol feature
  vhost: add new command line option: rxq
  vhost: support multiple queues
  virtio: resolve for control queue
  vhost: add per queue stats info
  vhost: alloc core to virtq
  doc: update doc for vhost multiple queues

 doc/guides/prog_guide/vhost_lib.rst           |  38 +++
 doc/guides/sample_app_ug/vhost.rst            | 113 +++++++
 drivers/net/ixgbe/ixgbe_rxtx.c                |  86 ++++-
 drivers/net/virtio/virtio_ethdev.c            |   9 +-
 examples/vhost/Makefile                       |   4 +-
 examples/vhost/main.c                         | 459 +++++++++++++++++---------
 examples/vhost/main.h                         |   3 +-
 lib/librte_ether/rte_ethdev.c                 |  31 ++
 lib/librte_vhost/rte_vhost_version.map        |   2 +-
 lib/librte_vhost/rte_virtio_net.h             |  47 ++-
 lib/librte_vhost/vhost-net.h                  |   4 +
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c |  57 ++--
 lib/librte_vhost/vhost_rxtx.c                 |  91 +++--
 lib/librte_vhost/vhost_user/vhost-net-user.c  |  29 +-
 lib/librte_vhost/vhost_user/vhost-net-user.h  |   4 +
 lib/librte_vhost/vhost_user/virtio-net-user.c | 164 ++++++---
 lib/librte_vhost/vhost_user/virtio-net-user.h |   4 +
 lib/librte_vhost/virtio-net.c                 | 283 ++++++++++++----
 lib/librte_vhost/virtio-net.h                 |   2 +
 19 files changed, 1087 insertions(+), 343 deletions(-)

-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v4 01/12] ixgbe: support VMDq RSS in non-SRIOV environment
  2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
@ 2015-08-12  8:02       ` Ouyang Changchun
  2015-08-12  8:22         ` Vincent JARDIN
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev Ouyang Changchun
                         ` (10 subsequent siblings)
  11 siblings, 1 reply; 65+ messages in thread
From: Ouyang Changchun @ 2015-08-12  8:02 UTC (permalink / raw)
  To: dev

In non-SRIOV environment, VMDq RSS could be enabled by MRQC register.
In theory, the queue number per pool could be 2 or 4, but only 2 queues are
available due to HW limitation, the same limit also exist in Linux ixgbe driver.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
Changes in v2:
  - fix checkpatch errors

Changes in v4:
  - use vmdq_queue_num to calculate queue number per pool

 drivers/net/ixgbe/ixgbe_rxtx.c | 86 +++++++++++++++++++++++++++++++++++-------
 lib/librte_ether/rte_ethdev.c  | 31 +++++++++++++++
 2 files changed, 104 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 91023b9..d063e12 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -3513,16 +3513,16 @@ void ixgbe_configure_dcb(struct rte_eth_dev *dev)
 	return;
 }
 
-/*
- * VMDq only support for 10 GbE NIC.
+/**
+ * Config pool for VMDq on 10 GbE NIC.
  */
 static void
-ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
+ixgbe_vmdq_pool_configure(struct rte_eth_dev *dev)
 {
 	struct rte_eth_vmdq_rx_conf *cfg;
 	struct ixgbe_hw *hw;
 	enum rte_eth_nb_pools num_pools;
-	uint32_t mrqc, vt_ctl, vlanctrl;
+	uint32_t vt_ctl, vlanctrl;
 	uint32_t vmolr = 0;
 	int i;
 
@@ -3531,12 +3531,6 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
 	cfg = &dev->data->dev_conf.rx_adv_conf.vmdq_rx_conf;
 	num_pools = cfg->nb_queue_pools;
 
-	ixgbe_rss_disable(dev);
-
-	/* MRQC: enable vmdq */
-	mrqc = IXGBE_MRQC_VMDQEN;
-	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
-
 	/* PFVTCTL: turn on virtualisation and set the default pool */
 	vt_ctl = IXGBE_VT_CTL_VT_ENABLE | IXGBE_VT_CTL_REPLEN;
 	if (cfg->enable_default_pool)
@@ -3602,7 +3596,29 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
 	IXGBE_WRITE_FLUSH(hw);
 }
 
-/*
+/**
+ * VMDq only support for 10 GbE NIC.
+ */
+static void
+ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
+{
+	struct ixgbe_hw *hw;
+	uint32_t mrqc;
+
+	PMD_INIT_FUNC_TRACE();
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	ixgbe_rss_disable(dev);
+
+	/* MRQC: enable vmdq */
+	mrqc = IXGBE_MRQC_VMDQEN;
+	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
+	IXGBE_WRITE_FLUSH(hw);
+
+	ixgbe_vmdq_pool_configure(dev);
+}
+
+/**
  * ixgbe_dcb_config_tx_hw_config - Configure general VMDq TX parameters
  * @hw: pointer to hardware structure
  */
@@ -3707,6 +3723,41 @@ ixgbe_config_vf_rss(struct rte_eth_dev *dev)
 }
 
 static int
+ixgbe_config_vmdq_rss(struct rte_eth_dev *dev)
+{
+	struct ixgbe_hw *hw;
+	uint32_t mrqc;
+
+	ixgbe_rss_configure(dev);
+
+	hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/* MRQC: enable VMDQ RSS */
+	mrqc = IXGBE_READ_REG(hw, IXGBE_MRQC);
+	mrqc &= ~IXGBE_MRQC_MRQE_MASK;
+
+	switch (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) {
+	case 2:
+		mrqc |= IXGBE_MRQC_VMDQRSS64EN;
+		break;
+
+	case 4:
+		mrqc |= IXGBE_MRQC_VMDQRSS32EN;
+		break;
+
+	default:
+		PMD_INIT_LOG(ERR, "Invalid pool number in non-IOV mode with VMDQ RSS");
+		return -EINVAL;
+	}
+
+	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
+
+	ixgbe_vmdq_pool_configure(dev);
+
+	return 0;
+}
+
+static int
 ixgbe_config_vf_default(struct rte_eth_dev *dev)
 {
 	struct ixgbe_hw *hw =
@@ -3762,6 +3813,10 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev)
 				ixgbe_vmdq_rx_hw_configure(dev);
 				break;
 
+			case ETH_MQ_RX_VMDQ_RSS:
+				ixgbe_config_vmdq_rss(dev);
+				break;
+
 			case ETH_MQ_RX_NONE:
 				/* if mq_mode is none, disable rss mode.*/
 			default: ixgbe_rss_disable(dev);
@@ -4252,6 +4307,8 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
 
 	/* Setup RX queues */
 	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		uint32_t psrtype = 0;
+
 		rxq = dev->data->rx_queues[i];
 
 		/*
@@ -4279,12 +4336,10 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
 		if (rx_conf->header_split) {
 			if (hw->mac.type == ixgbe_mac_82599EB) {
 				/* Must setup the PSRTYPE register */
-				uint32_t psrtype;
 				psrtype = IXGBE_PSRTYPE_TCPHDR |
 					IXGBE_PSRTYPE_UDPHDR   |
 					IXGBE_PSRTYPE_IPV4HDR  |
 					IXGBE_PSRTYPE_IPV6HDR;
-				IXGBE_WRITE_REG(hw, IXGBE_PSRTYPE(rxq->reg_idx), psrtype);
 			}
 			srrctl = ((rx_conf->split_hdr_size <<
 				IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT) &
@@ -4294,6 +4349,11 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
 #endif
 			srrctl = IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;
 
+		/* Set RQPL for VMDQ RSS according to max Rx queue */
+		psrtype |= (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool >> 1) <<
+			IXGBE_PSRTYPE_RQPL_SHIFT;
+		IXGBE_WRITE_REG(hw, IXGBE_PSRTYPE(rxq->reg_idx), psrtype);
+
 		/* Set if packets are dropped when no descriptors available */
 		if (rxq->drop_en)
 			srrctl |= IXGBE_SRRCTL_DROP_EN;
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 6b2400c..cadbd76 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -906,6 +906,16 @@ rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t nb_rx_q)
 	return 0;
 }
 
+#define VMDQ_RSS_RX_QUEUE_NUM_MAX 4
+
+static int
+rte_eth_dev_check_vmdq_rss_rxq_num(__rte_unused uint8_t port_id, uint16_t nb_rx_q)
+{
+	if (nb_rx_q > VMDQ_RSS_RX_QUEUE_NUM_MAX)
+		return -EINVAL;
+	return 0;
+}
+
 static int
 rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 			  const struct rte_eth_conf *dev_conf)
@@ -1067,6 +1077,27 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
 				return -EINVAL;
 			}
 		}
+
+		if (dev_conf->rxmode.mq_mode == ETH_MQ_RX_VMDQ_RSS) {
+			uint32_t nb_queue_pools =
+				dev_conf->rx_adv_conf.vmdq_rx_conf.nb_queue_pools;
+			struct rte_eth_dev_info dev_info;
+
+			rte_eth_dev_info_get(port_id, &dev_info);
+			dev->data->dev_conf.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
+			RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool =
+				dev_info.vmdq_queue_num/nb_queue_pools;
+
+			if (rte_eth_dev_check_vmdq_rss_rxq_num(port_id,
+				RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) != 0) {
+				PMD_DEBUG_TRACE("ethdev port_id=%d"
+					" SRIOV active, invalid queue"
+					" number for VMDQ RSS, allowed"
+					" value are 1, 2 or 4\n",
+					port_id);
+				return -EINVAL;
+			}
+		}
 	}
 	return 0;
 }
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev
  2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 01/12] ixgbe: support VMDq RSS in non-SRIOV environment Ouyang Changchun
@ 2015-08-12  8:02       ` Ouyang Changchun
  2015-08-13 12:52         ` Flavio Leitner
                           ` (2 more replies)
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 03/12] vhost: update version map file Ouyang Changchun
                         ` (9 subsequent siblings)
  11 siblings, 3 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-08-12  8:02 UTC (permalink / raw)
  To: dev

Each virtio device could have multiple queues, say 2 or 4, at most 8.
Enabling this feature allows virtio device/port on guest has the ability to
use different vCPU to receive/transmit packets from/to each queue.

In multiple queues mode, virtio device readiness means all queues of
this virtio device are ready, cleanup/destroy a virtio device also
requires clearing all queues belong to it.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
Changes in v4:
  - rebase and fix conflicts
  - resolve comments
  - init each virtq pair if mq is on

Changes in v3:
  - fix coding style
  - check virtqueue idx validity

Changes in v2:
  - remove the q_num_set api
  - add the qp_num_get api
  - determine the queue pair num from qemu message
  - rework for reset owner message handler
  - dynamically alloc mem for dev virtqueue
  - queue pair num could be 0x8000
  - fix checkpatch errors

 lib/librte_vhost/rte_virtio_net.h             |  10 +-
 lib/librte_vhost/vhost-net.h                  |   1 +
 lib/librte_vhost/vhost_rxtx.c                 |  52 +++++---
 lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
 lib/librte_vhost/vhost_user/virtio-net-user.c |  76 +++++++++---
 lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
 lib/librte_vhost/virtio-net.c                 | 165 +++++++++++++++++---------
 7 files changed, 222 insertions(+), 88 deletions(-)

diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index b9bf320..d9e887f 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -59,7 +59,6 @@ struct rte_mbuf;
 /* Backend value set by guest. */
 #define VIRTIO_DEV_STOPPED -1
 
-
 /* Enum for virtqueue management. */
 enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
 
@@ -96,13 +95,14 @@ struct vhost_virtqueue {
  * Device structure contains all configuration information relating to the device.
  */
 struct virtio_net {
-	struct vhost_virtqueue	*virtqueue[VIRTIO_QNUM];	/**< Contains all virtqueue information. */
 	struct virtio_memory	*mem;		/**< QEMU memory and memory region information. */
+	struct vhost_virtqueue	**virtqueue;    /**< Contains all virtqueue information. */
 	uint64_t		features;	/**< Negotiated feature set. */
 	uint64_t		device_fh;	/**< device identifier. */
 	uint32_t		flags;		/**< Device flags. Only used to check if device is running on data core. */
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
 	char			ifname[IF_NAME_SZ];	/**< Name of the tap device or socket path. */
+	uint32_t		virt_qp_nb;
 	void			*priv;		/**< private context */
 } __rte_cache_aligned;
 
@@ -235,4 +235,10 @@ uint16_t rte_vhost_enqueue_burst(struct virtio_net *dev, uint16_t queue_id,
 uint16_t rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
 
+/**
+ * This function get the queue pair number of one vhost device.
+ * @return
+ *  num of queue pair of specified virtio device.
+ */
+uint16_t rte_vhost_qp_num_get(struct virtio_net *dev);
 #endif /* _VIRTIO_NET_H_ */
diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
index c69b60b..7dff14d 100644
--- a/lib/librte_vhost/vhost-net.h
+++ b/lib/librte_vhost/vhost-net.h
@@ -115,4 +115,5 @@ struct vhost_net_device_ops {
 
 
 struct vhost_net_device_ops const *get_virtio_net_callbacks(void);
+int alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx);
 #endif /* _VHOST_NET_CDEV_H_ */
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 0d07338..db4ad88 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -43,6 +43,18 @@
 #define MAX_PKT_BURST 32
 
 /**
+ * Check the virtqueue idx validility,
+ * return 1 if pass, otherwise 0.
+ */
+static inline uint8_t __attribute__((always_inline))
+check_virtqueue_idx(uint16_t virtq_idx, uint8_t is_tx, uint32_t virtq_num)
+{
+	if ((is_tx ^ (virtq_idx & 0x1)) || (virtq_idx >= virtq_num))
+		return 0;
+	return 1;
+}
+
+/**
  * This function adds buffers to the virtio devices RX virtqueue. Buffers can
  * be received from the physical port or from another virtio device. A packet
  * count is returned to indicate the number of packets that are succesfully
@@ -68,12 +80,15 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 	uint8_t success = 0;
 
 	LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_rx()\n", dev->device_fh);
-	if (unlikely(queue_id != VIRTIO_RXQ)) {
-		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
+	if (unlikely(check_virtqueue_idx(queue_id, 0,
+		VIRTIO_QNUM * dev->virt_qp_nb) == 0)) {
+		RTE_LOG(ERR, VHOST_DATA,
+			"%s (%"PRIu64"): virtqueue idx:%d invalid.\n",
+			 __func__, dev->device_fh, queue_id);
 		return 0;
 	}
 
-	vq = dev->virtqueue[VIRTIO_RXQ];
+	vq = dev->virtqueue[queue_id];
 	count = (count > MAX_PKT_BURST) ? MAX_PKT_BURST : count;
 
 	/*
@@ -235,8 +250,9 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 }
 
 static inline uint32_t __attribute__((always_inline))
-copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t res_base_idx,
-	uint16_t res_end_idx, struct rte_mbuf *pkt)
+copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
+	uint16_t res_base_idx, uint16_t res_end_idx,
+	struct rte_mbuf *pkt)
 {
 	uint32_t vec_idx = 0;
 	uint32_t entry_success = 0;
@@ -264,8 +280,9 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t res_base_idx,
 	 * Convert from gpa to vva
 	 * (guest physical addr -> vhost virtual addr)
 	 */
-	vq = dev->virtqueue[VIRTIO_RXQ];
-	vb_addr = gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
+	vq = dev->virtqueue[queue_id];
+	vb_addr =
+		gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
 	vb_hdr_addr = vb_addr;
 
 	/* Prefetch buffer address. */
@@ -464,11 +481,15 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
 
 	LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_merge_rx()\n",
 		dev->device_fh);
-	if (unlikely(queue_id != VIRTIO_RXQ)) {
-		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
+	if (unlikely(check_virtqueue_idx(queue_id, 0,
+		VIRTIO_QNUM * dev->virt_qp_nb) == 0)) {
+		RTE_LOG(ERR, VHOST_DATA,
+			"%s (%"PRIu64"): virtqueue idx:%d invalid.\n",
+			 __func__, dev->device_fh, queue_id);
+		return 0;
 	}
 
-	vq = dev->virtqueue[VIRTIO_RXQ];
+	vq = dev->virtqueue[queue_id];
 	count = RTE_MIN((uint32_t)MAX_PKT_BURST, count);
 
 	if (count == 0)
@@ -509,7 +530,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
 							res_cur_idx);
 		} while (success == 0);
 
-		entry_success = copy_from_mbuf_to_vring(dev, res_base_idx,
+		entry_success = copy_from_mbuf_to_vring(dev, queue_id, res_base_idx,
 			res_cur_idx, pkts[pkt_idx]);
 
 		rte_compiler_barrier();
@@ -559,12 +580,15 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 	uint16_t free_entries, entry_success = 0;
 	uint16_t avail_idx;
 
-	if (unlikely(queue_id != VIRTIO_TXQ)) {
-		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
+	if (unlikely(check_virtqueue_idx(queue_id, 1,
+		VIRTIO_QNUM * dev->virt_qp_nb) == 0)) {
+		RTE_LOG(ERR, VHOST_DATA,
+			"%s (%"PRIu64"): virtqueue idx:%d invalid.\n",
+			 __func__, dev->device_fh, queue_id);
 		return 0;
 	}
 
-	vq = dev->virtqueue[VIRTIO_TXQ];
+	vq = dev->virtqueue[queue_id];
 	avail_idx =  *((volatile uint16_t *)&vq->avail->idx);
 
 	/* If there are no available buffers then return. */
diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c
index f406a94..3d7c373 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -383,7 +383,9 @@ vserver_message_handler(int connfd, void *dat, int *remove)
 		ops->set_owner(ctx);
 		break;
 	case VHOST_USER_RESET_OWNER:
-		ops->reset_owner(ctx);
+		RTE_LOG(INFO, VHOST_CONFIG,
+			"(%"PRIu64") VHOST_NET_RESET_OWNER\n", ctx.fh);
+		user_reset_owner(ctx, &msg.payload.state);
 		break;
 
 	case VHOST_USER_SET_MEM_TABLE:
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c
index c1ffc38..4c1d4df 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -209,30 +209,46 @@ static int
 virtio_is_ready(struct virtio_net *dev)
 {
 	struct vhost_virtqueue *rvq, *tvq;
+	uint32_t q_idx;
 
 	/* mq support in future.*/
-	rvq = dev->virtqueue[VIRTIO_RXQ];
-	tvq = dev->virtqueue[VIRTIO_TXQ];
-	if (rvq && tvq && rvq->desc && tvq->desc &&
-		(rvq->kickfd != (eventfd_t)-1) &&
-		(rvq->callfd != (eventfd_t)-1) &&
-		(tvq->kickfd != (eventfd_t)-1) &&
-		(tvq->callfd != (eventfd_t)-1)) {
-		RTE_LOG(INFO, VHOST_CONFIG,
-			"virtio is now ready for processing.\n");
-		return 1;
+	for (q_idx = 0; q_idx < dev->virt_qp_nb; q_idx++) {
+		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+
+		rvq = dev->virtqueue[virt_rx_q_idx];
+		tvq = dev->virtqueue[virt_tx_q_idx];
+		if ((rvq == NULL) || (tvq == NULL) ||
+			(rvq->desc == NULL) || (tvq->desc == NULL) ||
+			(rvq->kickfd == (eventfd_t)-1) ||
+			(rvq->callfd == (eventfd_t)-1) ||
+			(tvq->kickfd == (eventfd_t)-1) ||
+			(tvq->callfd == (eventfd_t)-1)) {
+			RTE_LOG(INFO, VHOST_CONFIG,
+				"virtio isn't ready for processing.\n");
+			return 0;
+		}
 	}
 	RTE_LOG(INFO, VHOST_CONFIG,
-		"virtio isn't ready for processing.\n");
-	return 0;
+		"virtio is now ready for processing.\n");
+	return 1;
 }
 
 void
 user_set_vring_call(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 {
 	struct vhost_vring_file file;
+	struct virtio_net *dev = get_device(ctx);
+	uint32_t cur_qp_idx;
 
 	file.index = pmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
+	cur_qp_idx = file.index >> 1;
+
+	if (dev->virt_qp_nb < cur_qp_idx + 1) {
+		if (alloc_vring_queue_pair(dev, cur_qp_idx) == 0)
+			dev->virt_qp_nb = cur_qp_idx + 1;
+	}
+
 	if (pmsg->payload.u64 & VHOST_USER_VRING_NOFD_MASK)
 		file.fd = -1;
 	else
@@ -290,13 +306,37 @@ user_get_vring_base(struct vhost_device_ctx ctx,
 	 * sent and only sent in vhost_vring_stop.
 	 * TODO: cleanup the vring, it isn't usable since here.
 	 */
-	if (((int)dev->virtqueue[VIRTIO_RXQ]->kickfd) >= 0) {
-		close(dev->virtqueue[VIRTIO_RXQ]->kickfd);
-		dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
+	if (((int)dev->virtqueue[state->index]->kickfd) >= 0) {
+		close(dev->virtqueue[state->index]->kickfd);
+		dev->virtqueue[state->index]->kickfd = (eventfd_t)-1;
 	}
-	if (((int)dev->virtqueue[VIRTIO_TXQ]->kickfd) >= 0) {
-		close(dev->virtqueue[VIRTIO_TXQ]->kickfd);
-		dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
+
+	return 0;
+}
+
+/*
+ * when virtio is stopped, qemu will send us the RESET_OWNER message.
+ */
+int
+user_reset_owner(struct vhost_device_ctx ctx,
+	struct vhost_vring_state *state)
+{
+	struct virtio_net *dev = get_device(ctx);
+
+	/* We have to stop the queue (virtio) if it is running. */
+	if (dev->flags & VIRTIO_DEV_RUNNING)
+		notify_ops->destroy_device(dev);
+
+	RTE_LOG(INFO, VHOST_CONFIG,
+		"reset owner --- state idx:%d state num:%d\n", state->index, state->num);
+	/*
+	 * Based on current qemu vhost-user implementation, this message is
+	 * sent and only sent in vhost_net_stop_one.
+	 * TODO: cleanup the vring, it isn't usable since here.
+	 */
+	if (((int)dev->virtqueue[state->index]->kickfd) >= 0) {
+		close(dev->virtqueue[state->index]->kickfd);
+		dev->virtqueue[state->index]->kickfd = (eventfd_t)-1;
 	}
 
 	return 0;
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.h b/lib/librte_vhost/vhost_user/virtio-net-user.h
index df24860..2429836 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.h
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.h
@@ -46,4 +46,6 @@ void user_set_vring_kick(struct vhost_device_ctx, struct VhostUserMsg *);
 int user_get_vring_base(struct vhost_device_ctx, struct vhost_vring_state *);
 
 void user_destroy_device(struct vhost_device_ctx);
+
+int user_reset_owner(struct vhost_device_ctx ctx, struct vhost_vring_state *state);
 #endif
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index b520ec5..2a4b791 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -71,9 +71,10 @@ static struct virtio_net_config_ll *ll_root;
 #define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
 				(1ULL << VIRTIO_NET_F_CTRL_VQ) | \
 				(1ULL << VIRTIO_NET_F_CTRL_RX) | \
-				(1ULL << VHOST_F_LOG_ALL))
-static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
+				(1ULL << VHOST_F_LOG_ALL) | \
+				(1ULL << VIRTIO_NET_F_MQ))
 
+static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
 
 /*
  * Converts QEMU virtual address to Vhost virtual address. This function is
@@ -182,6 +183,8 @@ add_config_ll_entry(struct virtio_net_config_ll *new_ll_dev)
 static void
 cleanup_device(struct virtio_net *dev)
 {
+	uint32_t qp_idx;
+
 	/* Unmap QEMU memory file if mapped. */
 	if (dev->mem) {
 		munmap((void *)(uintptr_t)dev->mem->mapped_address,
@@ -190,14 +193,18 @@ cleanup_device(struct virtio_net *dev)
 	}
 
 	/* Close any event notifiers opened by device. */
-	if ((int)dev->virtqueue[VIRTIO_RXQ]->callfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_RXQ]->callfd);
-	if ((int)dev->virtqueue[VIRTIO_RXQ]->kickfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_RXQ]->kickfd);
-	if ((int)dev->virtqueue[VIRTIO_TXQ]->callfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_TXQ]->callfd);
-	if ((int)dev->virtqueue[VIRTIO_TXQ]->kickfd >= 0)
-		close((int)dev->virtqueue[VIRTIO_TXQ]->kickfd);
+	for (qp_idx = 0; qp_idx < dev->virt_qp_nb; qp_idx++) {
+		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+		if ((int)dev->virtqueue[virt_rx_q_idx]->callfd >= 0)
+			close((int)dev->virtqueue[virt_rx_q_idx]->callfd);
+		if ((int)dev->virtqueue[virt_rx_q_idx]->kickfd >= 0)
+			close((int)dev->virtqueue[virt_rx_q_idx]->kickfd);
+		if ((int)dev->virtqueue[virt_tx_q_idx]->callfd >= 0)
+			close((int)dev->virtqueue[virt_tx_q_idx]->callfd);
+		if ((int)dev->virtqueue[virt_tx_q_idx]->kickfd >= 0)
+			close((int)dev->virtqueue[virt_tx_q_idx]->kickfd);
+	}
 }
 
 /*
@@ -206,9 +213,17 @@ cleanup_device(struct virtio_net *dev)
 static void
 free_device(struct virtio_net_config_ll *ll_dev)
 {
-	/* Free any malloc'd memory */
-	rte_free(ll_dev->dev.virtqueue[VIRTIO_RXQ]);
-	rte_free(ll_dev->dev.virtqueue[VIRTIO_TXQ]);
+	uint32_t qp_idx;
+
+	/*
+	 * Free any malloc'd memory.
+	 */
+	/* Free every queue pair. */
+	for (qp_idx = 0; qp_idx < ll_dev->dev.virt_qp_nb; qp_idx++) {
+		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		rte_free(ll_dev->dev.virtqueue[virt_rx_q_idx]);
+	}
+	rte_free(ll_dev->dev.virtqueue);
 	rte_free(ll_dev);
 }
 
@@ -242,6 +257,27 @@ rm_config_ll_entry(struct virtio_net_config_ll *ll_dev,
 }
 
 /*
+ *  Initialise all variables in vring queue pair.
+ */
+static void
+init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx)
+{
+	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+	memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct vhost_virtqueue));
+	memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct vhost_virtqueue));
+
+	dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
+	dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
+	dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
+	dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
+
+	/* Backends are set to -1 indicating an inactive device. */
+	dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
+	dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED;
+}
+
+/*
  *  Initialise all variables in device structure.
  */
 static void
@@ -258,17 +294,34 @@ init_device(struct virtio_net *dev)
 	/* Set everything to 0. */
 	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
 		(sizeof(struct virtio_net) - (size_t)vq_offset));
-	memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct vhost_virtqueue));
-	memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct vhost_virtqueue));
 
-	dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
-	dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
-	dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
-	dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;
+	init_vring_queue_pair(dev, 0);
+	dev->virt_qp_nb = 1;
+}
+
+/*
+ *  Alloc mem for vring queue pair.
+ */
+int
+alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx)
+{
+	struct vhost_virtqueue *virtqueue = NULL;
+	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
 
-	/* Backends are set to -1 indicating an inactive device. */
-	dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
-	dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
+	virtqueue = rte_malloc(NULL, sizeof(struct vhost_virtqueue) * VIRTIO_QNUM, 0);
+	if (virtqueue == NULL) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"Failed to allocate memory for virt qp:%d.\n", qp_idx);
+		return -1;
+	}
+
+	dev->virtqueue[virt_rx_q_idx] = virtqueue;
+	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
+
+	init_vring_queue_pair(dev, qp_idx);
+
+	return 0;
 }
 
 /*
@@ -280,7 +333,6 @@ static int
 new_device(struct vhost_device_ctx ctx)
 {
 	struct virtio_net_config_ll *new_ll_dev;
-	struct vhost_virtqueue *virtqueue_rx, *virtqueue_tx;
 
 	/* Setup device and virtqueues. */
 	new_ll_dev = rte_malloc(NULL, sizeof(struct virtio_net_config_ll), 0);
@@ -291,28 +343,22 @@ new_device(struct vhost_device_ctx ctx)
 		return -1;
 	}
 
-	virtqueue_rx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
-	if (virtqueue_rx == NULL) {
-		rte_free(new_ll_dev);
+	new_ll_dev->dev.virtqueue =
+		rte_malloc(NULL, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct vhost_virtqueue *), 0);
+	if (new_ll_dev->dev.virtqueue == NULL) {
 		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to allocate memory for rxq.\n",
+			"(%"PRIu64") Failed to allocate memory for dev.virtqueue.\n",
 			ctx.fh);
+		rte_free(new_ll_dev);
 		return -1;
 	}
 
-	virtqueue_tx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
-	if (virtqueue_tx == NULL) {
-		rte_free(virtqueue_rx);
+	if (alloc_vring_queue_pair(&new_ll_dev->dev, 0) == -1) {
+		rte_free(new_ll_dev->dev.virtqueue);
 		rte_free(new_ll_dev);
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to allocate memory for txq.\n",
-			ctx.fh);
 		return -1;
 	}
 
-	new_ll_dev->dev.virtqueue[VIRTIO_RXQ] = virtqueue_rx;
-	new_ll_dev->dev.virtqueue[VIRTIO_TXQ] = virtqueue_tx;
-
 	/* Initialise device and virtqueues. */
 	init_device(&new_ll_dev->dev);
 
@@ -396,7 +442,7 @@ set_owner(struct vhost_device_ctx ctx)
  * Called from CUSE IOCTL: VHOST_RESET_OWNER
  */
 static int
-reset_owner(struct vhost_device_ctx ctx)
+reset_owner(__rte_unused struct vhost_device_ctx ctx)
 {
 	struct virtio_net_config_ll *ll_dev;
 
@@ -434,6 +480,7 @@ static int
 set_features(struct vhost_device_ctx ctx, uint64_t *pu)
 {
 	struct virtio_net *dev;
+	uint32_t q_idx;
 
 	dev = get_device(ctx);
 	if (dev == NULL)
@@ -445,22 +492,26 @@ set_features(struct vhost_device_ctx ctx, uint64_t *pu)
 	dev->features = *pu;
 
 	/* Set the vhost_hlen depending on if VIRTIO_NET_F_MRG_RXBUF is set. */
-	if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
-		LOG_DEBUG(VHOST_CONFIG,
-			"(%"PRIu64") Mergeable RX buffers enabled\n",
-			dev->device_fh);
-		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr_mrg_rxbuf);
-		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr_mrg_rxbuf);
-	} else {
-		LOG_DEBUG(VHOST_CONFIG,
-			"(%"PRIu64") Mergeable RX buffers disabled\n",
-			dev->device_fh);
-		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr);
-		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
-			sizeof(struct virtio_net_hdr);
+	for (q_idx = 0; q_idx < dev->virt_qp_nb; q_idx++) {
+		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
+		if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
+			LOG_DEBUG(VHOST_CONFIG,
+				"(%"PRIu64") Mergeable RX buffers enabled\n",
+				dev->device_fh);
+			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr_mrg_rxbuf);
+			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr_mrg_rxbuf);
+		} else {
+			LOG_DEBUG(VHOST_CONFIG,
+				"(%"PRIu64") Mergeable RX buffers disabled\n",
+				dev->device_fh);
+			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr);
+			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
+				sizeof(struct virtio_net_hdr);
+		}
 	}
 	return 0;
 }
@@ -826,6 +877,14 @@ int rte_vhost_feature_enable(uint64_t feature_mask)
 	return -1;
 }
 
+uint16_t rte_vhost_qp_num_get(struct virtio_net *dev)
+{
+	if (dev == NULL)
+		return 0;
+
+	return dev->virt_qp_nb;
+}
+
 /*
  * Register ops so that we can add/remove device to data core.
  */
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v4 03/12] vhost: update version map file
  2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 01/12] ixgbe: support VMDq RSS in non-SRIOV environment Ouyang Changchun
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev Ouyang Changchun
@ 2015-08-12  8:02       ` Ouyang Changchun
  2015-08-12  8:24         ` Panu Matilainen
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 04/12] vhost: set memory layout for multiple queues mode Ouyang Changchun
                         ` (8 subsequent siblings)
  11 siblings, 1 reply; 65+ messages in thread
From: Ouyang Changchun @ 2015-08-12  8:02 UTC (permalink / raw)
  To: dev

From: Changchun Ouyang <changchun.ouyang@intel.com>

it is added in v4.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
 lib/librte_vhost/rte_vhost_version.map | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map
index 3d8709e..0bb1c0f 100644
--- a/lib/librte_vhost/rte_vhost_version.map
+++ b/lib/librte_vhost/rte_vhost_version.map
@@ -18,5 +18,5 @@ DPDK_2.1 {
 	global:
 
 	rte_vhost_driver_unregister;
-
+	rte_vhost_qp_num_get;
 } DPDK_2.0;
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v4 04/12] vhost: set memory layout for multiple queues mode
  2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
                         ` (2 preceding siblings ...)
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 03/12] vhost: update version map file Ouyang Changchun
@ 2015-08-12  8:02       ` Ouyang Changchun
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 05/12] vhost: check the virtqueue address's validity Ouyang Changchun
                         ` (7 subsequent siblings)
  11 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-08-12  8:02 UTC (permalink / raw)
  To: dev

QEMU sends separate commands orderly to set the memory layout for each queue
in one virtio device, accordingly vhost need keep memory layout information
for each queue of the virtio device.

This also need adjust the interface a bit for function gpa_to_vva by
introducing the queue index to specify queue of device to look up its
virtual vhost address for the incoming guest physical address.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
Chagnes in v4
  - rebase and fix conflicts
  - call calloc for dev.mem_arr

Chagnes in v3
  - fix coding style

Chagnes in v2
  - q_idx is changed into qp_idx
  - dynamically alloc mem for dev mem_arr
  - fix checkpatch errors

 examples/vhost/main.c                         | 21 +++++-----
 lib/librte_vhost/rte_virtio_net.h             | 10 +++--
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 57 ++++++++++++++------------
 lib/librte_vhost/vhost_rxtx.c                 | 22 +++++-----
 lib/librte_vhost/vhost_user/virtio-net-user.c | 59 ++++++++++++++-------------
 lib/librte_vhost/virtio-net.c                 | 38 ++++++++++++-----
 6 files changed, 119 insertions(+), 88 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 1b137b9..d3c45dd 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1466,11 +1466,11 @@ attach_rxmbuf_zcp(struct virtio_net *dev)
 		desc = &vq->desc[desc_idx];
 		if (desc->flags & VRING_DESC_F_NEXT) {
 			desc = &vq->desc[desc->next];
-			buff_addr = gpa_to_vva(dev, desc->addr);
+			buff_addr = gpa_to_vva(dev, 0, desc->addr);
 			phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len,
 					&addr_type);
 		} else {
-			buff_addr = gpa_to_vva(dev,
+			buff_addr = gpa_to_vva(dev, 0,
 					desc->addr + vq->vhost_hlen);
 			phys_addr = gpa_to_hpa(vdev,
 					desc->addr + vq->vhost_hlen,
@@ -1722,7 +1722,7 @@ virtio_dev_rx_zcp(struct virtio_net *dev, struct rte_mbuf **pkts,
 			rte_pktmbuf_data_len(buff), 0);
 
 		/* Buffer address translation for virtio header. */
-		buff_hdr_addr = gpa_to_vva(dev, desc->addr);
+		buff_hdr_addr = gpa_to_vva(dev, 0, desc->addr);
 		packet_len = rte_pktmbuf_data_len(buff) + vq->vhost_hlen;
 
 		/*
@@ -1946,7 +1946,7 @@ virtio_dev_tx_zcp(struct virtio_net *dev)
 		desc = &vq->desc[desc->next];
 
 		/* Buffer address translation. */
-		buff_addr = gpa_to_vva(dev, desc->addr);
+		buff_addr = gpa_to_vva(dev, 0, desc->addr);
 		/* Need check extra VLAN_HLEN size for inserting VLAN tag */
 		phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len + VLAN_HLEN,
 			&addr_type);
@@ -2604,13 +2604,14 @@ new_device (struct virtio_net *dev)
 	dev->priv = vdev;
 
 	if (zero_copy) {
-		vdev->nregions_hpa = dev->mem->nregions;
-		for (regionidx = 0; regionidx < dev->mem->nregions; regionidx++) {
+		struct virtio_memory *dev_mem = dev->mem_arr[0];
+		vdev->nregions_hpa = dev_mem->nregions;
+		for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) {
 			vdev->nregions_hpa
 				+= check_hpa_regions(
-					dev->mem->regions[regionidx].guest_phys_address
-					+ dev->mem->regions[regionidx].address_offset,
-					dev->mem->regions[regionidx].memory_size);
+					dev_mem->regions[regionidx].guest_phys_address
+					+ dev_mem->regions[regionidx].address_offset,
+					dev_mem->regions[regionidx].memory_size);
 
 		}
 
@@ -2626,7 +2627,7 @@ new_device (struct virtio_net *dev)
 
 
 		if (fill_hpa_memory_regions(
-			vdev->regions_hpa, dev->mem
+			vdev->regions_hpa, dev_mem
 			) != vdev->nregions_hpa) {
 
 			RTE_LOG(ERR, VHOST_CONFIG,
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index d9e887f..8520d96 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -95,14 +95,15 @@ struct vhost_virtqueue {
  * Device structure contains all configuration information relating to the device.
  */
 struct virtio_net {
-	struct virtio_memory	*mem;		/**< QEMU memory and memory region information. */
 	struct vhost_virtqueue	**virtqueue;    /**< Contains all virtqueue information. */
+	struct virtio_memory    **mem_arr;      /**< Array for QEMU memory and memory region information. */
 	uint64_t		features;	/**< Negotiated feature set. */
 	uint64_t		device_fh;	/**< device identifier. */
 	uint32_t		flags;		/**< Device flags. Only used to check if device is running on data core. */
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
 	char			ifname[IF_NAME_SZ];	/**< Name of the tap device or socket path. */
 	uint32_t		virt_qp_nb;
+	uint32_t		mem_idx;	/** Used in set memory layout, unique for each queue within virtio device. */
 	void			*priv;		/**< private context */
 } __rte_cache_aligned;
 
@@ -153,14 +154,15 @@ rte_vring_available_entries(struct virtio_net *dev, uint16_t queue_id)
  * This is used to convert guest virtio buffer addresses.
  */
 static inline uint64_t __attribute__((always_inline))
-gpa_to_vva(struct virtio_net *dev, uint64_t guest_pa)
+gpa_to_vva(struct virtio_net *dev, uint32_t q_idx, uint64_t guest_pa)
 {
 	struct virtio_memory_regions *region;
+	struct virtio_memory *dev_mem = dev->mem_arr[q_idx];
 	uint32_t regionidx;
 	uint64_t vhost_va = 0;
 
-	for (regionidx = 0; regionidx < dev->mem->nregions; regionidx++) {
-		region = &dev->mem->regions[regionidx];
+	for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) {
+		region = &dev_mem->regions[regionidx];
 		if ((guest_pa >= region->guest_phys_address) &&
 			(guest_pa <= region->guest_phys_address_end)) {
 			vhost_va = region->address_offset + guest_pa;
diff --git a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
index ae2c3fa..34648f6 100644
--- a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
+++ b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
@@ -273,28 +273,32 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 		((uint64_t)(uintptr_t)mem_regions_addr + size);
 	uint64_t base_address = 0, mapped_address, mapped_size;
 	struct virtio_net *dev;
+	struct virtio_memory *dev_mem = NULL;
 
 	dev = get_device(ctx);
 	if (dev == NULL)
-		return -1;
-
-	if (dev->mem && dev->mem->mapped_address) {
-		munmap((void *)(uintptr_t)dev->mem->mapped_address,
-			(size_t)dev->mem->mapped_size);
-		free(dev->mem);
-		dev->mem = NULL;
+		goto error;
+
+	dev_mem = dev->mem_arr[dev->mem_idx];
+	if (dev_mem && dev_mem->mapped_address) {
+		munmap((void *)(uintptr_t)dev_mem->mapped_address,
+			(size_t)dev_mem->mapped_size);
+		free(dev_mem);
+		dev->mem_arr[dev->mem_idx] = NULL;
 	}
 
-	dev->mem = calloc(1, sizeof(struct virtio_memory) +
+	dev->mem_arr[dev->mem_idx] = calloc(1, sizeof(struct virtio_memory) +
 		sizeof(struct virtio_memory_regions) * nregions);
-	if (dev->mem == NULL) {
+	dev_mem = dev->mem_arr[dev->mem_idx];
+
+	if (dev_mem == NULL) {
 		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to allocate memory for dev->mem\n",
-			dev->device_fh);
-		return -1;
+			"(%"PRIu64") Failed to allocate memory for dev->mem_arr[%d]\n",
+			dev->device_fh, dev->mem_idx);
+		goto error;
 	}
 
-	pregion = &dev->mem->regions[0];
+	pregion = &dev_mem->regions[0];
 
 	for (idx = 0; idx < nregions; idx++) {
 		pregion[idx].guest_phys_address =
@@ -320,14 +324,12 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 				pregion[idx].userspace_address;
 			/* Map VM memory file */
 			if (host_memory_map(ctx.pid, base_address,
-				&mapped_address, &mapped_size) != 0) {
-				free(dev->mem);
-				dev->mem = NULL;
-				return -1;
-			}
-			dev->mem->mapped_address = mapped_address;
-			dev->mem->base_address = base_address;
-			dev->mem->mapped_size = mapped_size;
+				&mapped_address, &mapped_size) != 0)
+				goto free;
+
+			dev_mem->mapped_address = mapped_address;
+			dev_mem->base_address = base_address;
+			dev_mem->mapped_size = mapped_size;
 		}
 	}
 
@@ -335,9 +337,7 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 	if (base_address == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"Failed to find base address of qemu memory file.\n");
-		free(dev->mem);
-		dev->mem = NULL;
-		return -1;
+		goto free;
 	}
 
 	valid_regions = nregions;
@@ -369,9 +369,16 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
 			pregion[idx].userspace_address -
 			pregion[idx].guest_phys_address;
 	}
-	dev->mem->nregions = valid_regions;
 
+	dev_mem->nregions = valid_regions;
+	dev->mem_idx = (dev->mem_idx + 1) % (dev->virt_qp_nb * VIRTIO_QNUM);
 	return 0;
+
+free:
+	free(dev_mem);
+	dev->mem_arr[dev->mem_idx] = NULL;
+error:
+	return -1;
 }
 
 /*
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index db4ad88..a60b542 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -139,7 +139,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 		buff = pkts[packet_success];
 
 		/* Convert from gpa to vva (guest physical addr -> vhost virtual addr) */
-		buff_addr = gpa_to_vva(dev, desc->addr);
+		buff_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM, desc->addr);
 		/* Prefetch buffer address. */
 		rte_prefetch0((void *)(uintptr_t)buff_addr);
 
@@ -154,7 +154,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 			(desc->len == vq->vhost_hlen)) {
 			desc = &vq->desc[desc->next];
 			/* Buffer address translation. */
-			buff_addr = gpa_to_vva(dev, desc->addr);
+			buff_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM, desc->addr);
 		} else {
 			vb_offset += vq->vhost_hlen;
 			hdr = 1;
@@ -191,7 +191,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 			if (vb_offset == desc->len) {
 				if (desc->flags & VRING_DESC_F_NEXT) {
 					desc = &vq->desc[desc->next];
-					buff_addr = gpa_to_vva(dev, desc->addr);
+					buff_addr = gpa_to_vva(dev, queue_id, desc->addr);
 					vb_offset = 0;
 				} else {
 					/* Room in vring buffer is not enough */
@@ -281,8 +281,8 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 	 * (guest physical addr -> vhost virtual addr)
 	 */
 	vq = dev->virtqueue[queue_id];
-	vb_addr =
-		gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
+	vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
+			vq->buf_vec[vec_idx].buf_addr);
 	vb_hdr_addr = vb_addr;
 
 	/* Prefetch buffer address. */
@@ -322,7 +322,8 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 		}
 
 		vec_idx++;
-		vb_addr = gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
+		vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
+			vq->buf_vec[vec_idx].buf_addr);
 
 		/* Prefetch buffer address. */
 		rte_prefetch0((void *)(uintptr_t)vb_addr);
@@ -367,7 +368,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 			}
 
 			vec_idx++;
-			vb_addr = gpa_to_vva(dev,
+			vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
 				vq->buf_vec[vec_idx].buf_addr);
 			vb_offset = 0;
 			vb_avail = vq->buf_vec[vec_idx].buf_len;
@@ -410,7 +411,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 
 					/* Get next buffer from buf_vec. */
 					vec_idx++;
-					vb_addr = gpa_to_vva(dev,
+					vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
 						vq->buf_vec[vec_idx].buf_addr);
 					vb_avail =
 						vq->buf_vec[vec_idx].buf_len;
@@ -639,7 +640,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 		}
 
 		/* Buffer address translation. */
-		vb_addr = gpa_to_vva(dev, desc->addr);
+		vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM, desc->addr);
 		/* Prefetch buffer address. */
 		rte_prefetch0((void *)(uintptr_t)vb_addr);
 
@@ -743,7 +744,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 					desc = &vq->desc[desc->next];
 
 					/* Buffer address translation. */
-					vb_addr = gpa_to_vva(dev, desc->addr);
+					vb_addr = gpa_to_vva(dev,
+						queue_id / VIRTIO_QNUM, desc->addr);
 					/* Prefetch buffer address. */
 					rte_prefetch0((void *)(uintptr_t)vb_addr);
 					vb_offset = 0;
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c
index 4c1d4df..d749f27 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -70,17 +70,17 @@ get_blk_size(int fd)
 }
 
 static void
-free_mem_region(struct virtio_net *dev)
+free_mem_region(struct virtio_memory *dev_mem)
 {
 	struct orig_region_map *region;
 	unsigned int idx;
 	uint64_t alignment;
 
-	if (!dev || !dev->mem)
+	if (!dev_mem)
 		return;
 
-	region = orig_region(dev->mem, dev->mem->nregions);
-	for (idx = 0; idx < dev->mem->nregions; idx++) {
+	region = orig_region(dev_mem, dev_mem->nregions);
+	for (idx = 0; idx < dev_mem->nregions; idx++) {
 		if (region[idx].mapped_address) {
 			alignment = region[idx].blksz;
 			munmap((void *)(uintptr_t)
@@ -103,37 +103,37 @@ user_set_mem_table(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 	unsigned int idx = 0;
 	struct orig_region_map *pregion_orig;
 	uint64_t alignment;
+	struct virtio_memory *dev_mem = NULL;
 
 	/* unmap old memory regions one by one*/
 	dev = get_device(ctx);
 	if (dev == NULL)
 		return -1;
 
-	/* Remove from the data plane. */
-	if (dev->flags & VIRTIO_DEV_RUNNING)
-		notify_ops->destroy_device(dev);
-
-	if (dev->mem) {
-		free_mem_region(dev);
-		free(dev->mem);
-		dev->mem = NULL;
+	dev_mem = dev->mem_arr[dev->mem_idx];
+	if (dev_mem) {
+		free_mem_region(dev_mem);
+		free(dev_mem);
+		dev->mem_arr[dev->mem_idx] = NULL;
 	}
 
-	dev->mem = calloc(1,
+	dev->mem_arr[dev->mem_idx] = calloc(1,
 		sizeof(struct virtio_memory) +
 		sizeof(struct virtio_memory_regions) * memory.nregions +
 		sizeof(struct orig_region_map) * memory.nregions);
-	if (dev->mem == NULL) {
+
+	dev_mem = dev->mem_arr[dev->mem_idx];
+	if (dev_mem == NULL) {
 		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to allocate memory for dev->mem\n",
-			dev->device_fh);
+			"(%"PRIu64") Failed to allocate memory for dev->mem_arr[%d]\n",
+			dev->device_fh, dev->mem_idx);
 		return -1;
 	}
-	dev->mem->nregions = memory.nregions;
+	dev_mem->nregions = memory.nregions;
 
-	pregion_orig = orig_region(dev->mem, memory.nregions);
+	pregion_orig = orig_region(dev_mem, memory.nregions);
 	for (idx = 0; idx < memory.nregions; idx++) {
-		pregion = &dev->mem->regions[idx];
+		pregion = &dev_mem->regions[idx];
 		pregion->guest_phys_address =
 			memory.regions[idx].guest_phys_addr;
 		pregion->guest_phys_address_end =
@@ -175,9 +175,9 @@ user_set_mem_table(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 			pregion->guest_phys_address;
 
 		if (memory.regions[idx].guest_phys_addr == 0) {
-			dev->mem->base_address =
+			dev_mem->base_address =
 				memory.regions[idx].userspace_addr;
-			dev->mem->mapped_address =
+			dev_mem->mapped_address =
 				pregion->address_offset;
 		}
 
@@ -189,6 +189,7 @@ user_set_mem_table(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
 			 pregion->memory_size);
 	}
 
+	dev->mem_idx = (dev->mem_idx + 1) % (dev->virt_qp_nb * VIRTIO_QNUM);
 	return 0;
 
 err_mmap:
@@ -200,8 +201,8 @@ err_mmap:
 					alignment));
 		close(pregion_orig[idx].fd);
 	}
-	free(dev->mem);
-	dev->mem = NULL;
+	free(dev_mem);
+	dev->mem_arr[dev->mem_idx] = NULL;
 	return -1;
 }
 
@@ -346,13 +347,15 @@ void
 user_destroy_device(struct vhost_device_ctx ctx)
 {
 	struct virtio_net *dev = get_device(ctx);
+	uint32_t i;
 
 	if (dev && (dev->flags & VIRTIO_DEV_RUNNING))
 		notify_ops->destroy_device(dev);
 
-	if (dev && dev->mem) {
-		free_mem_region(dev);
-		free(dev->mem);
-		dev->mem = NULL;
-	}
+	for (i = 0; i < dev->virt_qp_nb; i++)
+		if (dev && dev->mem_arr[i]) {
+			free_mem_region(dev->mem_arr[i]);
+			free(dev->mem_arr[i]);
+			dev->mem_arr[i] = NULL;
+		}
 }
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 2a4b791..fd66a06 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -81,15 +81,16 @@ static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
  * used to convert the ring addresses to our address space.
  */
 static uint64_t
-qva_to_vva(struct virtio_net *dev, uint64_t qemu_va)
+qva_to_vva(struct virtio_net *dev, uint32_t q_idx, uint64_t qemu_va)
 {
 	struct virtio_memory_regions *region;
 	uint64_t vhost_va = 0;
 	uint32_t regionidx = 0;
+	struct virtio_memory *dev_mem = dev->mem_arr[q_idx];
 
 	/* Find the region where the address lives. */
-	for (regionidx = 0; regionidx < dev->mem->nregions; regionidx++) {
-		region = &dev->mem->regions[regionidx];
+	for (regionidx = 0; regionidx < dev_mem->nregions; regionidx++) {
+		region = &dev_mem->regions[regionidx];
 		if ((qemu_va >= region->userspace_address) &&
 			(qemu_va <= region->userspace_address +
 			region->memory_size)) {
@@ -186,10 +187,13 @@ cleanup_device(struct virtio_net *dev)
 	uint32_t qp_idx;
 
 	/* Unmap QEMU memory file if mapped. */
-	if (dev->mem) {
-		munmap((void *)(uintptr_t)dev->mem->mapped_address,
-			(size_t)dev->mem->mapped_size);
-		free(dev->mem);
+	for (qp_idx = 0; qp_idx < dev->virt_qp_nb; qp_idx++) {
+		struct virtio_memory *dev_mem = dev->mem_arr[qp_idx];
+		if (dev_mem) {
+			munmap((void *)(uintptr_t)dev_mem->mapped_address,
+				(size_t)dev_mem->mapped_size);
+			free(dev_mem);
+		}
 	}
 
 	/* Close any event notifiers opened by device. */
@@ -218,6 +222,8 @@ free_device(struct virtio_net_config_ll *ll_dev)
 	/*
 	 * Free any malloc'd memory.
 	 */
+	free(ll_dev->dev.mem_arr);
+
 	/* Free every queue pair. */
 	for (qp_idx = 0; qp_idx < ll_dev->dev.virt_qp_nb; qp_idx++) {
 		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
@@ -289,7 +295,7 @@ init_device(struct virtio_net *dev)
 	 * Virtqueues have already been malloced so
 	 * we don't want to set them to NULL.
 	 */
-	vq_offset = offsetof(struct virtio_net, mem);
+	vq_offset = offsetof(struct virtio_net, features);
 
 	/* Set everything to 0. */
 	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
@@ -359,6 +365,16 @@ new_device(struct vhost_device_ctx ctx)
 		return -1;
 	}
 
+	new_ll_dev->dev.mem_arr =
+		calloc(VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX, sizeof(struct virtio_memory *));
+	if (new_ll_dev->dev.mem_arr == NULL) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"(%"PRIu64") Failed to allocate memory for dev.mem_arr.\n",
+			ctx.fh);
+		free_device(new_ll_dev);
+		return -1;
+	}
+
 	/* Initialise device and virtqueues. */
 	init_device(&new_ll_dev->dev);
 
@@ -637,7 +653,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 
 	/* The addresses are converted from QEMU virtual to Vhost virtual. */
 	vq->desc = (struct vring_desc *)(uintptr_t)qva_to_vva(dev,
-			addr->desc_user_addr);
+			addr->index / VIRTIO_QNUM, addr->desc_user_addr);
 	if (vq->desc == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"(%"PRIu64") Failed to find desc ring address.\n",
@@ -649,7 +665,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 	vq = dev->virtqueue[addr->index];
 
 	vq->avail = (struct vring_avail *)(uintptr_t)qva_to_vva(dev,
-			addr->avail_user_addr);
+			addr->index / VIRTIO_QNUM, addr->avail_user_addr);
 	if (vq->avail == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"(%"PRIu64") Failed to find avail ring address.\n",
@@ -658,7 +674,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 	}
 
 	vq->used = (struct vring_used *)(uintptr_t)qva_to_vva(dev,
-			addr->used_user_addr);
+			addr->index / VIRTIO_QNUM, addr->used_user_addr);
 	if (vq->used == 0) {
 		RTE_LOG(ERR, VHOST_CONFIG,
 			"(%"PRIu64") Failed to find used ring address.\n",
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v4 05/12] vhost: check the virtqueue address's validity
  2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
                         ` (3 preceding siblings ...)
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 04/12] vhost: set memory layout for multiple queues mode Ouyang Changchun
@ 2015-08-12  8:02       ` Ouyang Changchun
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 06/12] vhost: support protocol feature Ouyang Changchun
                         ` (6 subsequent siblings)
  11 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-08-12  8:02 UTC (permalink / raw)
  To: dev

This is added since v3.
Check the virtqueue address's validity.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
Changes in v4:
  - remove unnecessary code

 lib/librte_vhost/vhost_user/vhost-net-user.c |  4 +++-
 lib/librte_vhost/virtio-net.c                | 10 ++++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c
index 3d7c373..e926ed7 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -403,7 +403,9 @@ vserver_message_handler(int connfd, void *dat, int *remove)
 		ops->set_vring_num(ctx, &msg.payload.state);
 		break;
 	case VHOST_USER_SET_VRING_ADDR:
-		ops->set_vring_addr(ctx, &msg.payload.addr);
+		if (ops->set_vring_addr(ctx, &msg.payload.addr) != 0)
+			RTE_LOG(INFO, VHOST_CONFIG,
+				"vring address incorrect.\n");
 		break;
 	case VHOST_USER_SET_VRING_BASE:
 		ops->set_vring_base(ctx, &msg.payload.state);
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index fd66a06..8901aa5 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -643,6 +643,7 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 {
 	struct virtio_net *dev;
 	struct vhost_virtqueue *vq;
+	uint32_t i;
 
 	dev = get_device(ctx);
 	if (dev == NULL)
@@ -673,6 +674,15 @@ set_vring_addr(struct vhost_device_ctx ctx, struct vhost_vring_addr *addr)
 		return -1;
 	}
 
+	for (i = vq->last_used_idx; i < vq->avail->idx; i++)
+		if (vq->avail->ring[i] >= vq->size) {
+			RTE_LOG(ERR, VHOST_CONFIG, "%s (%"PRIu64"):"
+				"Please check virt queue pair idx:%d is "
+				"enalbed correctly on guest.\n", __func__,
+				dev->device_fh, addr->index / VIRTIO_QNUM);
+			return -1;
+		}
+
 	vq->used = (struct vring_used *)(uintptr_t)qva_to_vva(dev,
 			addr->index / VIRTIO_QNUM, addr->used_user_addr);
 	if (vq->used == 0) {
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v4 06/12] vhost: support protocol feature
  2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
                         ` (4 preceding siblings ...)
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 05/12] vhost: check the virtqueue address's validity Ouyang Changchun
@ 2015-08-12  8:02       ` Ouyang Changchun
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 07/12] vhost: add new command line option: rxq Ouyang Changchun
                         ` (5 subsequent siblings)
  11 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-08-12  8:02 UTC (permalink / raw)
  To: dev

Support new protocol feature to communicate with qemu:
Add set and get protocol feature bits;
Add VRING_FLAG for mq feature to set vring flag, which
indicates the vq is enabled or disabled.

Reserve values as follows:
VHOST_USER_SEND_RARP = 17 (merge from qemu community)
VHOST_USER_SET_VRING_FLAG = 18 (reserve for vhost mq)

These reservation need sync up with qemu community before finalizing.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
This is added since v4.

 lib/librte_vhost/rte_virtio_net.h             |  2 +
 lib/librte_vhost/vhost-net.h                  |  3 ++
 lib/librte_vhost/vhost_rxtx.c                 | 21 ++++++++++
 lib/librte_vhost/vhost_user/vhost-net-user.c  | 21 +++++++++-
 lib/librte_vhost/vhost_user/vhost-net-user.h  |  4 ++
 lib/librte_vhost/vhost_user/virtio-net-user.c | 29 ++++++++++++++
 lib/librte_vhost/vhost_user/virtio-net-user.h |  2 +
 lib/librte_vhost/virtio-net.c                 | 56 ++++++++++++++++++++++++++-
 lib/librte_vhost/virtio-net.h                 |  2 +
 9 files changed, 138 insertions(+), 2 deletions(-)

diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index 8520d96..e16ad3a 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -88,6 +88,7 @@ struct vhost_virtqueue {
 	volatile uint16_t	last_used_idx_res;	/**< Used for multiple devices reserving buffers. */
 	eventfd_t		callfd;			/**< Used to notify the guest (trigger interrupt). */
 	eventfd_t		kickfd;			/**< Currently unused as polling mode is enabled. */
+	uint32_t		enabled;		/**< Indicate the queue is enabled or not. */
 	struct buf_vector	buf_vec[BUF_VECTOR_MAX];	/**< for scatter RX. */
 } __rte_cache_aligned;
 
@@ -98,6 +99,7 @@ struct virtio_net {
 	struct vhost_virtqueue	**virtqueue;    /**< Contains all virtqueue information. */
 	struct virtio_memory    **mem_arr;      /**< Array for QEMU memory and memory region information. */
 	uint64_t		features;	/**< Negotiated feature set. */
+	uint64_t		protocol_features;	/**< Negotiated protocol feature set. */
 	uint64_t		device_fh;	/**< device identifier. */
 	uint32_t		flags;		/**< Device flags. Only used to check if device is running on data core. */
 #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
index 7dff14d..bc88bad 100644
--- a/lib/librte_vhost/vhost-net.h
+++ b/lib/librte_vhost/vhost-net.h
@@ -99,6 +99,9 @@ struct vhost_net_device_ops {
 	int (*get_features)(struct vhost_device_ctx, uint64_t *);
 	int (*set_features)(struct vhost_device_ctx, uint64_t *);
 
+	int (*get_protocol_features)(struct vhost_device_ctx, uint64_t *);
+	int (*set_protocol_features)(struct vhost_device_ctx, uint64_t *);
+
 	int (*set_vring_num)(struct vhost_device_ctx, struct vhost_vring_state *);
 	int (*set_vring_addr)(struct vhost_device_ctx, struct vhost_vring_addr *);
 	int (*set_vring_base)(struct vhost_device_ctx, struct vhost_vring_state *);
diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index a60b542..3af0326 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -89,6 +89,14 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 	}
 
 	vq = dev->virtqueue[queue_id];
+
+	if (unlikely(vq->enabled == 0)) {
+		RTE_LOG(ERR, VHOST_DATA,
+			"%s (%"PRIu64"): virtqueue idx:%d not enabled.\n",
+			 __func__, dev->device_fh, queue_id);
+		return 0;
+	}
+
 	count = (count > MAX_PKT_BURST) ? MAX_PKT_BURST : count;
 
 	/*
@@ -281,6 +289,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
 	 * (guest physical addr -> vhost virtual addr)
 	 */
 	vq = dev->virtqueue[queue_id];
+
 	vb_addr = gpa_to_vva(dev, queue_id / VIRTIO_QNUM,
 			vq->buf_vec[vec_idx].buf_addr);
 	vb_hdr_addr = vb_addr;
@@ -491,6 +500,14 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
 	}
 
 	vq = dev->virtqueue[queue_id];
+
+	if (unlikely(vq->enabled == 0)) {
+		RTE_LOG(ERR, VHOST_DATA,
+			"%s (%"PRIu64"): virtqueue idx:%d not enabled.\n",
+			 __func__, dev->device_fh, queue_id);
+		return 0;
+	}
+
 	count = RTE_MIN((uint32_t)MAX_PKT_BURST, count);
 
 	if (count == 0)
@@ -590,6 +607,10 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 	}
 
 	vq = dev->virtqueue[queue_id];
+
+	if (unlikely(vq->enabled == 0))
+		return 0;
+
 	avail_idx =  *((volatile uint16_t *)&vq->avail->idx);
 
 	/* If there are no available buffers then return. */
diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c
index e926ed7..f7a24e9 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -95,7 +95,11 @@ static const char *vhost_message_str[VHOST_USER_MAX] = {
 	[VHOST_USER_GET_VRING_BASE] = "VHOST_USER_GET_VRING_BASE",
 	[VHOST_USER_SET_VRING_KICK] = "VHOST_USER_SET_VRING_KICK",
 	[VHOST_USER_SET_VRING_CALL] = "VHOST_USER_SET_VRING_CALL",
-	[VHOST_USER_SET_VRING_ERR]  = "VHOST_USER_SET_VRING_ERR"
+	[VHOST_USER_SET_VRING_ERR]  = "VHOST_USER_SET_VRING_ERR",
+	[VHOST_USER_GET_PROTOCOL_FEATURES]  = "VHOST_USER_GET_PROTOCOL_FEATURES",
+	[VHOST_USER_SET_PROTOCOL_FEATURES]  = "VHOST_USER_SET_PROTOCOL_FEATURES",
+	[VHOST_USER_SEND_RARP]  = "VHOST_USER_SEND_RARP",
+	[VHOST_USER_SET_VRING_FLAG]  = "VHOST_USER_SET_VRING_FLAG"
 };
 
 /**
@@ -379,6 +383,17 @@ vserver_message_handler(int connfd, void *dat, int *remove)
 		ops->set_features(ctx, &features);
 		break;
 
+	case VHOST_USER_GET_PROTOCOL_FEATURES:
+		ret = ops->get_protocol_features(ctx, &features);
+		msg.payload.u64 = features;
+		msg.size = sizeof(msg.payload.u64);
+		send_vhost_message(connfd, &msg);
+		break;
+	case VHOST_USER_SET_PROTOCOL_FEATURES:
+		features = msg.payload.u64;
+		ops->set_protocol_features(ctx, &features);
+		break;
+
 	case VHOST_USER_SET_OWNER:
 		ops->set_owner(ctx);
 		break;
@@ -424,6 +439,10 @@ vserver_message_handler(int connfd, void *dat, int *remove)
 		user_set_vring_call(ctx, &msg);
 		break;
 
+	case VHOST_USER_SET_VRING_FLAG:
+		user_set_vring_flag(ctx, &msg.payload.state);
+		break;
+
 	case VHOST_USER_SET_VRING_ERR:
 		if (!(msg.payload.u64 & VHOST_USER_VRING_NOFD_MASK))
 			close(msg.fds[0]);
diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.h b/lib/librte_vhost/vhost_user/vhost-net-user.h
index 2e72f3c..54e95aa 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.h
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.h
@@ -63,6 +63,10 @@ typedef enum VhostUserRequest {
 	VHOST_USER_SET_VRING_KICK = 12,
 	VHOST_USER_SET_VRING_CALL = 13,
 	VHOST_USER_SET_VRING_ERR = 14,
+	VHOST_USER_GET_PROTOCOL_FEATURES = 15,
+	VHOST_USER_SET_PROTOCOL_FEATURES = 16,
+	VHOST_USER_SEND_RARP = 17,
+	VHOST_USER_SET_VRING_FLAG = 18,
 	VHOST_USER_MAX
 } VhostUserRequest;
 
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c
index d749f27..6a12d96 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -229,6 +229,13 @@ virtio_is_ready(struct virtio_net *dev)
 				"virtio isn't ready for processing.\n");
 			return 0;
 		}
+		if ((dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_VRING_FLAG)) == 0) {
+			/* Without VRING_FLAG feature, only 1 vq pair is supported */
+			if (q_idx == 0) {
+				rvq->enabled = 1;
+				tvq->enabled = 1;
+			}
+		}
 	}
 	RTE_LOG(INFO, VHOST_CONFIG,
 		"virtio is now ready for processing.\n");
@@ -343,6 +350,28 @@ user_reset_owner(struct vhost_device_ctx ctx,
 	return 0;
 }
 
+/*
+ * when virtio queues are ready to work, qemu will send us to enable the virtio queue pair.
+ */
+int
+user_set_vring_flag(struct vhost_device_ctx ctx,
+	struct vhost_vring_state *state)
+{
+	struct virtio_net *dev = get_device(ctx);
+
+	RTE_LOG(INFO, VHOST_CONFIG,
+		"set queue enable --- state idx:%d state num:%d\n", state->index, state->num);
+
+	/*
+	 * The state->index indicate the qeueu pair index,
+	 * need set for both Rx and Tx.
+	 */
+	dev->virtqueue[state->index * VIRTIO_QNUM + VIRTIO_RXQ]->enabled = state->num;
+	dev->virtqueue[state->index * VIRTIO_QNUM + VIRTIO_TXQ]->enabled = state->num;
+
+	return 0;
+}
+
 void
 user_destroy_device(struct vhost_device_ctx ctx)
 {
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.h b/lib/librte_vhost/vhost_user/virtio-net-user.h
index 2429836..10a3fff 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.h
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.h
@@ -45,6 +45,8 @@ void user_set_vring_kick(struct vhost_device_ctx, struct VhostUserMsg *);
 
 int user_get_vring_base(struct vhost_device_ctx, struct vhost_vring_state *);
 
+int user_set_vring_flag(struct vhost_device_ctx ctx, struct vhost_vring_state *state);
+
 void user_destroy_device(struct vhost_device_ctx);
 
 int user_reset_owner(struct vhost_device_ctx ctx, struct vhost_vring_state *state);
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 8901aa5..24d0c53 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -67,15 +67,23 @@ struct virtio_net_device_ops const *notify_ops;
 /* root address of the linked list of managed virtio devices */
 static struct virtio_net_config_ll *ll_root;
 
+#define VHOST_USER_F_PROTOCOL_FEATURES 30
+
 /* Features supported by this lib. */
 #define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
 				(1ULL << VIRTIO_NET_F_CTRL_VQ) | \
 				(1ULL << VIRTIO_NET_F_CTRL_RX) | \
 				(1ULL << VHOST_F_LOG_ALL) | \
-				(1ULL << VIRTIO_NET_F_MQ))
+				(1ULL << VIRTIO_NET_F_MQ) | \
+				(1ULL << VHOST_USER_F_PROTOCOL_FEATURES))
 
 static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
 
+/* Protocol features supported by this lib. */
+#define VHOST_SUPPORTED_PROTOCOL_FEATURES ((1ULL << VHOST_USER_PROTOCOL_F_VRING_FLAG))
+
+static uint64_t VHOST_PROTOCOL_FEATURES = VHOST_SUPPORTED_PROTOCOL_FEATURES;
+
 /*
  * Converts QEMU virtual address to Vhost virtual address. This function is
  * used to convert the ring addresses to our address space.
@@ -533,6 +541,45 @@ set_features(struct vhost_device_ctx ctx, uint64_t *pu)
 }
 
 /*
+ * Called from VHOST-USER SOCKET: VHOST_GET_PROTOCOL_FEATURES
+ * The features that we support are requested.
+ */
+static int
+get_protocol_features(struct vhost_device_ctx ctx, uint64_t *pu)
+{
+	struct virtio_net *dev;
+
+	dev = get_device(ctx);
+	if (dev == NULL)
+		return -1;
+
+	/* Send our supported features. */
+	*pu = VHOST_PROTOCOL_FEATURES;
+	return 0;
+}
+
+/*
+ * Called from VHOST-USER SOCKET: VHOST_SET_PROTOCOL_FEATURES
+ * We receive the negotiated features supported by us and the virtio device.
+ */
+static int
+set_protocol_features(struct vhost_device_ctx ctx, uint64_t *pu)
+{
+	struct virtio_net *dev;
+
+	dev = get_device(ctx);
+	if (dev == NULL)
+		return -1;
+	if (*pu & ~VHOST_PROTOCOL_FEATURES)
+		return -1;
+
+	/* Store the negotiated feature list for the device. */
+	dev->protocol_features = *pu;
+
+	return 0;
+}
+
+/*
  * Called from CUSE IOCTL: VHOST_SET_VRING_NUM
  * The virtio device sends us the size of the descriptor ring.
  */
@@ -824,6 +871,10 @@ set_backend(struct vhost_device_ctx ctx, struct vhost_vring_file *file)
 	if (!(dev->flags & VIRTIO_DEV_RUNNING)) {
 		if (((int)dev->virtqueue[VIRTIO_TXQ]->backend != VIRTIO_DEV_STOPPED) &&
 			((int)dev->virtqueue[VIRTIO_RXQ]->backend != VIRTIO_DEV_STOPPED)) {
+			if ((dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_VRING_FLAG)) == 0) {
+				dev->virtqueue[VIRTIO_RXQ]->enabled = 1;
+				dev->virtqueue[VIRTIO_TXQ]->enabled = 1;
+			}
 			return notify_ops->new_device(dev);
 		}
 	/* Otherwise we remove it. */
@@ -846,6 +897,9 @@ static const struct vhost_net_device_ops vhost_device_ops = {
 	.get_features = get_features,
 	.set_features = set_features,
 
+	.get_protocol_features = get_protocol_features,
+	.set_protocol_features = set_protocol_features,
+
 	.set_vring_num = set_vring_num,
 	.set_vring_addr = set_vring_addr,
 	.set_vring_base = set_vring_base,
diff --git a/lib/librte_vhost/virtio-net.h b/lib/librte_vhost/virtio-net.h
index 75fb57e..ef6efae 100644
--- a/lib/librte_vhost/virtio-net.h
+++ b/lib/librte_vhost/virtio-net.h
@@ -37,6 +37,8 @@
 #include "vhost-net.h"
 #include "rte_virtio_net.h"
 
+#define VHOST_USER_PROTOCOL_F_VRING_FLAG 2
+
 struct virtio_net_device_ops const *notify_ops;
 struct virtio_net *get_device(struct vhost_device_ctx ctx);
 
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v4 07/12] vhost: add new command line option: rxq
  2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
                         ` (5 preceding siblings ...)
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 06/12] vhost: support protocol feature Ouyang Changchun
@ 2015-08-12  8:02       ` Ouyang Changchun
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 08/12] vhost: support multiple queues Ouyang Changchun
                         ` (4 subsequent siblings)
  11 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-08-12  8:02 UTC (permalink / raw)
  To: dev

Sample vhost need know the queue number user want to enable for each virtio device,
so add the new option '--rxq' into it.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
Changes in v3
  - fix coding style

Changes in v2
  - refine help info
  - check if rxq = 0
  - fix checkpatch errors

 examples/vhost/main.c | 49 +++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 45 insertions(+), 4 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index d3c45dd..5b811af 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -163,6 +163,9 @@ static int mergeable;
 /* Do vlan strip on host, enabled on default */
 static uint32_t vlan_strip = 1;
 
+/* Rx queue number per virtio device */
+static uint32_t rxq = 1;
+
 /* number of descriptors to apply*/
 static uint32_t num_rx_descriptor = RTE_TEST_RX_DESC_DEFAULT_ZCP;
 static uint32_t num_tx_descriptor = RTE_TEST_TX_DESC_DEFAULT_ZCP;
@@ -408,8 +411,19 @@ port_init(uint8_t port)
 		txconf->tx_deferred_start = 1;
 	}
 
-	/*configure the number of supported virtio devices based on VMDQ limits */
-	num_devices = dev_info.max_vmdq_pools;
+	/* Configure the virtio devices num based on VMDQ limits */
+	switch (rxq) {
+	case 1:
+	case 2:
+		num_devices = dev_info.max_vmdq_pools;
+		break;
+	case 4:
+		num_devices = dev_info.max_vmdq_pools / 2;
+		break;
+	default:
+		RTE_LOG(ERR, VHOST_CONFIG, "rxq invalid for VMDq.\n");
+		return -1;
+	}
 
 	if (zero_copy) {
 		rx_ring_size = num_rx_descriptor;
@@ -431,7 +445,7 @@ port_init(uint8_t port)
 		return retval;
 	/* NIC queues are divided into pf queues and vmdq queues.  */
 	num_pf_queues = dev_info.max_rx_queues - dev_info.vmdq_queue_num;
-	queues_per_pool = dev_info.vmdq_queue_num / dev_info.max_vmdq_pools;
+	queues_per_pool = dev_info.vmdq_queue_num / num_devices;
 	num_vmdq_queues = num_devices * queues_per_pool;
 	num_queues = num_pf_queues + num_vmdq_queues;
 	vmdq_queue_base = dev_info.vmdq_queue_base;
@@ -576,7 +590,8 @@ us_vhost_usage(const char *prgname)
 	"		--rx-desc-num [0-N]: the number of descriptors on rx, "
 			"used only when zero copy is enabled.\n"
 	"		--tx-desc-num [0-N]: the number of descriptors on tx, "
-			"used only when zero copy is enabled.\n",
+			"used only when zero copy is enabled.\n"
+	"		--rxq [1,2,4]: rx queue number for each vhost device\n",
 	       prgname);
 }
 
@@ -602,6 +617,7 @@ us_vhost_parse_args(int argc, char **argv)
 		{"zero-copy", required_argument, NULL, 0},
 		{"rx-desc-num", required_argument, NULL, 0},
 		{"tx-desc-num", required_argument, NULL, 0},
+		{"rxq", required_argument, NULL, 0},
 		{NULL, 0, 0, 0},
 	};
 
@@ -778,6 +794,18 @@ us_vhost_parse_args(int argc, char **argv)
 				}
 			}
 
+			/* Specify the Rx queue number for each vhost dev. */
+			if (!strncmp(long_option[option_index].name,
+				"rxq", MAX_LONG_OPT_SZ)) {
+				ret = parse_num_opt(optarg, 4);
+				if ((ret == -1) || (ret == 0) || (!POWEROF2(ret))) {
+					RTE_LOG(INFO, VHOST_CONFIG,
+					"Valid value for rxq is [1,2,4]\n");
+					us_vhost_usage(prgname);
+					return -1;
+				} else
+					rxq = ret;
+			}
 			break;
 
 			/* Invalid option - print options. */
@@ -813,6 +841,19 @@ us_vhost_parse_args(int argc, char **argv)
 		return -1;
 	}
 
+	if (rxq > 1) {
+		vmdq_conf_default.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
+		vmdq_conf_default.rx_adv_conf.rss_conf.rss_hf = ETH_RSS_IP |
+				ETH_RSS_UDP | ETH_RSS_TCP | ETH_RSS_SCTP;
+	}
+
+	if ((zero_copy == 1) && (rxq > 1)) {
+		RTE_LOG(INFO, VHOST_PORT,
+			"Vhost zero copy doesn't support mq mode,"
+			"please specify '--rxq 1' to disable it.\n");
+		return -1;
+	}
+
 	return 0;
 }
 
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v4 08/12] vhost: support multiple queues
  2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
                         ` (6 preceding siblings ...)
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 07/12] vhost: add new command line option: rxq Ouyang Changchun
@ 2015-08-12  8:02       ` Ouyang Changchun
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 09/12] virtio: resolve for control queue Ouyang Changchun
                         ` (3 subsequent siblings)
  11 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-08-12  8:02 UTC (permalink / raw)
  To: dev

Sample vhost leverage the VMDq+RSS in HW to receive packets and distribute them
into different queue in the pool according to 5 tuples.

And enable multiple queues mode in vhost/virtio layer.

HW queue numbers in pool exactly same with the queue number in virtio device,
e.g. rxq = 4, the queue number is 4, it means 4 HW queues in each VMDq pool,
and 4 queues in each virtio device/port, one maps to each.

=========================================
==================|   |==================|
       vport0     |   |      vport1      |
---  ---  ---  ---|   |---  ---  ---  ---|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
||   ||   ||   ||      ||   ||   ||   ||
||   ||   ||   ||      ||   ||   ||   ||
||= =||= =||= =||=|   =||== ||== ||== ||=|
q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |

------------------|   |------------------|
     VMDq pool0   |   |    VMDq pool1    |
==================|   |==================|

In RX side, it firstly polls each queue of the pool and gets the packets from
it and enqueue them into its corresponding queue in virtio device/port.
In TX side, it dequeue packets from each queue of virtio device/port and send
to either physical port or another virtio device according to its destination
MAC address.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
Changes in v4:
  - address comments and refine var name
  - support FVL nic
  - fix check patch issue

Changes in v2:
  - check queue num per pool in VMDq and queue pair number per vhost device
  - remove the unnecessary calling q_num_set api
  - fix checkpatch errors

 examples/vhost/main.c | 190 ++++++++++++++++++++++++++++++++------------------
 1 file changed, 124 insertions(+), 66 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 5b811af..683a300 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -368,6 +368,37 @@ validate_num_devices(uint32_t max_nb_devices)
 	return 0;
 }
 
+static int
+get_dev_nb_for_82599(struct rte_eth_dev_info dev_info)
+{
+	int dev_nb = -1;
+	switch (rxq) {
+	case 1:
+	case 2:
+		/*
+		 * for 82599, dev_info.max_vmdq_pools always 64 dispite rx mode.
+		 */
+		dev_nb = (int)dev_info.max_vmdq_pools;
+		break;
+	case 4:
+		dev_nb = (int)dev_info.max_vmdq_pools / 2;
+		break;
+	default:
+		RTE_LOG(ERR, VHOST_CONFIG, "rxq invalid for VMDq.\n");
+	}
+	return dev_nb;
+}
+
+static int
+get_dev_nb_for_fvl(struct rte_eth_dev_info dev_info)
+{
+	/*
+	 * for FVL, dev_info.max_vmdq_pools is calculated according to
+	 * the configured value: CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM.
+	 */
+	return (int)dev_info.max_vmdq_pools;
+}
+
 /*
  * Initialises a given port using global settings and with the rx buffers
  * coming from the mbuf_pool passed as parameter
@@ -412,17 +443,14 @@ port_init(uint8_t port)
 	}
 
 	/* Configure the virtio devices num based on VMDQ limits */
-	switch (rxq) {
-	case 1:
-	case 2:
-		num_devices = dev_info.max_vmdq_pools;
-		break;
-	case 4:
-		num_devices = dev_info.max_vmdq_pools / 2;
-		break;
-	default:
-		RTE_LOG(ERR, VHOST_CONFIG, "rxq invalid for VMDq.\n");
-		return -1;
+	if (dev_info.max_vmdq_pools == ETH_64_POOLS) {
+		num_devices = (uint32_t)get_dev_nb_for_82599(dev_info);
+		if (num_devices == (uint32_t)-1)
+			return -1;
+	} else {
+		num_devices = (uint32_t)get_dev_nb_for_fvl(dev_info);
+		if (num_devices == (uint32_t)-1)
+			return -1;
 	}
 
 	if (zero_copy) {
@@ -1001,8 +1029,9 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)
 
 	/* Enable stripping of the vlan tag as we handle routing. */
 	if (vlan_strip)
-		rte_eth_dev_set_vlan_strip_on_queue(ports[0],
-			(uint16_t)vdev->vmdq_rx_q, 1);
+		for (i = 0; i < (int)rxq; i++)
+			rte_eth_dev_set_vlan_strip_on_queue(ports[0],
+				(uint16_t)(vdev->vmdq_rx_q + i), 1);
 
 	/* Set device as ready for RX. */
 	vdev->ready = DEVICE_RX;
@@ -1017,7 +1046,7 @@ link_vmdq(struct vhost_dev *vdev, struct rte_mbuf *m)
 static inline void
 unlink_vmdq(struct vhost_dev *vdev)
 {
-	unsigned i = 0;
+	unsigned i = 0, j = 0;
 	unsigned rx_count;
 	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
 
@@ -1030,15 +1059,19 @@ unlink_vmdq(struct vhost_dev *vdev)
 		vdev->vlan_tag = 0;
 
 		/*Clear out the receive buffers*/
-		rx_count = rte_eth_rx_burst(ports[0],
-					(uint16_t)vdev->vmdq_rx_q, pkts_burst, MAX_PKT_BURST);
+		for (i = 0; i < rxq; i++) {
+			rx_count = rte_eth_rx_burst(ports[0],
+					(uint16_t)vdev->vmdq_rx_q + i,
+					pkts_burst, MAX_PKT_BURST);
 
-		while (rx_count) {
-			for (i = 0; i < rx_count; i++)
-				rte_pktmbuf_free(pkts_burst[i]);
+			while (rx_count) {
+				for (j = 0; j < rx_count; j++)
+					rte_pktmbuf_free(pkts_burst[j]);
 
-			rx_count = rte_eth_rx_burst(ports[0],
-					(uint16_t)vdev->vmdq_rx_q, pkts_burst, MAX_PKT_BURST);
+				rx_count = rte_eth_rx_burst(ports[0],
+					(uint16_t)vdev->vmdq_rx_q + i,
+					pkts_burst, MAX_PKT_BURST);
+			}
 		}
 
 		vdev->ready = DEVICE_MAC_LEARNING;
@@ -1050,7 +1083,7 @@ unlink_vmdq(struct vhost_dev *vdev)
  * the packet on that devices RX queue. If not then return.
  */
 static inline int __attribute__((always_inline))
-virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m)
+virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m, uint32_t q_idx)
 {
 	struct virtio_net_data_ll *dev_ll;
 	struct ether_hdr *pkt_hdr;
@@ -1065,7 +1098,7 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m)
 
 	while (dev_ll != NULL) {
 		if ((dev_ll->vdev->ready == DEVICE_RX) && ether_addr_cmp(&(pkt_hdr->d_addr),
-				          &dev_ll->vdev->mac_address)) {
+					&dev_ll->vdev->mac_address)) {
 
 			/* Drop the packet if the TX packet is destined for the TX device. */
 			if (dev_ll->vdev->dev->device_fh == dev->device_fh) {
@@ -1083,7 +1116,9 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m)
 				LOG_DEBUG(VHOST_DATA, "(%"PRIu64") Device is marked for removal\n", tdev->device_fh);
 			} else {
 				/*send the packet to the local virtio device*/
-				ret = rte_vhost_enqueue_burst(tdev, VIRTIO_RXQ, &m, 1);
+				ret = rte_vhost_enqueue_burst(tdev,
+					VIRTIO_RXQ + q_idx * VIRTIO_QNUM,
+					&m, 1);
 				if (enable_stats) {
 					rte_atomic64_add(
 					&dev_statistics[tdev->device_fh].rx_total_atomic,
@@ -1160,7 +1195,8 @@ find_local_dest(struct virtio_net *dev, struct rte_mbuf *m,
  * or the physical port.
  */
 static inline void __attribute__((always_inline))
-virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
+virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m,
+		uint16_t vlan_tag, uint32_t q_idx)
 {
 	struct mbuf_table *tx_q;
 	struct rte_mbuf **m_table;
@@ -1170,7 +1206,8 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag)
 	struct ether_hdr *nh;
 
 	/*check if destination is local VM*/
-	if ((vm2vm_mode == VM2VM_SOFTWARE) && (virtio_tx_local(vdev, m) == 0)) {
+	if ((vm2vm_mode == VM2VM_SOFTWARE) &&
+		(virtio_tx_local(vdev, m, q_idx) == 0)) {
 		rte_pktmbuf_free(m);
 		return;
 	}
@@ -1334,49 +1371,60 @@ switch_worker(__attribute__((unused)) void *arg)
 			}
 			if (likely(vdev->ready == DEVICE_RX)) {
 				/*Handle guest RX*/
-				rx_count = rte_eth_rx_burst(ports[0],
-					vdev->vmdq_rx_q, pkts_burst, MAX_PKT_BURST);
+				for (i = 0; i < rxq; i++) {
+					rx_count = rte_eth_rx_burst(ports[0],
+						vdev->vmdq_rx_q + i, pkts_burst, MAX_PKT_BURST);
 
-				if (rx_count) {
-					/*
-					* Retry is enabled and the queue is full then we wait and retry to avoid packet loss
-					* Here MAX_PKT_BURST must be less than virtio queue size
-					*/
-					if (enable_retry && unlikely(rx_count > rte_vring_available_entries(dev, VIRTIO_RXQ))) {
-						for (retry = 0; retry < burst_rx_retry_num; retry++) {
-							rte_delay_us(burst_rx_delay_time);
-							if (rx_count <= rte_vring_available_entries(dev, VIRTIO_RXQ))
-								break;
+					if (rx_count) {
+						/*
+						* Retry is enabled and the queue is full then we wait and retry to avoid packet loss
+						* Here MAX_PKT_BURST must be less than virtio queue size
+						*/
+						if (enable_retry && unlikely(rx_count > rte_vring_available_entries(dev,
+											VIRTIO_RXQ + i * VIRTIO_QNUM))) {
+							for (retry = 0; retry < burst_rx_retry_num; retry++) {
+								rte_delay_us(burst_rx_delay_time);
+								if (rx_count <= rte_vring_available_entries(dev,
+											VIRTIO_RXQ + i * VIRTIO_QNUM))
+									break;
+							}
+						}
+						ret_count = rte_vhost_enqueue_burst(dev, VIRTIO_RXQ + i * VIRTIO_QNUM,
+											pkts_burst, rx_count);
+						if (enable_stats) {
+							rte_atomic64_add(
+							&dev_statistics[dev_ll->vdev->dev->device_fh].rx_total_atomic,
+							rx_count);
+							rte_atomic64_add(
+							&dev_statistics[dev_ll->vdev->dev->device_fh].rx_atomic, ret_count);
+						}
+						while (likely(rx_count)) {
+							rx_count--;
+							rte_pktmbuf_free(pkts_burst[rx_count]);
 						}
 					}
-					ret_count = rte_vhost_enqueue_burst(dev, VIRTIO_RXQ, pkts_burst, rx_count);
-					if (enable_stats) {
-						rte_atomic64_add(
-						&dev_statistics[dev_ll->vdev->dev->device_fh].rx_total_atomic,
-						rx_count);
-						rte_atomic64_add(
-						&dev_statistics[dev_ll->vdev->dev->device_fh].rx_atomic, ret_count);
-					}
-					while (likely(rx_count)) {
-						rx_count--;
-						rte_pktmbuf_free(pkts_burst[rx_count]);
-					}
-
 				}
 			}
 
 			if (likely(!vdev->remove)) {
 				/* Handle guest TX*/
-				tx_count = rte_vhost_dequeue_burst(dev, VIRTIO_TXQ, mbuf_pool, pkts_burst, MAX_PKT_BURST);
-				/* If this is the first received packet we need to learn the MAC and setup VMDQ */
-				if (unlikely(vdev->ready == DEVICE_MAC_LEARNING) && tx_count) {
-					if (vdev->remove || (link_vmdq(vdev, pkts_burst[0]) == -1)) {
-						while (tx_count)
-							rte_pktmbuf_free(pkts_burst[--tx_count]);
+				for (i = 0; i < rxq; i++) {
+					tx_count = rte_vhost_dequeue_burst(dev, VIRTIO_TXQ + i * VIRTIO_QNUM,
+							mbuf_pool, pkts_burst, MAX_PKT_BURST);
+					/*
+					 * If this is the first received packet we need to learn
+					 * the MAC and setup VMDQ
+					 */
+					if (unlikely(vdev->ready == DEVICE_MAC_LEARNING) && tx_count) {
+						if (vdev->remove || (link_vmdq(vdev, pkts_burst[0]) == -1)) {
+							while (tx_count)
+								rte_pktmbuf_free(pkts_burst[--tx_count]);
+						}
 					}
+					while (tx_count)
+						virtio_tx_route(vdev, pkts_burst[--tx_count],
+								(uint16_t)dev->device_fh, i);
 				}
-				while (tx_count)
-					virtio_tx_route(vdev, pkts_burst[--tx_count], (uint16_t)dev->device_fh);
 			}
 
 			/*move to the next device in the list*/
@@ -2634,6 +2682,14 @@ new_device (struct virtio_net *dev)
 	uint32_t device_num_min = num_devices;
 	struct vhost_dev *vdev;
 	uint32_t regionidx;
+	uint32_t i;
+
+	if ((rxq > 1) && (dev->virt_qp_nb != rxq)) {
+		RTE_LOG(ERR, VHOST_DATA, "(%"PRIu64") queue num in VMDq pool:"
+			"%d != queue pair num in vhost dev:%d\n",
+			dev->device_fh, rxq, dev->virt_qp_nb);
+		return -1;
+	}
 
 	vdev = rte_zmalloc("vhost device", sizeof(*vdev), RTE_CACHE_LINE_SIZE);
 	if (vdev == NULL) {
@@ -2680,12 +2736,12 @@ new_device (struct virtio_net *dev)
 		}
 	}
 
-
 	/* Add device to main ll */
 	ll_dev = get_data_ll_free_entry(&ll_root_free);
 	if (ll_dev == NULL) {
-		RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") No free entry found in linked list. Device limit "
-			"of %d devices per core has been reached\n",
+		RTE_LOG(INFO, VHOST_DATA,
+			"(%"PRIu64") No free entry found in linked list."
+			"Device limit of %d devices per core has been reached\n",
 			dev->device_fh, num_devices);
 		if (vdev->regions_hpa)
 			rte_free(vdev->regions_hpa);
@@ -2694,8 +2750,7 @@ new_device (struct virtio_net *dev)
 	}
 	ll_dev->vdev = vdev;
 	add_data_ll_entry(&ll_root_used, ll_dev);
-	vdev->vmdq_rx_q
-		= dev->device_fh * queues_per_pool + vmdq_queue_base;
+	vdev->vmdq_rx_q	= dev->device_fh * rxq + vmdq_queue_base;
 
 	if (zero_copy) {
 		uint32_t index = vdev->vmdq_rx_q;
@@ -2801,8 +2856,11 @@ new_device (struct virtio_net *dev)
 	memset(&dev_statistics[dev->device_fh], 0, sizeof(struct device_statistics));
 
 	/* Disable notifications. */
-	rte_vhost_enable_guest_notification(dev, VIRTIO_RXQ, 0);
-	rte_vhost_enable_guest_notification(dev, VIRTIO_TXQ, 0);
+	for (i = 0; i < rxq; i++) {
+		rte_vhost_enable_guest_notification(dev, i * VIRTIO_QNUM + VIRTIO_RXQ, 0);
+		rte_vhost_enable_guest_notification(dev, i * VIRTIO_QNUM + VIRTIO_TXQ, 0);
+	}
+
 	lcore_info[vdev->coreid].lcore_ll->device_num++;
 	dev->flags |= VIRTIO_DEV_RUNNING;
 
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v4 09/12] virtio: resolve for control queue
  2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
                         ` (7 preceding siblings ...)
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 08/12] vhost: support multiple queues Ouyang Changchun
@ 2015-08-12  8:02       ` Ouyang Changchun
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 10/12] vhost: add per queue stats info Ouyang Changchun
                         ` (2 subsequent siblings)
  11 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-08-12  8:02 UTC (permalink / raw)
  To: dev

Fix the max virtio queue pair read issue.

Control queue can't work for vhost-user mulitple queue mode,
so introduce a counter to void the dead loop when polling the control queue(removed in v4).

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
Changes in v4:
  - revert the workaround
  - fix the max virtio queue pair read issue

Changes in v2:
  - fix checkpatch errors

 drivers/net/virtio/virtio_ethdev.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c
index 465d3cd..3ce11f8 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1162,7 +1162,6 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
 	struct virtio_hw *hw = eth_dev->data->dev_private;
 	struct virtio_net_config *config;
 	struct virtio_net_config local_config;
-	uint32_t offset_conf = sizeof(config->mac);
 	struct rte_pci_device *pci_dev;
 
 	RTE_BUILD_BUG_ON(RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr));
@@ -1222,7 +1221,8 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
 		config = &local_config;
 
 		if (vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
-			offset_conf += sizeof(config->status);
+			vtpci_read_dev_config(hw, offsetof(struct virtio_net_config, status),
+				(uint8_t *)&config->status, sizeof(config->status));
 		} else {
 			PMD_INIT_LOG(DEBUG,
 				     "VIRTIO_NET_F_STATUS is not supported");
@@ -1230,15 +1230,14 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
 		}
 
 		if (vtpci_with_feature(hw, VIRTIO_NET_F_MQ)) {
-			offset_conf += sizeof(config->max_virtqueue_pairs);
+			vtpci_read_dev_config(hw, offsetof(struct virtio_net_config, max_virtqueue_pairs),
+				(uint8_t *)&config->max_virtqueue_pairs, sizeof(config->max_virtqueue_pairs));
 		} else {
 			PMD_INIT_LOG(DEBUG,
 				     "VIRTIO_NET_F_MQ is not supported");
 			config->max_virtqueue_pairs = 1;
 		}
 
-		vtpci_read_dev_config(hw, 0, (uint8_t *)config, offset_conf);
-
 		hw->max_rx_queues =
 			(VIRTIO_MAX_RX_QUEUES < config->max_virtqueue_pairs) ?
 			VIRTIO_MAX_RX_QUEUES : config->max_virtqueue_pairs;
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v4 10/12] vhost: add per queue stats info
  2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
                         ` (8 preceding siblings ...)
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 09/12] virtio: resolve for control queue Ouyang Changchun
@ 2015-08-12  8:02       ` Ouyang Changchun
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 11/12] vhost: alloc core to virtq Ouyang Changchun
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 12/12] doc: update doc for vhost multiple queues Ouyang Changchun
  11 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-08-12  8:02 UTC (permalink / raw)
  To: dev

Add per queue stats info

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
Changes in v3
  - fix coding style and displaying format
  - check stats_enable to alloc mem for queue pair

Changes in v2
  - fix the stats issue in tx_local
  - dynamically alloc mem for queue pair stats info
  - fix checkpatch errors

 examples/vhost/main.c | 126 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 79 insertions(+), 47 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 683a300..54f9648 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -314,7 +314,7 @@ struct ipv4_hdr {
 #define VLAN_ETH_HLEN   18
 
 /* Per-device statistics struct */
-struct device_statistics {
+struct qp_statistics {
 	uint64_t tx_total;
 	rte_atomic64_t rx_total_atomic;
 	uint64_t rx_total;
@@ -322,6 +322,10 @@ struct device_statistics {
 	rte_atomic64_t rx_atomic;
 	uint64_t rx;
 } __rte_cache_aligned;
+
+struct device_statistics {
+	struct qp_statistics *qp_stats;
+};
 struct device_statistics dev_statistics[MAX_DEVICES];
 
 /*
@@ -766,6 +770,17 @@ us_vhost_parse_args(int argc, char **argv)
 					return -1;
 				} else {
 					enable_stats = ret;
+					if (enable_stats)
+						for (i = 0; i < MAX_DEVICES; i++) {
+							dev_statistics[i].qp_stats =
+								malloc(VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct qp_statistics));
+							if (dev_statistics[i].qp_stats == NULL) {
+								RTE_LOG(ERR, VHOST_CONFIG, "Failed to allocate memory for qp stats.\n");
+								return -1;
+							}
+							memset(dev_statistics[i].qp_stats, 0,
+								VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct qp_statistics));
+						}
 				}
 			}
 
@@ -1121,13 +1136,13 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m, uint32_t q_idx)
 					&m, 1);
 				if (enable_stats) {
 					rte_atomic64_add(
-					&dev_statistics[tdev->device_fh].rx_total_atomic,
+					&dev_statistics[tdev->device_fh].qp_stats[q_idx].rx_total_atomic,
 					1);
 					rte_atomic64_add(
-					&dev_statistics[tdev->device_fh].rx_atomic,
+					&dev_statistics[tdev->device_fh].qp_stats[q_idx].rx_atomic,
 					ret);
-					dev_statistics[tdev->device_fh].tx_total++;
-					dev_statistics[tdev->device_fh].tx += ret;
+					dev_statistics[dev->device_fh].qp_stats[q_idx].tx_total++;
+					dev_statistics[dev->device_fh].qp_stats[q_idx].tx += ret;
 				}
 			}
 
@@ -1261,8 +1276,8 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m,
 	tx_q->m_table[len] = m;
 	len++;
 	if (enable_stats) {
-		dev_statistics[dev->device_fh].tx_total++;
-		dev_statistics[dev->device_fh].tx++;
+		dev_statistics[dev->device_fh].qp_stats[q_idx].tx_total++;
+		dev_statistics[dev->device_fh].qp_stats[q_idx].tx++;
 	}
 
 	if (unlikely(len == MAX_PKT_BURST)) {
@@ -1393,10 +1408,10 @@ switch_worker(__attribute__((unused)) void *arg)
 											pkts_burst, rx_count);
 						if (enable_stats) {
 							rte_atomic64_add(
-							&dev_statistics[dev_ll->vdev->dev->device_fh].rx_total_atomic,
+							&dev_statistics[dev_ll->vdev->dev->device_fh].qp_stats[i].rx_total_atomic,
 							rx_count);
 							rte_atomic64_add(
-							&dev_statistics[dev_ll->vdev->dev->device_fh].rx_atomic, ret_count);
+							&dev_statistics[dev_ll->vdev->dev->device_fh].qp_stats[i].rx_atomic, ret_count);
 						}
 						while (likely(rx_count)) {
 							rx_count--;
@@ -1946,8 +1961,8 @@ virtio_tx_route_zcp(struct virtio_net *dev, struct rte_mbuf *m,
 		(mbuf->next == NULL) ? "null" : "non-null");
 
 	if (enable_stats) {
-		dev_statistics[dev->device_fh].tx_total++;
-		dev_statistics[dev->device_fh].tx++;
+		dev_statistics[dev->device_fh].qp_stats[0].tx_total++;
+		dev_statistics[dev->device_fh].qp_stats[0].tx++;
 	}
 
 	if (unlikely(len == MAX_PKT_BURST)) {
@@ -2230,9 +2245,9 @@ switch_worker_zcp(__attribute__((unused)) void *arg)
 					ret_count = virtio_dev_rx_zcp(dev,
 							pkts_burst, rx_count);
 					if (enable_stats) {
-						dev_statistics[dev->device_fh].rx_total
+						dev_statistics[dev->device_fh].qp_stats[0].rx_total
 							+= rx_count;
-						dev_statistics[dev->device_fh].rx
+						dev_statistics[dev->device_fh].qp_stats[0].rx
 							+= ret_count;
 					}
 					while (likely(rx_count)) {
@@ -2853,7 +2868,9 @@ new_device (struct virtio_net *dev)
 	add_data_ll_entry(&lcore_info[vdev->coreid].lcore_ll->ll_root_used, ll_dev);
 
 	/* Initialize device stats */
-	memset(&dev_statistics[dev->device_fh], 0, sizeof(struct device_statistics));
+	if (enable_stats)
+		memset(dev_statistics[dev->device_fh].qp_stats, 0,
+			VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct qp_statistics));
 
 	/* Disable notifications. */
 	for (i = 0; i < rxq; i++) {
@@ -2889,7 +2906,7 @@ print_stats(void)
 	struct virtio_net_data_ll *dev_ll;
 	uint64_t tx_dropped, rx_dropped;
 	uint64_t tx, tx_total, rx, rx_total;
-	uint32_t device_fh;
+	uint32_t device_fh, i;
 	const char clr[] = { 27, '[', '2', 'J', '\0' };
 	const char top_left[] = { 27, '[', '1', ';', '1', 'H','\0' };
 
@@ -2904,35 +2921,53 @@ print_stats(void)
 		dev_ll = ll_root_used;
 		while (dev_ll != NULL) {
 			device_fh = (uint32_t)dev_ll->vdev->dev->device_fh;
-			tx_total = dev_statistics[device_fh].tx_total;
-			tx = dev_statistics[device_fh].tx;
-			tx_dropped = tx_total - tx;
-			if (zero_copy == 0) {
-				rx_total = rte_atomic64_read(
-					&dev_statistics[device_fh].rx_total_atomic);
-				rx = rte_atomic64_read(
-					&dev_statistics[device_fh].rx_atomic);
-			} else {
-				rx_total = dev_statistics[device_fh].rx_total;
-				rx = dev_statistics[device_fh].rx;
-			}
-			rx_dropped = rx_total - rx;
-
-			printf("\nStatistics for device %"PRIu32" ------------------------------"
-					"\nTX total: 		%"PRIu64""
-					"\nTX dropped: 		%"PRIu64""
-					"\nTX successful: 		%"PRIu64""
-					"\nRX total: 		%"PRIu64""
-					"\nRX dropped: 		%"PRIu64""
-					"\nRX successful: 		%"PRIu64"",
-					device_fh,
-					tx_total,
-					tx_dropped,
-					tx,
-					rx_total,
-					rx_dropped,
-					rx);
-
+			for (i = 0; i < rxq; i++) {
+				tx_total = dev_statistics[device_fh].qp_stats[i].tx_total;
+				tx = dev_statistics[device_fh].qp_stats[i].tx;
+				tx_dropped = tx_total - tx;
+				if (zero_copy == 0) {
+					rx_total = rte_atomic64_read(
+						&dev_statistics[device_fh].qp_stats[i].rx_total_atomic);
+					rx = rte_atomic64_read(
+						&dev_statistics[device_fh].qp_stats[i].rx_atomic);
+				} else {
+					rx_total = dev_statistics[device_fh].qp_stats[0].rx_total;
+					rx = dev_statistics[device_fh].qp_stats[0].rx;
+				}
+				rx_dropped = rx_total - rx;
+
+				if (rxq > 1)
+					printf("\nStatistics for device %"PRIu32" queue id: %d------------------"
+						"\nTX total:		%"PRIu64""
+						"\nTX dropped:		%"PRIu64""
+						"\nTX success:		%"PRIu64""
+						"\nRX total:		%"PRIu64""
+						"\nRX dropped:		%"PRIu64""
+						"\nRX success:		%"PRIu64"",
+						device_fh,
+						i,
+						tx_total,
+						tx_dropped,
+						tx,
+						rx_total,
+						rx_dropped,
+						rx);
+				else
+					printf("\nStatistics for device %"PRIu32" ------------------------------"
+						"\nTX total:		%"PRIu64""
+						"\nTX dropped:		%"PRIu64""
+						"\nTX success:		%"PRIu64""
+						"\nRX total:		%"PRIu64""
+						"\nRX dropped:		%"PRIu64""
+						"\nRX success:		%"PRIu64"",
+						device_fh,
+						tx_total,
+						tx_dropped,
+						tx,
+						rx_total,
+						rx_dropped,
+						rx);
+				}
 			dev_ll = dev_ll->next;
 		}
 		printf("\n======================================================\n");
@@ -3114,9 +3149,6 @@ main(int argc, char *argv[])
 	if (init_data_ll() == -1)
 		rte_exit(EXIT_FAILURE, "Failed to initialize linked list\n");
 
-	/* Initialize device stats */
-	memset(&dev_statistics, 0, sizeof(dev_statistics));
-
 	/* Enable stats if the user option is set. */
 	if (enable_stats)
 		pthread_create(&tid, NULL, (void*)print_stats, NULL );
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v4 11/12] vhost: alloc core to virtq
  2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
                         ` (9 preceding siblings ...)
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 10/12] vhost: add per queue stats info Ouyang Changchun
@ 2015-08-12  8:02       ` Ouyang Changchun
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 12/12] doc: update doc for vhost multiple queues Ouyang Changchun
  11 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-08-12  8:02 UTC (permalink / raw)
  To: dev

This patch allocates the core on the granularity of virtq instead of virtio device.
This allows vhost having the capability of polling different virtq with different core,
which shows better performance on vhost/virtio ports with more cores.

Add 2 API: rte_vhost_core_id_get and rte_vhost_core_id_set.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
It is added since v4.

 examples/vhost/Makefile           |   4 +-
 examples/vhost/main.c             | 243 ++++++++++++++++++++------------------
 examples/vhost/main.h             |   3 +-
 lib/librte_vhost/rte_virtio_net.h |  25 ++++
 lib/librte_vhost/virtio-net.c     |  22 ++++
 5 files changed, 178 insertions(+), 119 deletions(-)

diff --git a/examples/vhost/Makefile b/examples/vhost/Makefile
index c269466..32a3dec 100644
--- a/examples/vhost/Makefile
+++ b/examples/vhost/Makefile
@@ -50,8 +50,8 @@ APP = vhost-switch
 # all source are stored in SRCS-y
 SRCS-y := main.c
 
-CFLAGS += -O2 -D_FILE_OFFSET_BITS=64
-CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -O0 -g -D_FILE_OFFSET_BITS=64
+CFLAGS += $(WERROR_FLAGS) -Wno-maybe-uninitialized
 
 include $(RTE_SDK)/mk/rte.extapp.mk
 
diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 54f9648..0a36c61 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1386,60 +1386,58 @@ switch_worker(__attribute__((unused)) void *arg)
 			}
 			if (likely(vdev->ready == DEVICE_RX)) {
 				/*Handle guest RX*/
-				for (i = 0; i < rxq; i++) {
-					rx_count = rte_eth_rx_burst(ports[0],
-						vdev->vmdq_rx_q + i, pkts_burst, MAX_PKT_BURST);
+				uint16_t q_idx = dev_ll->work_q_idx;
+				rx_count = rte_eth_rx_burst(ports[0],
+					vdev->vmdq_rx_q + q_idx, pkts_burst, MAX_PKT_BURST);
 
-					if (rx_count) {
-						/*
-						* Retry is enabled and the queue is full then we wait and retry to avoid packet loss
-						* Here MAX_PKT_BURST must be less than virtio queue size
-						*/
-						if (enable_retry && unlikely(rx_count > rte_vring_available_entries(dev,
-											VIRTIO_RXQ + i * VIRTIO_QNUM))) {
-							for (retry = 0; retry < burst_rx_retry_num; retry++) {
-								rte_delay_us(burst_rx_delay_time);
-								if (rx_count <= rte_vring_available_entries(dev,
-											VIRTIO_RXQ + i * VIRTIO_QNUM))
-									break;
-							}
-						}
-						ret_count = rte_vhost_enqueue_burst(dev, VIRTIO_RXQ + i * VIRTIO_QNUM,
-											pkts_burst, rx_count);
-						if (enable_stats) {
-							rte_atomic64_add(
-							&dev_statistics[dev_ll->vdev->dev->device_fh].qp_stats[i].rx_total_atomic,
-							rx_count);
-							rte_atomic64_add(
-							&dev_statistics[dev_ll->vdev->dev->device_fh].qp_stats[i].rx_atomic, ret_count);
-						}
-						while (likely(rx_count)) {
-							rx_count--;
-							rte_pktmbuf_free(pkts_burst[rx_count]);
+				if (rx_count) {
+					/*
+					* Retry is enabled and the queue is full then we wait and retry to avoid packet loss
+					* Here MAX_PKT_BURST must be less than virtio queue size
+					*/
+					if (enable_retry && unlikely(rx_count > rte_vring_available_entries(dev,
+										VIRTIO_RXQ + q_idx * VIRTIO_QNUM))) {
+						for (retry = 0; retry < burst_rx_retry_num; retry++) {
+							rte_delay_us(burst_rx_delay_time);
+							if (rx_count <= rte_vring_available_entries(dev,
+										VIRTIO_RXQ + q_idx * VIRTIO_QNUM))
+								break;
 						}
 					}
+					ret_count = rte_vhost_enqueue_burst(dev, VIRTIO_RXQ + q_idx * VIRTIO_QNUM,
+										pkts_burst, rx_count);
+					if (enable_stats) {
+						rte_atomic64_add(
+						&dev_statistics[dev_ll->vdev->dev->device_fh].qp_stats[q_idx].rx_total_atomic,
+						rx_count);
+						rte_atomic64_add(
+						&dev_statistics[dev_ll->vdev->dev->device_fh].qp_stats[q_idx].rx_atomic, ret_count);
+					}
+					while (likely(rx_count)) {
+						rx_count--;
+						rte_pktmbuf_free(pkts_burst[rx_count]);
+					}
 				}
 			}
 
 			if (likely(!vdev->remove)) {
 				/* Handle guest TX*/
-				for (i = 0; i < rxq; i++) {
-					tx_count = rte_vhost_dequeue_burst(dev, VIRTIO_TXQ + i * VIRTIO_QNUM,
-							mbuf_pool, pkts_burst, MAX_PKT_BURST);
-					/*
-					 * If this is the first received packet we need to learn
-					 * the MAC and setup VMDQ
-					 */
-					if (unlikely(vdev->ready == DEVICE_MAC_LEARNING) && tx_count) {
-						if (vdev->remove || (link_vmdq(vdev, pkts_burst[0]) == -1)) {
-							while (tx_count)
-								rte_pktmbuf_free(pkts_burst[--tx_count]);
-						}
+				uint16_t q_idx = dev_ll->work_q_idx;
+				tx_count = rte_vhost_dequeue_burst(dev, VIRTIO_TXQ + q_idx * VIRTIO_QNUM,
+						mbuf_pool, pkts_burst, MAX_PKT_BURST);
+				/*
+				 * If this is the first received packet we need to learn
+				 * the MAC and setup VMDQ
+				 */
+				if (unlikely(vdev->ready == DEVICE_MAC_LEARNING) && tx_count) {
+					if (vdev->remove || (link_vmdq(vdev, pkts_burst[0]) == -1)) {
+						while (tx_count)
+							rte_pktmbuf_free(pkts_burst[--tx_count]);
 					}
-					while (tx_count)
-						virtio_tx_route(vdev, pkts_burst[--tx_count],
-								(uint16_t)dev->device_fh, i);
 				}
+				while (tx_count)
+					virtio_tx_route(vdev, pkts_burst[--tx_count],
+						(uint16_t)dev->device_fh, q_idx);
 			}
 
 			/*move to the next device in the list*/
@@ -2427,6 +2425,7 @@ destroy_device (volatile struct virtio_net *dev)
 	struct virtio_net_data_ll *ll_main_dev_last = NULL;
 	struct vhost_dev *vdev;
 	int lcore;
+	uint32_t i;
 
 	dev->flags &= ~VIRTIO_DEV_RUNNING;
 
@@ -2438,61 +2437,73 @@ destroy_device (volatile struct virtio_net *dev)
 	}
 
 	/* Search for entry to be removed from lcore ll */
-	ll_lcore_dev_cur = lcore_info[vdev->coreid].lcore_ll->ll_root_used;
-	while (ll_lcore_dev_cur != NULL) {
-		if (ll_lcore_dev_cur->vdev == vdev) {
-			break;
-		} else {
-			ll_lcore_dev_last = ll_lcore_dev_cur;
-			ll_lcore_dev_cur = ll_lcore_dev_cur->next;
+	for (i = 0; i < rxq; i++) {
+		uint16_t core_id = rte_vhost_core_id_get(dev, i);
+
+		ll_lcore_dev_cur = lcore_info[core_id].lcore_ll->ll_root_used;
+
+		while (ll_lcore_dev_cur != NULL) {
+			if (ll_lcore_dev_cur->vdev == vdev) {
+				break;
+			} else {
+				ll_lcore_dev_last = ll_lcore_dev_cur;
+				ll_lcore_dev_cur = ll_lcore_dev_cur->next;
+			}
 		}
-	}
 
-	if (ll_lcore_dev_cur == NULL) {
-		RTE_LOG(ERR, VHOST_CONFIG,
-			"(%"PRIu64") Failed to find the dev to be destroy.\n",
-			dev->device_fh);
-		return;
-	}
+		if (ll_lcore_dev_cur == NULL) {
+			RTE_LOG(ERR, VHOST_CONFIG,
+				"(%"PRIu64") Failed to find the dev to be destroy.\n",
+				dev->device_fh);
+			if (i == 0)
+				return;
+			else
+				break;
+		}
 
-	/* Search for entry to be removed from main ll */
-	ll_main_dev_cur = ll_root_used;
-	ll_main_dev_last = NULL;
-	while (ll_main_dev_cur != NULL) {
-		if (ll_main_dev_cur->vdev == vdev) {
-			break;
-		} else {
-			ll_main_dev_last = ll_main_dev_cur;
-			ll_main_dev_cur = ll_main_dev_cur->next;
+		/* Search for entry to be removed from main ll */
+		if (i == 0) {
+			ll_main_dev_cur = ll_root_used;
+			ll_main_dev_last = NULL;
+			while (ll_main_dev_cur != NULL) {
+				if (ll_main_dev_cur->vdev == vdev) {
+					break;
+				} else {
+					ll_main_dev_last = ll_main_dev_cur;
+					ll_main_dev_cur = ll_main_dev_cur->next;
+				}
+			}
 		}
-	}
 
-	/* Remove entries from the lcore and main ll. */
-	rm_data_ll_entry(&lcore_info[vdev->coreid].lcore_ll->ll_root_used, ll_lcore_dev_cur, ll_lcore_dev_last);
-	rm_data_ll_entry(&ll_root_used, ll_main_dev_cur, ll_main_dev_last);
+		/* Remove entries from the lcore and main ll. */
+		rm_data_ll_entry(&lcore_info[core_id].lcore_ll->ll_root_used, ll_lcore_dev_cur, ll_lcore_dev_last);
+		if (i == 0)
+			rm_data_ll_entry(&ll_root_used, ll_main_dev_cur, ll_main_dev_last);
 
-	/* Set the dev_removal_flag on each lcore. */
-	RTE_LCORE_FOREACH_SLAVE(lcore) {
-		lcore_info[lcore].lcore_ll->dev_removal_flag = REQUEST_DEV_REMOVAL;
-	}
+		/* Set the dev_removal_flag on each lcore. */
+		RTE_LCORE_FOREACH_SLAVE(lcore) {
+			lcore_info[lcore].lcore_ll->dev_removal_flag = REQUEST_DEV_REMOVAL;
+		}
 
-	/*
-	 * Once each core has set the dev_removal_flag to ACK_DEV_REMOVAL we can be sure that
-	 * they can no longer access the device removed from the linked lists and that the devices
-	 * are no longer in use.
-	 */
-	RTE_LCORE_FOREACH_SLAVE(lcore) {
-		while (lcore_info[lcore].lcore_ll->dev_removal_flag != ACK_DEV_REMOVAL) {
-			rte_pause();
+		/*
+		 * Once each core has set the dev_removal_flag to ACK_DEV_REMOVAL we can be sure that
+		 * they can no longer access the device removed from the linked lists and that the devices
+		 * are no longer in use.
+		 */
+		RTE_LCORE_FOREACH_SLAVE(lcore) {
+			while (lcore_info[lcore].lcore_ll->dev_removal_flag != ACK_DEV_REMOVAL)
+				rte_pause();
 		}
-	}
 
-	/* Add the entries back to the lcore and main free ll.*/
-	put_data_ll_free_entry(&lcore_info[vdev->coreid].lcore_ll->ll_root_free, ll_lcore_dev_cur);
-	put_data_ll_free_entry(&ll_root_free, ll_main_dev_cur);
+		/* Add the entries back to the lcore and main free ll.*/
+		put_data_ll_free_entry(&lcore_info[core_id].lcore_ll->ll_root_free, ll_lcore_dev_cur);
 
-	/* Decrement number of device on the lcore. */
-	lcore_info[vdev->coreid].lcore_ll->device_num--;
+		if (i == 0)
+			put_data_ll_free_entry(&ll_root_free, ll_main_dev_cur);
+
+		/* Decrement number of device on the lcore. */
+		lcore_info[core_id].lcore_ll->device_num--;
+	}
 
 	RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") Device has been removed from data core\n", dev->device_fh);
 
@@ -2846,42 +2857,44 @@ new_device (struct virtio_net *dev)
 	vdev->remove = 0;
 
 	/* Find a suitable lcore to add the device. */
-	RTE_LCORE_FOREACH_SLAVE(lcore) {
-		if (lcore_info[lcore].lcore_ll->device_num < device_num_min) {
-			device_num_min = lcore_info[lcore].lcore_ll->device_num;
-			core_add = lcore;
+	for (i = 0; i < rxq; i++) {
+		device_num_min = num_devices;
+		RTE_LCORE_FOREACH_SLAVE(lcore) {
+			if (lcore_info[lcore].lcore_ll->device_num < device_num_min) {
+				device_num_min = lcore_info[lcore].lcore_ll->device_num;
+				core_add = lcore;
+			}
 		}
-	}
-	/* Add device to lcore ll */
-	ll_dev = get_data_ll_free_entry(&lcore_info[core_add].lcore_ll->ll_root_free);
-	if (ll_dev == NULL) {
-		RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") Failed to add device to data core\n", dev->device_fh);
-		vdev->ready = DEVICE_SAFE_REMOVE;
-		destroy_device(dev);
-		rte_free(vdev->regions_hpa);
-		rte_free(vdev);
-		return -1;
-	}
-	ll_dev->vdev = vdev;
-	vdev->coreid = core_add;
+		/* Add device to lcore ll */
+		ll_dev = get_data_ll_free_entry(&lcore_info[core_add].lcore_ll->ll_root_free);
+		if (ll_dev == NULL) {
+			RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") Failed to add device to data core\n", dev->device_fh);
+			vdev->ready = DEVICE_SAFE_REMOVE;
+			destroy_device(dev);
+			rte_free(vdev->regions_hpa);
+			rte_free(vdev);
+			return -1;
+		}
+		ll_dev->vdev = vdev;
+		ll_dev->work_q_idx = i;
+		rte_vhost_core_id_set(dev, i, core_add);
+		add_data_ll_entry(&lcore_info[core_add].lcore_ll->ll_root_used, ll_dev);
 
-	add_data_ll_entry(&lcore_info[vdev->coreid].lcore_ll->ll_root_used, ll_dev);
+		/* Disable notifications. */
+		rte_vhost_enable_guest_notification(dev, i * VIRTIO_QNUM + VIRTIO_RXQ, 0);
+		rte_vhost_enable_guest_notification(dev, i * VIRTIO_QNUM + VIRTIO_TXQ, 0);
+		lcore_info[core_add].lcore_ll->device_num++;
+		RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") Device has been added to data core %d for vq: %d\n",
+			dev->device_fh, core_add, i);
+	}
 
 	/* Initialize device stats */
 	if (enable_stats)
 		memset(dev_statistics[dev->device_fh].qp_stats, 0,
 			VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct qp_statistics));
 
-	/* Disable notifications. */
-	for (i = 0; i < rxq; i++) {
-		rte_vhost_enable_guest_notification(dev, i * VIRTIO_QNUM + VIRTIO_RXQ, 0);
-		rte_vhost_enable_guest_notification(dev, i * VIRTIO_QNUM + VIRTIO_TXQ, 0);
-	}
-
-	lcore_info[vdev->coreid].lcore_ll->device_num++;
 	dev->flags |= VIRTIO_DEV_RUNNING;
 
-	RTE_LOG(INFO, VHOST_DATA, "(%"PRIu64") Device has been added to data core %d\n", dev->device_fh, vdev->coreid);
 
 	return 0;
 }
diff --git a/examples/vhost/main.h b/examples/vhost/main.h
index d04e2be..42336bc 100644
--- a/examples/vhost/main.h
+++ b/examples/vhost/main.h
@@ -82,8 +82,6 @@ struct vhost_dev {
 	uint16_t vmdq_rx_q;
 	/**< Vlan tag assigned to the pool */
 	uint32_t vlan_tag;
-	/**< Data core that the device is added to. */
-	uint16_t coreid;
 	/**< A device is set as ready if the MAC address has been set. */
 	volatile uint8_t ready;
 	/**< Device is marked for removal from the data core. */
@@ -94,6 +92,7 @@ struct virtio_net_data_ll
 {
 	struct vhost_dev		*vdev;	/* Pointer to device created by configuration core. */
 	struct virtio_net_data_ll	*next;  /* Pointer to next device in linked list. */
+	uint32_t work_q_idx;
 };
 
 /*
diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
index e16ad3a..93d3e27 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -89,6 +89,7 @@ struct vhost_virtqueue {
 	eventfd_t		callfd;			/**< Used to notify the guest (trigger interrupt). */
 	eventfd_t		kickfd;			/**< Currently unused as polling mode is enabled. */
 	uint32_t		enabled;		/**< Indicate the queue is enabled or not. */
+	uint16_t		core_id;		/**< Data core that the vq is added to. */
 	struct buf_vector	buf_vec[BUF_VECTOR_MAX];	/**< for scatter RX. */
 } __rte_cache_aligned;
 
@@ -241,8 +242,32 @@ uint16_t rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
 
 /**
  * This function get the queue pair number of one vhost device.
+ * @param dev
+ *  virtio-net device
  * @return
  *  num of queue pair of specified virtio device.
  */
 uint16_t rte_vhost_qp_num_get(struct virtio_net *dev);
+
+/**
+ * This function get the data core id for queue pair in one vhost device.
+ * @param dev
+ *  virtio-net device
+ * @param queue_id
+ *  virtio queue index in mq case
+ * @return
+ *  core id of queue pair of specified virtio device.
+ */
+uint16_t rte_vhost_core_id_get(volatile struct virtio_net *dev, uint16_t queue_id);
+
+/**
+ * This function set the data core id for queue pair in one vhost device.
+ * @param dev
+ *  virtio-net device
+ * @param queue_id
+ *  virtio queue index in mq case
+ * @param core_id
+ *  data core id for virtio queue pair in mq case
+ */
+void rte_vhost_core_id_set(struct virtio_net *dev, uint16_t queue_id, uint16_t core_id);
 #endif /* _VIRTIO_NET_H_ */
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 24d0c53..d4c55c6 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -965,6 +965,28 @@ uint16_t rte_vhost_qp_num_get(struct virtio_net *dev)
 	return dev->virt_qp_nb;
 }
 
+uint16_t rte_vhost_core_id_get(volatile struct virtio_net *dev, uint16_t queue_id)
+{
+	if (dev == NULL)
+		return 0;
+
+	if (dev->virtqueue == NULL || dev->virtqueue[queue_id] == NULL)
+		return 0;
+
+	return dev->virtqueue[queue_id]->core_id;
+}
+
+void rte_vhost_core_id_set(struct virtio_net *dev, uint16_t queue_id, uint16_t core_id)
+{
+	if (dev == NULL)
+		return;
+
+	if (dev->virtqueue == NULL || dev->virtqueue[queue_id] == NULL)
+		return;
+
+	dev->virtqueue[queue_id]->core_id = core_id;
+}
+
 /*
  * Register ops so that we can add/remove device to data core.
  */
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* [dpdk-dev] [PATCH v4 12/12] doc: update doc for vhost multiple queues
  2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
                         ` (10 preceding siblings ...)
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 11/12] vhost: alloc core to virtq Ouyang Changchun
@ 2015-08-12  8:02       ` Ouyang Changchun
  11 siblings, 0 replies; 65+ messages in thread
From: Ouyang Changchun @ 2015-08-12  8:02 UTC (permalink / raw)
  To: dev

Update the sample guide doc for vhost multiple queues;
Update the prog guide doc for vhost lib multiple queues feature;

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
---
It is added since v3

 doc/guides/prog_guide/vhost_lib.rst |  38 ++++++++++++
 doc/guides/sample_app_ug/vhost.rst  | 113 ++++++++++++++++++++++++++++++++++++
 2 files changed, 151 insertions(+)

diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index 48e1fff..6f2315d 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -128,6 +128,44 @@ VHOST_GET_VRING_BASE is used as the signal to remove vhost device from data plan
 
 When the socket connection is closed, vhost will destroy the device.
 
+Vhost multiple queues feature
+-----------------------------
+This feature supports the multiple queues for each virtio device in vhost.
+Currently multiple queues feature is supported only for vhost-user, not supported for vhost-cuse.
+
+The new QEMU patch version(v6) of enabling vhost-use multiple queues has already been sent out to
+QEMU community and in its comments collecting stage. It requires applying the patch set onto QEMU
+and rebuild the QEMU before running vhost multiple queues:
+    http://patchwork.ozlabs.org/patch/506333/
+    http://patchwork.ozlabs.org/patch/506334/
+
+Note: the QEMU patch is based on top of 2 other patches, see patch description for more details
+
+The vhost will get the queue pair number based on the communication message with QEMU.
+
+HW queue numbers in pool is strongly recommended to set as identical with the queue number to start
+the QMEU guest and identical with the queue number to start with virtio port on guest.
+
+=========================================
+==================|   |==================|
+       vport0     |   |      vport1      |
+---  ---  ---  ---|   |---  ---  ---  ---|
+q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
+/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
+||   ||   ||   ||      ||   ||   ||   ||
+||   ||   ||   ||      ||   ||   ||   ||
+||= =||= =||= =||=|   =||== ||== ||== ||=|
+q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
+------------------|   |------------------|
+     VMDq pool0   |   |    VMDq pool1    |
+==================|   |==================|
+
+In RX side, it firstly polls each queue of the pool and gets the packets from
+it and enqueue them into its corresponding virtqueue in virtio device/port.
+In TX side, it dequeue packets from each virtqueue of virtio device/port and send
+to either physical port or another virtio device according to its destination
+MAC address.
+
 Vhost supported vSwitch reference
 ---------------------------------
 
diff --git a/doc/guides/sample_app_ug/vhost.rst b/doc/guides/sample_app_ug/vhost.rst
index 730b9da..e7dfe70 100644
--- a/doc/guides/sample_app_ug/vhost.rst
+++ b/doc/guides/sample_app_ug/vhost.rst
@@ -514,6 +514,13 @@ It is enabled by default.
 
     user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir /mnt/huge -- --vlan-strip [0, 1]
 
+**rxq.**
+The rxq option specify the rx queue number per VMDq pool, it is 1 on default.
+
+.. code-block:: console
+
+    user@target:~$ ./build/app/vhost-switch -c f -n 4 --huge-dir /mnt/huge -- --rxq [1, 2, 4]
+
 Running the Virtual Machine (QEMU)
 ----------------------------------
 
@@ -833,3 +840,109 @@ For example:
 The above message indicates that device 0 has been registered with MAC address cc:bb:bb:bb:bb:bb and VLAN tag 1000.
 Any packets received on the NIC with these values is placed on the devices receive queue.
 When a virtio-net device transmits packets, the VLAN tag is added to the packet by the DPDK vhost sample code.
+
+Vhost multiple queues
+---------------------
+
+This feature supports the multiple queues for each virtio device in vhost.
+Currently multiple queues feature is supported only for vhost-user, not supported for vhost-cuse.
+
+The new QEMU patch version(v6) of enabling vhost-use multiple queues has already been sent out to
+QEMU community and in its comments collecting stage. It requires applying the patch set onto QEMU
+and rebuild the QEMU before running vhost multiple queues:
+    http://patchwork.ozlabs.org/patch/506333/
+    http://patchwork.ozlabs.org/patch/506334/
+
+Note: the QEMU patch is based on top of 2 other patches, see patch description for more details.
+
+Basically vhost sample leverages the VMDq+RSS in HW to receive packets and distribute them
+into different queue in the pool according to their 5 tuples.
+
+On the other hand, the vhost will get the queue pair number based on the communication message with
+QEMU.
+
+HW queue numbers in pool is strongly recommended to set as identical with the queue number to start
+the QMEU guest and identical with the queue number to start with virtio port on guest.
+E.g. use '--rxq 4' to set the queue number as 4, it means there are 4 HW queues in each VMDq pool,
+and 4 queues in each vhost device/port, every queue in pool maps to one queue in vhost device.
+
+=========================================
+==================|   |==================|
+       vport0     |   |      vport1      |
+---  ---  ---  ---|   |---  ---  ---  ---|
+q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
+/\= =/\= =/\= =/\=|   |/\= =/\= =/\= =/\=|
+||   ||   ||   ||      ||   ||   ||   ||
+||   ||   ||   ||      ||   ||   ||   ||
+||= =||= =||= =||=|   =||== ||== ||== ||=|
+q0 | q1 | q2 | q3 |   |q0 | q1 | q2 | q3 |
+------------------|   |------------------|
+     VMDq pool0   |   |    VMDq pool1    |
+==================|   |==================|
+
+In RX side, it firstly polls each queue of the pool and gets the packets from
+it and enqueue them into its corresponding virtqueue in virtio device/port.
+In TX side, it dequeue packets from each virtqueue of virtio device/port and send
+to either physical port or another virtio device according to its destination
+MAC address.
+
+
+Test guidance
+~~~~~~~~~~~~~
+
+#.  On host, firstly mount hugepage, and insmod uio, igb_uio, bind one nic on igb_uio;
+    and then run vhost sample, key steps as follows:
+
+.. code-block:: console
+
+    sudo mount -t hugetlbfs nodev /mnt/huge
+    sudo modprobe uio
+    sudo insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko
+
+    $RTE_SDK/tools/dpdk_nic_bind.py --bind igb_uio 0000:08:00.0
+    sudo $RTE_SDK/examples/vhost/build/vhost-switch -c 0xf0 -n 4 --huge-dir \
+    /mnt/huge --socket-mem 1024,0 -- -p 1 --vm2vm 0 --dev-basename usvhost --rxq 2
+
+.. note::
+
+    use '--stats 1' to enable the stats dumping on screen for vhost.
+
+#.  After step 1, on host, modprobe kvm and kvm_intel, and use qemu command line to start one guest:
+
+.. code-block:: console
+
+    modprobe kvm
+    modprobe kvm_intel
+    sudo mount -t hugetlbfs nodev /dev/hugepages -o pagesize=1G
+
+    $QEMU_PATH/qemu-system-x86_64 -enable-kvm -m 4096 \
+    -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on \
+    -numa node,memdev=mem -mem-prealloc -smp 10 -cpu core2duo,+sse3,+sse4.1,+sse4.2 \
+    -name <vm-name> -drive file=<img-path>/vm.img \
+    -chardev socket,id=char0,path=<usvhost-path>/usvhost \
+    -netdev type=vhost-user,id=hostnet2,chardev=char0,vhostforce=on,queues=2 \
+    -device virtio-net-pci,mq=on,vectors=6,netdev=hostnet2,id=net2,mac=52:54:00:12:34:56,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off \
+    -chardev socket,id=char1,path=<usvhost-path>/usvhost \
+    -netdev type=vhost-user,id=hostnet3,chardev=char1,vhostforce=on,queues=2 \
+    -device virtio-net-pci,mq=on,vectors=6,netdev=hostnet3,id=net3,mac=52:54:00:12:34:57,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off
+
+#.  Log on guest, use testpmd(dpdk based) to test, use multiple virtio queues to rx and tx packets.
+
+.. code-block:: console
+
+    modprobe uio
+    insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko
+    echo 1024 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
+    ./tools/dpdk_nic_bind.py --bind igb_uio 00:03.0 00:04.0
+
+    $RTE_SDK/$RTE_TARGET/app/testpmd -c 1f -n 4 -- --rxq=2 --txq=2 --nb-cores=4 \
+    --rx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" \
+    --tx-queue-stats-mapping="(0,0,0),(0,1,1),(1,0,2),(1,1,3)" -i --disable-hw-vlan --txqflags 0xf00
+
+    set fwd mac
+    start tx_first
+
+#.  Use packet generator to send packets with dest MAC:52 54 00 12 34 57  VLAN tag:1001,
+    select IPv4 as protocols and continuous incremental IP address.
+
+#.  Testpmd on guest can display packets received/transmitted in both queues of each virtio port.
-- 
1.8.4.2

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v4 01/12] ixgbe: support VMDq RSS in non-SRIOV environment
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 01/12] ixgbe: support VMDq RSS in non-SRIOV environment Ouyang Changchun
@ 2015-08-12  8:22         ` Vincent JARDIN
  0 siblings, 0 replies; 65+ messages in thread
From: Vincent JARDIN @ 2015-08-12  8:22 UTC (permalink / raw)
  To: Ouyang Changchun; +Cc: dev

On 12/08/2015 10:02, Ouyang Changchun wrote:
> +#define VMDQ_RSS_RX_QUEUE_NUM_MAX 4
> +
> +static int
> +rte_eth_dev_check_vmdq_rss_rxq_num(__rte_unused uint8_t port_id, uint16_t nb_rx_q)
> +{
> +	if (nb_rx_q > VMDQ_RSS_RX_QUEUE_NUM_MAX)
> +		return -EINVAL;
> +	return 0;
> +}
> +

it is an ixgbe limitation, so, it should not be a included into 
librte_ether/rte_ethdev.c

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v4 03/12] vhost: update version map file
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 03/12] vhost: update version map file Ouyang Changchun
@ 2015-08-12  8:24         ` Panu Matilainen
  0 siblings, 0 replies; 65+ messages in thread
From: Panu Matilainen @ 2015-08-12  8:24 UTC (permalink / raw)
  To: Ouyang Changchun, dev

On 08/12/2015 11:02 AM, Ouyang Changchun wrote:
> From: Changchun Ouyang <changchun.ouyang@intel.com>
>
> it is added in v4.
>
> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> ---
>   lib/librte_vhost/rte_vhost_version.map | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/librte_vhost/rte_vhost_version.map b/lib/librte_vhost/rte_vhost_version.map
> index 3d8709e..0bb1c0f 100644
> --- a/lib/librte_vhost/rte_vhost_version.map
> +++ b/lib/librte_vhost/rte_vhost_version.map
> @@ -18,5 +18,5 @@ DPDK_2.1 {
>   	global:
>
>   	rte_vhost_driver_unregister;
> -
> +	rte_vhost_qp_num_get;
>   } DPDK_2.0;
>

Version map needs to be updated along with the actual code (in this 
case, the function is added in the second patch of the series). 
Otherwise there will be at least one commit where shared library 
configuration will be incorrect and might not be buildable at all.

- Panu -

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev Ouyang Changchun
@ 2015-08-13 12:52         ` Flavio Leitner
  2015-08-14  2:29           ` Ouyang, Changchun
  2015-08-19  3:52         ` Yuanhan Liu
  2015-09-03  2:27         ` Tetsuya Mukawa
  2 siblings, 1 reply; 65+ messages in thread
From: Flavio Leitner @ 2015-08-13 12:52 UTC (permalink / raw)
  To: Ouyang Changchun; +Cc: dev

On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> Each virtio device could have multiple queues, say 2 or 4, at most 8.
> Enabling this feature allows virtio device/port on guest has the ability to
> use different vCPU to receive/transmit packets from/to each queue.
> 
> In multiple queues mode, virtio device readiness means all queues of
> this virtio device are ready, cleanup/destroy a virtio device also
> requires clearing all queues belong to it.
> 
> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> ---
> Changes in v4:
>   - rebase and fix conflicts
>   - resolve comments
>   - init each virtq pair if mq is on
> 
> Changes in v3:
>   - fix coding style
>   - check virtqueue idx validity
> 
> Changes in v2:
>   - remove the q_num_set api
>   - add the qp_num_get api
>   - determine the queue pair num from qemu message
>   - rework for reset owner message handler
>   - dynamically alloc mem for dev virtqueue
>   - queue pair num could be 0x8000
>   - fix checkpatch errors
> 
>  lib/librte_vhost/rte_virtio_net.h             |  10 +-
>  lib/librte_vhost/vhost-net.h                  |   1 +
>  lib/librte_vhost/vhost_rxtx.c                 |  52 +++++---
>  lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
>  lib/librte_vhost/vhost_user/virtio-net-user.c |  76 +++++++++---
>  lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
>  lib/librte_vhost/virtio-net.c                 | 165 +++++++++++++++++---------
>  7 files changed, 222 insertions(+), 88 deletions(-)
> 
> diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_virtio_net.h
> index b9bf320..d9e887f 100644
> --- a/lib/librte_vhost/rte_virtio_net.h
> +++ b/lib/librte_vhost/rte_virtio_net.h
> @@ -59,7 +59,6 @@ struct rte_mbuf;
>  /* Backend value set by guest. */
>  #define VIRTIO_DEV_STOPPED -1
>  
> -
>  /* Enum for virtqueue management. */
>  enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
>  
> @@ -96,13 +95,14 @@ struct vhost_virtqueue {
>   * Device structure contains all configuration information relating to the device.
>   */
>  struct virtio_net {
> -	struct vhost_virtqueue	*virtqueue[VIRTIO_QNUM];	/**< Contains all virtqueue information. */
>  	struct virtio_memory	*mem;		/**< QEMU memory and memory region information. */
> +	struct vhost_virtqueue	**virtqueue;    /**< Contains all virtqueue information. */
>  	uint64_t		features;	/**< Negotiated feature set. */
>  	uint64_t		device_fh;	/**< device identifier. */
>  	uint32_t		flags;		/**< Device flags. Only used to check if device is running on data core. */
>  #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
>  	char			ifname[IF_NAME_SZ];	/**< Name of the tap device or socket path. */
> +	uint32_t		virt_qp_nb;
>  	void			*priv;		/**< private context */
>  } __rte_cache_aligned;
>  
> @@ -235,4 +235,10 @@ uint16_t rte_vhost_enqueue_burst(struct virtio_net *dev, uint16_t queue_id,
>  uint16_t rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
>  	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count);
>  
> +/**
> + * This function get the queue pair number of one vhost device.
> + * @return
> + *  num of queue pair of specified virtio device.
> + */
> +uint16_t rte_vhost_qp_num_get(struct virtio_net *dev);
>  #endif /* _VIRTIO_NET_H_ */
> diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h
> index c69b60b..7dff14d 100644
> --- a/lib/librte_vhost/vhost-net.h
> +++ b/lib/librte_vhost/vhost-net.h
> @@ -115,4 +115,5 @@ struct vhost_net_device_ops {
>  
>  
>  struct vhost_net_device_ops const *get_virtio_net_callbacks(void);
> +int alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx);
>  #endif /* _VHOST_NET_CDEV_H_ */
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> index 0d07338..db4ad88 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -43,6 +43,18 @@
>  #define MAX_PKT_BURST 32
>  
>  /**
> + * Check the virtqueue idx validility,
> + * return 1 if pass, otherwise 0.
> + */
> +static inline uint8_t __attribute__((always_inline))
> +check_virtqueue_idx(uint16_t virtq_idx, uint8_t is_tx, uint32_t virtq_num)
> +{
> +	if ((is_tx ^ (virtq_idx & 0x1)) || (virtq_idx >= virtq_num))
> +		return 0;
> +	return 1;
> +}
> +
> +/**
>   * This function adds buffers to the virtio devices RX virtqueue. Buffers can
>   * be received from the physical port or from another virtio device. A packet
>   * count is returned to indicate the number of packets that are succesfully
> @@ -68,12 +80,15 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>  	uint8_t success = 0;
>  
>  	LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_rx()\n", dev->device_fh);
> -	if (unlikely(queue_id != VIRTIO_RXQ)) {
> -		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
> +	if (unlikely(check_virtqueue_idx(queue_id, 0,
> +		VIRTIO_QNUM * dev->virt_qp_nb) == 0)) {
> +		RTE_LOG(ERR, VHOST_DATA,
> +			"%s (%"PRIu64"): virtqueue idx:%d invalid.\n",
> +			 __func__, dev->device_fh, queue_id);
>  		return 0;
>  	}
>  
> -	vq = dev->virtqueue[VIRTIO_RXQ];
> +	vq = dev->virtqueue[queue_id];
>  	count = (count > MAX_PKT_BURST) ? MAX_PKT_BURST : count;
>  
>  	/*
> @@ -235,8 +250,9 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>  }
>  
>  static inline uint32_t __attribute__((always_inline))
> -copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t res_base_idx,
> -	uint16_t res_end_idx, struct rte_mbuf *pkt)
> +copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
> +	uint16_t res_base_idx, uint16_t res_end_idx,
> +	struct rte_mbuf *pkt)
>  {
>  	uint32_t vec_idx = 0;
>  	uint32_t entry_success = 0;
> @@ -264,8 +280,9 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t res_base_idx,
>  	 * Convert from gpa to vva
>  	 * (guest physical addr -> vhost virtual addr)
>  	 */
> -	vq = dev->virtqueue[VIRTIO_RXQ];
> -	vb_addr = gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
> +	vq = dev->virtqueue[queue_id];
> +	vb_addr =
> +		gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
>  	vb_hdr_addr = vb_addr;
>  
>  	/* Prefetch buffer address. */
> @@ -464,11 +481,15 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
>  
>  	LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_merge_rx()\n",
>  		dev->device_fh);
> -	if (unlikely(queue_id != VIRTIO_RXQ)) {
> -		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
> +	if (unlikely(check_virtqueue_idx(queue_id, 0,
> +		VIRTIO_QNUM * dev->virt_qp_nb) == 0)) {
> +		RTE_LOG(ERR, VHOST_DATA,
> +			"%s (%"PRIu64"): virtqueue idx:%d invalid.\n",
> +			 __func__, dev->device_fh, queue_id);
> +		return 0;
>  	}
>  
> -	vq = dev->virtqueue[VIRTIO_RXQ];
> +	vq = dev->virtqueue[queue_id];
>  	count = RTE_MIN((uint32_t)MAX_PKT_BURST, count);
>  
>  	if (count == 0)
> @@ -509,7 +530,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
>  							res_cur_idx);
>  		} while (success == 0);
>  
> -		entry_success = copy_from_mbuf_to_vring(dev, res_base_idx,
> +		entry_success = copy_from_mbuf_to_vring(dev, queue_id, res_base_idx,
>  			res_cur_idx, pkts[pkt_idx]);
>  
>  		rte_compiler_barrier();
> @@ -559,12 +580,15 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
>  	uint16_t free_entries, entry_success = 0;
>  	uint16_t avail_idx;
>  
> -	if (unlikely(queue_id != VIRTIO_TXQ)) {
> -		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this version.\n");
> +	if (unlikely(check_virtqueue_idx(queue_id, 1,
> +		VIRTIO_QNUM * dev->virt_qp_nb) == 0)) {
> +		RTE_LOG(ERR, VHOST_DATA,
> +			"%s (%"PRIu64"): virtqueue idx:%d invalid.\n",
> +			 __func__, dev->device_fh, queue_id);
>  		return 0;
>  	}
>  
> -	vq = dev->virtqueue[VIRTIO_TXQ];
> +	vq = dev->virtqueue[queue_id];
>  	avail_idx =  *((volatile uint16_t *)&vq->avail->idx);
>  
>  	/* If there are no available buffers then return. */
> diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c
> index f406a94..3d7c373 100644
> --- a/lib/librte_vhost/vhost_user/vhost-net-user.c
> +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
> @@ -383,7 +383,9 @@ vserver_message_handler(int connfd, void *dat, int *remove)
>  		ops->set_owner(ctx);
>  		break;
>  	case VHOST_USER_RESET_OWNER:
> -		ops->reset_owner(ctx);
> +		RTE_LOG(INFO, VHOST_CONFIG,
> +			"(%"PRIu64") VHOST_NET_RESET_OWNER\n", ctx.fh);
> +		user_reset_owner(ctx, &msg.payload.state);
>  		break;
>  
>  	case VHOST_USER_SET_MEM_TABLE:
> diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c
> index c1ffc38..4c1d4df 100644
> --- a/lib/librte_vhost/vhost_user/virtio-net-user.c
> +++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
> @@ -209,30 +209,46 @@ static int
>  virtio_is_ready(struct virtio_net *dev)
>  {
>  	struct vhost_virtqueue *rvq, *tvq;
> +	uint32_t q_idx;
>  
>  	/* mq support in future.*/
> -	rvq = dev->virtqueue[VIRTIO_RXQ];
> -	tvq = dev->virtqueue[VIRTIO_TXQ];
> -	if (rvq && tvq && rvq->desc && tvq->desc &&
> -		(rvq->kickfd != (eventfd_t)-1) &&
> -		(rvq->callfd != (eventfd_t)-1) &&
> -		(tvq->kickfd != (eventfd_t)-1) &&
> -		(tvq->callfd != (eventfd_t)-1)) {
> -		RTE_LOG(INFO, VHOST_CONFIG,
> -			"virtio is now ready for processing.\n");
> -		return 1;
> +	for (q_idx = 0; q_idx < dev->virt_qp_nb; q_idx++) {
> +		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> +		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> +
> +		rvq = dev->virtqueue[virt_rx_q_idx];
> +		tvq = dev->virtqueue[virt_tx_q_idx];
> +		if ((rvq == NULL) || (tvq == NULL) ||
> +			(rvq->desc == NULL) || (tvq->desc == NULL) ||
> +			(rvq->kickfd == (eventfd_t)-1) ||
> +			(rvq->callfd == (eventfd_t)-1) ||
> +			(tvq->kickfd == (eventfd_t)-1) ||
> +			(tvq->callfd == (eventfd_t)-1)) {
> +			RTE_LOG(INFO, VHOST_CONFIG,
> +				"virtio isn't ready for processing.\n");
> +			return 0;
> +		}
>  	}
>  	RTE_LOG(INFO, VHOST_CONFIG,
> -		"virtio isn't ready for processing.\n");
> -	return 0;
> +		"virtio is now ready for processing.\n");
> +	return 1;
>  }
>  
>  void
>  user_set_vring_call(struct vhost_device_ctx ctx, struct VhostUserMsg *pmsg)
>  {
>  	struct vhost_vring_file file;
> +	struct virtio_net *dev = get_device(ctx);
> +	uint32_t cur_qp_idx;
>  
>  	file.index = pmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
> +	cur_qp_idx = file.index >> 1;
> +
> +	if (dev->virt_qp_nb < cur_qp_idx + 1) {
> +		if (alloc_vring_queue_pair(dev, cur_qp_idx) == 0)
> +			dev->virt_qp_nb = cur_qp_idx + 1;

Looks like it is missing vring initialization here.

	if (dev->virt_qp_nb < cur_qp_idx + 1) {
		if (alloc_vring_queue_pair(dev, cur_qp_idx) == 0) {
			dev->virt_qp_nb = cur_qp_idx + 1;
			init_vring_queue_pair(dev, cur_qp_idx);
		}
	}


fbl


> +	}
> +
>  	if (pmsg->payload.u64 & VHOST_USER_VRING_NOFD_MASK)
>  		file.fd = -1;
>  	else
> @@ -290,13 +306,37 @@ user_get_vring_base(struct vhost_device_ctx ctx,
>  	 * sent and only sent in vhost_vring_stop.
>  	 * TODO: cleanup the vring, it isn't usable since here.
>  	 */
> -	if (((int)dev->virtqueue[VIRTIO_RXQ]->kickfd) >= 0) {
> -		close(dev->virtqueue[VIRTIO_RXQ]->kickfd);
> -		dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
> +	if (((int)dev->virtqueue[state->index]->kickfd) >= 0) {
> +		close(dev->virtqueue[state->index]->kickfd);
> +		dev->virtqueue[state->index]->kickfd = (eventfd_t)-1;
>  	}
> -	if (((int)dev->virtqueue[VIRTIO_TXQ]->kickfd) >= 0) {
> -		close(dev->virtqueue[VIRTIO_TXQ]->kickfd);
> -		dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
> +
> +	return 0;
> +}
> +
> +/*
> + * when virtio is stopped, qemu will send us the RESET_OWNER message.
> + */
> +int
> +user_reset_owner(struct vhost_device_ctx ctx,
> +	struct vhost_vring_state *state)
> +{
> +	struct virtio_net *dev = get_device(ctx);
> +
> +	/* We have to stop the queue (virtio) if it is running. */
> +	if (dev->flags & VIRTIO_DEV_RUNNING)
> +		notify_ops->destroy_device(dev);
> +
> +	RTE_LOG(INFO, VHOST_CONFIG,
> +		"reset owner --- state idx:%d state num:%d\n", state->index, state->num);
> +	/*
> +	 * Based on current qemu vhost-user implementation, this message is
> +	 * sent and only sent in vhost_net_stop_one.
> +	 * TODO: cleanup the vring, it isn't usable since here.
> +	 */
> +	if (((int)dev->virtqueue[state->index]->kickfd) >= 0) {
> +		close(dev->virtqueue[state->index]->kickfd);
> +		dev->virtqueue[state->index]->kickfd = (eventfd_t)-1;
>  	}
>  
>  	return 0;
> diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.h b/lib/librte_vhost/vhost_user/virtio-net-user.h
> index df24860..2429836 100644
> --- a/lib/librte_vhost/vhost_user/virtio-net-user.h
> +++ b/lib/librte_vhost/vhost_user/virtio-net-user.h
> @@ -46,4 +46,6 @@ void user_set_vring_kick(struct vhost_device_ctx, struct VhostUserMsg *);
>  int user_get_vring_base(struct vhost_device_ctx, struct vhost_vring_state *);
>  
>  void user_destroy_device(struct vhost_device_ctx);
> +
> +int user_reset_owner(struct vhost_device_ctx ctx, struct vhost_vring_state *state);
>  #endif
> diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
> index b520ec5..2a4b791 100644
> --- a/lib/librte_vhost/virtio-net.c
> +++ b/lib/librte_vhost/virtio-net.c
> @@ -71,9 +71,10 @@ static struct virtio_net_config_ll *ll_root;
>  #define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
>  				(1ULL << VIRTIO_NET_F_CTRL_VQ) | \
>  				(1ULL << VIRTIO_NET_F_CTRL_RX) | \
> -				(1ULL << VHOST_F_LOG_ALL))
> -static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
> +				(1ULL << VHOST_F_LOG_ALL) | \
> +				(1ULL << VIRTIO_NET_F_MQ))
>  
> +static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
>  
>  /*
>   * Converts QEMU virtual address to Vhost virtual address. This function is
> @@ -182,6 +183,8 @@ add_config_ll_entry(struct virtio_net_config_ll *new_ll_dev)
>  static void
>  cleanup_device(struct virtio_net *dev)
>  {
> +	uint32_t qp_idx;
> +
>  	/* Unmap QEMU memory file if mapped. */
>  	if (dev->mem) {
>  		munmap((void *)(uintptr_t)dev->mem->mapped_address,
> @@ -190,14 +193,18 @@ cleanup_device(struct virtio_net *dev)
>  	}
>  
>  	/* Close any event notifiers opened by device. */
> -	if ((int)dev->virtqueue[VIRTIO_RXQ]->callfd >= 0)
> -		close((int)dev->virtqueue[VIRTIO_RXQ]->callfd);
> -	if ((int)dev->virtqueue[VIRTIO_RXQ]->kickfd >= 0)
> -		close((int)dev->virtqueue[VIRTIO_RXQ]->kickfd);
> -	if ((int)dev->virtqueue[VIRTIO_TXQ]->callfd >= 0)
> -		close((int)dev->virtqueue[VIRTIO_TXQ]->callfd);
> -	if ((int)dev->virtqueue[VIRTIO_TXQ]->kickfd >= 0)
> -		close((int)dev->virtqueue[VIRTIO_TXQ]->kickfd);
> +	for (qp_idx = 0; qp_idx < dev->virt_qp_nb; qp_idx++) {
> +		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> +		uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> +		if ((int)dev->virtqueue[virt_rx_q_idx]->callfd >= 0)
> +			close((int)dev->virtqueue[virt_rx_q_idx]->callfd);
> +		if ((int)dev->virtqueue[virt_rx_q_idx]->kickfd >= 0)
> +			close((int)dev->virtqueue[virt_rx_q_idx]->kickfd);
> +		if ((int)dev->virtqueue[virt_tx_q_idx]->callfd >= 0)
> +			close((int)dev->virtqueue[virt_tx_q_idx]->callfd);
> +		if ((int)dev->virtqueue[virt_tx_q_idx]->kickfd >= 0)
> +			close((int)dev->virtqueue[virt_tx_q_idx]->kickfd);
> +	}
>  }
>  
>  /*
> @@ -206,9 +213,17 @@ cleanup_device(struct virtio_net *dev)
>  static void
>  free_device(struct virtio_net_config_ll *ll_dev)
>  {
> -	/* Free any malloc'd memory */
> -	rte_free(ll_dev->dev.virtqueue[VIRTIO_RXQ]);
> -	rte_free(ll_dev->dev.virtqueue[VIRTIO_TXQ]);
> +	uint32_t qp_idx;
> +
> +	/*
> +	 * Free any malloc'd memory.
> +	 */
> +	/* Free every queue pair. */
> +	for (qp_idx = 0; qp_idx < ll_dev->dev.virt_qp_nb; qp_idx++) {
> +		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> +		rte_free(ll_dev->dev.virtqueue[virt_rx_q_idx]);
> +	}
> +	rte_free(ll_dev->dev.virtqueue);
>  	rte_free(ll_dev);
>  }
>  
> @@ -242,6 +257,27 @@ rm_config_ll_entry(struct virtio_net_config_ll *ll_dev,
>  }
>  
>  /*
> + *  Initialise all variables in vring queue pair.
> + */
> +static void
> +init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx)
> +{
> +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> +	memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct vhost_virtqueue));
> +	memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct vhost_virtqueue));
> +
> +	dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
> +	dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
> +	dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
> +	dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
> +
> +	/* Backends are set to -1 indicating an inactive device. */
> +	dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> +	dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> +}
> +
> +/*
>   *  Initialise all variables in device structure.
>   */
>  static void
> @@ -258,17 +294,34 @@ init_device(struct virtio_net *dev)
>  	/* Set everything to 0. */
>  	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
>  		(sizeof(struct virtio_net) - (size_t)vq_offset));
> -	memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct vhost_virtqueue));
> -	memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct vhost_virtqueue));
>  
> -	dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
> -	dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
> -	dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
> -	dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;
> +	init_vring_queue_pair(dev, 0);
> +	dev->virt_qp_nb = 1;
> +}
> +
> +/*
> + *  Alloc mem for vring queue pair.
> + */
> +int
> +alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx)
> +{
> +	struct vhost_virtqueue *virtqueue = NULL;
> +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
>  
> -	/* Backends are set to -1 indicating an inactive device. */
> -	dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
> -	dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
> +	virtqueue = rte_malloc(NULL, sizeof(struct vhost_virtqueue) * VIRTIO_QNUM, 0);
> +	if (virtqueue == NULL) {
> +		RTE_LOG(ERR, VHOST_CONFIG,
> +			"Failed to allocate memory for virt qp:%d.\n", qp_idx);
> +		return -1;
> +	}
> +
> +	dev->virtqueue[virt_rx_q_idx] = virtqueue;
> +	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
> +
> +	init_vring_queue_pair(dev, qp_idx);
> +
> +	return 0;
>  }
>  
>  /*
> @@ -280,7 +333,6 @@ static int
>  new_device(struct vhost_device_ctx ctx)
>  {
>  	struct virtio_net_config_ll *new_ll_dev;
> -	struct vhost_virtqueue *virtqueue_rx, *virtqueue_tx;
>  
>  	/* Setup device and virtqueues. */
>  	new_ll_dev = rte_malloc(NULL, sizeof(struct virtio_net_config_ll), 0);
> @@ -291,28 +343,22 @@ new_device(struct vhost_device_ctx ctx)
>  		return -1;
>  	}
>  
> -	virtqueue_rx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> -	if (virtqueue_rx == NULL) {
> -		rte_free(new_ll_dev);
> +	new_ll_dev->dev.virtqueue =
> +		rte_malloc(NULL, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct vhost_virtqueue *), 0);
> +	if (new_ll_dev->dev.virtqueue == NULL) {
>  		RTE_LOG(ERR, VHOST_CONFIG,
> -			"(%"PRIu64") Failed to allocate memory for rxq.\n",
> +			"(%"PRIu64") Failed to allocate memory for dev.virtqueue.\n",
>  			ctx.fh);
> +		rte_free(new_ll_dev);
>  		return -1;
>  	}
>  
> -	virtqueue_tx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> -	if (virtqueue_tx == NULL) {
> -		rte_free(virtqueue_rx);
> +	if (alloc_vring_queue_pair(&new_ll_dev->dev, 0) == -1) {
> +		rte_free(new_ll_dev->dev.virtqueue);
>  		rte_free(new_ll_dev);
> -		RTE_LOG(ERR, VHOST_CONFIG,
> -			"(%"PRIu64") Failed to allocate memory for txq.\n",
> -			ctx.fh);
>  		return -1;
>  	}
>  
> -	new_ll_dev->dev.virtqueue[VIRTIO_RXQ] = virtqueue_rx;
> -	new_ll_dev->dev.virtqueue[VIRTIO_TXQ] = virtqueue_tx;
> -
>  	/* Initialise device and virtqueues. */
>  	init_device(&new_ll_dev->dev);
>  
> @@ -396,7 +442,7 @@ set_owner(struct vhost_device_ctx ctx)
>   * Called from CUSE IOCTL: VHOST_RESET_OWNER
>   */
>  static int
> -reset_owner(struct vhost_device_ctx ctx)
> +reset_owner(__rte_unused struct vhost_device_ctx ctx)
>  {
>  	struct virtio_net_config_ll *ll_dev;
>  
> @@ -434,6 +480,7 @@ static int
>  set_features(struct vhost_device_ctx ctx, uint64_t *pu)
>  {
>  	struct virtio_net *dev;
> +	uint32_t q_idx;
>  
>  	dev = get_device(ctx);
>  	if (dev == NULL)
> @@ -445,22 +492,26 @@ set_features(struct vhost_device_ctx ctx, uint64_t *pu)
>  	dev->features = *pu;
>  
>  	/* Set the vhost_hlen depending on if VIRTIO_NET_F_MRG_RXBUF is set. */
> -	if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> -		LOG_DEBUG(VHOST_CONFIG,
> -			"(%"PRIu64") Mergeable RX buffers enabled\n",
> -			dev->device_fh);
> -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> -	} else {
> -		LOG_DEBUG(VHOST_CONFIG,
> -			"(%"PRIu64") Mergeable RX buffers disabled\n",
> -			dev->device_fh);
> -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> -			sizeof(struct virtio_net_hdr);
> -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> -			sizeof(struct virtio_net_hdr);
> +	for (q_idx = 0; q_idx < dev->virt_qp_nb; q_idx++) {
> +		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> +		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> +		if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> +			LOG_DEBUG(VHOST_CONFIG,
> +				"(%"PRIu64") Mergeable RX buffers enabled\n",
> +				dev->device_fh);
> +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> +		} else {
> +			LOG_DEBUG(VHOST_CONFIG,
> +				"(%"PRIu64") Mergeable RX buffers disabled\n",
> +				dev->device_fh);
> +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> +				sizeof(struct virtio_net_hdr);
> +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> +				sizeof(struct virtio_net_hdr);
> +		}
>  	}
>  	return 0;
>  }
> @@ -826,6 +877,14 @@ int rte_vhost_feature_enable(uint64_t feature_mask)
>  	return -1;
>  }
>  
> +uint16_t rte_vhost_qp_num_get(struct virtio_net *dev)
> +{
> +	if (dev == NULL)
> +		return 0;
> +
> +	return dev->virt_qp_nb;
> +}
> +
>  /*
>   * Register ops so that we can add/remove device to data core.
>   */
> -- 
> 1.8.4.2
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev
  2015-08-13 12:52         ` Flavio Leitner
@ 2015-08-14  2:29           ` Ouyang, Changchun
  2015-08-14 12:16             ` Flavio Leitner
  0 siblings, 1 reply; 65+ messages in thread
From: Ouyang, Changchun @ 2015-08-14  2:29 UTC (permalink / raw)
  To: Flavio Leitner; +Cc: dev

Hi Flavio,

Thanks for your comments, see my response below.

> -----Original Message-----
> From: Flavio Leitner [mailto:fbl@sysclose.org]
> Sent: Thursday, August 13, 2015 8:52 PM
> To: Ouyang, Changchun
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in
> virtio dev
> 
> On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > Enabling this feature allows virtio device/port on guest has the
> > ability to use different vCPU to receive/transmit packets from/to each
> queue.
> >
> > In multiple queues mode, virtio device readiness means all queues of
> > this virtio device are ready, cleanup/destroy a virtio device also
> > requires clearing all queues belong to it.
> >
> > Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> > ---
> > Changes in v4:
> >   - rebase and fix conflicts
> >   - resolve comments
> >   - init each virtq pair if mq is on
> >
> > Changes in v3:
> >   - fix coding style
> >   - check virtqueue idx validity
> >
> > Changes in v2:
> >   - remove the q_num_set api
> >   - add the qp_num_get api
> >   - determine the queue pair num from qemu message
> >   - rework for reset owner message handler
> >   - dynamically alloc mem for dev virtqueue
> >   - queue pair num could be 0x8000
> >   - fix checkpatch errors
> >
> >  lib/librte_vhost/rte_virtio_net.h             |  10 +-
> >  lib/librte_vhost/vhost-net.h                  |   1 +
> >  lib/librte_vhost/vhost_rxtx.c                 |  52 +++++---
> >  lib/librte_vhost/vhost_user/vhost-net-user.c  |   4 +-
> >  lib/librte_vhost/vhost_user/virtio-net-user.c |  76 +++++++++---
> >  lib/librte_vhost/vhost_user/virtio-net-user.h |   2 +
> >  lib/librte_vhost/virtio-net.c                 | 165 +++++++++++++++++---------
> >  7 files changed, 222 insertions(+), 88 deletions(-)
> >
> > diff --git a/lib/librte_vhost/rte_virtio_net.h
> > b/lib/librte_vhost/rte_virtio_net.h
> > index b9bf320..d9e887f 100644
> > --- a/lib/librte_vhost/rte_virtio_net.h
> > +++ b/lib/librte_vhost/rte_virtio_net.h
> > @@ -59,7 +59,6 @@ struct rte_mbuf;
> >  /* Backend value set by guest. */
> >  #define VIRTIO_DEV_STOPPED -1
> >
> > -
> >  /* Enum for virtqueue management. */
> >  enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
> >
> > @@ -96,13 +95,14 @@ struct vhost_virtqueue {
> >   * Device structure contains all configuration information relating to the
> device.
> >   */
> >  struct virtio_net {
> > -	struct vhost_virtqueue	*virtqueue[VIRTIO_QNUM];	/**< Contains
> all virtqueue information. */
> >  	struct virtio_memory	*mem;		/**< QEMU memory and
> memory region information. */
> > +	struct vhost_virtqueue	**virtqueue;    /**< Contains all virtqueue
> information. */
> >  	uint64_t		features;	/**< Negotiated feature set.
> */
> >  	uint64_t		device_fh;	/**< device identifier. */
> >  	uint32_t		flags;		/**< Device flags. Only used
> to check if device is running on data core. */
> >  #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
> >  	char			ifname[IF_NAME_SZ];	/**< Name of the tap
> device or socket path. */
> > +	uint32_t		virt_qp_nb;
> >  	void			*priv;		/**< private context */
> >  } __rte_cache_aligned;
> >
> > @@ -235,4 +235,10 @@ uint16_t rte_vhost_enqueue_burst(struct
> > virtio_net *dev, uint16_t queue_id,  uint16_t
> rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id,
> >  	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t
> > count);
> >
> > +/**
> > + * This function get the queue pair number of one vhost device.
> > + * @return
> > + *  num of queue pair of specified virtio device.
> > + */
> > +uint16_t rte_vhost_qp_num_get(struct virtio_net *dev);
> >  #endif /* _VIRTIO_NET_H_ */
> > diff --git a/lib/librte_vhost/vhost-net.h
> > b/lib/librte_vhost/vhost-net.h index c69b60b..7dff14d 100644
> > --- a/lib/librte_vhost/vhost-net.h
> > +++ b/lib/librte_vhost/vhost-net.h
> > @@ -115,4 +115,5 @@ struct vhost_net_device_ops {
> >
> >
> >  struct vhost_net_device_ops const *get_virtio_net_callbacks(void);
> > +int alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx);
> >  #endif /* _VHOST_NET_CDEV_H_ */
> > diff --git a/lib/librte_vhost/vhost_rxtx.c
> > b/lib/librte_vhost/vhost_rxtx.c index 0d07338..db4ad88 100644
> > --- a/lib/librte_vhost/vhost_rxtx.c
> > +++ b/lib/librte_vhost/vhost_rxtx.c
> > @@ -43,6 +43,18 @@
> >  #define MAX_PKT_BURST 32
> >
> >  /**
> > + * Check the virtqueue idx validility,
> > + * return 1 if pass, otherwise 0.
> > + */
> > +static inline uint8_t __attribute__((always_inline))
> > +check_virtqueue_idx(uint16_t virtq_idx, uint8_t is_tx, uint32_t
> > +virtq_num) {
> > +	if ((is_tx ^ (virtq_idx & 0x1)) || (virtq_idx >= virtq_num))
> > +		return 0;
> > +	return 1;
> > +}
> > +
> > +/**
> >   * This function adds buffers to the virtio devices RX virtqueue. Buffers can
> >   * be received from the physical port or from another virtio device. A
> packet
> >   * count is returned to indicate the number of packets that are
> > succesfully @@ -68,12 +80,15 @@ virtio_dev_rx(struct virtio_net *dev,
> uint16_t queue_id,
> >  	uint8_t success = 0;
> >
> >  	LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_rx()\n", dev-
> >device_fh);
> > -	if (unlikely(queue_id != VIRTIO_RXQ)) {
> > -		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this
> version.\n");
> > +	if (unlikely(check_virtqueue_idx(queue_id, 0,
> > +		VIRTIO_QNUM * dev->virt_qp_nb) == 0)) {
> > +		RTE_LOG(ERR, VHOST_DATA,
> > +			"%s (%"PRIu64"): virtqueue idx:%d invalid.\n",
> > +			 __func__, dev->device_fh, queue_id);
> >  		return 0;
> >  	}
> >
> > -	vq = dev->virtqueue[VIRTIO_RXQ];
> > +	vq = dev->virtqueue[queue_id];
> >  	count = (count > MAX_PKT_BURST) ? MAX_PKT_BURST : count;
> >
> >  	/*
> > @@ -235,8 +250,9 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t
> > queue_id,  }
> >
> >  static inline uint32_t __attribute__((always_inline))
> > -copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t res_base_idx,
> > -	uint16_t res_end_idx, struct rte_mbuf *pkt)
> > +copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t queue_id,
> > +	uint16_t res_base_idx, uint16_t res_end_idx,
> > +	struct rte_mbuf *pkt)
> >  {
> >  	uint32_t vec_idx = 0;
> >  	uint32_t entry_success = 0;
> > @@ -264,8 +280,9 @@ copy_from_mbuf_to_vring(struct virtio_net *dev,
> uint16_t res_base_idx,
> >  	 * Convert from gpa to vva
> >  	 * (guest physical addr -> vhost virtual addr)
> >  	 */
> > -	vq = dev->virtqueue[VIRTIO_RXQ];
> > -	vb_addr = gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
> > +	vq = dev->virtqueue[queue_id];
> > +	vb_addr =
> > +		gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
> >  	vb_hdr_addr = vb_addr;
> >
> >  	/* Prefetch buffer address. */
> > @@ -464,11 +481,15 @@ virtio_dev_merge_rx(struct virtio_net *dev,
> > uint16_t queue_id,
> >
> >  	LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_merge_rx()\n",
> >  		dev->device_fh);
> > -	if (unlikely(queue_id != VIRTIO_RXQ)) {
> > -		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this
> version.\n");
> > +	if (unlikely(check_virtqueue_idx(queue_id, 0,
> > +		VIRTIO_QNUM * dev->virt_qp_nb) == 0)) {
> > +		RTE_LOG(ERR, VHOST_DATA,
> > +			"%s (%"PRIu64"): virtqueue idx:%d invalid.\n",
> > +			 __func__, dev->device_fh, queue_id);
> > +		return 0;
> >  	}
> >
> > -	vq = dev->virtqueue[VIRTIO_RXQ];
> > +	vq = dev->virtqueue[queue_id];
> >  	count = RTE_MIN((uint32_t)MAX_PKT_BURST, count);
> >
> >  	if (count == 0)
> > @@ -509,7 +530,7 @@ virtio_dev_merge_rx(struct virtio_net *dev,
> uint16_t queue_id,
> >  							res_cur_idx);
> >  		} while (success == 0);
> >
> > -		entry_success = copy_from_mbuf_to_vring(dev,
> res_base_idx,
> > +		entry_success = copy_from_mbuf_to_vring(dev, queue_id,
> > +res_base_idx,
> >  			res_cur_idx, pkts[pkt_idx]);
> >
> >  		rte_compiler_barrier();
> > @@ -559,12 +580,15 @@ rte_vhost_dequeue_burst(struct virtio_net *dev,
> uint16_t queue_id,
> >  	uint16_t free_entries, entry_success = 0;
> >  	uint16_t avail_idx;
> >
> > -	if (unlikely(queue_id != VIRTIO_TXQ)) {
> > -		LOG_DEBUG(VHOST_DATA, "mq isn't supported in this
> version.\n");
> > +	if (unlikely(check_virtqueue_idx(queue_id, 1,
> > +		VIRTIO_QNUM * dev->virt_qp_nb) == 0)) {
> > +		RTE_LOG(ERR, VHOST_DATA,
> > +			"%s (%"PRIu64"): virtqueue idx:%d invalid.\n",
> > +			 __func__, dev->device_fh, queue_id);
> >  		return 0;
> >  	}
> >
> > -	vq = dev->virtqueue[VIRTIO_TXQ];
> > +	vq = dev->virtqueue[queue_id];
> >  	avail_idx =  *((volatile uint16_t *)&vq->avail->idx);
> >
> >  	/* If there are no available buffers then return. */ diff --git
> > a/lib/librte_vhost/vhost_user/vhost-net-user.c
> > b/lib/librte_vhost/vhost_user/vhost-net-user.c
> > index f406a94..3d7c373 100644
> > --- a/lib/librte_vhost/vhost_user/vhost-net-user.c
> > +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
> > @@ -383,7 +383,9 @@ vserver_message_handler(int connfd, void *dat, int
> *remove)
> >  		ops->set_owner(ctx);
> >  		break;
> >  	case VHOST_USER_RESET_OWNER:
> > -		ops->reset_owner(ctx);
> > +		RTE_LOG(INFO, VHOST_CONFIG,
> > +			"(%"PRIu64") VHOST_NET_RESET_OWNER\n", ctx.fh);
> > +		user_reset_owner(ctx, &msg.payload.state);
> >  		break;
> >
> >  	case VHOST_USER_SET_MEM_TABLE:
> > diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c
> > b/lib/librte_vhost/vhost_user/virtio-net-user.c
> > index c1ffc38..4c1d4df 100644
> > --- a/lib/librte_vhost/vhost_user/virtio-net-user.c
> > +++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
> > @@ -209,30 +209,46 @@ static int
> >  virtio_is_ready(struct virtio_net *dev)  {
> >  	struct vhost_virtqueue *rvq, *tvq;
> > +	uint32_t q_idx;
> >
> >  	/* mq support in future.*/
> > -	rvq = dev->virtqueue[VIRTIO_RXQ];
> > -	tvq = dev->virtqueue[VIRTIO_TXQ];
> > -	if (rvq && tvq && rvq->desc && tvq->desc &&
> > -		(rvq->kickfd != (eventfd_t)-1) &&
> > -		(rvq->callfd != (eventfd_t)-1) &&
> > -		(tvq->kickfd != (eventfd_t)-1) &&
> > -		(tvq->callfd != (eventfd_t)-1)) {
> > -		RTE_LOG(INFO, VHOST_CONFIG,
> > -			"virtio is now ready for processing.\n");
> > -		return 1;
> > +	for (q_idx = 0; q_idx < dev->virt_qp_nb; q_idx++) {
> > +		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM +
> VIRTIO_RXQ;
> > +		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM +
> VIRTIO_TXQ;
> > +
> > +		rvq = dev->virtqueue[virt_rx_q_idx];
> > +		tvq = dev->virtqueue[virt_tx_q_idx];
> > +		if ((rvq == NULL) || (tvq == NULL) ||
> > +			(rvq->desc == NULL) || (tvq->desc == NULL) ||
> > +			(rvq->kickfd == (eventfd_t)-1) ||
> > +			(rvq->callfd == (eventfd_t)-1) ||
> > +			(tvq->kickfd == (eventfd_t)-1) ||
> > +			(tvq->callfd == (eventfd_t)-1)) {
> > +			RTE_LOG(INFO, VHOST_CONFIG,
> > +				"virtio isn't ready for processing.\n");
> > +			return 0;
> > +		}
> >  	}
> >  	RTE_LOG(INFO, VHOST_CONFIG,
> > -		"virtio isn't ready for processing.\n");
> > -	return 0;
> > +		"virtio is now ready for processing.\n");
> > +	return 1;
> >  }
> >
> >  void
> >  user_set_vring_call(struct vhost_device_ctx ctx, struct VhostUserMsg
> > *pmsg)  {
> >  	struct vhost_vring_file file;
> > +	struct virtio_net *dev = get_device(ctx);
> > +	uint32_t cur_qp_idx;
> >
> >  	file.index = pmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
> > +	cur_qp_idx = file.index >> 1;
> > +
> > +	if (dev->virt_qp_nb < cur_qp_idx + 1) {
> > +		if (alloc_vring_queue_pair(dev, cur_qp_idx) == 0)
> > +			dev->virt_qp_nb = cur_qp_idx + 1;
> 
> Looks like it is missing vring initialization here.
> 
> 	if (dev->virt_qp_nb < cur_qp_idx + 1) {
> 		if (alloc_vring_queue_pair(dev, cur_qp_idx) == 0) {
> 			dev->virt_qp_nb = cur_qp_idx + 1;
> 			init_vring_queue_pair(dev, cur_qp_idx);

I have called the init_vring_queue_pair inside function alloc_vring_queue_pair,
It has same effect as your suggestion.

Thanks again
Changchun

> 		}
> 	}
> 
> 
> fbl
> 
> 
> > +	}
> > +
> >  	if (pmsg->payload.u64 & VHOST_USER_VRING_NOFD_MASK)
> >  		file.fd = -1;
> >  	else
> > @@ -290,13 +306,37 @@ user_get_vring_base(struct vhost_device_ctx ctx,
> >  	 * sent and only sent in vhost_vring_stop.
> >  	 * TODO: cleanup the vring, it isn't usable since here.
> >  	 */
> > -	if (((int)dev->virtqueue[VIRTIO_RXQ]->kickfd) >= 0) {
> > -		close(dev->virtqueue[VIRTIO_RXQ]->kickfd);
> > -		dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
> > +	if (((int)dev->virtqueue[state->index]->kickfd) >= 0) {
> > +		close(dev->virtqueue[state->index]->kickfd);
> > +		dev->virtqueue[state->index]->kickfd = (eventfd_t)-1;
> >  	}
> > -	if (((int)dev->virtqueue[VIRTIO_TXQ]->kickfd) >= 0) {
> > -		close(dev->virtqueue[VIRTIO_TXQ]->kickfd);
> > -		dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
> > +
> > +	return 0;
> > +}
> > +
> > +/*
> > + * when virtio is stopped, qemu will send us the RESET_OWNER message.
> > + */
> > +int
> > +user_reset_owner(struct vhost_device_ctx ctx,
> > +	struct vhost_vring_state *state)
> > +{
> > +	struct virtio_net *dev = get_device(ctx);
> > +
> > +	/* We have to stop the queue (virtio) if it is running. */
> > +	if (dev->flags & VIRTIO_DEV_RUNNING)
> > +		notify_ops->destroy_device(dev);
> > +
> > +	RTE_LOG(INFO, VHOST_CONFIG,
> > +		"reset owner --- state idx:%d state num:%d\n", state->index,
> state->num);
> > +	/*
> > +	 * Based on current qemu vhost-user implementation, this message
> is
> > +	 * sent and only sent in vhost_net_stop_one.
> > +	 * TODO: cleanup the vring, it isn't usable since here.
> > +	 */
> > +	if (((int)dev->virtqueue[state->index]->kickfd) >= 0) {
> > +		close(dev->virtqueue[state->index]->kickfd);
> > +		dev->virtqueue[state->index]->kickfd = (eventfd_t)-1;
> >  	}
> >
> >  	return 0;
> > diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.h
> > b/lib/librte_vhost/vhost_user/virtio-net-user.h
> > index df24860..2429836 100644
> > --- a/lib/librte_vhost/vhost_user/virtio-net-user.h
> > +++ b/lib/librte_vhost/vhost_user/virtio-net-user.h
> > @@ -46,4 +46,6 @@ void user_set_vring_kick(struct vhost_device_ctx,
> > struct VhostUserMsg *);  int user_get_vring_base(struct
> > vhost_device_ctx, struct vhost_vring_state *);
> >
> >  void user_destroy_device(struct vhost_device_ctx);
> > +
> > +int user_reset_owner(struct vhost_device_ctx ctx, struct
> > +vhost_vring_state *state);
> >  #endif
> > diff --git a/lib/librte_vhost/virtio-net.c
> > b/lib/librte_vhost/virtio-net.c index b520ec5..2a4b791 100644
> > --- a/lib/librte_vhost/virtio-net.c
> > +++ b/lib/librte_vhost/virtio-net.c
> > @@ -71,9 +71,10 @@ static struct virtio_net_config_ll *ll_root;
> > #define VHOST_SUPPORTED_FEATURES ((1ULL <<
> VIRTIO_NET_F_MRG_RXBUF) | \
> >  				(1ULL << VIRTIO_NET_F_CTRL_VQ) | \
> >  				(1ULL << VIRTIO_NET_F_CTRL_RX) | \
> > -				(1ULL << VHOST_F_LOG_ALL))
> > -static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
> > +				(1ULL << VHOST_F_LOG_ALL) | \
> > +				(1ULL << VIRTIO_NET_F_MQ))
> >
> > +static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;
> >
> >  /*
> >   * Converts QEMU virtual address to Vhost virtual address. This
> > function is @@ -182,6 +183,8 @@ add_config_ll_entry(struct
> > virtio_net_config_ll *new_ll_dev)  static void  cleanup_device(struct
> > virtio_net *dev)  {
> > +	uint32_t qp_idx;
> > +
> >  	/* Unmap QEMU memory file if mapped. */
> >  	if (dev->mem) {
> >  		munmap((void *)(uintptr_t)dev->mem->mapped_address,
> > @@ -190,14 +193,18 @@ cleanup_device(struct virtio_net *dev)
> >  	}
> >
> >  	/* Close any event notifiers opened by device. */
> > -	if ((int)dev->virtqueue[VIRTIO_RXQ]->callfd >= 0)
> > -		close((int)dev->virtqueue[VIRTIO_RXQ]->callfd);
> > -	if ((int)dev->virtqueue[VIRTIO_RXQ]->kickfd >= 0)
> > -		close((int)dev->virtqueue[VIRTIO_RXQ]->kickfd);
> > -	if ((int)dev->virtqueue[VIRTIO_TXQ]->callfd >= 0)
> > -		close((int)dev->virtqueue[VIRTIO_TXQ]->callfd);
> > -	if ((int)dev->virtqueue[VIRTIO_TXQ]->kickfd >= 0)
> > -		close((int)dev->virtqueue[VIRTIO_TXQ]->kickfd);
> > +	for (qp_idx = 0; qp_idx < dev->virt_qp_nb; qp_idx++) {
> > +		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM +
> VIRTIO_RXQ;
> > +		uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM +
> VIRTIO_TXQ;
> > +		if ((int)dev->virtqueue[virt_rx_q_idx]->callfd >= 0)
> > +			close((int)dev->virtqueue[virt_rx_q_idx]->callfd);
> > +		if ((int)dev->virtqueue[virt_rx_q_idx]->kickfd >= 0)
> > +			close((int)dev->virtqueue[virt_rx_q_idx]->kickfd);
> > +		if ((int)dev->virtqueue[virt_tx_q_idx]->callfd >= 0)
> > +			close((int)dev->virtqueue[virt_tx_q_idx]->callfd);
> > +		if ((int)dev->virtqueue[virt_tx_q_idx]->kickfd >= 0)
> > +			close((int)dev->virtqueue[virt_tx_q_idx]->kickfd);
> > +	}
> >  }
> >
> >  /*
> > @@ -206,9 +213,17 @@ cleanup_device(struct virtio_net *dev)  static
> > void  free_device(struct virtio_net_config_ll *ll_dev)  {
> > -	/* Free any malloc'd memory */
> > -	rte_free(ll_dev->dev.virtqueue[VIRTIO_RXQ]);
> > -	rte_free(ll_dev->dev.virtqueue[VIRTIO_TXQ]);
> > +	uint32_t qp_idx;
> > +
> > +	/*
> > +	 * Free any malloc'd memory.
> > +	 */
> > +	/* Free every queue pair. */
> > +	for (qp_idx = 0; qp_idx < ll_dev->dev.virt_qp_nb; qp_idx++) {
> > +		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM +
> VIRTIO_RXQ;
> > +		rte_free(ll_dev->dev.virtqueue[virt_rx_q_idx]);
> > +	}
> > +	rte_free(ll_dev->dev.virtqueue);
> >  	rte_free(ll_dev);
> >  }
> >
> > @@ -242,6 +257,27 @@ rm_config_ll_entry(struct virtio_net_config_ll
> > *ll_dev,  }
> >
> >  /*
> > + *  Initialise all variables in vring queue pair.
> > + */
> > +static void
> > +init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> > +	memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct
> vhost_virtqueue));
> > +	memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct
> > +vhost_virtqueue));
> > +
> > +	dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
> > +	dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
> > +	dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
> > +	dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
> > +
> > +	/* Backends are set to -1 indicating an inactive device. */
> > +	dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> > +	dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED; }
> > +
> > +/*
> >   *  Initialise all variables in device structure.
> >   */
> >  static void
> > @@ -258,17 +294,34 @@ init_device(struct virtio_net *dev)
> >  	/* Set everything to 0. */
> >  	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
> >  		(sizeof(struct virtio_net) - (size_t)vq_offset));
> > -	memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct
> vhost_virtqueue));
> > -	memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct
> vhost_virtqueue));
> >
> > -	dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
> > -	dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
> > -	dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
> > -	dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;
> > +	init_vring_queue_pair(dev, 0);
> > +	dev->virt_qp_nb = 1;
> > +}
> > +
> > +/*
> > + *  Alloc mem for vring queue pair.
> > + */
> > +int
> > +alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > +	struct vhost_virtqueue *virtqueue = NULL;
> > +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> >
> > -	/* Backends are set to -1 indicating an inactive device. */
> > -	dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
> > -	dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
> > +	virtqueue = rte_malloc(NULL, sizeof(struct vhost_virtqueue) *
> VIRTIO_QNUM, 0);
> > +	if (virtqueue == NULL) {
> > +		RTE_LOG(ERR, VHOST_CONFIG,
> > +			"Failed to allocate memory for virt qp:%d.\n",
> qp_idx);
> > +		return -1;
> > +	}
> > +
> > +	dev->virtqueue[virt_rx_q_idx] = virtqueue;
> > +	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
> > +
> > +	init_vring_queue_pair(dev, qp_idx);
> > +
> > +	return 0;
> >  }
> >
> >  /*
> > @@ -280,7 +333,6 @@ static int
> >  new_device(struct vhost_device_ctx ctx)  {
> >  	struct virtio_net_config_ll *new_ll_dev;
> > -	struct vhost_virtqueue *virtqueue_rx, *virtqueue_tx;
> >
> >  	/* Setup device and virtqueues. */
> >  	new_ll_dev = rte_malloc(NULL, sizeof(struct virtio_net_config_ll),
> > 0); @@ -291,28 +343,22 @@ new_device(struct vhost_device_ctx ctx)
> >  		return -1;
> >  	}
> >
> > -	virtqueue_rx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> > -	if (virtqueue_rx == NULL) {
> > -		rte_free(new_ll_dev);
> > +	new_ll_dev->dev.virtqueue =
> > +		rte_malloc(NULL, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX *
> sizeof(struct vhost_virtqueue *), 0);
> > +	if (new_ll_dev->dev.virtqueue == NULL) {
> >  		RTE_LOG(ERR, VHOST_CONFIG,
> > -			"(%"PRIu64") Failed to allocate memory for rxq.\n",
> > +			"(%"PRIu64") Failed to allocate memory for
> dev.virtqueue.\n",
> >  			ctx.fh);
> > +		rte_free(new_ll_dev);
> >  		return -1;
> >  	}
> >
> > -	virtqueue_tx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> > -	if (virtqueue_tx == NULL) {
> > -		rte_free(virtqueue_rx);
> > +	if (alloc_vring_queue_pair(&new_ll_dev->dev, 0) == -1) {
> > +		rte_free(new_ll_dev->dev.virtqueue);
> >  		rte_free(new_ll_dev);
> > -		RTE_LOG(ERR, VHOST_CONFIG,
> > -			"(%"PRIu64") Failed to allocate memory for txq.\n",
> > -			ctx.fh);
> >  		return -1;
> >  	}
> >
> > -	new_ll_dev->dev.virtqueue[VIRTIO_RXQ] = virtqueue_rx;
> > -	new_ll_dev->dev.virtqueue[VIRTIO_TXQ] = virtqueue_tx;
> > -
> >  	/* Initialise device and virtqueues. */
> >  	init_device(&new_ll_dev->dev);
> >
> > @@ -396,7 +442,7 @@ set_owner(struct vhost_device_ctx ctx)
> >   * Called from CUSE IOCTL: VHOST_RESET_OWNER
> >   */
> >  static int
> > -reset_owner(struct vhost_device_ctx ctx)
> > +reset_owner(__rte_unused struct vhost_device_ctx ctx)
> >  {
> >  	struct virtio_net_config_ll *ll_dev;
> >
> > @@ -434,6 +480,7 @@ static int
> >  set_features(struct vhost_device_ctx ctx, uint64_t *pu)  {
> >  	struct virtio_net *dev;
> > +	uint32_t q_idx;
> >
> >  	dev = get_device(ctx);
> >  	if (dev == NULL)
> > @@ -445,22 +492,26 @@ set_features(struct vhost_device_ctx ctx,
> uint64_t *pu)
> >  	dev->features = *pu;
> >
> >  	/* Set the vhost_hlen depending on if VIRTIO_NET_F_MRG_RXBUF
> is set. */
> > -	if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> > -		LOG_DEBUG(VHOST_CONFIG,
> > -			"(%"PRIu64") Mergeable RX buffers enabled\n",
> > -			dev->device_fh);
> > -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> > -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> > -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > -	} else {
> > -		LOG_DEBUG(VHOST_CONFIG,
> > -			"(%"PRIu64") Mergeable RX buffers disabled\n",
> > -			dev->device_fh);
> > -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> > -			sizeof(struct virtio_net_hdr);
> > -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> > -			sizeof(struct virtio_net_hdr);
> > +	for (q_idx = 0; q_idx < dev->virt_qp_nb; q_idx++) {
> > +		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM +
> VIRTIO_RXQ;
> > +		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM +
> VIRTIO_TXQ;
> > +		if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> > +			LOG_DEBUG(VHOST_CONFIG,
> > +				"(%"PRIu64") Mergeable RX buffers
> enabled\n",
> > +				dev->device_fh);
> > +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> > +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> > +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > +		} else {
> > +			LOG_DEBUG(VHOST_CONFIG,
> > +				"(%"PRIu64") Mergeable RX buffers
> disabled\n",
> > +				dev->device_fh);
> > +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> > +				sizeof(struct virtio_net_hdr);
> > +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> > +				sizeof(struct virtio_net_hdr);
> > +		}
> >  	}
> >  	return 0;
> >  }
> > @@ -826,6 +877,14 @@ int rte_vhost_feature_enable(uint64_t
> feature_mask)
> >  	return -1;
> >  }
> >
> > +uint16_t rte_vhost_qp_num_get(struct virtio_net *dev) {
> > +	if (dev == NULL)
> > +		return 0;
> > +
> > +	return dev->virt_qp_nb;
> > +}
> > +
> >  /*
> >   * Register ops so that we can add/remove device to data core.
> >   */
> > --
> > 1.8.4.2
> >

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev
  2015-08-14  2:29           ` Ouyang, Changchun
@ 2015-08-14 12:16             ` Flavio Leitner
  0 siblings, 0 replies; 65+ messages in thread
From: Flavio Leitner @ 2015-08-14 12:16 UTC (permalink / raw)
  To: Ouyang, Changchun; +Cc: dev

On Fri, Aug 14, 2015 at 02:29:51AM +0000, Ouyang, Changchun wrote:
> > -----Original Message-----
> > From: Flavio Leitner [mailto:fbl@sysclose.org]
> > Sent: Thursday, August 13, 2015 8:52 PM
> > To: Ouyang, Changchun
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in
> > virtio dev
> > 
> > On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> > >  	file.index = pmsg->payload.u64 & VHOST_USER_VRING_IDX_MASK;
> > > +	cur_qp_idx = file.index >> 1;
> > > +
> > > +	if (dev->virt_qp_nb < cur_qp_idx + 1) {
> > > +		if (alloc_vring_queue_pair(dev, cur_qp_idx) == 0)
> > > +			dev->virt_qp_nb = cur_qp_idx + 1;
> > 
> > Looks like it is missing vring initialization here.
> > 
> > 	if (dev->virt_qp_nb < cur_qp_idx + 1) {
> > 		if (alloc_vring_queue_pair(dev, cur_qp_idx) == 0) {
> > 			dev->virt_qp_nb = cur_qp_idx + 1;
> > 			init_vring_queue_pair(dev, cur_qp_idx);
> 
> I have called the init_vring_queue_pair inside function alloc_vring_queue_pair,
> It has same effect as your suggestion.

Yup, I missed that.
Thanks!
fbl

 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev Ouyang Changchun
  2015-08-13 12:52         ` Flavio Leitner
@ 2015-08-19  3:52         ` Yuanhan Liu
  2015-08-19  5:54           ` Ouyang, Changchun
  2015-09-03  2:27         ` Tetsuya Mukawa
  2 siblings, 1 reply; 65+ messages in thread
From: Yuanhan Liu @ 2015-08-19  3:52 UTC (permalink / raw)
  To: Ouyang Changchun; +Cc: dev

Hi Changchun,

On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> Each virtio device could have multiple queues, say 2 or 4, at most 8.
> Enabling this feature allows virtio device/port on guest has the ability to
> use different vCPU to receive/transmit packets from/to each queue.
> 
> In multiple queues mode, virtio device readiness means all queues of
> this virtio device are ready, cleanup/destroy a virtio device also
> requires clearing all queues belong to it.
> 
> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> ---
[snip ..]
>  /*
> + *  Initialise all variables in vring queue pair.
> + */
> +static void
> +init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx)
> +{
> +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> +	memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct vhost_virtqueue));
> +	memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct vhost_virtqueue));
> +
> +	dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
> +	dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
> +	dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
> +	dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
> +
> +	/* Backends are set to -1 indicating an inactive device. */
> +	dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> +	dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> +}
> +
> +/*
>   *  Initialise all variables in device structure.
>   */
>  static void
> @@ -258,17 +294,34 @@ init_device(struct virtio_net *dev)
>  	/* Set everything to 0. */

There is a trick here. Let me fill the context first:

283 static void
284 init_device(struct virtio_net *dev)
285 {
286         uint64_t vq_offset;
287
288         /*
289          * Virtqueues have already been malloced so
290          * we don't want to set them to NULL.
291          */
292         vq_offset = offsetof(struct virtio_net, mem);
293
294         /* Set everything to 0. */
295         memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
296                 (sizeof(struct virtio_net) - (size_t)vq_offset));
297
298         init_vring_queue_pair(dev, 0);

This piece of code's intention is to memset everything to zero, except
the `virtqueue' field, for, as the comment stated, we have already
allocated virtqueue.

It works only when `virtqueue' field is before `mem' field, and it was
before:

    struct virtio_net {
            struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];        /**< Contains all virtqueue information. */
            struct virtio_memory    *mem;           /**< QEMU memory and memory region information. */
            ...

After this patch, it becomes:

    struct virtio_net {
            struct virtio_memory    *mem;           /**< QEMU memory and memory region information. */
            struct vhost_virtqueue  **virtqueue;    /**< Contains all virtqueue information. */
            ...


Which actually wipes all stuff inside `struct virtio_net`, resulting to 
setting `virtqueue' to NULL as well.

While reading the code(without you patch applied), I thought that it's
error-prone, as it is very likely that someone else besides the author
doesn't know such undocumented rule. And you just gave me an example :)

Huawei, I'm proposing a fix to call rte_zmalloc() for allocating new_ll_dev
to get rid of such issue. What do you think?

	--yliu



>  	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
>  		(sizeof(struct virtio_net) - (size_t)vq_offset));
> -	memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct vhost_virtqueue));
> -	memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct vhost_virtqueue));
>  
> -	dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
> -	dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
> -	dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
> -	dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;
> +	init_vring_queue_pair(dev, 0);
> +	dev->virt_qp_nb = 1;
> +}
> +
> +/*
> + *  Alloc mem for vring queue pair.
> + */
> +int
> +alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx)
> +{
> +	struct vhost_virtqueue *virtqueue = NULL;
> +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
>  
> -	/* Backends are set to -1 indicating an inactive device. */
> -	dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
> -	dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
> +	virtqueue = rte_malloc(NULL, sizeof(struct vhost_virtqueue) * VIRTIO_QNUM, 0);
> +	if (virtqueue == NULL) {
> +		RTE_LOG(ERR, VHOST_CONFIG,
> +			"Failed to allocate memory for virt qp:%d.\n", qp_idx);
> +		return -1;
> +	}
> +
> +	dev->virtqueue[virt_rx_q_idx] = virtqueue;
> +	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
> +
> +	init_vring_queue_pair(dev, qp_idx);
> +
> +	return 0;
>  }
>  
>  /*
> @@ -280,7 +333,6 @@ static int
>  new_device(struct vhost_device_ctx ctx)
>  {
>  	struct virtio_net_config_ll *new_ll_dev;
> -	struct vhost_virtqueue *virtqueue_rx, *virtqueue_tx;
>  
>  	/* Setup device and virtqueues. */
>  	new_ll_dev = rte_malloc(NULL, sizeof(struct virtio_net_config_ll), 0);
> @@ -291,28 +343,22 @@ new_device(struct vhost_device_ctx ctx)
>  		return -1;
>  	}
>  
> -	virtqueue_rx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> -	if (virtqueue_rx == NULL) {
> -		rte_free(new_ll_dev);
> +	new_ll_dev->dev.virtqueue =
> +		rte_malloc(NULL, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX * sizeof(struct vhost_virtqueue *), 0);
> +	if (new_ll_dev->dev.virtqueue == NULL) {
>  		RTE_LOG(ERR, VHOST_CONFIG,
> -			"(%"PRIu64") Failed to allocate memory for rxq.\n",
> +			"(%"PRIu64") Failed to allocate memory for dev.virtqueue.\n",
>  			ctx.fh);
> +		rte_free(new_ll_dev);
>  		return -1;
>  	}
>  
> -	virtqueue_tx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> -	if (virtqueue_tx == NULL) {
> -		rte_free(virtqueue_rx);
> +	if (alloc_vring_queue_pair(&new_ll_dev->dev, 0) == -1) {
> +		rte_free(new_ll_dev->dev.virtqueue);
>  		rte_free(new_ll_dev);
> -		RTE_LOG(ERR, VHOST_CONFIG,
> -			"(%"PRIu64") Failed to allocate memory for txq.\n",
> -			ctx.fh);
>  		return -1;
>  	}
>  
> -	new_ll_dev->dev.virtqueue[VIRTIO_RXQ] = virtqueue_rx;
> -	new_ll_dev->dev.virtqueue[VIRTIO_TXQ] = virtqueue_tx;
> -
>  	/* Initialise device and virtqueues. */
>  	init_device(&new_ll_dev->dev);
>  
> @@ -396,7 +442,7 @@ set_owner(struct vhost_device_ctx ctx)
>   * Called from CUSE IOCTL: VHOST_RESET_OWNER
>   */
>  static int
> -reset_owner(struct vhost_device_ctx ctx)
> +reset_owner(__rte_unused struct vhost_device_ctx ctx)
>  {
>  	struct virtio_net_config_ll *ll_dev;
>  
> @@ -434,6 +480,7 @@ static int
>  set_features(struct vhost_device_ctx ctx, uint64_t *pu)
>  {
>  	struct virtio_net *dev;
> +	uint32_t q_idx;
>  
>  	dev = get_device(ctx);
>  	if (dev == NULL)
> @@ -445,22 +492,26 @@ set_features(struct vhost_device_ctx ctx, uint64_t *pu)
>  	dev->features = *pu;
>  
>  	/* Set the vhost_hlen depending on if VIRTIO_NET_F_MRG_RXBUF is set. */
> -	if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> -		LOG_DEBUG(VHOST_CONFIG,
> -			"(%"PRIu64") Mergeable RX buffers enabled\n",
> -			dev->device_fh);
> -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> -	} else {
> -		LOG_DEBUG(VHOST_CONFIG,
> -			"(%"PRIu64") Mergeable RX buffers disabled\n",
> -			dev->device_fh);
> -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> -			sizeof(struct virtio_net_hdr);
> -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> -			sizeof(struct virtio_net_hdr);
> +	for (q_idx = 0; q_idx < dev->virt_qp_nb; q_idx++) {
> +		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> +		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> +		if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> +			LOG_DEBUG(VHOST_CONFIG,
> +				"(%"PRIu64") Mergeable RX buffers enabled\n",
> +				dev->device_fh);
> +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> +		} else {
> +			LOG_DEBUG(VHOST_CONFIG,
> +				"(%"PRIu64") Mergeable RX buffers disabled\n",
> +				dev->device_fh);
> +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> +				sizeof(struct virtio_net_hdr);
> +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> +				sizeof(struct virtio_net_hdr);
> +		}
>  	}
>  	return 0;
>  }
> @@ -826,6 +877,14 @@ int rte_vhost_feature_enable(uint64_t feature_mask)
>  	return -1;
>  }
>  
> +uint16_t rte_vhost_qp_num_get(struct virtio_net *dev)
> +{
> +	if (dev == NULL)
> +		return 0;
> +
> +	return dev->virt_qp_nb;
> +}
> +
>  /*
>   * Register ops so that we can add/remove device to data core.
>   */
> -- 
> 1.8.4.2
> 

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev
  2015-08-19  3:52         ` Yuanhan Liu
@ 2015-08-19  5:54           ` Ouyang, Changchun
  2015-08-19  6:28             ` Yuanhan Liu
  0 siblings, 1 reply; 65+ messages in thread
From: Ouyang, Changchun @ 2015-08-19  5:54 UTC (permalink / raw)
  To: Yuanhan Liu; +Cc: dev

Hi Yuanhan,

> -----Original Message-----
> From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
> Sent: Wednesday, August 19, 2015 11:53 AM
> To: Ouyang, Changchun
> Cc: dev@dpdk.org; Xie, Huawei
> Subject: Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in
> virtio dev
> 
> Hi Changchun,
> 
> On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > Enabling this feature allows virtio device/port on guest has the
> > ability to use different vCPU to receive/transmit packets from/to each
> queue.
> >
> > In multiple queues mode, virtio device readiness means all queues of
> > this virtio device are ready, cleanup/destroy a virtio device also
> > requires clearing all queues belong to it.
> >
> > Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> > ---
> [snip ..]
> >  /*
> > + *  Initialise all variables in vring queue pair.
> > + */
> > +static void
> > +init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> > +	memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct
> vhost_virtqueue));
> > +	memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct
> > +vhost_virtqueue));
> > +
> > +	dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
> > +	dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
> > +	dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
> > +	dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
> > +
> > +	/* Backends are set to -1 indicating an inactive device. */
> > +	dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> > +	dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED; }
> > +
> > +/*
> >   *  Initialise all variables in device structure.
> >   */
> >  static void
> > @@ -258,17 +294,34 @@ init_device(struct virtio_net *dev)
> >  	/* Set everything to 0. */
> 
> There is a trick here. Let me fill the context first:
> 
> 283 static void
> 284 init_device(struct virtio_net *dev)
> 285 {
> 286         uint64_t vq_offset;
> 287
> 288         /*
> 289          * Virtqueues have already been malloced so
> 290          * we don't want to set them to NULL.
> 291          */
> 292         vq_offset = offsetof(struct virtio_net, mem);
> 293
> 294         /* Set everything to 0. */
> 295         memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
> 296                 (sizeof(struct virtio_net) - (size_t)vq_offset));
> 297
> 298         init_vring_queue_pair(dev, 0);
> 
> This piece of code's intention is to memset everything to zero, except the
> `virtqueue' field, for, as the comment stated, we have already allocated
> virtqueue.
> 
> It works only when `virtqueue' field is before `mem' field, and it was
> before:
> 
>     struct virtio_net {
>             struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];        /**< Contains
> all virtqueue information. */
>             struct virtio_memory    *mem;           /**< QEMU memory and memory
> region information. */
>             ...
> 
> After this patch, it becomes:
> 
>     struct virtio_net {
>             struct virtio_memory    *mem;           /**< QEMU memory and memory
> region information. */
>             struct vhost_virtqueue  **virtqueue;    /**< Contains all virtqueue
> information. */
>             ...
> 
> 
> Which actually wipes all stuff inside `struct virtio_net`, resulting to setting
> `virtqueue' to NULL as well.
> 
> While reading the code(without you patch applied), I thought that it's error-
> prone, as it is very likely that someone else besides the author doesn't know
> such undocumented rule. And you just gave me an example :)
> 
> Huawei, I'm proposing a fix to call rte_zmalloc() for allocating new_ll_dev to
> get rid of such issue. What do you think?
> 
> 	--yliu
> 
> 

Suggest you finish the latter patch review:
[PATCH v4 04/12] vhost: set memory layout for multiple queues mode,
After you finish reviewing this patch, I think you will change your mind :-)

This patch resolve what you concern.

> 
> >  	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
> >  		(sizeof(struct virtio_net) - (size_t)vq_offset));
> > -	memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct
> vhost_virtqueue));
> > -	memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct
> vhost_virtqueue));
> >
> > -	dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
> > -	dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
> > -	dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
> > -	dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;
> > +	init_vring_queue_pair(dev, 0);
> > +	dev->virt_qp_nb = 1;
> > +}
> > +
> > +/*
> > + *  Alloc mem for vring queue pair.
> > + */
> > +int
> > +alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > +	struct vhost_virtqueue *virtqueue = NULL;
> > +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> >
> > -	/* Backends are set to -1 indicating an inactive device. */
> > -	dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
> > -	dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
> > +	virtqueue = rte_malloc(NULL, sizeof(struct vhost_virtqueue) *
> VIRTIO_QNUM, 0);
> > +	if (virtqueue == NULL) {
> > +		RTE_LOG(ERR, VHOST_CONFIG,
> > +			"Failed to allocate memory for virt qp:%d.\n",
> qp_idx);
> > +		return -1;
> > +	}
> > +
> > +	dev->virtqueue[virt_rx_q_idx] = virtqueue;
> > +	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
> > +
> > +	init_vring_queue_pair(dev, qp_idx);
> > +
> > +	return 0;
> >  }
> >
> >  /*
> > @@ -280,7 +333,6 @@ static int
> >  new_device(struct vhost_device_ctx ctx)  {
> >  	struct virtio_net_config_ll *new_ll_dev;
> > -	struct vhost_virtqueue *virtqueue_rx, *virtqueue_tx;
> >
> >  	/* Setup device and virtqueues. */
> >  	new_ll_dev = rte_malloc(NULL, sizeof(struct virtio_net_config_ll),
> > 0); @@ -291,28 +343,22 @@ new_device(struct vhost_device_ctx ctx)
> >  		return -1;
> >  	}
> >
> > -	virtqueue_rx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> > -	if (virtqueue_rx == NULL) {
> > -		rte_free(new_ll_dev);
> > +	new_ll_dev->dev.virtqueue =
> > +		rte_malloc(NULL, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX *
> sizeof(struct vhost_virtqueue *), 0);
> > +	if (new_ll_dev->dev.virtqueue == NULL) {
> >  		RTE_LOG(ERR, VHOST_CONFIG,
> > -			"(%"PRIu64") Failed to allocate memory for rxq.\n",
> > +			"(%"PRIu64") Failed to allocate memory for
> dev.virtqueue.\n",
> >  			ctx.fh);
> > +		rte_free(new_ll_dev);
> >  		return -1;
> >  	}
> >
> > -	virtqueue_tx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> > -	if (virtqueue_tx == NULL) {
> > -		rte_free(virtqueue_rx);
> > +	if (alloc_vring_queue_pair(&new_ll_dev->dev, 0) == -1) {
> > +		rte_free(new_ll_dev->dev.virtqueue);
> >  		rte_free(new_ll_dev);
> > -		RTE_LOG(ERR, VHOST_CONFIG,
> > -			"(%"PRIu64") Failed to allocate memory for txq.\n",
> > -			ctx.fh);
> >  		return -1;
> >  	}
> >
> > -	new_ll_dev->dev.virtqueue[VIRTIO_RXQ] = virtqueue_rx;
> > -	new_ll_dev->dev.virtqueue[VIRTIO_TXQ] = virtqueue_tx;
> > -
> >  	/* Initialise device and virtqueues. */
> >  	init_device(&new_ll_dev->dev);
> >
> > @@ -396,7 +442,7 @@ set_owner(struct vhost_device_ctx ctx)
> >   * Called from CUSE IOCTL: VHOST_RESET_OWNER
> >   */
> >  static int
> > -reset_owner(struct vhost_device_ctx ctx)
> > +reset_owner(__rte_unused struct vhost_device_ctx ctx)
> >  {
> >  	struct virtio_net_config_ll *ll_dev;
> >
> > @@ -434,6 +480,7 @@ static int
> >  set_features(struct vhost_device_ctx ctx, uint64_t *pu)  {
> >  	struct virtio_net *dev;
> > +	uint32_t q_idx;
> >
> >  	dev = get_device(ctx);
> >  	if (dev == NULL)
> > @@ -445,22 +492,26 @@ set_features(struct vhost_device_ctx ctx,
> uint64_t *pu)
> >  	dev->features = *pu;
> >
> >  	/* Set the vhost_hlen depending on if VIRTIO_NET_F_MRG_RXBUF
> is set. */
> > -	if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> > -		LOG_DEBUG(VHOST_CONFIG,
> > -			"(%"PRIu64") Mergeable RX buffers enabled\n",
> > -			dev->device_fh);
> > -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> > -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> > -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > -	} else {
> > -		LOG_DEBUG(VHOST_CONFIG,
> > -			"(%"PRIu64") Mergeable RX buffers disabled\n",
> > -			dev->device_fh);
> > -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> > -			sizeof(struct virtio_net_hdr);
> > -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> > -			sizeof(struct virtio_net_hdr);
> > +	for (q_idx = 0; q_idx < dev->virt_qp_nb; q_idx++) {
> > +		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM +
> VIRTIO_RXQ;
> > +		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM +
> VIRTIO_TXQ;
> > +		if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> > +			LOG_DEBUG(VHOST_CONFIG,
> > +				"(%"PRIu64") Mergeable RX buffers
> enabled\n",
> > +				dev->device_fh);
> > +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> > +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> > +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > +		} else {
> > +			LOG_DEBUG(VHOST_CONFIG,
> > +				"(%"PRIu64") Mergeable RX buffers
> disabled\n",
> > +				dev->device_fh);
> > +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> > +				sizeof(struct virtio_net_hdr);
> > +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> > +				sizeof(struct virtio_net_hdr);
> > +		}
> >  	}
> >  	return 0;
> >  }
> > @@ -826,6 +877,14 @@ int rte_vhost_feature_enable(uint64_t
> feature_mask)
> >  	return -1;
> >  }
> >
> > +uint16_t rte_vhost_qp_num_get(struct virtio_net *dev) {
> > +	if (dev == NULL)
> > +		return 0;
> > +
> > +	return dev->virt_qp_nb;
> > +}
> > +
> >  /*
> >   * Register ops so that we can add/remove device to data core.
> >   */
> > --
> > 1.8.4.2
> >

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev
  2015-08-19  5:54           ` Ouyang, Changchun
@ 2015-08-19  6:28             ` Yuanhan Liu
  2015-08-19  6:39               ` Yuanhan Liu
  0 siblings, 1 reply; 65+ messages in thread
From: Yuanhan Liu @ 2015-08-19  6:28 UTC (permalink / raw)
  To: Ouyang, Changchun; +Cc: dev

On Wed, Aug 19, 2015 at 05:54:09AM +0000, Ouyang, Changchun wrote:
> Hi Yuanhan,
> 
> > -----Original Message-----
> > From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
> > Sent: Wednesday, August 19, 2015 11:53 AM
> > To: Ouyang, Changchun
> > Cc: dev@dpdk.org; Xie, Huawei
> > Subject: Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in
> > virtio dev
> > 
> > Hi Changchun,
> > 
> > On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> > > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > > Enabling this feature allows virtio device/port on guest has the
> > > ability to use different vCPU to receive/transmit packets from/to each
> > queue.
> > >
> > > In multiple queues mode, virtio device readiness means all queues of
> > > this virtio device are ready, cleanup/destroy a virtio device also
> > > requires clearing all queues belong to it.
> > >
> > > Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> > > ---
> > [snip ..]
> > >  /*
> > > + *  Initialise all variables in vring queue pair.
> > > + */
> > > +static void
> > > +init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > > +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > > +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> > > +	memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct
> > vhost_virtqueue));
> > > +	memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct
> > > +vhost_virtqueue));
> > > +
> > > +	dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
> > > +	dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
> > > +	dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
> > > +	dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
> > > +
> > > +	/* Backends are set to -1 indicating an inactive device. */
> > > +	dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> > > +	dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED; }
> > > +
> > > +/*
> > >   *  Initialise all variables in device structure.
> > >   */
> > >  static void
> > > @@ -258,17 +294,34 @@ init_device(struct virtio_net *dev)
> > >  	/* Set everything to 0. */
> > 
> > There is a trick here. Let me fill the context first:
> > 
> > 283 static void
> > 284 init_device(struct virtio_net *dev)
> > 285 {
> > 286         uint64_t vq_offset;
> > 287
> > 288         /*
> > 289          * Virtqueues have already been malloced so
> > 290          * we don't want to set them to NULL.
> > 291          */
> > 292         vq_offset = offsetof(struct virtio_net, mem);
> > 293
> > 294         /* Set everything to 0. */
> > 295         memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
> > 296                 (sizeof(struct virtio_net) - (size_t)vq_offset));
> > 297
> > 298         init_vring_queue_pair(dev, 0);
> > 
> > This piece of code's intention is to memset everything to zero, except the
> > `virtqueue' field, for, as the comment stated, we have already allocated
> > virtqueue.
> > 
> > It works only when `virtqueue' field is before `mem' field, and it was
> > before:
> > 
> >     struct virtio_net {
> >             struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];        /**< Contains
> > all virtqueue information. */
> >             struct virtio_memory    *mem;           /**< QEMU memory and memory
> > region information. */
> >             ...
> > 
> > After this patch, it becomes:
> > 
> >     struct virtio_net {
> >             struct virtio_memory    *mem;           /**< QEMU memory and memory
> > region information. */
> >             struct vhost_virtqueue  **virtqueue;    /**< Contains all virtqueue
> > information. */
> >             ...
> > 
> > 
> > Which actually wipes all stuff inside `struct virtio_net`, resulting to setting
> > `virtqueue' to NULL as well.
> > 
> > While reading the code(without you patch applied), I thought that it's error-
> > prone, as it is very likely that someone else besides the author doesn't know
> > such undocumented rule. And you just gave me an example :)
> > 
> > Huawei, I'm proposing a fix to call rte_zmalloc() for allocating new_ll_dev to
> > get rid of such issue. What do you think?
> > 
> > 	--yliu
> > 
> > 
> 
> Suggest you finish the latter patch review:
> [PATCH v4 04/12] vhost: set memory layout for multiple queues mode,
> After you finish reviewing this patch, I think you will change your mind :-)
> 
> This patch resolve what you concern.

Sorry that I hadn't gone that far yet. And yes, I see. I found that you
moved the barrier to `features' field, which puts the `virtqueue' field
back to the "do not set to zero" zone.

It's still an undocumented rule, and hence, error prone, IMO. But, you
reminded me that init_device() will be invoked at other place else(reset_owner()).
Hence, my solution won't work, either.

I'm wondering saving the `virtqueue'(and `mem_arr' from patch 04/12)
explicitly before memset() and restoring them after that, to get rid of
the undocumented rule. It may become uglier with more and more fields
need to be saved this way. But judging that we have two fields so far,
I'm kind of okay to that.

What do you think then? If that doesn't work, we should add some comments
inside the virtio_net struct at least, or even add a build time check.

	--yliu

> > 
> > >  	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
> > >  		(sizeof(struct virtio_net) - (size_t)vq_offset));
> > > -	memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct
> > vhost_virtqueue));
> > > -	memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct
> > vhost_virtqueue));
> > >
> > > -	dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
> > > -	dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
> > > -	dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
> > > -	dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;
> > > +	init_vring_queue_pair(dev, 0);
> > > +	dev->virt_qp_nb = 1;
> > > +}
> > > +
> > > +/*
> > > + *  Alloc mem for vring queue pair.
> > > + */
> > > +int
> > > +alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > > +	struct vhost_virtqueue *virtqueue = NULL;
> > > +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > > +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> > >
> > > -	/* Backends are set to -1 indicating an inactive device. */
> > > -	dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
> > > -	dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
> > > +	virtqueue = rte_malloc(NULL, sizeof(struct vhost_virtqueue) *
> > VIRTIO_QNUM, 0);
> > > +	if (virtqueue == NULL) {
> > > +		RTE_LOG(ERR, VHOST_CONFIG,
> > > +			"Failed to allocate memory for virt qp:%d.\n",
> > qp_idx);
> > > +		return -1;
> > > +	}
> > > +
> > > +	dev->virtqueue[virt_rx_q_idx] = virtqueue;
> > > +	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
> > > +
> > > +	init_vring_queue_pair(dev, qp_idx);
> > > +
> > > +	return 0;
> > >  }
> > >
> > >  /*
> > > @@ -280,7 +333,6 @@ static int
> > >  new_device(struct vhost_device_ctx ctx)  {
> > >  	struct virtio_net_config_ll *new_ll_dev;
> > > -	struct vhost_virtqueue *virtqueue_rx, *virtqueue_tx;
> > >
> > >  	/* Setup device and virtqueues. */
> > >  	new_ll_dev = rte_malloc(NULL, sizeof(struct virtio_net_config_ll),
> > > 0); @@ -291,28 +343,22 @@ new_device(struct vhost_device_ctx ctx)
> > >  		return -1;
> > >  	}
> > >
> > > -	virtqueue_rx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> > > -	if (virtqueue_rx == NULL) {
> > > -		rte_free(new_ll_dev);
> > > +	new_ll_dev->dev.virtqueue =
> > > +		rte_malloc(NULL, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX *
> > sizeof(struct vhost_virtqueue *), 0);
> > > +	if (new_ll_dev->dev.virtqueue == NULL) {
> > >  		RTE_LOG(ERR, VHOST_CONFIG,
> > > -			"(%"PRIu64") Failed to allocate memory for rxq.\n",
> > > +			"(%"PRIu64") Failed to allocate memory for
> > dev.virtqueue.\n",
> > >  			ctx.fh);
> > > +		rte_free(new_ll_dev);
> > >  		return -1;
> > >  	}
> > >
> > > -	virtqueue_tx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> > > -	if (virtqueue_tx == NULL) {
> > > -		rte_free(virtqueue_rx);
> > > +	if (alloc_vring_queue_pair(&new_ll_dev->dev, 0) == -1) {
> > > +		rte_free(new_ll_dev->dev.virtqueue);
> > >  		rte_free(new_ll_dev);
> > > -		RTE_LOG(ERR, VHOST_CONFIG,
> > > -			"(%"PRIu64") Failed to allocate memory for txq.\n",
> > > -			ctx.fh);
> > >  		return -1;
> > >  	}
> > >
> > > -	new_ll_dev->dev.virtqueue[VIRTIO_RXQ] = virtqueue_rx;
> > > -	new_ll_dev->dev.virtqueue[VIRTIO_TXQ] = virtqueue_tx;
> > > -
> > >  	/* Initialise device and virtqueues. */
> > >  	init_device(&new_ll_dev->dev);
> > >
> > > @@ -396,7 +442,7 @@ set_owner(struct vhost_device_ctx ctx)
> > >   * Called from CUSE IOCTL: VHOST_RESET_OWNER
> > >   */
> > >  static int
> > > -reset_owner(struct vhost_device_ctx ctx)
> > > +reset_owner(__rte_unused struct vhost_device_ctx ctx)
> > >  {
> > >  	struct virtio_net_config_ll *ll_dev;
> > >
> > > @@ -434,6 +480,7 @@ static int
> > >  set_features(struct vhost_device_ctx ctx, uint64_t *pu)  {
> > >  	struct virtio_net *dev;
> > > +	uint32_t q_idx;
> > >
> > >  	dev = get_device(ctx);
> > >  	if (dev == NULL)
> > > @@ -445,22 +492,26 @@ set_features(struct vhost_device_ctx ctx,
> > uint64_t *pu)
> > >  	dev->features = *pu;
> > >
> > >  	/* Set the vhost_hlen depending on if VIRTIO_NET_F_MRG_RXBUF
> > is set. */
> > > -	if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> > > -		LOG_DEBUG(VHOST_CONFIG,
> > > -			"(%"PRIu64") Mergeable RX buffers enabled\n",
> > > -			dev->device_fh);
> > > -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> > > -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> > > -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > -	} else {
> > > -		LOG_DEBUG(VHOST_CONFIG,
> > > -			"(%"PRIu64") Mergeable RX buffers disabled\n",
> > > -			dev->device_fh);
> > > -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> > > -			sizeof(struct virtio_net_hdr);
> > > -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> > > -			sizeof(struct virtio_net_hdr);
> > > +	for (q_idx = 0; q_idx < dev->virt_qp_nb; q_idx++) {
> > > +		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM +
> > VIRTIO_RXQ;
> > > +		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM +
> > VIRTIO_TXQ;
> > > +		if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> > > +			LOG_DEBUG(VHOST_CONFIG,
> > > +				"(%"PRIu64") Mergeable RX buffers
> > enabled\n",
> > > +				dev->device_fh);
> > > +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> > > +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> > > +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > +		} else {
> > > +			LOG_DEBUG(VHOST_CONFIG,
> > > +				"(%"PRIu64") Mergeable RX buffers
> > disabled\n",
> > > +				dev->device_fh);
> > > +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> > > +				sizeof(struct virtio_net_hdr);
> > > +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> > > +				sizeof(struct virtio_net_hdr);
> > > +		}
> > >  	}
> > >  	return 0;
> > >  }
> > > @@ -826,6 +877,14 @@ int rte_vhost_feature_enable(uint64_t
> > feature_mask)
> > >  	return -1;
> > >  }
> > >
> > > +uint16_t rte_vhost_qp_num_get(struct virtio_net *dev) {
> > > +	if (dev == NULL)
> > > +		return 0;
> > > +
> > > +	return dev->virt_qp_nb;
> > > +}
> > > +
> > >  /*
> > >   * Register ops so that we can add/remove device to data core.
> > >   */
> > > --
> > > 1.8.4.2
> > >

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev
  2015-08-19  6:28             ` Yuanhan Liu
@ 2015-08-19  6:39               ` Yuanhan Liu
  0 siblings, 0 replies; 65+ messages in thread
From: Yuanhan Liu @ 2015-08-19  6:39 UTC (permalink / raw)
  To: Ouyang, Changchun; +Cc: dev

On Wed, Aug 19, 2015 at 02:28:51PM +0800, Yuanhan Liu wrote:
> On Wed, Aug 19, 2015 at 05:54:09AM +0000, Ouyang, Changchun wrote:
> > Hi Yuanhan,
> > 
> > > -----Original Message-----
> > > From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
> > > Sent: Wednesday, August 19, 2015 11:53 AM
> > > To: Ouyang, Changchun
> > > Cc: dev@dpdk.org; Xie, Huawei
> > > Subject: Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in
> > > virtio dev
> > > 
> > > Hi Changchun,
> > > 
> > > On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> > > > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > > > Enabling this feature allows virtio device/port on guest has the
> > > > ability to use different vCPU to receive/transmit packets from/to each
> > > queue.
> > > >
> > > > In multiple queues mode, virtio device readiness means all queues of
> > > > this virtio device are ready, cleanup/destroy a virtio device also
> > > > requires clearing all queues belong to it.
> > > >
> > > > Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> > > > ---
> > > [snip ..]
> > > >  /*
> > > > + *  Initialise all variables in vring queue pair.
> > > > + */
> > > > +static void
> > > > +init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > > > +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > > > +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> > > > +	memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct
> > > vhost_virtqueue));
> > > > +	memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct
> > > > +vhost_virtqueue));
> > > > +
> > > > +	dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
> > > > +	dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
> > > > +	dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
> > > > +	dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
> > > > +
> > > > +	/* Backends are set to -1 indicating an inactive device. */
> > > > +	dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> > > > +	dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED; }
> > > > +
> > > > +/*
> > > >   *  Initialise all variables in device structure.
> > > >   */
> > > >  static void
> > > > @@ -258,17 +294,34 @@ init_device(struct virtio_net *dev)
> > > >  	/* Set everything to 0. */
> > > 
> > > There is a trick here. Let me fill the context first:
> > > 
> > > 283 static void
> > > 284 init_device(struct virtio_net *dev)
> > > 285 {
> > > 286         uint64_t vq_offset;
> > > 287
> > > 288         /*
> > > 289          * Virtqueues have already been malloced so
> > > 290          * we don't want to set them to NULL.
> > > 291          */
> > > 292         vq_offset = offsetof(struct virtio_net, mem);
> > > 293
> > > 294         /* Set everything to 0. */
> > > 295         memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
> > > 296                 (sizeof(struct virtio_net) - (size_t)vq_offset));
> > > 297
> > > 298         init_vring_queue_pair(dev, 0);
> > > 
> > > This piece of code's intention is to memset everything to zero, except the
> > > `virtqueue' field, for, as the comment stated, we have already allocated
> > > virtqueue.
> > > 
> > > It works only when `virtqueue' field is before `mem' field, and it was
> > > before:
> > > 
> > >     struct virtio_net {
> > >             struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];        /**< Contains
> > > all virtqueue information. */
> > >             struct virtio_memory    *mem;           /**< QEMU memory and memory
> > > region information. */
> > >             ...
> > > 
> > > After this patch, it becomes:
> > > 
> > >     struct virtio_net {
> > >             struct virtio_memory    *mem;           /**< QEMU memory and memory
> > > region information. */
> > >             struct vhost_virtqueue  **virtqueue;    /**< Contains all virtqueue
> > > information. */
> > >             ...
> > > 
> > > 
> > > Which actually wipes all stuff inside `struct virtio_net`, resulting to setting
> > > `virtqueue' to NULL as well.
> > > 
> > > While reading the code(without you patch applied), I thought that it's error-
> > > prone, as it is very likely that someone else besides the author doesn't know
> > > such undocumented rule. And you just gave me an example :)
> > > 
> > > Huawei, I'm proposing a fix to call rte_zmalloc() for allocating new_ll_dev to
> > > get rid of such issue. What do you think?
> > > 
> > > 	--yliu
> > > 
> > > 
> > 
> > Suggest you finish the latter patch review:
> > [PATCH v4 04/12] vhost: set memory layout for multiple queues mode,
> > After you finish reviewing this patch, I think you will change your mind :-)
> > 
> > This patch resolve what you concern.
> 
> Sorry that I hadn't gone that far yet. And yes, I see. I found that you
> moved the barrier to `features' field, which puts the `virtqueue' field
> back to the "do not set to zero" zone.
> 
> It's still an undocumented rule, and hence, error prone, IMO. But, you
> reminded me that init_device() will be invoked at other place else(reset_owner()).
> Hence, my solution won't work, either.
> 
> I'm wondering saving the `virtqueue'(and `mem_arr' from patch 04/12)
> explicitly before memset() and restoring them after that, to get rid of
> the undocumented rule. It may become uglier with more and more fields
> need to be saved this way. But judging that we have two fields so far,
> I'm kind of okay to that.

I thought it again. Nah.., let's forget that. It's not flexible, for
array fields like "virtqueue[]".

	--yliu
> 
> What do you think then? If that doesn't work, we should add some comments
> inside the virtio_net struct at least, or even add a build time check.
> 
> 	--yliu
> 
> > > 
> > > >  	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
> > > >  		(sizeof(struct virtio_net) - (size_t)vq_offset));
> > > > -	memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct
> > > vhost_virtqueue));
> > > > -	memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct
> > > vhost_virtqueue));
> > > >
> > > > -	dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
> > > > -	dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
> > > > -	dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
> > > > -	dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;
> > > > +	init_vring_queue_pair(dev, 0);
> > > > +	dev->virt_qp_nb = 1;
> > > > +}
> > > > +
> > > > +/*
> > > > + *  Alloc mem for vring queue pair.
> > > > + */
> > > > +int
> > > > +alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > > > +	struct vhost_virtqueue *virtqueue = NULL;
> > > > +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > > > +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> > > >
> > > > -	/* Backends are set to -1 indicating an inactive device. */
> > > > -	dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
> > > > -	dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
> > > > +	virtqueue = rte_malloc(NULL, sizeof(struct vhost_virtqueue) *
> > > VIRTIO_QNUM, 0);
> > > > +	if (virtqueue == NULL) {
> > > > +		RTE_LOG(ERR, VHOST_CONFIG,
> > > > +			"Failed to allocate memory for virt qp:%d.\n",
> > > qp_idx);
> > > > +		return -1;
> > > > +	}
> > > > +
> > > > +	dev->virtqueue[virt_rx_q_idx] = virtqueue;
> > > > +	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
> > > > +
> > > > +	init_vring_queue_pair(dev, qp_idx);
> > > > +
> > > > +	return 0;
> > > >  }
> > > >
> > > >  /*
> > > > @@ -280,7 +333,6 @@ static int
> > > >  new_device(struct vhost_device_ctx ctx)  {
> > > >  	struct virtio_net_config_ll *new_ll_dev;
> > > > -	struct vhost_virtqueue *virtqueue_rx, *virtqueue_tx;
> > > >
> > > >  	/* Setup device and virtqueues. */
> > > >  	new_ll_dev = rte_malloc(NULL, sizeof(struct virtio_net_config_ll),
> > > > 0); @@ -291,28 +343,22 @@ new_device(struct vhost_device_ctx ctx)
> > > >  		return -1;
> > > >  	}
> > > >
> > > > -	virtqueue_rx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> > > > -	if (virtqueue_rx == NULL) {
> > > > -		rte_free(new_ll_dev);
> > > > +	new_ll_dev->dev.virtqueue =
> > > > +		rte_malloc(NULL, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX *
> > > sizeof(struct vhost_virtqueue *), 0);
> > > > +	if (new_ll_dev->dev.virtqueue == NULL) {
> > > >  		RTE_LOG(ERR, VHOST_CONFIG,
> > > > -			"(%"PRIu64") Failed to allocate memory for rxq.\n",
> > > > +			"(%"PRIu64") Failed to allocate memory for
> > > dev.virtqueue.\n",
> > > >  			ctx.fh);
> > > > +		rte_free(new_ll_dev);
> > > >  		return -1;
> > > >  	}
> > > >
> > > > -	virtqueue_tx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> > > > -	if (virtqueue_tx == NULL) {
> > > > -		rte_free(virtqueue_rx);
> > > > +	if (alloc_vring_queue_pair(&new_ll_dev->dev, 0) == -1) {
> > > > +		rte_free(new_ll_dev->dev.virtqueue);
> > > >  		rte_free(new_ll_dev);
> > > > -		RTE_LOG(ERR, VHOST_CONFIG,
> > > > -			"(%"PRIu64") Failed to allocate memory for txq.\n",
> > > > -			ctx.fh);
> > > >  		return -1;
> > > >  	}
> > > >
> > > > -	new_ll_dev->dev.virtqueue[VIRTIO_RXQ] = virtqueue_rx;
> > > > -	new_ll_dev->dev.virtqueue[VIRTIO_TXQ] = virtqueue_tx;
> > > > -
> > > >  	/* Initialise device and virtqueues. */
> > > >  	init_device(&new_ll_dev->dev);
> > > >
> > > > @@ -396,7 +442,7 @@ set_owner(struct vhost_device_ctx ctx)
> > > >   * Called from CUSE IOCTL: VHOST_RESET_OWNER
> > > >   */
> > > >  static int
> > > > -reset_owner(struct vhost_device_ctx ctx)
> > > > +reset_owner(__rte_unused struct vhost_device_ctx ctx)
> > > >  {
> > > >  	struct virtio_net_config_ll *ll_dev;
> > > >
> > > > @@ -434,6 +480,7 @@ static int
> > > >  set_features(struct vhost_device_ctx ctx, uint64_t *pu)  {
> > > >  	struct virtio_net *dev;
> > > > +	uint32_t q_idx;
> > > >
> > > >  	dev = get_device(ctx);
> > > >  	if (dev == NULL)
> > > > @@ -445,22 +492,26 @@ set_features(struct vhost_device_ctx ctx,
> > > uint64_t *pu)
> > > >  	dev->features = *pu;
> > > >
> > > >  	/* Set the vhost_hlen depending on if VIRTIO_NET_F_MRG_RXBUF
> > > is set. */
> > > > -	if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> > > > -		LOG_DEBUG(VHOST_CONFIG,
> > > > -			"(%"PRIu64") Mergeable RX buffers enabled\n",
> > > > -			dev->device_fh);
> > > > -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> > > > -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > > -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> > > > -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > > -	} else {
> > > > -		LOG_DEBUG(VHOST_CONFIG,
> > > > -			"(%"PRIu64") Mergeable RX buffers disabled\n",
> > > > -			dev->device_fh);
> > > > -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> > > > -			sizeof(struct virtio_net_hdr);
> > > > -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> > > > -			sizeof(struct virtio_net_hdr);
> > > > +	for (q_idx = 0; q_idx < dev->virt_qp_nb; q_idx++) {
> > > > +		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM +
> > > VIRTIO_RXQ;
> > > > +		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM +
> > > VIRTIO_TXQ;
> > > > +		if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> > > > +			LOG_DEBUG(VHOST_CONFIG,
> > > > +				"(%"PRIu64") Mergeable RX buffers
> > > enabled\n",
> > > > +				dev->device_fh);
> > > > +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> > > > +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > > +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> > > > +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > > +		} else {
> > > > +			LOG_DEBUG(VHOST_CONFIG,
> > > > +				"(%"PRIu64") Mergeable RX buffers
> > > disabled\n",
> > > > +				dev->device_fh);
> > > > +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> > > > +				sizeof(struct virtio_net_hdr);
> > > > +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> > > > +				sizeof(struct virtio_net_hdr);
> > > > +		}
> > > >  	}
> > > >  	return 0;
> > > >  }
> > > > @@ -826,6 +877,14 @@ int rte_vhost_feature_enable(uint64_t
> > > feature_mask)
> > > >  	return -1;
> > > >  }
> > > >
> > > > +uint16_t rte_vhost_qp_num_get(struct virtio_net *dev) {
> > > > +	if (dev == NULL)
> > > > +		return 0;
> > > > +
> > > > +	return dev->virt_qp_nb;
> > > > +}
> > > > +
> > > >  /*
> > > >   * Register ops so that we can add/remove device to data core.
> > > >   */
> > > > --
> > > > 1.8.4.2
> > > >

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH 1/6] ixgbe: Support VMDq RSS in non-SRIOV environment
  2015-05-21  7:49 ` [dpdk-dev] [PATCH 1/6] ixgbe: Support VMDq RSS in non-SRIOV environment Ouyang Changchun
@ 2015-08-24 10:41   ` Qiu, Michael
  2015-08-25  0:38     ` Ouyang, Changchun
  0 siblings, 1 reply; 65+ messages in thread
From: Qiu, Michael @ 2015-08-24 10:41 UTC (permalink / raw)
  To: Ouyang, Changchun, dev

On 5/21/2015 3:50 PM, Ouyang Changchun wrote:
> In non-SRIOV environment, VMDq RSS could be enabled by MRQC register.
> In theory, the queue number per pool could be 2 or 4, but only 2 queues are
> available due to HW limitation, the same limit also exist in Linux ixgbe driver.
>
> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> ---
>  lib/librte_ether/rte_ethdev.c     | 40 +++++++++++++++++++
>  lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 82 +++++++++++++++++++++++++++++++++------
>  2 files changed, 111 insertions(+), 11 deletions(-)
>
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 024fe8b..6535715 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -933,6 +933,16 @@ rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t nb_rx_q)
>  	return 0;
>  }
>  
> +#define VMDQ_RSS_RX_QUEUE_NUM_MAX 4
> +
> +static int
> +rte_eth_dev_check_vmdq_rss_rxq_num(__rte_unused uint8_t port_id, uint16_t nb_rx_q)
> +{
> +	if (nb_rx_q > VMDQ_RSS_RX_QUEUE_NUM_MAX)
> +		return -EINVAL;
> +	return 0;
> +}
> +
>  static int
>  rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
>  		      const struct rte_eth_conf *dev_conf)
> @@ -1093,6 +1103,36 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
>  				return -EINVAL;
>  			}
>  		}
> +
> +		if (dev_conf->rxmode.mq_mode == ETH_MQ_RX_VMDQ_RSS) {
> +			uint32_t nb_queue_pools =
> +				dev_conf->rx_adv_conf.vmdq_rx_conf.nb_queue_pools;
> +			struct rte_eth_dev_info dev_info;
> +
> +			rte_eth_dev_info_get(port_id, &dev_info);
> +			dev->data->dev_conf.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
> +			if (nb_queue_pools == ETH_32_POOLS || nb_queue_pools == ETH_64_POOLS)
> +				RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool =
> +					dev_info.max_rx_queues/nb_queue_pools;
> +			else {
> +				PMD_DEBUG_TRACE("ethdev port_id=%d VMDQ "
> +						"nb_queue_pools=%d invalid "
> +						"in VMDQ RSS\n"

Does here miss "," ?

Thanks,
Michael

> +						port_id,
> +						nb_queue_pools);
> +				return -EINVAL;
> +			}
> +
> +			if (rte_eth_dev_check_vmdq_rss_rxq_num(port_id,
> +				RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) != 0) {
> +				PMD_DEBUG_TRACE("ethdev port_id=%d"
> +					" SRIOV active, invalid queue"
> +					" number for VMDQ RSS, allowed"
> +					" value are 1, 2 or 4\n",
> +					port_id);
> +				return -EINVAL;
> +			}
> +		}
>  	}
>  	return 0;
>  }
>


^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH 1/6] ixgbe: Support VMDq RSS in non-SRIOV environment
  2015-08-24 10:41   ` Qiu, Michael
@ 2015-08-25  0:38     ` Ouyang, Changchun
  0 siblings, 0 replies; 65+ messages in thread
From: Ouyang, Changchun @ 2015-08-25  0:38 UTC (permalink / raw)
  To: Qiu, Michael, dev

Hi Michael,

Pls review the latest version (v4).

Thanks for your effort
Changchun


> -----Original Message-----
> From: Qiu, Michael
> Sent: Monday, August 24, 2015 6:42 PM
> To: Ouyang, Changchun; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 1/6] ixgbe: Support VMDq RSS in non-SRIOV
> environment
> 
> On 5/21/2015 3:50 PM, Ouyang Changchun wrote:
> > In non-SRIOV environment, VMDq RSS could be enabled by MRQC register.
> > In theory, the queue number per pool could be 2 or 4, but only 2
> > queues are available due to HW limitation, the same limit also exist in Linux
> ixgbe driver.
> >
> > Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> > ---
> >  lib/librte_ether/rte_ethdev.c     | 40 +++++++++++++++++++
> >  lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 82
> > +++++++++++++++++++++++++++++++++------
> >  2 files changed, 111 insertions(+), 11 deletions(-)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c
> > b/lib/librte_ether/rte_ethdev.c index 024fe8b..6535715 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -933,6 +933,16 @@ rte_eth_dev_check_vf_rss_rxq_num(uint8_t
> port_id, uint16_t nb_rx_q)
> >  	return 0;
> >  }
> >
> > +#define VMDQ_RSS_RX_QUEUE_NUM_MAX 4
> > +
> > +static int
> > +rte_eth_dev_check_vmdq_rss_rxq_num(__rte_unused uint8_t port_id,
> > +uint16_t nb_rx_q) {
> > +	if (nb_rx_q > VMDQ_RSS_RX_QUEUE_NUM_MAX)
> > +		return -EINVAL;
> > +	return 0;
> > +}
> > +
> >  static int
> >  rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q,
> uint16_t nb_tx_q,
> >  		      const struct rte_eth_conf *dev_conf) @@ -1093,6
> +1103,36 @@
> > rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q,
> uint16_t nb_tx_q,
> >  				return -EINVAL;
> >  			}
> >  		}
> > +
> > +		if (dev_conf->rxmode.mq_mode ==
> ETH_MQ_RX_VMDQ_RSS) {
> > +			uint32_t nb_queue_pools =
> > +				dev_conf-
> >rx_adv_conf.vmdq_rx_conf.nb_queue_pools;
> > +			struct rte_eth_dev_info dev_info;
> > +
> > +			rte_eth_dev_info_get(port_id, &dev_info);
> > +			dev->data->dev_conf.rxmode.mq_mode =
> ETH_MQ_RX_VMDQ_RSS;
> > +			if (nb_queue_pools == ETH_32_POOLS ||
> nb_queue_pools == ETH_64_POOLS)
> > +				RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool =
> > +
> 	dev_info.max_rx_queues/nb_queue_pools;
> > +			else {
> > +				PMD_DEBUG_TRACE("ethdev port_id=%d
> VMDQ "
> > +						"nb_queue_pools=%d invalid
> "
> > +						"in VMDQ RSS\n"
> 
> Does here miss "," ?

Yes, it is fixed in later version.

> 
> Thanks,
> Michael
> 
> > +						port_id,
> > +						nb_queue_pools);
> > +				return -EINVAL;
> > +			}
> > +
> > +			if (rte_eth_dev_check_vmdq_rss_rxq_num(port_id,
> > +
> 	RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) != 0) {
> > +				PMD_DEBUG_TRACE("ethdev port_id=%d"
> > +					" SRIOV active, invalid queue"
> > +					" number for VMDQ RSS, allowed"
> > +					" value are 1, 2 or 4\n",
> > +					port_id);
> > +				return -EINVAL;
> > +			}
> > +		}
> >  	}
> >  	return 0;
> >  }
> >

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev
  2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev Ouyang Changchun
  2015-08-13 12:52         ` Flavio Leitner
  2015-08-19  3:52         ` Yuanhan Liu
@ 2015-09-03  2:27         ` Tetsuya Mukawa
  2015-09-06  2:25           ` Ouyang, Changchun
  2 siblings, 1 reply; 65+ messages in thread
From: Tetsuya Mukawa @ 2015-09-03  2:27 UTC (permalink / raw)
  To: dev, Ouyang, Changchun

On 2015/08/12 17:02, Ouyang Changchun wrote:
> diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.h b/lib/librte_vhost/vhost_user/virtio-net-user.h
> index df24860..2429836 100644
> --- a/lib/librte_vhost/vhost_user/virtio-net-user.h
> +++ b/lib/librte_vhost/vhost_user/virtio-net-user.h
> @@ -46,4 +46,6 @@ void user_set_vring_kick(struct vhost_device_ctx, struct VhostUserMsg *);
>
>  /*
> @@ -206,9 +213,17 @@ cleanup_device(struct virtio_net *dev)
>  static void
>  free_device(struct virtio_net_config_ll *ll_dev)
>  {
> -	/* Free any malloc'd memory */
> -	rte_free(ll_dev->dev.virtqueue[VIRTIO_RXQ]);
> -	rte_free(ll_dev->dev.virtqueue[VIRTIO_TXQ]);
> +	uint32_t qp_idx;
> +
> +	/*
> +	 * Free any malloc'd memory.
> +	 */
> +	/* Free every queue pair. */
> +	for (qp_idx = 0; qp_idx < ll_dev->dev.virt_qp_nb; qp_idx++) {
> +		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> +		rte_free(ll_dev->dev.virtqueue[virt_rx_q_idx]);

Hi Changchun,

Should we free tx queue also here?

Thanks,
Tetsuya

> +	}
> +	rte_free(ll_dev->dev.virtqueue);
>  	rte_free(ll_dev);
>  }
>  
>

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev
  2015-09-03  2:27         ` Tetsuya Mukawa
@ 2015-09-06  2:25           ` Ouyang, Changchun
  0 siblings, 0 replies; 65+ messages in thread
From: Ouyang, Changchun @ 2015-09-06  2:25 UTC (permalink / raw)
  To: Tetsuya Mukawa, dev

Hi Tetsuya,

> -----Original Message-----
> From: Tetsuya Mukawa [mailto:mukawa@igel.co.jp]
> Sent: Thursday, September 3, 2015 10:27 AM
> To: dev@dpdk.org; Ouyang, Changchun
> Subject: Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in
> virtio dev
> 
> On 2015/08/12 17:02, Ouyang Changchun wrote:
> > diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.h
> > b/lib/librte_vhost/vhost_user/virtio-net-user.h
> > index df24860..2429836 100644
> > --- a/lib/librte_vhost/vhost_user/virtio-net-user.h
> > +++ b/lib/librte_vhost/vhost_user/virtio-net-user.h
> > @@ -46,4 +46,6 @@ void user_set_vring_kick(struct vhost_device_ctx,
> > struct VhostUserMsg *);
> >
> >  /*
> > @@ -206,9 +213,17 @@ cleanup_device(struct virtio_net *dev)  static
> > void  free_device(struct virtio_net_config_ll *ll_dev)  {
> > -	/* Free any malloc'd memory */
> > -	rte_free(ll_dev->dev.virtqueue[VIRTIO_RXQ]);
> > -	rte_free(ll_dev->dev.virtqueue[VIRTIO_TXQ]);
> > +	uint32_t qp_idx;
> > +
> > +	/*
> > +	 * Free any malloc'd memory.
> > +	 */
> > +	/* Free every queue pair. */
> > +	for (qp_idx = 0; qp_idx < ll_dev->dev.virt_qp_nb; qp_idx++) {
> > +		uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM +
> VIRTIO_RXQ;
> > +		rte_free(ll_dev->dev.virtqueue[virt_rx_q_idx]);
> 
> Hi Changchun,
> 
> Should we free tx queue also here?
>

We don't need do it, as we allocate once for both rx and tx queue.
Thus, we allocate once, free once.
Pls see the following code snippet:

+ *  Alloc mem for vring queue pair.
+ */
+int
+alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
+	struct vhost_virtqueue *virtqueue = NULL;
+	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
+	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
 
-	/* Backends are set to -1 indicating an inactive device. */
-	dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
-	dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
+	virtqueue = rte_malloc(NULL, sizeof(struct vhost_virtqueue) * VIRTIO_QNUM, 0);
+	if (virtqueue == NULL) {
+		RTE_LOG(ERR, VHOST_CONFIG,
+			"Failed to allocate memory for virt qp:%d.\n", qp_idx);
+		return -1;
+	}
+
+	dev->virtqueue[virt_rx_q_idx] = virtqueue;
+	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
+
+	init_vring_queue_pair(dev, qp_idx);
+
+	return 0;
 }

Thanks
Changchun

> 
> > +	}
> > +	rte_free(ll_dev->dev.virtqueue);
> >  	rte_free(ll_dev);
> >  }
> >
> >

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2015-09-06  2:26 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-21  7:49 [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost Ouyang Changchun
2015-05-21  7:49 ` [dpdk-dev] [PATCH 1/6] ixgbe: Support VMDq RSS in non-SRIOV environment Ouyang Changchun
2015-08-24 10:41   ` Qiu, Michael
2015-08-25  0:38     ` Ouyang, Changchun
2015-05-21  7:49 ` [dpdk-dev] [PATCH 2/6] lib_vhost: Support multiple queues in virtio dev Ouyang Changchun
2015-06-03  2:47   ` Xie, Huawei
2015-05-21  7:49 ` [dpdk-dev] [PATCH 3/6] lib_vhost: Set memory layout for multiple queues mode Ouyang Changchun
2015-06-02  3:33   ` Xie, Huawei
2015-05-21  7:49 ` [dpdk-dev] [PATCH 4/6] vhost: Add new command line option: rxq Ouyang Changchun
2015-05-22  1:39   ` Thomas F Herbert
2015-05-22  6:05     ` Ouyang, Changchun
2015-05-22 12:51       ` Thomas F Herbert
2015-05-23  1:25         ` Ouyang, Changchun
2015-05-26  7:21           ` Ouyang, Changchun
2015-05-21  7:49 ` [dpdk-dev] [PATCH 5/6] vhost: Support multiple queues Ouyang Changchun
2015-05-21  7:49 ` [dpdk-dev] [PATCH 6/6] virtio: Resolve for control queue Ouyang Changchun
2015-05-22  1:13 ` [dpdk-dev] [PATCH 0/6] Support multiple queues in vhost Thomas F Herbert
2015-05-22  6:08   ` Ouyang, Changchun
2015-06-10  5:52 ` [dpdk-dev] [PATCH v2 0/7] " Ouyang Changchun
2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 1/7] ixgbe: Support VMDq RSS in non-SRIOV environment Ouyang Changchun
2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 2/7] lib_vhost: Support multiple queues in virtio dev Ouyang Changchun
2015-06-11  9:54     ` Panu Matilainen
2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 3/7] lib_vhost: Set memory layout for multiple queues mode Ouyang Changchun
2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 4/7] vhost: Add new command line option: rxq Ouyang Changchun
2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 5/7] vhost: Support multiple queues Ouyang Changchun
2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 6/7] virtio: Resolve for control queue Ouyang Changchun
2015-06-10  5:52   ` [dpdk-dev] [PATCH v2 7/7] vhost: Add per queue stats info Ouyang Changchun
2015-06-15  7:56   ` [dpdk-dev] [PATCH v3 0/9] Support multiple queues in vhost Ouyang Changchun
2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 1/9] ixgbe: Support VMDq RSS in non-SRIOV environment Ouyang Changchun
2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 2/9] lib_vhost: Support multiple queues in virtio dev Ouyang Changchun
2015-06-18 13:16       ` Flavio Leitner
2015-06-19  1:06         ` Ouyang, Changchun
2015-06-18 13:34       ` Flavio Leitner
2015-06-19  1:17         ` Ouyang, Changchun
2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 3/9] lib_vhost: Set memory layout for multiple queues mode Ouyang Changchun
2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 4/9] lib_vhost: Check the virtqueue address's validity Ouyang Changchun
2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 5/9] vhost: Add new command line option: rxq Ouyang Changchun
2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 6/9] vhost: Support multiple queues Ouyang Changchun
2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 7/9] virtio: Resolve for control queue Ouyang Changchun
2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 8/9] vhost: Add per queue stats info Ouyang Changchun
2015-06-15  7:56     ` [dpdk-dev] [PATCH v3 9/9] doc: Update doc for vhost multiple queues Ouyang Changchun
2015-08-12  8:02     ` [dpdk-dev] [PATCH v4 00/12] Support multiple queues in vhost Ouyang Changchun
2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 01/12] ixgbe: support VMDq RSS in non-SRIOV environment Ouyang Changchun
2015-08-12  8:22         ` Vincent JARDIN
2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev Ouyang Changchun
2015-08-13 12:52         ` Flavio Leitner
2015-08-14  2:29           ` Ouyang, Changchun
2015-08-14 12:16             ` Flavio Leitner
2015-08-19  3:52         ` Yuanhan Liu
2015-08-19  5:54           ` Ouyang, Changchun
2015-08-19  6:28             ` Yuanhan Liu
2015-08-19  6:39               ` Yuanhan Liu
2015-09-03  2:27         ` Tetsuya Mukawa
2015-09-06  2:25           ` Ouyang, Changchun
2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 03/12] vhost: update version map file Ouyang Changchun
2015-08-12  8:24         ` Panu Matilainen
2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 04/12] vhost: set memory layout for multiple queues mode Ouyang Changchun
2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 05/12] vhost: check the virtqueue address's validity Ouyang Changchun
2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 06/12] vhost: support protocol feature Ouyang Changchun
2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 07/12] vhost: add new command line option: rxq Ouyang Changchun
2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 08/12] vhost: support multiple queues Ouyang Changchun
2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 09/12] virtio: resolve for control queue Ouyang Changchun
2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 10/12] vhost: add per queue stats info Ouyang Changchun
2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 11/12] vhost: alloc core to virtq Ouyang Changchun
2015-08-12  8:02       ` [dpdk-dev] [PATCH v4 12/12] doc: update doc for vhost multiple queues Ouyang Changchun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).