* [dpdk-dev] [PATCH v4 0/3] vhost: make virtqueue cache-friendly
@ 2021-03-23 9:02 Maxime Coquelin
2021-03-23 9:02 ` [dpdk-dev] [PATCH v4 1/3] vhost: remove unused Vhost virtqueue field Maxime Coquelin
` (4 more replies)
0 siblings, 5 replies; 8+ messages in thread
From: Maxime Coquelin @ 2021-03-23 9:02 UTC (permalink / raw)
To: dev, chenbo.xia, amorenoz, david.marchand, olivier.matz, bnemeth
Cc: Maxime Coquelin
As done for Virtio PMD, this series improves cache utilization
of the vhost_virtqueue struct by removing unused field,
make the live-migration cache dynamically allocated at
live-migration setup time and by moving fields
around so that hot fields are on the first cachelines.
With this series, The struct vhost_virtqueue size goes
from 832B (13 cachelines) down to 320B (5 cachelines).
With this series and the virtio one, I measure a gain
of up to 8% in IO loop micro-benchmark with packed
ring, and 5% with split ring.
I don't have a setup at hand to run PVP testing, but
it might be interresting to get the numbers as I
suspect the cache pressure is higher in this test as
in real use-cases.
Changes in v4:
==============
- Fix missing changes to boolean (Chenbo)
Changes in v3:
==============
- Don't check pointer validity before freeing (David)
- Don't use deprecated rte_smp_wmb() (David, Checkpatch)
- Handle booleans properly (David)
- Prevent VQ size field overflow (David)
- Fix typo and indent (David)
Changes in v2:
==============
- Add log_cache freeing in free_vq (Chenbo)
Maxime Coquelin (3):
vhost: remove unused Vhost virtqueue field
vhost: move dirty logging cache out of the virtqueue
vhost: optimize vhost virtqueue struct
lib/librte_vhost/vhost.c | 21 +++++++++----
lib/librte_vhost/vhost.h | 56 +++++++++++++++++------------------
lib/librte_vhost/vhost_user.c | 44 +++++++++++++++++++--------
lib/librte_vhost/virtio_net.c | 12 ++++----
4 files changed, 82 insertions(+), 51 deletions(-)
--
2.30.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* [dpdk-dev] [PATCH v4 1/3] vhost: remove unused Vhost virtqueue field
2021-03-23 9:02 [dpdk-dev] [PATCH v4 0/3] vhost: make virtqueue cache-friendly Maxime Coquelin
@ 2021-03-23 9:02 ` Maxime Coquelin
2021-03-23 9:02 ` [dpdk-dev] [PATCH v4 2/3] vhost: move dirty logging cache out of the virtqueue Maxime Coquelin
` (3 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Maxime Coquelin @ 2021-03-23 9:02 UTC (permalink / raw)
To: dev, chenbo.xia, amorenoz, david.marchand, olivier.matz, bnemeth
Cc: Maxime Coquelin
This patch removes the "backend" field of the
vhost_virtqueue struct, which is not used by the
library.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
---
lib/librte_vhost/vhost.c | 2 --
lib/librte_vhost/vhost.h | 2 --
2 files changed, 4 deletions(-)
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 52ab93d1ec..5a7c0c6cff 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -558,8 +558,6 @@ init_vring_queue(struct virtio_net *dev, uint32_t vring_idx)
vq->notif_enable = VIRTIO_UNINITIALIZED_NOTIF;
vhost_user_iotlb_init(dev, vring_idx);
- /* Backends are set to -1 indicating an inactive device. */
- vq->backend = -1;
}
static void
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 658f6fc287..717f410548 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -143,8 +143,6 @@ struct vhost_virtqueue {
#define VIRTIO_INVALID_EVENTFD (-1)
#define VIRTIO_UNINITIALIZED_EVENTFD (-2)
- /* Backend value to determine if device should started/stopped */
- int backend;
int enabled;
int access_ok;
int ready;
--
2.30.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* [dpdk-dev] [PATCH v4 2/3] vhost: move dirty logging cache out of the virtqueue
2021-03-23 9:02 [dpdk-dev] [PATCH v4 0/3] vhost: make virtqueue cache-friendly Maxime Coquelin
2021-03-23 9:02 ` [dpdk-dev] [PATCH v4 1/3] vhost: remove unused Vhost virtqueue field Maxime Coquelin
@ 2021-03-23 9:02 ` Maxime Coquelin
2021-03-23 9:02 ` [dpdk-dev] [PATCH v4 3/3] vhost: optimize vhost virtqueue struct Maxime Coquelin
` (2 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Maxime Coquelin @ 2021-03-23 9:02 UTC (permalink / raw)
To: dev, chenbo.xia, amorenoz, david.marchand, olivier.matz, bnemeth
Cc: Maxime Coquelin
This patch moves the per-virtqueue's dirty logging cache
out of the virtqueue struct, by allocating it dynamically
only when live-migration is enabled.
It saves 8 cachelines in vhost_virtqueue struct.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
---
lib/librte_vhost/vhost.c | 13 +++++++++++++
lib/librte_vhost/vhost.h | 2 +-
lib/librte_vhost/vhost_user.c | 21 +++++++++++++++++++++
3 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 5a7c0c6cff..a8032e3ba1 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -145,6 +145,10 @@ __vhost_log_cache_sync(struct virtio_net *dev, struct vhost_virtqueue *vq)
if (unlikely(!dev->log_base))
return;
+ /* No cache, nothing to sync */
+ if (unlikely(!vq->log_cache))
+ return;
+
rte_atomic_thread_fence(__ATOMIC_RELEASE);
log_base = (unsigned long *)(uintptr_t)dev->log_base;
@@ -177,6 +181,14 @@ vhost_log_cache_page(struct virtio_net *dev, struct vhost_virtqueue *vq,
uint32_t offset = page / (sizeof(unsigned long) << 3);
int i;
+ if (unlikely(!vq->log_cache)) {
+ /* No logging cache allocated, write dirty log map directly */
+ rte_atomic_thread_fence(__ATOMIC_RELEASE);
+ vhost_log_page((uint8_t *)(uintptr_t)dev->log_base, page);
+
+ return;
+ }
+
for (i = 0; i < vq->log_cache_nb_elem; i++) {
struct log_cache_entry *elem = vq->log_cache + i;
@@ -354,6 +366,7 @@ free_vq(struct virtio_net *dev, struct vhost_virtqueue *vq)
}
rte_free(vq->batch_copy_elems);
rte_mempool_free(vq->iotlb_pool);
+ rte_free(vq->log_cache);
rte_free(vq);
}
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 717f410548..3a71dfeed9 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -183,7 +183,7 @@ struct vhost_virtqueue {
bool used_wrap_counter;
bool avail_wrap_counter;
- struct log_cache_entry log_cache[VHOST_LOG_CACHE_NR];
+ struct log_cache_entry *log_cache;
uint16_t log_cache_nb_elem;
rte_rwlock_t iotlb_lock;
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index a60bb945ad..4d9e76e49e 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -2022,6 +2022,9 @@ vhost_user_get_vring_base(struct virtio_net **pdev,
rte_free(vq->batch_copy_elems);
vq->batch_copy_elems = NULL;
+ rte_free(vq->log_cache);
+ vq->log_cache = NULL;
+
msg->size = sizeof(msg->payload.state);
msg->fd_num = 0;
@@ -2121,6 +2124,7 @@ vhost_user_set_log_base(struct virtio_net **pdev, struct VhostUserMsg *msg,
int fd = msg->fds[0];
uint64_t size, off;
void *addr;
+ uint32_t i;
if (validate_msg_fds(msg, 1) != 0)
return RTE_VHOST_MSG_RESULT_ERR;
@@ -2174,6 +2178,23 @@ vhost_user_set_log_base(struct virtio_net **pdev, struct VhostUserMsg *msg,
dev->log_base = dev->log_addr + off;
dev->log_size = size;
+ for (i = 0; i < dev->nr_vring; i++) {
+ struct vhost_virtqueue *vq = dev->virtqueue[i];
+
+ rte_free(vq->log_cache);
+ vq->log_cache = NULL;
+ vq->log_cache_nb_elem = 0;
+ vq->log_cache = rte_zmalloc("vq log cache",
+ sizeof(struct log_cache_entry) * VHOST_LOG_CACHE_NR,
+ 0);
+ /*
+ * If log cache alloc fail, don't fail migration, but no
+ * caching will be done, which will impact performance
+ */
+ if (!vq->log_cache)
+ VHOST_LOG_CONFIG(ERR, "Failed to allocate VQ logging cache\n");
+ }
+
/*
* The spec is not clear about it (yet), but QEMU doesn't expect
* any payload in the reply.
--
2.30.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* [dpdk-dev] [PATCH v4 3/3] vhost: optimize vhost virtqueue struct
2021-03-23 9:02 [dpdk-dev] [PATCH v4 0/3] vhost: make virtqueue cache-friendly Maxime Coquelin
2021-03-23 9:02 ` [dpdk-dev] [PATCH v4 1/3] vhost: remove unused Vhost virtqueue field Maxime Coquelin
2021-03-23 9:02 ` [dpdk-dev] [PATCH v4 2/3] vhost: move dirty logging cache out of the virtqueue Maxime Coquelin
@ 2021-03-23 9:02 ` Maxime Coquelin
2021-03-25 2:30 ` Xia, Chenbo
2021-03-23 10:30 ` [dpdk-dev] [PATCH v4 0/3] vhost: make virtqueue cache-friendly David Marchand
2021-03-31 6:04 ` Xia, Chenbo
4 siblings, 1 reply; 8+ messages in thread
From: Maxime Coquelin @ 2021-03-23 9:02 UTC (permalink / raw)
To: dev, chenbo.xia, amorenoz, david.marchand, olivier.matz, bnemeth
Cc: Maxime Coquelin
This patch moves vhost_virtqueue struct fields in order
to both optimize packing and move hot fields on the first
cachelines.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_vhost/vhost.c | 6 ++--
lib/librte_vhost/vhost.h | 54 ++++++++++++++++++-----------------
lib/librte_vhost/vhost_user.c | 23 +++++++--------
lib/librte_vhost/virtio_net.c | 12 ++++----
4 files changed, 48 insertions(+), 47 deletions(-)
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index a8032e3ba1..04d63b2f02 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -524,7 +524,7 @@ vring_translate(struct virtio_net *dev, struct vhost_virtqueue *vq)
if (log_translate(dev, vq) < 0)
return -1;
- vq->access_ok = 1;
+ vq->access_ok = true;
return 0;
}
@@ -535,7 +535,7 @@ vring_invalidate(struct virtio_net *dev, struct vhost_virtqueue *vq)
if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))
vhost_user_iotlb_wr_lock(vq);
- vq->access_ok = 0;
+ vq->access_ok = false;
vq->desc = NULL;
vq->avail = NULL;
vq->used = NULL;
@@ -1451,7 +1451,7 @@ rte_vhost_rx_queue_count(int vid, uint16_t qid)
rte_spinlock_lock(&vq->access_lock);
- if (unlikely(vq->enabled == 0 || vq->avail == NULL))
+ if (unlikely(!vq->enabled || vq->avail == NULL))
goto out;
ret = *((volatile uint16_t *)&vq->avail->idx) - vq->last_avail_idx;
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 3a71dfeed9..f628714c24 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -133,7 +133,7 @@ struct vhost_virtqueue {
struct vring_used *used;
struct vring_packed_desc_event *device_event;
};
- uint32_t size;
+ uint16_t size;
uint16_t last_avail_idx;
uint16_t last_used_idx;
@@ -143,29 +143,12 @@ struct vhost_virtqueue {
#define VIRTIO_INVALID_EVENTFD (-1)
#define VIRTIO_UNINITIALIZED_EVENTFD (-2)
- int enabled;
- int access_ok;
- int ready;
- int notif_enable;
-#define VIRTIO_UNINITIALIZED_NOTIF (-1)
+ bool enabled;
+ bool access_ok;
+ bool ready;
rte_spinlock_t access_lock;
- /* Used to notify the guest (trigger interrupt) */
- int callfd;
- /* Currently unused as polling mode is enabled */
- int kickfd;
-
- /* Physical address of used ring, for logging */
- uint64_t log_guest_addr;
-
- /* inflight share memory info */
- union {
- struct rte_vhost_inflight_info_split *inflight_split;
- struct rte_vhost_inflight_info_packed *inflight_packed;
- };
- struct rte_vhost_resubmit_info *resubmit_inflight;
- uint64_t global_counter;
union {
struct vring_used_elem *shadow_used_split;
@@ -176,22 +159,36 @@ struct vhost_virtqueue {
uint16_t shadow_aligned_idx;
/* Record packed ring first dequeue desc index */
uint16_t shadow_last_used_idx;
- struct vhost_vring_addr ring_addrs;
- struct batch_copy_elem *batch_copy_elems;
uint16_t batch_copy_nb_elems;
+ struct batch_copy_elem *batch_copy_elems;
bool used_wrap_counter;
bool avail_wrap_counter;
- struct log_cache_entry *log_cache;
- uint16_t log_cache_nb_elem;
+ /* Physical address of used ring, for logging */
+ uint16_t log_cache_nb_elem;
+ uint64_t log_guest_addr;
+ struct log_cache_entry *log_cache;
rte_rwlock_t iotlb_lock;
rte_rwlock_t iotlb_pending_lock;
struct rte_mempool *iotlb_pool;
TAILQ_HEAD(, vhost_iotlb_entry) iotlb_list;
- int iotlb_cache_nr;
TAILQ_HEAD(, vhost_iotlb_entry) iotlb_pending_list;
+ int iotlb_cache_nr;
+
+ /* Used to notify the guest (trigger interrupt) */
+ int callfd;
+ /* Currently unused as polling mode is enabled */
+ int kickfd;
+
+ /* inflight share memory info */
+ union {
+ struct rte_vhost_inflight_info_split *inflight_split;
+ struct rte_vhost_inflight_info_packed *inflight_packed;
+ };
+ struct rte_vhost_resubmit_info *resubmit_inflight;
+ uint64_t global_counter;
/* operation callbacks for async dma */
struct rte_vhost_async_channel_ops async_ops;
@@ -212,6 +209,11 @@ struct vhost_virtqueue {
bool async_inorder;
bool async_registered;
uint16_t async_threshold;
+
+ int notif_enable;
+#define VIRTIO_UNINITIALIZED_NOTIF (-1)
+
+ struct vhost_vring_addr ring_addrs;
} __rte_cache_aligned;
/* Virtio device status as per Virtio specification */
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 4d9e76e49e..2f4f89aeac 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -406,6 +406,11 @@ vhost_user_set_vring_num(struct virtio_net **pdev,
if (validate_msg_fds(msg, 0) != 0)
return RTE_VHOST_MSG_RESULT_ERR;
+ if (msg->payload.state.num > 32768) {
+ VHOST_LOG_CONFIG(ERR, "invalid virtqueue size %u\n", msg->payload.state.num);
+ return RTE_VHOST_MSG_RESULT_ERR;
+ }
+
vq->size = msg->payload.state.num;
/* VIRTIO 1.0, 2.4 Virtqueues says:
@@ -425,12 +430,6 @@ vhost_user_set_vring_num(struct virtio_net **pdev,
}
}
- if (vq->size > 32768) {
- VHOST_LOG_CONFIG(ERR,
- "invalid virtqueue size %u\n", vq->size);
- return RTE_VHOST_MSG_RESULT_ERR;
- }
-
if (vq_is_packed(dev)) {
if (vq->shadow_used_packed)
rte_free(vq->shadow_used_packed);
@@ -713,7 +712,7 @@ translate_ring_addresses(struct virtio_net *dev, int vq_index)
return dev;
}
- vq->access_ok = 1;
+ vq->access_ok = true;
return dev;
}
@@ -771,7 +770,7 @@ translate_ring_addresses(struct virtio_net *dev, int vq_index)
vq->last_avail_idx = vq->used->idx;
}
- vq->access_ok = 1;
+ vq->access_ok = true;
VHOST_LOG_CONFIG(DEBUG, "(%d) mapped address desc: %p\n",
dev->vid, vq->desc);
@@ -1658,7 +1657,7 @@ vhost_user_set_vring_call(struct virtio_net **pdev, struct VhostUserMsg *msg,
vq = dev->virtqueue[file.index];
if (vq->ready) {
- vq->ready = 0;
+ vq->ready = false;
vhost_user_notify_queue_state(dev, file.index, 0);
}
@@ -1918,14 +1917,14 @@ vhost_user_set_vring_kick(struct virtio_net **pdev, struct VhostUserMsg *msg,
* the SET_VRING_ENABLE message.
*/
if (!(dev->features & (1ULL << VHOST_USER_F_PROTOCOL_FEATURES))) {
- vq->enabled = 1;
+ vq->enabled = true;
if (dev->notify_ops->vring_state_changed)
dev->notify_ops->vring_state_changed(
dev->vid, file.index, 1);
}
if (vq->ready) {
- vq->ready = 0;
+ vq->ready = false;
vhost_user_notify_queue_state(dev, file.index, 0);
}
@@ -2043,7 +2042,7 @@ vhost_user_set_vring_enable(struct virtio_net **pdev,
int main_fd __rte_unused)
{
struct virtio_net *dev = *pdev;
- int enable = (int)msg->payload.state.num;
+ bool enable = !!msg->payload.state.num;
int index = (int)msg->payload.state.index;
if (validate_msg_fds(msg, 0) != 0)
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 583bf379c6..3d8e29df09 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -1396,13 +1396,13 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
rte_spinlock_lock(&vq->access_lock);
- if (unlikely(vq->enabled == 0))
+ if (unlikely(!vq->enabled))
goto out_access_unlock;
if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))
vhost_user_iotlb_rd_lock(vq);
- if (unlikely(vq->access_ok == 0))
+ if (unlikely(!vq->access_ok))
if (unlikely(vring_translate(dev, vq) < 0))
goto out;
@@ -1753,13 +1753,13 @@ virtio_dev_rx_async_submit(struct virtio_net *dev, uint16_t queue_id,
rte_spinlock_lock(&vq->access_lock);
- if (unlikely(vq->enabled == 0 || !vq->async_registered))
+ if (unlikely(!vq->enabled || !vq->async_registered))
goto out_access_unlock;
if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))
vhost_user_iotlb_rd_lock(vq);
- if (unlikely(vq->access_ok == 0))
+ if (unlikely(!vq->access_ok))
if (unlikely(vring_translate(dev, vq) < 0))
goto out;
@@ -2518,7 +2518,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
if (unlikely(rte_spinlock_trylock(&vq->access_lock) == 0))
return 0;
- if (unlikely(vq->enabled == 0)) {
+ if (unlikely(!vq->enabled)) {
count = 0;
goto out_access_unlock;
}
@@ -2526,7 +2526,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))
vhost_user_iotlb_rd_lock(vq);
- if (unlikely(vq->access_ok == 0))
+ if (unlikely(!vq->access_ok))
if (unlikely(vring_translate(dev, vq) < 0)) {
count = 0;
goto out;
--
2.30.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH v4 0/3] vhost: make virtqueue cache-friendly
2021-03-23 9:02 [dpdk-dev] [PATCH v4 0/3] vhost: make virtqueue cache-friendly Maxime Coquelin
` (2 preceding siblings ...)
2021-03-23 9:02 ` [dpdk-dev] [PATCH v4 3/3] vhost: optimize vhost virtqueue struct Maxime Coquelin
@ 2021-03-23 10:30 ` David Marchand
2021-03-29 10:52 ` Balazs Nemeth
2021-03-31 6:04 ` Xia, Chenbo
4 siblings, 1 reply; 8+ messages in thread
From: David Marchand @ 2021-03-23 10:30 UTC (permalink / raw)
To: Maxime Coquelin
Cc: dev, Xia, Chenbo, Adrian Moreno Zapata, Olivier Matz, Balazs Nemeth
On Tue, Mar 23, 2021 at 10:02 AM Maxime Coquelin
<maxime.coquelin@redhat.com> wrote:
>
> As done for Virtio PMD, this series improves cache utilization
> of the vhost_virtqueue struct by removing unused field,
> make the live-migration cache dynamically allocated at
> live-migration setup time and by moving fields
> around so that hot fields are on the first cachelines.
>
> With this series, The struct vhost_virtqueue size goes
> from 832B (13 cachelines) down to 320B (5 cachelines).
>
> With this series and the virtio one, I measure a gain
> of up to 8% in IO loop micro-benchmark with packed
> ring, and 5% with split ring.
>
> I don't have a setup at hand to run PVP testing, but
> it might be interresting to get the numbers as I
> suspect the cache pressure is higher in this test as
> in real use-cases.
>
> Changes in v4:
> ==============
> - Fix missing changes to boolean (Chenbo)
>
For the series,
Reviewed-by: David Marchand <david.marchand@redhat.com>
Merci !
--
David Marchand
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH v4 3/3] vhost: optimize vhost virtqueue struct
2021-03-23 9:02 ` [dpdk-dev] [PATCH v4 3/3] vhost: optimize vhost virtqueue struct Maxime Coquelin
@ 2021-03-25 2:30 ` Xia, Chenbo
0 siblings, 0 replies; 8+ messages in thread
From: Xia, Chenbo @ 2021-03-25 2:30 UTC (permalink / raw)
To: Maxime Coquelin, dev, amorenoz, david.marchand, olivier.matz, bnemeth
> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Tuesday, March 23, 2021 5:02 PM
> To: dev@dpdk.org; Xia, Chenbo <chenbo.xia@intel.com>; amorenoz@redhat.com;
> david.marchand@redhat.com; olivier.matz@6wind.com; bnemeth@redhat.com
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
> Subject: [PATCH v4 3/3] vhost: optimize vhost virtqueue struct
>
> This patch moves vhost_virtqueue struct fields in order
> to both optimize packing and move hot fields on the first
> cachelines.
>
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
> lib/librte_vhost/vhost.c | 6 ++--
> lib/librte_vhost/vhost.h | 54 ++++++++++++++++++-----------------
> lib/librte_vhost/vhost_user.c | 23 +++++++--------
> lib/librte_vhost/virtio_net.c | 12 ++++----
> 4 files changed, 48 insertions(+), 47 deletions(-)
>
> diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
> index a8032e3ba1..04d63b2f02 100644
> --- a/lib/librte_vhost/vhost.c
> +++ b/lib/librte_vhost/vhost.c
> @@ -524,7 +524,7 @@ vring_translate(struct virtio_net *dev, struct
> vhost_virtqueue *vq)
> if (log_translate(dev, vq) < 0)
> return -1;
>
> - vq->access_ok = 1;
> + vq->access_ok = true;
>
> return 0;
> }
> @@ -535,7 +535,7 @@ vring_invalidate(struct virtio_net *dev, struct
> vhost_virtqueue *vq)
> if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))
> vhost_user_iotlb_wr_lock(vq);
>
> - vq->access_ok = 0;
> + vq->access_ok = false;
> vq->desc = NULL;
> vq->avail = NULL;
> vq->used = NULL;
> @@ -1451,7 +1451,7 @@ rte_vhost_rx_queue_count(int vid, uint16_t qid)
>
> rte_spinlock_lock(&vq->access_lock);
>
> - if (unlikely(vq->enabled == 0 || vq->avail == NULL))
> + if (unlikely(!vq->enabled || vq->avail == NULL))
> goto out;
>
> ret = *((volatile uint16_t *)&vq->avail->idx) - vq->last_avail_idx;
> diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
> index 3a71dfeed9..f628714c24 100644
> --- a/lib/librte_vhost/vhost.h
> +++ b/lib/librte_vhost/vhost.h
> @@ -133,7 +133,7 @@ struct vhost_virtqueue {
> struct vring_used *used;
> struct vring_packed_desc_event *device_event;
> };
> - uint32_t size;
> + uint16_t size;
>
> uint16_t last_avail_idx;
> uint16_t last_used_idx;
> @@ -143,29 +143,12 @@ struct vhost_virtqueue {
> #define VIRTIO_INVALID_EVENTFD (-1)
> #define VIRTIO_UNINITIALIZED_EVENTFD (-2)
>
> - int enabled;
> - int access_ok;
> - int ready;
> - int notif_enable;
> -#define VIRTIO_UNINITIALIZED_NOTIF (-1)
> + bool enabled;
> + bool access_ok;
> + bool ready;
>
> rte_spinlock_t access_lock;
>
> - /* Used to notify the guest (trigger interrupt) */
> - int callfd;
> - /* Currently unused as polling mode is enabled */
> - int kickfd;
> -
> - /* Physical address of used ring, for logging */
> - uint64_t log_guest_addr;
> -
> - /* inflight share memory info */
> - union {
> - struct rte_vhost_inflight_info_split *inflight_split;
> - struct rte_vhost_inflight_info_packed *inflight_packed;
> - };
> - struct rte_vhost_resubmit_info *resubmit_inflight;
> - uint64_t global_counter;
>
> union {
> struct vring_used_elem *shadow_used_split;
> @@ -176,22 +159,36 @@ struct vhost_virtqueue {
> uint16_t shadow_aligned_idx;
> /* Record packed ring first dequeue desc index */
> uint16_t shadow_last_used_idx;
> - struct vhost_vring_addr ring_addrs;
>
> - struct batch_copy_elem *batch_copy_elems;
> uint16_t batch_copy_nb_elems;
> + struct batch_copy_elem *batch_copy_elems;
> bool used_wrap_counter;
> bool avail_wrap_counter;
>
> - struct log_cache_entry *log_cache;
> - uint16_t log_cache_nb_elem;
> + /* Physical address of used ring, for logging */
> + uint16_t log_cache_nb_elem;
> + uint64_t log_guest_addr;
> + struct log_cache_entry *log_cache;
>
> rte_rwlock_t iotlb_lock;
> rte_rwlock_t iotlb_pending_lock;
> struct rte_mempool *iotlb_pool;
> TAILQ_HEAD(, vhost_iotlb_entry) iotlb_list;
> - int iotlb_cache_nr;
> TAILQ_HEAD(, vhost_iotlb_entry) iotlb_pending_list;
> + int iotlb_cache_nr;
> +
> + /* Used to notify the guest (trigger interrupt) */
> + int callfd;
> + /* Currently unused as polling mode is enabled */
> + int kickfd;
> +
> + /* inflight share memory info */
> + union {
> + struct rte_vhost_inflight_info_split *inflight_split;
> + struct rte_vhost_inflight_info_packed *inflight_packed;
> + };
> + struct rte_vhost_resubmit_info *resubmit_inflight;
> + uint64_t global_counter;
>
> /* operation callbacks for async dma */
> struct rte_vhost_async_channel_ops async_ops;
> @@ -212,6 +209,11 @@ struct vhost_virtqueue {
> bool async_inorder;
> bool async_registered;
> uint16_t async_threshold;
> +
> + int notif_enable;
> +#define VIRTIO_UNINITIALIZED_NOTIF (-1)
> +
> + struct vhost_vring_addr ring_addrs;
> } __rte_cache_aligned;
>
> /* Virtio device status as per Virtio specification */
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> index 4d9e76e49e..2f4f89aeac 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -406,6 +406,11 @@ vhost_user_set_vring_num(struct virtio_net **pdev,
> if (validate_msg_fds(msg, 0) != 0)
> return RTE_VHOST_MSG_RESULT_ERR;
>
> + if (msg->payload.state.num > 32768) {
> + VHOST_LOG_CONFIG(ERR, "invalid virtqueue size %u\n", msg-
> >payload.state.num);
> + return RTE_VHOST_MSG_RESULT_ERR;
> + }
> +
> vq->size = msg->payload.state.num;
>
> /* VIRTIO 1.0, 2.4 Virtqueues says:
> @@ -425,12 +430,6 @@ vhost_user_set_vring_num(struct virtio_net **pdev,
> }
> }
>
> - if (vq->size > 32768) {
> - VHOST_LOG_CONFIG(ERR,
> - "invalid virtqueue size %u\n", vq->size);
> - return RTE_VHOST_MSG_RESULT_ERR;
> - }
> -
> if (vq_is_packed(dev)) {
> if (vq->shadow_used_packed)
> rte_free(vq->shadow_used_packed);
> @@ -713,7 +712,7 @@ translate_ring_addresses(struct virtio_net *dev, int
> vq_index)
> return dev;
> }
>
> - vq->access_ok = 1;
> + vq->access_ok = true;
> return dev;
> }
>
> @@ -771,7 +770,7 @@ translate_ring_addresses(struct virtio_net *dev, int
> vq_index)
> vq->last_avail_idx = vq->used->idx;
> }
>
> - vq->access_ok = 1;
> + vq->access_ok = true;
>
> VHOST_LOG_CONFIG(DEBUG, "(%d) mapped address desc: %p\n",
> dev->vid, vq->desc);
> @@ -1658,7 +1657,7 @@ vhost_user_set_vring_call(struct virtio_net **pdev,
> struct VhostUserMsg *msg,
> vq = dev->virtqueue[file.index];
>
> if (vq->ready) {
> - vq->ready = 0;
> + vq->ready = false;
> vhost_user_notify_queue_state(dev, file.index, 0);
> }
>
> @@ -1918,14 +1917,14 @@ vhost_user_set_vring_kick(struct virtio_net **pdev,
> struct VhostUserMsg *msg,
> * the SET_VRING_ENABLE message.
> */
> if (!(dev->features & (1ULL << VHOST_USER_F_PROTOCOL_FEATURES))) {
> - vq->enabled = 1;
> + vq->enabled = true;
> if (dev->notify_ops->vring_state_changed)
> dev->notify_ops->vring_state_changed(
> dev->vid, file.index, 1);
> }
>
> if (vq->ready) {
> - vq->ready = 0;
> + vq->ready = false;
> vhost_user_notify_queue_state(dev, file.index, 0);
> }
>
> @@ -2043,7 +2042,7 @@ vhost_user_set_vring_enable(struct virtio_net **pdev,
> int main_fd __rte_unused)
> {
> struct virtio_net *dev = *pdev;
> - int enable = (int)msg->payload.state.num;
> + bool enable = !!msg->payload.state.num;
> int index = (int)msg->payload.state.index;
>
> if (validate_msg_fds(msg, 0) != 0)
> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index 583bf379c6..3d8e29df09 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -1396,13 +1396,13 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t
> queue_id,
>
> rte_spinlock_lock(&vq->access_lock);
>
> - if (unlikely(vq->enabled == 0))
> + if (unlikely(!vq->enabled))
> goto out_access_unlock;
>
> if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))
> vhost_user_iotlb_rd_lock(vq);
>
> - if (unlikely(vq->access_ok == 0))
> + if (unlikely(!vq->access_ok))
> if (unlikely(vring_translate(dev, vq) < 0))
> goto out;
>
> @@ -1753,13 +1753,13 @@ virtio_dev_rx_async_submit(struct virtio_net *dev,
> uint16_t queue_id,
>
> rte_spinlock_lock(&vq->access_lock);
>
> - if (unlikely(vq->enabled == 0 || !vq->async_registered))
> + if (unlikely(!vq->enabled || !vq->async_registered))
> goto out_access_unlock;
>
> if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))
> vhost_user_iotlb_rd_lock(vq);
>
> - if (unlikely(vq->access_ok == 0))
> + if (unlikely(!vq->access_ok))
> if (unlikely(vring_translate(dev, vq) < 0))
> goto out;
>
> @@ -2518,7 +2518,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
> if (unlikely(rte_spinlock_trylock(&vq->access_lock) == 0))
> return 0;
>
> - if (unlikely(vq->enabled == 0)) {
> + if (unlikely(!vq->enabled)) {
> count = 0;
> goto out_access_unlock;
> }
> @@ -2526,7 +2526,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
> if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))
> vhost_user_iotlb_rd_lock(vq);
>
> - if (unlikely(vq->access_ok == 0))
> + if (unlikely(!vq->access_ok))
> if (unlikely(vring_translate(dev, vq) < 0)) {
> count = 0;
> goto out;
> --
> 2.30.2
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH v4 0/3] vhost: make virtqueue cache-friendly
2021-03-23 10:30 ` [dpdk-dev] [PATCH v4 0/3] vhost: make virtqueue cache-friendly David Marchand
@ 2021-03-29 10:52 ` Balazs Nemeth
0 siblings, 0 replies; 8+ messages in thread
From: Balazs Nemeth @ 2021-03-29 10:52 UTC (permalink / raw)
To: David Marchand, Maxime Coquelin
Cc: dev, Xia, Chenbo, Adrian Moreno Zapata, Olivier Matz
On Tue, 2021-03-23 at 11:30 +0100, David Marchand wrote:
> On Tue, Mar 23, 2021 at 10:02 AM Maxime Coquelin
> <maxime.coquelin@redhat.com> wrote:
> >
> > As done for Virtio PMD, this series improves cache utilization
> > of the vhost_virtqueue struct by removing unused field,
> > make the live-migration cache dynamically allocated at
> > live-migration setup time and by moving fields
> > around so that hot fields are on the first cachelines.
> >
> > With this series, The struct vhost_virtqueue size goes
> > from 832B (13 cachelines) down to 320B (5 cachelines).
> >
> > With this series and the virtio one, I measure a gain
> > of up to 8% in IO loop micro-benchmark with packed
> > ring, and 5% with split ring.
> >
> > I don't have a setup at hand to run PVP testing, but
> > it might be interresting to get the numbers as I
> > suspect the cache pressure is higher in this test as
> > in real use-cases.
> >
> > Changes in v4:
> > ==============
> > - Fix missing changes to boolean (Chenbo)
> >
>
> For the series,
> Reviewed-by: David Marchand <david.marchand@redhat.com>
>
> Merci !
>
>
Tested this in a PVP setup on ARM, giving a slight improvement in
performance. For the series:
Tested-by: Balazs Nemeth <bnemeth@redhat.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH v4 0/3] vhost: make virtqueue cache-friendly
2021-03-23 9:02 [dpdk-dev] [PATCH v4 0/3] vhost: make virtqueue cache-friendly Maxime Coquelin
` (3 preceding siblings ...)
2021-03-23 10:30 ` [dpdk-dev] [PATCH v4 0/3] vhost: make virtqueue cache-friendly David Marchand
@ 2021-03-31 6:04 ` Xia, Chenbo
4 siblings, 0 replies; 8+ messages in thread
From: Xia, Chenbo @ 2021-03-31 6:04 UTC (permalink / raw)
To: Maxime Coquelin, dev, amorenoz, david.marchand, olivier.matz, bnemeth
> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Tuesday, March 23, 2021 5:02 PM
> To: dev@dpdk.org; Xia, Chenbo <chenbo.xia@intel.com>; amorenoz@redhat.com;
> david.marchand@redhat.com; olivier.matz@6wind.com; bnemeth@redhat.com
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
> Subject: [PATCH v4 0/3] vhost: make virtqueue cache-friendly
>
> As done for Virtio PMD, this series improves cache utilization
> of the vhost_virtqueue struct by removing unused field,
> make the live-migration cache dynamically allocated at
> live-migration setup time and by moving fields
> around so that hot fields are on the first cachelines.
>
> With this series, The struct vhost_virtqueue size goes
> from 832B (13 cachelines) down to 320B (5 cachelines).
>
> With this series and the virtio one, I measure a gain
> of up to 8% in IO loop micro-benchmark with packed
> ring, and 5% with split ring.
>
> I don't have a setup at hand to run PVP testing, but
> it might be interresting to get the numbers as I
> suspect the cache pressure is higher in this test as
> in real use-cases.
>
> Maxime Coquelin (3):
> vhost: remove unused Vhost virtqueue field
> vhost: move dirty logging cache out of the virtqueue
> vhost: optimize vhost virtqueue struct
>
> lib/librte_vhost/vhost.c | 21 +++++++++----
> lib/librte_vhost/vhost.h | 56 +++++++++++++++++------------------
> lib/librte_vhost/vhost_user.c | 44 +++++++++++++++++++--------
> lib/librte_vhost/virtio_net.c | 12 ++++----
> 4 files changed, 82 insertions(+), 51 deletions(-)
>
> --
> 2.30.2
Series applied to next-virtio/main, Thanks!
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-03-31 6:04 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-23 9:02 [dpdk-dev] [PATCH v4 0/3] vhost: make virtqueue cache-friendly Maxime Coquelin
2021-03-23 9:02 ` [dpdk-dev] [PATCH v4 1/3] vhost: remove unused Vhost virtqueue field Maxime Coquelin
2021-03-23 9:02 ` [dpdk-dev] [PATCH v4 2/3] vhost: move dirty logging cache out of the virtqueue Maxime Coquelin
2021-03-23 9:02 ` [dpdk-dev] [PATCH v4 3/3] vhost: optimize vhost virtqueue struct Maxime Coquelin
2021-03-25 2:30 ` Xia, Chenbo
2021-03-23 10:30 ` [dpdk-dev] [PATCH v4 0/3] vhost: make virtqueue cache-friendly David Marchand
2021-03-29 10:52 ` Balazs Nemeth
2021-03-31 6:04 ` Xia, Chenbo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).