* [dpdk-dev] [PATCH v2 0/3] vhost: make virtqueue cache-friendly
@ 2021-03-16 12:41 Maxime Coquelin
2021-03-16 12:41 ` [dpdk-dev] [PATCH v2 1/3] vhost: remove unused Vhost virtqueue field Maxime Coquelin
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Maxime Coquelin @ 2021-03-16 12:41 UTC (permalink / raw)
To: dev, chenbo.xia, amorenoz, david.marchand, olivier.matz, bnemeth
Cc: Maxime Coquelin
As done for Virtio PMD, this series improves cache utilization
of the vhost_virtqueue struct by removing unused field,
make the live-migration cache dynamically allocated at
live-migration setup time and by moving fields
around so that hot fields are on the first cachelines.
With this series, The struct vhost_virtqueue size goes
from 832B (13 cachelines) down to 320B (5 cachelines).
With this series and the virtio one, I measure a gain
of up to 8% in IO loop micro-benchmark with packed
ring, and 5% with split ring.
I don't have a setup at hand to run PVP testing, but
it might be interresting to get the numbers as I
suspect the cache pressure is higher in this test as
in real use-cases.
Changes in v2:
==============
- Add log_cache freeing in free_vq (Chenbo)
Maxime Coquelin (3):
vhost: remove unused Vhost virtqueue field
vhost: move dirty logging cache out of the virtqueue
vhost: optimize vhost virtqueue struct
lib/librte_vhost/vhost.c | 16 +++++++++--
lib/librte_vhost/vhost.h | 54 +++++++++++++++++------------------
lib/librte_vhost/vhost_user.c | 25 ++++++++++++++++
3 files changed, 66 insertions(+), 29 deletions(-)
--
2.29.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* [dpdk-dev] [PATCH v2 1/3] vhost: remove unused Vhost virtqueue field
2021-03-16 12:41 [dpdk-dev] [PATCH v2 0/3] vhost: make virtqueue cache-friendly Maxime Coquelin
@ 2021-03-16 12:41 ` Maxime Coquelin
2021-03-16 12:41 ` [dpdk-dev] [PATCH v2 2/3] vhost: move dirty logging cache out of the virtqueue Maxime Coquelin
2021-03-16 12:41 ` [dpdk-dev] [PATCH v2 3/3] vhost: optimize vhost virtqueue struct Maxime Coquelin
2 siblings, 0 replies; 8+ messages in thread
From: Maxime Coquelin @ 2021-03-16 12:41 UTC (permalink / raw)
To: dev, chenbo.xia, amorenoz, david.marchand, olivier.matz, bnemeth
Cc: Maxime Coquelin
This patch removes the "backend" field of the
vhost_virtqueue struct, which is not used by the
library.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
---
lib/librte_vhost/vhost.c | 2 --
lib/librte_vhost/vhost.h | 2 --
2 files changed, 4 deletions(-)
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 52ab93d1ec..5a7c0c6cff 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -558,8 +558,6 @@ init_vring_queue(struct virtio_net *dev, uint32_t vring_idx)
vq->notif_enable = VIRTIO_UNINITIALIZED_NOTIF;
vhost_user_iotlb_init(dev, vring_idx);
- /* Backends are set to -1 indicating an inactive device. */
- vq->backend = -1;
}
static void
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 658f6fc287..717f410548 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -143,8 +143,6 @@ struct vhost_virtqueue {
#define VIRTIO_INVALID_EVENTFD (-1)
#define VIRTIO_UNINITIALIZED_EVENTFD (-2)
- /* Backend value to determine if device should started/stopped */
- int backend;
int enabled;
int access_ok;
int ready;
--
2.29.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* [dpdk-dev] [PATCH v2 2/3] vhost: move dirty logging cache out of the virtqueue
2021-03-16 12:41 [dpdk-dev] [PATCH v2 0/3] vhost: make virtqueue cache-friendly Maxime Coquelin
2021-03-16 12:41 ` [dpdk-dev] [PATCH v2 1/3] vhost: remove unused Vhost virtqueue field Maxime Coquelin
@ 2021-03-16 12:41 ` Maxime Coquelin
2021-03-16 13:13 ` David Marchand
2021-03-16 12:41 ` [dpdk-dev] [PATCH v2 3/3] vhost: optimize vhost virtqueue struct Maxime Coquelin
2 siblings, 1 reply; 8+ messages in thread
From: Maxime Coquelin @ 2021-03-16 12:41 UTC (permalink / raw)
To: dev, chenbo.xia, amorenoz, david.marchand, olivier.matz, bnemeth
Cc: Maxime Coquelin
This patch moves the per-virtqueue's dirty logging cache
out of the virtqueue struct, by allocating it dynamically
only when live-migration is enabled.
It saves 8 cachelines in vhost_virtqueue struct.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_vhost/vhost.c | 14 ++++++++++++++
lib/librte_vhost/vhost.h | 2 +-
lib/librte_vhost/vhost_user.c | 25 +++++++++++++++++++++++++
3 files changed, 40 insertions(+), 1 deletion(-)
diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 5a7c0c6cff..c3490ce897 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -145,6 +145,10 @@ __vhost_log_cache_sync(struct virtio_net *dev, struct vhost_virtqueue *vq)
if (unlikely(!dev->log_base))
return;
+ /* No cache, nothing to sync */
+ if (unlikely(!vq->log_cache))
+ return;
+
rte_atomic_thread_fence(__ATOMIC_RELEASE);
log_base = (unsigned long *)(uintptr_t)dev->log_base;
@@ -177,6 +181,14 @@ vhost_log_cache_page(struct virtio_net *dev, struct vhost_virtqueue *vq,
uint32_t offset = page / (sizeof(unsigned long) << 3);
int i;
+ if (unlikely(!vq->log_cache)) {
+ /* No logging cache allocated, write dirty log map directly */
+ rte_smp_wmb();
+ vhost_log_page((uint8_t *)(uintptr_t)dev->log_base, page);
+
+ return;
+ }
+
for (i = 0; i < vq->log_cache_nb_elem; i++) {
struct log_cache_entry *elem = vq->log_cache + i;
@@ -354,6 +366,8 @@ free_vq(struct virtio_net *dev, struct vhost_virtqueue *vq)
}
rte_free(vq->batch_copy_elems);
rte_mempool_free(vq->iotlb_pool);
+ if (vq->log_cache)
+ rte_free(vq->log_cache);
rte_free(vq);
}
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 717f410548..3a71dfeed9 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -183,7 +183,7 @@ struct vhost_virtqueue {
bool used_wrap_counter;
bool avail_wrap_counter;
- struct log_cache_entry log_cache[VHOST_LOG_CACHE_NR];
+ struct log_cache_entry *log_cache;
uint16_t log_cache_nb_elem;
rte_rwlock_t iotlb_lock;
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index a60bb945ad..0f452d6ff3 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -2022,6 +2022,11 @@ vhost_user_get_vring_base(struct virtio_net **pdev,
rte_free(vq->batch_copy_elems);
vq->batch_copy_elems = NULL;
+ if (vq->log_cache) {
+ rte_free(vq->log_cache);
+ vq->log_cache = NULL;
+ }
+
msg->size = sizeof(msg->payload.state);
msg->fd_num = 0;
@@ -2121,6 +2126,7 @@ vhost_user_set_log_base(struct virtio_net **pdev, struct VhostUserMsg *msg,
int fd = msg->fds[0];
uint64_t size, off;
void *addr;
+ uint32_t i;
if (validate_msg_fds(msg, 1) != 0)
return RTE_VHOST_MSG_RESULT_ERR;
@@ -2174,6 +2180,25 @@ vhost_user_set_log_base(struct virtio_net **pdev, struct VhostUserMsg *msg,
dev->log_base = dev->log_addr + off;
dev->log_size = size;
+ for (i = 0; i < dev->nr_vring; i++) {
+ struct vhost_virtqueue *vq = dev->virtqueue[i];
+
+ if (vq->log_cache) {
+ rte_free(vq->log_cache);
+ vq->log_cache = NULL;
+ }
+ vq->log_cache_nb_elem = 0;
+ vq->log_cache = rte_zmalloc("vq log cache",
+ sizeof(struct log_cache_entry) * VHOST_LOG_CACHE_NR,
+ 0);
+ /*
+ * If log cache alloc fail, don't fail migration, but no
+ * caching will be done, which will impact performance
+ */
+ if (!vq->log_cache)
+ VHOST_LOG_CONFIG(ERR, "Failed to allocate VQ logging cache\n");
+ }
+
/*
* The spec is not clear about it (yet), but QEMU doesn't expect
* any payload in the reply.
--
2.29.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* [dpdk-dev] [PATCH v2 3/3] vhost: optimize vhost virtqueue struct
2021-03-16 12:41 [dpdk-dev] [PATCH v2 0/3] vhost: make virtqueue cache-friendly Maxime Coquelin
2021-03-16 12:41 ` [dpdk-dev] [PATCH v2 1/3] vhost: remove unused Vhost virtqueue field Maxime Coquelin
2021-03-16 12:41 ` [dpdk-dev] [PATCH v2 2/3] vhost: move dirty logging cache out of the virtqueue Maxime Coquelin
@ 2021-03-16 12:41 ` Maxime Coquelin
2021-03-16 13:38 ` David Marchand
2 siblings, 1 reply; 8+ messages in thread
From: Maxime Coquelin @ 2021-03-16 12:41 UTC (permalink / raw)
To: dev, chenbo.xia, amorenoz, david.marchand, olivier.matz, bnemeth
Cc: Maxime Coquelin
This patch moves vhost_virtuqueue struct fields in order
to both optimize packing and move hot fields on the first
cachelines.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
lib/librte_vhost/vhost.h | 52 +++++++++++++++++++++-------------------
1 file changed, 27 insertions(+), 25 deletions(-)
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 3a71dfeed9..ae45b05dd1 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -133,7 +133,7 @@ struct vhost_virtqueue {
struct vring_used *used;
struct vring_packed_desc_event *device_event;
};
- uint32_t size;
+ uint16_t size;
uint16_t last_avail_idx;
uint16_t last_used_idx;
@@ -143,29 +143,12 @@ struct vhost_virtqueue {
#define VIRTIO_INVALID_EVENTFD (-1)
#define VIRTIO_UNINITIALIZED_EVENTFD (-2)
- int enabled;
- int access_ok;
- int ready;
- int notif_enable;
-#define VIRTIO_UNINITIALIZED_NOTIF (-1)
+ bool enabled;
+ bool access_ok;
+ bool ready;
rte_spinlock_t access_lock;
- /* Used to notify the guest (trigger interrupt) */
- int callfd;
- /* Currently unused as polling mode is enabled */
- int kickfd;
-
- /* Physical address of used ring, for logging */
- uint64_t log_guest_addr;
-
- /* inflight share memory info */
- union {
- struct rte_vhost_inflight_info_split *inflight_split;
- struct rte_vhost_inflight_info_packed *inflight_packed;
- };
- struct rte_vhost_resubmit_info *resubmit_inflight;
- uint64_t global_counter;
union {
struct vring_used_elem *shadow_used_split;
@@ -176,22 +159,36 @@ struct vhost_virtqueue {
uint16_t shadow_aligned_idx;
/* Record packed ring first dequeue desc index */
uint16_t shadow_last_used_idx;
- struct vhost_vring_addr ring_addrs;
- struct batch_copy_elem *batch_copy_elems;
uint16_t batch_copy_nb_elems;
+ struct batch_copy_elem *batch_copy_elems;
bool used_wrap_counter;
bool avail_wrap_counter;
- struct log_cache_entry *log_cache;
+ /* Physical address of used ring, for logging */
uint16_t log_cache_nb_elem;
+ uint64_t log_guest_addr;
+ struct log_cache_entry *log_cache;
rte_rwlock_t iotlb_lock;
rte_rwlock_t iotlb_pending_lock;
struct rte_mempool *iotlb_pool;
TAILQ_HEAD(, vhost_iotlb_entry) iotlb_list;
- int iotlb_cache_nr;
TAILQ_HEAD(, vhost_iotlb_entry) iotlb_pending_list;
+ int iotlb_cache_nr;
+
+ /* Used to notify the guest (trigger interrupt) */
+ int callfd;
+ /* Currently unused as polling mode is enabled */
+ int kickfd;
+
+ /* inflight share memory info */
+ union {
+ struct rte_vhost_inflight_info_split *inflight_split;
+ struct rte_vhost_inflight_info_packed *inflight_packed;
+ };
+ struct rte_vhost_resubmit_info *resubmit_inflight;
+ uint64_t global_counter;
/* operation callbacks for async dma */
struct rte_vhost_async_channel_ops async_ops;
@@ -212,6 +209,11 @@ struct vhost_virtqueue {
bool async_inorder;
bool async_registered;
uint16_t async_threshold;
+
+ int notif_enable;
+#define VIRTIO_UNINITIALIZED_NOTIF (-1)
+
+ struct vhost_vring_addr ring_addrs;
} __rte_cache_aligned;
/* Virtio device status as per Virtio specification */
--
2.29.2
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH v2 2/3] vhost: move dirty logging cache out of the virtqueue
2021-03-16 12:41 ` [dpdk-dev] [PATCH v2 2/3] vhost: move dirty logging cache out of the virtqueue Maxime Coquelin
@ 2021-03-16 13:13 ` David Marchand
2021-03-17 10:20 ` Maxime Coquelin
0 siblings, 1 reply; 8+ messages in thread
From: David Marchand @ 2021-03-16 13:13 UTC (permalink / raw)
To: Maxime Coquelin
Cc: dev, Xia, Chenbo, Adrian Moreno Zapata, Olivier Matz, bnemeth
On Tue, Mar 16, 2021 at 1:42 PM Maxime Coquelin
<maxime.coquelin@redhat.com> wrote:
>
> This patch moves the per-virtqueue's dirty logging cache
> out of the virtqueue struct, by allocating it dynamically
> only when live-migration is enabled.
>
> It saves 8 cachelines in vhost_virtqueue struct.
>
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
> lib/librte_vhost/vhost.c | 14 ++++++++++++++
> lib/librte_vhost/vhost.h | 2 +-
> lib/librte_vhost/vhost_user.c | 25 +++++++++++++++++++++++++
> 3 files changed, 40 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
> index 5a7c0c6cff..c3490ce897 100644
> --- a/lib/librte_vhost/vhost.c
> +++ b/lib/librte_vhost/vhost.c
> @@ -145,6 +145,10 @@ __vhost_log_cache_sync(struct virtio_net *dev, struct vhost_virtqueue *vq)
> if (unlikely(!dev->log_base))
> return;
>
> + /* No cache, nothing to sync */
> + if (unlikely(!vq->log_cache))
> + return;
> +
> rte_atomic_thread_fence(__ATOMIC_RELEASE);
>
> log_base = (unsigned long *)(uintptr_t)dev->log_base;
> @@ -177,6 +181,14 @@ vhost_log_cache_page(struct virtio_net *dev, struct vhost_virtqueue *vq,
> uint32_t offset = page / (sizeof(unsigned long) << 3);
> int i;
>
> + if (unlikely(!vq->log_cache)) {
> + /* No logging cache allocated, write dirty log map directly */
> + rte_smp_wmb();
We try not to reintroduce full barriers (checkpatch caught this).
> + vhost_log_page((uint8_t *)(uintptr_t)dev->log_base, page);
> +
> + return;
> + }
> +
> for (i = 0; i < vq->log_cache_nb_elem; i++) {
> struct log_cache_entry *elem = vq->log_cache + i;
>
> @@ -354,6 +366,8 @@ free_vq(struct virtio_net *dev, struct vhost_virtqueue *vq)
> }
> rte_free(vq->batch_copy_elems);
> rte_mempool_free(vq->iotlb_pool);
> + if (vq->log_cache)
> + rte_free(vq->log_cache);
No if() needed.
> rte_free(vq);
> }
>
> diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
> index 717f410548..3a71dfeed9 100644
> --- a/lib/librte_vhost/vhost.h
> +++ b/lib/librte_vhost/vhost.h
> @@ -183,7 +183,7 @@ struct vhost_virtqueue {
> bool used_wrap_counter;
> bool avail_wrap_counter;
>
> - struct log_cache_entry log_cache[VHOST_LOG_CACHE_NR];
> + struct log_cache_entry *log_cache;
> uint16_t log_cache_nb_elem;
>
> rte_rwlock_t iotlb_lock;
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> index a60bb945ad..0f452d6ff3 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -2022,6 +2022,11 @@ vhost_user_get_vring_base(struct virtio_net **pdev,
> rte_free(vq->batch_copy_elems);
> vq->batch_copy_elems = NULL;
>
> + if (vq->log_cache) {
> + rte_free(vq->log_cache);
> + vq->log_cache = NULL;
> + }
Idem.
> +
> msg->size = sizeof(msg->payload.state);
> msg->fd_num = 0;
>
> @@ -2121,6 +2126,7 @@ vhost_user_set_log_base(struct virtio_net **pdev, struct VhostUserMsg *msg,
> int fd = msg->fds[0];
> uint64_t size, off;
> void *addr;
> + uint32_t i;
>
> if (validate_msg_fds(msg, 1) != 0)
> return RTE_VHOST_MSG_RESULT_ERR;
> @@ -2174,6 +2180,25 @@ vhost_user_set_log_base(struct virtio_net **pdev, struct VhostUserMsg *msg,
> dev->log_base = dev->log_addr + off;
> dev->log_size = size;
>
> + for (i = 0; i < dev->nr_vring; i++) {
> + struct vhost_virtqueue *vq = dev->virtqueue[i];
> +
> + if (vq->log_cache) {
> + rte_free(vq->log_cache);
> + vq->log_cache = NULL;
> + }
Idem.
> + vq->log_cache_nb_elem = 0;
> + vq->log_cache = rte_zmalloc("vq log cache",
> + sizeof(struct log_cache_entry) * VHOST_LOG_CACHE_NR,
> + 0);
> + /*
> + * If log cache alloc fail, don't fail migration, but no
> + * caching will be done, which will impact performance
> + */
> + if (!vq->log_cache)
> + VHOST_LOG_CONFIG(ERR, "Failed to allocate VQ logging cache\n");
> + }
> +
> /*
> * The spec is not clear about it (yet), but QEMU doesn't expect
> * any payload in the reply.
> --
> 2.29.2
>
--
David Marchand
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH v2 3/3] vhost: optimize vhost virtqueue struct
2021-03-16 12:41 ` [dpdk-dev] [PATCH v2 3/3] vhost: optimize vhost virtqueue struct Maxime Coquelin
@ 2021-03-16 13:38 ` David Marchand
2021-03-17 10:26 ` Maxime Coquelin
0 siblings, 1 reply; 8+ messages in thread
From: David Marchand @ 2021-03-16 13:38 UTC (permalink / raw)
To: Maxime Coquelin
Cc: dev, Xia, Chenbo, Adrian Moreno Zapata, Olivier Matz, bnemeth
On Tue, Mar 16, 2021 at 1:42 PM Maxime Coquelin
<maxime.coquelin@redhat.com> wrote:
>
> This patch moves vhost_virtuqueue struct fields in order
virtqueue
You might want to add something in your local dictionary :-p.
> to both optimize packing and move hot fields on the first
> cachelines.
>
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
> lib/librte_vhost/vhost.h | 52 +++++++++++++++++++++-------------------
> 1 file changed, 27 insertions(+), 25 deletions(-)
>
> diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
> index 3a71dfeed9..ae45b05dd1 100644
> --- a/lib/librte_vhost/vhost.h
> +++ b/lib/librte_vhost/vhost.h
> @@ -133,7 +133,7 @@ struct vhost_virtqueue {
> struct vring_used *used;
> struct vring_packed_desc_event *device_event;
> };
> - uint32_t size;
> + uint16_t size;
There is one site in the code that could be problematic:
vhost_user_set_vring_num()
...
vq->size = msg->payload.state.num;
...
if (vq->size > 32768) {
payload.state.num is an unsigned int, so we'd better check its value
as an unsigned int, before storing to vq->size.
>
> uint16_t last_avail_idx;
> uint16_t last_used_idx;
> @@ -143,29 +143,12 @@ struct vhost_virtqueue {
> #define VIRTIO_INVALID_EVENTFD (-1)
> #define VIRTIO_UNINITIALIZED_EVENTFD (-2)
>
> - int enabled;
> - int access_ok;
> - int ready;
> - int notif_enable;
> -#define VIRTIO_UNINITIALIZED_NOTIF (-1)
> + bool enabled;
> + bool access_ok;
> + bool ready;
Changing those types is fine, but it is a bit odd to still see
boolean_var = 0 or boolean_var = 1 in the rest of the code.
>
> rte_spinlock_t access_lock;
>
> - /* Used to notify the guest (trigger interrupt) */
> - int callfd;
> - /* Currently unused as polling mode is enabled */
> - int kickfd;
> -
> - /* Physical address of used ring, for logging */
> - uint64_t log_guest_addr;
> -
> - /* inflight share memory info */
> - union {
> - struct rte_vhost_inflight_info_split *inflight_split;
> - struct rte_vhost_inflight_info_packed *inflight_packed;
> - };
> - struct rte_vhost_resubmit_info *resubmit_inflight;
> - uint64_t global_counter;
>
> union {
> struct vring_used_elem *shadow_used_split;
> @@ -176,22 +159,36 @@ struct vhost_virtqueue {
> uint16_t shadow_aligned_idx;
> /* Record packed ring first dequeue desc index */
> uint16_t shadow_last_used_idx;
> - struct vhost_vring_addr ring_addrs;
>
> - struct batch_copy_elem *batch_copy_elems;
> uint16_t batch_copy_nb_elems;
> + struct batch_copy_elem *batch_copy_elems;
> bool used_wrap_counter;
> bool avail_wrap_counter;
>
> - struct log_cache_entry *log_cache;
> + /* Physical address of used ring, for logging */
> uint16_t log_cache_nb_elem;
Indent is broken / not consistent, probably because you cut/pasted lines.
> + uint64_t log_guest_addr;
> + struct log_cache_entry *log_cache;
>
> rte_rwlock_t iotlb_lock;
> rte_rwlock_t iotlb_pending_lock;
> struct rte_mempool *iotlb_pool;
> TAILQ_HEAD(, vhost_iotlb_entry) iotlb_list;
> - int iotlb_cache_nr;
> TAILQ_HEAD(, vhost_iotlb_entry) iotlb_pending_list;
> + int iotlb_cache_nr;
> +
> + /* Used to notify the guest (trigger interrupt) */
> + int callfd;
> + /* Currently unused as polling mode is enabled */
> + int kickfd;
> +
> + /* inflight share memory info */
> + union {
> + struct rte_vhost_inflight_info_split *inflight_split;
> + struct rte_vhost_inflight_info_packed *inflight_packed;
> + };
> + struct rte_vhost_resubmit_info *resubmit_inflight;
> + uint64_t global_counter;
>
> /* operation callbacks for async dma */
> struct rte_vhost_async_channel_ops async_ops;
> @@ -212,6 +209,11 @@ struct vhost_virtqueue {
> bool async_inorder;
> bool async_registered;
> uint16_t async_threshold;
> +
> + int notif_enable;
> +#define VIRTIO_UNINITIALIZED_NOTIF (-1)
> +
> + struct vhost_vring_addr ring_addrs;
> } __rte_cache_aligned;
>
> /* Virtio device status as per Virtio specification */
> --
> 2.29.2
>
--
David Marchand
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH v2 2/3] vhost: move dirty logging cache out of the virtqueue
2021-03-16 13:13 ` David Marchand
@ 2021-03-17 10:20 ` Maxime Coquelin
0 siblings, 0 replies; 8+ messages in thread
From: Maxime Coquelin @ 2021-03-17 10:20 UTC (permalink / raw)
To: David Marchand
Cc: dev, Xia, Chenbo, Adrian Moreno Zapata, Olivier Matz, bnemeth
On 3/16/21 2:13 PM, David Marchand wrote:
> On Tue, Mar 16, 2021 at 1:42 PM Maxime Coquelin
> <maxime.coquelin@redhat.com> wrote:
>>
>> This patch moves the per-virtqueue's dirty logging cache
>> out of the virtqueue struct, by allocating it dynamically
>> only when live-migration is enabled.
>>
>> It saves 8 cachelines in vhost_virtqueue struct.
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>> lib/librte_vhost/vhost.c | 14 ++++++++++++++
>> lib/librte_vhost/vhost.h | 2 +-
>> lib/librte_vhost/vhost_user.c | 25 +++++++++++++++++++++++++
>> 3 files changed, 40 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
>> index 5a7c0c6cff..c3490ce897 100644
>> --- a/lib/librte_vhost/vhost.c
>> +++ b/lib/librte_vhost/vhost.c
>> @@ -145,6 +145,10 @@ __vhost_log_cache_sync(struct virtio_net *dev, struct vhost_virtqueue *vq)
>> if (unlikely(!dev->log_base))
>> return;
>>
>> + /* No cache, nothing to sync */
>> + if (unlikely(!vq->log_cache))
>> + return;
>> +
>> rte_atomic_thread_fence(__ATOMIC_RELEASE);
>>
>> log_base = (unsigned long *)(uintptr_t)dev->log_base;
>> @@ -177,6 +181,14 @@ vhost_log_cache_page(struct virtio_net *dev, struct vhost_virtqueue *vq,
>> uint32_t offset = page / (sizeof(unsigned long) << 3);
>> int i;
>>
>> + if (unlikely(!vq->log_cache)) {
>> + /* No logging cache allocated, write dirty log map directly */
>> + rte_smp_wmb();
>
> We try not to reintroduce full barriers (checkpatch caught this).
Agree, the rebase introduced it again silently.
>
>> + vhost_log_page((uint8_t *)(uintptr_t)dev->log_base, page);
>> +
>> + return;
>> + }
>> +
>> for (i = 0; i < vq->log_cache_nb_elem; i++) {
>> struct log_cache_entry *elem = vq->log_cache + i;
>>
>> @@ -354,6 +366,8 @@ free_vq(struct virtio_net *dev, struct vhost_virtqueue *vq)
>> }
>> rte_free(vq->batch_copy_elems);
>> rte_mempool_free(vq->iotlb_pool);
>> + if (vq->log_cache)
>> + rte_free(vq->log_cache);
>
> No if() needed.
Yes, will fix in next revision.
>
>> rte_free(vq);
>> }
>>
>> diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
>> index 717f410548..3a71dfeed9 100644
>> --- a/lib/librte_vhost/vhost.h
>> +++ b/lib/librte_vhost/vhost.h
>> @@ -183,7 +183,7 @@ struct vhost_virtqueue {
>> bool used_wrap_counter;
>> bool avail_wrap_counter;
>>
>> - struct log_cache_entry log_cache[VHOST_LOG_CACHE_NR];
>> + struct log_cache_entry *log_cache;
>> uint16_t log_cache_nb_elem;
>>
>> rte_rwlock_t iotlb_lock;
>> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
>> index a60bb945ad..0f452d6ff3 100644
>> --- a/lib/librte_vhost/vhost_user.c
>> +++ b/lib/librte_vhost/vhost_user.c
>> @@ -2022,6 +2022,11 @@ vhost_user_get_vring_base(struct virtio_net **pdev,
>> rte_free(vq->batch_copy_elems);
>> vq->batch_copy_elems = NULL;
>>
>> + if (vq->log_cache) {
>> + rte_free(vq->log_cache);
>> + vq->log_cache = NULL;
>> + }
>
> Idem.
>
>
Thanks,
Maxime
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH v2 3/3] vhost: optimize vhost virtqueue struct
2021-03-16 13:38 ` David Marchand
@ 2021-03-17 10:26 ` Maxime Coquelin
0 siblings, 0 replies; 8+ messages in thread
From: Maxime Coquelin @ 2021-03-17 10:26 UTC (permalink / raw)
To: David Marchand
Cc: dev, Xia, Chenbo, Adrian Moreno Zapata, Olivier Matz, bnemeth
On 3/16/21 2:38 PM, David Marchand wrote:
> On Tue, Mar 16, 2021 at 1:42 PM Maxime Coquelin
> <maxime.coquelin@redhat.com> wrote:
>>
>> This patch moves vhost_virtuqueue struct fields in order
>
> virtqueue
> You might want to add something in your local dictionary :-p.
>
>
>> to both optimize packing and move hot fields on the first
>> cachelines.
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>> lib/librte_vhost/vhost.h | 52 +++++++++++++++++++++-------------------
>> 1 file changed, 27 insertions(+), 25 deletions(-)
>>
>> diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
>> index 3a71dfeed9..ae45b05dd1 100644
>> --- a/lib/librte_vhost/vhost.h
>> +++ b/lib/librte_vhost/vhost.h
>> @@ -133,7 +133,7 @@ struct vhost_virtqueue {
>> struct vring_used *used;
>> struct vring_packed_desc_event *device_event;
>> };
>> - uint32_t size;
>> + uint16_t size;
>
> There is one site in the code that could be problematic:
>
> vhost_user_set_vring_num()
> ...
> vq->size = msg->payload.state.num;
> ...
> if (vq->size > 32768) {
>
> payload.state.num is an unsigned int, so we'd better check its value
> as an unsigned int, before storing to vq->size.
Indeed, I will rework the function, so that it checks that
msg->payload.state.num is smaller than 32K to comply with the spec,
and so remove the later check on vq->size.
>
>>
>> uint16_t last_avail_idx;
>> uint16_t last_used_idx;
>> @@ -143,29 +143,12 @@ struct vhost_virtqueue {
>> #define VIRTIO_INVALID_EVENTFD (-1)
>> #define VIRTIO_UNINITIALIZED_EVENTFD (-2)
>>
>> - int enabled;
>> - int access_ok;
>> - int ready;
>> - int notif_enable;
>> -#define VIRTIO_UNINITIALIZED_NOTIF (-1)
>> + bool enabled;
>> + bool access_ok;
>> + bool ready;
>
> Changing those types is fine, but it is a bit odd to still see
> boolean_var = 0 or boolean_var = 1 in the rest of the code.
Yes, I fix all the places with assigning boolean values.
>
>>
>> rte_spinlock_t access_lock;
>>
>> - /* Used to notify the guest (trigger interrupt) */
>> - int callfd;
>> - /* Currently unused as polling mode is enabled */
>> - int kickfd;
>> -
>> - /* Physical address of used ring, for logging */
>> - uint64_t log_guest_addr;
>> -
>> - /* inflight share memory info */
>> - union {
>> - struct rte_vhost_inflight_info_split *inflight_split;
>> - struct rte_vhost_inflight_info_packed *inflight_packed;
>> - };
>> - struct rte_vhost_resubmit_info *resubmit_inflight;
>> - uint64_t global_counter;
>>
>> union {
>> struct vring_used_elem *shadow_used_split;
>> @@ -176,22 +159,36 @@ struct vhost_virtqueue {
>> uint16_t shadow_aligned_idx;
>> /* Record packed ring first dequeue desc index */
>> uint16_t shadow_last_used_idx;
>> - struct vhost_vring_addr ring_addrs;
>>
>> - struct batch_copy_elem *batch_copy_elems;
>> uint16_t batch_copy_nb_elems;
>> + struct batch_copy_elem *batch_copy_elems;
>> bool used_wrap_counter;
>> bool avail_wrap_counter;
>>
>> - struct log_cache_entry *log_cache;
>> + /* Physical address of used ring, for logging */
>> uint16_t log_cache_nb_elem;
>
> Indent is broken / not consistent, probably because you cut/pasted lines.
Will fix it.
Thanks,
Maxime
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-03-17 10:27 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-16 12:41 [dpdk-dev] [PATCH v2 0/3] vhost: make virtqueue cache-friendly Maxime Coquelin
2021-03-16 12:41 ` [dpdk-dev] [PATCH v2 1/3] vhost: remove unused Vhost virtqueue field Maxime Coquelin
2021-03-16 12:41 ` [dpdk-dev] [PATCH v2 2/3] vhost: move dirty logging cache out of the virtqueue Maxime Coquelin
2021-03-16 13:13 ` David Marchand
2021-03-17 10:20 ` Maxime Coquelin
2021-03-16 12:41 ` [dpdk-dev] [PATCH v2 3/3] vhost: optimize vhost virtqueue struct Maxime Coquelin
2021-03-16 13:38 ` David Marchand
2021-03-17 10:26 ` Maxime Coquelin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).