* [PATCH v3 0/5] Support add/remove memory region & get-max-slots
@ 2025-11-04 4:21 Pravin M Bathija
2025-11-04 4:21 ` [PATCH v3 1/5] vhost: add user to mailmap and define to vhost hdr Pravin M Bathija
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: Pravin M Bathija @ 2025-11-04 4:21 UTC (permalink / raw)
To: dev; +Cc: pravin.bathija, pravin.m.bathija.dev
This patchset incorporates support for adding and removal of memory
regions, getting max memory slots and other changes to vhost-user messages.
These messages are sent from vhost-user front-end (qemu or libblkio) to a
vhost-user back-end (dpdk, spdk). Supporting functions corresponding to
these message functions have been implemented in the interest of writing
optimized code. Older functions pasrt of vhost-user back-end have also been
optimized using these newly defined support functions. This implementation
has been extensively tested by doing Read/Write I/O from multiple instances
of fio + libblkio (front-end) talking to spdk/dpdk (back-end) based drives.
Also tested with qemu front-end talking to dpdk + testpmd (back-end) performing
add/removal of memory regions. The last patch also includes increasing
the number of memory regions from 8 to 128. This has been also
extensively tested using the above test approaches.
Pravin M Bathija (5):
vhost: add user to mailmap and define to vhost hdr
vhost_user: header defines for add/rem mem region
vhost_user: Function defs for add/rem mem regions
vhost_user: support function defines for back-end
vhost_user: Increase number of memory regions
.mailmap | 1 +
lib/vhost/rte_vhost.h | 4 +
lib/vhost/vhost_user.c | 333 +++++++++++++++++++++++++++++++++++------
lib/vhost/vhost_user.h | 12 +-
4 files changed, 305 insertions(+), 45 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 1/5] vhost: add user to mailmap and define to vhost hdr
2025-11-04 4:21 [PATCH v3 0/5] Support add/remove memory region & get-max-slots Pravin M Bathija
@ 2025-11-04 4:21 ` Pravin M Bathija
2025-11-04 7:15 ` fengchengwen
2025-11-04 4:21 ` [PATCH v3 2/5] vhost_user: header defines for add/rem mem region Pravin M Bathija
` (3 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Pravin M Bathija @ 2025-11-04 4:21 UTC (permalink / raw)
To: dev; +Cc: pravin.bathija, pravin.m.bathija.dev
- add user to mailmap fle
- define a bit-field called VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS
that depicts if the fature/capability to add/remove memory regions is
supported. This is a part of the overall support for add/remove
memory region feature in this patchset.
Signed-off-by: Pravin M Bathija <pravin.bathija@dell.com>
---
.mailmap | 1 +
lib/vhost/rte_vhost.h | 4 ++++
2 files changed, 5 insertions(+)
diff --git a/.mailmap b/.mailmap
index 0b043cb0c0..33b8ccf92a 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1266,6 +1266,7 @@ Prateek Agarwal <prateekag@cse.iitb.ac.in>
Prathisna Padmasanan <prathisna.padmasanan@intel.com>
Praveen Kaligineedi <pkaligineedi@google.com>
Praveen Shetty <praveen.shetty@intel.com>
+Pravin M Bathija <pravin.bathija@dell.com>
Pravin Pathak <pravin.pathak.dev@gmail.com> <pravin.pathak@intel.com>
Prince Takkar <ptakkar@marvell.com>
Priyalee Kushwaha <priyalee.kushwaha@intel.com>
diff --git a/lib/vhost/rte_vhost.h b/lib/vhost/rte_vhost.h
index 2f7c4c0080..a7f9700538 100644
--- a/lib/vhost/rte_vhost.h
+++ b/lib/vhost/rte_vhost.h
@@ -109,6 +109,10 @@ extern "C" {
#define VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD 12
#endif
+#ifndef VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS
+#define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS 15
+#endif
+
#ifndef VHOST_USER_PROTOCOL_F_STATUS
#define VHOST_USER_PROTOCOL_F_STATUS 16
#endif
--
2.43.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 2/5] vhost_user: header defines for add/rem mem region
2025-11-04 4:21 [PATCH v3 0/5] Support add/remove memory region & get-max-slots Pravin M Bathija
2025-11-04 4:21 ` [PATCH v3 1/5] vhost: add user to mailmap and define to vhost hdr Pravin M Bathija
@ 2025-11-04 4:21 ` Pravin M Bathija
2025-11-04 7:18 ` fengchengwen
2025-11-04 4:21 ` [PATCH v3 3/5] vhost_user: Function defs for add/rem mem regions Pravin M Bathija
` (2 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Pravin M Bathija @ 2025-11-04 4:21 UTC (permalink / raw)
To: dev; +Cc: pravin.bathija, pravin.m.bathija.dev
The changes in this file cover the enum message requests for
supporting add/remove memory regions. The front-end vhost-user
client sends messages like get max memory slots, add memory region
and remove memory region which are covered in these changes which
are on the vhost-user back-end. The changes also include data structure
definition of memory region to be added/removed. The data structure
VhostUserMsg has been changed to include the memory region.
Signed-off-by: Pravin M Bathija <pravin.bathija@dell.com>
---
lib/vhost/vhost_user.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/lib/vhost/vhost_user.h b/lib/vhost/vhost_user.h
index ef486545ba..5a0e747b58 100644
--- a/lib/vhost/vhost_user.h
+++ b/lib/vhost/vhost_user.h
@@ -32,6 +32,7 @@
(1ULL << VHOST_USER_PROTOCOL_F_BACKEND_SEND_FD) | \
(1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) | \
(1ULL << VHOST_USER_PROTOCOL_F_PAGEFAULT) | \
+ (1ULL << VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS) | \
(1ULL << VHOST_USER_PROTOCOL_F_STATUS))
typedef enum VhostUserRequest {
@@ -67,6 +68,9 @@ typedef enum VhostUserRequest {
VHOST_USER_POSTCOPY_END = 30,
VHOST_USER_GET_INFLIGHT_FD = 31,
VHOST_USER_SET_INFLIGHT_FD = 32,
+ VHOST_USER_GET_MAX_MEM_SLOTS = 36,
+ VHOST_USER_ADD_MEM_REG = 37,
+ VHOST_USER_REM_MEM_REG = 38,
VHOST_USER_SET_STATUS = 39,
VHOST_USER_GET_STATUS = 40,
} VhostUserRequest;
@@ -91,6 +95,11 @@ typedef struct VhostUserMemory {
VhostUserMemoryRegion regions[VHOST_MEMORY_MAX_NREGIONS];
} VhostUserMemory;
+typedef struct VhostUserSingleMemReg {
+ uint64_t padding;
+ VhostUserMemoryRegion region;
+} VhostUserSingleMemReg;
+
typedef struct VhostUserLog {
uint64_t mmap_size;
uint64_t mmap_offset;
@@ -186,6 +195,7 @@ typedef struct __rte_packed_begin VhostUserMsg {
struct vhost_vring_state state;
struct vhost_vring_addr addr;
VhostUserMemory memory;
+ VhostUserSingleMemReg memory_single;
VhostUserLog log;
struct vhost_iotlb_msg iotlb;
VhostUserCryptoSessionParam crypto_session;
--
2.43.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 3/5] vhost_user: Function defs for add/rem mem regions
2025-11-04 4:21 [PATCH v3 0/5] Support add/remove memory region & get-max-slots Pravin M Bathija
2025-11-04 4:21 ` [PATCH v3 1/5] vhost: add user to mailmap and define to vhost hdr Pravin M Bathija
2025-11-04 4:21 ` [PATCH v3 2/5] vhost_user: header defines for add/rem mem region Pravin M Bathija
@ 2025-11-04 4:21 ` Pravin M Bathija
2025-11-04 7:48 ` fengchengwen
2025-11-04 4:21 ` [PATCH v3 4/5] vhost_user: support function defines for back-end Pravin M Bathija
2025-11-04 4:21 ` [PATCH v3 5/5] vhost_user: Increase number of memory regions Pravin M Bathija
4 siblings, 1 reply; 11+ messages in thread
From: Pravin M Bathija @ 2025-11-04 4:21 UTC (permalink / raw)
To: dev; +Cc: pravin.bathija, pravin.m.bathija.dev
These changes cover the function definition for add/remove memory
region calls which are invoked on receiving vhost user message from
vhost user front-end (e.g. Qemu). The vhost-user front end sw sends
an add memory region message which includes the guest physical address
range(GPA), memory size, front-end's virtual address and offset within
tha file-descriptor mapping. The back-end(dpdk) uses this information
to create a mapping within it's own address space using mmap on the
fd + offset. This added memory region serves as sshared memory to pass
data back and forth between vhost-user front and back ends. Similarly in
the case of remove memory region, the said memory region is unmapped from
the back-end(dpdk). In our case, in addition to testing with qemu front-end,
the testing has also been performed with libblkio front-end and spdk/dpdk
back-end. We did I/O using libblkio based device driver, to spdk based drives.
There are also changes for set mem table and new definition for get memory slots.
The message vhost set memory table essentially is how the vhost-user front-end
(qemu or libblkio) tells vhost-user back-end (dpdk) about all of it's guest
memory regions. This allows the back-end to translate guest physical addresses
to back-end virtual addresses and perform direct I/O to guest memory. Our changes
optimize the set memory table call to use common support functions. Message get
memory slots is how the vhost-user front-end queries the vhost-user back-end about
the number of memory slots available to be registered by the back-end.
Signed-off-by: Pravin M Bathija <pravin.bathija@dell.com>
---
lib/vhost/vhost_user.c | 253 +++++++++++++++++++++++++++++++++++------
1 file changed, 221 insertions(+), 32 deletions(-)
diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
index 4bfb13fb98..168432e7d1 100644
--- a/lib/vhost/vhost_user.c
+++ b/lib/vhost/vhost_user.c
@@ -71,6 +71,9 @@ VHOST_MESSAGE_HANDLER(VHOST_USER_SET_FEATURES, vhost_user_set_features, false, t
VHOST_MESSAGE_HANDLER(VHOST_USER_SET_OWNER, vhost_user_set_owner, false, true) \
VHOST_MESSAGE_HANDLER(VHOST_USER_RESET_OWNER, vhost_user_reset_owner, false, false) \
VHOST_MESSAGE_HANDLER(VHOST_USER_SET_MEM_TABLE, vhost_user_set_mem_table, true, true) \
+VHOST_MESSAGE_HANDLER(VHOST_USER_GET_MAX_MEM_SLOTS, vhost_user_get_max_mem_slots, false, false) \
+VHOST_MESSAGE_HANDLER(VHOST_USER_ADD_MEM_REG, vhost_user_add_mem_reg, true, true) \
+VHOST_MESSAGE_HANDLER(VHOST_USER_REM_MEM_REG, vhost_user_rem_mem_reg, false, true) \
VHOST_MESSAGE_HANDLER(VHOST_USER_SET_LOG_BASE, vhost_user_set_log_base, true, true) \
VHOST_MESSAGE_HANDLER(VHOST_USER_SET_LOG_FD, vhost_user_set_log_fd, true, true) \
VHOST_MESSAGE_HANDLER(VHOST_USER_SET_VRING_NUM, vhost_user_set_vring_num, false, true) \
@@ -1390,7 +1393,6 @@ vhost_user_set_mem_table(struct virtio_net **pdev,
struct virtio_net *dev = *pdev;
struct VhostUserMemory *memory = &ctx->msg.payload.memory;
struct rte_vhost_mem_region *reg;
- int numa_node = SOCKET_ID_ANY;
uint64_t mmap_offset;
uint32_t i;
bool async_notify = false;
@@ -1435,39 +1437,13 @@ vhost_user_set_mem_table(struct virtio_net **pdev,
if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))
vhost_user_iotlb_flush_all(dev);
- free_mem_region(dev);
+ free_all_mem_regions(dev);
rte_free(dev->mem);
dev->mem = NULL;
}
- /*
- * If VQ 0 has already been allocated, try to allocate on the same
- * NUMA node. It can be reallocated later in numa_realloc().
- */
- if (dev->nr_vring > 0)
- numa_node = dev->virtqueue[0]->numa_node;
-
- dev->nr_guest_pages = 0;
- if (dev->guest_pages == NULL) {
- dev->max_guest_pages = 8;
- dev->guest_pages = rte_zmalloc_socket(NULL,
- dev->max_guest_pages *
- sizeof(struct guest_page),
- RTE_CACHE_LINE_SIZE,
- numa_node);
- if (dev->guest_pages == NULL) {
- VHOST_CONFIG_LOG(dev->ifname, ERR,
- "failed to allocate memory for dev->guest_pages");
- goto close_msg_fds;
- }
- }
-
- dev->mem = rte_zmalloc_socket("vhost-mem-table", sizeof(struct rte_vhost_memory) +
- sizeof(struct rte_vhost_mem_region) * memory->nregions, 0, numa_node);
- if (dev->mem == NULL) {
- VHOST_CONFIG_LOG(dev->ifname, ERR, "failed to allocate memory for dev->mem");
- goto free_guest_pages;
- }
+ if (vhost_user_initialize_memory(pdev) < 0)
+ goto close_msg_fds;
for (i = 0; i < memory->nregions; i++) {
reg = &dev->mem->regions[i];
@@ -1531,11 +1507,182 @@ vhost_user_set_mem_table(struct virtio_net **pdev,
return RTE_VHOST_MSG_RESULT_OK;
free_mem_table:
- free_mem_region(dev);
+ free_all_mem_regions(dev);
rte_free(dev->mem);
dev->mem = NULL;
+ rte_free(dev->guest_pages);
+ dev->guest_pages = NULL;
+close_msg_fds:
+ close_msg_fds(ctx);
+ return RTE_VHOST_MSG_RESULT_ERR;
+}
+
-free_guest_pages:
+static int
+vhost_user_get_max_mem_slots(struct virtio_net **pdev __rte_unused,
+ struct vhu_msg_context *ctx,
+ int main_fd __rte_unused)
+{
+ uint32_t max_mem_slots = VHOST_MEMORY_MAX_NREGIONS;
+
+ ctx->msg.payload.u64 = (uint64_t)max_mem_slots;
+ ctx->msg.size = sizeof(ctx->msg.payload.u64);
+ ctx->fd_num = 0;
+
+ return RTE_VHOST_MSG_RESULT_REPLY;
+}
+
+static int
+vhost_user_add_mem_reg(struct virtio_net **pdev,
+ struct vhu_msg_context *ctx,
+ int main_fd __rte_unused)
+{
+ struct virtio_net *dev = *pdev;
+ struct VhostUserMemoryRegion *region = &ctx->msg.payload.memory_single.region;
+ uint32_t i;
+
+ /* make sure new region will fit */
+ if (dev->mem != NULL && dev->mem->nregions >= VHOST_MEMORY_MAX_NREGIONS) {
+ VHOST_CONFIG_LOG(dev->ifname, ERR,
+ "too many memory regions already (%u)",
+ dev->mem->nregions);
+ goto close_msg_fds;
+ }
+
+ /* make sure supplied memory fd present */
+ if (ctx->fd_num != 1) {
+ VHOST_CONFIG_LOG(dev->ifname, ERR,
+ "fd count makes no sense (%u)",
+ ctx->fd_num);
+ goto close_msg_fds;
+ }
+
+ /* Make sure no overlap in guest virtual address space */
+ if (dev->mem != NULL && dev->mem->nregions > 0) {
+ for (uint32_t i = 0; i < VHOST_MEMORY_MAX_NREGIONS; i++) {
+ struct rte_vhost_mem_region *current_region = &dev->mem->regions[i];
+
+ if (current_region->mmap_size == 0)
+ continue;
+
+ uint64_t current_region_guest_start = current_region->guest_user_addr;
+ uint64_t current_region_guest_end = current_region_guest_start
+ + current_region->mmap_size - 1;
+ uint64_t proposed_region_guest_start = region->userspace_addr;
+ uint64_t proposed_region_guest_end = proposed_region_guest_start
+ + region->memory_size - 1;
+ bool overlap = false;
+
+ bool curent_region_guest_start_overlap =
+ current_region_guest_start >= proposed_region_guest_start
+ && current_region_guest_start <= proposed_region_guest_end;
+ bool curent_region_guest_end_overlap =
+ current_region_guest_end >= proposed_region_guest_start
+ && current_region_guest_end <= proposed_region_guest_end;
+ bool proposed_region_guest_start_overlap =
+ proposed_region_guest_start >= current_region_guest_start
+ && proposed_region_guest_start <= current_region_guest_end;
+ bool proposed_region_guest_end_overlap =
+ proposed_region_guest_end >= current_region_guest_start
+ && proposed_region_guest_end <= current_region_guest_end;
+
+ overlap = curent_region_guest_start_overlap
+ || curent_region_guest_end_overlap
+ || proposed_region_guest_start_overlap
+ || proposed_region_guest_end_overlap;
+
+ if (overlap) {
+ VHOST_CONFIG_LOG(dev->ifname, ERR,
+ "requested memory region overlaps with another region");
+ VHOST_CONFIG_LOG(dev->ifname, ERR,
+ "\tRequested region address:0x%" PRIx64,
+ region->userspace_addr);
+ VHOST_CONFIG_LOG(dev->ifname, ERR,
+ "\tRequested region size:0x%" PRIx64,
+ region->memory_size);
+ VHOST_CONFIG_LOG(dev->ifname, ERR,
+ "\tOverlapping region address:0x%" PRIx64,
+ current_region->guest_user_addr);
+ VHOST_CONFIG_LOG(dev->ifname, ERR,
+ "\tOverlapping region size:0x%" PRIx64,
+ current_region->mmap_size);
+ goto close_msg_fds;
+ }
+
+ }
+ }
+
+ /* convert first region add to normal memory table set */
+ if (dev->mem == NULL) {
+ if (vhost_user_initialize_memory(pdev) < 0)
+ goto close_msg_fds;
+ }
+
+ /* find a new region and set it like memory table set does */
+ struct rte_vhost_mem_region *reg = NULL;
+ uint64_t mmap_offset;
+
+ for (uint32_t i = 0; i < VHOST_MEMORY_MAX_NREGIONS; i++) {
+ if (dev->mem->regions[i].guest_user_addr == 0) {
+ reg = &dev->mem->regions[i];
+ break;
+ }
+ }
+ if (reg == NULL) {
+ VHOST_CONFIG_LOG(dev->ifname, ERR, "no free memory region");
+ goto close_msg_fds;
+ }
+
+ reg->guest_phys_addr = region->guest_phys_addr;
+ reg->guest_user_addr = region->userspace_addr;
+ reg->size = region->memory_size;
+ reg->fd = ctx->fds[0];
+
+ mmap_offset = region->mmap_offset;
+
+ if (vhost_user_mmap_region(dev, reg, mmap_offset) < 0) {
+ VHOST_CONFIG_LOG(dev->ifname, ERR, "failed to mmap region");
+ goto close_msg_fds;
+ }
+
+ dev->mem->nregions++;
+
+ if (dev->async_copy && rte_vfio_is_enabled("vfio"))
+ async_dma_map(dev, true);
+
+ if (vhost_user_postcopy_register(dev, main_fd, ctx) < 0)
+ goto free_mem_table;
+
+ for (i = 0; i < dev->nr_vring; i++) {
+ struct vhost_virtqueue *vq = dev->virtqueue[i];
+
+ if (!vq)
+ continue;
+
+ if (vq->desc || vq->avail || vq->used) {
+ /* vhost_user_lock_all_queue_pairs locked all qps */
+ VHOST_USER_ASSERT_LOCK(dev, vq, VHOST_USER_ADD_MEM_REG);
+
+ /*
+ * If the memory table got updated, the ring addresses
+ * need to be translated again as virtual addresses have
+ * changed.
+ */
+ vring_invalidate(dev, vq);
+
+ translate_ring_addresses(&dev, &vq);
+ *pdev = dev;
+ }
+ }
+
+ dump_guest_pages(dev);
+
+ return RTE_VHOST_MSG_RESULT_OK;
+
+free_mem_table:
+ free_all_mem_regions(dev);
+ rte_free(dev->mem);
+ dev->mem = NULL;
rte_free(dev->guest_pages);
dev->guest_pages = NULL;
close_msg_fds:
@@ -1543,6 +1690,48 @@ vhost_user_set_mem_table(struct virtio_net **pdev,
return RTE_VHOST_MSG_RESULT_ERR;
}
+static int
+vhost_user_rem_mem_reg(struct virtio_net **pdev __rte_unused,
+ struct vhu_msg_context *ctx __rte_unused,
+ int main_fd __rte_unused)
+{
+ struct virtio_net *dev = *pdev;
+ struct VhostUserMemoryRegion *region = &ctx->msg.payload.memory_single.region;
+
+ if ((dev->mem) && (dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
+ struct rte_vdpa_device *vdpa_dev = dev->vdpa_dev;
+
+ if (vdpa_dev && vdpa_dev->ops->dev_close)
+ vdpa_dev->ops->dev_close(dev->vid);
+ dev->flags &= ~VIRTIO_DEV_VDPA_CONFIGURED;
+ }
+
+ if (dev->mem != NULL && dev->mem->nregions > 0) {
+ for (uint32_t i = 0; i < VHOST_MEMORY_MAX_NREGIONS; i++) {
+ struct rte_vhost_mem_region *current_region = &dev->mem->regions[i];
+
+ if (current_region->guest_user_addr == 0)
+ continue;
+
+ /*
+ * According to the vhost-user specification:
+ * The memory region to be removed is identified by its guest address,
+ * user address and size. The mmap offset is ignored.
+ */
+ if (region->userspace_addr == current_region->guest_user_addr
+ && region->guest_phys_addr == current_region->guest_phys_addr
+ && region->memory_size == current_region->size) {
+ free_mem_region(current_region);
+ dev->mem->nregions--;
+ return RTE_VHOST_MSG_RESULT_OK;
+ }
+ }
+ }
+
+ VHOST_CONFIG_LOG(dev->ifname, ERR, "failed to find region");
+ return RTE_VHOST_MSG_RESULT_ERR;
+}
+
static bool
vq_is_ready(struct virtio_net *dev, struct vhost_virtqueue *vq)
{
--
2.43.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 4/5] vhost_user: support function defines for back-end
2025-11-04 4:21 [PATCH v3 0/5] Support add/remove memory region & get-max-slots Pravin M Bathija
` (2 preceding siblings ...)
2025-11-04 4:21 ` [PATCH v3 3/5] vhost_user: Function defs for add/rem mem regions Pravin M Bathija
@ 2025-11-04 4:21 ` Pravin M Bathija
2025-11-04 8:05 ` fengchengwen
2025-11-04 4:21 ` [PATCH v3 5/5] vhost_user: Increase number of memory regions Pravin M Bathija
4 siblings, 1 reply; 11+ messages in thread
From: Pravin M Bathija @ 2025-11-04 4:21 UTC (permalink / raw)
To: dev; +Cc: pravin.bathija, pravin.m.bathija.dev
Here we define support functions which are called from the various
vhost-user back-end message functions like set memory table, get
memory slots, add memory region, remove memory region. These are
essetially common functions to initialize memory, unmap a set of
memory regions, perform register copy and align memory addresses.
Signed-off-by: Pravin M Bathija <pravin.bathija@dell.com>
---
lib/vhost/vhost_user.c | 80 +++++++++++++++++++++++++++++++++++-------
1 file changed, 68 insertions(+), 12 deletions(-)
diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
index 168432e7d1..9a85f2fc92 100644
--- a/lib/vhost/vhost_user.c
+++ b/lib/vhost/vhost_user.c
@@ -228,7 +228,17 @@ async_dma_map(struct virtio_net *dev, bool do_map)
}
static void
-free_mem_region(struct virtio_net *dev)
+free_mem_region(struct rte_vhost_mem_region *reg)
+{
+ if (reg != NULL && reg->host_user_addr) {
+ munmap(reg->mmap_addr, reg->mmap_size);
+ close(reg->fd);
+ memset(reg, 0, sizeof(struct rte_vhost_mem_region));
+ }
+}
+
+static void
+free_all_mem_regions(struct virtio_net *dev)
{
uint32_t i;
struct rte_vhost_mem_region *reg;
@@ -239,12 +249,10 @@ free_mem_region(struct virtio_net *dev)
if (dev->async_copy && rte_vfio_is_enabled("vfio"))
async_dma_map(dev, false);
- for (i = 0; i < dev->mem->nregions; i++) {
+ for (i = 0; i < VHOST_MEMORY_MAX_NREGIONS; i++) {
reg = &dev->mem->regions[i];
- if (reg->host_user_addr) {
- munmap(reg->mmap_addr, reg->mmap_size);
- close(reg->fd);
- }
+ if (reg->mmap_addr)
+ free_mem_region(reg);
}
}
@@ -258,7 +266,7 @@ vhost_backend_cleanup(struct virtio_net *dev)
vdpa_dev->ops->dev_cleanup(dev->vid);
if (dev->mem) {
- free_mem_region(dev);
+ free_all_mem_regions(dev);
rte_free(dev->mem);
dev->mem = NULL;
}
@@ -707,7 +715,7 @@ numa_realloc(struct virtio_net **pdev, struct vhost_virtqueue **pvq)
vhost_devices[dev->vid] = dev;
mem_size = sizeof(struct rte_vhost_memory) +
- sizeof(struct rte_vhost_mem_region) * dev->mem->nregions;
+ sizeof(struct rte_vhost_mem_region) * VHOST_MEMORY_MAX_NREGIONS;
mem = rte_realloc_socket(dev->mem, mem_size, 0, node);
if (!mem) {
VHOST_CONFIG_LOG(dev->ifname, ERR,
@@ -811,8 +819,10 @@ hua_to_alignment(struct rte_vhost_memory *mem, void *ptr)
uint32_t i;
uintptr_t hua = (uintptr_t)ptr;
- for (i = 0; i < mem->nregions; i++) {
+ for (i = 0; i < VHOST_MEMORY_MAX_NREGIONS; i++) {
r = &mem->regions[i];
+ if (r->host_user_addr == 0)
+ continue;
if (hua >= r->host_user_addr &&
hua < r->host_user_addr + r->size) {
return get_blk_size(r->fd);
@@ -1250,9 +1260,13 @@ vhost_user_postcopy_register(struct virtio_net *dev, int main_fd,
* retrieve the region offset when handling userfaults.
*/
memory = &ctx->msg.payload.memory;
- for (i = 0; i < memory->nregions; i++) {
+ for (i = 0; i < VHOST_MEMORY_MAX_NREGIONS; i++) {
+ int reg_msg_index = 0;
reg = &dev->mem->regions[i];
- memory->regions[i].userspace_addr = reg->host_user_addr;
+ if (reg->host_user_addr == 0)
+ continue;
+ memory->regions[reg_msg_index].userspace_addr = reg->host_user_addr;
+ reg_msg_index++;
}
/* Send the addresses back to qemu */
@@ -1279,8 +1293,10 @@ vhost_user_postcopy_register(struct virtio_net *dev, int main_fd,
}
/* Now userfault register and we can use the memory */
- for (i = 0; i < memory->nregions; i++) {
+ for (i = 0; i < VHOST_MEMORY_MAX_NREGIONS; i++) {
reg = &dev->mem->regions[i];
+ if (reg->host_user_addr == 0)
+ continue;
if (vhost_user_postcopy_region_register(dev, reg) < 0)
return -1;
}
@@ -1385,6 +1401,46 @@ vhost_user_mmap_region(struct virtio_net *dev,
return 0;
}
+static int
+vhost_user_initialize_memory(struct virtio_net **pdev)
+{
+ struct virtio_net *dev = *pdev;
+ int numa_node = SOCKET_ID_ANY;
+
+ /*
+ * If VQ 0 has already been allocated, try to allocate on the same
+ * NUMA node. It can be reallocated later in numa_realloc().
+ */
+ if (dev->nr_vring > 0)
+ numa_node = dev->virtqueue[0]->numa_node;
+
+ dev->nr_guest_pages = 0;
+ if (dev->guest_pages == NULL) {
+ dev->max_guest_pages = 8;
+ dev->guest_pages = rte_zmalloc_socket(NULL,
+ dev->max_guest_pages *
+ sizeof(struct guest_page),
+ RTE_CACHE_LINE_SIZE,
+ numa_node);
+ if (dev->guest_pages == NULL) {
+ VHOST_CONFIG_LOG(dev->ifname, ERR,
+ "failed to allocate memory for dev->guest_pages");
+ return -1;
+ }
+ }
+
+ dev->mem = rte_zmalloc_socket("vhost-mem-table", sizeof(struct rte_vhost_memory) +
+ sizeof(struct rte_vhost_mem_region) * VHOST_MEMORY_MAX_NREGIONS, 0, numa_node);
+ if (dev->mem == NULL) {
+ VHOST_CONFIG_LOG(dev->ifname, ERR, "failed to allocate memory for dev->mem");
+ rte_free(dev->guest_pages);
+ dev->guest_pages = NULL;
+ return -1;
+ }
+
+ return 0;
+}
+
static int
vhost_user_set_mem_table(struct virtio_net **pdev,
struct vhu_msg_context *ctx,
--
2.43.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 5/5] vhost_user: Increase number of memory regions
2025-11-04 4:21 [PATCH v3 0/5] Support add/remove memory region & get-max-slots Pravin M Bathija
` (3 preceding siblings ...)
2025-11-04 4:21 ` [PATCH v3 4/5] vhost_user: support function defines for back-end Pravin M Bathija
@ 2025-11-04 4:21 ` Pravin M Bathija
2025-11-04 8:12 ` fengchengwen
4 siblings, 1 reply; 11+ messages in thread
From: Pravin M Bathija @ 2025-11-04 4:21 UTC (permalink / raw)
To: dev; +Cc: pravin.bathija, pravin.m.bathija.dev
In this patch the number of memory regions are increased from
8 to 128. When a vhost-user front-end such as qemu or libblkio
queries the back-end such as dpdk with the message, get max number
of memory slots, the back-end replies with this number 128 instead
of the previously defined 8. The back-end also allocates that many
slots in the memory table where regions are added/removed as
requested by the vhost-user front-end. This also helps the vhost-
user front-end to limit the number of memory regions when sending
the set mem table message ar adding memory regions.
Signed-off-by: Pravin M Bathija <pravin.bathija@dell.com>
---
lib/vhost/vhost_user.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/vhost/vhost_user.h b/lib/vhost/vhost_user.h
index 5a0e747b58..c6ad5b76d6 100644
--- a/lib/vhost/vhost_user.h
+++ b/lib/vhost/vhost_user.h
@@ -11,7 +11,7 @@
/* refer to hw/virtio/vhost-user.c */
-#define VHOST_MEMORY_MAX_NREGIONS 8
+#define VHOST_MEMORY_MAX_NREGIONS 128
#define VHOST_USER_NET_SUPPORTED_FEATURES \
(VIRTIO_NET_SUPPORTED_FEATURES | \
--
2.43.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 1/5] vhost: add user to mailmap and define to vhost hdr
2025-11-04 4:21 ` [PATCH v3 1/5] vhost: add user to mailmap and define to vhost hdr Pravin M Bathija
@ 2025-11-04 7:15 ` fengchengwen
0 siblings, 0 replies; 11+ messages in thread
From: fengchengwen @ 2025-11-04 7:15 UTC (permalink / raw)
To: Pravin M Bathija, dev; +Cc: pravin.m.bathija.dev
Please fix a typo below
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
On 11/4/2025 12:21 PM, Pravin M Bathija wrote:
> - add user to mailmap fle
> - define a bit-field called VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS
> that depicts if the fature/capability to add/remove memory regions is
fature -> feature
> supported. This is a part of the overall support for add/remove
> memory region feature in this patchset.
>
> Signed-off-by: Pravin M Bathija <pravin.bathija@dell.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 2/5] vhost_user: header defines for add/rem mem region
2025-11-04 4:21 ` [PATCH v3 2/5] vhost_user: header defines for add/rem mem region Pravin M Bathija
@ 2025-11-04 7:18 ` fengchengwen
0 siblings, 0 replies; 11+ messages in thread
From: fengchengwen @ 2025-11-04 7:18 UTC (permalink / raw)
To: Pravin M Bathija, dev; +Cc: pravin.m.bathija.dev
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
On 11/4/2025 12:21 PM, Pravin M Bathija wrote:
> The changes in this file cover the enum message requests for
> supporting add/remove memory regions. The front-end vhost-user
> client sends messages like get max memory slots, add memory region
> and remove memory region which are covered in these changes which
> are on the vhost-user back-end. The changes also include data structure
> definition of memory region to be added/removed. The data structure
> VhostUserMsg has been changed to include the memory region.
>
> Signed-off-by: Pravin M Bathija <pravin.bathija@dell.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 3/5] vhost_user: Function defs for add/rem mem regions
2025-11-04 4:21 ` [PATCH v3 3/5] vhost_user: Function defs for add/rem mem regions Pravin M Bathija
@ 2025-11-04 7:48 ` fengchengwen
0 siblings, 0 replies; 11+ messages in thread
From: fengchengwen @ 2025-11-04 7:48 UTC (permalink / raw)
To: Pravin M Bathija, dev; +Cc: pravin.m.bathija.dev
On 11/4/2025 12:21 PM, Pravin M Bathija wrote:
> These changes cover the function definition for add/remove memory
> region calls which are invoked on receiving vhost user message from
> vhost user front-end (e.g. Qemu). The vhost-user front end sw sends
> an add memory region message which includes the guest physical address
> range(GPA), memory size, front-end's virtual address and offset within
> tha file-descriptor mapping. The back-end(dpdk) uses this information
> to create a mapping within it's own address space using mmap on the
> fd + offset. This added memory region serves as sshared memory to pass
> data back and forth between vhost-user front and back ends. Similarly in
> the case of remove memory region, the said memory region is unmapped from
> the back-end(dpdk). In our case, in addition to testing with qemu front-end,
> the testing has also been performed with libblkio front-end and spdk/dpdk
> back-end. We did I/O using libblkio based device driver, to spdk based drives.
> There are also changes for set mem table and new definition for get memory slots.
> The message vhost set memory table essentially is how the vhost-user front-end
> (qemu or libblkio) tells vhost-user back-end (dpdk) about all of it's guest
> memory regions. This allows the back-end to translate guest physical addresses
> to back-end virtual addresses and perform direct I/O to guest memory. Our changes
> optimize the set memory table call to use common support functions. Message get
> memory slots is how the vhost-user front-end queries the vhost-user back-end about
> the number of memory slots available to be registered by the back-end.
Too much useless information in the commit log, this features is a de facto standard,
and it supported from early qemu version (6.2 as I know), so I think we could simplfy
the commit log.
>
> Signed-off-by: Pravin M Bathija <pravin.bathija@dell.com>
> ---
> lib/vhost/vhost_user.c | 253 +++++++++++++++++++++++++++++++++++------
> 1 file changed, 221 insertions(+), 32 deletions(-)
>
> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> index 4bfb13fb98..168432e7d1 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -71,6 +71,9 @@ VHOST_MESSAGE_HANDLER(VHOST_USER_SET_FEATURES, vhost_user_set_features, false, t
> VHOST_MESSAGE_HANDLER(VHOST_USER_SET_OWNER, vhost_user_set_owner, false, true) \
> VHOST_MESSAGE_HANDLER(VHOST_USER_RESET_OWNER, vhost_user_reset_owner, false, false) \
> VHOST_MESSAGE_HANDLER(VHOST_USER_SET_MEM_TABLE, vhost_user_set_mem_table, true, true) \
> +VHOST_MESSAGE_HANDLER(VHOST_USER_GET_MAX_MEM_SLOTS, vhost_user_get_max_mem_slots, false, false) \
> +VHOST_MESSAGE_HANDLER(VHOST_USER_ADD_MEM_REG, vhost_user_add_mem_reg, true, true) \
> +VHOST_MESSAGE_HANDLER(VHOST_USER_REM_MEM_REG, vhost_user_rem_mem_reg, false, true) \
> VHOST_MESSAGE_HANDLER(VHOST_USER_SET_LOG_BASE, vhost_user_set_log_base, true, true) \
> VHOST_MESSAGE_HANDLER(VHOST_USER_SET_LOG_FD, vhost_user_set_log_fd, true, true) \
> VHOST_MESSAGE_HANDLER(VHOST_USER_SET_VRING_NUM, vhost_user_set_vring_num, false, true) \
> @@ -1390,7 +1393,6 @@ vhost_user_set_mem_table(struct virtio_net **pdev,
> struct virtio_net *dev = *pdev;
> struct VhostUserMemory *memory = &ctx->msg.payload.memory;
> struct rte_vhost_mem_region *reg;
> - int numa_node = SOCKET_ID_ANY;
> uint64_t mmap_offset;
> uint32_t i;
> bool async_notify = false;
> @@ -1435,39 +1437,13 @@ vhost_user_set_mem_table(struct virtio_net **pdev,
> if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))
> vhost_user_iotlb_flush_all(dev);
>
> - free_mem_region(dev);
> + free_all_mem_regions(dev);
> rte_free(dev->mem);
> dev->mem = NULL;
> }
>
> - /*
> - * If VQ 0 has already been allocated, try to allocate on the same
> - * NUMA node. It can be reallocated later in numa_realloc().
> - */
> - if (dev->nr_vring > 0)
> - numa_node = dev->virtqueue[0]->numa_node;
> -
> - dev->nr_guest_pages = 0;
> - if (dev->guest_pages == NULL) {
> - dev->max_guest_pages = 8;
> - dev->guest_pages = rte_zmalloc_socket(NULL,
> - dev->max_guest_pages *
> - sizeof(struct guest_page),
> - RTE_CACHE_LINE_SIZE,
> - numa_node);
> - if (dev->guest_pages == NULL) {
> - VHOST_CONFIG_LOG(dev->ifname, ERR,
> - "failed to allocate memory for dev->guest_pages");
> - goto close_msg_fds;
> - }
> - }
> -
> - dev->mem = rte_zmalloc_socket("vhost-mem-table", sizeof(struct rte_vhost_memory) +
> - sizeof(struct rte_vhost_mem_region) * memory->nregions, 0, numa_node);
> - if (dev->mem == NULL) {
> - VHOST_CONFIG_LOG(dev->ifname, ERR, "failed to allocate memory for dev->mem");
> - goto free_guest_pages;
> - }
> + if (vhost_user_initialize_memory(pdev) < 0)
> + goto close_msg_fds;
>
> for (i = 0; i < memory->nregions; i++) {
> reg = &dev->mem->regions[i];
> @@ -1531,11 +1507,182 @@ vhost_user_set_mem_table(struct virtio_net **pdev,
> return RTE_VHOST_MSG_RESULT_OK;
>
> free_mem_table:
> - free_mem_region(dev);
> + free_all_mem_regions(dev);
> rte_free(dev->mem);
> dev->mem = NULL;
> + rte_free(dev->guest_pages);
> + dev->guest_pages = NULL;
> +close_msg_fds:
> + close_msg_fds(ctx);
> + return RTE_VHOST_MSG_RESULT_ERR;
> +}
> +
>
> -free_guest_pages:
> +static int
> +vhost_user_get_max_mem_slots(struct virtio_net **pdev __rte_unused,
> + struct vhu_msg_context *ctx,
> + int main_fd __rte_unused)
> +{
> + uint32_t max_mem_slots = VHOST_MEMORY_MAX_NREGIONS;
> +
> + ctx->msg.payload.u64 = (uint64_t)max_mem_slots;
> + ctx->msg.size = sizeof(ctx->msg.payload.u64);
> + ctx->fd_num = 0;
> +
> + return RTE_VHOST_MSG_RESULT_REPLY;
> +}
> +
> +static int
> +vhost_user_add_mem_reg(struct virtio_net **pdev,
> + struct vhu_msg_context *ctx,
> + int main_fd __rte_unused)
> +{
> + struct virtio_net *dev = *pdev;
> + struct VhostUserMemoryRegion *region = &ctx->msg.payload.memory_single.region;
Please put long line in front.
> + uint32_t i;
> +
> + /* make sure new region will fit */
> + if (dev->mem != NULL && dev->mem->nregions >= VHOST_MEMORY_MAX_NREGIONS) {
> + VHOST_CONFIG_LOG(dev->ifname, ERR,
> + "too many memory regions already (%u)",
> + dev->mem->nregions);
Currently support 100 char per line. So maybe we could put the log oneline. the same for below log.
> + goto close_msg_fds;
> + }
> +
> + /* make sure supplied memory fd present */
> + if (ctx->fd_num != 1) {
> + VHOST_CONFIG_LOG(dev->ifname, ERR,
> + "fd count makes no sense (%u)",
> + ctx->fd_num);
> + goto close_msg_fds;
> + }
> +
> + /* Make sure no overlap in guest virtual address space */
> + if (dev->mem != NULL && dev->mem->nregions > 0) {
> + for (uint32_t i = 0; i < VHOST_MEMORY_MAX_NREGIONS; i++) {
> + struct rte_vhost_mem_region *current_region = &dev->mem->regions[i];
> +
> + if (current_region->mmap_size == 0)
> + continue;
> +
> + uint64_t current_region_guest_start = current_region->guest_user_addr;
> + uint64_t current_region_guest_end = current_region_guest_start
> + + current_region->mmap_size - 1;
> + uint64_t proposed_region_guest_start = region->userspace_addr;
> + uint64_t proposed_region_guest_end = proposed_region_guest_start
> + + region->memory_size - 1;
> + bool overlap = false;
> +
> + bool curent_region_guest_start_overlap =
> + current_region_guest_start >= proposed_region_guest_start
> + && current_region_guest_start <= proposed_region_guest_end;
the && should put tail of line according DPDK coding style.
> + bool curent_region_guest_end_overlap =
> + current_region_guest_end >= proposed_region_guest_start
> + && current_region_guest_end <= proposed_region_guest_end;
> + bool proposed_region_guest_start_overlap =
> + proposed_region_guest_start >= current_region_guest_start
> + && proposed_region_guest_start <= current_region_guest_end;
> + bool proposed_region_guest_end_overlap =
> + proposed_region_guest_end >= current_region_guest_start
> + && proposed_region_guest_end <= current_region_guest_end;
> +
> + overlap = curent_region_guest_start_overlap
> + || curent_region_guest_end_overlap
> + || proposed_region_guest_start_overlap
> + || proposed_region_guest_end_overlap;
> +
> + if (overlap) {
> + VHOST_CONFIG_LOG(dev->ifname, ERR,
> + "requested memory region overlaps with another region");
> + VHOST_CONFIG_LOG(dev->ifname, ERR,
> + "\tRequested region address:0x%" PRIx64,
> + region->userspace_addr);
> + VHOST_CONFIG_LOG(dev->ifname, ERR,
> + "\tRequested region size:0x%" PRIx64,
> + region->memory_size);
> + VHOST_CONFIG_LOG(dev->ifname, ERR,
> + "\tOverlapping region address:0x%" PRIx64,
> + current_region->guest_user_addr);
> + VHOST_CONFIG_LOG(dev->ifname, ERR,
> + "\tOverlapping region size:0x%" PRIx64,
> + current_region->mmap_size);
> + goto close_msg_fds;
how about add sub function to wrap the check?
> + }
> +
> + }
> + }
> +
> + /* convert first region add to normal memory table set */
> + if (dev->mem == NULL) {
> + if (vhost_user_initialize_memory(pdev) < 0)
> + goto close_msg_fds;
> + }
I think we could put this judgement at the beginning of function, so that we could
drop so many judgement for 'if (dev->mem == NULL)'
> +
> + /* find a new region and set it like memory table set does */
> + struct rte_vhost_mem_region *reg = NULL;
> + uint64_t mmap_offset;
Suggest place them at the beginning of function.
> +
> + for (uint32_t i = 0; i < VHOST_MEMORY_MAX_NREGIONS; i++) {
> + if (dev->mem->regions[i].guest_user_addr == 0) {
> + reg = &dev->mem->regions[i];
> + break;
> + }
> + }
> + if (reg == NULL) {
> + VHOST_CONFIG_LOG(dev->ifname, ERR, "no free memory region");
> + goto close_msg_fds;
> + }
> +
> + reg->guest_phys_addr = region->guest_phys_addr;
> + reg->guest_user_addr = region->userspace_addr;
> + reg->size = region->memory_size;
> + reg->fd = ctx->fds[0];
> +
> + mmap_offset = region->mmap_offset;
no need to define mmap_offset which only used once.
> +
> + if (vhost_user_mmap_region(dev, reg, mmap_offset) < 0) {
> + VHOST_CONFIG_LOG(dev->ifname, ERR, "failed to mmap region");
> + goto close_msg_fds;
> + }
> +
> + dev->mem->nregions++;
> +
> + if (dev->async_copy && rte_vfio_is_enabled("vfio"))
> + async_dma_map(dev, true);
it will map all region, we should map only this region.
> +
> + if (vhost_user_postcopy_register(dev, main_fd, ctx) < 0)
> + goto free_mem_table;
> +
> + for (i = 0; i < dev->nr_vring; i++) {
> + struct vhost_virtqueue *vq = dev->virtqueue[i];
> +
> + if (!vq)
> + continue;
> +
> + if (vq->desc || vq->avail || vq->used) {
> + /* vhost_user_lock_all_queue_pairs locked all qps */
> + VHOST_USER_ASSERT_LOCK(dev, vq, VHOST_USER_ADD_MEM_REG);
> +
> + /*
> + * If the memory table got updated, the ring addresses
> + * need to be translated again as virtual addresses have
> + * changed.
> + */
> + vring_invalidate(dev, vq);
> +
> + translate_ring_addresses(&dev, &vq);
> + *pdev = dev;
> + }
> + }
> +
> + dump_guest_pages(dev);
> +
> + return RTE_VHOST_MSG_RESULT_OK;
> +
> +free_mem_table:
> + free_all_mem_regions(dev);
> + rte_free(dev->mem);
> + dev->mem = NULL;
> rte_free(dev->guest_pages);
> dev->guest_pages = NULL;
> close_msg_fds:
> @@ -1543,6 +1690,48 @@ vhost_user_set_mem_table(struct virtio_net **pdev,
> return RTE_VHOST_MSG_RESULT_ERR;
> }
>
> +static int
> +vhost_user_rem_mem_reg(struct virtio_net **pdev __rte_unused,
> + struct vhu_msg_context *ctx __rte_unused,
> + int main_fd __rte_unused)
> +{
> + struct virtio_net *dev = *pdev;
> + struct VhostUserMemoryRegion *region = &ctx->msg.payload.memory_single.region;
> +
> + if ((dev->mem) && (dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
> + struct rte_vdpa_device *vdpa_dev = dev->vdpa_dev;
> +
> + if (vdpa_dev && vdpa_dev->ops->dev_close)
> + vdpa_dev->ops->dev_close(dev->vid);
> + dev->flags &= ~VIRTIO_DEV_VDPA_CONFIGURED;
The DPDK's vdpa devices will not report the new feature, so this will be dead code I think.
> + }
> +
> + if (dev->mem != NULL && dev->mem->nregions > 0) {
> + for (uint32_t i = 0; i < VHOST_MEMORY_MAX_NREGIONS; i++) {
suggest i defined in beginning of func.
> + struct rte_vhost_mem_region *current_region = &dev->mem->regions[i];
> +
> + if (current_region->guest_user_addr == 0)
> + continue;
> +
> + /*
> + * According to the vhost-user specification:
> + * The memory region to be removed is identified by its guest address,
> + * user address and size. The mmap offset is ignored.
> + */
> + if (region->userspace_addr == current_region->guest_user_addr
> + && region->guest_phys_addr == current_region->guest_phys_addr
> + && region->memory_size == current_region->size) {
> + free_mem_region(current_region);
I checked the 4/5 commit, and we should do async_dma unmap this region here.
Beside, should we invalidate vring here?
> + dev->mem->nregions--;
> + return RTE_VHOST_MSG_RESULT_OK;
> + }
> + }
> + }
> +
> + VHOST_CONFIG_LOG(dev->ifname, ERR, "failed to find region");
> + return RTE_VHOST_MSG_RESULT_ERR;
> +}
> +
> static bool
> vq_is_ready(struct virtio_net *dev, struct vhost_virtqueue *vq)
> {
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 4/5] vhost_user: support function defines for back-end
2025-11-04 4:21 ` [PATCH v3 4/5] vhost_user: support function defines for back-end Pravin M Bathija
@ 2025-11-04 8:05 ` fengchengwen
0 siblings, 0 replies; 11+ messages in thread
From: fengchengwen @ 2025-11-04 8:05 UTC (permalink / raw)
To: Pravin M Bathija, dev; +Cc: pravin.m.bathija.dev
On 11/4/2025 12:21 PM, Pravin M Bathija wrote:
> Here we define support functions which are called from the various
> vhost-user back-end message functions like set memory table, get
> memory slots, add memory region, remove memory region. These are
> essetially common functions to initialize memory, unmap a set of
> memory regions, perform register copy and align memory addresses.
>
> Signed-off-by: Pravin M Bathija <pravin.bathija@dell.com>
> ---
> lib/vhost/vhost_user.c | 80 +++++++++++++++++++++++++++++++++++-------
> 1 file changed, 68 insertions(+), 12 deletions(-)
>
> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> index 168432e7d1..9a85f2fc92 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -228,7 +228,17 @@ async_dma_map(struct virtio_net *dev, bool do_map)
> }
>
> static void
> -free_mem_region(struct virtio_net *dev)
> +free_mem_region(struct rte_vhost_mem_region *reg)
> +{
> + if (reg != NULL && reg->host_user_addr) {
> + munmap(reg->mmap_addr, reg->mmap_size);
> + close(reg->fd);
> + memset(reg, 0, sizeof(struct rte_vhost_mem_region));
> + }
> +}
> +
> +static void
> +free_all_mem_regions(struct virtio_net *dev)
> {
> uint32_t i;
> struct rte_vhost_mem_region *reg;
> @@ -239,12 +249,10 @@ free_mem_region(struct virtio_net *dev)
> if (dev->async_copy && rte_vfio_is_enabled("vfio"))
> async_dma_map(dev, false);
>
> - for (i = 0; i < dev->mem->nregions; i++) {
> + for (i = 0; i < VHOST_MEMORY_MAX_NREGIONS; i++) {
> reg = &dev->mem->regions[i];
> - if (reg->host_user_addr) {
> - munmap(reg->mmap_addr, reg->mmap_size);
> - close(reg->fd);
> - }
> + if (reg->mmap_addr)
> + free_mem_region(reg);
> }
> }
>
> @@ -258,7 +266,7 @@ vhost_backend_cleanup(struct virtio_net *dev)
> vdpa_dev->ops->dev_cleanup(dev->vid);
>
> if (dev->mem) {
> - free_mem_region(dev);
> + free_all_mem_regions(dev);
> rte_free(dev->mem);
> dev->mem = NULL;
> }
> @@ -707,7 +715,7 @@ numa_realloc(struct virtio_net **pdev, struct vhost_virtqueue **pvq)
> vhost_devices[dev->vid] = dev;
>
> mem_size = sizeof(struct rte_vhost_memory) +
> - sizeof(struct rte_vhost_mem_region) * dev->mem->nregions;
> + sizeof(struct rte_vhost_mem_region) * VHOST_MEMORY_MAX_NREGIONS;
> mem = rte_realloc_socket(dev->mem, mem_size, 0, node);
> if (!mem) {
> VHOST_CONFIG_LOG(dev->ifname, ERR,
> @@ -811,8 +819,10 @@ hua_to_alignment(struct rte_vhost_memory *mem, void *ptr)
> uint32_t i;
> uintptr_t hua = (uintptr_t)ptr;
>
> - for (i = 0; i < mem->nregions; i++) {
> + for (i = 0; i < VHOST_MEMORY_MAX_NREGIONS; i++) {
> r = &mem->regions[i];
> + if (r->host_user_addr == 0)
> + continue;
> if (hua >= r->host_user_addr &&
> hua < r->host_user_addr + r->size) {
> return get_blk_size(r->fd);
> @@ -1250,9 +1260,13 @@ vhost_user_postcopy_register(struct virtio_net *dev, int main_fd,
> * retrieve the region offset when handling userfaults.
> */
> memory = &ctx->msg.payload.memory;
> - for (i = 0; i < memory->nregions; i++) {
> + for (i = 0; i < VHOST_MEMORY_MAX_NREGIONS; i++) {
I think the using MAX_NREGIONS are most for convienent, but it will impact the performance,
because the rte_vhost_va_from_guest_pa() should iter the entire array.
I think we should keep the original impl: make sure the nregions entry of memory-region is
always valid.
Beside, where is the modification for rte_vhost_va_from_guest_pa()???
> + int reg_msg_index = 0;
> reg = &dev->mem->regions[i];
> - memory->regions[i].userspace_addr = reg->host_user_addr;
> + if (reg->host_user_addr == 0)
> + continue;
> + memory->regions[reg_msg_index].userspace_addr = reg->host_user_addr;
> + reg_msg_index++;
> }
>
> /* Send the addresses back to qemu */
> @@ -1279,8 +1293,10 @@ vhost_user_postcopy_register(struct virtio_net *dev, int main_fd,
> }
>
> /* Now userfault register and we can use the memory */
> - for (i = 0; i < memory->nregions; i++) {
> + for (i = 0; i < VHOST_MEMORY_MAX_NREGIONS; i++) {
> reg = &dev->mem->regions[i];
> + if (reg->host_user_addr == 0)
> + continue;
> if (vhost_user_postcopy_region_register(dev, reg) < 0)
> return -1;
> }
> @@ -1385,6 +1401,46 @@ vhost_user_mmap_region(struct virtio_net *dev,
> return 0;
> }
>
> +static int
> +vhost_user_initialize_memory(struct virtio_net **pdev)
This function should be part of 3/5, else the 3/5 will compile fail
> +{
> + struct virtio_net *dev = *pdev;
> + int numa_node = SOCKET_ID_ANY;
> +
> + /*
> + * If VQ 0 has already been allocated, try to allocate on the same
> + * NUMA node. It can be reallocated later in numa_realloc().
> + */
> + if (dev->nr_vring > 0)
> + numa_node = dev->virtqueue[0]->numa_node;
> +
> + dev->nr_guest_pages = 0;
> + if (dev->guest_pages == NULL) {
> + dev->max_guest_pages = 8;
It should be VHOST_MEMORY_MAX_NREGIONS
> + dev->guest_pages = rte_zmalloc_socket(NULL,
> + dev->max_guest_pages *
> + sizeof(struct guest_page),
> + RTE_CACHE_LINE_SIZE,
> + numa_node);
> + if (dev->guest_pages == NULL) {
> + VHOST_CONFIG_LOG(dev->ifname, ERR,
> + "failed to allocate memory for dev->guest_pages");
> + return -1;
> + }
> + }
> +
> + dev->mem = rte_zmalloc_socket("vhost-mem-table", sizeof(struct rte_vhost_memory) +
> + sizeof(struct rte_vhost_mem_region) * VHOST_MEMORY_MAX_NREGIONS, 0, numa_node);
> + if (dev->mem == NULL) {
> + VHOST_CONFIG_LOG(dev->ifname, ERR, "failed to allocate memory for dev->mem");
> + rte_free(dev->guest_pages);
> + dev->guest_pages = NULL;
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> static int
> vhost_user_set_mem_table(struct virtio_net **pdev,
> struct vhu_msg_context *ctx,
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 5/5] vhost_user: Increase number of memory regions
2025-11-04 4:21 ` [PATCH v3 5/5] vhost_user: Increase number of memory regions Pravin M Bathija
@ 2025-11-04 8:12 ` fengchengwen
0 siblings, 0 replies; 11+ messages in thread
From: fengchengwen @ 2025-11-04 8:12 UTC (permalink / raw)
To: Pravin M Bathija, dev; +Cc: pravin.m.bathija.dev
On 11/4/2025 12:21 PM, Pravin M Bathija wrote:
> In this patch the number of memory regions are increased from
> 8 to 128. When a vhost-user front-end such as qemu or libblkio
> queries the back-end such as dpdk with the message, get max number
> of memory slots, the back-end replies with this number 128 instead
> of the previously defined 8. The back-end also allocates that many
> slots in the memory table where regions are added/removed as
> requested by the vhost-user front-end. This also helps the vhost-
> user front-end to limit the number of memory regions when sending
> the set mem table message ar adding memory regions.
>
> Signed-off-by: Pravin M Bathija <pravin.bathija@dell.com>
> ---
> lib/vhost/vhost_user.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/vhost/vhost_user.h b/lib/vhost/vhost_user.h
> index 5a0e747b58..c6ad5b76d6 100644
> --- a/lib/vhost/vhost_user.h
> +++ b/lib/vhost/vhost_user.h
> @@ -11,7 +11,7 @@
>
> /* refer to hw/virtio/vhost-user.c */
>
> -#define VHOST_MEMORY_MAX_NREGIONS 8
> +#define VHOST_MEMORY_MAX_NREGIONS 128
The address translation may increase a lot if the real region is 128.
Maybe we should add another patch to optimize it.
>
> #define VHOST_USER_NET_SUPPORTED_FEATURES \
> (VIRTIO_NET_SUPPORTED_FEATURES | \
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-11-04 8:12 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-04 4:21 [PATCH v3 0/5] Support add/remove memory region & get-max-slots Pravin M Bathija
2025-11-04 4:21 ` [PATCH v3 1/5] vhost: add user to mailmap and define to vhost hdr Pravin M Bathija
2025-11-04 7:15 ` fengchengwen
2025-11-04 4:21 ` [PATCH v3 2/5] vhost_user: header defines for add/rem mem region Pravin M Bathija
2025-11-04 7:18 ` fengchengwen
2025-11-04 4:21 ` [PATCH v3 3/5] vhost_user: Function defs for add/rem mem regions Pravin M Bathija
2025-11-04 7:48 ` fengchengwen
2025-11-04 4:21 ` [PATCH v3 4/5] vhost_user: support function defines for back-end Pravin M Bathija
2025-11-04 8:05 ` fengchengwen
2025-11-04 4:21 ` [PATCH v3 5/5] vhost_user: Increase number of memory regions Pravin M Bathija
2025-11-04 8:12 ` fengchengwen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).