* [PATCH v3 0/4] vhost: add device op to offload the interrupt kick
@ 2023-05-17 9:08 Eelco Chaudron
2023-05-17 9:08 ` [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a read/write one Eelco Chaudron
` (4 more replies)
0 siblings, 5 replies; 35+ messages in thread
From: Eelco Chaudron @ 2023-05-17 9:08 UTC (permalink / raw)
To: maxime.coquelin, chenbo.xia, david.marchand; +Cc: dev
This series adds an operation callback which gets called every time the
library wants to call eventfd_write(). This eventfd_write() call could
result in a system call, which could potentially block the PMD thread.
The callback function can decide whether it's ok to handle the
eventfd_write() now or have the newly introduced function,
rte_vhost_notify_guest(), called at a later time.
This can be used by 3rd party applications, like OVS, to avoid system
calls being called as part of the PMD threads.
v3:
- Changed ABI compatibility code to no longer use a boolean
to avoid having to disable specific GCC warnings.
- Moved the fd check fix to a separate patch (patch 3/4).
- Fixed some coding style issues.
v2: - Used vhost_virtqueue->index to find index for operation.
- Aligned function name to VDUSE RFC patchset.
- Added error and offload statistics counter.
- Mark new API as experimental.
- Change the virtual queue spin lock to read/write spin lock.
- Made shared counters atomic.
- Add versioned rte_vhost_driver_callback_register() for
ABI compliance.
Eelco Chaudron (4):
vhost: change vhost_virtqueue access lock to a read/write one
vhost: make the guest_notifications statistic counter atomic
vhost: fix invalid call FD handling
vhost: add device op to offload the interrupt kick
lib/eal/include/generic/rte_rwlock.h | 17 +++++
lib/vhost/meson.build | 2 +
lib/vhost/rte_vhost.h | 23 ++++++-
lib/vhost/socket.c | 63 +++++++++++++++++--
lib/vhost/version.map | 9 +++
lib/vhost/vhost.c | 92 +++++++++++++++++++++-------
lib/vhost/vhost.h | 69 ++++++++++++++-------
lib/vhost/vhost_user.c | 14 ++---
lib/vhost/virtio_net.c | 90 +++++++++++++--------------
9 files changed, 278 insertions(+), 101 deletions(-)
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a read/write one
2023-05-17 9:08 [PATCH v3 0/4] vhost: add device op to offload the interrupt kick Eelco Chaudron
@ 2023-05-17 9:08 ` Eelco Chaudron
2023-05-17 17:33 ` Maxime Coquelin
2023-05-31 6:37 ` Xia, Chenbo
2023-05-17 9:08 ` [PATCH v3 2/4] vhost: make the guest_notifications statistic counter atomic Eelco Chaudron
` (3 subsequent siblings)
4 siblings, 2 replies; 35+ messages in thread
From: Eelco Chaudron @ 2023-05-17 9:08 UTC (permalink / raw)
To: maxime.coquelin, chenbo.xia, david.marchand; +Cc: dev
This change will allow the vhost interrupt datapath handling to be split
between two processed without one of them holding an explicit lock.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
---
lib/eal/include/generic/rte_rwlock.h | 17 ++++++
lib/vhost/vhost.c | 46 +++++++++--------
lib/vhost/vhost.h | 4 +-
lib/vhost/vhost_user.c | 14 +++--
lib/vhost/virtio_net.c | 90 +++++++++++++++++-----------------
5 files changed, 94 insertions(+), 77 deletions(-)
diff --git a/lib/eal/include/generic/rte_rwlock.h b/lib/eal/include/generic/rte_rwlock.h
index 71e2d8d5f4..9e083bbc61 100644
--- a/lib/eal/include/generic/rte_rwlock.h
+++ b/lib/eal/include/generic/rte_rwlock.h
@@ -236,6 +236,23 @@ rte_rwlock_write_unlock(rte_rwlock_t *rwl)
__atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE, __ATOMIC_RELEASE);
}
+/**
+ * Test if the write lock is taken.
+ *
+ * @param rwl
+ * A pointer to a rwlock structure.
+ * @return
+ * 1 if the write lock is currently taken; 0 otherwise.
+ */
+static inline int
+rte_rwlock_write_is_locked(rte_rwlock_t *rwl)
+{
+ if (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) & RTE_RWLOCK_WRITE)
+ return 1;
+
+ return 0;
+}
+
/**
* Try to execute critical section in a hardware memory transaction, if it
* fails or not available take a read lock
diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
index ef37943817..74bdbfd810 100644
--- a/lib/vhost/vhost.c
+++ b/lib/vhost/vhost.c
@@ -393,9 +393,9 @@ free_vq(struct virtio_net *dev, struct vhost_virtqueue *vq)
else
rte_free(vq->shadow_used_split);
- rte_spinlock_lock(&vq->access_lock);
+ rte_rwlock_write_lock(&vq->access_lock);
vhost_free_async_mem(vq);
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_write_unlock(&vq->access_lock);
rte_free(vq->batch_copy_elems);
vhost_user_iotlb_destroy(vq);
rte_free(vq->log_cache);
@@ -630,7 +630,7 @@ alloc_vring_queue(struct virtio_net *dev, uint32_t vring_idx)
dev->virtqueue[i] = vq;
init_vring_queue(dev, vq, i);
- rte_spinlock_init(&vq->access_lock);
+ rte_rwlock_init(&vq->access_lock);
vq->avail_wrap_counter = 1;
vq->used_wrap_counter = 1;
vq->signalled_used_valid = false;
@@ -1305,14 +1305,14 @@ rte_vhost_vring_call(int vid, uint16_t vring_idx)
if (!vq)
return -1;
- rte_spinlock_lock(&vq->access_lock);
+ rte_rwlock_read_lock(&vq->access_lock);
if (vq_is_packed(dev))
vhost_vring_call_packed(dev, vq);
else
vhost_vring_call_split(dev, vq);
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_read_unlock(&vq->access_lock);
return 0;
}
@@ -1334,7 +1334,7 @@ rte_vhost_vring_call_nonblock(int vid, uint16_t vring_idx)
if (!vq)
return -1;
- if (!rte_spinlock_trylock(&vq->access_lock))
+ if (rte_rwlock_read_trylock(&vq->access_lock))
return -EAGAIN;
if (vq_is_packed(dev))
@@ -1342,7 +1342,7 @@ rte_vhost_vring_call_nonblock(int vid, uint16_t vring_idx)
else
vhost_vring_call_split(dev, vq);
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_read_unlock(&vq->access_lock);
return 0;
}
@@ -1365,7 +1365,7 @@ rte_vhost_avail_entries(int vid, uint16_t queue_id)
if (!vq)
return 0;
- rte_spinlock_lock(&vq->access_lock);
+ rte_rwlock_write_lock(&vq->access_lock);
if (unlikely(!vq->enabled || vq->avail == NULL))
goto out;
@@ -1373,7 +1373,7 @@ rte_vhost_avail_entries(int vid, uint16_t queue_id)
ret = *(volatile uint16_t *)&vq->avail->idx - vq->last_used_idx;
out:
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_write_unlock(&vq->access_lock);
return ret;
}
@@ -1457,12 +1457,12 @@ rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable)
if (!vq)
return -1;
- rte_spinlock_lock(&vq->access_lock);
+ rte_rwlock_write_lock(&vq->access_lock);
vq->notif_enable = enable;
ret = vhost_enable_guest_notification(dev, vq, enable);
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_write_unlock(&vq->access_lock);
return ret;
}
@@ -1520,7 +1520,7 @@ rte_vhost_rx_queue_count(int vid, uint16_t qid)
if (vq == NULL)
return 0;
- rte_spinlock_lock(&vq->access_lock);
+ rte_rwlock_write_lock(&vq->access_lock);
if (unlikely(!vq->enabled || vq->avail == NULL))
goto out;
@@ -1528,7 +1528,7 @@ rte_vhost_rx_queue_count(int vid, uint16_t qid)
ret = *((volatile uint16_t *)&vq->avail->idx) - vq->last_avail_idx;
out:
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_write_unlock(&vq->access_lock);
return ret;
}
@@ -1757,9 +1757,9 @@ rte_vhost_async_channel_register(int vid, uint16_t queue_id)
if (unlikely(vq == NULL || !dev->async_copy || dev->vdpa_dev != NULL))
return -1;
- rte_spinlock_lock(&vq->access_lock);
+ rte_rwlock_write_lock(&vq->access_lock);
ret = async_channel_register(dev, vq);
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_write_unlock(&vq->access_lock);
return ret;
}
@@ -1804,7 +1804,7 @@ rte_vhost_async_channel_unregister(int vid, uint16_t queue_id)
if (vq == NULL)
return ret;
- if (!rte_spinlock_trylock(&vq->access_lock)) {
+ if (rte_rwlock_write_trylock(&vq->access_lock)) {
VHOST_LOG_CONFIG(dev->ifname, ERR,
"failed to unregister async channel, virtqueue busy.\n");
return ret;
@@ -1821,7 +1821,7 @@ rte_vhost_async_channel_unregister(int vid, uint16_t queue_id)
ret = 0;
}
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_write_unlock(&vq->access_lock);
return ret;
}
@@ -1954,7 +1954,7 @@ rte_vhost_async_get_inflight(int vid, uint16_t queue_id)
if (vq == NULL)
return ret;
- if (!rte_spinlock_trylock(&vq->access_lock)) {
+ if (rte_rwlock_write_trylock(&vq->access_lock)) {
VHOST_LOG_CONFIG(dev->ifname, DEBUG,
"failed to check in-flight packets. virtqueue busy.\n");
return ret;
@@ -1963,7 +1963,7 @@ rte_vhost_async_get_inflight(int vid, uint16_t queue_id)
if (vq->async)
ret = vq->async->pkts_inflight_n;
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_write_unlock(&vq->access_lock);
return ret;
}
@@ -2084,13 +2084,13 @@ rte_vhost_vring_stats_get(int vid, uint16_t queue_id,
vq = dev->virtqueue[queue_id];
- rte_spinlock_lock(&vq->access_lock);
+ rte_rwlock_write_lock(&vq->access_lock);
for (i = 0; i < VHOST_NB_VQ_STATS; i++) {
stats[i].value =
*(uint64_t *)(((char *)vq) + vhost_vq_stat_strings[i].offset);
stats[i].id = i;
}
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_write_unlock(&vq->access_lock);
return VHOST_NB_VQ_STATS;
}
@@ -2111,9 +2111,9 @@ int rte_vhost_vring_stats_reset(int vid, uint16_t queue_id)
vq = dev->virtqueue[queue_id];
- rte_spinlock_lock(&vq->access_lock);
+ rte_rwlock_write_lock(&vq->access_lock);
memset(&vq->stats, 0, sizeof(vq->stats));
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_write_unlock(&vq->access_lock);
return 0;
}
diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
index 8fdab13c70..5c939ef06f 100644
--- a/lib/vhost/vhost.h
+++ b/lib/vhost/vhost.h
@@ -277,7 +277,7 @@ struct vhost_virtqueue {
bool access_ok;
bool ready;
- rte_spinlock_t access_lock;
+ rte_rwlock_t access_lock;
union {
@@ -517,7 +517,7 @@ static inline void
vq_assert_lock__(struct virtio_net *dev, struct vhost_virtqueue *vq, const char *func)
__rte_assert_exclusive_lock(&vq->access_lock)
{
- if (unlikely(!rte_spinlock_is_locked(&vq->access_lock)))
+ if (unlikely(!rte_rwlock_write_is_locked(&vq->access_lock)))
rte_panic("VHOST_CONFIG: (%s) %s() called without access lock taken.\n",
dev->ifname, func);
}
diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
index d60e39b6bc..c9454ce3d9 100644
--- a/lib/vhost/vhost_user.c
+++ b/lib/vhost/vhost_user.c
@@ -399,7 +399,7 @@ vhost_user_set_features(struct virtio_net **pdev,
cleanup_vq_inflight(dev, vq);
/* vhost_user_lock_all_queue_pairs locked all qps */
vq_assert_lock(dev, vq);
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_write_unlock(&vq->access_lock);
free_vq(dev, vq);
}
}
@@ -2649,10 +2649,10 @@ vhost_user_iotlb_msg(struct virtio_net **pdev,
len, imsg->perm);
if (is_vring_iotlb(dev, vq, imsg)) {
- rte_spinlock_lock(&vq->access_lock);
+ rte_rwlock_write_lock(&vq->access_lock);
translate_ring_addresses(&dev, &vq);
*pdev = dev;
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_write_unlock(&vq->access_lock);
}
}
break;
@@ -2667,9 +2667,9 @@ vhost_user_iotlb_msg(struct virtio_net **pdev,
imsg->size);
if (is_vring_iotlb(dev, vq, imsg)) {
- rte_spinlock_lock(&vq->access_lock);
+ rte_rwlock_write_lock(&vq->access_lock);
vring_invalidate(dev, vq);
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_write_unlock(&vq->access_lock);
}
}
break;
@@ -3030,7 +3030,7 @@ vhost_user_lock_all_queue_pairs(struct virtio_net *dev)
struct vhost_virtqueue *vq = dev->virtqueue[i];
if (vq) {
- rte_spinlock_lock(&vq->access_lock);
+ rte_rwlock_write_lock(&vq->access_lock);
vq_num++;
}
i++;
@@ -3048,7 +3048,7 @@ vhost_user_unlock_all_queue_pairs(struct virtio_net *dev)
struct vhost_virtqueue *vq = dev->virtqueue[i];
if (vq) {
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_write_unlock(&vq->access_lock);
vq_num++;
}
i++;
diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
index fe10a78586..d7624d18c8 100644
--- a/lib/vhost/virtio_net.c
+++ b/lib/vhost/virtio_net.c
@@ -55,7 +55,7 @@ is_valid_virt_queue_idx(uint32_t idx, int is_tx, uint32_t nr_vring)
static inline void
vhost_queue_stats_update(struct virtio_net *dev, struct vhost_virtqueue *vq,
struct rte_mbuf **pkts, uint16_t count)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
{
struct virtqueue_stats *stats = &vq->stats;
int i;
@@ -102,7 +102,7 @@ static __rte_always_inline int64_t
vhost_async_dma_transfer_one(struct virtio_net *dev, struct vhost_virtqueue *vq,
int16_t dma_id, uint16_t vchan_id, uint16_t flag_idx,
struct vhost_iov_iter *pkt)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
{
struct async_dma_vchan_info *dma_info = &dma_copy_track[dma_id].vchans[vchan_id];
uint16_t ring_mask = dma_info->ring_mask;
@@ -152,7 +152,7 @@ static __rte_always_inline uint16_t
vhost_async_dma_transfer(struct virtio_net *dev, struct vhost_virtqueue *vq,
int16_t dma_id, uint16_t vchan_id, uint16_t head_idx,
struct vhost_iov_iter *pkts, uint16_t nr_pkts)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
{
struct async_dma_vchan_info *dma_info = &dma_copy_track[dma_id].vchans[vchan_id];
int64_t ret, nr_copies = 0;
@@ -454,7 +454,7 @@ vhost_async_shadow_enqueue_packed_batch(struct vhost_virtqueue *vq,
static __rte_always_inline void
vhost_async_shadow_dequeue_packed_batch(struct vhost_virtqueue *vq, uint16_t *ids)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
{
uint16_t i;
struct vhost_async *async = vq->async;
@@ -1131,7 +1131,7 @@ static __rte_always_inline int
async_fill_seg(struct virtio_net *dev, struct vhost_virtqueue *vq,
struct rte_mbuf *m, uint32_t mbuf_offset,
uint64_t buf_iova, uint32_t cpy_len, bool to_desc)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
struct vhost_async *async = vq->async;
@@ -1211,7 +1211,7 @@ static __rte_always_inline int
mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq,
struct rte_mbuf *m, struct buf_vector *buf_vec,
uint16_t nr_vec, uint16_t num_buffers, bool is_async)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
uint32_t vec_idx = 0;
@@ -1341,7 +1341,7 @@ vhost_enqueue_single_packed(struct virtio_net *dev,
struct rte_mbuf *pkt,
struct buf_vector *buf_vec,
uint16_t *nr_descs)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
uint16_t nr_vec = 0;
@@ -1403,7 +1403,7 @@ vhost_enqueue_single_packed(struct virtio_net *dev,
static __rte_noinline uint32_t
virtio_dev_rx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
struct rte_mbuf **pkts, uint32_t count)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
uint32_t pkt_idx = 0;
@@ -1635,7 +1635,7 @@ static __rte_always_inline int16_t
virtio_dev_rx_single_packed(struct virtio_net *dev,
struct vhost_virtqueue *vq,
struct rte_mbuf *pkt)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
struct buf_vector buf_vec[BUF_VECTOR_MAX];
@@ -1661,7 +1661,7 @@ virtio_dev_rx_packed(struct virtio_net *dev,
struct vhost_virtqueue *__rte_restrict vq,
struct rte_mbuf **__rte_restrict pkts,
uint32_t count)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
uint32_t pkt_idx = 0;
@@ -1701,7 +1701,7 @@ virtio_dev_rx(struct virtio_net *dev, struct vhost_virtqueue *vq,
uint32_t nb_tx = 0;
VHOST_LOG_DATA(dev->ifname, DEBUG, "%s\n", __func__);
- rte_spinlock_lock(&vq->access_lock);
+ rte_rwlock_read_lock(&vq->access_lock);
if (unlikely(!vq->enabled))
goto out_access_unlock;
@@ -1727,7 +1727,7 @@ virtio_dev_rx(struct virtio_net *dev, struct vhost_virtqueue *vq,
vhost_user_iotlb_rd_unlock(vq);
out_access_unlock:
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_read_unlock(&vq->access_lock);
return nb_tx;
}
@@ -1760,7 +1760,7 @@ rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
static __rte_always_inline uint16_t
async_get_first_inflight_pkt_idx(struct vhost_virtqueue *vq)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
{
struct vhost_async *async = vq->async;
@@ -2165,7 +2165,7 @@ virtio_dev_rx_async_submit_packed(struct virtio_net *dev, struct vhost_virtqueue
static __rte_always_inline void
write_back_completed_descs_split(struct vhost_virtqueue *vq, uint16_t n_descs)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
{
struct vhost_async *async = vq->async;
uint16_t nr_left = n_descs;
@@ -2198,7 +2198,7 @@ write_back_completed_descs_split(struct vhost_virtqueue *vq, uint16_t n_descs)
static __rte_always_inline void
write_back_completed_descs_packed(struct vhost_virtqueue *vq,
uint16_t n_buffers)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
{
struct vhost_async *async = vq->async;
uint16_t from = async->last_buffer_idx_packed;
@@ -2263,7 +2263,7 @@ write_back_completed_descs_packed(struct vhost_virtqueue *vq,
static __rte_always_inline uint16_t
vhost_poll_enqueue_completed(struct virtio_net *dev, struct vhost_virtqueue *vq,
struct rte_mbuf **pkts, uint16_t count, int16_t dma_id, uint16_t vchan_id)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
{
struct vhost_async *async = vq->async;
struct async_inflight_info *pkts_info = async->pkts_info;
@@ -2357,7 +2357,7 @@ rte_vhost_poll_enqueue_completed(int vid, uint16_t queue_id,
vq = dev->virtqueue[queue_id];
- if (!rte_spinlock_trylock(&vq->access_lock)) {
+ if (rte_rwlock_read_trylock(&vq->access_lock)) {
VHOST_LOG_DATA(dev->ifname, DEBUG,
"%s: virtqueue %u is busy.\n",
__func__, queue_id);
@@ -2377,7 +2377,7 @@ rte_vhost_poll_enqueue_completed(int vid, uint16_t queue_id,
vq->stats.inflight_completed += n_pkts_cpl;
out:
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_read_unlock(&vq->access_lock);
return n_pkts_cpl;
}
@@ -2465,7 +2465,7 @@ rte_vhost_clear_queue(int vid, uint16_t queue_id, struct rte_mbuf **pkts,
vq = dev->virtqueue[queue_id];
- if (!rte_spinlock_trylock(&vq->access_lock)) {
+ if (rte_rwlock_read_trylock(&vq->access_lock)) {
VHOST_LOG_DATA(dev->ifname, DEBUG, "%s: virtqueue %u is busy.\n",
__func__, queue_id);
return 0;
@@ -2495,7 +2495,7 @@ rte_vhost_clear_queue(int vid, uint16_t queue_id, struct rte_mbuf **pkts,
vq->stats.inflight_completed += n_pkts_cpl;
out_access_unlock:
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_read_unlock(&vq->access_lock);
return n_pkts_cpl;
}
@@ -2516,7 +2516,7 @@ virtio_dev_rx_async_submit(struct virtio_net *dev, struct vhost_virtqueue *vq,
return 0;
}
- rte_spinlock_lock(&vq->access_lock);
+ rte_rwlock_write_lock(&vq->access_lock);
if (unlikely(!vq->enabled || !vq->async))
goto out_access_unlock;
@@ -2544,7 +2544,7 @@ virtio_dev_rx_async_submit(struct virtio_net *dev, struct vhost_virtqueue *vq,
vhost_user_iotlb_rd_unlock(vq);
out_access_unlock:
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_write_unlock(&vq->access_lock);
return nb_tx;
}
@@ -2866,7 +2866,7 @@ desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
struct buf_vector *buf_vec, uint16_t nr_vec,
struct rte_mbuf *m, struct rte_mempool *mbuf_pool,
bool legacy_ol_flags, uint16_t slot_idx, bool is_async)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
uint32_t buf_avail, buf_offset, buf_len;
@@ -3073,7 +3073,7 @@ static uint16_t
virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count,
bool legacy_ol_flags)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
uint16_t i;
@@ -3178,7 +3178,7 @@ static uint16_t
virtio_dev_tx_split_legacy(struct virtio_net *dev,
struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool,
struct rte_mbuf **pkts, uint16_t count)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
return virtio_dev_tx_split(dev, vq, mbuf_pool, pkts, count, true);
@@ -3189,7 +3189,7 @@ static uint16_t
virtio_dev_tx_split_compliant(struct virtio_net *dev,
struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool,
struct rte_mbuf **pkts, uint16_t count)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
return virtio_dev_tx_split(dev, vq, mbuf_pool, pkts, count, false);
@@ -3393,7 +3393,7 @@ vhost_dequeue_single_packed(struct virtio_net *dev,
uint16_t *buf_id,
uint16_t *desc_count,
bool legacy_ol_flags)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
struct buf_vector buf_vec[BUF_VECTOR_MAX];
@@ -3443,7 +3443,7 @@ virtio_dev_tx_single_packed(struct virtio_net *dev,
struct rte_mempool *mbuf_pool,
struct rte_mbuf *pkts,
bool legacy_ol_flags)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
@@ -3475,7 +3475,7 @@ virtio_dev_tx_packed(struct virtio_net *dev,
struct rte_mbuf **__rte_restrict pkts,
uint32_t count,
bool legacy_ol_flags)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
uint32_t pkt_idx = 0;
@@ -3520,7 +3520,7 @@ static uint16_t
virtio_dev_tx_packed_legacy(struct virtio_net *dev,
struct vhost_virtqueue *__rte_restrict vq, struct rte_mempool *mbuf_pool,
struct rte_mbuf **__rte_restrict pkts, uint32_t count)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
return virtio_dev_tx_packed(dev, vq, mbuf_pool, pkts, count, true);
@@ -3531,7 +3531,7 @@ static uint16_t
virtio_dev_tx_packed_compliant(struct virtio_net *dev,
struct vhost_virtqueue *__rte_restrict vq, struct rte_mempool *mbuf_pool,
struct rte_mbuf **__rte_restrict pkts, uint32_t count)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
return virtio_dev_tx_packed(dev, vq, mbuf_pool, pkts, count, false);
@@ -3566,7 +3566,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
vq = dev->virtqueue[queue_id];
- if (unlikely(rte_spinlock_trylock(&vq->access_lock) == 0))
+ if (unlikely(rte_rwlock_read_trylock(&vq->access_lock) != 0))
return 0;
if (unlikely(!vq->enabled)) {
@@ -3636,7 +3636,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
vhost_user_iotlb_rd_unlock(vq);
out_access_unlock:
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_read_unlock(&vq->access_lock);
if (unlikely(rarp_mbuf != NULL))
count += 1;
@@ -3648,7 +3648,7 @@ static __rte_always_inline uint16_t
async_poll_dequeue_completed(struct virtio_net *dev, struct vhost_virtqueue *vq,
struct rte_mbuf **pkts, uint16_t count, int16_t dma_id,
uint16_t vchan_id, bool legacy_ol_flags)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
{
uint16_t start_idx, from, i;
uint16_t nr_cpl_pkts = 0;
@@ -3695,7 +3695,7 @@ static __rte_always_inline uint16_t
virtio_dev_tx_async_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count,
int16_t dma_id, uint16_t vchan_id, bool legacy_ol_flags)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
static bool allocerr_warned;
@@ -3844,7 +3844,7 @@ virtio_dev_tx_async_split_legacy(struct virtio_net *dev,
struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool,
struct rte_mbuf **pkts, uint16_t count,
int16_t dma_id, uint16_t vchan_id)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
return virtio_dev_tx_async_split(dev, vq, mbuf_pool,
@@ -3857,7 +3857,7 @@ virtio_dev_tx_async_split_compliant(struct virtio_net *dev,
struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool,
struct rte_mbuf **pkts, uint16_t count,
int16_t dma_id, uint16_t vchan_id)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
return virtio_dev_tx_async_split(dev, vq, mbuf_pool,
@@ -3867,7 +3867,7 @@ virtio_dev_tx_async_split_compliant(struct virtio_net *dev,
static __rte_always_inline void
vhost_async_shadow_dequeue_single_packed(struct vhost_virtqueue *vq,
uint16_t buf_id, uint16_t count)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
{
struct vhost_async *async = vq->async;
uint16_t idx = async->buffer_idx_packed;
@@ -3889,7 +3889,7 @@ virtio_dev_tx_async_single_packed(struct virtio_net *dev,
struct rte_mbuf *pkts,
uint16_t slot_idx,
bool legacy_ol_flags)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
int err;
@@ -3942,7 +3942,7 @@ virtio_dev_tx_async_packed_batch(struct virtio_net *dev,
struct vhost_virtqueue *vq,
struct rte_mbuf **pkts, uint16_t slot_idx,
uint16_t dma_id, uint16_t vchan_id)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
uint16_t avail_idx = vq->last_avail_idx;
@@ -4000,7 +4000,7 @@ static __rte_always_inline uint16_t
virtio_dev_tx_async_packed(struct virtio_net *dev, struct vhost_virtqueue *vq,
struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts,
uint16_t count, uint16_t dma_id, uint16_t vchan_id, bool legacy_ol_flags)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
uint32_t pkt_idx = 0;
@@ -4111,7 +4111,7 @@ static uint16_t
virtio_dev_tx_async_packed_legacy(struct virtio_net *dev, struct vhost_virtqueue *vq,
struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts,
uint16_t count, uint16_t dma_id, uint16_t vchan_id)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
return virtio_dev_tx_async_packed(dev, vq, mbuf_pool,
@@ -4123,7 +4123,7 @@ static uint16_t
virtio_dev_tx_async_packed_compliant(struct virtio_net *dev, struct vhost_virtqueue *vq,
struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts,
uint16_t count, uint16_t dma_id, uint16_t vchan_id)
- __rte_exclusive_locks_required(&vq->access_lock)
+ __rte_shared_locks_required(&vq->access_lock)
__rte_shared_locks_required(&vq->iotlb_lock)
{
return virtio_dev_tx_async_packed(dev, vq, mbuf_pool,
@@ -4173,7 +4173,7 @@ rte_vhost_async_try_dequeue_burst(int vid, uint16_t queue_id,
vq = dev->virtqueue[queue_id];
- if (unlikely(rte_spinlock_trylock(&vq->access_lock) == 0))
+ if (unlikely(rte_rwlock_read_trylock(&vq->access_lock) != 0))
return 0;
if (unlikely(vq->enabled == 0)) {
@@ -4255,7 +4255,7 @@ rte_vhost_async_try_dequeue_burst(int vid, uint16_t queue_id,
vhost_user_iotlb_rd_unlock(vq);
out_access_unlock:
- rte_spinlock_unlock(&vq->access_lock);
+ rte_rwlock_read_unlock(&vq->access_lock);
if (unlikely(rarp_mbuf != NULL))
count += 1;
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v3 2/4] vhost: make the guest_notifications statistic counter atomic
2023-05-17 9:08 [PATCH v3 0/4] vhost: add device op to offload the interrupt kick Eelco Chaudron
2023-05-17 9:08 ` [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a read/write one Eelco Chaudron
@ 2023-05-17 9:08 ` Eelco Chaudron
2023-05-30 12:52 ` Maxime Coquelin
2023-05-31 7:03 ` Xia, Chenbo
2023-05-17 9:09 ` [PATCH v3 3/4] vhost: fix invalid call FD handling Eelco Chaudron
` (2 subsequent siblings)
4 siblings, 2 replies; 35+ messages in thread
From: Eelco Chaudron @ 2023-05-17 9:08 UTC (permalink / raw)
To: maxime.coquelin, chenbo.xia, david.marchand; +Cc: dev
Making the guest_notifications statistic counter atomic, allows
it to be safely incremented while holding the read access_lock.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
---
lib/vhost/vhost.c | 8 ++++++++
lib/vhost/vhost.h | 9 ++++++---
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
index 74bdbfd810..8ff6434c93 100644
--- a/lib/vhost/vhost.c
+++ b/lib/vhost/vhost.c
@@ -2086,6 +2086,10 @@ rte_vhost_vring_stats_get(int vid, uint16_t queue_id,
rte_rwlock_write_lock(&vq->access_lock);
for (i = 0; i < VHOST_NB_VQ_STATS; i++) {
+ /*
+ * No need to the read atomic counters as such, due to the
+ * above write access_lock preventing them to be updated.
+ */
stats[i].value =
*(uint64_t *)(((char *)vq) + vhost_vq_stat_strings[i].offset);
stats[i].id = i;
@@ -2112,6 +2116,10 @@ int rte_vhost_vring_stats_reset(int vid, uint16_t queue_id)
vq = dev->virtqueue[queue_id];
rte_rwlock_write_lock(&vq->access_lock);
+ /*
+ * No need to the reset atomic counters as such, due to the
+ * above write access_lock preventing them to be updated.
+ */
memset(&vq->stats, 0, sizeof(vq->stats));
rte_rwlock_write_unlock(&vq->access_lock);
diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
index 5c939ef06f..37609c7c8d 100644
--- a/lib/vhost/vhost.h
+++ b/lib/vhost/vhost.h
@@ -135,11 +135,12 @@ struct virtqueue_stats {
uint64_t broadcast;
/* Size bins in array as RFC 2819, undersized [0], 64 [1], etc */
uint64_t size_bins[8];
- uint64_t guest_notifications;
uint64_t iotlb_hits;
uint64_t iotlb_misses;
uint64_t inflight_submitted;
uint64_t inflight_completed;
+ /* Counters below are atomic, and should be incremented as such. */
+ uint64_t guest_notifications;
};
/**
@@ -907,7 +908,8 @@ vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq)
unlikely(!signalled_used_valid)) {
eventfd_write(vq->callfd, (eventfd_t) 1);
if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
- vq->stats.guest_notifications++;
+ __atomic_fetch_add(&vq->stats.guest_notifications,
+ 1, __ATOMIC_RELAXED);
if (dev->notify_ops->guest_notified)
dev->notify_ops->guest_notified(dev->vid);
}
@@ -917,7 +919,8 @@ vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq)
&& (vq->callfd >= 0)) {
eventfd_write(vq->callfd, (eventfd_t)1);
if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
- vq->stats.guest_notifications++;
+ __atomic_fetch_add(&vq->stats.guest_notifications,
+ 1, __ATOMIC_RELAXED);
if (dev->notify_ops->guest_notified)
dev->notify_ops->guest_notified(dev->vid);
}
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v3 3/4] vhost: fix invalid call FD handling
2023-05-17 9:08 [PATCH v3 0/4] vhost: add device op to offload the interrupt kick Eelco Chaudron
2023-05-17 9:08 ` [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a read/write one Eelco Chaudron
2023-05-17 9:08 ` [PATCH v3 2/4] vhost: make the guest_notifications statistic counter atomic Eelco Chaudron
@ 2023-05-17 9:09 ` Eelco Chaudron
2023-05-30 12:54 ` Maxime Coquelin
2023-05-17 9:09 ` [PATCH v3 4/4] vhost: add device op to offload the interrupt kick Eelco Chaudron
2023-06-01 20:00 ` [PATCH v3 0/4] " Maxime Coquelin
4 siblings, 1 reply; 35+ messages in thread
From: Eelco Chaudron @ 2023-05-17 9:09 UTC (permalink / raw)
To: maxime.coquelin, chenbo.xia, david.marchand; +Cc: dev
This patch fixes cases where IRQ injection is tried while
the call FD is not valid, which should not happen.
Fixes: b1cce26af1dc ("vhost: add notification for packed ring")
Fixes: e37ff954405a ("vhost: support virtqueue interrupt/notification suppression")
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
---
lib/vhost/vhost.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
index 37609c7c8d..23a4e2b1a7 100644
--- a/lib/vhost/vhost.h
+++ b/lib/vhost/vhost.h
@@ -903,9 +903,9 @@ vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq)
"%s: used_event_idx=%d, old=%d, new=%d\n",
__func__, vhost_used_event(vq), old, new);
- if ((vhost_need_event(vhost_used_event(vq), new, old) &&
- (vq->callfd >= 0)) ||
- unlikely(!signalled_used_valid)) {
+ if ((vhost_need_event(vhost_used_event(vq), new, old) ||
+ unlikely(!signalled_used_valid)) &&
+ vq->callfd >= 0) {
eventfd_write(vq->callfd, (eventfd_t) 1);
if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
__atomic_fetch_add(&vq->stats.guest_notifications,
@@ -974,7 +974,7 @@ vhost_vring_call_packed(struct virtio_net *dev, struct vhost_virtqueue *vq)
if (vhost_need_event(off, new, old))
kick = true;
kick:
- if (kick) {
+ if (kick && vq->callfd >= 0) {
eventfd_write(vq->callfd, (eventfd_t)1);
if (dev->notify_ops->guest_notified)
dev->notify_ops->guest_notified(dev->vid);
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-05-17 9:08 [PATCH v3 0/4] vhost: add device op to offload the interrupt kick Eelco Chaudron
` (2 preceding siblings ...)
2023-05-17 9:09 ` [PATCH v3 3/4] vhost: fix invalid call FD handling Eelco Chaudron
@ 2023-05-17 9:09 ` Eelco Chaudron
2023-05-30 13:02 ` Maxime Coquelin
` (2 more replies)
2023-06-01 20:00 ` [PATCH v3 0/4] " Maxime Coquelin
4 siblings, 3 replies; 35+ messages in thread
From: Eelco Chaudron @ 2023-05-17 9:09 UTC (permalink / raw)
To: maxime.coquelin, chenbo.xia, david.marchand; +Cc: dev
This patch adds an operation callback which gets called every time the
library wants to call eventfd_write(). This eventfd_write() call could
result in a system call, which could potentially block the PMD thread.
The callback function can decide whether it's ok to handle the
eventfd_write() now or have the newly introduced function,
rte_vhost_notify_guest(), called at a later time.
This can be used by 3rd party applications, like OVS, to avoid system
calls being called as part of the PMD threads.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
---
lib/vhost/meson.build | 2 ++
lib/vhost/rte_vhost.h | 23 +++++++++++++++++-
lib/vhost/socket.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++---
lib/vhost/version.map | 9 +++++++
lib/vhost/vhost.c | 38 ++++++++++++++++++++++++++++++
lib/vhost/vhost.h | 58 ++++++++++++++++++++++++++++++++-------------
6 files changed, 171 insertions(+), 22 deletions(-)
diff --git a/lib/vhost/meson.build b/lib/vhost/meson.build
index 0d1abf6283..05679447db 100644
--- a/lib/vhost/meson.build
+++ b/lib/vhost/meson.build
@@ -38,3 +38,5 @@ driver_sdk_headers = files(
'vdpa_driver.h',
)
deps += ['ethdev', 'cryptodev', 'hash', 'pci', 'dmadev']
+
+use_function_versioning = true
diff --git a/lib/vhost/rte_vhost.h b/lib/vhost/rte_vhost.h
index 58a5d4be92..7a10bc36cf 100644
--- a/lib/vhost/rte_vhost.h
+++ b/lib/vhost/rte_vhost.h
@@ -298,7 +298,13 @@ struct rte_vhost_device_ops {
*/
void (*guest_notified)(int vid);
- void *reserved[1]; /**< Reserved for future extension */
+ /**
+ * If this callback is registered, notification to the guest can
+ * be handled by the front-end calling rte_vhost_notify_guest().
+ * If it's not handled, 'false' should be returned. This can be used
+ * to remove the "slow" eventfd_write() syscall from the datapath.
+ */
+ bool (*guest_notify)(int vid, uint16_t queue_id);
};
/**
@@ -433,6 +439,21 @@ void rte_vhost_log_used_vring(int vid, uint16_t vring_idx,
int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable);
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice.
+ *
+ * Inject the offloaded interrupt into the vhost device's queue. For more
+ * details see the 'guest_notify' vhost device operation.
+ *
+ * @param vid
+ * vhost device ID
+ * @param queue_id
+ * virtio queue index
+ */
+__rte_experimental
+void rte_vhost_notify_guest(int vid, uint16_t queue_id);
+
/**
* Register vhost driver. path could be different for multiple
* instance support.
diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
index 669c322e12..f2c02075fe 100644
--- a/lib/vhost/socket.c
+++ b/lib/vhost/socket.c
@@ -15,6 +15,7 @@
#include <fcntl.h>
#include <pthread.h>
+#include <rte_function_versioning.h>
#include <rte_log.h>
#include "fd_man.h"
@@ -59,6 +60,7 @@ struct vhost_user_socket {
struct rte_vdpa_device *vdpa_dev;
struct rte_vhost_device_ops const *notify_ops;
+ struct rte_vhost_device_ops *malloc_notify_ops;
};
struct vhost_user_connection {
@@ -846,6 +848,11 @@ vhost_user_socket_mem_free(struct vhost_user_socket *vsocket)
vsocket->path = NULL;
}
+ if (vsocket && vsocket->malloc_notify_ops) {
+ free(vsocket->malloc_notify_ops);
+ vsocket->malloc_notify_ops = NULL;
+ }
+
if (vsocket) {
free(vsocket);
vsocket = NULL;
@@ -1099,21 +1106,69 @@ rte_vhost_driver_unregister(const char *path)
/*
* Register ops so that we can add/remove device to data core.
*/
-int
-rte_vhost_driver_callback_register(const char *path,
- struct rte_vhost_device_ops const * const ops)
+static int
+vhost_driver_callback_register(const char *path,
+ struct rte_vhost_device_ops const * const ops,
+ struct rte_vhost_device_ops *malloc_ops)
{
struct vhost_user_socket *vsocket;
pthread_mutex_lock(&vhost_user.mutex);
vsocket = find_vhost_user_socket(path);
- if (vsocket)
+ if (vsocket) {
vsocket->notify_ops = ops;
+ free(vsocket->malloc_notify_ops);
+ vsocket->malloc_notify_ops = malloc_ops;
+ }
pthread_mutex_unlock(&vhost_user.mutex);
return vsocket ? 0 : -1;
}
+int __vsym
+rte_vhost_driver_callback_register_v24(const char *path,
+ struct rte_vhost_device_ops const * const ops)
+{
+ return vhost_driver_callback_register(path, ops, NULL);
+}
+
+int __vsym
+rte_vhost_driver_callback_register_v23(const char *path,
+ struct rte_vhost_device_ops const * const ops)
+{
+ int ret;
+
+ /*
+ * Although the ops structure is a const structure, we do need to
+ * override the guest_notify operation. This is because with the
+ * previous APIs it was "reserved" and if any garbage value was passed,
+ * it could crash the application.
+ */
+ if (ops && !ops->guest_notify) {
+ struct rte_vhost_device_ops *new_ops;
+
+ new_ops = malloc(sizeof(*new_ops));
+ if (new_ops == NULL)
+ return -1;
+
+ memcpy(new_ops, ops, sizeof(*new_ops));
+ new_ops->guest_notify = NULL;
+
+ ret = vhost_driver_callback_register(path, new_ops, new_ops);
+ } else {
+ ret = vhost_driver_callback_register(path, ops, NULL);
+ }
+
+ return ret;
+}
+
+/* Mark the v23 function as the old version, and v24 as the default version. */
+VERSION_SYMBOL(rte_vhost_driver_callback_register, _v23, 23);
+BIND_DEFAULT_SYMBOL(rte_vhost_driver_callback_register, _v24, 24);
+MAP_STATIC_SYMBOL(int rte_vhost_driver_callback_register(const char *path,
+ struct rte_vhost_device_ops const * const ops),
+ rte_vhost_driver_callback_register_v24);
+
struct rte_vhost_device_ops const *
vhost_driver_callback_get(const char *path)
{
diff --git a/lib/vhost/version.map b/lib/vhost/version.map
index d322a4a888..7bcbfd12cf 100644
--- a/lib/vhost/version.map
+++ b/lib/vhost/version.map
@@ -64,6 +64,12 @@ DPDK_23 {
local: *;
};
+DPDK_24 {
+ global:
+
+ rte_vhost_driver_callback_register;
+} DPDK_23;
+
EXPERIMENTAL {
global:
@@ -98,6 +104,9 @@ EXPERIMENTAL {
# added in 22.11
rte_vhost_async_dma_unconfigure;
rte_vhost_vring_call_nonblock;
+
+ # added in 23.07
+ rte_vhost_notify_guest;
};
INTERNAL {
diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
index 8ff6434c93..79e88f986e 100644
--- a/lib/vhost/vhost.c
+++ b/lib/vhost/vhost.c
@@ -44,6 +44,10 @@ static const struct vhost_vq_stats_name_off vhost_vq_stat_strings[] = {
{"size_1024_1518_packets", offsetof(struct vhost_virtqueue, stats.size_bins[6])},
{"size_1519_max_packets", offsetof(struct vhost_virtqueue, stats.size_bins[7])},
{"guest_notifications", offsetof(struct vhost_virtqueue, stats.guest_notifications)},
+ {"guest_notifications_offloaded", offsetof(struct vhost_virtqueue,
+ stats.guest_notifications_offloaded)},
+ {"guest_notifications_error", offsetof(struct vhost_virtqueue,
+ stats.guest_notifications_error)},
{"iotlb_hits", offsetof(struct vhost_virtqueue, stats.iotlb_hits)},
{"iotlb_misses", offsetof(struct vhost_virtqueue, stats.iotlb_misses)},
{"inflight_submitted", offsetof(struct vhost_virtqueue, stats.inflight_submitted)},
@@ -1467,6 +1471,40 @@ rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable)
return ret;
}
+void
+rte_vhost_notify_guest(int vid, uint16_t queue_id)
+{
+ struct virtio_net *dev = get_device(vid);
+ struct vhost_virtqueue *vq;
+
+ if (!dev || queue_id >= VHOST_MAX_VRING)
+ return;
+
+ vq = dev->virtqueue[queue_id];
+ if (!vq)
+ return;
+
+ rte_rwlock_read_lock(&vq->access_lock);
+
+ if (vq->callfd >= 0) {
+ int ret = eventfd_write(vq->callfd, (eventfd_t)1);
+
+ if (ret) {
+ if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
+ __atomic_fetch_add(&vq->stats.guest_notifications_error,
+ 1, __ATOMIC_RELAXED);
+ } else {
+ if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
+ __atomic_fetch_add(&vq->stats.guest_notifications,
+ 1, __ATOMIC_RELAXED);
+ if (dev->notify_ops->guest_notified)
+ dev->notify_ops->guest_notified(dev->vid);
+ }
+ }
+
+ rte_rwlock_read_unlock(&vq->access_lock);
+}
+
void
rte_vhost_log_write(int vid, uint64_t addr, uint64_t len)
{
diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
index 23a4e2b1a7..8ad53e9bb5 100644
--- a/lib/vhost/vhost.h
+++ b/lib/vhost/vhost.h
@@ -141,6 +141,8 @@ struct virtqueue_stats {
uint64_t inflight_completed;
/* Counters below are atomic, and should be incremented as such. */
uint64_t guest_notifications;
+ uint64_t guest_notifications_offloaded;
+ uint64_t guest_notifications_error;
};
/**
@@ -884,6 +886,34 @@ vhost_need_event(uint16_t event_idx, uint16_t new_idx, uint16_t old)
return (uint16_t)(new_idx - event_idx - 1) < (uint16_t)(new_idx - old);
}
+static __rte_always_inline void
+vhost_vring_inject_irq(struct virtio_net *dev, struct vhost_virtqueue *vq)
+{
+ int ret;
+
+ if (dev->notify_ops->guest_notify &&
+ dev->notify_ops->guest_notify(dev->vid, vq->index)) {
+ if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
+ __atomic_fetch_add(&vq->stats.guest_notifications_offloaded,
+ 1, __ATOMIC_RELAXED);
+ return;
+ }
+
+ ret = eventfd_write(vq->callfd, (eventfd_t) 1);
+ if (ret) {
+ if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
+ __atomic_fetch_add(&vq->stats.guest_notifications_error,
+ 1, __ATOMIC_RELAXED);
+ return;
+ }
+
+ if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
+ __atomic_fetch_add(&vq->stats.guest_notifications,
+ 1, __ATOMIC_RELAXED);
+ if (dev->notify_ops->guest_notified)
+ dev->notify_ops->guest_notified(dev->vid);
+}
+
static __rte_always_inline void
vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq)
{
@@ -906,23 +936,13 @@ vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq)
if ((vhost_need_event(vhost_used_event(vq), new, old) ||
unlikely(!signalled_used_valid)) &&
vq->callfd >= 0) {
- eventfd_write(vq->callfd, (eventfd_t) 1);
- if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
- __atomic_fetch_add(&vq->stats.guest_notifications,
- 1, __ATOMIC_RELAXED);
- if (dev->notify_ops->guest_notified)
- dev->notify_ops->guest_notified(dev->vid);
+ vhost_vring_inject_irq(dev, vq);
}
} else {
/* Kick the guest if necessary. */
if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)
&& (vq->callfd >= 0)) {
- eventfd_write(vq->callfd, (eventfd_t)1);
- if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
- __atomic_fetch_add(&vq->stats.guest_notifications,
- 1, __ATOMIC_RELAXED);
- if (dev->notify_ops->guest_notified)
- dev->notify_ops->guest_notified(dev->vid);
+ vhost_vring_inject_irq(dev, vq);
}
}
}
@@ -974,11 +994,8 @@ vhost_vring_call_packed(struct virtio_net *dev, struct vhost_virtqueue *vq)
if (vhost_need_event(off, new, old))
kick = true;
kick:
- if (kick && vq->callfd >= 0) {
- eventfd_write(vq->callfd, (eventfd_t)1);
- if (dev->notify_ops->guest_notified)
- dev->notify_ops->guest_notified(dev->vid);
- }
+ if (kick && vq->callfd >= 0)
+ vhost_vring_inject_irq(dev, vq);
}
static __rte_always_inline void
@@ -1017,4 +1034,11 @@ mbuf_is_consumed(struct rte_mbuf *m)
uint64_t hua_to_alignment(struct rte_vhost_memory *mem, void *ptr);
void mem_set_dump(void *ptr, size_t size, bool enable, uint64_t alignment);
+
+/* Versioned functions */
+int rte_vhost_driver_callback_register_v23(const char *path,
+ struct rte_vhost_device_ops const * const ops);
+int rte_vhost_driver_callback_register_v24(const char *path,
+ struct rte_vhost_device_ops const * const ops);
+
#endif /* _VHOST_NET_CDEV_H_ */
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a read/write one
2023-05-17 9:08 ` [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a read/write one Eelco Chaudron
@ 2023-05-17 17:33 ` Maxime Coquelin
2023-05-18 14:46 ` Eelco Chaudron
2023-05-31 6:37 ` Xia, Chenbo
1 sibling, 1 reply; 35+ messages in thread
From: Maxime Coquelin @ 2023-05-17 17:33 UTC (permalink / raw)
To: Eelco Chaudron, chenbo.xia, david.marchand; +Cc: dev
Hi Eelco,
On 5/17/23 11:08, Eelco Chaudron wrote:
> This change will allow the vhost interrupt datapath handling to be split
> between two processed without one of them holding an explicit lock.
>
As I had a tuned PVP benchmarking setup at hand, I ran a 0.02% loss
RFC2544 test with and without this patch to ensure moving to RX locks
would not introduce performance regression.
I can confirm there are no performance regression introduced with this
patch applied:
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
And the patch looks good to me:
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks,
Maxime
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---
> lib/eal/include/generic/rte_rwlock.h | 17 ++++++
> lib/vhost/vhost.c | 46 +++++++++--------
> lib/vhost/vhost.h | 4 +-
> lib/vhost/vhost_user.c | 14 +++--
> lib/vhost/virtio_net.c | 90 +++++++++++++++++-----------------
> 5 files changed, 94 insertions(+), 77 deletions(-)
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a read/write one
2023-05-17 17:33 ` Maxime Coquelin
@ 2023-05-18 14:46 ` Eelco Chaudron
0 siblings, 0 replies; 35+ messages in thread
From: Eelco Chaudron @ 2023-05-18 14:46 UTC (permalink / raw)
To: Maxime Coquelin; +Cc: chenbo.xia, david.marchand, dev
On 17 May 2023, at 19:33, Maxime Coquelin wrote:
> Hi Eelco,
>
> On 5/17/23 11:08, Eelco Chaudron wrote:
>> This change will allow the vhost interrupt datapath handling to be split
>> between two processed without one of them holding an explicit lock.
>>
>
> As I had a tuned PVP benchmarking setup at hand, I ran a 0.02% loss
> RFC2544 test with and without this patch to ensure moving to RX locks
> would not introduce performance regression.
>
> I can confirm there are no performance regression introduced with this
> patch applied:
> Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>
> And the patch looks good to me:
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks for running the test and doing the review!
//Eelco
>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>> ---
>> lib/eal/include/generic/rte_rwlock.h | 17 ++++++
>> lib/vhost/vhost.c | 46 +++++++++--------
>> lib/vhost/vhost.h | 4 +-
>> lib/vhost/vhost_user.c | 14 +++--
>> lib/vhost/virtio_net.c | 90 +++++++++++++++++-----------------
>> 5 files changed, 94 insertions(+), 77 deletions(-)
>>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 2/4] vhost: make the guest_notifications statistic counter atomic
2023-05-17 9:08 ` [PATCH v3 2/4] vhost: make the guest_notifications statistic counter atomic Eelco Chaudron
@ 2023-05-30 12:52 ` Maxime Coquelin
2023-05-31 7:03 ` Xia, Chenbo
1 sibling, 0 replies; 35+ messages in thread
From: Maxime Coquelin @ 2023-05-30 12:52 UTC (permalink / raw)
To: Eelco Chaudron, chenbo.xia, david.marchand; +Cc: dev
On 5/17/23 11:08, Eelco Chaudron wrote:
> Making the guest_notifications statistic counter atomic, allows
> it to be safely incremented while holding the read access_lock.
>
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---
> lib/vhost/vhost.c | 8 ++++++++
> lib/vhost/vhost.h | 9 ++++++---
> 2 files changed, 14 insertions(+), 3 deletions(-)
>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks,
Maxime
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 3/4] vhost: fix invalid call FD handling
2023-05-17 9:09 ` [PATCH v3 3/4] vhost: fix invalid call FD handling Eelco Chaudron
@ 2023-05-30 12:54 ` Maxime Coquelin
2023-05-31 6:12 ` Xia, Chenbo
0 siblings, 1 reply; 35+ messages in thread
From: Maxime Coquelin @ 2023-05-30 12:54 UTC (permalink / raw)
To: Eelco Chaudron, chenbo.xia, david.marchand; +Cc: dev
On 5/17/23 11:09, Eelco Chaudron wrote:
> This patch fixes cases where IRQ injection is tried while
> the call FD is not valid, which should not happen.
>
> Fixes: b1cce26af1dc ("vhost: add notification for packed ring")
> Fixes: e37ff954405a ("vhost: support virtqueue interrupt/notification suppression")
>
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---
> lib/vhost/vhost.h | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
> index 37609c7c8d..23a4e2b1a7 100644
> --- a/lib/vhost/vhost.h
> +++ b/lib/vhost/vhost.h
> @@ -903,9 +903,9 @@ vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq)
> "%s: used_event_idx=%d, old=%d, new=%d\n",
> __func__, vhost_used_event(vq), old, new);
>
> - if ((vhost_need_event(vhost_used_event(vq), new, old) &&
> - (vq->callfd >= 0)) ||
> - unlikely(!signalled_used_valid)) {
> + if ((vhost_need_event(vhost_used_event(vq), new, old) ||
> + unlikely(!signalled_used_valid)) &&
> + vq->callfd >= 0) {
> eventfd_write(vq->callfd, (eventfd_t) 1);
> if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
> __atomic_fetch_add(&vq->stats.guest_notifications,
> @@ -974,7 +974,7 @@ vhost_vring_call_packed(struct virtio_net *dev, struct vhost_virtqueue *vq)
> if (vhost_need_event(off, new, old))
> kick = true;
> kick:
> - if (kick) {
> + if (kick && vq->callfd >= 0) {
> eventfd_write(vq->callfd, (eventfd_t)1);
> if (dev->notify_ops->guest_notified)
> dev->notify_ops->guest_notified(dev->vid);
>
Reporting Chenbo's R-by, from the VDUSE series RFC:
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-05-17 9:09 ` [PATCH v3 4/4] vhost: add device op to offload the interrupt kick Eelco Chaudron
@ 2023-05-30 13:02 ` Maxime Coquelin
2023-05-30 13:16 ` Thomas Monjalon
2023-05-31 11:49 ` David Marchand
2023-05-31 12:01 ` David Marchand
2023-05-31 14:12 ` David Marchand
2 siblings, 2 replies; 35+ messages in thread
From: Maxime Coquelin @ 2023-05-30 13:02 UTC (permalink / raw)
To: Eelco Chaudron, chenbo.xia, david.marchand, Thomas Monjalon; +Cc: dev
On 5/17/23 11:09, Eelco Chaudron wrote:
> This patch adds an operation callback which gets called every time the
> library wants to call eventfd_write(). This eventfd_write() call could
> result in a system call, which could potentially block the PMD thread.
>
> The callback function can decide whether it's ok to handle the
> eventfd_write() now or have the newly introduced function,
> rte_vhost_notify_guest(), called at a later time.
>
> This can be used by 3rd party applications, like OVS, to avoid system
> calls being called as part of the PMD threads.
>
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---
> lib/vhost/meson.build | 2 ++
> lib/vhost/rte_vhost.h | 23 +++++++++++++++++-
> lib/vhost/socket.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++---
> lib/vhost/version.map | 9 +++++++
> lib/vhost/vhost.c | 38 ++++++++++++++++++++++++++++++
> lib/vhost/vhost.h | 58 ++++++++++++++++++++++++++++++++-------------
> 6 files changed, 171 insertions(+), 22 deletions(-)
>
The patch looks good to me, but that's the first time we use function
versioning in Vhost library, so I'd like another pair of eyes to be sure
I don't miss anything.
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thomas, do we need to mention it somewhere in the release note?
Thanks,
Maxime
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-05-30 13:02 ` Maxime Coquelin
@ 2023-05-30 13:16 ` Thomas Monjalon
2023-05-30 15:16 ` Maxime Coquelin
2023-05-31 11:49 ` David Marchand
1 sibling, 1 reply; 35+ messages in thread
From: Thomas Monjalon @ 2023-05-30 13:16 UTC (permalink / raw)
To: Eelco Chaudron, chenbo.xia, david.marchand, Maxime Coquelin; +Cc: dev
30/05/2023 15:02, Maxime Coquelin:
>
> On 5/17/23 11:09, Eelco Chaudron wrote:
> > This patch adds an operation callback which gets called every time the
> > library wants to call eventfd_write(). This eventfd_write() call could
> > result in a system call, which could potentially block the PMD thread.
> >
> > The callback function can decide whether it's ok to handle the
> > eventfd_write() now or have the newly introduced function,
> > rte_vhost_notify_guest(), called at a later time.
> >
> > This can be used by 3rd party applications, like OVS, to avoid system
> > calls being called as part of the PMD threads.
> >
> > Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> > ---
> > lib/vhost/meson.build | 2 ++
> > lib/vhost/rte_vhost.h | 23 +++++++++++++++++-
> > lib/vhost/socket.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++---
> > lib/vhost/version.map | 9 +++++++
> > lib/vhost/vhost.c | 38 ++++++++++++++++++++++++++++++
> > lib/vhost/vhost.h | 58 ++++++++++++++++++++++++++++++++-------------
> > 6 files changed, 171 insertions(+), 22 deletions(-)
> >
>
>
> The patch looks good to me, but that's the first time we use function
> versioning in Vhost library, so I'd like another pair of eyes to be sure
> I don't miss anything.
>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>
> Thomas, do we need to mention it somewhere in the release note?
If compatibility is kept, I think we don't need to mention it.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-05-30 13:16 ` Thomas Monjalon
@ 2023-05-30 15:16 ` Maxime Coquelin
2023-05-31 6:19 ` Xia, Chenbo
0 siblings, 1 reply; 35+ messages in thread
From: Maxime Coquelin @ 2023-05-30 15:16 UTC (permalink / raw)
To: Thomas Monjalon, Eelco Chaudron, chenbo.xia, david.marchand; +Cc: dev
On 5/30/23 15:16, Thomas Monjalon wrote:
> 30/05/2023 15:02, Maxime Coquelin:
>>
>> On 5/17/23 11:09, Eelco Chaudron wrote:
>>> This patch adds an operation callback which gets called every time the
>>> library wants to call eventfd_write(). This eventfd_write() call could
>>> result in a system call, which could potentially block the PMD thread.
>>>
>>> The callback function can decide whether it's ok to handle the
>>> eventfd_write() now or have the newly introduced function,
>>> rte_vhost_notify_guest(), called at a later time.
>>>
>>> This can be used by 3rd party applications, like OVS, to avoid system
>>> calls being called as part of the PMD threads.
>>>
>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>> ---
>>> lib/vhost/meson.build | 2 ++
>>> lib/vhost/rte_vhost.h | 23 +++++++++++++++++-
>>> lib/vhost/socket.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++---
>>> lib/vhost/version.map | 9 +++++++
>>> lib/vhost/vhost.c | 38 ++++++++++++++++++++++++++++++
>>> lib/vhost/vhost.h | 58 ++++++++++++++++++++++++++++++++-------------
>>> 6 files changed, 171 insertions(+), 22 deletions(-)
>>>
>>
>>
>> The patch looks good to me, but that's the first time we use function
>> versioning in Vhost library, so I'd like another pair of eyes to be sure
>> I don't miss anything.
>>
>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>
>> Thomas, do we need to mention it somewhere in the release note?
>
> If compatibility is kept, I think we don't need to mention it.
>
>
Thanks Thomas for the information.
Maxime
^ permalink raw reply [flat|nested] 35+ messages in thread
* RE: [PATCH v3 3/4] vhost: fix invalid call FD handling
2023-05-30 12:54 ` Maxime Coquelin
@ 2023-05-31 6:12 ` Xia, Chenbo
2023-05-31 9:30 ` Maxime Coquelin
0 siblings, 1 reply; 35+ messages in thread
From: Xia, Chenbo @ 2023-05-31 6:12 UTC (permalink / raw)
To: Maxime Coquelin, Eelco Chaudron, david.marchand; +Cc: dev
> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Tuesday, May 30, 2023 8:54 PM
> To: Eelco Chaudron <echaudro@redhat.com>; Xia, Chenbo
> <chenbo.xia@intel.com>; david.marchand@redhat.com
> Cc: dev@dpdk.org
> Subject: Re: [PATCH v3 3/4] vhost: fix invalid call FD handling
>
>
>
> On 5/17/23 11:09, Eelco Chaudron wrote:
> > This patch fixes cases where IRQ injection is tried while
> > the call FD is not valid, which should not happen.
> >
> > Fixes: b1cce26af1dc ("vhost: add notification for packed ring")
> > Fixes: e37ff954405a ("vhost: support virtqueue interrupt/notification
> suppression")
> >
> > Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> > ---
> > lib/vhost/vhost.h | 8 ++++----
> > 1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
> > index 37609c7c8d..23a4e2b1a7 100644
> > --- a/lib/vhost/vhost.h
> > +++ b/lib/vhost/vhost.h
> > @@ -903,9 +903,9 @@ vhost_vring_call_split(struct virtio_net *dev,
> struct vhost_virtqueue *vq)
> > "%s: used_event_idx=%d, old=%d, new=%d\n",
> > __func__, vhost_used_event(vq), old, new);
> >
> > - if ((vhost_need_event(vhost_used_event(vq), new, old) &&
> > - (vq->callfd >= 0)) ||
> > - unlikely(!signalled_used_valid)) {
> > + if ((vhost_need_event(vhost_used_event(vq), new, old) ||
> > + unlikely(!signalled_used_valid)) &&
> > + vq->callfd >= 0) {
> > eventfd_write(vq->callfd, (eventfd_t) 1);
> > if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
> > __atomic_fetch_add(&vq->stats.guest_notifications,
> > @@ -974,7 +974,7 @@ vhost_vring_call_packed(struct virtio_net *dev,
> struct vhost_virtqueue *vq)
> > if (vhost_need_event(off, new, old))
> > kick = true;
> > kick:
> > - if (kick) {
> > + if (kick && vq->callfd >= 0) {
> > eventfd_write(vq->callfd, (eventfd_t)1);
> > if (dev->notify_ops->guest_notified)
> > dev->notify_ops->guest_notified(dev->vid);
> >
>
> Reporting Chenbo's R-by, from the VDUSE series RFC:
>
> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Thanks Maxime! Btw: what's your plan of the same fix in VDUSE series, do you plan
to drop it in VDUSE series or?
Thanks,
Chenbo
^ permalink raw reply [flat|nested] 35+ messages in thread
* RE: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-05-30 15:16 ` Maxime Coquelin
@ 2023-05-31 6:19 ` Xia, Chenbo
2023-05-31 9:29 ` Maxime Coquelin
0 siblings, 1 reply; 35+ messages in thread
From: Xia, Chenbo @ 2023-05-31 6:19 UTC (permalink / raw)
To: Maxime Coquelin, Thomas Monjalon, Eelco Chaudron, david.marchand; +Cc: dev
> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Tuesday, May 30, 2023 11:17 PM
> To: Thomas Monjalon <thomas@monjalon.net>; Eelco Chaudron
> <echaudro@redhat.com>; Xia, Chenbo <chenbo.xia@intel.com>;
> david.marchand@redhat.com
> Cc: dev@dpdk.org
> Subject: Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt
> kick
>
>
>
> On 5/30/23 15:16, Thomas Monjalon wrote:
> > 30/05/2023 15:02, Maxime Coquelin:
> >>
> >> On 5/17/23 11:09, Eelco Chaudron wrote:
> >>> This patch adds an operation callback which gets called every time the
> >>> library wants to call eventfd_write(). This eventfd_write() call could
> >>> result in a system call, which could potentially block the PMD thread.
> >>>
> >>> The callback function can decide whether it's ok to handle the
> >>> eventfd_write() now or have the newly introduced function,
> >>> rte_vhost_notify_guest(), called at a later time.
> >>>
> >>> This can be used by 3rd party applications, like OVS, to avoid system
> >>> calls being called as part of the PMD threads.
> >>>
> >>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> >>> ---
> >>> lib/vhost/meson.build | 2 ++
> >>> lib/vhost/rte_vhost.h | 23 +++++++++++++++++-
> >>> lib/vhost/socket.c | 63
> ++++++++++++++++++++++++++++++++++++++++++++++---
> >>> lib/vhost/version.map | 9 +++++++
> >>> lib/vhost/vhost.c | 38 ++++++++++++++++++++++++++++++
> >>> lib/vhost/vhost.h | 58 ++++++++++++++++++++++++++++++++------
> -------
> >>> 6 files changed, 171 insertions(+), 22 deletions(-)
> >>>
> >>
> >>
> >> The patch looks good to me, but that's the first time we use function
> >> versioning in Vhost library, so I'd like another pair of eyes to be
> sure
> >> I don't miss anything.
> >>
> >> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> >>
> >> Thomas, do we need to mention it somewhere in the release note?
> >
> > If compatibility is kept, I think we don't need to mention it.
> >
> >
>
> Thanks Thomas for the information.
>
> Maxime
About release note, except the versioning, there is also one new API introduced
in this patch, so we still need to mention this in release note
Thanks,
Chenbo
^ permalink raw reply [flat|nested] 35+ messages in thread
* RE: [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a read/write one
2023-05-17 9:08 ` [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a read/write one Eelco Chaudron
2023-05-17 17:33 ` Maxime Coquelin
@ 2023-05-31 6:37 ` Xia, Chenbo
2023-05-31 9:27 ` Maxime Coquelin
1 sibling, 1 reply; 35+ messages in thread
From: Xia, Chenbo @ 2023-05-31 6:37 UTC (permalink / raw)
To: Eelco Chaudron, maxime.coquelin, david.marchand; +Cc: dev
Hi Eelco,
> -----Original Message-----
> From: Eelco Chaudron <echaudro@redhat.com>
> Sent: Wednesday, May 17, 2023 5:09 PM
> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>;
> david.marchand@redhat.com
> Cc: dev@dpdk.org
> Subject: [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a
> read/write one
>
> This change will allow the vhost interrupt datapath handling to be split
> between two processed without one of them holding an explicit lock.
>
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---
> lib/eal/include/generic/rte_rwlock.h | 17 ++++++
> lib/vhost/vhost.c | 46 +++++++++--------
> lib/vhost/vhost.h | 4 +-
> lib/vhost/vhost_user.c | 14 +++--
> lib/vhost/virtio_net.c | 90 +++++++++++++++++------------
> -----
> 5 files changed, 94 insertions(+), 77 deletions(-)
>
> diff --git a/lib/eal/include/generic/rte_rwlock.h
> b/lib/eal/include/generic/rte_rwlock.h
> index 71e2d8d5f4..9e083bbc61 100644
> --- a/lib/eal/include/generic/rte_rwlock.h
> +++ b/lib/eal/include/generic/rte_rwlock.h
> @@ -236,6 +236,23 @@ rte_rwlock_write_unlock(rte_rwlock_t *rwl)
> __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE, __ATOMIC_RELEASE);
> }
>
> +/**
> + * Test if the write lock is taken.
> + *
> + * @param rwl
> + * A pointer to a rwlock structure.
> + * @return
> + * 1 if the write lock is currently taken; 0 otherwise.
> + */
> +static inline int
> +rte_rwlock_write_is_locked(rte_rwlock_t *rwl)
> +{
> + if (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) & RTE_RWLOCK_WRITE)
> + return 1;
> +
> + return 0;
> +}
> +
Again we need to update release note as it's a new EAL API.
> /**
> * Try to execute critical section in a hardware memory transaction, if
> it
> * fails or not available take a read lock
> diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
> index ef37943817..74bdbfd810 100644
> --- a/lib/vhost/vhost.c
> +++ b/lib/vhost/vhost.c
> @@ -393,9 +393,9 @@ free_vq(struct virtio_net *dev, struct vhost_virtqueue
> *vq)
> else
> rte_free(vq->shadow_used_split);
>
> - rte_spinlock_lock(&vq->access_lock);
> + rte_rwlock_write_lock(&vq->access_lock);
> vhost_free_async_mem(vq);
> - rte_spinlock_unlock(&vq->access_lock);
> + rte_rwlock_write_unlock(&vq->access_lock);
> rte_free(vq->batch_copy_elems);
> vhost_user_iotlb_destroy(vq);
> rte_free(vq->log_cache);
> @@ -630,7 +630,7 @@ alloc_vring_queue(struct virtio_net *dev, uint32_t
> vring_idx)
>
> dev->virtqueue[i] = vq;
> init_vring_queue(dev, vq, i);
> - rte_spinlock_init(&vq->access_lock);
> + rte_rwlock_init(&vq->access_lock);
> vq->avail_wrap_counter = 1;
> vq->used_wrap_counter = 1;
> vq->signalled_used_valid = false;
> @@ -1305,14 +1305,14 @@ rte_vhost_vring_call(int vid, uint16_t vring_idx)
> if (!vq)
> return -1;
>
> - rte_spinlock_lock(&vq->access_lock);
> + rte_rwlock_read_lock(&vq->access_lock);
>
> if (vq_is_packed(dev))
> vhost_vring_call_packed(dev, vq);
> else
> vhost_vring_call_split(dev, vq);
>
> - rte_spinlock_unlock(&vq->access_lock);
> + rte_rwlock_read_unlock(&vq->access_lock);
Not sure about this. vhost_ring_call_packed/split is changing some field in
Vq. Should we use write lock here?
Thanks,
Chenbo
^ permalink raw reply [flat|nested] 35+ messages in thread
* RE: [PATCH v3 2/4] vhost: make the guest_notifications statistic counter atomic
2023-05-17 9:08 ` [PATCH v3 2/4] vhost: make the guest_notifications statistic counter atomic Eelco Chaudron
2023-05-30 12:52 ` Maxime Coquelin
@ 2023-05-31 7:03 ` Xia, Chenbo
1 sibling, 0 replies; 35+ messages in thread
From: Xia, Chenbo @ 2023-05-31 7:03 UTC (permalink / raw)
To: Eelco Chaudron, maxime.coquelin, david.marchand; +Cc: dev
> -----Original Message-----
> From: Eelco Chaudron <echaudro@redhat.com>
> Sent: Wednesday, May 17, 2023 5:09 PM
> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>;
> david.marchand@redhat.com
> Cc: dev@dpdk.org
> Subject: [PATCH v3 2/4] vhost: make the guest_notifications statistic
> counter atomic
>
> Making the guest_notifications statistic counter atomic, allows
> it to be safely incremented while holding the read access_lock.
>
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---
> lib/vhost/vhost.c | 8 ++++++++
> lib/vhost/vhost.h | 9 ++++++---
> 2 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
> index 74bdbfd810..8ff6434c93 100644
> --- a/lib/vhost/vhost.c
> +++ b/lib/vhost/vhost.c
> @@ -2086,6 +2086,10 @@ rte_vhost_vring_stats_get(int vid, uint16_t
> queue_id,
>
> rte_rwlock_write_lock(&vq->access_lock);
> for (i = 0; i < VHOST_NB_VQ_STATS; i++) {
> + /*
> + * No need to the read atomic counters as such, due to the
> + * above write access_lock preventing them to be updated.
> + */
> stats[i].value =
> *(uint64_t *)(((char *)vq) +
> vhost_vq_stat_strings[i].offset);
> stats[i].id = i;
> @@ -2112,6 +2116,10 @@ int rte_vhost_vring_stats_reset(int vid, uint16_t
> queue_id)
> vq = dev->virtqueue[queue_id];
>
> rte_rwlock_write_lock(&vq->access_lock);
> + /*
> + * No need to the reset atomic counters as such, due to the
> + * above write access_lock preventing them to be updated.
> + */
> memset(&vq->stats, 0, sizeof(vq->stats));
> rte_rwlock_write_unlock(&vq->access_lock);
>
> diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
> index 5c939ef06f..37609c7c8d 100644
> --- a/lib/vhost/vhost.h
> +++ b/lib/vhost/vhost.h
> @@ -135,11 +135,12 @@ struct virtqueue_stats {
> uint64_t broadcast;
> /* Size bins in array as RFC 2819, undersized [0], 64 [1], etc */
> uint64_t size_bins[8];
> - uint64_t guest_notifications;
> uint64_t iotlb_hits;
> uint64_t iotlb_misses;
> uint64_t inflight_submitted;
> uint64_t inflight_completed;
> + /* Counters below are atomic, and should be incremented as such. */
> + uint64_t guest_notifications;
> };
>
> /**
> @@ -907,7 +908,8 @@ vhost_vring_call_split(struct virtio_net *dev, struct
> vhost_virtqueue *vq)
> unlikely(!signalled_used_valid)) {
> eventfd_write(vq->callfd, (eventfd_t) 1);
> if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
> - vq->stats.guest_notifications++;
> + __atomic_fetch_add(&vq->stats.guest_notifications,
> + 1, __ATOMIC_RELAXED);
> if (dev->notify_ops->guest_notified)
> dev->notify_ops->guest_notified(dev->vid);
> }
> @@ -917,7 +919,8 @@ vhost_vring_call_split(struct virtio_net *dev, struct
> vhost_virtqueue *vq)
> && (vq->callfd >= 0)) {
> eventfd_write(vq->callfd, (eventfd_t)1);
> if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
> - vq->stats.guest_notifications++;
> + __atomic_fetch_add(&vq->stats.guest_notifications,
> + 1, __ATOMIC_RELAXED);
> if (dev->notify_ops->guest_notified)
> dev->notify_ops->guest_notified(dev->vid);
> }
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a read/write one
2023-05-31 6:37 ` Xia, Chenbo
@ 2023-05-31 9:27 ` Maxime Coquelin
2023-05-31 11:13 ` Eelco Chaudron
0 siblings, 1 reply; 35+ messages in thread
From: Maxime Coquelin @ 2023-05-31 9:27 UTC (permalink / raw)
To: Xia, Chenbo, Eelco Chaudron, david.marchand; +Cc: dev
On 5/31/23 08:37, Xia, Chenbo wrote:
> Hi Eelco,
>
>> -----Original Message-----
>> From: Eelco Chaudron <echaudro@redhat.com>
>> Sent: Wednesday, May 17, 2023 5:09 PM
>> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>;
>> david.marchand@redhat.com
>> Cc: dev@dpdk.org
>> Subject: [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a
>> read/write one
>>
>> This change will allow the vhost interrupt datapath handling to be split
>> between two processed without one of them holding an explicit lock.
>>
>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>> ---
>> lib/eal/include/generic/rte_rwlock.h | 17 ++++++
>> lib/vhost/vhost.c | 46 +++++++++--------
>> lib/vhost/vhost.h | 4 +-
>> lib/vhost/vhost_user.c | 14 +++--
>> lib/vhost/virtio_net.c | 90 +++++++++++++++++------------
>> -----
>> 5 files changed, 94 insertions(+), 77 deletions(-)
>>
>> diff --git a/lib/eal/include/generic/rte_rwlock.h
>> b/lib/eal/include/generic/rte_rwlock.h
>> index 71e2d8d5f4..9e083bbc61 100644
>> --- a/lib/eal/include/generic/rte_rwlock.h
>> +++ b/lib/eal/include/generic/rte_rwlock.h
>> @@ -236,6 +236,23 @@ rte_rwlock_write_unlock(rte_rwlock_t *rwl)
>> __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE, __ATOMIC_RELEASE);
>> }
>>
>> +/**
>> + * Test if the write lock is taken.
>> + *
>> + * @param rwl
>> + * A pointer to a rwlock structure.
>> + * @return
>> + * 1 if the write lock is currently taken; 0 otherwise.
>> + */
>> +static inline int
>> +rte_rwlock_write_is_locked(rte_rwlock_t *rwl)
>> +{
>> + if (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) & RTE_RWLOCK_WRITE)
>> + return 1;
>> +
>> + return 0;
>> +}
>> +
>
> Again we need to update release note as it's a new EAL API.
>
>> /**
>> * Try to execute critical section in a hardware memory transaction, if
>> it
>> * fails or not available take a read lock
>> diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
>> index ef37943817..74bdbfd810 100644
>> --- a/lib/vhost/vhost.c
>> +++ b/lib/vhost/vhost.c
>> @@ -393,9 +393,9 @@ free_vq(struct virtio_net *dev, struct vhost_virtqueue
>> *vq)
>> else
>> rte_free(vq->shadow_used_split);
>>
>> - rte_spinlock_lock(&vq->access_lock);
>> + rte_rwlock_write_lock(&vq->access_lock);
>> vhost_free_async_mem(vq);
>> - rte_spinlock_unlock(&vq->access_lock);
>> + rte_rwlock_write_unlock(&vq->access_lock);
>> rte_free(vq->batch_copy_elems);
>> vhost_user_iotlb_destroy(vq);
>> rte_free(vq->log_cache);
>> @@ -630,7 +630,7 @@ alloc_vring_queue(struct virtio_net *dev, uint32_t
>> vring_idx)
>>
>> dev->virtqueue[i] = vq;
>> init_vring_queue(dev, vq, i);
>> - rte_spinlock_init(&vq->access_lock);
>> + rte_rwlock_init(&vq->access_lock);
>> vq->avail_wrap_counter = 1;
>> vq->used_wrap_counter = 1;
>> vq->signalled_used_valid = false;
>> @@ -1305,14 +1305,14 @@ rte_vhost_vring_call(int vid, uint16_t vring_idx)
>> if (!vq)
>> return -1;
>>
>> - rte_spinlock_lock(&vq->access_lock);
>> + rte_rwlock_read_lock(&vq->access_lock);
>>
>> if (vq_is_packed(dev))
>> vhost_vring_call_packed(dev, vq);
>> else
>> vhost_vring_call_split(dev, vq);
>>
>> - rte_spinlock_unlock(&vq->access_lock);
>> + rte_rwlock_read_unlock(&vq->access_lock);
>
> Not sure about this. vhost_ring_call_packed/split is changing some field in
> Vq. Should we use write lock here?
I don't think so, the purpose of the access_lock is not to make the
datapath threads-safe, but to protect the datapath from metadata changes
by the control path.
Thanks,
Maxime
>
> Thanks,
> Chenbo
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-05-31 6:19 ` Xia, Chenbo
@ 2023-05-31 9:29 ` Maxime Coquelin
2023-05-31 11:21 ` Eelco Chaudron
2023-06-01 2:18 ` Xia, Chenbo
0 siblings, 2 replies; 35+ messages in thread
From: Maxime Coquelin @ 2023-05-31 9:29 UTC (permalink / raw)
To: Xia, Chenbo, Thomas Monjalon, Eelco Chaudron, david.marchand; +Cc: dev
On 5/31/23 08:19, Xia, Chenbo wrote:
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Tuesday, May 30, 2023 11:17 PM
>> To: Thomas Monjalon <thomas@monjalon.net>; Eelco Chaudron
>> <echaudro@redhat.com>; Xia, Chenbo <chenbo.xia@intel.com>;
>> david.marchand@redhat.com
>> Cc: dev@dpdk.org
>> Subject: Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt
>> kick
>>
>>
>>
>> On 5/30/23 15:16, Thomas Monjalon wrote:
>>> 30/05/2023 15:02, Maxime Coquelin:
>>>>
>>>> On 5/17/23 11:09, Eelco Chaudron wrote:
>>>>> This patch adds an operation callback which gets called every time the
>>>>> library wants to call eventfd_write(). This eventfd_write() call could
>>>>> result in a system call, which could potentially block the PMD thread.
>>>>>
>>>>> The callback function can decide whether it's ok to handle the
>>>>> eventfd_write() now or have the newly introduced function,
>>>>> rte_vhost_notify_guest(), called at a later time.
>>>>>
>>>>> This can be used by 3rd party applications, like OVS, to avoid system
>>>>> calls being called as part of the PMD threads.
>>>>>
>>>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>>>> ---
>>>>> lib/vhost/meson.build | 2 ++
>>>>> lib/vhost/rte_vhost.h | 23 +++++++++++++++++-
>>>>> lib/vhost/socket.c | 63
>> ++++++++++++++++++++++++++++++++++++++++++++++---
>>>>> lib/vhost/version.map | 9 +++++++
>>>>> lib/vhost/vhost.c | 38 ++++++++++++++++++++++++++++++
>>>>> lib/vhost/vhost.h | 58 ++++++++++++++++++++++++++++++++------
>> -------
>>>>> 6 files changed, 171 insertions(+), 22 deletions(-)
>>>>>
>>>>
>>>>
>>>> The patch looks good to me, but that's the first time we use function
>>>> versioning in Vhost library, so I'd like another pair of eyes to be
>> sure
>>>> I don't miss anything.
>>>>
>>>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>
>>>> Thomas, do we need to mention it somewhere in the release note?
>>>
>>> If compatibility is kept, I think we don't need to mention it.
>>>
>>>
>>
>> Thanks Thomas for the information.
>>
>> Maxime
>
> About release note, except the versioning, there is also one new API introduced
> in this patch, so we still need to mention this in release note
Right, good catch.
Eelco, let me know what you would put, I'll add it while applying (No
need for a new revision).
Thanks,
Maxime
> Thanks,
> Chenbo
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 3/4] vhost: fix invalid call FD handling
2023-05-31 6:12 ` Xia, Chenbo
@ 2023-05-31 9:30 ` Maxime Coquelin
0 siblings, 0 replies; 35+ messages in thread
From: Maxime Coquelin @ 2023-05-31 9:30 UTC (permalink / raw)
To: Xia, Chenbo, Eelco Chaudron, david.marchand; +Cc: dev
On 5/31/23 08:12, Xia, Chenbo wrote:
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Tuesday, May 30, 2023 8:54 PM
>> To: Eelco Chaudron <echaudro@redhat.com>; Xia, Chenbo
>> <chenbo.xia@intel.com>; david.marchand@redhat.com
>> Cc: dev@dpdk.org
>> Subject: Re: [PATCH v3 3/4] vhost: fix invalid call FD handling
>>
>>
>>
>> On 5/17/23 11:09, Eelco Chaudron wrote:
>>> This patch fixes cases where IRQ injection is tried while
>>> the call FD is not valid, which should not happen.
>>>
>>> Fixes: b1cce26af1dc ("vhost: add notification for packed ring")
>>> Fixes: e37ff954405a ("vhost: support virtqueue interrupt/notification
>> suppression")
>>>
>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>> ---
>>> lib/vhost/vhost.h | 8 ++++----
>>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
>>> index 37609c7c8d..23a4e2b1a7 100644
>>> --- a/lib/vhost/vhost.h
>>> +++ b/lib/vhost/vhost.h
>>> @@ -903,9 +903,9 @@ vhost_vring_call_split(struct virtio_net *dev,
>> struct vhost_virtqueue *vq)
>>> "%s: used_event_idx=%d, old=%d, new=%d\n",
>>> __func__, vhost_used_event(vq), old, new);
>>>
>>> - if ((vhost_need_event(vhost_used_event(vq), new, old) &&
>>> - (vq->callfd >= 0)) ||
>>> - unlikely(!signalled_used_valid)) {
>>> + if ((vhost_need_event(vhost_used_event(vq), new, old) ||
>>> + unlikely(!signalled_used_valid)) &&
>>> + vq->callfd >= 0) {
>>> eventfd_write(vq->callfd, (eventfd_t) 1);
>>> if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
>>> __atomic_fetch_add(&vq->stats.guest_notifications,
>>> @@ -974,7 +974,7 @@ vhost_vring_call_packed(struct virtio_net *dev,
>> struct vhost_virtqueue *vq)
>>> if (vhost_need_event(off, new, old))
>>> kick = true;
>>> kick:
>>> - if (kick) {
>>> + if (kick && vq->callfd >= 0) {
>>> eventfd_write(vq->callfd, (eventfd_t)1);
>>> if (dev->notify_ops->guest_notified)
>>> dev->notify_ops->guest_notified(dev->vid);
>>>
>>
>> Reporting Chenbo's R-by, from the VDUSE series RFC:
>>
>> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
>
> Thanks Maxime! Btw: what's your plan of the same fix in VDUSE series, do you plan
> to drop it in VDUSE series or?
Yes, I'm rebasing my VDUSE series on top of Eelco's.
I just need the release note update for the new API, and I'll push Eelco
series to next-virtio.
Maxime
> Thanks,
> Chenbo
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a read/write one
2023-05-31 9:27 ` Maxime Coquelin
@ 2023-05-31 11:13 ` Eelco Chaudron
2023-06-01 1:45 ` Xia, Chenbo
0 siblings, 1 reply; 35+ messages in thread
From: Eelco Chaudron @ 2023-05-31 11:13 UTC (permalink / raw)
To: Maxime Coquelin; +Cc: Xia, Chenbo, david.marchand, dev
On 31 May 2023, at 11:27, Maxime Coquelin wrote:
> On 5/31/23 08:37, Xia, Chenbo wrote:
>> Hi Eelco,
>>
>>> -----Original Message-----
>>> From: Eelco Chaudron <echaudro@redhat.com>
>>> Sent: Wednesday, May 17, 2023 5:09 PM
>>> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>;
>>> david.marchand@redhat.com
>>> Cc: dev@dpdk.org
>>> Subject: [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a
>>> read/write one
>>>
>>> This change will allow the vhost interrupt datapath handling to be split
>>> between two processed without one of them holding an explicit lock.
>>>
>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>> ---
>>> lib/eal/include/generic/rte_rwlock.h | 17 ++++++
>>> lib/vhost/vhost.c | 46 +++++++++--------
>>> lib/vhost/vhost.h | 4 +-
>>> lib/vhost/vhost_user.c | 14 +++--
>>> lib/vhost/virtio_net.c | 90 +++++++++++++++++------------
>>> -----
>>> 5 files changed, 94 insertions(+), 77 deletions(-)
>>>
>>> diff --git a/lib/eal/include/generic/rte_rwlock.h
>>> b/lib/eal/include/generic/rte_rwlock.h
>>> index 71e2d8d5f4..9e083bbc61 100644
>>> --- a/lib/eal/include/generic/rte_rwlock.h
>>> +++ b/lib/eal/include/generic/rte_rwlock.h
>>> @@ -236,6 +236,23 @@ rte_rwlock_write_unlock(rte_rwlock_t *rwl)
>>> __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE, __ATOMIC_RELEASE);
>>> }
>>>
>>> +/**
>>> + * Test if the write lock is taken.
>>> + *
>>> + * @param rwl
>>> + * A pointer to a rwlock structure.
>>> + * @return
>>> + * 1 if the write lock is currently taken; 0 otherwise.
>>> + */
>>> +static inline int
>>> +rte_rwlock_write_is_locked(rte_rwlock_t *rwl)
>>> +{
>>> + if (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) & RTE_RWLOCK_WRITE)
>>> + return 1;
>>> +
>>> + return 0;
>>> +}
>>> +
>>
>> Again we need to update release note as it's a new EAL API.
>>
>>> /**
>>> * Try to execute critical section in a hardware memory transaction, if
>>> it
>>> * fails or not available take a read lock
>>> diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
>>> index ef37943817..74bdbfd810 100644
>>> --- a/lib/vhost/vhost.c
>>> +++ b/lib/vhost/vhost.c
>>> @@ -393,9 +393,9 @@ free_vq(struct virtio_net *dev, struct vhost_virtqueue
>>> *vq)
>>> else
>>> rte_free(vq->shadow_used_split);
>>>
>>> - rte_spinlock_lock(&vq->access_lock);
>>> + rte_rwlock_write_lock(&vq->access_lock);
>>> vhost_free_async_mem(vq);
>>> - rte_spinlock_unlock(&vq->access_lock);
>>> + rte_rwlock_write_unlock(&vq->access_lock);
>>> rte_free(vq->batch_copy_elems);
>>> vhost_user_iotlb_destroy(vq);
>>> rte_free(vq->log_cache);
>>> @@ -630,7 +630,7 @@ alloc_vring_queue(struct virtio_net *dev, uint32_t
>>> vring_idx)
>>>
>>> dev->virtqueue[i] = vq;
>>> init_vring_queue(dev, vq, i);
>>> - rte_spinlock_init(&vq->access_lock);
>>> + rte_rwlock_init(&vq->access_lock);
>>> vq->avail_wrap_counter = 1;
>>> vq->used_wrap_counter = 1;
>>> vq->signalled_used_valid = false;
>>> @@ -1305,14 +1305,14 @@ rte_vhost_vring_call(int vid, uint16_t vring_idx)
>>> if (!vq)
>>> return -1;
>>>
>>> - rte_spinlock_lock(&vq->access_lock);
>>> + rte_rwlock_read_lock(&vq->access_lock);
>>>
>>> if (vq_is_packed(dev))
>>> vhost_vring_call_packed(dev, vq);
>>> else
>>> vhost_vring_call_split(dev, vq);
>>>
>>> - rte_spinlock_unlock(&vq->access_lock);
>>> + rte_rwlock_read_unlock(&vq->access_lock);
>>
>> Not sure about this. vhost_ring_call_packed/split is changing some field in
>> Vq. Should we use write lock here?
>
> I don't think so, the purpose of the access_lock is not to make the
> datapath threads-safe, but to protect the datapath from metadata changes
> by the control path.
Thanks Chinbo for the review, and see Maxime’s comment above. Does this clarify your concern/question?
>>
>> Thanks,
>> Chenbo
>>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-05-31 9:29 ` Maxime Coquelin
@ 2023-05-31 11:21 ` Eelco Chaudron
2023-06-01 2:18 ` Xia, Chenbo
1 sibling, 0 replies; 35+ messages in thread
From: Eelco Chaudron @ 2023-05-31 11:21 UTC (permalink / raw)
To: Maxime Coquelin; +Cc: Xia, Chenbo, Thomas Monjalon, david.marchand, dev
On 31 May 2023, at 11:29, Maxime Coquelin wrote:
> On 5/31/23 08:19, Xia, Chenbo wrote:
>>> -----Original Message-----
>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>> Sent: Tuesday, May 30, 2023 11:17 PM
>>> To: Thomas Monjalon <thomas@monjalon.net>; Eelco Chaudron
>>> <echaudro@redhat.com>; Xia, Chenbo <chenbo.xia@intel.com>;
>>> david.marchand@redhat.com
>>> Cc: dev@dpdk.org
>>> Subject: Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt
>>> kick
>>>
>>>
>>>
>>> On 5/30/23 15:16, Thomas Monjalon wrote:
>>>> 30/05/2023 15:02, Maxime Coquelin:
>>>>>
>>>>> On 5/17/23 11:09, Eelco Chaudron wrote:
>>>>>> This patch adds an operation callback which gets called every time the
>>>>>> library wants to call eventfd_write(). This eventfd_write() call could
>>>>>> result in a system call, which could potentially block the PMD thread.
>>>>>>
>>>>>> The callback function can decide whether it's ok to handle the
>>>>>> eventfd_write() now or have the newly introduced function,
>>>>>> rte_vhost_notify_guest(), called at a later time.
>>>>>>
>>>>>> This can be used by 3rd party applications, like OVS, to avoid system
>>>>>> calls being called as part of the PMD threads.
>>>>>>
>>>>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>>>>> ---
>>>>>> lib/vhost/meson.build | 2 ++
>>>>>> lib/vhost/rte_vhost.h | 23 +++++++++++++++++-
>>>>>> lib/vhost/socket.c | 63
>>> ++++++++++++++++++++++++++++++++++++++++++++++---
>>>>>> lib/vhost/version.map | 9 +++++++
>>>>>> lib/vhost/vhost.c | 38 ++++++++++++++++++++++++++++++
>>>>>> lib/vhost/vhost.h | 58 ++++++++++++++++++++++++++++++++------
>>> -------
>>>>>> 6 files changed, 171 insertions(+), 22 deletions(-)
>>>>>>
>>>>>
>>>>>
>>>>> The patch looks good to me, but that's the first time we use function
>>>>> versioning in Vhost library, so I'd like another pair of eyes to be
>>> sure
>>>>> I don't miss anything.
>>>>>
>>>>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>
>>>>> Thomas, do we need to mention it somewhere in the release note?
>>>>
>>>> If compatibility is kept, I think we don't need to mention it.
>>>>
>>>>
>>>
>>> Thanks Thomas for the information.
>>>
>>> Maxime
>>
>> About release note, except the versioning, there is also one new API introduced
>> in this patch, so we still need to mention this in release note
>
> Right, good catch.
> Eelco, let me know what you would put, I'll add it while applying (No
> need for a new revision).
Thanks guys! What about the following release note addition:
* **Added callback API support for interrupt handling in the vhost library.**
A new callback, guest_notify, is introduced that can be used to handle the interrupt kick outside of the datapath fast path. In addition, a new API, rte_vhost_notify_guest(), is added to raise the interrupt outside of the past path.
Please change if you see fit.
Thanks,
Eelco
> Thanks,
> Maxime
>
>> Thanks,
>> Chenbo
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-05-30 13:02 ` Maxime Coquelin
2023-05-30 13:16 ` Thomas Monjalon
@ 2023-05-31 11:49 ` David Marchand
1 sibling, 0 replies; 35+ messages in thread
From: David Marchand @ 2023-05-31 11:49 UTC (permalink / raw)
To: Maxime Coquelin; +Cc: Eelco Chaudron, chenbo.xia, Thomas Monjalon, dev
On Tue, May 30, 2023 at 3:02 PM Maxime Coquelin
<maxime.coquelin@redhat.com> wrote:
> On 5/17/23 11:09, Eelco Chaudron wrote:
> > This patch adds an operation callback which gets called every time the
> > library wants to call eventfd_write(). This eventfd_write() call could
> > result in a system call, which could potentially block the PMD thread.
> >
> > The callback function can decide whether it's ok to handle the
> > eventfd_write() now or have the newly introduced function,
> > rte_vhost_notify_guest(), called at a later time.
> >
> > This can be used by 3rd party applications, like OVS, to avoid system
> > calls being called as part of the PMD threads.
> >
> > Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> > ---
> > lib/vhost/meson.build | 2 ++
> > lib/vhost/rte_vhost.h | 23 +++++++++++++++++-
> > lib/vhost/socket.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++---
> > lib/vhost/version.map | 9 +++++++
> > lib/vhost/vhost.c | 38 ++++++++++++++++++++++++++++++
> > lib/vhost/vhost.h | 58 ++++++++++++++++++++++++++++++++-------------
> > 6 files changed, 171 insertions(+), 22 deletions(-)
> >
>
>
> The patch looks good to me, but that's the first time we use function
> versioning in Vhost library, so I'd like another pair of eyes to be sure
> I don't miss anything.
>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
There may be some enhancement to do in the symbol check (wrt to the
warning http://mails.dpdk.org/archives/test-report/2023-May/394865.html)
but otherwise the versioning part lgtm.
--
David Marchand
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-05-17 9:09 ` [PATCH v3 4/4] vhost: add device op to offload the interrupt kick Eelco Chaudron
2023-05-30 13:02 ` Maxime Coquelin
@ 2023-05-31 12:01 ` David Marchand
2023-05-31 12:48 ` Maxime Coquelin
2023-05-31 14:12 ` David Marchand
2 siblings, 1 reply; 35+ messages in thread
From: David Marchand @ 2023-05-31 12:01 UTC (permalink / raw)
To: Eelco Chaudron, maxime.coquelin; +Cc: chenbo.xia, dev
Eelco, Maxime,
On Wed, May 17, 2023 at 11:09 AM Eelco Chaudron <echaudro@redhat.com> wrote:
> @@ -846,6 +848,11 @@ vhost_user_socket_mem_free(struct vhost_user_socket *vsocket)
> vsocket->path = NULL;
> }
>
> + if (vsocket && vsocket->malloc_notify_ops) {
> + free(vsocket->malloc_notify_ops);
> + vsocket->malloc_notify_ops = NULL;
> + }
> +
> if (vsocket) {
> free(vsocket);
> vsocket = NULL;
Nit: we had several cleanups in the tree to remove patterns like if
(ptr) free(ptr).
Here, this helper could look for vsocket being NULL first thing, then
call free() unconditionnally.
And resetting the fields to NULL is probably not that useful, since
the vsocket is freed at the end.
Wdyt of:
static void
vhost_user_socket_mem_free(struct vhost_user_socket *vsocket)
{
if (vsocket == NULL)
return;
free(vsocket->path);
free(vsocket->malloc_notify_ops);
free(vsocket);
}
--
David Marchand
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-05-31 12:01 ` David Marchand
@ 2023-05-31 12:48 ` Maxime Coquelin
2023-05-31 13:13 ` Eelco Chaudron
0 siblings, 1 reply; 35+ messages in thread
From: Maxime Coquelin @ 2023-05-31 12:48 UTC (permalink / raw)
To: David Marchand, Eelco Chaudron; +Cc: chenbo.xia, dev
On 5/31/23 14:01, David Marchand wrote:
> Eelco, Maxime,
>
> On Wed, May 17, 2023 at 11:09 AM Eelco Chaudron <echaudro@redhat.com> wrote:
>> @@ -846,6 +848,11 @@ vhost_user_socket_mem_free(struct vhost_user_socket *vsocket)
>> vsocket->path = NULL;
>> }
>>
>> + if (vsocket && vsocket->malloc_notify_ops) {
>> + free(vsocket->malloc_notify_ops);
>> + vsocket->malloc_notify_ops = NULL;
>> + }
>> +
>> if (vsocket) {
>> free(vsocket);
>> vsocket = NULL;
>
> Nit: we had several cleanups in the tree to remove patterns like if
> (ptr) free(ptr).
> Here, this helper could look for vsocket being NULL first thing, then
> call free() unconditionnally.
> And resetting the fields to NULL is probably not that useful, since
> the vsocket is freed at the end.
> Wdyt of:
>
> static void
> vhost_user_socket_mem_free(struct vhost_user_socket *vsocket)
> {
> if (vsocket == NULL)
> return;
>
> free(vsocket->path);
> free(vsocket->malloc_notify_ops);
> free(vsocket);
> }
>
>
It is indeed better, I can fix while applying if Eelco agrees.
Thanks,
Maxime
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-05-31 12:48 ` Maxime Coquelin
@ 2023-05-31 13:13 ` Eelco Chaudron
0 siblings, 0 replies; 35+ messages in thread
From: Eelco Chaudron @ 2023-05-31 13:13 UTC (permalink / raw)
To: Maxime Coquelin; +Cc: David Marchand, chenbo.xia, dev
On 31 May 2023, at 14:48, Maxime Coquelin wrote:
> On 5/31/23 14:01, David Marchand wrote:
>> Eelco, Maxime,
>>
>> On Wed, May 17, 2023 at 11:09 AM Eelco Chaudron <echaudro@redhat.com> wrote:
>>> @@ -846,6 +848,11 @@ vhost_user_socket_mem_free(struct vhost_user_socket *vsocket)
>>> vsocket->path = NULL;
>>> }
>>>
>>> + if (vsocket && vsocket->malloc_notify_ops) {
>>> + free(vsocket->malloc_notify_ops);
>>> + vsocket->malloc_notify_ops = NULL;
>>> + }
>>> +
>>> if (vsocket) {
>>> free(vsocket);
>>> vsocket = NULL;
>>
>> Nit: we had several cleanups in the tree to remove patterns like if
>> (ptr) free(ptr).
>> Here, this helper could look for vsocket being NULL first thing, then
>> call free() unconditionnally.
>> And resetting the fields to NULL is probably not that useful, since
>> the vsocket is freed at the end.
>> Wdyt of:
>>
>> static void
>> vhost_user_socket_mem_free(struct vhost_user_socket *vsocket)
>> {
>> if (vsocket == NULL)
>> return;
>>
>> free(vsocket->path);
>> free(vsocket->malloc_notify_ops);
>> free(vsocket);
>> }
>>
>>
>
> It is indeed better, I can fix while applying if Eelco agrees.
Looks good to me.
//Eelco
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-05-17 9:09 ` [PATCH v3 4/4] vhost: add device op to offload the interrupt kick Eelco Chaudron
2023-05-30 13:02 ` Maxime Coquelin
2023-05-31 12:01 ` David Marchand
@ 2023-05-31 14:12 ` David Marchand
2023-05-31 14:18 ` Maxime Coquelin
2 siblings, 1 reply; 35+ messages in thread
From: David Marchand @ 2023-05-31 14:12 UTC (permalink / raw)
To: maxime.coquelin; +Cc: Eelco Chaudron, chenbo.xia, dev
Maxime,
On Wed, May 17, 2023 at 11:09 AM Eelco Chaudron <echaudro@redhat.com> wrote:
> @@ -974,11 +994,8 @@ vhost_vring_call_packed(struct virtio_net *dev, struct vhost_virtqueue *vq)
> if (vhost_need_event(off, new, old))
> kick = true;
> kick:
> - if (kick && vq->callfd >= 0) {
> - eventfd_write(vq->callfd, (eventfd_t)1);
> - if (dev->notify_ops->guest_notified)
> - dev->notify_ops->guest_notified(dev->vid);
> - }
> + if (kick && vq->callfd >= 0)
> + vhost_vring_inject_irq(dev, vq);
Thinking again about the VDUSE series overlap...
This hunk here is implicitely fixing an issue.
You addressed it in
https://patchwork.dpdk.org/project/dpdk/patch/20230525162551.70359-2-maxime.coquelin@redhat.com/
So I suggest you apply this patch of yours before Eelco patch 4.
This way, the fix backport will be trivial.
--
David Marchand
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-05-31 14:12 ` David Marchand
@ 2023-05-31 14:18 ` Maxime Coquelin
0 siblings, 0 replies; 35+ messages in thread
From: Maxime Coquelin @ 2023-05-31 14:18 UTC (permalink / raw)
To: David Marchand; +Cc: Eelco Chaudron, chenbo.xia, dev
On 5/31/23 16:12, David Marchand wrote:
> Maxime,
>
> On Wed, May 17, 2023 at 11:09 AM Eelco Chaudron <echaudro@redhat.com> wrote:
>> @@ -974,11 +994,8 @@ vhost_vring_call_packed(struct virtio_net *dev, struct vhost_virtqueue *vq)
>> if (vhost_need_event(off, new, old))
>> kick = true;
>> kick:
>> - if (kick && vq->callfd >= 0) {
>> - eventfd_write(vq->callfd, (eventfd_t)1);
>> - if (dev->notify_ops->guest_notified)
>> - dev->notify_ops->guest_notified(dev->vid);
>> - }
>> + if (kick && vq->callfd >= 0)
>> + vhost_vring_inject_irq(dev, vq);
>
> Thinking again about the VDUSE series overlap...
>
> This hunk here is implicitely fixing an issue.
> You addressed it in
> https://patchwork.dpdk.org/project/dpdk/patch/20230525162551.70359-2-maxime.coquelin@redhat.com/
>
> So I suggest you apply this patch of yours before Eelco patch 4.
> This way, the fix backport will be trivial.
>
>
Thanks David, this is indeed the best thing to do to ease backporting
the fix to v22.11.
Maxime
^ permalink raw reply [flat|nested] 35+ messages in thread
* RE: [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a read/write one
2023-05-31 11:13 ` Eelco Chaudron
@ 2023-06-01 1:45 ` Xia, Chenbo
0 siblings, 0 replies; 35+ messages in thread
From: Xia, Chenbo @ 2023-06-01 1:45 UTC (permalink / raw)
To: Eelco Chaudron, Maxime Coquelin; +Cc: david.marchand, dev
> -----Original Message-----
> From: Eelco Chaudron <echaudro@redhat.com>
> Sent: Wednesday, May 31, 2023 7:14 PM
> To: Maxime Coquelin <maxime.coquelin@redhat.com>
> Cc: Xia, Chenbo <chenbo.xia@intel.com>; david.marchand@redhat.com;
> dev@dpdk.org
> Subject: Re: [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a
> read/write one
>
>
>
> On 31 May 2023, at 11:27, Maxime Coquelin wrote:
>
> > On 5/31/23 08:37, Xia, Chenbo wrote:
> >> Hi Eelco,
> >>
> >>> -----Original Message-----
> >>> From: Eelco Chaudron <echaudro@redhat.com>
> >>> Sent: Wednesday, May 17, 2023 5:09 PM
> >>> To: maxime.coquelin@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>;
> >>> david.marchand@redhat.com
> >>> Cc: dev@dpdk.org
> >>> Subject: [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a
> >>> read/write one
> >>>
> >>> This change will allow the vhost interrupt datapath handling to be
> split
> >>> between two processed without one of them holding an explicit lock.
> >>>
> >>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> >>> ---
> >>> lib/eal/include/generic/rte_rwlock.h | 17 ++++++
> >>> lib/vhost/vhost.c | 46 +++++++++--------
> >>> lib/vhost/vhost.h | 4 +-
> >>> lib/vhost/vhost_user.c | 14 +++--
> >>> lib/vhost/virtio_net.c | 90 +++++++++++++++++-------
> -----
> >>> -----
> >>> 5 files changed, 94 insertions(+), 77 deletions(-)
> >>>
> >>> diff --git a/lib/eal/include/generic/rte_rwlock.h
> >>> b/lib/eal/include/generic/rte_rwlock.h
> >>> index 71e2d8d5f4..9e083bbc61 100644
> >>> --- a/lib/eal/include/generic/rte_rwlock.h
> >>> +++ b/lib/eal/include/generic/rte_rwlock.h
> >>> @@ -236,6 +236,23 @@ rte_rwlock_write_unlock(rte_rwlock_t *rwl)
> >>> __atomic_fetch_sub(&rwl->cnt, RTE_RWLOCK_WRITE,
> __ATOMIC_RELEASE);
> >>> }
> >>>
> >>> +/**
> >>> + * Test if the write lock is taken.
> >>> + *
> >>> + * @param rwl
> >>> + * A pointer to a rwlock structure.
> >>> + * @return
> >>> + * 1 if the write lock is currently taken; 0 otherwise.
> >>> + */
> >>> +static inline int
> >>> +rte_rwlock_write_is_locked(rte_rwlock_t *rwl)
> >>> +{
> >>> + if (__atomic_load_n(&rwl->cnt, __ATOMIC_RELAXED) & RTE_RWLOCK_WRITE)
> >>> + return 1;
> >>> +
> >>> + return 0;
> >>> +}
> >>> +
> >>
> >> Again we need to update release note as it's a new EAL API.
> >>
> >>> /**
> >>> * Try to execute critical section in a hardware memory transaction,
> if
> >>> it
> >>> * fails or not available take a read lock
> >>> diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
> >>> index ef37943817..74bdbfd810 100644
> >>> --- a/lib/vhost/vhost.c
> >>> +++ b/lib/vhost/vhost.c
> >>> @@ -393,9 +393,9 @@ free_vq(struct virtio_net *dev, struct
> vhost_virtqueue
> >>> *vq)
> >>> else
> >>> rte_free(vq->shadow_used_split);
> >>>
> >>> - rte_spinlock_lock(&vq->access_lock);
> >>> + rte_rwlock_write_lock(&vq->access_lock);
> >>> vhost_free_async_mem(vq);
> >>> - rte_spinlock_unlock(&vq->access_lock);
> >>> + rte_rwlock_write_unlock(&vq->access_lock);
> >>> rte_free(vq->batch_copy_elems);
> >>> vhost_user_iotlb_destroy(vq);
> >>> rte_free(vq->log_cache);
> >>> @@ -630,7 +630,7 @@ alloc_vring_queue(struct virtio_net *dev, uint32_t
> >>> vring_idx)
> >>>
> >>> dev->virtqueue[i] = vq;
> >>> init_vring_queue(dev, vq, i);
> >>> - rte_spinlock_init(&vq->access_lock);
> >>> + rte_rwlock_init(&vq->access_lock);
> >>> vq->avail_wrap_counter = 1;
> >>> vq->used_wrap_counter = 1;
> >>> vq->signalled_used_valid = false;
> >>> @@ -1305,14 +1305,14 @@ rte_vhost_vring_call(int vid, uint16_t
> vring_idx)
> >>> if (!vq)
> >>> return -1;
> >>>
> >>> - rte_spinlock_lock(&vq->access_lock);
> >>> + rte_rwlock_read_lock(&vq->access_lock);
> >>>
> >>> if (vq_is_packed(dev))
> >>> vhost_vring_call_packed(dev, vq);
> >>> else
> >>> vhost_vring_call_split(dev, vq);
> >>>
> >>> - rte_spinlock_unlock(&vq->access_lock);
> >>> + rte_rwlock_read_unlock(&vq->access_lock);
> >>
> >> Not sure about this. vhost_ring_call_packed/split is changing some
> field in
> >> Vq. Should we use write lock here?
> >
> > I don't think so, the purpose of the access_lock is not to make the
> > datapath threads-safe, but to protect the datapath from metadata changes
> > by the control path.
>
> Thanks Chinbo for the review, and see Maxime’s comment above. Does this
> clarify your concern/question?
Make sense to me. Thanks Eelco and Maxime!
With the release note added:
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
>
> >>
> >> Thanks,
> >> Chenbo
> >>
^ permalink raw reply [flat|nested] 35+ messages in thread
* RE: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-05-31 9:29 ` Maxime Coquelin
2023-05-31 11:21 ` Eelco Chaudron
@ 2023-06-01 2:18 ` Xia, Chenbo
2023-06-01 8:15 ` Eelco Chaudron
1 sibling, 1 reply; 35+ messages in thread
From: Xia, Chenbo @ 2023-06-01 2:18 UTC (permalink / raw)
To: Maxime Coquelin, Thomas Monjalon, Eelco Chaudron, david.marchand; +Cc: dev
> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Wednesday, May 31, 2023 5:29 PM
> To: Xia, Chenbo <chenbo.xia@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>; Eelco Chaudron <echaudro@redhat.com>;
> david.marchand@redhat.com
> Cc: dev@dpdk.org
> Subject: Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt
> kick
>
>
>
> On 5/31/23 08:19, Xia, Chenbo wrote:
> >> -----Original Message-----
> >> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> Sent: Tuesday, May 30, 2023 11:17 PM
> >> To: Thomas Monjalon <thomas@monjalon.net>; Eelco Chaudron
> >> <echaudro@redhat.com>; Xia, Chenbo <chenbo.xia@intel.com>;
> >> david.marchand@redhat.com
> >> Cc: dev@dpdk.org
> >> Subject: Re: [PATCH v3 4/4] vhost: add device op to offload the
> interrupt
> >> kick
> >>
> >>
> >>
> >> On 5/30/23 15:16, Thomas Monjalon wrote:
> >>> 30/05/2023 15:02, Maxime Coquelin:
> >>>>
> >>>> On 5/17/23 11:09, Eelco Chaudron wrote:
> >>>>> This patch adds an operation callback which gets called every time
> the
> >>>>> library wants to call eventfd_write(). This eventfd_write() call
> could
> >>>>> result in a system call, which could potentially block the PMD
> thread.
> >>>>>
> >>>>> The callback function can decide whether it's ok to handle the
> >>>>> eventfd_write() now or have the newly introduced function,
> >>>>> rte_vhost_notify_guest(), called at a later time.
> >>>>>
> >>>>> This can be used by 3rd party applications, like OVS, to avoid
> system
> >>>>> calls being called as part of the PMD threads.
> >>>>>
> >>>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> >>>>> ---
> >>>>> lib/vhost/meson.build | 2 ++
> >>>>> lib/vhost/rte_vhost.h | 23 +++++++++++++++++-
> >>>>> lib/vhost/socket.c | 63
> >> ++++++++++++++++++++++++++++++++++++++++++++++---
> >>>>> lib/vhost/version.map | 9 +++++++
> >>>>> lib/vhost/vhost.c | 38 ++++++++++++++++++++++++++++++
> >>>>> lib/vhost/vhost.h | 58 ++++++++++++++++++++++++++++++++---
> ---
> >> -------
> >>>>> 6 files changed, 171 insertions(+), 22 deletions(-)
> >>>>>
> >>>>
> >>>>
> >>>> The patch looks good to me, but that's the first time we use function
> >>>> versioning in Vhost library, so I'd like another pair of eyes to be
> >> sure
> >>>> I don't miss anything.
> >>>>
> >>>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> >>>>
> >>>> Thomas, do we need to mention it somewhere in the release note?
> >>>
> >>> If compatibility is kept, I think we don't need to mention it.
> >>>
> >>>
> >>
> >> Thanks Thomas for the information.
> >>
> >> Maxime
> >
> > About release note, except the versioning, there is also one new API
> introduced
> > in this patch, so we still need to mention this in release note
>
> Right, good catch.
> Eelco, let me know what you would put, I'll add it while applying (No
> need for a new revision).
Btw, the vhost_lib.rst also needs a new item..
Thanks,
Chenbo
>
> Thanks,
> Maxime
>
> > Thanks,
> > Chenbo
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-06-01 2:18 ` Xia, Chenbo
@ 2023-06-01 8:15 ` Eelco Chaudron
2023-06-01 8:29 ` Maxime Coquelin
0 siblings, 1 reply; 35+ messages in thread
From: Eelco Chaudron @ 2023-06-01 8:15 UTC (permalink / raw)
To: Xia, Chenbo; +Cc: Maxime Coquelin, Thomas Monjalon, david.marchand, dev
On 1 Jun 2023, at 4:18, Xia, Chenbo wrote:
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Wednesday, May 31, 2023 5:29 PM
>> To: Xia, Chenbo <chenbo.xia@intel.com>; Thomas Monjalon
>> <thomas@monjalon.net>; Eelco Chaudron <echaudro@redhat.com>;
>> david.marchand@redhat.com
>> Cc: dev@dpdk.org
>> Subject: Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt
>> kick
>>
>>
>>
>> On 5/31/23 08:19, Xia, Chenbo wrote:
>>>> -----Original Message-----
>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Sent: Tuesday, May 30, 2023 11:17 PM
>>>> To: Thomas Monjalon <thomas@monjalon.net>; Eelco Chaudron
>>>> <echaudro@redhat.com>; Xia, Chenbo <chenbo.xia@intel.com>;
>>>> david.marchand@redhat.com
>>>> Cc: dev@dpdk.org
>>>> Subject: Re: [PATCH v3 4/4] vhost: add device op to offload the
>> interrupt
>>>> kick
>>>>
>>>>
>>>>
>>>> On 5/30/23 15:16, Thomas Monjalon wrote:
>>>>> 30/05/2023 15:02, Maxime Coquelin:
>>>>>>
>>>>>> On 5/17/23 11:09, Eelco Chaudron wrote:
>>>>>>> This patch adds an operation callback which gets called every time
>> the
>>>>>>> library wants to call eventfd_write(). This eventfd_write() call
>> could
>>>>>>> result in a system call, which could potentially block the PMD
>> thread.
>>>>>>>
>>>>>>> The callback function can decide whether it's ok to handle the
>>>>>>> eventfd_write() now or have the newly introduced function,
>>>>>>> rte_vhost_notify_guest(), called at a later time.
>>>>>>>
>>>>>>> This can be used by 3rd party applications, like OVS, to avoid
>> system
>>>>>>> calls being called as part of the PMD threads.
>>>>>>>
>>>>>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>>>>>> ---
>>>>>>> lib/vhost/meson.build | 2 ++
>>>>>>> lib/vhost/rte_vhost.h | 23 +++++++++++++++++-
>>>>>>> lib/vhost/socket.c | 63
>>>> ++++++++++++++++++++++++++++++++++++++++++++++---
>>>>>>> lib/vhost/version.map | 9 +++++++
>>>>>>> lib/vhost/vhost.c | 38 ++++++++++++++++++++++++++++++
>>>>>>> lib/vhost/vhost.h | 58 ++++++++++++++++++++++++++++++++---
>> ---
>>>> -------
>>>>>>> 6 files changed, 171 insertions(+), 22 deletions(-)
>>>>>>>
>>>>>>
>>>>>>
>>>>>> The patch looks good to me, but that's the first time we use function
>>>>>> versioning in Vhost library, so I'd like another pair of eyes to be
>>>> sure
>>>>>> I don't miss anything.
>>>>>>
>>>>>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>
>>>>>> Thomas, do we need to mention it somewhere in the release note?
>>>>>
>>>>> If compatibility is kept, I think we don't need to mention it.
>>>>>
>>>>>
>>>>
>>>> Thanks Thomas for the information.
>>>>
>>>> Maxime
>>>
>>> About release note, except the versioning, there is also one new API
>> introduced
>>> in this patch, so we still need to mention this in release note
>>
>> Right, good catch.
>> Eelco, let me know what you would put, I'll add it while applying (No
>> need for a new revision).
>
> Btw, the vhost_lib.rst also needs a new item..
What about the following?
diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index e8bb8c9b7b..0f964d7a4a 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -334,6 +334,11 @@ The following is an overview of some key Vhost API functions:
Clean DMA vChannel finished to use. After this function is called,
the specified DMA vChannel should no longer be used by the Vhost library.
+* ``rte_vhost_notify_guest(int vid, uint16_t queue_id)``
+
+ Inject the offloaded interrupt received by the 'guest_notify' callback,
+ into the vhost device's queue.
+
Vhost-user Implementations
--------------------------
Maxime do you want to add it, or do you want a new rev?
> Thanks,
> Chenbo
>
>>
>> Thanks,
>> Maxime
>>
>>> Thanks,
>>> Chenbo
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-06-01 8:15 ` Eelco Chaudron
@ 2023-06-01 8:29 ` Maxime Coquelin
2023-06-01 8:49 ` Eelco Chaudron
0 siblings, 1 reply; 35+ messages in thread
From: Maxime Coquelin @ 2023-06-01 8:29 UTC (permalink / raw)
To: Eelco Chaudron, Xia, Chenbo; +Cc: Thomas Monjalon, david.marchand, dev
On 6/1/23 10:15, Eelco Chaudron wrote:
>
>
> On 1 Jun 2023, at 4:18, Xia, Chenbo wrote:
>
>>> -----Original Message-----
>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>> Sent: Wednesday, May 31, 2023 5:29 PM
>>> To: Xia, Chenbo <chenbo.xia@intel.com>; Thomas Monjalon
>>> <thomas@monjalon.net>; Eelco Chaudron <echaudro@redhat.com>;
>>> david.marchand@redhat.com
>>> Cc: dev@dpdk.org
>>> Subject: Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt
>>> kick
>>>
>>>
>>>
>>> On 5/31/23 08:19, Xia, Chenbo wrote:
>>>>> -----Original Message-----
>>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>> Sent: Tuesday, May 30, 2023 11:17 PM
>>>>> To: Thomas Monjalon <thomas@monjalon.net>; Eelco Chaudron
>>>>> <echaudro@redhat.com>; Xia, Chenbo <chenbo.xia@intel.com>;
>>>>> david.marchand@redhat.com
>>>>> Cc: dev@dpdk.org
>>>>> Subject: Re: [PATCH v3 4/4] vhost: add device op to offload the
>>> interrupt
>>>>> kick
>>>>>
>>>>>
>>>>>
>>>>> On 5/30/23 15:16, Thomas Monjalon wrote:
>>>>>> 30/05/2023 15:02, Maxime Coquelin:
>>>>>>>
>>>>>>> On 5/17/23 11:09, Eelco Chaudron wrote:
>>>>>>>> This patch adds an operation callback which gets called every time
>>> the
>>>>>>>> library wants to call eventfd_write(). This eventfd_write() call
>>> could
>>>>>>>> result in a system call, which could potentially block the PMD
>>> thread.
>>>>>>>>
>>>>>>>> The callback function can decide whether it's ok to handle the
>>>>>>>> eventfd_write() now or have the newly introduced function,
>>>>>>>> rte_vhost_notify_guest(), called at a later time.
>>>>>>>>
>>>>>>>> This can be used by 3rd party applications, like OVS, to avoid
>>> system
>>>>>>>> calls being called as part of the PMD threads.
>>>>>>>>
>>>>>>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>>>>>>> ---
>>>>>>>> lib/vhost/meson.build | 2 ++
>>>>>>>> lib/vhost/rte_vhost.h | 23 +++++++++++++++++-
>>>>>>>> lib/vhost/socket.c | 63
>>>>> ++++++++++++++++++++++++++++++++++++++++++++++---
>>>>>>>> lib/vhost/version.map | 9 +++++++
>>>>>>>> lib/vhost/vhost.c | 38 ++++++++++++++++++++++++++++++
>>>>>>>> lib/vhost/vhost.h | 58 ++++++++++++++++++++++++++++++++---
>>> ---
>>>>> -------
>>>>>>>> 6 files changed, 171 insertions(+), 22 deletions(-)
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> The patch looks good to me, but that's the first time we use function
>>>>>>> versioning in Vhost library, so I'd like another pair of eyes to be
>>>>> sure
>>>>>>> I don't miss anything.
>>>>>>>
>>>>>>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>>
>>>>>>> Thomas, do we need to mention it somewhere in the release note?
>>>>>>
>>>>>> If compatibility is kept, I think we don't need to mention it.
>>>>>>
>>>>>>
>>>>>
>>>>> Thanks Thomas for the information.
>>>>>
>>>>> Maxime
>>>>
>>>> About release note, except the versioning, there is also one new API
>>> introduced
>>>> in this patch, so we still need to mention this in release note
>>>
>>> Right, good catch.
>>> Eelco, let me know what you would put, I'll add it while applying (No
>>> need for a new revision).
>>
>> Btw, the vhost_lib.rst also needs a new item..
>
> What about the following?
>
> diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
> index e8bb8c9b7b..0f964d7a4a 100644
> --- a/doc/guides/prog_guide/vhost_lib.rst
> +++ b/doc/guides/prog_guide/vhost_lib.rst
> @@ -334,6 +334,11 @@ The following is an overview of some key Vhost API functions:
> Clean DMA vChannel finished to use. After this function is called,
> the specified DMA vChannel should no longer be used by the Vhost library.
>
> +* ``rte_vhost_notify_guest(int vid, uint16_t queue_id)``
> +
> + Inject the offloaded interrupt received by the 'guest_notify' callback,
> + into the vhost device's queue.
> +
> Vhost-user Implementations
> --------------------------
>
> Maxime do you want to add it, or do you want a new rev?
That's fine, I just added it now :) Can you check the next-virtio main
branch and confirm all is OK for your series?
https://git.dpdk.org/next/dpdk-next-virtio/log/
Thanks,
Maxime
>
>> Thanks,
>> Chenbo
>>
>>>
>>> Thanks,
>>> Maxime
>>>
>>>> Thanks,
>>>> Chenbo
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-06-01 8:29 ` Maxime Coquelin
@ 2023-06-01 8:49 ` Eelco Chaudron
2023-06-01 8:53 ` Maxime Coquelin
0 siblings, 1 reply; 35+ messages in thread
From: Eelco Chaudron @ 2023-06-01 8:49 UTC (permalink / raw)
To: Maxime Coquelin; +Cc: Xia, Chenbo, Thomas Monjalon, david.marchand, dev
On 1 Jun 2023, at 10:29, Maxime Coquelin wrote:
> On 6/1/23 10:15, Eelco Chaudron wrote:
>>
>>
>> On 1 Jun 2023, at 4:18, Xia, Chenbo wrote:
>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Sent: Wednesday, May 31, 2023 5:29 PM
>>>> To: Xia, Chenbo <chenbo.xia@intel.com>; Thomas Monjalon
>>>> <thomas@monjalon.net>; Eelco Chaudron <echaudro@redhat.com>;
>>>> david.marchand@redhat.com
>>>> Cc: dev@dpdk.org
>>>> Subject: Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt
>>>> kick
>>>>
>>>>
>>>>
>>>> On 5/31/23 08:19, Xia, Chenbo wrote:
>>>>>> -----Original Message-----
>>>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>> Sent: Tuesday, May 30, 2023 11:17 PM
>>>>>> To: Thomas Monjalon <thomas@monjalon.net>; Eelco Chaudron
>>>>>> <echaudro@redhat.com>; Xia, Chenbo <chenbo.xia@intel.com>;
>>>>>> david.marchand@redhat.com
>>>>>> Cc: dev@dpdk.org
>>>>>> Subject: Re: [PATCH v3 4/4] vhost: add device op to offload the
>>>> interrupt
>>>>>> kick
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 5/30/23 15:16, Thomas Monjalon wrote:
>>>>>>> 30/05/2023 15:02, Maxime Coquelin:
>>>>>>>>
>>>>>>>> On 5/17/23 11:09, Eelco Chaudron wrote:
>>>>>>>>> This patch adds an operation callback which gets called every time
>>>> the
>>>>>>>>> library wants to call eventfd_write(). This eventfd_write() call
>>>> could
>>>>>>>>> result in a system call, which could potentially block the PMD
>>>> thread.
>>>>>>>>>
>>>>>>>>> The callback function can decide whether it's ok to handle the
>>>>>>>>> eventfd_write() now or have the newly introduced function,
>>>>>>>>> rte_vhost_notify_guest(), called at a later time.
>>>>>>>>>
>>>>>>>>> This can be used by 3rd party applications, like OVS, to avoid
>>>> system
>>>>>>>>> calls being called as part of the PMD threads.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>>>>>>>> ---
>>>>>>>>> lib/vhost/meson.build | 2 ++
>>>>>>>>> lib/vhost/rte_vhost.h | 23 +++++++++++++++++-
>>>>>>>>> lib/vhost/socket.c | 63
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++---
>>>>>>>>> lib/vhost/version.map | 9 +++++++
>>>>>>>>> lib/vhost/vhost.c | 38 ++++++++++++++++++++++++++++++
>>>>>>>>> lib/vhost/vhost.h | 58 ++++++++++++++++++++++++++++++++---
>>>> ---
>>>>>> -------
>>>>>>>>> 6 files changed, 171 insertions(+), 22 deletions(-)
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> The patch looks good to me, but that's the first time we use function
>>>>>>>> versioning in Vhost library, so I'd like another pair of eyes to be
>>>>>> sure
>>>>>>>> I don't miss anything.
>>>>>>>>
>>>>>>>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>>>
>>>>>>>> Thomas, do we need to mention it somewhere in the release note?
>>>>>>>
>>>>>>> If compatibility is kept, I think we don't need to mention it.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Thanks Thomas for the information.
>>>>>>
>>>>>> Maxime
>>>>>
>>>>> About release note, except the versioning, there is also one new API
>>>> introduced
>>>>> in this patch, so we still need to mention this in release note
>>>>
>>>> Right, good catch.
>>>> Eelco, let me know what you would put, I'll add it while applying (No
>>>> need for a new revision).
>>>
>>> Btw, the vhost_lib.rst also needs a new item..
>>
>> What about the following?
>>
>> diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
>> index e8bb8c9b7b..0f964d7a4a 100644
>> --- a/doc/guides/prog_guide/vhost_lib.rst
>> +++ b/doc/guides/prog_guide/vhost_lib.rst
>> @@ -334,6 +334,11 @@ The following is an overview of some key Vhost API functions:
>> Clean DMA vChannel finished to use. After this function is called,
>> the specified DMA vChannel should no longer be used by the Vhost library.
>>
>> +* ``rte_vhost_notify_guest(int vid, uint16_t queue_id)``
>> +
>> + Inject the offloaded interrupt received by the 'guest_notify' callback,
>> + into the vhost device's queue.
>> +
>> Vhost-user Implementations
>> --------------------------
>>
>> Maxime do you want to add it, or do you want a new rev?
>
> That's fine, I just added it now :) Can you check the next-virtio main
> branch and confirm all is OK for your series?
>
> https://git.dpdk.org/next/dpdk-next-virtio/log/
>
One small spelling issue, the rest looks good:
+* **Added callback API support for interrupt handling in the vhost library.**
+
+ A new callback, guest_notify, is introduced that can be used to handle the
+ interrupt kick outside of the datapath fast path. In addition, a new API,
+ rte_vhost_notify_guest(), is added to raise the interrupt outside of the
+ past path.
Last sentence spells ‘past path’ instead of ‘fast path’.
> Thanks,
> Maxime
>
>>
>>> Thanks,
>>> Chenbo
>>>
>>>>
>>>> Thanks,
>>>> Maxime
>>>>
>>>>> Thanks,
>>>>> Chenbo
>>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt kick
2023-06-01 8:49 ` Eelco Chaudron
@ 2023-06-01 8:53 ` Maxime Coquelin
0 siblings, 0 replies; 35+ messages in thread
From: Maxime Coquelin @ 2023-06-01 8:53 UTC (permalink / raw)
To: Eelco Chaudron; +Cc: Xia, Chenbo, Thomas Monjalon, david.marchand, dev
On 6/1/23 10:49, Eelco Chaudron wrote:
>
>
> On 1 Jun 2023, at 10:29, Maxime Coquelin wrote:
>
>> On 6/1/23 10:15, Eelco Chaudron wrote:
>>>
>>>
>>> On 1 Jun 2023, at 4:18, Xia, Chenbo wrote:
>>>
>>>>> -----Original Message-----
>>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>> Sent: Wednesday, May 31, 2023 5:29 PM
>>>>> To: Xia, Chenbo <chenbo.xia@intel.com>; Thomas Monjalon
>>>>> <thomas@monjalon.net>; Eelco Chaudron <echaudro@redhat.com>;
>>>>> david.marchand@redhat.com
>>>>> Cc: dev@dpdk.org
>>>>> Subject: Re: [PATCH v3 4/4] vhost: add device op to offload the interrupt
>>>>> kick
>>>>>
>>>>>
>>>>>
>>>>> On 5/31/23 08:19, Xia, Chenbo wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>> Sent: Tuesday, May 30, 2023 11:17 PM
>>>>>>> To: Thomas Monjalon <thomas@monjalon.net>; Eelco Chaudron
>>>>>>> <echaudro@redhat.com>; Xia, Chenbo <chenbo.xia@intel.com>;
>>>>>>> david.marchand@redhat.com
>>>>>>> Cc: dev@dpdk.org
>>>>>>> Subject: Re: [PATCH v3 4/4] vhost: add device op to offload the
>>>>> interrupt
>>>>>>> kick
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 5/30/23 15:16, Thomas Monjalon wrote:
>>>>>>>> 30/05/2023 15:02, Maxime Coquelin:
>>>>>>>>>
>>>>>>>>> On 5/17/23 11:09, Eelco Chaudron wrote:
>>>>>>>>>> This patch adds an operation callback which gets called every time
>>>>> the
>>>>>>>>>> library wants to call eventfd_write(). This eventfd_write() call
>>>>> could
>>>>>>>>>> result in a system call, which could potentially block the PMD
>>>>> thread.
>>>>>>>>>>
>>>>>>>>>> The callback function can decide whether it's ok to handle the
>>>>>>>>>> eventfd_write() now or have the newly introduced function,
>>>>>>>>>> rte_vhost_notify_guest(), called at a later time.
>>>>>>>>>>
>>>>>>>>>> This can be used by 3rd party applications, like OVS, to avoid
>>>>> system
>>>>>>>>>> calls being called as part of the PMD threads.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>>>>>>>>> ---
>>>>>>>>>> lib/vhost/meson.build | 2 ++
>>>>>>>>>> lib/vhost/rte_vhost.h | 23 +++++++++++++++++-
>>>>>>>>>> lib/vhost/socket.c | 63
>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++---
>>>>>>>>>> lib/vhost/version.map | 9 +++++++
>>>>>>>>>> lib/vhost/vhost.c | 38 ++++++++++++++++++++++++++++++
>>>>>>>>>> lib/vhost/vhost.h | 58 ++++++++++++++++++++++++++++++++---
>>>>> ---
>>>>>>> -------
>>>>>>>>>> 6 files changed, 171 insertions(+), 22 deletions(-)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The patch looks good to me, but that's the first time we use function
>>>>>>>>> versioning in Vhost library, so I'd like another pair of eyes to be
>>>>>>> sure
>>>>>>>>> I don't miss anything.
>>>>>>>>>
>>>>>>>>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>>>>
>>>>>>>>> Thomas, do we need to mention it somewhere in the release note?
>>>>>>>>
>>>>>>>> If compatibility is kept, I think we don't need to mention it.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> Thanks Thomas for the information.
>>>>>>>
>>>>>>> Maxime
>>>>>>
>>>>>> About release note, except the versioning, there is also one new API
>>>>> introduced
>>>>>> in this patch, so we still need to mention this in release note
>>>>>
>>>>> Right, good catch.
>>>>> Eelco, let me know what you would put, I'll add it while applying (No
>>>>> need for a new revision).
>>>>
>>>> Btw, the vhost_lib.rst also needs a new item..
>>>
>>> What about the following?
>>>
>>> diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
>>> index e8bb8c9b7b..0f964d7a4a 100644
>>> --- a/doc/guides/prog_guide/vhost_lib.rst
>>> +++ b/doc/guides/prog_guide/vhost_lib.rst
>>> @@ -334,6 +334,11 @@ The following is an overview of some key Vhost API functions:
>>> Clean DMA vChannel finished to use. After this function is called,
>>> the specified DMA vChannel should no longer be used by the Vhost library.
>>>
>>> +* ``rte_vhost_notify_guest(int vid, uint16_t queue_id)``
>>> +
>>> + Inject the offloaded interrupt received by the 'guest_notify' callback,
>>> + into the vhost device's queue.
>>> +
>>> Vhost-user Implementations
>>> --------------------------
>>>
>>> Maxime do you want to add it, or do you want a new rev?
>>
>> That's fine, I just added it now :) Can you check the next-virtio main
>> branch and confirm all is OK for your series?
>>
>> https://git.dpdk.org/next/dpdk-next-virtio/log/
>>
>
> One small spelling issue, the rest looks good:
>
> +* **Added callback API support for interrupt handling in the vhost library.**
> +
> + A new callback, guest_notify, is introduced that can be used to handle the
> + interrupt kick outside of the datapath fast path. In addition, a new API,
> + rte_vhost_notify_guest(), is added to raise the interrupt outside of the
> + past path.
>
> Last sentence spells ‘past path’ instead of ‘fast path’.
I edited the line but it didn't caught my eye!
Fixing it now.
Thanks,
Maxime
>
>> Thanks,
>> Maxime
>>
>>>
>>>> Thanks,
>>>> Chenbo
>>>>
>>>>>
>>>>> Thanks,
>>>>> Maxime
>>>>>
>>>>>> Thanks,
>>>>>> Chenbo
>>>
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 0/4] vhost: add device op to offload the interrupt kick
2023-05-17 9:08 [PATCH v3 0/4] vhost: add device op to offload the interrupt kick Eelco Chaudron
` (3 preceding siblings ...)
2023-05-17 9:09 ` [PATCH v3 4/4] vhost: add device op to offload the interrupt kick Eelco Chaudron
@ 2023-06-01 20:00 ` Maxime Coquelin
2023-06-02 6:20 ` Eelco Chaudron
4 siblings, 1 reply; 35+ messages in thread
From: Maxime Coquelin @ 2023-06-01 20:00 UTC (permalink / raw)
To: Eelco Chaudron, chenbo.xia, david.marchand; +Cc: dev
On 5/17/23 11:08, Eelco Chaudron wrote:
> This series adds an operation callback which gets called every time the
> library wants to call eventfd_write(). This eventfd_write() call could
> result in a system call, which could potentially block the PMD thread.
>
> The callback function can decide whether it's ok to handle the
> eventfd_write() now or have the newly introduced function,
> rte_vhost_notify_guest(), called at a later time.
>
> This can be used by 3rd party applications, like OVS, to avoid system
> calls being called as part of the PMD threads.
>
> v3:
> - Changed ABI compatibility code to no longer use a boolean
> to avoid having to disable specific GCC warnings.
> - Moved the fd check fix to a separate patch (patch 3/4).
> - Fixed some coding style issues.
>
> v2: - Used vhost_virtqueue->index to find index for operation.
> - Aligned function name to VDUSE RFC patchset.
> - Added error and offload statistics counter.
> - Mark new API as experimental.
> - Change the virtual queue spin lock to read/write spin lock.
> - Made shared counters atomic.
> - Add versioned rte_vhost_driver_callback_register() for
> ABI compliance.
>
> Eelco Chaudron (4):
> vhost: change vhost_virtqueue access lock to a read/write one
> vhost: make the guest_notifications statistic counter atomic
> vhost: fix invalid call FD handling
> vhost: add device op to offload the interrupt kick
>
>
> lib/eal/include/generic/rte_rwlock.h | 17 +++++
> lib/vhost/meson.build | 2 +
> lib/vhost/rte_vhost.h | 23 ++++++-
> lib/vhost/socket.c | 63 +++++++++++++++++--
> lib/vhost/version.map | 9 +++
> lib/vhost/vhost.c | 92 +++++++++++++++++++++-------
> lib/vhost/vhost.h | 69 ++++++++++++++-------
> lib/vhost/vhost_user.c | 14 ++---
> lib/vhost/virtio_net.c | 90 +++++++++++++--------------
> 9 files changed, 278 insertions(+), 101 deletions(-)
>
Applied to dpdk-next-virtio/main.
Thanks,
Maxime
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 0/4] vhost: add device op to offload the interrupt kick
2023-06-01 20:00 ` [PATCH v3 0/4] " Maxime Coquelin
@ 2023-06-02 6:20 ` Eelco Chaudron
0 siblings, 0 replies; 35+ messages in thread
From: Eelco Chaudron @ 2023-06-02 6:20 UTC (permalink / raw)
To: Maxime Coquelin; +Cc: chenbo.xia, david.marchand, dev
On 1 Jun 2023, at 22:00, Maxime Coquelin wrote:
> On 5/17/23 11:08, Eelco Chaudron wrote:
>> This series adds an operation callback which gets called every time the
>> library wants to call eventfd_write(). This eventfd_write() call could
>> result in a system call, which could potentially block the PMD thread.
>>
>> The callback function can decide whether it's ok to handle the
>> eventfd_write() now or have the newly introduced function,
>> rte_vhost_notify_guest(), called at a later time.
>>
>> This can be used by 3rd party applications, like OVS, to avoid system
>> calls being called as part of the PMD threads.
>>
>> v3:
>> - Changed ABI compatibility code to no longer use a boolean
>> to avoid having to disable specific GCC warnings.
>> - Moved the fd check fix to a separate patch (patch 3/4).
>> - Fixed some coding style issues.
>>
>> v2: - Used vhost_virtqueue->index to find index for operation.
>> - Aligned function name to VDUSE RFC patchset.
>> - Added error and offload statistics counter.
>> - Mark new API as experimental.
>> - Change the virtual queue spin lock to read/write spin lock.
>> - Made shared counters atomic.
>> - Add versioned rte_vhost_driver_callback_register() for
>> ABI compliance.
>>
>> Eelco Chaudron (4):
>> vhost: change vhost_virtqueue access lock to a read/write one
>> vhost: make the guest_notifications statistic counter atomic
>> vhost: fix invalid call FD handling
>> vhost: add device op to offload the interrupt kick
>>
>>
>> lib/eal/include/generic/rte_rwlock.h | 17 +++++
>> lib/vhost/meson.build | 2 +
>> lib/vhost/rte_vhost.h | 23 ++++++-
>> lib/vhost/socket.c | 63 +++++++++++++++++--
>> lib/vhost/version.map | 9 +++
>> lib/vhost/vhost.c | 92 +++++++++++++++++++++-------
>> lib/vhost/vhost.h | 69 ++++++++++++++-------
>> lib/vhost/vhost_user.c | 14 ++---
>> lib/vhost/virtio_net.c | 90 +++++++++++++--------------
>> 9 files changed, 278 insertions(+), 101 deletions(-)
>>
>
>
> Applied to dpdk-next-virtio/main.
Thanks Maxime!
//Eelco
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2023-06-02 6:20 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-17 9:08 [PATCH v3 0/4] vhost: add device op to offload the interrupt kick Eelco Chaudron
2023-05-17 9:08 ` [PATCH v3 1/4] vhost: change vhost_virtqueue access lock to a read/write one Eelco Chaudron
2023-05-17 17:33 ` Maxime Coquelin
2023-05-18 14:46 ` Eelco Chaudron
2023-05-31 6:37 ` Xia, Chenbo
2023-05-31 9:27 ` Maxime Coquelin
2023-05-31 11:13 ` Eelco Chaudron
2023-06-01 1:45 ` Xia, Chenbo
2023-05-17 9:08 ` [PATCH v3 2/4] vhost: make the guest_notifications statistic counter atomic Eelco Chaudron
2023-05-30 12:52 ` Maxime Coquelin
2023-05-31 7:03 ` Xia, Chenbo
2023-05-17 9:09 ` [PATCH v3 3/4] vhost: fix invalid call FD handling Eelco Chaudron
2023-05-30 12:54 ` Maxime Coquelin
2023-05-31 6:12 ` Xia, Chenbo
2023-05-31 9:30 ` Maxime Coquelin
2023-05-17 9:09 ` [PATCH v3 4/4] vhost: add device op to offload the interrupt kick Eelco Chaudron
2023-05-30 13:02 ` Maxime Coquelin
2023-05-30 13:16 ` Thomas Monjalon
2023-05-30 15:16 ` Maxime Coquelin
2023-05-31 6:19 ` Xia, Chenbo
2023-05-31 9:29 ` Maxime Coquelin
2023-05-31 11:21 ` Eelco Chaudron
2023-06-01 2:18 ` Xia, Chenbo
2023-06-01 8:15 ` Eelco Chaudron
2023-06-01 8:29 ` Maxime Coquelin
2023-06-01 8:49 ` Eelco Chaudron
2023-06-01 8:53 ` Maxime Coquelin
2023-05-31 11:49 ` David Marchand
2023-05-31 12:01 ` David Marchand
2023-05-31 12:48 ` Maxime Coquelin
2023-05-31 13:13 ` Eelco Chaudron
2023-05-31 14:12 ` David Marchand
2023-05-31 14:18 ` Maxime Coquelin
2023-06-01 20:00 ` [PATCH v3 0/4] " Maxime Coquelin
2023-06-02 6:20 ` Eelco Chaudron
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).