DPDK patches and discussions
 help / color / mirror / Atom feed
From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: Yuan Wang <yuanx.wang@intel.com>, chenbo.xia@intel.com
Cc: dev@dpdk.org, jiayu.hu@intel.com, xuan.ding@intel.com,
	wenwux.ma@intel.com, weix.ling@intel.com
Subject: Re: [PATCH] vhost: fix data-plane access to released vq
Date: Wed, 26 Jan 2022 15:02:37 +0100	[thread overview]
Message-ID: <63fdcab8-d692-c8c6-240d-a87b01ed1778@redhat.com> (raw)
In-Reply-To: <20211203163400.164545-1-yuanx.wang@intel.com>

Hi Yuan,

On 12/3/21 17:34, Yuan Wang wrote:
> From: yuan wang <yuanx.wang@intel.com>
> 
> When numa reallocation occurs, numa_realoc() on the control
> plane will free the old vq. If rte_vhost_dequeue_burst()
> on the data plane get the vq just before release, then it
> will access the released vq. We need to put the
> vq->access_lock into struct virtio_net to ensure that it
> can prevents this situation.


This patch is a fix, so the Fixes tag would be needed.

But are you really facing this issue, or this is just based on code
review?

Currently NUMA reallocation is called whenever
translate_ring_addresses() is called.

translate_ring_addresses() is primarly called at device initialization,
before the .new_device() callback is called. At that stage, there is no
risk to performa NUMA reallocation as the application is not expected to
use APIs requiring vq->access_lock acquisition.

But I agree there are possibilities that numa_realloc() gets called
while device is in running state. But even if that happened, I don't
think it is possible that numa_realloc() ends-up reallocating the
virtqueue on a different NUMA node (the vring should not have moved from
a physical memory standpoint). And if even it happened, we should be
safe because we ensure the VQ was not ready (so not usable by the
application) before proceeding with reallocation:

static struct virtio_net*
numa_realloc(struct virtio_net *dev, int index)
{
	int node, dev_node;
	struct virtio_net *old_dev;
	struct vhost_virtqueue *vq;
	struct batch_copy_elem *bce;
	struct guest_page *gp;
	struct rte_vhost_memory *mem;
	size_t mem_size;
	int ret;

	old_dev = dev;
	vq = dev->virtqueue[index];

	/*
	 * If VQ is ready, it is too late to reallocate, it certainly already
	 * happened anyway on VHOST_USER_SET_VRING_ADRR.
	 */
	if (vq->ready)
		return dev;

So, if this is fixing a real issue, I would need more details on the
issue in order to understand why vq->ready was not set when it should
have been.

On a side note, while trying to understand how you could face an issue,
I noticed that translate_ring_addresses() may be called by
vhost_user_iotlb_msg(). In that case, vq->access_lock is not held as
this is the handler for VHOST_USER_IOTLB_MSG. We may want to protect
translate_ring_addresses() calls with locking the VQ locks. I will post
a fix for it.

> Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
> ---
>   lib/vhost/vhost.c      | 26 +++++++++++++-------------
>   lib/vhost/vhost.h      |  4 +---
>   lib/vhost/vhost_user.c |  4 ++--
>   lib/vhost/virtio_net.c | 16 ++++++++--------
>   4 files changed, 24 insertions(+), 26 deletions(-)
> 

...

> diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
> index 7085e0885c..f85ce4fda5 100644
> --- a/lib/vhost/vhost.h
> +++ b/lib/vhost/vhost.h
> @@ -185,9 +185,6 @@ struct vhost_virtqueue {
>   	bool			access_ok;
>   	bool			ready;
>   
> -	rte_spinlock_t		access_lock;
> -
> -
>   	union {
>   		struct vring_used_elem  *shadow_used_split;
>   		struct vring_used_elem_packed *shadow_used_packed;
> @@ -384,6 +381,7 @@ struct virtio_net {
>   	int			extbuf;
>   	int			linearbuf;
>   	struct vhost_virtqueue	*virtqueue[VHOST_MAX_QUEUE_PAIRS * 2];
> +	rte_spinlock_t		vq_access_lock[VHOST_MAX_QUEUE_PAIRS * 2];

The problem here is that you'll be introducing false sharing, so I
expect performance to no more scale with the number of queues.

It also consumes unnecessary memory.

>   	struct inflight_mem_info *inflight_info;
>   #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
>   	char			ifname[IF_NAME_SZ];

Thanks,
Maxime


  reply	other threads:[~2022-01-26 14:02 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-03 16:34 Yuan Wang
2022-01-26 14:02 ` Maxime Coquelin [this message]
2022-01-27 10:30   ` Wang, YuanX
2022-01-27 10:46     ` Maxime Coquelin
2022-01-29  9:26       ` Wang, YuanX

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=63fdcab8-d692-c8c6-240d-a87b01ed1778@redhat.com \
    --to=maxime.coquelin@redhat.com \
    --cc=chenbo.xia@intel.com \
    --cc=dev@dpdk.org \
    --cc=jiayu.hu@intel.com \
    --cc=weix.ling@intel.com \
    --cc=wenwux.ma@intel.com \
    --cc=xuan.ding@intel.com \
    --cc=yuanx.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).