From: "Wang, YuanX" <yuanx.wang@intel.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>,
"Xia, Chenbo" <chenbo.xia@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, "Hu, Jiayu" <jiayu.hu@intel.com>,
"Ding, Xuan" <xuan.ding@intel.com>,
"Ma, WenwuX" <wenwux.ma@intel.com>,
"Ling, WeiX" <weix.ling@intel.com>
Subject: RE: [PATCH] vhost: fix data-plane access to released vq
Date: Thu, 27 Jan 2022 10:30:42 +0000 [thread overview]
Message-ID: <CO1PR11MB48971FCDCC2936AE9806569A85219@CO1PR11MB4897.namprd11.prod.outlook.com> (raw)
In-Reply-To: <63fdcab8-d692-c8c6-240d-a87b01ed1778@redhat.com>
Hi Maxime,
> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Wednesday, January 26, 2022 10:03 PM
> To: Wang, YuanX <yuanx.wang@intel.com>; Xia, Chenbo
> <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; Hu, Jiayu <jiayu.hu@intel.com>; Ding, Xuan
> <xuan.ding@intel.com>; Ma, WenwuX <wenwux.ma@intel.com>; Ling,
> WeiX <weix.ling@intel.com>
> Subject: Re: [PATCH] vhost: fix data-plane access to released vq
>
> Hi Yuan,
>
> On 12/3/21 17:34, Yuan Wang wrote:
> > From: yuan wang <yuanx.wang@intel.com>
> >
> > When numa reallocation occurs, numa_realoc() on the control plane will
> > free the old vq. If rte_vhost_dequeue_burst() on the data plane get
> > the vq just before release, then it will access the released vq. We
> > need to put the
> > vq->access_lock into struct virtio_net to ensure that it
> > can prevents this situation.
>
>
> This patch is a fix, so the Fixes tag would be needed.
>
> But are you really facing this issue, or this is just based on code review?
This issue is run-time checked with AddressSanitizer which can be turned on by:
meson configure -Db_sanitize=address <build_dir>
>
> Currently NUMA reallocation is called whenever
> translate_ring_addresses() is called.
>
> translate_ring_addresses() is primarly called at device initialization, before
> the .new_device() callback is called. At that stage, there is no risk to
> performa NUMA reallocation as the application is not expected to use APIs
> requiring vq->access_lock acquisition.
>
> But I agree there are possibilities that numa_realloc() gets called while device
> is in running state. But even if that happened, I don't think it is possible that
> numa_realloc() ends-up reallocating the virtqueue on a different NUMA
> node (the vring should not have moved from a physical memory standpoint).
> And if even it happened, we should be safe because we ensure the VQ was
> not ready (so not usable by the
> application) before proceeding with reallocation:
Here is a scenario where VQ ready has not been set:
1. run the testpmd and then start the data plane process.
2. run the front-end.
3. new_device() gets called when the first two queues are ready, even if the later queues are not.
4. when processing messages from the later queues, it may go to numa_realloc(), the ready flag has not been set and therefore can be reallocated.
If all the queues are ready before call new_deivce(), this issue does not occur.
I think maybe it is another solution.
Thanks,
Yuan
>
> static struct virtio_net*
> numa_realloc(struct virtio_net *dev, int index) {
> int node, dev_node;
> struct virtio_net *old_dev;
> struct vhost_virtqueue *vq;
> struct batch_copy_elem *bce;
> struct guest_page *gp;
> struct rte_vhost_memory *mem;
> size_t mem_size;
> int ret;
>
> old_dev = dev;
> vq = dev->virtqueue[index];
>
> /*
> * If VQ is ready, it is too late to reallocate, it certainly already
> * happened anyway on VHOST_USER_SET_VRING_ADRR.
> */
> if (vq->ready)
> return dev;
>
> So, if this is fixing a real issue, I would need more details on the issue in order
> to understand why vq->ready was not set when it should have been.
>
> On a side note, while trying to understand how you could face an issue, I
> noticed that translate_ring_addresses() may be called by
> vhost_user_iotlb_msg(). In that case, vq->access_lock is not held as this is
> the handler for VHOST_USER_IOTLB_MSG. We may want to protect
> translate_ring_addresses() calls with locking the VQ locks. I will post a fix for
> it.
>
> > Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
> > ---
> > lib/vhost/vhost.c | 26 +++++++++++++-------------
> > lib/vhost/vhost.h | 4 +---
> > lib/vhost/vhost_user.c | 4 ++--
> > lib/vhost/virtio_net.c | 16 ++++++++--------
> > 4 files changed, 24 insertions(+), 26 deletions(-)
> >
>
> ...
>
> > diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index
> > 7085e0885c..f85ce4fda5 100644
> > --- a/lib/vhost/vhost.h
> > +++ b/lib/vhost/vhost.h
> > @@ -185,9 +185,6 @@ struct vhost_virtqueue {
> > bool access_ok;
> > bool ready;
> >
> > - rte_spinlock_t access_lock;
> > -
> > -
> > union {
> > struct vring_used_elem *shadow_used_split;
> > struct vring_used_elem_packed *shadow_used_packed;
> @@ -384,6
> > +381,7 @@ struct virtio_net {
> > int extbuf;
> > int linearbuf;
> > struct vhost_virtqueue *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2];
> > + rte_spinlock_t vq_access_lock[VHOST_MAX_QUEUE_PAIRS
> * 2];
>
> The problem here is that you'll be introducing false sharing, so I expect
> performance to no more scale with the number of queues.
>
> It also consumes unnecessary memory.
>
> > struct inflight_mem_info *inflight_info;
> > #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
> > char ifname[IF_NAME_SZ];
>
> Thanks,
> Maxime
next prev parent reply other threads:[~2022-01-27 10:30 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-03 16:34 Yuan Wang
2022-01-26 14:02 ` Maxime Coquelin
2022-01-27 10:30 ` Wang, YuanX [this message]
2022-01-27 10:46 ` Maxime Coquelin
2022-01-29 9:26 ` Wang, YuanX
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CO1PR11MB48971FCDCC2936AE9806569A85219@CO1PR11MB4897.namprd11.prod.outlook.com \
--to=yuanx.wang@intel.com \
--cc=chenbo.xia@intel.com \
--cc=dev@dpdk.org \
--cc=jiayu.hu@intel.com \
--cc=maxime.coquelin@redhat.com \
--cc=weix.ling@intel.com \
--cc=wenwux.ma@intel.com \
--cc=xuan.ding@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).