From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 7A2B5293B; Thu, 7 Dec 2017 10:33:31 +0100 (CET) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Dec 2017 01:33:30 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.45,372,1508828400"; d="scan'208";a="1252755105" Received: from fmsmsx106.amr.corp.intel.com ([10.18.124.204]) by fmsmga002.fm.intel.com with ESMTP; 07 Dec 2017 01:33:30 -0800 Received: from shsmsx151.ccr.corp.intel.com (10.239.6.50) by FMSMSX106.amr.corp.intel.com (10.18.124.204) with Microsoft SMTP Server (TLS) id 14.3.319.2; Thu, 7 Dec 2017 01:33:30 -0800 Received: from shsmsx103.ccr.corp.intel.com ([169.254.4.213]) by SHSMSX151.ccr.corp.intel.com ([169.254.3.218]) with mapi id 14.03.0319.002; Thu, 7 Dec 2017 17:33:28 +0800 From: "Tan, Jianfeng" To: Victor Kaplansky , "dev@dpdk.org" , "yliu@fridaylinux.org" , "Bie, Tiwei" CC: "stable@dpdk.org" , "jfreiman@redhat.com" , Maxime Coquelin Thread-Topic: [PATCH] vhost_user: protect active rings from async ring changes Thread-Index: AQHTbpnxJQ34Q0Trw06BgnNCg0sNi6M3lqyw Date: Thu, 7 Dec 2017 09:33:27 +0000 Message-ID: References: <20171206135329.652-1-vkaplans@redhat.com> In-Reply-To: <20171206135329.652-1-vkaplans@redhat.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH] vhost_user: protect active rings from async ring changes X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Dec 2017 09:33:32 -0000 > -----Original Message----- > From: Victor Kaplansky [mailto:vkaplans@redhat.com] > Sent: Wednesday, December 6, 2017 9:56 PM > To: dev@dpdk.org; yliu@fridaylinux.org; Bie, Tiwei; Tan, Jianfeng; > vkaplans@redhat.com > Cc: stable@dpdk.org; jfreiman@redhat.com; Maxime Coquelin > Subject: [PATCH] vhost_user: protect active rings from async ring changes >=20 > When performing live migration or memory hot-plugging, > the changes to the device and vrings made by message handler > done independently from vring usage by PMD threads. >=20 > This causes for example segfauls during live-migration segfauls ->segfaults? > with MQ enable, but in general virtually any request > sent by qemu changing the state of device can cause > problems. >=20 > These patches fixes all above issues by adding a spinlock > to every vring and requiring message handler to start operation > only after ensuring that all PMD threads related to the divece Another typo: divece. > are out of critical section accessing the vring data. >=20 > Each vring has its own lock in order to not create contention > between PMD threads of different vrings and to prevent > performance degradation by scaling queue pair number. Also wonder how much overhead it brings. Instead of locking each vring, can we just, waiting a while (10us for examp= le) after call destroy_device() callback so that every PMD thread has enoug= h time to skip out the criterial area? > --- > lib/librte_vhost/vhost.h | 1 + > lib/librte_vhost/vhost_user.c | 44 > +++++++++++++++++++++++++++++++++++++++++++ > lib/librte_vhost/virtio_net.c | 8 ++++++++ > 3 files changed, 53 insertions(+) >=20 > diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h > index 1cc81c17..812aeccd 100644 > --- a/lib/librte_vhost/vhost.h > +++ b/lib/librte_vhost/vhost.h > @@ -137,6 +137,7 @@ struct vhost_virtqueue { > TAILQ_HEAD(, vhost_iotlb_entry) iotlb_list; > int iotlb_cache_nr; > TAILQ_HEAD(, vhost_iotlb_entry) iotlb_pending_list; > + rte_spinlock_t active_lock; > } __rte_cache_aligned; >=20 > /* Old kernels have no such macros defined */ > diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.= c > index f4c7ce46..02a3a7d3 100644 > --- a/lib/librte_vhost/vhost_user.c > +++ b/lib/librte_vhost/vhost_user.c > @@ -1190,6 +1190,46 @@ vhost_user_check_and_alloc_queue_pair(struct > virtio_net *dev, VhostUserMsg *msg) > return alloc_vring_queue(dev, vring_idx); > } >=20 > +static void > +vhost_user_lock_all_queue_pairs(struct virtio_net *dev) > +{ > + unsigned int i =3D 0; > + unsigned int vq_num =3D 0; > + > + while (vq_num < dev->nr_vring) { > + struct vhost_virtqueue *vq =3D dev->virtqueue[i]; > + > + if (!vq) { > + i++; > + continue; > + } > + > + rte_spinlock_lock(&vq->active_lock); > + vq_num++; > + i++; > + } > +} > + > +static void > +vhost_user_unlock_all_queue_pairs(struct virtio_net *dev) > +{ > + unsigned int i =3D 0; > + unsigned int vq_num =3D 0; > + > + while (vq_num < dev->nr_vring) { > + struct vhost_virtqueue *vq =3D dev->virtqueue[i]; > + > + if (!vq) { > + i++; > + continue; > + } > + > + rte_spinlock_unlock(&vq->active_lock); > + vq_num++; > + i++; > + } > +} > + > int > vhost_user_msg_handler(int vid, int fd) > { > @@ -1241,6 +1281,8 @@ vhost_user_msg_handler(int vid, int fd) > return -1; > } >=20 > + vhost_user_lock_all_queue_pairs(dev); > + > switch (msg.request.master) { > case VHOST_USER_GET_FEATURES: > msg.payload.u64 =3D vhost_user_get_features(dev); > @@ -1342,6 +1384,8 @@ vhost_user_msg_handler(int vid, int fd) >=20 > } >=20 > + vhost_user_unlock_all_queue_pairs(dev); > + > if (msg.flags & VHOST_USER_NEED_REPLY) { > msg.payload.u64 =3D !!ret; > msg.size =3D sizeof(msg.payload.u64); > diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.= c > index 6fee16e5..de7e38bb 100644 > --- a/lib/librte_vhost/virtio_net.c > +++ b/lib/librte_vhost/virtio_net.c > @@ -44,6 +44,7 @@ > #include > #include > #include > +#include >=20 > #include "iotlb.h" > #include "vhost.h" > @@ -1180,9 +1181,12 @@ rte_vhost_dequeue_burst(int vid, uint16_t > queue_id, > } >=20 > vq =3D dev->virtqueue[queue_id]; > + Remove above blank line. > if (unlikely(vq->enabled =3D=3D 0)) > return 0; >=20 > + rte_spinlock_lock(&vq->active_lock); > + > vq->batch_copy_nb_elems =3D 0; >=20 > if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) > @@ -1240,6 +1244,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t > queue_id, > if (rarp_mbuf =3D=3D NULL) { > RTE_LOG(ERR, VHOST_DATA, > "Failed to allocate memory for mbuf.\n"); > + rte_spinlock_unlock(&vq->active_lock); > return 0; > } >=20 > @@ -1356,6 +1361,8 @@ rte_vhost_dequeue_burst(int vid, uint16_t > queue_id, > if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) > vhost_user_iotlb_rd_unlock(vq); >=20 > + rte_spinlock_unlock(&vq->active_lock); > + > if (unlikely(rarp_mbuf !=3D NULL)) { > /* > * Inject it to the head of "pkts" array, so that switch's mac > @@ -1366,5 +1373,6 @@ rte_vhost_dequeue_burst(int vid, uint16_t > queue_id, > i +=3D 1; > } >=20 > + Remove above blank line. > return i; > } > -- > Victor