From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f66.google.com (mail-pg0-f66.google.com [74.125.83.66]) by dpdk.org (Postfix) with ESMTP id 8874D1B1C9 for ; Wed, 20 Dec 2017 20:06:20 +0100 (CET) Received: by mail-pg0-f66.google.com with SMTP id j9so12269296pgc.11 for ; Wed, 20 Dec 2017 11:06:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=1QpNyNWPROET4Fzzm2mN1T+kLCvW2KGJAaFq4vtiDig=; b=BGVsAmQFxOsjqYl4GRyihPwm1lNQV8BZYGFHA8kvO+9g094SIDORnAFglsl/0q2w2e sB3ptttpkdZn+bFRmZAza3IIrcYxmVSu7Lf4/auW4WcIBk61I8adEyZLRa7oJatEc0J/ AZWcVJ07rdDXX8Bvva1UlS9lsxTRj7yrLpPk/QPxvUuvgyewc7PZOfLqlhfDYKlXDYIo LT7hTt0wTJm2WtONe10z541Ix0jXjud8Y7cvVUCuFhYyq83MF2/ZNbv19lM83EdJ9RVJ PrzBKhqw/rsSFvRWgWLGsCKzeDoepj/Vn64B+CT4KmVqQVk6IuBtY8yLbBKAPFKXlMz+ 7vtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1QpNyNWPROET4Fzzm2mN1T+kLCvW2KGJAaFq4vtiDig=; b=bWDi4tTz/XjNCbgpga3RnueyFJmD1NYhKpgAUh5mrNHp3FYlIYjeB/Es12rXvy8asl W7+xPVHfnUViyjmj414QeOMVrmH2uB6Eg6ywNXMafpvBOFRyMxlqRru0vPm/ZasFEMmD HnRbmZWujrO5vhJQZHiZ/P/YN90mcAEX2K2g1ZEwkR6+nrK+I/RrTCU/hD1wSlyouegA xhI7MfQqStFTECkqHoccdkIxxN/rMGzuD9dP1J/ufoFKYN8sIg1RTEUVwcjrxndBRC75 IrP1tYaTd5UZA85eg4sZU/bqDig6OjEpsHJQ7L2naGOBLx1xAqKTtQYyyvs/w75isIkm VgNg== X-Gm-Message-State: AKGB3mLd+wPk5YH1Q6fel1BT2c+kdA1tkzmzofLsy6cNoe59RCVu0ZjG BVH4zCVqmeqDX+cFD+b9wUTruw== X-Google-Smtp-Source: ACJfBos/OEAfHlfitO9DXf3JJmN2flRIOUsyZfEZTigLCtHH+OXPwbnxsAfbNHytooKchKJL2lyvDQ== X-Received: by 10.98.202.26 with SMTP id n26mr7870616pfg.202.1513796779621; Wed, 20 Dec 2017 11:06:19 -0800 (PST) Received: from xeon-e3 (204-195-18-133.wavecable.com. [204.195.18.133]) by smtp.gmail.com with ESMTPSA id 68sm34545783pfx.186.2017.12.20.11.06.19 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Dec 2017 11:06:19 -0800 (PST) Date: Wed, 20 Dec 2017 11:06:16 -0800 From: Stephen Hemminger To: Victor Kaplansky Cc: dev@dpdk.org, stable@dpdk.org, Jens Freimann , Maxime Coquelin , Yuanhan Liu , Tiwei Bie , "Tan, Jianfeng" Message-ID: <20171220110616.21301e11@xeon-e3> In-Reply-To: <20171220163752-mutt-send-email-victork@redhat.com> References: <20171220163752-mutt-send-email-victork@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v4] vhost_user: protect active rings from async ring changes X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Dec 2017 19:06:20 -0000 On Wed, 20 Dec 2017 16:37:52 +0200 Victor Kaplansky wrote: > When performing live migration or memory hot-plugging, > the changes to the device and vrings made by message handler > done independently from vring usage by PMD threads. > > This causes for example segfaults during live-migration > with MQ enable, but in general virtually any request > sent by qemu changing the state of device can cause > problems. > > These patches fixes all above issues by adding a spinlock > to every vring and requiring message handler to start operation > only after ensuring that all PMD threads related to the device > are out of critical section accessing the vring data. > > Each vring has its own lock in order to not create contention > between PMD threads of different vrings and to prevent > performance degradation by scaling queue pair number. > > See https://bugzilla.redhat.com/show_bug.cgi?id=1450680 > > Signed-off-by: Victor Kaplansky > --- > > v4: > > o moved access_unlock before accessing enable flag and > access_unlock after iommu_unlock consistently. > o cosmetics: removed blank line. > o the access_lock variable moved to be in the same > cache line with enable and access_ok flags. > o dequeue path is now guarded with trylock and returning > zero if unsuccessful. > o GET_VRING_BASE operation is not guarded by access lock > to avoid deadlock with device_destroy. See the comment > in the code. > o Fixed error path exit from enqueue and dequeue carefully > unlocking access and iommu locks as appropriate. > > v3: > o Added locking to enqueue flow. > o Enqueue path guarded as well as dequeue path. > o Changed name of active_lock. > o Added initialization of guarding spinlock. > o Reworked functions skimming over all virt-queues. > o Performance measurements done by Maxime Coquelin shows > no degradation in bandwidth and throughput. > o Spelling. > o Taking lock only on set operations. > o IOMMU messages are not guarded by access lock. > > v2: > o Fixed checkpatch complains. > o Added Signed-off-by. > o Refined placement of guard to exclude IOMMU messages. > o TODO: performance degradation measurement. > > > > lib/librte_vhost/vhost.h | 25 +++++++++++++-- > lib/librte_vhost/vhost.c | 1 + > lib/librte_vhost/vhost_user.c | 71 +++++++++++++++++++++++++++++++++++++++++++ > lib/librte_vhost/virtio_net.c | 28 ++++++++++++++--- > 4 files changed, 119 insertions(+), 6 deletions(-) > > diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h > index 1cc81c17..f3e43e95 100644 > --- a/lib/librte_vhost/vhost.h > +++ b/lib/librte_vhost/vhost.h > @@ -108,12 +108,14 @@ struct vhost_virtqueue { > > /* Backend value to determine if device should started/stopped */ > int backend; > + int enabled; > + int access_ok; > + rte_spinlock_t access_lock; Maybe these int's should be bool? > + > /* Used to notify the guest (trigger interrupt) */ > int callfd; > /* Currently unused as polling mode is enabled */ > int kickfd; > - int enabled; > - int access_ok; > > /* Physical address of used ring, for logging */ > uint64_t log_guest_addr; > @@ -302,6 +304,25 @@ vhost_log_used_vring(struct virtio_net *dev, struct vhost_virtqueue *vq, > vhost_log_write(dev, vq->log_guest_addr + offset, len); > } > > +static __rte_always_inline int > +vhost_user_access_trylock(struct vhost_virtqueue *vq) > +{ > + return rte_spinlock_trylock(&vq->access_lock); > +} > + > +static __rte_always_inline void > +vhost_user_access_lock(struct vhost_virtqueue *vq) > +{ > + rte_spinlock_lock(&vq->access_lock); > +} > + > +static __rte_always_inline void > +vhost_user_access_unlock(struct vhost_virtqueue *vq) > +{ > + rte_spinlock_unlock(&vq->access_lock); > +} > + > + Wrapping locking inline's adds nothing and makes life harder for static analysis tools. The bigger problem is that doing locking on all enqueue/dequeue can have a visible performance impact. Did you measure that? Could you invent an RCUish mechanism using compiler barriers?