From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id 90A421BE0; Fri, 4 May 2018 17:48:10 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 031E68DC4E; Fri, 4 May 2018 15:48:10 +0000 (UTC) Received: from [10.36.112.52] (ovpn-112-52.ams2.redhat.com [10.36.112.52]) by smtp.corp.redhat.com (Postfix) with ESMTPS id AE7D383B85; Fri, 4 May 2018 15:48:06 +0000 (UTC) To: Tiwei Bie Cc: dev@dpdk.org, jianfeng.tan@intel.com, mst@redhat.com, stable@dpdk.org References: <20180430155954.9939-1-maxime.coquelin@redhat.com> <20180503115634.feaimkzpnbodferd@debian> From: Maxime Coquelin Message-ID: <6125f044-d557-666a-8228-4930ead16089@redhat.com> Date: Fri, 4 May 2018 17:48:05 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180503115634.feaimkzpnbodferd@debian> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Fri, 04 May 2018 15:48:10 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Fri, 04 May 2018 15:48:10 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'maxime.coquelin@redhat.com' RCPT:'' Subject: Re: [dpdk-dev] [PATCH] vhost: improve dirty pages logging performance X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 May 2018 15:48:10 -0000 Hi Tiwei, On 05/03/2018 01:56 PM, Tiwei Bie wrote: > On Mon, Apr 30, 2018 at 05:59:54PM +0200, Maxime Coquelin wrote: >> This patch caches all dirty pages logging until the used ring index >> is updated. These dirty pages won't be accessed by the guest as >> long as the host doesn't give them back to it by updating the >> index. > > Below sentence in above commit message isn't the reason why > we can cache the dirty page logging. Right? > > """ > These dirty pages won't be accessed by the guest as > long as the host doesn't give them back to it by updating the > index. > """ > >> >> The goal of this optimization is to fix a performance regression >> introduced when the vhost library started to use atomic operations >> to set bits in the shared dirty log map. While the fix was valid >> as previous implementation wasn't safe against concurent accesses, >> contention was induced. >> >> With this patch, during migration, we have: >> 1. Less atomic operations as only a single atomic OR operation >> per 32 pages. > > Why not do it per 64 pages? I wasn't sure it would be supported on 32bits CPU, but [0] seems do indicate that only 128bits atomic operations aren't supported on all architectures. I will change to do it per 64 pages. [0]: https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html#g_t_005f_005fatomic-Builtins >> 2. Less atomic operations as during a burst, the same page will >> be marked dirty only once. >> 3. Less write memory barriers. >> >> Fixes: 897f13a1f726 ("vhost: make page logging atomic") >> >> Cc: stable@dpdk.org >> >> Suggested-by: Michael S. Tsirkin >> Signed-off-by: Maxime Coquelin >> --- >> >> Hi, >> >> This series was tested with migrating a guest while running PVP >> benchmark at 1Mpps with both ovs-dpdk and testpmd as vswitch. > > If the throughput is higher (e.g. by adding more cores > and queues), will the live migration fail due to the > higher dirty page generating speed? I haven't done the check, but I agree that the higher is the throughput, the longer will be the migration duration and the higher the risk it never converges. In this case of scenario, postcopy live-migration way be a better fit, as the hot pages will be copied only once. Postcopy support for vhost-user have been added to QEMU in v2.12, and I have a prototype for DPDK that I plan to propose for next release. > >> >> With this patch we recover the packet drops regressions seen since >> the use of atomic operations to log dirty pages. > [...] >> >> +static __rte_always_inline void >> +vhost_log_cache_sync(struct virtio_net *dev, struct vhost_virtqueue *vq) >> +{ >> + uint32_t *log_base; >> + int i; >> + >> + if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) || >> + !dev->log_base)) >> + return; >> + >> + log_base = (uint32_t *)(uintptr_t)dev->log_base; >> + >> + /* To make sure guest memory updates are committed before logging */ >> + rte_smp_wmb(); > > It seems that __sync_fetch_and_or() can be considered a full > barrier [1]. So do we really need this rte_smp_wmb()? That's a good point, thanks for the pointer. > Besides, based on the same doc [1], it seems that the __sync_ > version is deprecated in favor of the __atomic_ one. I will change to __atomic_. For the memory model, do you agree I should use __ATOMIC_SEQ_CST? > [1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html > >> + >> + for (i = 0; i < vq->log_cache_nb_elem; i++) { >> + struct log_cache_entry *elem = vq->log_cache + i; >> + >> + __sync_fetch_and_or(log_base + elem->offset, elem->val); >> + } >> + >> + vq->log_cache_nb_elem = 0; >> +} >> + > [...] > Thanks, Maxime