From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: Tiwei Bie <tiwei.bie@intel.com>
Cc: dev@dpdk.org, jianfeng.tan@intel.com, mst@redhat.com, stable@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] vhost: improve dirty pages logging performance
Date: Fri, 4 May 2018 17:48:05 +0200 [thread overview]
Message-ID: <6125f044-d557-666a-8228-4930ead16089@redhat.com> (raw)
In-Reply-To: <20180503115634.feaimkzpnbodferd@debian>
Hi Tiwei,
On 05/03/2018 01:56 PM, Tiwei Bie wrote:
> On Mon, Apr 30, 2018 at 05:59:54PM +0200, Maxime Coquelin wrote:
>> This patch caches all dirty pages logging until the used ring index
>> is updated. These dirty pages won't be accessed by the guest as
>> long as the host doesn't give them back to it by updating the
>> index.
>
> Below sentence in above commit message isn't the reason why
> we can cache the dirty page logging. Right?
>
> """
> These dirty pages won't be accessed by the guest as
> long as the host doesn't give them back to it by updating the
> index.
> """
>
>>
>> The goal of this optimization is to fix a performance regression
>> introduced when the vhost library started to use atomic operations
>> to set bits in the shared dirty log map. While the fix was valid
>> as previous implementation wasn't safe against concurent accesses,
>> contention was induced.
>>
>> With this patch, during migration, we have:
>> 1. Less atomic operations as only a single atomic OR operation
>> per 32 pages.
>
> Why not do it per 64 pages?
I wasn't sure it would be supported on 32bits CPU, but [0] seems do
indicate that only 128bits atomic operations aren't supported on all
architectures.
I will change to do it per 64 pages.
[0]:
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html#g_t_005f_005fatomic-Builtins
>> 2. Less atomic operations as during a burst, the same page will
>> be marked dirty only once.
>> 3. Less write memory barriers.
>>
>> Fixes: 897f13a1f726 ("vhost: make page logging atomic")
>>
>> Cc: stable@dpdk.org
>>
>> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>>
>> Hi,
>>
>> This series was tested with migrating a guest while running PVP
>> benchmark at 1Mpps with both ovs-dpdk and testpmd as vswitch.
>
> If the throughput is higher (e.g. by adding more cores
> and queues), will the live migration fail due to the
> higher dirty page generating speed?
I haven't done the check, but I agree that the higher is the throughput,
the longer will be the migration duration and the higher the risk it
never converges.
In this case of scenario, postcopy live-migration way be a better fit,
as the hot pages will be copied only once. Postcopy support for
vhost-user have been added to QEMU in v2.12, and I have a prototype for
DPDK that I plan to propose for next release.
>
>>
>> With this patch we recover the packet drops regressions seen since
>> the use of atomic operations to log dirty pages.
> [...]
>>
>> +static __rte_always_inline void
>> +vhost_log_cache_sync(struct virtio_net *dev, struct vhost_virtqueue *vq)
>> +{
>> + uint32_t *log_base;
>> + int i;
>> +
>> + if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) ||
>> + !dev->log_base))
>> + return;
>> +
>> + log_base = (uint32_t *)(uintptr_t)dev->log_base;
>> +
>> + /* To make sure guest memory updates are committed before logging */
>> + rte_smp_wmb();
>
> It seems that __sync_fetch_and_or() can be considered a full
> barrier [1]. So do we really need this rte_smp_wmb()?
That's a good point, thanks for the pointer.
> Besides, based on the same doc [1], it seems that the __sync_
> version is deprecated in favor of the __atomic_ one.
I will change to __atomic_. For the memory model, do you agree I should
use __ATOMIC_SEQ_CST?
> [1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html
>
>> +
>> + for (i = 0; i < vq->log_cache_nb_elem; i++) {
>> + struct log_cache_entry *elem = vq->log_cache + i;
>> +
>> + __sync_fetch_and_or(log_base + elem->offset, elem->val);
>> + }
>> +
>> + vq->log_cache_nb_elem = 0;
>> +}
>> +
> [...]
>
Thanks,
Maxime
next prev parent reply other threads:[~2018-05-04 15:48 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-30 15:59 Maxime Coquelin
2018-05-03 11:56 ` Tiwei Bie
2018-05-04 15:48 ` Maxime Coquelin [this message]
2018-05-04 18:54 ` Michael S. Tsirkin
2018-05-07 3:49 ` Tiwei Bie
2018-05-07 3:58 ` Michael S. Tsirkin
2018-05-15 13:50 ` Maxime Coquelin
2018-05-16 6:10 ` Tiwei Bie
2018-05-16 15:00 ` Maxime Coquelin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6125f044-d557-666a-8228-4930ead16089@redhat.com \
--to=maxime.coquelin@redhat.com \
--cc=dev@dpdk.org \
--cc=jianfeng.tan@intel.com \
--cc=mst@redhat.com \
--cc=stable@dpdk.org \
--cc=tiwei.bie@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).