Re: [dpdk-dev] [PATCH] vhost: improve dirty pages logging performance

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: Tiwei Bie <tiwei.bie@intel.com>
Cc: dev@dpdk.org, jianfeng.tan@intel.com, mst@redhat.com, stable@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] vhost: improve dirty pages logging performance
Date: Fri, 4 May 2018 17:48:05 +0200	[thread overview]
Message-ID: <6125f044-d557-666a-8228-4930ead16089@redhat.com> (raw)
In-Reply-To: <20180503115634.feaimkzpnbodferd@debian>

Hi Tiwei,

On 05/03/2018 01:56 PM, Tiwei Bie wrote:
> On Mon, Apr 30, 2018 at 05:59:54PM +0200, Maxime Coquelin wrote:
>> This patch caches all dirty pages logging until the used ring index
>> is updated. These dirty pages won't be accessed by the guest as
>> long as the host doesn't give them back to it by updating the
>> index.
> 
> Below sentence in above commit message isn't the reason why
> we can cache the dirty page logging. Right?
> 
> """
> These dirty pages won't be accessed by the guest as
> long as the host doesn't give them back to it by updating the
> index.
> """
> 
>>
>> The goal of this optimization is to fix a performance regression
>> introduced when the vhost library started to use atomic operations
>> to set bits in the shared dirty log map. While the fix was valid
>> as previous implementation wasn't safe against concurent accesses,
>> contention was induced.
>>
>> With this patch, during migration, we have:
>> 1. Less atomic operations as only a single atomic OR operation
>> per 32 pages.
> 
> Why not do it per 64 pages?

I wasn't sure it would be supported on 32bits CPU, but [0] seems do
indicate that only 128bits atomic operations aren't supported on all
architectures.

I will change to do it per 64 pages.

[0]: 
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html#g_t_005f_005fatomic-Builtins
>> 2. Less atomic operations as during a burst, the same page will
>> be marked dirty only once.
>> 3. Less write memory barriers.
>>
>> Fixes: 897f13a1f726 ("vhost: make page logging atomic")
>>
>> Cc: stable@dpdk.org
>>
>> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>>
>> Hi,
>>
>> This series was tested with migrating a guest while running PVP
>> benchmark at 1Mpps with both ovs-dpdk and testpmd as vswitch.
> 
> If the throughput is higher (e.g. by adding more cores
> and queues), will the live migration fail due to the
> higher dirty page generating speed?

I haven't done the check, but I agree that the higher is the throughput,
the longer will be the migration duration and the higher the risk it
never converges.

In this case of scenario, postcopy live-migration way be a better fit,
as the hot pages will be copied only once. Postcopy support for
vhost-user have been added to QEMU in v2.12, and I have a prototype for
DPDK that I plan to propose for next release.

> 
>>
>> With this patch we recover the packet drops regressions seen since
>> the use of atomic operations to log dirty pages.
> [...]
>>   
>> +static __rte_always_inline void
>> +vhost_log_cache_sync(struct virtio_net *dev, struct vhost_virtqueue *vq)
>> +{
>> +	uint32_t *log_base;
>> +	int i;
>> +
>> +	if (likely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) == 0) ||
>> +		   !dev->log_base))
>> +		return;
>> +
>> +	log_base = (uint32_t *)(uintptr_t)dev->log_base;
>> +
>> +	/* To make sure guest memory updates are committed before logging */
>> +	rte_smp_wmb();
> 
> It seems that __sync_fetch_and_or() can be considered a full
> barrier [1]. So do we really need this rte_smp_wmb()?

That's a good point, thanks for the pointer.

> Besides, based on the same doc [1], it seems that the __sync_
> version is deprecated in favor of the __atomic_ one.

I will change to __atomic_. For the memory model, do you agree I should
use __ATOMIC_SEQ_CST?

> [1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html
> 
>> +
>> +	for (i = 0; i < vq->log_cache_nb_elem; i++) {
>> +		struct log_cache_entry *elem = vq->log_cache + i;
>> +
>> +		__sync_fetch_and_or(log_base + elem->offset, elem->val);
>> +	}
>> +
>> +	vq->log_cache_nb_elem = 0;
>> +}
>> +
> [...]
> 

Thanks,
Maxime

next prev parent reply	other threads:[~2018-05-04 15:48 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-30 15:59 Maxime Coquelin
2018-05-03 11:56 ` Tiwei Bie
2018-05-04 15:48   ` Maxime Coquelin [this message]
2018-05-04 18:54     ` Michael S. Tsirkin
2018-05-07  3:49     ` Tiwei Bie
2018-05-07  3:58       ` Michael S. Tsirkin
2018-05-15 13:50   ` Maxime Coquelin
2018-05-16  6:10     ` Tiwei Bie
2018-05-16 15:00       ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6125f044-d557-666a-8228-4930ead16089@redhat.com \
    --to=maxime.coquelin@redhat.com \
    --cc=dev@dpdk.org \
    --cc=jianfeng.tan@intel.com \
    --cc=mst@redhat.com \
    --cc=stable@dpdk.org \
    --cc=tiwei.bie@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).