From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <huawei.xie@intel.com>
Received: from mga02.intel.com (mga02.intel.com [134.134.136.20])
 by dpdk.org (Postfix) with ESMTP id 2C0DE567E
 for <dev@dpdk.org>; Wed,  9 Dec 2015 04:33:22 +0100 (CET)
Received: from orsmga001.jf.intel.com ([10.7.209.18])
 by orsmga101.jf.intel.com with ESMTP; 08 Dec 2015 19:33:21 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.20,401,1444719600"; d="scan'208";a="837295166"
Received: from fmsmsx108.amr.corp.intel.com ([10.18.124.206])
 by orsmga001.jf.intel.com with ESMTP; 08 Dec 2015 19:33:21 -0800
Received: from fmsmsx154.amr.corp.intel.com (10.18.116.70) by
 FMSMSX108.amr.corp.intel.com (10.18.124.206) with Microsoft SMTP Server (TLS)
 id 14.3.248.2; Tue, 8 Dec 2015 19:33:19 -0800
Received: from shsmsx101.ccr.corp.intel.com (10.239.4.153) by
 FMSMSX154.amr.corp.intel.com (10.18.116.70) with Microsoft SMTP Server (TLS)
 id 14.3.248.2; Tue, 8 Dec 2015 19:33:18 -0800
Received: from shsmsx103.ccr.corp.intel.com ([169.254.4.138]) by
 SHSMSX101.ccr.corp.intel.com ([169.254.1.83]) with mapi id 14.03.0248.002;
 Wed, 9 Dec 2015 11:33:17 +0800
From: "Xie, Huawei" <huawei.xie@intel.com>
To: Victor Kaplansky <victork@redhat.com>, Yuanhan Liu
 <yuanhan.liu@linux.intel.com>
Thread-Topic: [PATCH 2/4] vhost: introduce vhost_log_write
Thread-Index: AdEyMldEwGHDAPhjRAm0JwFAYWponA==
Date: Wed, 9 Dec 2015 03:33:16 +0000
Message-ID: <C37D651A908B024F974696C65296B57B4BB82C2D@SHSMSX103.ccr.corp.intel.com>
References: <1449027793-30975-1-git-send-email-yuanhan.liu@linux.intel.com>
 <1449027793-30975-3-git-send-email-yuanhan.liu@linux.intel.com>
 <20151202153050-mutt-send-email-victork@redhat.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.239.127.40]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "dev@dpdk.org" <dev@dpdk.org>, "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [dpdk-dev] [PATCH 2/4] vhost: introduce vhost_log_write
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Dec 2015 03:33:23 -0000

On 12/2/2015 9:53 PM, Victor Kaplansky wrote:=0A=
> On Wed, Dec 02, 2015 at 11:43:11AM +0800, Yuanhan Liu wrote:=0A=
>> Introduce vhost_log_write() helper function to log the dirty pages we=0A=
>> touched. Page size is harded code to 4096 (VHOST_LOG_PAGE), and each=0A=
>> log is presented by 1 bit.=0A=
>>=0A=
>> Therefore, vhost_log_write() simply finds the right bit for related=0A=
>> page we are gonna change, and set it to 1. dev->log_base denotes the=0A=
>> start of the dirty page bitmap.=0A=
>>=0A=
>> The page address is biased by log_guest_addr, which is derived from=0A=
>> SET_VRING_ADDR request as part of the vring related addresses.=0A=
>>=0A=
>> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>=0A=
>> ---=0A=
>>  lib/librte_vhost/rte_virtio_net.h | 34 ++++++++++++++++++++++++++++++++=
++=0A=
>>  lib/librte_vhost/virtio-net.c     |  4 ++++=0A=
>>  2 files changed, 38 insertions(+)=0A=
>>=0A=
>> diff --git a/lib/librte_vhost/rte_virtio_net.h b/lib/librte_vhost/rte_vi=
rtio_net.h=0A=
>> index 416dac2..191c1be 100644=0A=
>> --- a/lib/librte_vhost/rte_virtio_net.h=0A=
>> +++ b/lib/librte_vhost/rte_virtio_net.h=0A=
>> @@ -40,6 +40,7 @@=0A=
>>   */=0A=
>>  =0A=
>>  #include <stdint.h>=0A=
>> +#include <linux/vhost.h>=0A=
>>  #include <linux/virtio_ring.h>=0A=
>>  #include <linux/virtio_net.h>=0A=
>>  #include <sys/eventfd.h>=0A=
>> @@ -59,6 +60,8 @@ struct rte_mbuf;=0A=
>>  /* Backend value set by guest. */=0A=
>>  #define VIRTIO_DEV_STOPPED -1=0A=
>>  =0A=
>> +#define VHOST_LOG_PAGE	4096=0A=
>> +=0A=
>>  =0A=
>>  /* Enum for virtqueue management. */=0A=
>>  enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};=0A=
>> @@ -82,6 +85,7 @@ struct vhost_virtqueue {=0A=
>>  	struct vring_desc	*desc;			/**< Virtqueue descriptor ring. */=0A=
>>  	struct vring_avail	*avail;			/**< Virtqueue available ring. */=0A=
>>  	struct vring_used	*used;			/**< Virtqueue used ring. */=0A=
>> +	uint64_t		log_guest_addr;		/**< Physical address of used ring, for log=
ging */=0A=
>>  	uint32_t		size;			/**< Size of descriptor ring. */=0A=
>>  	uint32_t		backend;		/**< Backend value to determine if device should s=
tarted/stopped. */=0A=
>>  	uint16_t		vhost_hlen;		/**< Vhost header length (varies depending on R=
X merge buffers. */=0A=
>> @@ -203,6 +207,36 @@ gpa_to_vva(struct virtio_net *dev, uint64_t guest_p=
a)=0A=
>>  	return vhost_va;=0A=
>>  }=0A=
>>  =0A=
>> +static inline void __attribute__((always_inline))=0A=
>> +vhost_log_page(uint8_t *log_base, uint64_t page)=0A=
>> +{=0A=
>> +	/* TODO: to make it atomic? */=0A=
>> +	log_base[page / 8] |=3D 1 << (page % 8);=0A=
> I think the atomic OR operation is necessary only if there can be=0A=
> more than one vhost-user back-end updating the guest's memory=0A=
> simultaneously. However probably it is pretty safe to perform=0A=
> regular OR operation, since rings are not shared between=0A=
> back-end. What about buffers pointed by descriptors?  To be on=0A=
> the safe side, I would use a GCC built-in function=0A=
> __sync_fetch_and_or(). =0A=
>=0A=
>> +}=0A=
>> +=0A=
>> +static inline void __attribute__((always_inline))=0A=
>> +vhost_log_write(struct virtio_net *dev, struct vhost_virtqueue *vq,=0A=
>> +		uint64_t offset, uint64_t len)=0A=
>> +{=0A=
>> +	uint64_t addr =3D vq->log_guest_addr;=0A=
>> +	uint64_t page;=0A=
>> +=0A=
>> +	if (unlikely(((dev->features & (1ULL << VHOST_F_LOG_ALL)) =3D=3D 0) ||=
=0A=
>> +		     !dev->log_base || !len))=0A=
>> +		return;=0A=
> Isn't "likely" more appropriate in above, since the whole=0A=
> expression is expected to be true most of the time?=0A=
Victor:=0A=
So we are not always logging, what is the message that tells the backend=0A=
the migration is started?=0A=
[...]=0A=
=0A=