From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <huawei.xie@intel.com>
Received: from mga09.intel.com (mga09.intel.com [134.134.136.24])
 by dpdk.org (Postfix) with ESMTP id AA69E5B21
 for <dev@dpdk.org>; Tue, 27 Jan 2015 08:03:57 +0100 (CET)
Received: from fmsmga001.fm.intel.com ([10.253.24.23])
 by orsmga102.jf.intel.com with ESMTP; 26 Jan 2015 23:00:44 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.09,473,1418112000"; d="scan'208";a="657084403"
Received: from pgsmsx108.gar.corp.intel.com ([10.221.44.103])
 by fmsmga001.fm.intel.com with ESMTP; 26 Jan 2015 23:03:51 -0800
Received: from shsmsx103.ccr.corp.intel.com (10.239.4.69) by
 PGSMSX108.gar.corp.intel.com (10.221.44.103) with Microsoft SMTP Server (TLS)
 id 14.3.195.1; Tue, 27 Jan 2015 15:03:49 +0800
Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.64]) by
 SHSMSX103.ccr.corp.intel.com ([169.254.4.192]) with mapi id 14.03.0195.001;
 Tue, 27 Jan 2015 15:03:47 +0800
From: "Xie, Huawei" <huawei.xie@intel.com>
To: "Ouyang, Changchun" <changchun.ouyang@intel.com>, "dev@dpdk.org"
 <dev@dpdk.org>
Thread-Topic: [dpdk-dev] [PATCH v2 02/24] virtio: Use weaker barriers
Thread-Index: AQHQOdplIBvPxnxbzU2OvnOFGg/hwpzTiENA
Date: Tue, 27 Jan 2015 07:03:47 +0000
Message-ID: <C37D651A908B024F974696C65296B57B0F361A10@SHSMSX101.ccr.corp.intel.com>
References: <1421298930-15210-1-git-send-email-changchun.ouyang@intel.com>
 <1422326164-13697-1-git-send-email-changchun.ouyang@intel.com>
 <1422326164-13697-3-git-send-email-changchun.ouyang@intel.com>
In-Reply-To: <1422326164-13697-3-git-send-email-changchun.ouyang@intel.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.239.127.40]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH v2 02/24] virtio: Use weaker barriers
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 27 Jan 2015 07:03:58 -0000



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ouyang Changchun
> Sent: Tuesday, January 27, 2015 10:36 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2 02/24] virtio: Use weaker barriers
>=20
> The DPDK driver only has to deal with the case of running on PCI
> and with SMP. In this case, the code can use the weaker barriers
> instead of using hard (fence) barriers. This will help performance.
> The rationale is explained in Linux kernel virtio_ring.h.
>=20
> To make it clearer that this is a virtio thing and not some generic
> barrier, prefix the barrier calls with virtio_.
>=20
> Add missing (and needed) barrier between updating ring data
> structure and notifying host.
>=20
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> ---
>  lib/librte_pmd_virtio/virtio_ethdev.c |  2 +-
>  lib/librte_pmd_virtio/virtio_rxtx.c   |  8 +++++---
>  lib/librte_pmd_virtio/virtqueue.h     | 19 ++++++++++++++-----
>  3 files changed, 20 insertions(+), 9 deletions(-)
>=20
> diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c
> b/lib/librte_pmd_virtio/virtio_ethdev.c
> index 662a49c..dc47e72 100644
> --- a/lib/librte_pmd_virtio/virtio_ethdev.c
> +++ b/lib/librte_pmd_virtio/virtio_ethdev.c
> @@ -175,7 +175,7 @@ virtio_send_command(struct virtqueue *vq, struct
> virtio_pmd_ctrl *ctrl,
>  		uint32_t idx, desc_idx, used_idx;
>  		struct vring_used_elem *uep;
>=20
> -		rmb();
> +		virtio_rmb();
>=20
>  		used_idx =3D (uint32_t)(vq->vq_used_cons_idx
>  				& (vq->vq_nentries - 1));
> diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c
> b/lib/librte_pmd_virtio/virtio_rxtx.c
> index c013f97..78af334 100644
> --- a/lib/librte_pmd_virtio/virtio_rxtx.c
> +++ b/lib/librte_pmd_virtio/virtio_rxtx.c
> @@ -456,7 +456,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf
> **rx_pkts, uint16_t nb_pkts)
>=20
>  	nb_used =3D VIRTQUEUE_NUSED(rxvq);
>=20
> -	rmb();
> +	virtio_rmb();
>=20
>  	num =3D (uint16_t)(likely(nb_used <=3D nb_pkts) ? nb_used : nb_pkts);
>  	num =3D (uint16_t)(likely(num <=3D VIRTIO_MBUF_BURST_SZ) ? num :
> VIRTIO_MBUF_BURST_SZ);
> @@ -516,6 +516,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf
> **rx_pkts, uint16_t nb_pkts)
>  	}
>=20
>  	if (likely(nb_enqueued)) {
> +		virtio_wmb();
>  		if (unlikely(virtqueue_kick_prepare(rxvq))) {
>  			virtqueue_notify(rxvq);
>  			PMD_RX_LOG(DEBUG, "Notified\n");
> @@ -547,7 +548,7 @@ virtio_recv_mergeable_pkts(void *rx_queue,
>=20
>  	nb_used =3D VIRTQUEUE_NUSED(rxvq);
>=20
> -	rmb();
> +	virtio_rmb();
>=20
>  	if (nb_used =3D=3D 0)
>  		return 0;
> @@ -694,7 +695,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf
> **tx_pkts, uint16_t nb_pkts)
>  	PMD_TX_LOG(DEBUG, "%d packets to xmit", nb_pkts);
>  	nb_used =3D VIRTQUEUE_NUSED(txvq);
>=20
> -	rmb();
> +	virtio_rmb();
>=20
>  	num =3D (uint16_t)(likely(nb_used < VIRTIO_MBUF_BURST_SZ) ? nb_used :
> VIRTIO_MBUF_BURST_SZ);
>=20
> @@ -735,6 +736,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf
> **tx_pkts, uint16_t nb_pkts)
>  		}
>  	}
>  	vq_update_avail_idx(txvq);
> +	virtio_wmb();
>=20
>  	txvq->packets +=3D nb_tx;
>=20
> diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/vi=
rtqueue.h
> index fdee054..f6ad98d 100644
> --- a/lib/librte_pmd_virtio/virtqueue.h
> +++ b/lib/librte_pmd_virtio/virtqueue.h
> @@ -46,9 +46,18 @@
>  #include "virtio_ring.h"
>  #include "virtio_logs.h"
>=20
> -#define mb()  rte_mb()
> -#define wmb() rte_wmb()
> -#define rmb() rte_rmb()
> +/*
> + * Per virtio_config.h in Linux.
> + *     For virtio_pci on SMP, we don't need to order with respect to MMI=
O
> + *     accesses through relaxed memory I/O windows, so smp_mb() et al ar=
e
> + *     sufficient.
> + *
> + * This driver is for virtio_pci on SMP and therefore can assume
> + * weaker (compiler barriers)
> + */
> +#define virtio_mb()	rte_mb()
> +#define virtio_rmb()	rte_compiler_barrier()
> +#define virtio_wmb()	rte_compiler_barrier()
>=20
>  #ifdef RTE_PMD_PACKET_PREFETCH
>  #define rte_packet_prefetch(p)  rte_prefetch1(p)
> @@ -225,7 +234,7 @@ virtqueue_full(const struct virtqueue *vq)
>  static inline void
>  vq_update_avail_idx(struct virtqueue *vq)
>  {
> -	rte_compiler_barrier();
> +	virtio_rmb();

I recall our original code is virtio_wmb().=20
Use store fence to ensure all updates to entries before updating the index.
Why do we need virtio_rmb() here and add virtio_wmb after vq_update_avail_i=
dx()?

>  	vq->vq_ring.avail->idx =3D vq->vq_avail_idx;
>  }
>=20
> @@ -255,7 +264,7 @@ static inline void
>  virtqueue_notify(struct virtqueue *vq)
>  {
>  	/*
> -	 * Ensure updated avail->idx is visible to host. mb() necessary?
> +	 * Ensure updated avail->idx is visible to host.
>  	 * For virtio on IA, the notificaiton is through io port operation
>  	 * which is a serialization instruction itself.
>  	 */
> --
> 1.8.4.2