From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <zhihong.wang@intel.com>
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
 by dpdk.org (Postfix) with ESMTP id 23CD46C93
 for <dev@dpdk.org>; Fri, 19 Aug 2016 14:51:59 +0200 (CEST)
Received: from orsmga002.jf.intel.com ([10.7.209.21])
 by fmsmga102.fm.intel.com with ESMTP; 19 Aug 2016 05:51:54 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.28,544,1464678000"; d="scan'208";a="1038480413"
Received: from unknown (HELO dpdk5.sh.intel.com) ([10.239.129.118])
 by orsmga002.jf.intel.com with ESMTP; 19 Aug 2016 05:51:53 -0700
From: Zhihong Wang <zhihong.wang@intel.com>
To: dev@dpdk.org
Cc: maxime.coquelin@redhat.com, yuanhan.liu@linux.intel.com,
 Zhihong Wang <zhihong.wang@intel.com>
Date: Fri, 19 Aug 2016 01:43:50 -0400
Message-Id: <1471585430-125925-6-git-send-email-zhihong.wang@intel.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1471585430-125925-1-git-send-email-zhihong.wang@intel.com>
References: <1471319402-112998-1-git-send-email-zhihong.wang@intel.com>
 <1471585430-125925-1-git-send-email-zhihong.wang@intel.com>
Subject: [dpdk-dev] [PATCH v3 5/5] vhost: optimize cache access
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Aug 2016 12:52:02 -0000

This patch reorders the code to delay virtio header write to optimize cache
access efficiency for cases where the mrg_rxbuf feature is turned on. It
reduces CPU pipeline stall cycles significantly.

---
Changes in v3:

 1. Remove unnecessary memset which causes frontend stall on SNB & IVB.

 2. Rename variables to follow naming convention.

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
---
 lib/librte_vhost/vhost_rxtx.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index c4abaf1..e3ba4e0 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -154,6 +154,7 @@ enqueue_packet(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	uint32_t mbuf_len = 0;
 	uint32_t mbuf_avail = 0;
 	uint32_t copy_len = 0;
+	uint32_t copy_virtio_hdr = 0;
 	uint32_t extra_buffers = 0;
 
 	/* start with the first mbuf of the packet */
@@ -168,15 +169,16 @@ enqueue_packet(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	if (unlikely(!desc_addr))
 		goto error;
 
-	/* handle virtio header */
+	/*
+	 * handle virtio header, the actual write operation
+	 * is delayed for cache optimization.
+	 */
 	virtio_hdr = (struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)desc_addr;
-	virtio_enqueue_offload(mbuf, &(virtio_hdr->hdr));
+	copy_virtio_hdr = 1;
 	vhost_log_write(dev, desc->addr, dev->vhost_hlen);
 	desc_offset = dev->vhost_hlen;
 	desc_chain_len = desc_offset;
 	desc_addr += desc_offset;
-	if (is_mrg_rxbuf)
-		virtio_hdr->num_buffers = 1;
 
 	/* start copy from mbuf to desc */
 	while (1) {
@@ -228,8 +230,15 @@ enqueue_packet(struct virtio_net *dev, struct vhost_virtqueue *vq,
 				goto rollback;
 		}
 
-		/* copy mbuf data */
+		/* copy virtio header and mbuf data */
 		copy_len = RTE_MIN(desc->len - desc_offset, mbuf_avail);
+		if (copy_virtio_hdr) {
+			copy_virtio_hdr = 0;
+			virtio_enqueue_offload(mbuf, &(virtio_hdr->hdr));
+			if (is_mrg_rxbuf)
+				virtio_hdr->num_buffers = extra_buffers + 1;
+		}
+
 		rte_memcpy((void *)(uintptr_t)desc_addr,
 				rte_pktmbuf_mtod_offset(mbuf, void *,
 					mbuf_len - mbuf_avail),
-- 
2.7.4