DPDK patches and discussions
 help / color / mirror / Atom feed
From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To: "Wang, Zhihong" <zhihong.wang@intel.com>
Cc: Jianbo Liu <jianbo.liu@linaro.org>,
	Thomas Monjalon <thomas.monjalon@6wind.com>,
	Maxime Coquelin <maxime.coquelin@redhat.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue
Date: Thu, 13 Oct 2016 13:33:24 +0800	[thread overview]
Message-ID: <20161013053324.GP16751@yliu-dev.sh.intel.com> (raw)
In-Reply-To: <8F6C2BD409508844A0EFC19955BE09414E7CD78E@SHSMSX103.ccr.corp.intel.com>

On Wed, Oct 12, 2016 at 12:22:08PM +0000, Wang, Zhihong wrote:
> > > >> >  3. How many percentage of drop are you seeing?
> > > The testing result:
> > > size (bytes)     improvement (%)
> > > 64                   3.92
> > > 128                 11.51
> > > 256                  24.16
> > > 512                  -13.79
> > > 1024                -22.51
> > > 1500                -12.22
> > > A correction is that performance is dropping if byte size is larger than 512.
> > 
> > I have thought of this twice. Unfortunately, I think I may need NACK this
> > series.
> > 
> > Merging two code path into one is really good: as you stated, it improves
> > the maintainability. But only if we see no performance regression on both
> > path after the refactor. Unfortunately, that's not the case here: it hurts
> > the performance for one code path (non-mrg Rx).
> > 
> > That makes me think we may should not do the code path merge at all. I think
> > that also aligns with what you have said before (internally): we could do the
> > merge if it gives comparable performance before and after that.
> > 
> > Besides that, I don't quite like the way you did in patch 2 (rewrite enqueue):
> > you made a lot of changes in one patch. That means if something wrong
> > happened,
> > it is hard to narrow down which change introduces that regression. Badly,
> > that's exactly what we met here. Weeks have been passed, I see no progress.
> > 
> > That's the reason we like the idea of "one patch only does one thing, an
> > atomic thing".
> 
> 
> Yuanhan, folks,
> 
> Thanks for the analysis. I disagree here though.
> 
> I analyze, develop, benchmark on x86 platforms, where this patch
> works great.

Yes, that's great effort! With your hardwork, we know what the bottleneck
is and how it could be improved.

However, you don't have to do code refactor (merge two code path to one)
to apply those improvements. From what I know, in this patchset, there
are two factors could improve the performance:

- copy hdr together with packet data

- shadow used ring update and update at once

The overall performance boost I got with your v6 patchset with mrg-Rx
code path is about 27% (in PVP case). And I have just applied the 1st
optimization, it yields about 20% boosts. The left could be covered if
we apply the 2nd optimization (I guess).

That would be a clean way to optimize vhost mergeable Rx path:

- you don't touch non-mrg Rx path (well, you may could apply the
  shadow_used_ring trick to it as wel)

  This would at least make sure we will have no such performance
  regression issue reported by ARM guys.

- you don't refactor the code

  The rewrite from scratch could introduce other issues, besides the
  performance regression. We may just don't know it yet.


Make sense to you? If you agree, I think we could still make it in
this release: they would be some small changes after all. For example,
below is the patch applies the 1st optimization tip on top of
dpdk-next-virtio

	--yliu

---------------------------------------------------------------
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 8a151af..0ddb5af 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -379,7 +379,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			    uint16_t end_idx, struct rte_mbuf *m,
 			    struct buf_vector *buf_vec)
 {
-	struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
+	struct virtio_net_hdr_mrg_rxbuf *virtio_hdr;
 	uint32_t vec_idx = 0;
 	uint16_t start_idx = vq->last_used_idx;
 	uint16_t cur_idx = start_idx;
@@ -388,6 +388,8 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	uint32_t desc_offset, desc_avail;
 	uint32_t cpy_len;
 	uint16_t desc_idx, used_idx;
+	uint16_t num_buffers = end_idx - start_idx;
+	int hdr_copied = 0;
 
 	if (unlikely(m == NULL))
 		return 0;
@@ -399,16 +401,11 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 	if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr)
 		return 0;
 
-	rte_prefetch0((void *)(uintptr_t)desc_addr);
+	virtio_hdr = (struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)desc_addr;
+	rte_prefetch0(virtio_hdr);
 
-	virtio_hdr.num_buffers = end_idx - start_idx;
 	LOG_DEBUG(VHOST_DATA, "(%d) RX: num merge buffers %d\n",
-		dev->vid, virtio_hdr.num_buffers);
-
-	virtio_enqueue_offload(m, &virtio_hdr.hdr);
-	copy_virtio_net_hdr(dev, desc_addr, virtio_hdr);
-	vhost_log_write(dev, buf_vec[vec_idx].buf_addr, dev->vhost_hlen);
-	PRINT_PACKET(dev, (uintptr_t)desc_addr, dev->vhost_hlen, 0);
+		dev->vid, num_buffers);
 
 	desc_avail  = buf_vec[vec_idx].buf_len - dev->vhost_hlen;
 	desc_offset = dev->vhost_hlen;
@@ -450,6 +447,15 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq,
 			mbuf_avail  = rte_pktmbuf_data_len(m);
 		}
 
+		if (hdr_copied == 0) {
+			virtio_hdr->num_buffers = num_buffers;
+			virtio_enqueue_offload(m, &virtio_hdr->hdr);
+			vhost_log_write(dev, buf_vec[vec_idx].buf_addr, dev->vhost_hlen);
+			PRINT_PACKET(dev, (uintptr_t)desc_addr, dev->vhost_hlen, 0);
+
+			hdr_copied = 1;
+		}
+
 		cpy_len = RTE_MIN(desc_avail, mbuf_avail);
 		rte_memcpy((void *)((uintptr_t)(desc_addr + desc_offset)),
 			rte_pktmbuf_mtod_offset(m, void *, mbuf_offset),

  parent reply	other threads:[~2016-10-13  5:32 UTC|newest]

Thread overview: 140+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-16  3:50 [dpdk-dev] [PATCH] optimize vhost enqueue Zhihong Wang
2016-08-16 13:59 ` Maxime Coquelin
2016-08-17  1:45   ` Wang, Zhihong
2016-08-17  2:38     ` Yuanhan Liu
2016-08-17  6:41       ` Wang, Zhihong
2016-08-17  9:17         ` Maxime Coquelin
2016-08-17  9:51           ` Yuanhan Liu
2016-08-18 13:44             ` Wang, Zhihong
2016-08-17 10:07           ` Wang, Zhihong
2016-08-18  6:33 ` [dpdk-dev] [PATCH v2 0/6] vhost: optimize enqueue Zhihong Wang
2016-08-18  6:33   ` [dpdk-dev] [PATCH v2 1/6] vhost: rewrite enqueue Zhihong Wang
2016-08-19  2:39     ` Yuanhan Liu
2016-08-19  7:07       ` Wang, Zhihong
2016-08-18  6:33   ` [dpdk-dev] [PATCH v2 2/6] vhost: remove obsolete Zhihong Wang
2016-08-19  2:32     ` Yuanhan Liu
2016-08-19  7:08       ` Wang, Zhihong
2016-08-18  6:33   ` [dpdk-dev] [PATCH v2 3/6] vhost: remove useless volatile Zhihong Wang
2016-08-18  6:33   ` [dpdk-dev] [PATCH v2 4/6] vhost: add desc prefetch Zhihong Wang
2016-08-18  6:33   ` [dpdk-dev] [PATCH v2 5/6] vhost: batch update used ring Zhihong Wang
2016-08-18  6:33   ` [dpdk-dev] [PATCH v2 6/6] vhost: optimize cache access Zhihong Wang
2016-08-19  5:43 ` [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue Zhihong Wang
2016-08-19  5:43   ` [dpdk-dev] [PATCH v3 1/5] vhost: rewrite enqueue Zhihong Wang
2016-08-22  9:35     ` Maxime Coquelin
2016-08-23  2:27       ` Wang, Zhihong
2016-08-25  4:00       ` Yuanhan Liu
2016-08-19  5:43   ` [dpdk-dev] [PATCH v3 2/5] vhost: remove useless volatile Zhihong Wang
2016-08-19  5:43   ` [dpdk-dev] [PATCH v3 3/5] vhost: add desc prefetch Zhihong Wang
2016-08-19  5:43   ` [dpdk-dev] [PATCH v3 4/5] vhost: batch update used ring Zhihong Wang
2016-08-25  3:48     ` Yuanhan Liu
2016-08-25  5:19       ` Wang, Zhihong
2016-08-19  5:43   ` [dpdk-dev] [PATCH v3 5/5] vhost: optimize cache access Zhihong Wang
2016-08-22  8:11   ` [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue Maxime Coquelin
2016-08-22 10:01     ` Maxime Coquelin
2016-08-22 10:35       ` Thomas Monjalon
2016-08-24  3:37         ` Wang, Zhihong
2016-08-23  2:31       ` Wang, Zhihong
2016-08-23 10:43         ` Wang, Zhihong
2016-08-23 12:16           ` Maxime Coquelin
2016-08-23 12:22           ` Yuanhan Liu
2016-08-23  2:15     ` Wang, Zhihong
2016-09-21  8:50     ` Jianbo Liu
2016-09-21  9:27       ` Wang, Zhihong
2016-09-21 12:54         ` Jianbo Liu
2016-09-22  2:11           ` Wang, Zhihong
2016-09-22  2:29           ` Yuanhan Liu
2016-09-22  5:47             ` Jianbo Liu
2016-09-22  6:58               ` Wang, Zhihong
2016-09-22  9:01                 ` Jianbo Liu
2016-09-22 10:04                   ` Wang, Zhihong
2016-09-22 14:41                     ` Jianbo Liu
2016-09-23  2:56                       ` Wang, Zhihong
2016-09-23 10:41                         ` Jianbo Liu
2016-09-23 13:41                           ` Thomas Monjalon
2016-09-25  5:41                             ` Wang, Zhihong
2016-09-26  5:12                               ` Jianbo Liu
2016-09-26  5:25                                 ` Wang, Zhihong
2016-09-26  5:38                                   ` Jianbo Liu
2016-09-26  6:00                                     ` Wang, Zhihong
2016-09-26  4:24                             ` Jianbo Liu
2016-09-26  5:37                   ` Luke Gorrie
2016-09-26  5:40                     ` Jianbo Liu
2016-09-27 10:21                   ` Yuanhan Liu
2016-09-27 16:45                     ` Wang, Zhihong
2016-10-09 12:09                       ` Wang, Zhihong
2016-10-10  2:44                         ` Yuanhan Liu
2016-10-10  5:31                           ` Jianbo Liu
2016-10-10  6:22                             ` Wang, Zhihong
2016-10-10  6:57                               ` Jianbo Liu
2016-10-10  7:25                                 ` Wang, Zhihong
2016-10-12  2:53               ` Yuanhan Liu
2016-10-12 12:22                 ` Wang, Zhihong
2016-10-12 15:31                   ` Thomas Monjalon
2016-10-13  1:21                     ` Wang, Zhihong
2016-10-13  3:51                     ` Jianbo Liu
2016-10-13  5:33                   ` Yuanhan Liu [this message]
2016-10-13  5:35                     ` Yuanhan Liu
2016-10-13  6:02                     ` Wang, Zhihong
2016-10-13  7:54                       ` Maxime Coquelin
2016-10-13  9:23                         ` Maxime Coquelin
2016-10-14 10:11                           ` Yuanhan Liu
2016-08-30  3:35 ` [dpdk-dev] [PATCH v4 0/6] " Zhihong Wang
2016-08-30  3:35   ` [dpdk-dev] [PATCH v4 1/6] vhost: fix windows vm hang Zhihong Wang
2016-09-05  5:24     ` [dpdk-dev] [dpdk-stable] " Yuanhan Liu
2016-09-05  5:25       ` Wang, Zhihong
2016-09-05  5:40         ` Yuanhan Liu
2016-08-30  3:36   ` [dpdk-dev] [PATCH v4 2/6] vhost: rewrite enqueue Zhihong Wang
2016-09-05  6:39     ` Yuanhan Liu
2016-09-07  5:33       ` Yuanhan Liu
2016-09-07  5:39         ` Wang, Zhihong
2016-08-30  3:36   ` [dpdk-dev] [PATCH v4 3/6] vhost: remove useless volatile Zhihong Wang
2016-08-30  3:36   ` [dpdk-dev] [PATCH v4 4/6] vhost: add desc prefetch Zhihong Wang
2016-08-30  3:36   ` [dpdk-dev] [PATCH v4 5/6] vhost: batch update used ring Zhihong Wang
2016-08-30  3:36   ` [dpdk-dev] [PATCH v4 6/6] vhost: optimize cache access Zhihong Wang
2016-09-09  3:39 ` [dpdk-dev] [PATCH v5 0/6] vhost: optimize enqueue Zhihong Wang
2016-09-09  3:39   ` [dpdk-dev] [PATCH v5 1/6] vhost: fix windows vm hang Zhihong Wang
2016-09-09  3:39   ` [dpdk-dev] [PATCH v5 2/6] vhost: rewrite enqueue Zhihong Wang
2016-09-12 15:42     ` Maxime Coquelin
2016-09-14  8:20       ` Wang, Zhihong
2016-09-15 16:35         ` Maxime Coquelin
2016-09-12 16:26     ` Maxime Coquelin
2016-09-14  8:22       ` Wang, Zhihong
2016-09-18 14:19     ` Yuanhan Liu
2016-09-19  3:29       ` Wang, Zhihong
2016-09-09  3:39   ` [dpdk-dev] [PATCH v5 3/6] vhost: remove useless volatile Zhihong Wang
2016-09-09  3:39   ` [dpdk-dev] [PATCH v5 4/6] vhost: add desc prefetch Zhihong Wang
2016-09-09  3:39   ` [dpdk-dev] [PATCH v5 5/6] vhost: batch update used ring Zhihong Wang
2016-09-12 15:45     ` Maxime Coquelin
2016-09-14  8:43       ` Wang, Zhihong
2016-09-15 16:38         ` Maxime Coquelin
2016-09-18  2:55           ` Yuanhan Liu
2016-09-18  2:57             ` Wang, Zhihong
2016-09-09  3:39   ` [dpdk-dev] [PATCH v5 6/6] vhost: optimize cache access Zhihong Wang
2016-09-12 13:52   ` [dpdk-dev] [PATCH v5 0/6] vhost: optimize enqueue Maxime Coquelin
2016-09-12 13:56     ` Maxime Coquelin
2016-09-12 14:01     ` Yuanhan Liu
2016-09-20  2:00 ` [dpdk-dev] [PATCH v6 " Zhihong Wang
2016-09-20  2:00   ` [dpdk-dev] [PATCH v6 1/6] vhost: fix windows vm hang Zhihong Wang
2016-10-13  6:18     ` [dpdk-dev] [dpdk-stable] " Yuanhan Liu
2016-09-20  2:00   ` [dpdk-dev] [PATCH v6 2/6] vhost: rewrite enqueue Zhihong Wang
2016-09-22  9:58     ` Jianbo Liu
2016-09-22 10:13       ` Wang, Zhihong
2016-09-20  2:00   ` [dpdk-dev] [PATCH v6 3/6] vhost: remove useless volatile Zhihong Wang
2016-09-20  2:00   ` [dpdk-dev] [PATCH v6 4/6] vhost: add desc prefetch Zhihong Wang
2016-09-20  2:00   ` [dpdk-dev] [PATCH v6 5/6] vhost: batch update used ring Zhihong Wang
2016-09-20  2:00   ` [dpdk-dev] [PATCH v6 6/6] vhost: optimize cache access Zhihong Wang
2016-09-21  2:26   ` [dpdk-dev] [PATCH v6 0/6] vhost: optimize enqueue Yuanhan Liu
2016-09-21  4:39     ` Maxime Coquelin
2016-10-14  9:34   ` [dpdk-dev] [PATCH v7 0/7] vhost: optimize mergeable Rx path Yuanhan Liu
2016-10-14  9:34     ` [dpdk-dev] [PATCH v7 1/7] vhost: remove useless volatile Yuanhan Liu
2016-10-14  9:34     ` [dpdk-dev] [PATCH v7 2/7] vhost: optimize cache access Yuanhan Liu
2016-10-14  9:34     ` [dpdk-dev] [PATCH v7 3/7] vhost: simplify mergeable Rx vring reservation Yuanhan Liu
2016-10-25 22:08       ` Thomas Monjalon
2016-10-26  2:56         ` Yuanhan Liu
2016-10-14  9:34     ` [dpdk-dev] [PATCH v7 4/7] vhost: use last avail idx for avail ring reservation Yuanhan Liu
2016-10-14  9:34     ` [dpdk-dev] [PATCH v7 5/7] vhost: shadow used ring update Yuanhan Liu
2016-10-14  9:34     ` [dpdk-dev] [PATCH v7 6/7] vhost: prefetch avail ring Yuanhan Liu
2016-10-14  9:34     ` [dpdk-dev] [PATCH v7 7/7] vhost: retrieve avail head once Yuanhan Liu
2016-10-18  2:25     ` [dpdk-dev] [PATCH v7 0/7] vhost: optimize mergeable Rx path Jianbo Liu
2016-10-18 14:53     ` Maxime Coquelin
2016-10-21  7:51     ` Yuanhan Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161013053324.GP16751@yliu-dev.sh.intel.com \
    --to=yuanhan.liu@linux.intel.com \
    --cc=dev@dpdk.org \
    --cc=jianbo.liu@linaro.org \
    --cc=maxime.coquelin@redhat.com \
    --cc=thomas.monjalon@6wind.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).