From: "Wang, Zhihong" <zhihong.wang@intel.com>
To: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: Jianbo Liu <jianbo.liu@linaro.org>,
Thomas Monjalon <thomas.monjalon@6wind.com>,
Maxime Coquelin <maxime.coquelin@redhat.com>,
"dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue
Date: Thu, 13 Oct 2016 06:02:04 +0000 [thread overview]
Message-ID: <8F6C2BD409508844A0EFC19955BE09414E7CDDC7@SHSMSX103.ccr.corp.intel.com> (raw)
In-Reply-To: <20161013053324.GP16751@yliu-dev.sh.intel.com>
> -----Original Message-----
> From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
> Sent: Thursday, October 13, 2016 1:33 PM
> To: Wang, Zhihong <zhihong.wang@intel.com>
> Cc: Jianbo Liu <jianbo.liu@linaro.org>; Thomas Monjalon
> <thomas.monjalon@6wind.com>; Maxime Coquelin
> <maxime.coquelin@redhat.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue
>
> On Wed, Oct 12, 2016 at 12:22:08PM +0000, Wang, Zhihong wrote:
> > > > >> > 3. How many percentage of drop are you seeing?
> > > > The testing result:
> > > > size (bytes) improvement (%)
> > > > 64 3.92
> > > > 128 11.51
> > > > 256 24.16
> > > > 512 -13.79
> > > > 1024 -22.51
> > > > 1500 -12.22
> > > > A correction is that performance is dropping if byte size is larger than
> 512.
> > >
> > > I have thought of this twice. Unfortunately, I think I may need NACK this
> > > series.
> > >
> > > Merging two code path into one is really good: as you stated, it improves
> > > the maintainability. But only if we see no performance regression on both
> > > path after the refactor. Unfortunately, that's not the case here: it hurts
> > > the performance for one code path (non-mrg Rx).
> > >
> > > That makes me think we may should not do the code path merge at all. I
> think
> > > that also aligns with what you have said before (internally): we could do
> the
> > > merge if it gives comparable performance before and after that.
> > >
> > > Besides that, I don't quite like the way you did in patch 2 (rewrite
> enqueue):
> > > you made a lot of changes in one patch. That means if something wrong
> > > happened,
> > > it is hard to narrow down which change introduces that regression. Badly,
> > > that's exactly what we met here. Weeks have been passed, I see no
> progress.
> > >
> > > That's the reason we like the idea of "one patch only does one thing, an
> > > atomic thing".
> >
> >
> > Yuanhan, folks,
> >
> > Thanks for the analysis. I disagree here though.
> >
> > I analyze, develop, benchmark on x86 platforms, where this patch
> > works great.
>
> Yes, that's great effort! With your hardwork, we know what the bottleneck
> is and how it could be improved.
>
> However, you don't have to do code refactor (merge two code path to one)
> to apply those improvements. From what I know, in this patchset, there
> are two factors could improve the performance:
>
> - copy hdr together with packet data
>
> - shadow used ring update and update at once
>
> The overall performance boost I got with your v6 patchset with mrg-Rx
> code path is about 27% (in PVP case). And I have just applied the 1st
> optimization, it yields about 20% boosts. The left could be covered if
> we apply the 2nd optimization (I guess).
>
> That would be a clean way to optimize vhost mergeable Rx path:
>
> - you don't touch non-mrg Rx path (well, you may could apply the
> shadow_used_ring trick to it as wel)
>
> This would at least make sure we will have no such performance
> regression issue reported by ARM guys.
>
> - you don't refactor the code
>
> The rewrite from scratch could introduce other issues, besides the
> performance regression. We may just don't know it yet.
>
>
> Make sense to you? If you agree, I think we could still make it in
> this release: they would be some small changes after all. For example,
> below is the patch applies the 1st optimization tip on top of
> dpdk-next-virtio
Thanks for this great idea. I think it's a better way to do it.
I'll start to make the patch then.
>
> --yliu
>
> ---------------------------------------------------------------
> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index 8a151af..0ddb5af 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -379,7 +379,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net
> *dev, struct vhost_virtqueue *vq,
> uint16_t end_idx, struct rte_mbuf *m,
> struct buf_vector *buf_vec)
> {
> - struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
> + struct virtio_net_hdr_mrg_rxbuf *virtio_hdr;
> uint32_t vec_idx = 0;
> uint16_t start_idx = vq->last_used_idx;
> uint16_t cur_idx = start_idx;
> @@ -388,6 +388,8 @@ copy_mbuf_to_desc_mergeable(struct virtio_net
> *dev, struct vhost_virtqueue *vq,
> uint32_t desc_offset, desc_avail;
> uint32_t cpy_len;
> uint16_t desc_idx, used_idx;
> + uint16_t num_buffers = end_idx - start_idx;
> + int hdr_copied = 0;
>
> if (unlikely(m == NULL))
> return 0;
> @@ -399,16 +401,11 @@ copy_mbuf_to_desc_mergeable(struct virtio_net
> *dev, struct vhost_virtqueue *vq,
> if (buf_vec[vec_idx].buf_len < dev->vhost_hlen || !desc_addr)
> return 0;
>
> - rte_prefetch0((void *)(uintptr_t)desc_addr);
> + virtio_hdr = (struct virtio_net_hdr_mrg_rxbuf
> *)(uintptr_t)desc_addr;
> + rte_prefetch0(virtio_hdr);
>
> - virtio_hdr.num_buffers = end_idx - start_idx;
> LOG_DEBUG(VHOST_DATA, "(%d) RX: num merge buffers %d\n",
> - dev->vid, virtio_hdr.num_buffers);
> -
> - virtio_enqueue_offload(m, &virtio_hdr.hdr);
> - copy_virtio_net_hdr(dev, desc_addr, virtio_hdr);
> - vhost_log_write(dev, buf_vec[vec_idx].buf_addr, dev->vhost_hlen);
> - PRINT_PACKET(dev, (uintptr_t)desc_addr, dev->vhost_hlen, 0);
> + dev->vid, num_buffers);
>
> desc_avail = buf_vec[vec_idx].buf_len - dev->vhost_hlen;
> desc_offset = dev->vhost_hlen;
> @@ -450,6 +447,15 @@ copy_mbuf_to_desc_mergeable(struct virtio_net
> *dev, struct vhost_virtqueue *vq,
> mbuf_avail = rte_pktmbuf_data_len(m);
> }
>
> + if (hdr_copied == 0) {
> + virtio_hdr->num_buffers = num_buffers;
> + virtio_enqueue_offload(m, &virtio_hdr->hdr);
> + vhost_log_write(dev, buf_vec[vec_idx].buf_addr,
> dev->vhost_hlen);
> + PRINT_PACKET(dev, (uintptr_t)desc_addr, dev-
> >vhost_hlen, 0);
> +
> + hdr_copied = 1;
> + }
> +
> cpy_len = RTE_MIN(desc_avail, mbuf_avail);
> rte_memcpy((void *)((uintptr_t)(desc_addr + desc_offset)),
> rte_pktmbuf_mtod_offset(m, void *, mbuf_offset),
next prev parent reply other threads:[~2016-10-13 6:02 UTC|newest]
Thread overview: 140+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-16 3:50 [dpdk-dev] [PATCH] optimize vhost enqueue Zhihong Wang
2016-08-16 13:59 ` Maxime Coquelin
2016-08-17 1:45 ` Wang, Zhihong
2016-08-17 2:38 ` Yuanhan Liu
2016-08-17 6:41 ` Wang, Zhihong
2016-08-17 9:17 ` Maxime Coquelin
2016-08-17 9:51 ` Yuanhan Liu
2016-08-18 13:44 ` Wang, Zhihong
2016-08-17 10:07 ` Wang, Zhihong
2016-08-18 6:33 ` [dpdk-dev] [PATCH v2 0/6] vhost: optimize enqueue Zhihong Wang
2016-08-18 6:33 ` [dpdk-dev] [PATCH v2 1/6] vhost: rewrite enqueue Zhihong Wang
2016-08-19 2:39 ` Yuanhan Liu
2016-08-19 7:07 ` Wang, Zhihong
2016-08-18 6:33 ` [dpdk-dev] [PATCH v2 2/6] vhost: remove obsolete Zhihong Wang
2016-08-19 2:32 ` Yuanhan Liu
2016-08-19 7:08 ` Wang, Zhihong
2016-08-18 6:33 ` [dpdk-dev] [PATCH v2 3/6] vhost: remove useless volatile Zhihong Wang
2016-08-18 6:33 ` [dpdk-dev] [PATCH v2 4/6] vhost: add desc prefetch Zhihong Wang
2016-08-18 6:33 ` [dpdk-dev] [PATCH v2 5/6] vhost: batch update used ring Zhihong Wang
2016-08-18 6:33 ` [dpdk-dev] [PATCH v2 6/6] vhost: optimize cache access Zhihong Wang
2016-08-19 5:43 ` [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue Zhihong Wang
2016-08-19 5:43 ` [dpdk-dev] [PATCH v3 1/5] vhost: rewrite enqueue Zhihong Wang
2016-08-22 9:35 ` Maxime Coquelin
2016-08-23 2:27 ` Wang, Zhihong
2016-08-25 4:00 ` Yuanhan Liu
2016-08-19 5:43 ` [dpdk-dev] [PATCH v3 2/5] vhost: remove useless volatile Zhihong Wang
2016-08-19 5:43 ` [dpdk-dev] [PATCH v3 3/5] vhost: add desc prefetch Zhihong Wang
2016-08-19 5:43 ` [dpdk-dev] [PATCH v3 4/5] vhost: batch update used ring Zhihong Wang
2016-08-25 3:48 ` Yuanhan Liu
2016-08-25 5:19 ` Wang, Zhihong
2016-08-19 5:43 ` [dpdk-dev] [PATCH v3 5/5] vhost: optimize cache access Zhihong Wang
2016-08-22 8:11 ` [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue Maxime Coquelin
2016-08-22 10:01 ` Maxime Coquelin
2016-08-22 10:35 ` Thomas Monjalon
2016-08-24 3:37 ` Wang, Zhihong
2016-08-23 2:31 ` Wang, Zhihong
2016-08-23 10:43 ` Wang, Zhihong
2016-08-23 12:16 ` Maxime Coquelin
2016-08-23 12:22 ` Yuanhan Liu
2016-08-23 2:15 ` Wang, Zhihong
2016-09-21 8:50 ` Jianbo Liu
2016-09-21 9:27 ` Wang, Zhihong
2016-09-21 12:54 ` Jianbo Liu
2016-09-22 2:11 ` Wang, Zhihong
2016-09-22 2:29 ` Yuanhan Liu
2016-09-22 5:47 ` Jianbo Liu
2016-09-22 6:58 ` Wang, Zhihong
2016-09-22 9:01 ` Jianbo Liu
2016-09-22 10:04 ` Wang, Zhihong
2016-09-22 14:41 ` Jianbo Liu
2016-09-23 2:56 ` Wang, Zhihong
2016-09-23 10:41 ` Jianbo Liu
2016-09-23 13:41 ` Thomas Monjalon
2016-09-25 5:41 ` Wang, Zhihong
2016-09-26 5:12 ` Jianbo Liu
2016-09-26 5:25 ` Wang, Zhihong
2016-09-26 5:38 ` Jianbo Liu
2016-09-26 6:00 ` Wang, Zhihong
2016-09-26 4:24 ` Jianbo Liu
2016-09-26 5:37 ` Luke Gorrie
2016-09-26 5:40 ` Jianbo Liu
2016-09-27 10:21 ` Yuanhan Liu
2016-09-27 16:45 ` Wang, Zhihong
2016-10-09 12:09 ` Wang, Zhihong
2016-10-10 2:44 ` Yuanhan Liu
2016-10-10 5:31 ` Jianbo Liu
2016-10-10 6:22 ` Wang, Zhihong
2016-10-10 6:57 ` Jianbo Liu
2016-10-10 7:25 ` Wang, Zhihong
2016-10-12 2:53 ` Yuanhan Liu
2016-10-12 12:22 ` Wang, Zhihong
2016-10-12 15:31 ` Thomas Monjalon
2016-10-13 1:21 ` Wang, Zhihong
2016-10-13 3:51 ` Jianbo Liu
2016-10-13 5:33 ` Yuanhan Liu
2016-10-13 5:35 ` Yuanhan Liu
2016-10-13 6:02 ` Wang, Zhihong [this message]
2016-10-13 7:54 ` Maxime Coquelin
2016-10-13 9:23 ` Maxime Coquelin
2016-10-14 10:11 ` Yuanhan Liu
2016-08-30 3:35 ` [dpdk-dev] [PATCH v4 0/6] " Zhihong Wang
2016-08-30 3:35 ` [dpdk-dev] [PATCH v4 1/6] vhost: fix windows vm hang Zhihong Wang
2016-09-05 5:24 ` [dpdk-dev] [dpdk-stable] " Yuanhan Liu
2016-09-05 5:25 ` Wang, Zhihong
2016-09-05 5:40 ` Yuanhan Liu
2016-08-30 3:36 ` [dpdk-dev] [PATCH v4 2/6] vhost: rewrite enqueue Zhihong Wang
2016-09-05 6:39 ` Yuanhan Liu
2016-09-07 5:33 ` Yuanhan Liu
2016-09-07 5:39 ` Wang, Zhihong
2016-08-30 3:36 ` [dpdk-dev] [PATCH v4 3/6] vhost: remove useless volatile Zhihong Wang
2016-08-30 3:36 ` [dpdk-dev] [PATCH v4 4/6] vhost: add desc prefetch Zhihong Wang
2016-08-30 3:36 ` [dpdk-dev] [PATCH v4 5/6] vhost: batch update used ring Zhihong Wang
2016-08-30 3:36 ` [dpdk-dev] [PATCH v4 6/6] vhost: optimize cache access Zhihong Wang
2016-09-09 3:39 ` [dpdk-dev] [PATCH v5 0/6] vhost: optimize enqueue Zhihong Wang
2016-09-09 3:39 ` [dpdk-dev] [PATCH v5 1/6] vhost: fix windows vm hang Zhihong Wang
2016-09-09 3:39 ` [dpdk-dev] [PATCH v5 2/6] vhost: rewrite enqueue Zhihong Wang
2016-09-12 15:42 ` Maxime Coquelin
2016-09-14 8:20 ` Wang, Zhihong
2016-09-15 16:35 ` Maxime Coquelin
2016-09-12 16:26 ` Maxime Coquelin
2016-09-14 8:22 ` Wang, Zhihong
2016-09-18 14:19 ` Yuanhan Liu
2016-09-19 3:29 ` Wang, Zhihong
2016-09-09 3:39 ` [dpdk-dev] [PATCH v5 3/6] vhost: remove useless volatile Zhihong Wang
2016-09-09 3:39 ` [dpdk-dev] [PATCH v5 4/6] vhost: add desc prefetch Zhihong Wang
2016-09-09 3:39 ` [dpdk-dev] [PATCH v5 5/6] vhost: batch update used ring Zhihong Wang
2016-09-12 15:45 ` Maxime Coquelin
2016-09-14 8:43 ` Wang, Zhihong
2016-09-15 16:38 ` Maxime Coquelin
2016-09-18 2:55 ` Yuanhan Liu
2016-09-18 2:57 ` Wang, Zhihong
2016-09-09 3:39 ` [dpdk-dev] [PATCH v5 6/6] vhost: optimize cache access Zhihong Wang
2016-09-12 13:52 ` [dpdk-dev] [PATCH v5 0/6] vhost: optimize enqueue Maxime Coquelin
2016-09-12 13:56 ` Maxime Coquelin
2016-09-12 14:01 ` Yuanhan Liu
2016-09-20 2:00 ` [dpdk-dev] [PATCH v6 " Zhihong Wang
2016-09-20 2:00 ` [dpdk-dev] [PATCH v6 1/6] vhost: fix windows vm hang Zhihong Wang
2016-10-13 6:18 ` [dpdk-dev] [dpdk-stable] " Yuanhan Liu
2016-09-20 2:00 ` [dpdk-dev] [PATCH v6 2/6] vhost: rewrite enqueue Zhihong Wang
2016-09-22 9:58 ` Jianbo Liu
2016-09-22 10:13 ` Wang, Zhihong
2016-09-20 2:00 ` [dpdk-dev] [PATCH v6 3/6] vhost: remove useless volatile Zhihong Wang
2016-09-20 2:00 ` [dpdk-dev] [PATCH v6 4/6] vhost: add desc prefetch Zhihong Wang
2016-09-20 2:00 ` [dpdk-dev] [PATCH v6 5/6] vhost: batch update used ring Zhihong Wang
2016-09-20 2:00 ` [dpdk-dev] [PATCH v6 6/6] vhost: optimize cache access Zhihong Wang
2016-09-21 2:26 ` [dpdk-dev] [PATCH v6 0/6] vhost: optimize enqueue Yuanhan Liu
2016-09-21 4:39 ` Maxime Coquelin
2016-10-14 9:34 ` [dpdk-dev] [PATCH v7 0/7] vhost: optimize mergeable Rx path Yuanhan Liu
2016-10-14 9:34 ` [dpdk-dev] [PATCH v7 1/7] vhost: remove useless volatile Yuanhan Liu
2016-10-14 9:34 ` [dpdk-dev] [PATCH v7 2/7] vhost: optimize cache access Yuanhan Liu
2016-10-14 9:34 ` [dpdk-dev] [PATCH v7 3/7] vhost: simplify mergeable Rx vring reservation Yuanhan Liu
2016-10-25 22:08 ` Thomas Monjalon
2016-10-26 2:56 ` Yuanhan Liu
2016-10-14 9:34 ` [dpdk-dev] [PATCH v7 4/7] vhost: use last avail idx for avail ring reservation Yuanhan Liu
2016-10-14 9:34 ` [dpdk-dev] [PATCH v7 5/7] vhost: shadow used ring update Yuanhan Liu
2016-10-14 9:34 ` [dpdk-dev] [PATCH v7 6/7] vhost: prefetch avail ring Yuanhan Liu
2016-10-14 9:34 ` [dpdk-dev] [PATCH v7 7/7] vhost: retrieve avail head once Yuanhan Liu
2016-10-18 2:25 ` [dpdk-dev] [PATCH v7 0/7] vhost: optimize mergeable Rx path Jianbo Liu
2016-10-18 14:53 ` Maxime Coquelin
2016-10-21 7:51 ` Yuanhan Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8F6C2BD409508844A0EFC19955BE09414E7CDDC7@SHSMSX103.ccr.corp.intel.com \
--to=zhihong.wang@intel.com \
--cc=dev@dpdk.org \
--cc=jianbo.liu@linaro.org \
--cc=maxime.coquelin@redhat.com \
--cc=thomas.monjalon@6wind.com \
--cc=yuanhan.liu@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).