DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Xie, Huawei" <huawei.xie@intel.com>
To: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	Victor Kaplansky <vkaplans@redhat.com>
Subject: Re: [dpdk-dev] [PATCH v2 1/7] vhost: refactor rte_vhost_dequeue_burst
Date: Mon, 7 Mar 2016 02:19:54 +0000	[thread overview]
Message-ID: <C37D651A908B024F974696C65296B57B4C636CBD@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <20160304022118.GU14300@yliu-dev.sh.intel.com>

On 3/4/2016 10:19 AM, Yuanhan Liu wrote:
> On Thu, Mar 03, 2016 at 04:21:19PM +0000, Xie, Huawei wrote:
>> On 2/18/2016 9:48 PM, Yuanhan Liu wrote:
>>> The current rte_vhost_dequeue_burst() implementation is a bit messy
>>> and logic twisted. And you could see repeat code here and there: it
>>> invokes rte_pktmbuf_alloc() three times at three different places!
>>>
>>> However, rte_vhost_dequeue_burst() acutally does a simple job: copy
>>> the packet data from vring desc to mbuf. What's tricky here is:
>>>
>>> - desc buff could be chained (by desc->next field), so that you need
>>>   fetch next one if current is wholly drained.
>>>
>>> - One mbuf could not be big enough to hold all desc buff, hence you
>>>   need to chain the mbuf as well, by the mbuf->next field.
>>>
>>> Even though, the logic could be simple. Here is the pseudo code.
>>>
>>> 	while (this_desc_is_not_drained_totally || has_next_desc) {
>>> 		if (this_desc_has_drained_totally) {
>>> 			this_desc = next_desc();
>>> 		}
>>>
>>> 		if (mbuf_has_no_room) {
>>> 			mbuf = allocate_a_new_mbuf();
>>> 		}
>>>
>>> 		COPY(mbuf, desc);
>>> 	}
>>>
>>> And this is how I refactored rte_vhost_dequeue_burst.
>>>
>>> Note that the old patch does a special handling for skipping virtio
>>> header. However, that could be simply done by adjusting desc_avail
>>> and desc_offset var:
>>>
>>> 	desc_avail  = desc->len - vq->vhost_hlen;
>>> 	desc_offset = vq->vhost_hlen;
>>>
>>> This refactor makes the code much more readable (IMO), yet it reduces
>>> binary code size (nearly 2K).
>>>
>>> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
>>> ---
>>>
>>> v2: - fix potential NULL dereference bug of var "prev" and "head"
>>> ---
>>>  lib/librte_vhost/vhost_rxtx.c | 297 +++++++++++++++++-------------------------
>>>  1 file changed, 116 insertions(+), 181 deletions(-)
>>>
>>> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
>>> index 5e7e5b1..d5cd0fa 100644
>>> --- a/lib/librte_vhost/vhost_rxtx.c
>>> +++ b/lib/librte_vhost/vhost_rxtx.c
>>> @@ -702,21 +702,104 @@ vhost_dequeue_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *m)
>>>  	}
>>>  }
>>>  
>>> +static inline struct rte_mbuf *
>>> +copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
>>> +		  uint16_t desc_idx, struct rte_mempool *mbuf_pool)
>>> +{
>>> +	struct vring_desc *desc;
>>> +	uint64_t desc_addr;
>>> +	uint32_t desc_avail, desc_offset;
>>> +	uint32_t mbuf_avail, mbuf_offset;
>>> +	uint32_t cpy_len;
>>> +	struct rte_mbuf *head = NULL;
>>> +	struct rte_mbuf *cur = NULL, *prev = NULL;
>>> +	struct virtio_net_hdr *hdr;
>>> +
>>> +	desc = &vq->desc[desc_idx];
>>> +	desc_addr = gpa_to_vva(dev, desc->addr);
>>> +	rte_prefetch0((void *)(uintptr_t)desc_addr);
>>> +
>>> +	/* Retrieve virtio net header */
>>> +	hdr = (struct virtio_net_hdr *)((uintptr_t)desc_addr);
>>> +	desc_avail  = desc->len - vq->vhost_hlen;
>> There is a serious bug here, desc->len - vq->vhost_len could overflow.
>> VM could easily create this case. Let us fix it here.
> Nope, this issue has been there since the beginning, and this patch
> is a refactor: we should not bring any functional changes. Therefore,
> we should not fix it here.

No, I don't mean exactly fixing in this patch but in this series.

Besides, from refactoring's perspective, actually we could make things
further much simpler and more readable. Both the desc chains and mbuf
could be converted into iovec, then both dequeue(copy_desc_to_mbuf) and
enqueue(copy_mbuf_to_desc) could use the commonly used iovec copying
algorithms As data path are performance oriented, let us stop here.

>
> And actually, it's been fixed in the 6th patch in this series:

Will check that.

>
>     [PATCH v2 6/7] vhost: do sanity check for desc->len
>
> 	--yliu
>


  reply	other threads:[~2016-03-07  2:19 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-03  6:06 [dpdk-dev] [PATCH 0/5 for 2.3] vhost rxtx refactor Yuanhan Liu
2015-12-03  6:06 ` [dpdk-dev] [PATCH 1/5] vhost: refactor rte_vhost_dequeue_burst Yuanhan Liu
2015-12-03  7:02   ` Stephen Hemminger
2015-12-03  7:25     ` Yuanhan Liu
2015-12-03  7:03   ` Stephen Hemminger
2015-12-12  6:55   ` Rich Lane
2015-12-14  1:55     ` Yuanhan Liu
2016-01-26 10:30   ` Xie, Huawei
2016-01-27  3:26     ` Yuanhan Liu
2016-01-27  6:12       ` Xie, Huawei
2016-01-27  6:16         ` Yuanhan Liu
2015-12-03  6:06 ` [dpdk-dev] [PATCH 2/5] vhost: refactor virtio_dev_rx Yuanhan Liu
2015-12-11 20:42   ` Rich Lane
2015-12-14  1:47     ` Yuanhan Liu
2016-01-21 13:50       ` Jérôme Jutteau
2016-01-27  3:27         ` Yuanhan Liu
2015-12-03  6:06 ` [dpdk-dev] [PATCH 3/5] vhost: refactor virtio_dev_merge_rx Yuanhan Liu
2015-12-03  6:06 ` [dpdk-dev] [PATCH 4/5] vhost: do not use rte_memcpy for virtio_hdr copy Yuanhan Liu
2016-01-27  2:46   ` Xie, Huawei
2016-01-27  3:22     ` Yuanhan Liu
2016-01-27  5:56       ` Xie, Huawei
2016-01-27  6:02         ` Yuanhan Liu
2016-01-27  6:16           ` Xie, Huawei
2016-01-27  6:35             ` Yuanhan Liu
2015-12-03  6:06 ` [dpdk-dev] [PATCH 5/5] vhost: don't use unlikely for VIRTIO_NET_F_MRG_RXBUF detection Yuanhan Liu
2016-02-17 22:50 ` [dpdk-dev] [PATCH 0/5 for 2.3] vhost rxtx refactor Thomas Monjalon
2016-02-18  4:09   ` Yuanhan Liu
2016-02-18 13:49 ` [dpdk-dev] [PATCH v2 0/7] " Yuanhan Liu
2016-02-18 13:49   ` [dpdk-dev] [PATCH v2 1/7] vhost: refactor rte_vhost_dequeue_burst Yuanhan Liu
2016-03-03 16:21     ` Xie, Huawei
2016-03-04  2:21       ` Yuanhan Liu
2016-03-07  2:19         ` Xie, Huawei [this message]
2016-03-07  2:44           ` Yuanhan Liu
2016-03-03 16:30     ` Xie, Huawei
2016-03-04  2:17       ` Yuanhan Liu
2016-03-07  2:32         ` Xie, Huawei
2016-03-07  2:48           ` Yuanhan Liu
2016-03-07  2:59             ` Xie, Huawei
2016-03-07  6:14               ` Yuanhan Liu
2016-03-03 17:19     ` Xie, Huawei
2016-03-04  2:11       ` Yuanhan Liu
2016-03-07  2:55         ` Xie, Huawei
2016-03-03 17:40     ` Xie, Huawei
2016-03-04  2:32       ` Yuanhan Liu
2016-03-07  3:02         ` Xie, Huawei
2016-03-07  3:03     ` Xie, Huawei
2016-02-18 13:49   ` [dpdk-dev] [PATCH v2 2/7] vhost: refactor virtio_dev_rx Yuanhan Liu
2016-03-07  3:34     ` Xie, Huawei
2016-03-08 12:27       ` Yuanhan Liu
2016-02-18 13:49   ` [dpdk-dev] [PATCH v2 3/7] vhost: refactor virtio_dev_merge_rx Yuanhan Liu
2016-03-07  6:22     ` Xie, Huawei
2016-03-07  6:36       ` Yuanhan Liu
2016-03-07  6:38         ` Xie, Huawei
2016-03-07  6:51           ` Yuanhan Liu
2016-03-07  7:03             ` Xie, Huawei
2016-03-07  7:16               ` Xie, Huawei
2016-03-07  8:20                 ` Yuanhan Liu
2016-03-07  7:52     ` Xie, Huawei
2016-03-07  8:38       ` Yuanhan Liu
2016-03-07  9:27         ` Xie, Huawei
2016-02-18 13:49   ` [dpdk-dev] [PATCH v2 4/7] vhost: do not use rte_memcpy for virtio_hdr copy Yuanhan Liu
2016-03-07  1:20     ` Xie, Huawei
2016-03-07  4:20     ` Stephen Hemminger
2016-03-07  5:24       ` Xie, Huawei
2016-03-07  6:21       ` Yuanhan Liu
2016-02-18 13:49   ` [dpdk-dev] [PATCH v2 5/7] vhost: don't use unlikely for VIRTIO_NET_F_MRG_RXBUF detection Yuanhan Liu
2016-02-18 13:49   ` [dpdk-dev] [PATCH v2 6/7] vhost: do sanity check for desc->len Yuanhan Liu
2016-02-18 13:49   ` [dpdk-dev] [PATCH v2 7/7] vhost: do sanity check for desc->next Yuanhan Liu
2016-03-07  3:10     ` Xie, Huawei
2016-03-07  6:57       ` Yuanhan Liu
2016-02-29 16:06   ` [dpdk-dev] [PATCH v2 0/7] vhost rxtx refactor Thomas Monjalon
2016-03-01  6:01     ` Yuanhan Liu
2016-03-10  4:32   ` [dpdk-dev] [PATCH v3 0/8] vhost rxtx refactor and fixes Yuanhan Liu
2016-03-10  4:32     ` [dpdk-dev] [PATCH v3 1/8] vhost: refactor rte_vhost_dequeue_burst Yuanhan Liu
2016-03-10  4:32     ` [dpdk-dev] [PATCH v3 2/8] vhost: refactor virtio_dev_rx Yuanhan Liu
2016-03-10  4:32     ` [dpdk-dev] [PATCH v3 3/8] vhost: refactor virtio_dev_merge_rx Yuanhan Liu
2016-03-11 16:18       ` Thomas Monjalon
2016-03-14  7:35         ` [dpdk-dev] [PATCH v4 " Yuanhan Liu
2016-03-10  4:32     ` [dpdk-dev] [PATCH v3 4/8] vhost: do not use rte_memcpy for virtio_hdr copy Yuanhan Liu
2016-03-10  4:32     ` [dpdk-dev] [PATCH v3 5/8] vhost: don't use unlikely for VIRTIO_NET_F_MRG_RXBUF detection Yuanhan Liu
2016-03-10  4:32     ` [dpdk-dev] [PATCH v3 6/8] vhost: do sanity check for desc->len Yuanhan Liu
2016-03-10  4:32     ` [dpdk-dev] [PATCH v3 7/8] vhost: do sanity check for desc->next against with vq->size Yuanhan Liu
2016-03-10  4:32     ` [dpdk-dev] [PATCH v3 8/8] vhost: avoid dead loop chain Yuanhan Liu
2016-03-14 23:09     ` [dpdk-dev] [PATCH v3 0/8] vhost rxtx refactor and fixes Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C37D651A908B024F974696C65296B57B4C636CBD@SHSMSX101.ccr.corp.intel.com \
    --to=huawei.xie@intel.com \
    --cc=dev@dpdk.org \
    --cc=mst@redhat.com \
    --cc=vkaplans@redhat.com \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).