From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 6A5192BDD for ; Mon, 7 Mar 2016 03:19:58 +0100 (CET) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga103.fm.intel.com with ESMTP; 06 Mar 2016 18:19:57 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,549,1449561600"; d="scan'208";a="664930271" Received: from fmsmsx105.amr.corp.intel.com ([10.18.124.203]) by FMSMGA003.fm.intel.com with ESMTP; 06 Mar 2016 18:19:57 -0800 Received: from fmsmsx116.amr.corp.intel.com (10.18.116.20) by FMSMSX105.amr.corp.intel.com (10.18.124.203) with Microsoft SMTP Server (TLS) id 14.3.248.2; Sun, 6 Mar 2016 18:19:57 -0800 Received: from shsmsx104.ccr.corp.intel.com (10.239.110.15) by fmsmsx116.amr.corp.intel.com (10.18.116.20) with Microsoft SMTP Server (TLS) id 14.3.248.2; Sun, 6 Mar 2016 18:19:56 -0800 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.136]) by SHSMSX104.ccr.corp.intel.com ([169.254.5.132]) with mapi id 14.03.0248.002; Mon, 7 Mar 2016 10:19:55 +0800 From: "Xie, Huawei" To: Yuanhan Liu Thread-Topic: [PATCH v2 1/7] vhost: refactor rte_vhost_dequeue_burst Thread-Index: AdF1aLd5clXhimYJTx2/yKoyh8rOiw== Date: Mon, 7 Mar 2016 02:19:54 +0000 Message-ID: References: <1449122773-25510-1-git-send-email-yuanhan.liu@linux.intel.com> <1455803352-5518-1-git-send-email-yuanhan.liu@linux.intel.com> <1455803352-5518-2-git-send-email-yuanhan.liu@linux.intel.com> <20160304022118.GU14300@yliu-dev.sh.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "Michael S. Tsirkin" , "dev@dpdk.org" , Victor Kaplansky Subject: Re: [dpdk-dev] [PATCH v2 1/7] vhost: refactor rte_vhost_dequeue_burst X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2016 02:19:59 -0000 On 3/4/2016 10:19 AM, Yuanhan Liu wrote:=0A= > On Thu, Mar 03, 2016 at 04:21:19PM +0000, Xie, Huawei wrote:=0A= >> On 2/18/2016 9:48 PM, Yuanhan Liu wrote:=0A= >>> The current rte_vhost_dequeue_burst() implementation is a bit messy=0A= >>> and logic twisted. And you could see repeat code here and there: it=0A= >>> invokes rte_pktmbuf_alloc() three times at three different places!=0A= >>>=0A= >>> However, rte_vhost_dequeue_burst() acutally does a simple job: copy=0A= >>> the packet data from vring desc to mbuf. What's tricky here is:=0A= >>>=0A= >>> - desc buff could be chained (by desc->next field), so that you need=0A= >>> fetch next one if current is wholly drained.=0A= >>>=0A= >>> - One mbuf could not be big enough to hold all desc buff, hence you=0A= >>> need to chain the mbuf as well, by the mbuf->next field.=0A= >>>=0A= >>> Even though, the logic could be simple. Here is the pseudo code.=0A= >>>=0A= >>> while (this_desc_is_not_drained_totally || has_next_desc) {=0A= >>> if (this_desc_has_drained_totally) {=0A= >>> this_desc =3D next_desc();=0A= >>> }=0A= >>>=0A= >>> if (mbuf_has_no_room) {=0A= >>> mbuf =3D allocate_a_new_mbuf();=0A= >>> }=0A= >>>=0A= >>> COPY(mbuf, desc);=0A= >>> }=0A= >>>=0A= >>> And this is how I refactored rte_vhost_dequeue_burst.=0A= >>>=0A= >>> Note that the old patch does a special handling for skipping virtio=0A= >>> header. However, that could be simply done by adjusting desc_avail=0A= >>> and desc_offset var:=0A= >>>=0A= >>> desc_avail =3D desc->len - vq->vhost_hlen;=0A= >>> desc_offset =3D vq->vhost_hlen;=0A= >>>=0A= >>> This refactor makes the code much more readable (IMO), yet it reduces= =0A= >>> binary code size (nearly 2K).=0A= >>>=0A= >>> Signed-off-by: Yuanhan Liu =0A= >>> ---=0A= >>>=0A= >>> v2: - fix potential NULL dereference bug of var "prev" and "head"=0A= >>> ---=0A= >>> lib/librte_vhost/vhost_rxtx.c | 297 +++++++++++++++++-----------------= --------=0A= >>> 1 file changed, 116 insertions(+), 181 deletions(-)=0A= >>>=0A= >>> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxt= x.c=0A= >>> index 5e7e5b1..d5cd0fa 100644=0A= >>> --- a/lib/librte_vhost/vhost_rxtx.c=0A= >>> +++ b/lib/librte_vhost/vhost_rxtx.c=0A= >>> @@ -702,21 +702,104 @@ vhost_dequeue_offload(struct virtio_net_hdr *hdr= , struct rte_mbuf *m)=0A= >>> }=0A= >>> }=0A= >>> =0A= >>> +static inline struct rte_mbuf *=0A= >>> +copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,= =0A= >>> + uint16_t desc_idx, struct rte_mempool *mbuf_pool)=0A= >>> +{=0A= >>> + struct vring_desc *desc;=0A= >>> + uint64_t desc_addr;=0A= >>> + uint32_t desc_avail, desc_offset;=0A= >>> + uint32_t mbuf_avail, mbuf_offset;=0A= >>> + uint32_t cpy_len;=0A= >>> + struct rte_mbuf *head =3D NULL;=0A= >>> + struct rte_mbuf *cur =3D NULL, *prev =3D NULL;=0A= >>> + struct virtio_net_hdr *hdr;=0A= >>> +=0A= >>> + desc =3D &vq->desc[desc_idx];=0A= >>> + desc_addr =3D gpa_to_vva(dev, desc->addr);=0A= >>> + rte_prefetch0((void *)(uintptr_t)desc_addr);=0A= >>> +=0A= >>> + /* Retrieve virtio net header */=0A= >>> + hdr =3D (struct virtio_net_hdr *)((uintptr_t)desc_addr);=0A= >>> + desc_avail =3D desc->len - vq->vhost_hlen;=0A= >> There is a serious bug here, desc->len - vq->vhost_len could overflow.= =0A= >> VM could easily create this case. Let us fix it here.=0A= > Nope, this issue has been there since the beginning, and this patch=0A= > is a refactor: we should not bring any functional changes. Therefore,=0A= > we should not fix it here.=0A= =0A= No, I don't mean exactly fixing in this patch but in this series.=0A= =0A= Besides, from refactoring's perspective, actually we could make things=0A= further much simpler and more readable. Both the desc chains and mbuf=0A= could be converted into iovec, then both dequeue(copy_desc_to_mbuf) and=0A= enqueue(copy_mbuf_to_desc) could use the commonly used iovec copying=0A= algorithms As data path are performance oriented, let us stop here.=0A= =0A= >=0A= > And actually, it's been fixed in the 6th patch in this series:=0A= =0A= Will check that.=0A= =0A= >=0A= > [PATCH v2 6/7] vhost: do sanity check for desc->len=0A= >=0A= > --yliu=0A= >=0A= =0A=