From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id B5181B6AB for ; Mon, 18 May 2015 11:39:30 +0200 (CEST) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP; 18 May 2015 02:39:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,452,1427785200"; d="scan'208";a="572993084" Received: from pgsmsx105.gar.corp.intel.com ([10.221.44.96]) by orsmga003.jf.intel.com with ESMTP; 18 May 2015 02:39:28 -0700 Received: from shsmsx104.ccr.corp.intel.com (10.239.4.70) by PGSMSX105.gar.corp.intel.com (10.221.44.96) with Microsoft SMTP Server (TLS) id 14.3.224.2; Mon, 18 May 2015 17:39:28 +0800 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.120]) by SHSMSX104.ccr.corp.intel.com ([169.254.5.94]) with mapi id 14.03.0224.002; Mon, 18 May 2015 17:39:20 +0800 From: "Xie, Huawei" To: "Ouyang, Changchun" , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH] virtio: Fix enqueue/dequeue can't handle chained vring descriptors. Thread-Index: AdCRToQkB1WqIIEaR8mGcLUDx/H8gA== Date: Mon, 18 May 2015 09:39:20 +0000 Message-ID: References: <1430720780-27525-1-git-send-email-changchun.ouyang@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH] virtio: Fix enqueue/dequeue can't handle chained vring descriptors. X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 May 2015 09:39:31 -0000 On 5/4/2015 2:27 PM, Ouyang Changchun wrote:=0A= > Vring enqueue need consider the 2 cases:=0A= > 1. Vring descriptors chained together, the first one is for virtio heade= r, the rest are for real data;=0A= > 2. Only one descriptor, virtio header and real data share one single des= criptor;=0A= >=0A= > So does vring dequeue.=0A= >=0A= > Signed-off-by: Changchun Ouyang =0A= > ---=0A= > lib/librte_vhost/vhost_rxtx.c | 60 +++++++++++++++++++++++++++++++------= ------=0A= > 1 file changed, 44 insertions(+), 16 deletions(-)=0A= >=0A= > diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.= c=0A= > index 510ffe8..3135883 100644=0A= > --- a/lib/librte_vhost/vhost_rxtx.c=0A= > +++ b/lib/librte_vhost/vhost_rxtx.c=0A= > @@ -59,7 +59,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id= ,=0A= > struct virtio_net_hdr_mrg_rxbuf virtio_hdr =3D {{0, 0, 0, 0, 0, 0}, 0};= =0A= > uint64_t buff_addr =3D 0;=0A= > uint64_t buff_hdr_addr =3D 0;=0A= > - uint32_t head[MAX_PKT_BURST], packet_len =3D 0;=0A= > + uint32_t head[MAX_PKT_BURST];=0A= > uint32_t head_idx, packet_success =3D 0;=0A= > uint16_t avail_idx, res_cur_idx;=0A= > uint16_t res_base_idx, res_end_idx;=0A= > @@ -113,6 +113,10 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue= _id,=0A= > rte_prefetch0(&vq->desc[head[packet_success]]);=0A= > =0A= > while (res_cur_idx !=3D res_end_idx) {=0A= > + uint32_t offset =3D 0;=0A= > + uint32_t data_len, len_to_cpy;=0A= > + uint8_t plus_hdr =3D 0;=0A= > +=0A= > /* Get descriptor from available ring */=0A= > desc =3D &vq->desc[head[packet_success]];=0A= > =0A= > @@ -125,7 +129,6 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_= id,=0A= > =0A= > /* Copy virtio_hdr to packet and increment buffer address */=0A= > buff_hdr_addr =3D buff_addr;=0A= > - packet_len =3D rte_pktmbuf_data_len(buff) + vq->vhost_hlen;=0A= > =0A= > /*=0A= > * If the descriptors are chained the header and data are=0A= > @@ -136,24 +139,44 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queu= e_id,=0A= > desc =3D &vq->desc[desc->next];=0A= > /* Buffer address translation. */=0A= > buff_addr =3D gpa_to_vva(dev, desc->addr);=0A= > - desc->len =3D rte_pktmbuf_data_len(buff);=0A= > } else {=0A= > buff_addr +=3D vq->vhost_hlen;=0A= > - desc->len =3D packet_len;=0A= > + plus_hdr =3D 1;=0A= > }=0A= > =0A= > + data_len =3D rte_pktmbuf_data_len(buff);=0A= > + len_to_cpy =3D RTE_MIN(data_len, desc->len);=0A= > + do {=0A= > + if (len_to_cpy > 0) {=0A= > + /* Copy mbuf data to buffer */=0A= > + rte_memcpy((void *)(uintptr_t)buff_addr,=0A= > + (const void *)(rte_pktmbuf_mtod(buff, const char *) + offset),=0A= > + len_to_cpy);=0A= > + PRINT_PACKET(dev, (uintptr_t)buff_addr,=0A= > + len_to_cpy, 0);=0A= > +=0A= > + desc->len =3D len_to_cpy + (plus_hdr ? vq->vhost_hlen : 0);=0A= =0A= Do we really need to rewrite the desc->len again and again? At least we=0A= only have the possibility to change the value of desc->len of the last=0A= descriptor.=0A= =0A= > + offset +=3D len_to_cpy;=0A= > + if (desc->flags & VRING_DESC_F_NEXT) {=0A= > + desc =3D &vq->desc[desc->next];=0A= > + buff_addr =3D gpa_to_vva(dev, desc->addr);=0A= > + len_to_cpy =3D RTE_MIN(data_len - offset, desc->len);=0A= > + } else=0A= > + break;=0A= =0A= Still there are two issues here.=0A= a) If the data couldn't be fully copied to chain of guest buffers, we=0A= shouldn't do any copy.=0A= b) scatter mbuf isn't considered.=0A= =0A= > + } else {=0A= > + desc->len =3D 0;=0A= > + if (desc->flags & VRING_DESC_F_NEXT)=0A= > + desc =3D &vq->desc[desc->next];= =0A= > + else=0A= > + break;=0A= > + }=0A= > + } while (1);=0A= > +=0A= > /* Update used ring with desc information */=0A= > vq->used->ring[res_cur_idx & (vq->size - 1)].id =3D=0A= > head[packet_success];=0A= > - vq->used->ring[res_cur_idx & (vq->size - 1)].len =3D packet_len;=0A= > -=0A= > - /* Copy mbuf data to buffer */=0A= > - /* FIXME for sg mbuf and the case that desc couldn't hold the mbuf dat= a */=0A= > - rte_memcpy((void *)(uintptr_t)buff_addr,=0A= > - rte_pktmbuf_mtod(buff, const void *),=0A= > - rte_pktmbuf_data_len(buff));=0A= > - PRINT_PACKET(dev, (uintptr_t)buff_addr,=0A= > - rte_pktmbuf_data_len(buff), 0);=0A= > + vq->used->ring[res_cur_idx & (vq->size - 1)].len =3D=0A= > + offset + vq->vhost_hlen;=0A= > =0A= > res_cur_idx++;=0A= > packet_success++;=0A= > @@ -583,7 +606,14 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint= 16_t queue_id,=0A= > desc =3D &vq->desc[head[entry_success]];=0A= > =0A= > /* Discard first buffer as it is the virtio header */=0A= > - desc =3D &vq->desc[desc->next];=0A= > + if (desc->flags & VRING_DESC_F_NEXT) {=0A= > + desc =3D &vq->desc[desc->next];=0A= > + vb_offset =3D 0;=0A= > + vb_avail =3D desc->len;=0A= > + } else {=0A= > + vb_offset =3D vq->vhost_hlen;=0A= > + vb_avail =3D desc->len - vb_offset;=0A= > + }=0A= > =0A= > /* Buffer address translation. */=0A= > vb_addr =3D gpa_to_vva(dev, desc->addr);=0A= > @@ -602,8 +632,6 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint1= 6_t queue_id,=0A= > vq->used->ring[used_idx].id =3D head[entry_success];=0A= > vq->used->ring[used_idx].len =3D 0;=0A= > =0A= > - vb_offset =3D 0;=0A= > - vb_avail =3D desc->len;=0A= > /* Allocate an mbuf and populate the structure. */=0A= > m =3D rte_pktmbuf_alloc(mbuf_pool);=0A= > if (unlikely(m =3D=3D NULL)) {=0A= =0A=