From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id B208012A8 for ; Sun, 31 May 2015 15:23:01 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga101.fm.intel.com with ESMTP; 31 May 2015 06:23:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,526,1427785200"; d="scan'208";a="738248046" Received: from pgsmsx104.gar.corp.intel.com ([10.221.44.91]) by orsmga002.jf.intel.com with ESMTP; 31 May 2015 06:23:00 -0700 Received: from shsmsx103.ccr.corp.intel.com (10.239.4.69) by PGSMSX104.gar.corp.intel.com (10.221.44.91) with Microsoft SMTP Server (TLS) id 14.3.224.2; Sun, 31 May 2015 21:22:58 +0800 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.109]) by SHSMSX103.ccr.corp.intel.com ([169.254.4.23]) with mapi id 14.03.0224.002; Sun, 31 May 2015 21:22:51 +0800 From: "Ouyang, Changchun" To: "Xie, Huawei" , "dev@dpdk.org" Thread-Topic: [PATCH v2 1/5] lib_vhost: Fix enqueue/dequeue can't handle chained vring descriptors Thread-Index: AQHQmVlfUJ/WOKm/LUianbN3s/Z+1Z2WD3qwgAAHPOA= Date: Sun, 31 May 2015 13:22:50 +0000 Message-ID: References: <1430720780-27525-1-git-send-email-changchun.ouyang@intel.com> <1432826207-8428-1-git-send-email-changchun.ouyang@intel.com> <1432826207-8428-2-git-send-email-changchun.ouyang@intel.com> In-Reply-To: Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v2 1/5] lib_vhost: Fix enqueue/dequeue can't handle chained vring descriptors X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 May 2015 13:23:02 -0000 > -----Original Message----- > From: Ouyang, Changchun > Sent: Sunday, May 31, 2015 9:00 PM > To: Xie, Huawei; dev@dpdk.org > Cc: Cao, Waterman; Ouyang, Changchun > Subject: RE: [PATCH v2 1/5] lib_vhost: Fix enqueue/dequeue can't handle > chained vring descriptors >=20 >=20 >=20 > > -----Original Message----- > > From: Xie, Huawei > > Sent: Sunday, May 31, 2015 4:41 PM > > To: Ouyang, Changchun; dev@dpdk.org > > Cc: Cao, Waterman > > Subject: Re: [PATCH v2 1/5] lib_vhost: Fix enqueue/dequeue can't > > handle chained vring descriptors > > > > On 5/28/2015 11:17 PM, Ouyang, Changchun wrote: > > > Vring enqueue need consider the 2 cases: > > > 1. Vring descriptors chained together, the first one is for virtio > > > header, the > > rest are for real > > > data, virtio driver in Linux usually use this scheme; 2. Only > > > one descriptor, virtio header and real data share one single > > > descriptor, virtio- > > net pmd use > > > such scheme; > > > > > > So does vring dequeue, it should not assume vring descriptor is > > > chained or not chained, virtio in different Linux version has > > > different behavior, e.g. fedora 20 use chained vring descriptor, > > > while > > fedora 21 use one single vring descriptor for tx. > > > > > > Changes in v2 > > > - drop the uncompleted packet > > > - refine code logic > > > > > > Signed-off-by: Changchun Ouyang > > > --- > > > lib/librte_vhost/vhost_rxtx.c | 65 > > > +++++++++++++++++++++++++++++++++---------- > > > 1 file changed, 50 insertions(+), 15 deletions(-) > > > > > > diff --git a/lib/librte_vhost/vhost_rxtx.c > > > b/lib/librte_vhost/vhost_rxtx.c index 4809d32..06ae2df 100644 > > > --- a/lib/librte_vhost/vhost_rxtx.c > > > +++ b/lib/librte_vhost/vhost_rxtx.c > > > @@ -59,7 +59,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t > > queue_id, > > > struct virtio_net_hdr_mrg_rxbuf virtio_hdr =3D {{0, 0, 0, 0, 0, 0},= 0}; > > > uint64_t buff_addr =3D 0; > > > uint64_t buff_hdr_addr =3D 0; > > > - uint32_t head[MAX_PKT_BURST], packet_len =3D 0; > > > + uint32_t head[MAX_PKT_BURST]; > > > uint32_t head_idx, packet_success =3D 0; > > > uint16_t avail_idx, res_cur_idx; > > > uint16_t res_base_idx, res_end_idx; @@ -113,6 +113,10 @@ > > > virtio_dev_rx(struct virtio_net *dev, uint16_t > > queue_id, > > > rte_prefetch0(&vq->desc[head[packet_success]]); > > > > > > while (res_cur_idx !=3D res_end_idx) { > > > + uint32_t offset =3D 0; > > > + uint32_t data_len, len_to_cpy; > > > + uint8_t hdr =3D 0, uncompleted_pkt =3D 0; > > > + > > > /* Get descriptor from available ring */ > > > desc =3D &vq->desc[head[packet_success]]; > > > > > > @@ -125,7 +129,6 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t > > > queue_id, > > > > > > /* Copy virtio_hdr to packet and increment buffer address */ > > > buff_hdr_addr =3D buff_addr; > > > - packet_len =3D rte_pktmbuf_data_len(buff) + vq->vhost_hlen; > > > > > > /* > > > * If the descriptors are chained the header and data are @@ > > > -136,28 +139,55 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t > > queue_id, > > > desc =3D &vq->desc[desc->next]; > > > /* Buffer address translation. */ > > > buff_addr =3D gpa_to_vva(dev, desc->addr); > > > - desc->len =3D rte_pktmbuf_data_len(buff); > > Do we got confirm from virtio SPEC that it is OK to only update used->l= en? >=20 > Virtio Spec don't require vhost update desc->len. >=20 >=20 > > > } else { > > > buff_addr +=3D vq->vhost_hlen; > > > - desc->len =3D packet_len; > > > + hdr =3D 1; > > > } > > > > > > + data_len =3D rte_pktmbuf_data_len(buff); > > > + len_to_cpy =3D RTE_MIN(data_len, > > > + hdr ? desc->len - vq->vhost_hlen : desc->len); > > > + while (len_to_cpy > 0) { > > > + /* Copy mbuf data to buffer */ > > > + rte_memcpy((void *)(uintptr_t)buff_addr, > > > + (const void *)(rte_pktmbuf_mtod(buff, > > const char *) + offset), > > > + len_to_cpy); > > > + PRINT_PACKET(dev, (uintptr_t)buff_addr, > > > + len_to_cpy, 0); > > > + > > > + offset +=3D len_to_cpy; > > > + > > > + if (offset =3D=3D data_len) > > > + break; > > Ok, i see scatter gather case handling is in patch 5. > > > + > > > + if (desc->flags & VRING_DESC_F_NEXT) { > > > + desc =3D &vq->desc[desc->next]; > > > + buff_addr =3D gpa_to_vva(dev, desc->addr); > > > + len_to_cpy =3D RTE_MIN(data_len - offset, > > desc->len); > > > + } else { > > > + /* Room in vring buffer is not enough */ > > > + uncompleted_pkt =3D 1; > > > + break; > > > + } > > > + }; > > > + > > > /* Update used ring with desc information */ > > > vq->used->ring[res_cur_idx & (vq->size - 1)].id =3D > > > > > head[packet_success]; > > > - vq->used->ring[res_cur_idx & (vq->size - 1)].len =3D > > packet_len; > > > > > > - /* Copy mbuf data to buffer */ > > > - /* FIXME for sg mbuf and the case that desc couldn't hold the > > mbuf data */ > > > - rte_memcpy((void *)(uintptr_t)buff_addr, > > > - rte_pktmbuf_mtod(buff, const void *), > > > - rte_pktmbuf_data_len(buff)); > > > - PRINT_PACKET(dev, (uintptr_t)buff_addr, > > > - rte_pktmbuf_data_len(buff), 0); > > > + /* Drop the packet if it is uncompleted */ > > > + if (unlikely(uncompleted_pkt =3D=3D 1)) > > > + vq->used->ring[res_cur_idx & (vq->size - 1)].len =3D 0; > > Here things become complicated with the previous lockless reserve. >=20 > Why it become complicated? Len =3D 0 means it contain any meaningful data= in > the buffer. Sorry typo here, Len =3D 0 means it doesn't' contain any meaningful data in the buffer. >=20 > > What is the consequence when guest sees zero in used->len? At least, > > do we check with virtio-net implementation? >=20 > > > > > + else > > > + vq->used->ring[res_cur_idx & (vq->size - 1)].len =3D > > > + offset + vq- > > >vhost_hlen; > > Two questions here, > > 1. add virtio header len? > > 2. Why not use packet_len rather than offset? > > > > > > res_cur_idx++; > > > packet_success++; > > > > > > + if (unlikely(uncompleted_pkt =3D=3D 1)) > > > + continue; > > > + > > > rte_memcpy((void *)(uintptr_t)buff_hdr_addr, > > > (const void *)&virtio_hdr, vq->vhost_hlen); > > > > > > @@ -589,7 +619,14 @@ rte_vhost_dequeue_burst(struct virtio_net > *dev, > > uint16_t queue_id, > > > desc =3D &vq->desc[head[entry_success]]; > > > > > > /* Discard first buffer as it is the virtio header */ > > > - desc =3D &vq->desc[desc->next]; > > > + if (desc->flags & VRING_DESC_F_NEXT) { > > > + desc =3D &vq->desc[desc->next]; > > > + vb_offset =3D 0; > > > + vb_avail =3D desc->len; > > > + } else { > > > + vb_offset =3D vq->vhost_hlen; > > > + vb_avail =3D desc->len - vb_offset; > > > + } > > > > > > /* Buffer address translation. */ > > > vb_addr =3D gpa_to_vva(dev, desc->addr); @@ -608,8 +645,6 > > @@ > > > rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t queue_id, > > > vq->used->ring[used_idx].id =3D head[entry_success]; > > > vq->used->ring[used_idx].len =3D 0; > > > > > > - vb_offset =3D 0; > > > - vb_avail =3D desc->len; > > > /* Allocate an mbuf and populate the structure. */ > > > m =3D rte_pktmbuf_alloc(mbuf_pool); > > > if (unlikely(m =3D=3D NULL)) {