From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 1906B5A9B for ; Wed, 20 May 2015 07:28:25 +0200 (CEST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga103.jf.intel.com with ESMTP; 19 May 2015 22:28:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,463,1427785200"; d="scan'208";a="697576995" Received: from pgsmsx101.gar.corp.intel.com ([10.221.44.78]) by orsmga001.jf.intel.com with ESMTP; 19 May 2015 22:28:05 -0700 Received: from shsmsx151.ccr.corp.intel.com (10.239.6.50) by PGSMSX101.gar.corp.intel.com (10.221.44.78) with Microsoft SMTP Server (TLS) id 14.3.224.2; Wed, 20 May 2015 13:26:22 +0800 Received: from shsmsx101.ccr.corp.intel.com ([169.254.1.120]) by SHSMSX151.ccr.corp.intel.com ([169.254.3.180]) with mapi id 14.03.0224.002; Wed, 20 May 2015 13:26:21 +0800 From: "Xie, Huawei" To: "Ouyang, Changchun" Thread-Topic: [dpdk-dev] [PATCH] virtio: Fix enqueue/dequeue can't handle chained vring descriptors. Thread-Index: AdCRToQkB1WqIIEaR8mGcLUDx/H8gA== Date: Wed, 20 May 2015 05:26:21 +0000 Message-ID: References: <1430720780-27525-1-git-send-email-changchun.ouyang@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH] virtio: Fix enqueue/dequeue can't handle chained vring descriptors. X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 May 2015 05:28:26 -0000 On 5/18/2015 9:23 PM, Ouyang, Changchun wrote:=0A= > Hi Huawei,=0A= >=0A= >> -----Original Message-----=0A= >> From: Xie, Huawei=0A= >> Sent: Monday, May 18, 2015 5:39 PM=0A= >> To: Ouyang, Changchun; dev@dpdk.org=0A= >> Subject: Re: [dpdk-dev] [PATCH] virtio: Fix enqueue/dequeue can't handle= =0A= >> chained vring descriptors.=0A= >>=0A= >> On 5/4/2015 2:27 PM, Ouyang Changchun wrote:=0A= >>> Vring enqueue need consider the 2 cases:=0A= >>> 1. Vring descriptors chained together, the first one is for virtio=0A= >>> header, the rest are for real data; 2. Only one descriptor, virtio=0A= >>> header and real data share one single descriptor;=0A= >>>=0A= >>> So does vring dequeue.=0A= >>>=0A= >>> Signed-off-by: Changchun Ouyang =0A= >>> ---=0A= >>> lib/librte_vhost/vhost_rxtx.c | 60=0A= >>> +++++++++++++++++++++++++++++++------------=0A= >>> 1 file changed, 44 insertions(+), 16 deletions(-)=0A= >>>=0A= >>> diff --git a/lib/librte_vhost/vhost_rxtx.c=0A= >>> b/lib/librte_vhost/vhost_rxtx.c index 510ffe8..3135883 100644=0A= >>> --- a/lib/librte_vhost/vhost_rxtx.c=0A= >>> +++ b/lib/librte_vhost/vhost_rxtx.c=0A= >>> @@ -59,7 +59,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t=0A= >> queue_id,=0A= >>> struct virtio_net_hdr_mrg_rxbuf virtio_hdr =3D {{0, 0, 0, 0, 0, 0}, 0= };=0A= >>> uint64_t buff_addr =3D 0;=0A= >>> uint64_t buff_hdr_addr =3D 0;=0A= >>> - uint32_t head[MAX_PKT_BURST], packet_len =3D 0;=0A= >>> + uint32_t head[MAX_PKT_BURST];=0A= >>> uint32_t head_idx, packet_success =3D 0;=0A= >>> uint16_t avail_idx, res_cur_idx;=0A= >>> uint16_t res_base_idx, res_end_idx;=0A= >>> @@ -113,6 +113,10 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t=0A= >> queue_id,=0A= >>> rte_prefetch0(&vq->desc[head[packet_success]]);=0A= >>>=0A= >>> while (res_cur_idx !=3D res_end_idx) {=0A= >>> + uint32_t offset =3D 0;=0A= >>> + uint32_t data_len, len_to_cpy;=0A= >>> + uint8_t plus_hdr =3D 0;=0A= >>> +=0A= >>> /* Get descriptor from available ring */=0A= >>> desc =3D &vq->desc[head[packet_success]];=0A= >>>=0A= >>> @@ -125,7 +129,6 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t=0A= >>> queue_id,=0A= >>>=0A= >>> /* Copy virtio_hdr to packet and increment buffer address */=0A= >>> buff_hdr_addr =3D buff_addr;=0A= >>> - packet_len =3D rte_pktmbuf_data_len(buff) + vq->vhost_hlen;=0A= >>>=0A= >>> /*=0A= >>> * If the descriptors are chained the header and data are @@=0A= >>> -136,24 +139,44 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t=0A= >> queue_id,=0A= >>> desc =3D &vq->desc[desc->next];=0A= >>> /* Buffer address translation. */=0A= >>> buff_addr =3D gpa_to_vva(dev, desc->addr);=0A= >>> - desc->len =3D rte_pktmbuf_data_len(buff);=0A= >>> } else {=0A= >>> buff_addr +=3D vq->vhost_hlen;=0A= >>> - desc->len =3D packet_len;=0A= >>> + plus_hdr =3D 1;=0A= >>> }=0A= >>>=0A= >>> + data_len =3D rte_pktmbuf_data_len(buff);=0A= >>> + len_to_cpy =3D RTE_MIN(data_len, desc->len);=0A= >>> + do {=0A= >>> + if (len_to_cpy > 0) {=0A= >>> + /* Copy mbuf data to buffer */=0A= >>> + rte_memcpy((void *)(uintptr_t)buff_addr,=0A= >>> + (const void=0A= >> *)(rte_pktmbuf_mtod(buff, const char *) + offset),=0A= >>> + len_to_cpy);=0A= >>> + PRINT_PACKET(dev, (uintptr_t)buff_addr,=0A= >>> + len_to_cpy, 0);=0A= >>> +=0A= >>> + desc->len =3D len_to_cpy + (plus_hdr ? vq-=0A= >>> vhost_hlen : 0);=0A= >> Do we really need to rewrite the desc->len again and again? At least we= only=0A= >> have the possibility to change the value of desc->len of the last descri= ptor.=0A= > Well, I think we need change each descriptor's len in the chain here,=0A= > If aggregate all len to the last descriptor's len, it is possibly the len= gth will exceed its original len,=0A= > e.g. use 8 descriptor(each has len of 1024) chained to recv a 8K packet, = then last descriptor's len=0A= > will be 8K, and all other descriptor is 0, I don't think this situation m= ake sense. =0A= Let me explain this way.=0A= We receive a packet with 350 bytes size, and we have descriptor chain of=0A= 10 descs, each with 100 byte size.=0A= At least we don't need to change the len field of first three descriptors.= =0A= Whether we need to change the 4th len field to 50, and the rest of them=0A= to zero is still a question(used->len is updated to 350).=0A= We need check the VIRTIO spec.=0A= >>> + offset +=3D len_to_cpy;=0A= >>> + if (desc->flags & VRING_DESC_F_NEXT) {=0A= >>> + desc =3D &vq->desc[desc->next];=0A= >>> + buff_addr =3D gpa_to_vva(dev, desc-=0A= >>> addr);=0A= >>> + len_to_cpy =3D RTE_MIN(data_len -=0A= >> offset, desc->len);=0A= >>> + } else=0A= >>> + break;=0A= >> Still there are two issues here.=0A= >> a) If the data couldn't be fully copied to chain of guest buffers, we sh= ouldn't=0A= >> do any copy.=0A= > Why don't copy any data is better than the current implementation?=0A= We don't need to pass part of a packet to guest.=0A= >=0A= >> b) scatter mbuf isn't considered.=0A= > If we also consider scatter mbuf here, then this function will have exact= ly same logic with mergeable_rx,=0A= > Do you want to totally remove this function, just keep the mergeable rx f= unction for all cases?=0A= >=0A= > Changchun=0A= >=0A= >=0A= >=0A= =0A=