From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1D8E1A0524; Tue, 13 Apr 2021 04:44:16 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9208016098E; Tue, 13 Apr 2021 04:44:15 +0200 (CEST) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 4567516098C for ; Tue, 13 Apr 2021 04:44:12 +0200 (CEST) IronPort-SDR: RD3VNIAld9ftGjrQ1Gep8BTAlNkQ0JFIQzWSBHiCz8hvXK+4P/kMSqE6AUSKh6BlKR4vldgCVq nStJOihPUt0g== X-IronPort-AV: E=McAfee;i="6200,9189,9952"; a="258298009" X-IronPort-AV: E=Sophos;i="5.82,216,1613462400"; d="scan'208";a="258298009" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Apr 2021 19:44:11 -0700 IronPort-SDR: Y+HKCXqIv7V1isk8Gjx21/pZqCLPdqI9etIYfz1qp2bri3bVf2ATZN74e/y+nQ2Z5Oq6S+H3ws 5v2P/tzYNXIg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,216,1613462400"; d="scan'208";a="614776598" Received: from fmsmsx602.amr.corp.intel.com ([10.18.126.82]) by fmsmga005.fm.intel.com with ESMTP; 12 Apr 2021 19:44:11 -0700 Received: from shsmsx604.ccr.corp.intel.com (10.109.6.214) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2; Mon, 12 Apr 2021 19:44:09 -0700 Received: from shsmsx606.ccr.corp.intel.com (10.109.6.216) by SHSMSX604.ccr.corp.intel.com (10.109.6.214) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2; Tue, 13 Apr 2021 10:44:07 +0800 Received: from shsmsx606.ccr.corp.intel.com ([10.109.6.216]) by SHSMSX606.ccr.corp.intel.com ([10.109.6.216]) with mapi id 15.01.2106.013; Tue, 13 Apr 2021 10:44:07 +0800 From: "Hu, Jiayu" To: "Jiang, Cheng1" , "maxime.coquelin@redhat.com" , "Xia, Chenbo" CC: "dev@dpdk.org" , "Yang, YvonneX" , "Wang, Yinan" , "Liu, Yong" Thread-Topic: [PATCH v5 1/4] vhost: abstract and reorganize async split ring code Thread-Index: AQHXL5GpbchF1VJXP0mAcWBBbNlviKqxtw9Q Date: Tue, 13 Apr 2021 02:44:07 +0000 Message-ID: References: <20210317085426.10119-1-Cheng1.jiang@intel.com> <20210412113430.17587-1-Cheng1.jiang@intel.com> <20210412113430.17587-2-Cheng1.jiang@intel.com> In-Reply-To: <20210412113430.17587-2-Cheng1.jiang@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-reaction: no-action dlp-version: 11.5.1.3 dlp-product: dlpe-windows x-originating-ip: [10.239.127.36] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v5 1/4] vhost: abstract and reorganize async split ring code X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Cheng, Some comments inline. > -----Original Message----- > From: Jiang, Cheng1 > Sent: Monday, April 12, 2021 7:34 PM > To: maxime.coquelin@redhat.com; Xia, Chenbo > Cc: dev@dpdk.org; Hu, Jiayu ; Yang, YvonneX > ; Wang, Yinan ; Liu, > Yong ; Jiang, Cheng1 > Subject: [PATCH v5 1/4] vhost: abstract and reorganize async split ring c= ode >=20 > In order to improve code efficiency and readability when async packed > ring support is enabled. This patch abstract some functions like > shadow_ring_store and write_back_completed_descs_split. And improve > the efficiency of some pointer offset calculation. Need to improve grammar for commit log, as there is typo and incomplete sentence. >=20 > Signed-off-by: Cheng Jiang > --- > lib/librte_vhost/virtio_net.c | 146 +++++++++++++++++++--------------- > 1 file changed, 84 insertions(+), 62 deletions(-) >=20 > diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.= c > index ff3987860..c43ab0093 100644 > --- a/lib/librte_vhost/virtio_net.c > +++ b/lib/librte_vhost/virtio_net.c > @@ -1458,6 +1458,29 @@ virtio_dev_rx_async_get_info_idx(uint16_t > pkts_idx, > (vq_size - n_inflight + pkts_idx) & (vq_size - 1); > } >=20 > +static __rte_always_inline void > +shadow_ring_store(struct vhost_virtqueue *vq, void *shadow_ring, void > *d_ring, > + uint16_t s_idx, uint16_t d_idx, > + uint16_t count, uint16_t elem_size) > +{ > + if (d_idx + count <=3D vq->size) { > + rte_memcpy((void *)((uintptr_t)d_ring + d_idx * elem_size), > + (void *)((uintptr_t)shadow_ring + s_idx * elem_size), > + count * elem_size); > + } else { > + uint16_t size =3D vq->size - d_idx; > + > + rte_memcpy((void *)((uintptr_t)d_ring + d_idx * elem_size), > + (void *)((uintptr_t)shadow_ring + s_idx * elem_size), > + size * elem_size); > + > + rte_memcpy((void *)((uintptr_t)d_ring), > + (void *)((uintptr_t)shadow_ring + > + (s_idx + size) * elem_size), > + (count - size) * elem_size); > + } > +} > + > static __rte_noinline uint32_t > virtio_dev_rx_async_submit_split(struct virtio_net *dev, > struct vhost_virtqueue *vq, uint16_t queue_id, > @@ -1478,6 +1501,7 @@ virtio_dev_rx_async_submit_split(struct virtio_net > *dev, > struct rte_vhost_iov_iter *dst_it =3D it_pool + 1; > uint16_t slot_idx =3D 0; > uint16_t segs_await =3D 0; > + uint16_t iovec_idx =3D 0, it_idx =3D 0; > struct async_inflight_info *pkts_info =3D vq->async_pkts_info; > uint32_t n_pkts =3D 0, pkt_err =3D 0; > uint32_t num_async_pkts =3D 0, num_done_pkts =3D 0; > @@ -1513,27 +1537,32 @@ virtio_dev_rx_async_submit_split(struct > virtio_net *dev, >=20 > if (async_mbuf_to_desc(dev, vq, pkts[pkt_idx], > buf_vec, nr_vec, num_buffers, > - src_iovec, dst_iovec, src_it, dst_it) < 0) { > + &src_iovec[iovec_idx], > + &dst_iovec[iovec_idx], > + &src_it[it_idx], > + &dst_it[it_idx]) < 0) { When use index, it's strange to get src and dst iov_iter from dst_it and sr= c_it respectively, as they are not start addresses of two separated iov_iter arrays but have o= verlapped elements. IMO, there is no need to use src/dst_it, as they can be simply in= dexed by it_pool[it_idx] and it_pool[it_idx+1]. > vq->shadow_used_idx -=3D num_buffers; > break; > } >=20 > slot_idx =3D (vq->async_pkts_idx + num_async_pkts) & > (vq->size - 1); > - if (src_it->count) { > + if (src_it[it_idx].count) { > uint16_t from, to; >=20 > - async_fill_desc(&tdes[pkt_burst_idx++], src_it, dst_it); > + async_fill_desc(&tdes[pkt_burst_idx++], > + &src_it[it_idx], > + &dst_it[it_idx]); > pkts_info[slot_idx].descs =3D num_buffers; > pkts_info[slot_idx].mbuf =3D pkts[pkt_idx]; > async_pkts_log[num_async_pkts].pkt_idx =3D pkt_idx; > async_pkts_log[num_async_pkts++].last_avail_idx =3D > vq->last_avail_idx; > - src_iovec +=3D src_it->nr_segs; > - dst_iovec +=3D dst_it->nr_segs; > - src_it +=3D 2; > - dst_it +=3D 2; > - segs_await +=3D src_it->nr_segs; > + > + iovec_idx +=3D src_it[it_idx].nr_segs; > + it_idx +=3D 2; > + > + segs_await +=3D src_it[it_idx].nr_segs; >=20 > /** > * recover shadow used ring and keep DMA-occupied > @@ -1541,23 +1570,12 @@ virtio_dev_rx_async_submit_split(struct > virtio_net *dev, > */ > from =3D vq->shadow_used_idx - num_buffers; > to =3D vq->async_desc_idx & (vq->size - 1); > - if (num_buffers + to <=3D vq->size) { > - rte_memcpy(&vq->async_descs_split[to], > - &vq- > >shadow_used_split[from], > - num_buffers * > - sizeof(struct > vring_used_elem)); > - } else { > - int size =3D vq->size - to; > - > - rte_memcpy(&vq->async_descs_split[to], > - &vq- > >shadow_used_split[from], > - size * > - sizeof(struct > vring_used_elem)); > - rte_memcpy(vq->async_descs_split, > - &vq- > >shadow_used_split[from + > - size], (num_buffers - size) * > - sizeof(struct vring_used_elem)); > - } > + > + shadow_ring_store(vq, vq->shadow_used_split, > + vq->async_descs_split, > + from, to, num_buffers, > + sizeof(struct vring_used_elem)); This function is to store DMA-occupied desc, but " shadow_ring_store" is no= t a good name for it. In addition, I think there is no need to pass vq as a parameter. Wh= at you need is the size of shadow ring and async desc ring. Thanks, Jiayu > + > vq->async_desc_idx +=3D num_buffers; > vq->shadow_used_idx -=3D num_buffers; > } else > @@ -1575,10 +1593,9 @@ virtio_dev_rx_async_submit_split(struct > virtio_net *dev, > BUF_VECTOR_MAX))) { > n_pkts =3D vq->async_ops.transfer_data(dev->vid, > queue_id, tdes, 0, pkt_burst_idx); > - src_iovec =3D vec_pool; > - dst_iovec =3D vec_pool + (VHOST_MAX_ASYNC_VEC >> > 1); > - src_it =3D it_pool; > - dst_it =3D it_pool + 1; > + iovec_idx =3D 0; > + it_idx =3D 0; > + > segs_await =3D 0; > vq->async_pkts_inflight_n +=3D n_pkts; >=20 > @@ -1639,6 +1656,43 @@ virtio_dev_rx_async_submit_split(struct > virtio_net *dev, > return pkt_idx; > } >=20 > +static __rte_always_inline void > +write_back_completed_descs_split(struct vhost_virtqueue *vq, uint16_t > n_descs) > +{ > + uint16_t nr_left =3D n_descs; > + uint16_t nr_copy; > + uint16_t to, from; > + > + do { > + from =3D vq->last_async_desc_idx & (vq->size - 1); > + nr_copy =3D nr_left + from <=3D vq->size ? nr_left : > + vq->size - from; > + to =3D vq->last_used_idx & (vq->size - 1); > + > + if (to + nr_copy <=3D vq->size) { > + rte_memcpy(&vq->used->ring[to], > + &vq->async_descs_split[from], > + nr_copy * > + sizeof(struct vring_used_elem)); > + } else { > + uint16_t size =3D vq->size - to; > + > + rte_memcpy(&vq->used->ring[to], > + &vq->async_descs_split[from], > + size * > + sizeof(struct vring_used_elem)); > + rte_memcpy(vq->used->ring, > + &vq->async_descs_split[from + > + size], (nr_copy - size) * > + sizeof(struct vring_used_elem)); > + } > + > + vq->last_async_desc_idx +=3D nr_copy; > + vq->last_used_idx +=3D nr_copy; > + nr_left -=3D nr_copy; > + } while (nr_left > 0); > +} > + > uint16_t rte_vhost_poll_enqueue_completed(int vid, uint16_t queue_id, > struct rte_mbuf **pkts, uint16_t count) > { > @@ -1695,39 +1749,7 @@ uint16_t rte_vhost_poll_enqueue_completed(int > vid, uint16_t queue_id, > vq->async_pkts_inflight_n -=3D n_pkts_put; >=20 > if (likely(vq->enabled && vq->access_ok)) { > - uint16_t nr_left =3D n_descs; > - uint16_t nr_copy; > - uint16_t to; > - > - /* write back completed descriptors to used ring */ > - do { > - from =3D vq->last_async_desc_idx & (vq->size - 1); > - nr_copy =3D nr_left + from <=3D vq->size ? nr_left : > - vq->size - from; > - to =3D vq->last_used_idx & (vq->size - 1); > - > - if (to + nr_copy <=3D vq->size) { > - rte_memcpy(&vq->used->ring[to], > - &vq- > >async_descs_split[from], > - nr_copy * > - sizeof(struct > vring_used_elem)); > - } else { > - uint16_t size =3D vq->size - to; > - > - rte_memcpy(&vq->used->ring[to], > - &vq- > >async_descs_split[from], > - size * > - sizeof(struct > vring_used_elem)); > - rte_memcpy(vq->used->ring, > - &vq->async_descs_split[from > + > - size], (nr_copy - size) * > - sizeof(struct > vring_used_elem)); > - } > - > - vq->last_async_desc_idx +=3D nr_copy; > - vq->last_used_idx +=3D nr_copy; > - nr_left -=3D nr_copy; > - } while (nr_left > 0); > + write_back_completed_descs_split(vq, n_descs); >=20 > __atomic_add_fetch(&vq->used->idx, n_descs, > __ATOMIC_RELEASE); > vhost_vring_call_split(dev, vq); > -- > 2.29.2