From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id B52E147CE for ; Wed, 1 Jun 2016 08:53:07 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga101.fm.intel.com with ESMTP; 31 May 2016 23:53:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.26,400,1459839600"; d="scan'208";a="988643297" Received: from yliu-dev.sh.intel.com (HELO yliu-dev) ([10.239.67.162]) by orsmga002.jf.intel.com with ESMTP; 31 May 2016 23:53:05 -0700 Date: Wed, 1 Jun 2016 14:55:57 +0800 From: Yuanhan Liu To: "Xie, Huawei" Cc: "dev@dpdk.org" , "Michael S. Tsirkin" Message-ID: <20160601065557.GB10038@yliu-dev.sh.intel.com> References: <1462236378-7604-1-git-send-email-yuanhan.liu@linux.intel.com> <1462236378-7604-2-git-send-email-yuanhan.liu@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] [PATCH 1/3] vhost: pre update used ring for Tx and Rx X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jun 2016 06:53:08 -0000 On Wed, Jun 01, 2016 at 06:40:41AM +0000, Xie, Huawei wrote: > > /* Retrieve all of the head indexes first to avoid caching issues. */ > > for (i = 0; i < count; i++) { > > - desc_indexes[i] = vq->avail->ring[(vq->last_used_idx + i) & > > - (vq->size - 1)]; > > + used_idx = (vq->last_used_idx + i) & (vq->size - 1); > > + desc_indexes[i] = vq->avail->ring[used_idx]; > > + > > + vq->used->ring[used_idx].id = desc_indexes[i]; > > + vq->used->ring[used_idx].len = 0; > > + vhost_log_used_vring(dev, vq, > > + offsetof(struct vring_used, ring[used_idx]), > > + sizeof(vq->used->ring[used_idx])); > > } > > > > /* Prefetch descriptor index. */ > > rte_prefetch0(&vq->desc[desc_indexes[0]]); > > - rte_prefetch0(&vq->used->ring[vq->last_used_idx & (vq->size - 1)]); > > - > > for (i = 0; i < count; i++) { > > int err; > > > > - if (likely(i + 1 < count)) { > > + if (likely(i + 1 < count)) > > rte_prefetch0(&vq->desc[desc_indexes[i + 1]]); > > - rte_prefetch0(&vq->used->ring[(used_idx + 1) & > > - (vq->size - 1)]); > > - } > > > > pkts[i] = rte_pktmbuf_alloc(mbuf_pool); > > if (unlikely(pkts[i] == NULL)) { > > @@ -916,18 +920,12 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, > > rte_pktmbuf_free(pkts[i]); > > break; > > } > > - > > - used_idx = vq->last_used_idx++ & (vq->size - 1); > > - vq->used->ring[used_idx].id = desc_indexes[i]; > > - vq->used->ring[used_idx].len = 0; > > - vhost_log_used_vring(dev, vq, > > - offsetof(struct vring_used, ring[used_idx]), > > - sizeof(vq->used->ring[used_idx])); > > } > > Had tried post-updating used ring in batch, but forget the perf change. I would assume pre-updating gives better performance gain, as we are fiddling with avail and used ring together, which would be more cache friendly. > One optimization would be on vhost_log_used_ring. > I have two ideas, > a) In QEMU side, we always assume use ring will be changed. so that we > don't need to log used ring in VHOST. > > Michael: feasible in QEMU? comments on this? > > b) We could always mark the total used ring modified rather than entry > by entry. I doubt it's worthwhile. One fact is that vhost_log_used_ring is a non operation in most time: it will take action only in the short gap of during live migration. And FYI, I even tried with all vhost_log_xxx being removed, it showed no performance boost at all. Therefore, it's not a factor that will impact performance. --yliu