From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 537131DBD for ; Tue, 19 Mar 2019 09:54:08 +0100 (CET) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AA3ED3082231; Tue, 19 Mar 2019 08:54:07 +0000 (UTC) Received: from localhost (dhcp-192-225.str.redhat.com [10.33.192.225]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5278A5C25A; Tue, 19 Mar 2019 08:54:05 +0000 (UTC) Date: Tue, 19 Mar 2019 09:54:03 +0100 From: Jens Freimann To: Tiwei Bie Cc: maxime.coquelin@redhat.com, zhihong.wang@intel.com, dev@dpdk.org Message-ID: <20190319085403.gnamedd5pz7jzwvj@jenstp.localdomain> References: <20190319064312.13743-1-tiwei.bie@intel.com> <20190319064312.13743-5-tiwei.bie@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20190319064312.13743-5-tiwei.bie@intel.com> User-Agent: NeoMutt/20180716 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Tue, 19 Mar 2019 08:54:07 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH 04/10] net/virtio: optimize flags update for packed ring X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Mar 2019 08:54:08 -0000 On Tue, Mar 19, 2019 at 02:43:06PM +0800, Tiwei Bie wrote: >Cache the AVAIL, USED and WRITE bits to avoid calculating >them as much as possible. Note that, the WRITE bit isn't >cached for control queue. > >Signed-off-by: Tiwei Bie >--- > drivers/net/virtio/virtio_ethdev.c | 35 ++++++++++++++---------------- > drivers/net/virtio/virtio_rxtx.c | 31 ++++++++++---------------- > drivers/net/virtio/virtqueue.h | 8 +++---- > 3 files changed, 32 insertions(+), 42 deletions(-) > >diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c >index ff16fb63e..9060b6b33 100644 >--- a/drivers/net/virtio/virtio_ethdev.c >+++ b/drivers/net/virtio/virtio_ethdev.c >@@ -149,7 +149,7 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, > int head; > struct vring_packed_desc *desc = vq->ring_packed.desc_packed; > struct virtio_pmd_ctrl *result; >- bool avail_wrap_counter; >+ uint16_t flags; > int sum = 0; > int nb_descs = 0; > int k; >@@ -161,14 +161,15 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, > * One RX packet for ACK. > */ > head = vq->vq_avail_idx; >- avail_wrap_counter = vq->avail_wrap_counter; >+ flags = vq->cached_flags; > desc[head].addr = cvq->virtio_net_hdr_mem; > desc[head].len = sizeof(struct virtio_net_ctrl_hdr); > vq->vq_free_cnt--; > nb_descs++; > if (++vq->vq_avail_idx >= vq->vq_nentries) { > vq->vq_avail_idx -= vq->vq_nentries; >- vq->avail_wrap_counter ^= 1; >+ vq->cached_flags ^= >+ VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); Maybe name it avail_used_flags instead of cached flags. Also we could use a constant value. > } > > for (k = 0; k < pkt_num; k++) { >@@ -177,34 +178,31 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, > + sizeof(ctrl->status) + sizeof(uint8_t) * sum; > desc[vq->vq_avail_idx].len = dlen[k]; > desc[vq->vq_avail_idx].flags = VRING_DESC_F_NEXT | >- VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >- VRING_DESC_F_USED(!vq->avail_wrap_counter); >+ vq->cached_flags; > sum += dlen[k]; > vq->vq_free_cnt--; > nb_descs++; > if (++vq->vq_avail_idx >= vq->vq_nentries) { > vq->vq_avail_idx -= vq->vq_nentries; >- vq->avail_wrap_counter ^= 1; >+ vq->cached_flags ^= >+ VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); > } > } > > desc[vq->vq_avail_idx].addr = cvq->virtio_net_hdr_mem > + sizeof(struct virtio_net_ctrl_hdr); > desc[vq->vq_avail_idx].len = sizeof(ctrl->status); >- desc[vq->vq_avail_idx].flags = VRING_DESC_F_WRITE | >- VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >- VRING_DESC_F_USED(!vq->avail_wrap_counter); >+ desc[vq->vq_avail_idx].flags = VRING_DESC_F_WRITE | vq->cached_flags; > vq->vq_free_cnt--; > nb_descs++; > if (++vq->vq_avail_idx >= vq->vq_nentries) { > vq->vq_avail_idx -= vq->vq_nentries; >- vq->avail_wrap_counter ^= 1; >+ vq->cached_flags ^= >+ VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); > } > > virtio_wmb(vq->hw->weak_barriers); >- desc[head].flags = VRING_DESC_F_NEXT | >- VRING_DESC_F_AVAIL(avail_wrap_counter) | >- VRING_DESC_F_USED(!avail_wrap_counter); >+ desc[head].flags = VRING_DESC_F_NEXT | flags; > > virtio_wmb(vq->hw->weak_barriers); > virtqueue_notify(vq); >@@ -226,12 +224,12 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, > PMD_INIT_LOG(DEBUG, "vq->vq_free_cnt=%d\n" > "vq->vq_avail_idx=%d\n" > "vq->vq_used_cons_idx=%d\n" >- "vq->avail_wrap_counter=%d\n" >+ "vq->cached_flags=0x%x\n" > "vq->used_wrap_counter=%d\n", > vq->vq_free_cnt, > vq->vq_avail_idx, > vq->vq_used_cons_idx, >- vq->avail_wrap_counter, >+ vq->cached_flags, > vq->used_wrap_counter); > > result = cvq->virtio_net_hdr_mz->addr; >@@ -491,11 +489,10 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t vtpci_queue_idx) > vq->vq_nentries = vq_size; > vq->event_flags_shadow = 0; > if (vtpci_packed_queue(hw)) { >- vq->avail_wrap_counter = 1; > vq->used_wrap_counter = 1; >- vq->avail_used_flags = >- VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >- VRING_DESC_F_USED(!vq->avail_wrap_counter); >+ vq->cached_flags = VRING_DESC_F_AVAIL(1); >+ if (queue_type == VTNET_RQ) >+ vq->cached_flags |= VRING_DESC_F_WRITE; > } > > /* >diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c >index 771d3c3f6..3c354baef 100644 >--- a/drivers/net/virtio/virtio_rxtx.c >+++ b/drivers/net/virtio/virtio_rxtx.c >@@ -431,7 +431,7 @@ virtqueue_enqueue_recv_refill_packed(struct virtqueue *vq, > struct rte_mbuf **cookie, uint16_t num) > { > struct vring_packed_desc *start_dp = vq->ring_packed.desc_packed; >- uint16_t flags = VRING_DESC_F_WRITE | vq->avail_used_flags; >+ uint16_t flags = vq->cached_flags; > struct virtio_hw *hw = vq->hw; > struct vq_desc_extra *dxp; > uint16_t idx; >@@ -460,11 +460,9 @@ virtqueue_enqueue_recv_refill_packed(struct virtqueue *vq, > start_dp[idx].flags = flags; > if (++vq->vq_avail_idx >= vq->vq_nentries) { > vq->vq_avail_idx -= vq->vq_nentries; >- vq->avail_wrap_counter ^= 1; >- vq->avail_used_flags = >- VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >- VRING_DESC_F_USED(!vq->avail_wrap_counter); >- flags = VRING_DESC_F_WRITE | vq->avail_used_flags; >+ vq->cached_flags ^= >+ VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); >+ flags = vq->cached_flags; same here. it's not really cached, it's pre-calculated. And here we could also use a pre-calculated constand/define. Otherwise looks good! Did you see any performance improvements? regards, Jens From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id 2CC83A00E6 for ; Tue, 19 Mar 2019 09:54:10 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id ED66525A1; Tue, 19 Mar 2019 09:54:08 +0100 (CET) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 537131DBD for ; Tue, 19 Mar 2019 09:54:08 +0100 (CET) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AA3ED3082231; Tue, 19 Mar 2019 08:54:07 +0000 (UTC) Received: from localhost (dhcp-192-225.str.redhat.com [10.33.192.225]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5278A5C25A; Tue, 19 Mar 2019 08:54:05 +0000 (UTC) Date: Tue, 19 Mar 2019 09:54:03 +0100 From: Jens Freimann To: Tiwei Bie Cc: maxime.coquelin@redhat.com, zhihong.wang@intel.com, dev@dpdk.org Message-ID: <20190319085403.gnamedd5pz7jzwvj@jenstp.localdomain> References: <20190319064312.13743-1-tiwei.bie@intel.com> <20190319064312.13743-5-tiwei.bie@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8"; format="flowed" Content-Disposition: inline In-Reply-To: <20190319064312.13743-5-tiwei.bie@intel.com> User-Agent: NeoMutt/20180716 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Tue, 19 Mar 2019 08:54:07 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH 04/10] net/virtio: optimize flags update for packed ring X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Message-ID: <20190319085403.vcXma_vSkVJboJ7jn2RPGifWCfDsXUTT9zwlp6oP_YI@z> On Tue, Mar 19, 2019 at 02:43:06PM +0800, Tiwei Bie wrote: >Cache the AVAIL, USED and WRITE bits to avoid calculating >them as much as possible. Note that, the WRITE bit isn't >cached for control queue. > >Signed-off-by: Tiwei Bie >--- > drivers/net/virtio/virtio_ethdev.c | 35 ++++++++++++++---------------- > drivers/net/virtio/virtio_rxtx.c | 31 ++++++++++---------------- > drivers/net/virtio/virtqueue.h | 8 +++---- > 3 files changed, 32 insertions(+), 42 deletions(-) > >diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c >index ff16fb63e..9060b6b33 100644 >--- a/drivers/net/virtio/virtio_ethdev.c >+++ b/drivers/net/virtio/virtio_ethdev.c >@@ -149,7 +149,7 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, > int head; > struct vring_packed_desc *desc = vq->ring_packed.desc_packed; > struct virtio_pmd_ctrl *result; >- bool avail_wrap_counter; >+ uint16_t flags; > int sum = 0; > int nb_descs = 0; > int k; >@@ -161,14 +161,15 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, > * One RX packet for ACK. > */ > head = vq->vq_avail_idx; >- avail_wrap_counter = vq->avail_wrap_counter; >+ flags = vq->cached_flags; > desc[head].addr = cvq->virtio_net_hdr_mem; > desc[head].len = sizeof(struct virtio_net_ctrl_hdr); > vq->vq_free_cnt--; > nb_descs++; > if (++vq->vq_avail_idx >= vq->vq_nentries) { > vq->vq_avail_idx -= vq->vq_nentries; >- vq->avail_wrap_counter ^= 1; >+ vq->cached_flags ^= >+ VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); Maybe name it avail_used_flags instead of cached flags. Also we could use a constant value. > } > > for (k = 0; k < pkt_num; k++) { >@@ -177,34 +178,31 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, > + sizeof(ctrl->status) + sizeof(uint8_t) * sum; > desc[vq->vq_avail_idx].len = dlen[k]; > desc[vq->vq_avail_idx].flags = VRING_DESC_F_NEXT | >- VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >- VRING_DESC_F_USED(!vq->avail_wrap_counter); >+ vq->cached_flags; > sum += dlen[k]; > vq->vq_free_cnt--; > nb_descs++; > if (++vq->vq_avail_idx >= vq->vq_nentries) { > vq->vq_avail_idx -= vq->vq_nentries; >- vq->avail_wrap_counter ^= 1; >+ vq->cached_flags ^= >+ VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); > } > } > > desc[vq->vq_avail_idx].addr = cvq->virtio_net_hdr_mem > + sizeof(struct virtio_net_ctrl_hdr); > desc[vq->vq_avail_idx].len = sizeof(ctrl->status); >- desc[vq->vq_avail_idx].flags = VRING_DESC_F_WRITE | >- VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >- VRING_DESC_F_USED(!vq->avail_wrap_counter); >+ desc[vq->vq_avail_idx].flags = VRING_DESC_F_WRITE | vq->cached_flags; > vq->vq_free_cnt--; > nb_descs++; > if (++vq->vq_avail_idx >= vq->vq_nentries) { > vq->vq_avail_idx -= vq->vq_nentries; >- vq->avail_wrap_counter ^= 1; >+ vq->cached_flags ^= >+ VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); > } > > virtio_wmb(vq->hw->weak_barriers); >- desc[head].flags = VRING_DESC_F_NEXT | >- VRING_DESC_F_AVAIL(avail_wrap_counter) | >- VRING_DESC_F_USED(!avail_wrap_counter); >+ desc[head].flags = VRING_DESC_F_NEXT | flags; > > virtio_wmb(vq->hw->weak_barriers); > virtqueue_notify(vq); >@@ -226,12 +224,12 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, > PMD_INIT_LOG(DEBUG, "vq->vq_free_cnt=%d\n" > "vq->vq_avail_idx=%d\n" > "vq->vq_used_cons_idx=%d\n" >- "vq->avail_wrap_counter=%d\n" >+ "vq->cached_flags=0x%x\n" > "vq->used_wrap_counter=%d\n", > vq->vq_free_cnt, > vq->vq_avail_idx, > vq->vq_used_cons_idx, >- vq->avail_wrap_counter, >+ vq->cached_flags, > vq->used_wrap_counter); > > result = cvq->virtio_net_hdr_mz->addr; >@@ -491,11 +489,10 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t vtpci_queue_idx) > vq->vq_nentries = vq_size; > vq->event_flags_shadow = 0; > if (vtpci_packed_queue(hw)) { >- vq->avail_wrap_counter = 1; > vq->used_wrap_counter = 1; >- vq->avail_used_flags = >- VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >- VRING_DESC_F_USED(!vq->avail_wrap_counter); >+ vq->cached_flags = VRING_DESC_F_AVAIL(1); >+ if (queue_type == VTNET_RQ) >+ vq->cached_flags |= VRING_DESC_F_WRITE; > } > > /* >diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c >index 771d3c3f6..3c354baef 100644 >--- a/drivers/net/virtio/virtio_rxtx.c >+++ b/drivers/net/virtio/virtio_rxtx.c >@@ -431,7 +431,7 @@ virtqueue_enqueue_recv_refill_packed(struct virtqueue *vq, > struct rte_mbuf **cookie, uint16_t num) > { > struct vring_packed_desc *start_dp = vq->ring_packed.desc_packed; >- uint16_t flags = VRING_DESC_F_WRITE | vq->avail_used_flags; >+ uint16_t flags = vq->cached_flags; > struct virtio_hw *hw = vq->hw; > struct vq_desc_extra *dxp; > uint16_t idx; >@@ -460,11 +460,9 @@ virtqueue_enqueue_recv_refill_packed(struct virtqueue *vq, > start_dp[idx].flags = flags; > if (++vq->vq_avail_idx >= vq->vq_nentries) { > vq->vq_avail_idx -= vq->vq_nentries; >- vq->avail_wrap_counter ^= 1; >- vq->avail_used_flags = >- VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >- VRING_DESC_F_USED(!vq->avail_wrap_counter); >- flags = VRING_DESC_F_WRITE | vq->avail_used_flags; >+ vq->cached_flags ^= >+ VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); >+ flags = vq->cached_flags; same here. it's not really cached, it's pre-calculated. And here we could also use a pre-calculated constand/define. Otherwise looks good! Did you see any performance improvements? regards, Jens