From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 4B67E2956 for ; Tue, 19 Mar 2019 11:11:59 +0100 (CET) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A015230001E2; Tue, 19 Mar 2019 10:11:58 +0000 (UTC) Received: from localhost (dhcp-192-225.str.redhat.com [10.33.192.225]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 051DD1001E61; Tue, 19 Mar 2019 10:11:55 +0000 (UTC) Date: Tue, 19 Mar 2019 11:11:54 +0100 From: Jens Freimann To: Tiwei Bie Cc: maxime.coquelin@redhat.com, zhihong.wang@intel.com, dev@dpdk.org Message-ID: <20190319101154.loy5cmwbgnbnkxyg@jenstp.localdomain> References: <20190319064312.13743-1-tiwei.bie@intel.com> <20190319064312.13743-5-tiwei.bie@intel.com> <20190319085403.gnamedd5pz7jzwvj@jenstp.localdomain> <20190319093734.GA24818@dpdk-tbie.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20190319093734.GA24818@dpdk-tbie.sh.intel.com> User-Agent: NeoMutt/20180716 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Tue, 19 Mar 2019 10:11:58 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH 04/10] net/virtio: optimize flags update for packed ring X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Mar 2019 10:11:59 -0000 On Tue, Mar 19, 2019 at 05:37:34PM +0800, Tiwei Bie wrote: >On Tue, Mar 19, 2019 at 09:54:03AM +0100, Jens Freimann wrote: >> On Tue, Mar 19, 2019 at 02:43:06PM +0800, Tiwei Bie wrote: >> > Cache the AVAIL, USED and WRITE bits to avoid calculating >> > them as much as possible. Note that, the WRITE bit isn't >> > cached for control queue. >> > >> > Signed-off-by: Tiwei Bie >> > --- >> > drivers/net/virtio/virtio_ethdev.c | 35 ++++++++++++++---------------- >> > drivers/net/virtio/virtio_rxtx.c | 31 ++++++++++---------------- >> > drivers/net/virtio/virtqueue.h | 8 +++---- >> > 3 files changed, 32 insertions(+), 42 deletions(-) >> > >> > diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c >> > index ff16fb63e..9060b6b33 100644 >> > --- a/drivers/net/virtio/virtio_ethdev.c >> > +++ b/drivers/net/virtio/virtio_ethdev.c >> > @@ -149,7 +149,7 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, >> > int head; >> > struct vring_packed_desc *desc = vq->ring_packed.desc_packed; >> > struct virtio_pmd_ctrl *result; >> > - bool avail_wrap_counter; >> > + uint16_t flags; >> > int sum = 0; >> > int nb_descs = 0; >> > int k; >> > @@ -161,14 +161,15 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, >> > * One RX packet for ACK. >> > */ >> > head = vq->vq_avail_idx; >> > - avail_wrap_counter = vq->avail_wrap_counter; >> > + flags = vq->cached_flags; >> > desc[head].addr = cvq->virtio_net_hdr_mem; >> > desc[head].len = sizeof(struct virtio_net_ctrl_hdr); >> > vq->vq_free_cnt--; >> > nb_descs++; >> > if (++vq->vq_avail_idx >= vq->vq_nentries) { >> > vq->vq_avail_idx -= vq->vq_nentries; >> > - vq->avail_wrap_counter ^= 1; >> > + vq->cached_flags ^= >> > + VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); >> >> Maybe name it avail_used_flags instead of cached flags. Also we could >> use a constant value. > >It also contains the WRITE bit (not just AVAIL and USED bits) >for Rx path. That's why I didn't name it as avail_used_flags. ok >> >> > } >> > >> > for (k = 0; k < pkt_num; k++) { >> > @@ -177,34 +178,31 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, >> > + sizeof(ctrl->status) + sizeof(uint8_t) * sum; >> > desc[vq->vq_avail_idx].len = dlen[k]; >> > desc[vq->vq_avail_idx].flags = VRING_DESC_F_NEXT | >> > - VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >> > - VRING_DESC_F_USED(!vq->avail_wrap_counter); >> > + vq->cached_flags; >> > sum += dlen[k]; >> > vq->vq_free_cnt--; >> > nb_descs++; >> > if (++vq->vq_avail_idx >= vq->vq_nentries) { >> > vq->vq_avail_idx -= vq->vq_nentries; >> > - vq->avail_wrap_counter ^= 1; >> > + vq->cached_flags ^= >> > + VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); >> > } >> > } >> > >> > desc[vq->vq_avail_idx].addr = cvq->virtio_net_hdr_mem >> > + sizeof(struct virtio_net_ctrl_hdr); >> > desc[vq->vq_avail_idx].len = sizeof(ctrl->status); >> > - desc[vq->vq_avail_idx].flags = VRING_DESC_F_WRITE | >> > - VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >> > - VRING_DESC_F_USED(!vq->avail_wrap_counter); >> > + desc[vq->vq_avail_idx].flags = VRING_DESC_F_WRITE | vq->cached_flags; >> > vq->vq_free_cnt--; >> > nb_descs++; >> > if (++vq->vq_avail_idx >= vq->vq_nentries) { >> > vq->vq_avail_idx -= vq->vq_nentries; >> > - vq->avail_wrap_counter ^= 1; >> > + vq->cached_flags ^= >> > + VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); >> > } >> > >> > virtio_wmb(vq->hw->weak_barriers); >> > - desc[head].flags = VRING_DESC_F_NEXT | >> > - VRING_DESC_F_AVAIL(avail_wrap_counter) | >> > - VRING_DESC_F_USED(!avail_wrap_counter); >> > + desc[head].flags = VRING_DESC_F_NEXT | flags; >> > >> > virtio_wmb(vq->hw->weak_barriers); >> > virtqueue_notify(vq); >> > @@ -226,12 +224,12 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, >> > PMD_INIT_LOG(DEBUG, "vq->vq_free_cnt=%d\n" >> > "vq->vq_avail_idx=%d\n" >> > "vq->vq_used_cons_idx=%d\n" >> > - "vq->avail_wrap_counter=%d\n" >> > + "vq->cached_flags=0x%x\n" >> > "vq->used_wrap_counter=%d\n", >> > vq->vq_free_cnt, >> > vq->vq_avail_idx, >> > vq->vq_used_cons_idx, >> > - vq->avail_wrap_counter, >> > + vq->cached_flags, >> > vq->used_wrap_counter); >> > >> > result = cvq->virtio_net_hdr_mz->addr; >> > @@ -491,11 +489,10 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t vtpci_queue_idx) >> > vq->vq_nentries = vq_size; >> > vq->event_flags_shadow = 0; >> > if (vtpci_packed_queue(hw)) { >> > - vq->avail_wrap_counter = 1; >> > vq->used_wrap_counter = 1; >> > - vq->avail_used_flags = >> > - VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >> > - VRING_DESC_F_USED(!vq->avail_wrap_counter); >> > + vq->cached_flags = VRING_DESC_F_AVAIL(1); >> > + if (queue_type == VTNET_RQ) >> > + vq->cached_flags |= VRING_DESC_F_WRITE; >> > } >> > >> > /* >> > diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c >> > index 771d3c3f6..3c354baef 100644 >> > --- a/drivers/net/virtio/virtio_rxtx.c >> > +++ b/drivers/net/virtio/virtio_rxtx.c >> > @@ -431,7 +431,7 @@ virtqueue_enqueue_recv_refill_packed(struct virtqueue *vq, >> > struct rte_mbuf **cookie, uint16_t num) >> > { >> > struct vring_packed_desc *start_dp = vq->ring_packed.desc_packed; >> > - uint16_t flags = VRING_DESC_F_WRITE | vq->avail_used_flags; >> > + uint16_t flags = vq->cached_flags; >> > struct virtio_hw *hw = vq->hw; >> > struct vq_desc_extra *dxp; >> > uint16_t idx; >> > @@ -460,11 +460,9 @@ virtqueue_enqueue_recv_refill_packed(struct virtqueue *vq, >> > start_dp[idx].flags = flags; >> > if (++vq->vq_avail_idx >= vq->vq_nentries) { >> > vq->vq_avail_idx -= vq->vq_nentries; >> > - vq->avail_wrap_counter ^= 1; >> > - vq->avail_used_flags = >> > - VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >> > - VRING_DESC_F_USED(!vq->avail_wrap_counter); >> > - flags = VRING_DESC_F_WRITE | vq->avail_used_flags; >> > + vq->cached_flags ^= >> > + VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); >> > + flags = vq->cached_flags; >> >> same here. it's not really cached, it's pre-calculated. And here we >> could also use a pre-calculated constand/define. > >For pre-calculated constant/define, do you mean VRING_DESC_F_AVAIL(1) >and VRING_DESC_F_USED(1)? There is still little code in virtio-user >using them without constantly passing 1. I planned to fully get rid >of them in a separate patch later (but I can do it in this series if >anyone wants). Yes, that's what I meant. And it's fine by me if you do it in a follow-up. > >> >> Otherwise looks good! Did you see any performance improvements? > >Yeah, I saw slightly better performance in a quick test. Nice. Reviewed-by: Jens Freimann regards, Jens > >> >> >> regards, >> Jens From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id 595BBA00E6 for ; Tue, 19 Mar 2019 11:12:01 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D5FB22C30; Tue, 19 Mar 2019 11:12:00 +0100 (CET) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 4B67E2956 for ; Tue, 19 Mar 2019 11:11:59 +0100 (CET) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A015230001E2; Tue, 19 Mar 2019 10:11:58 +0000 (UTC) Received: from localhost (dhcp-192-225.str.redhat.com [10.33.192.225]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 051DD1001E61; Tue, 19 Mar 2019 10:11:55 +0000 (UTC) Date: Tue, 19 Mar 2019 11:11:54 +0100 From: Jens Freimann To: Tiwei Bie Cc: maxime.coquelin@redhat.com, zhihong.wang@intel.com, dev@dpdk.org Message-ID: <20190319101154.loy5cmwbgnbnkxyg@jenstp.localdomain> References: <20190319064312.13743-1-tiwei.bie@intel.com> <20190319064312.13743-5-tiwei.bie@intel.com> <20190319085403.gnamedd5pz7jzwvj@jenstp.localdomain> <20190319093734.GA24818@dpdk-tbie.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8"; format="flowed" Content-Disposition: inline In-Reply-To: <20190319093734.GA24818@dpdk-tbie.sh.intel.com> User-Agent: NeoMutt/20180716 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Tue, 19 Mar 2019 10:11:58 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH 04/10] net/virtio: optimize flags update for packed ring X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Message-ID: <20190319101154.bZrvFuhjG-JJQIxFEFUgHSmJtJVfIhkK7Lu0KzSIeqI@z> On Tue, Mar 19, 2019 at 05:37:34PM +0800, Tiwei Bie wrote: >On Tue, Mar 19, 2019 at 09:54:03AM +0100, Jens Freimann wrote: >> On Tue, Mar 19, 2019 at 02:43:06PM +0800, Tiwei Bie wrote: >> > Cache the AVAIL, USED and WRITE bits to avoid calculating >> > them as much as possible. Note that, the WRITE bit isn't >> > cached for control queue. >> > >> > Signed-off-by: Tiwei Bie >> > --- >> > drivers/net/virtio/virtio_ethdev.c | 35 ++++++++++++++---------------- >> > drivers/net/virtio/virtio_rxtx.c | 31 ++++++++++---------------- >> > drivers/net/virtio/virtqueue.h | 8 +++---- >> > 3 files changed, 32 insertions(+), 42 deletions(-) >> > >> > diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c >> > index ff16fb63e..9060b6b33 100644 >> > --- a/drivers/net/virtio/virtio_ethdev.c >> > +++ b/drivers/net/virtio/virtio_ethdev.c >> > @@ -149,7 +149,7 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, >> > int head; >> > struct vring_packed_desc *desc = vq->ring_packed.desc_packed; >> > struct virtio_pmd_ctrl *result; >> > - bool avail_wrap_counter; >> > + uint16_t flags; >> > int sum = 0; >> > int nb_descs = 0; >> > int k; >> > @@ -161,14 +161,15 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, >> > * One RX packet for ACK. >> > */ >> > head = vq->vq_avail_idx; >> > - avail_wrap_counter = vq->avail_wrap_counter; >> > + flags = vq->cached_flags; >> > desc[head].addr = cvq->virtio_net_hdr_mem; >> > desc[head].len = sizeof(struct virtio_net_ctrl_hdr); >> > vq->vq_free_cnt--; >> > nb_descs++; >> > if (++vq->vq_avail_idx >= vq->vq_nentries) { >> > vq->vq_avail_idx -= vq->vq_nentries; >> > - vq->avail_wrap_counter ^= 1; >> > + vq->cached_flags ^= >> > + VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); >> >> Maybe name it avail_used_flags instead of cached flags. Also we could >> use a constant value. > >It also contains the WRITE bit (not just AVAIL and USED bits) >for Rx path. That's why I didn't name it as avail_used_flags. ok >> >> > } >> > >> > for (k = 0; k < pkt_num; k++) { >> > @@ -177,34 +178,31 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, >> > + sizeof(ctrl->status) + sizeof(uint8_t) * sum; >> > desc[vq->vq_avail_idx].len = dlen[k]; >> > desc[vq->vq_avail_idx].flags = VRING_DESC_F_NEXT | >> > - VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >> > - VRING_DESC_F_USED(!vq->avail_wrap_counter); >> > + vq->cached_flags; >> > sum += dlen[k]; >> > vq->vq_free_cnt--; >> > nb_descs++; >> > if (++vq->vq_avail_idx >= vq->vq_nentries) { >> > vq->vq_avail_idx -= vq->vq_nentries; >> > - vq->avail_wrap_counter ^= 1; >> > + vq->cached_flags ^= >> > + VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); >> > } >> > } >> > >> > desc[vq->vq_avail_idx].addr = cvq->virtio_net_hdr_mem >> > + sizeof(struct virtio_net_ctrl_hdr); >> > desc[vq->vq_avail_idx].len = sizeof(ctrl->status); >> > - desc[vq->vq_avail_idx].flags = VRING_DESC_F_WRITE | >> > - VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >> > - VRING_DESC_F_USED(!vq->avail_wrap_counter); >> > + desc[vq->vq_avail_idx].flags = VRING_DESC_F_WRITE | vq->cached_flags; >> > vq->vq_free_cnt--; >> > nb_descs++; >> > if (++vq->vq_avail_idx >= vq->vq_nentries) { >> > vq->vq_avail_idx -= vq->vq_nentries; >> > - vq->avail_wrap_counter ^= 1; >> > + vq->cached_flags ^= >> > + VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); >> > } >> > >> > virtio_wmb(vq->hw->weak_barriers); >> > - desc[head].flags = VRING_DESC_F_NEXT | >> > - VRING_DESC_F_AVAIL(avail_wrap_counter) | >> > - VRING_DESC_F_USED(!avail_wrap_counter); >> > + desc[head].flags = VRING_DESC_F_NEXT | flags; >> > >> > virtio_wmb(vq->hw->weak_barriers); >> > virtqueue_notify(vq); >> > @@ -226,12 +224,12 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, >> > PMD_INIT_LOG(DEBUG, "vq->vq_free_cnt=%d\n" >> > "vq->vq_avail_idx=%d\n" >> > "vq->vq_used_cons_idx=%d\n" >> > - "vq->avail_wrap_counter=%d\n" >> > + "vq->cached_flags=0x%x\n" >> > "vq->used_wrap_counter=%d\n", >> > vq->vq_free_cnt, >> > vq->vq_avail_idx, >> > vq->vq_used_cons_idx, >> > - vq->avail_wrap_counter, >> > + vq->cached_flags, >> > vq->used_wrap_counter); >> > >> > result = cvq->virtio_net_hdr_mz->addr; >> > @@ -491,11 +489,10 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t vtpci_queue_idx) >> > vq->vq_nentries = vq_size; >> > vq->event_flags_shadow = 0; >> > if (vtpci_packed_queue(hw)) { >> > - vq->avail_wrap_counter = 1; >> > vq->used_wrap_counter = 1; >> > - vq->avail_used_flags = >> > - VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >> > - VRING_DESC_F_USED(!vq->avail_wrap_counter); >> > + vq->cached_flags = VRING_DESC_F_AVAIL(1); >> > + if (queue_type == VTNET_RQ) >> > + vq->cached_flags |= VRING_DESC_F_WRITE; >> > } >> > >> > /* >> > diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c >> > index 771d3c3f6..3c354baef 100644 >> > --- a/drivers/net/virtio/virtio_rxtx.c >> > +++ b/drivers/net/virtio/virtio_rxtx.c >> > @@ -431,7 +431,7 @@ virtqueue_enqueue_recv_refill_packed(struct virtqueue *vq, >> > struct rte_mbuf **cookie, uint16_t num) >> > { >> > struct vring_packed_desc *start_dp = vq->ring_packed.desc_packed; >> > - uint16_t flags = VRING_DESC_F_WRITE | vq->avail_used_flags; >> > + uint16_t flags = vq->cached_flags; >> > struct virtio_hw *hw = vq->hw; >> > struct vq_desc_extra *dxp; >> > uint16_t idx; >> > @@ -460,11 +460,9 @@ virtqueue_enqueue_recv_refill_packed(struct virtqueue *vq, >> > start_dp[idx].flags = flags; >> > if (++vq->vq_avail_idx >= vq->vq_nentries) { >> > vq->vq_avail_idx -= vq->vq_nentries; >> > - vq->avail_wrap_counter ^= 1; >> > - vq->avail_used_flags = >> > - VRING_DESC_F_AVAIL(vq->avail_wrap_counter) | >> > - VRING_DESC_F_USED(!vq->avail_wrap_counter); >> > - flags = VRING_DESC_F_WRITE | vq->avail_used_flags; >> > + vq->cached_flags ^= >> > + VRING_DESC_F_AVAIL(1) | VRING_DESC_F_USED(1); >> > + flags = vq->cached_flags; >> >> same here. it's not really cached, it's pre-calculated. And here we >> could also use a pre-calculated constand/define. > >For pre-calculated constant/define, do you mean VRING_DESC_F_AVAIL(1) >and VRING_DESC_F_USED(1)? There is still little code in virtio-user >using them without constantly passing 1. I planned to fully get rid >of them in a separate patch later (but I can do it in this series if >anyone wants). Yes, that's what I meant. And it's fine by me if you do it in a follow-up. > >> >> Otherwise looks good! Did you see any performance improvements? > >Yeah, I saw slightly better performance in a quick test. Nice. Reviewed-by: Jens Freimann regards, Jens > >> >> >> regards, >> Jens