From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id E55C0201 for ; Fri, 14 Apr 2017 09:56:14 +0200 (CEST) Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 14 Apr 2017 00:56:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.37,197,1488873600"; d="scan'208";a="88924861" Received: from yliu-dev.sh.intel.com ([10.239.67.162]) by fmsmga005.fm.intel.com with ESMTP; 14 Apr 2017 00:56:12 -0700 From: Yuanhan Liu To: dev@dpdk.org Cc: Maxime Coquelin , Yuanhan Liu Date: Fri, 14 Apr 2017 15:53:18 +0800 Message-Id: <1492156398-14405-1-git-send-email-yuanhan.liu@linux.intel.com> X-Mailer: git-send-email 1.9.0 Subject: [dpdk-dev] [PATCH] vhost: avoid memory write on net header when necessary X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Apr 2017 07:56:15 -0000 Like what we did for virtio PMD driver [0][1], we could also apply such trick to vhost, to avoid the memory write on net header when necessary. [0]: c9ea670c1dc7 ("net/virtio: fix performance regression due to TSO") [1]: 16994abee215 ("net/virtio: optimize header reset on any layout") With this, the cache issue of the mergeable path is again greatly reduced: even the write of "num_buffers" could be avoided. A quick PVP test shows the gap between the mergeable Rx and non-mergable Rx is pretty small now: they are basically the same in my test. Signed-off-by: Yuanhan Liu --- I still don't have plan to make ASSIGN_UNLESS_EQUAL public; somethig I will consider when there is a third user. --- lib/librte_vhost/virtio_net.c | 38 +++++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 17 deletions(-) diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index b9f2168..cfdefe0 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -105,6 +105,12 @@ static inline void __attribute__((always_inline)) vq->shadow_used_ring[i].len = len; } +/* avoid write operation when necessary, to lessen cache issues */ +#define ASSIGN_UNLESS_EQUAL(var, val) do { \ + if ((var) != (val)) \ + (var) = (val); \ +} while (0) + static void virtio_enqueue_offload(struct rte_mbuf *m_buf, struct virtio_net_hdr *net_hdr) { @@ -126,6 +132,10 @@ static inline void __attribute__((always_inline)) cksum)); break; } + } else { + ASSIGN_UNLESS_EQUAL(net_hdr->csum_start, 0); + ASSIGN_UNLESS_EQUAL(net_hdr->csum_offset, 0); + ASSIGN_UNLESS_EQUAL(net_hdr->flags, 0); } if (m_buf->ol_flags & PKT_TX_TCP_SEG) { @@ -136,19 +146,13 @@ static inline void __attribute__((always_inline)) net_hdr->gso_size = m_buf->tso_segsz; net_hdr->hdr_len = m_buf->l2_len + m_buf->l3_len + m_buf->l4_len; + } else { + ASSIGN_UNLESS_EQUAL(net_hdr->gso_type, 0); + ASSIGN_UNLESS_EQUAL(net_hdr->gso_size, 0); + ASSIGN_UNLESS_EQUAL(net_hdr->hdr_len, 0); } } -static inline void -copy_virtio_net_hdr(struct virtio_net *dev, uint64_t desc_addr, - struct virtio_net_hdr_mrg_rxbuf hdr) -{ - if (dev->vhost_hlen == sizeof(struct virtio_net_hdr_mrg_rxbuf)) - *(struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)desc_addr = hdr; - else - *(struct virtio_net_hdr *)(uintptr_t)desc_addr = hdr.hdr; -} - static inline int __attribute__((always_inline)) copy_mbuf_to_desc(struct virtio_net *dev, struct vring_desc *descs, struct rte_mbuf *m, uint16_t desc_idx, uint32_t size) @@ -158,7 +162,6 @@ static inline int __attribute__((always_inline)) uint32_t cpy_len; struct vring_desc *desc; uint64_t desc_addr; - struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0}; /* A counter to avoid desc dead loop chain */ uint16_t nr_desc = 1; @@ -174,8 +177,7 @@ static inline int __attribute__((always_inline)) rte_prefetch0((void *)(uintptr_t)desc_addr); - virtio_enqueue_offload(m, &virtio_hdr.hdr); - copy_virtio_net_hdr(dev, desc_addr, virtio_hdr); + virtio_enqueue_offload(m, (struct virtio_net_hdr *)(uintptr_t)desc_addr); vhost_log_write(dev, desc->addr, dev->vhost_hlen); PRINT_PACKET(dev, (uintptr_t)desc_addr, dev->vhost_hlen, 0); @@ -426,7 +428,6 @@ static inline int __attribute__((always_inline)) copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct rte_mbuf *m, struct buf_vector *buf_vec, uint16_t num_buffers) { - struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0}; uint32_t vec_idx = 0; uint64_t desc_addr; uint32_t mbuf_offset, mbuf_avail; @@ -447,7 +448,6 @@ static inline int __attribute__((always_inline)) hdr_phys_addr = buf_vec[vec_idx].buf_addr; rte_prefetch0((void *)(uintptr_t)hdr_addr); - virtio_hdr.num_buffers = num_buffers; LOG_DEBUG(VHOST_DATA, "(%d) RX: num merge buffers %d\n", dev->vid, num_buffers); @@ -480,8 +480,12 @@ static inline int __attribute__((always_inline)) } if (hdr_addr) { - virtio_enqueue_offload(hdr_mbuf, &virtio_hdr.hdr); - copy_virtio_net_hdr(dev, hdr_addr, virtio_hdr); + struct virtio_net_hdr_mrg_rxbuf *hdr; + + hdr = (struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)hdr_addr; + virtio_enqueue_offload(hdr_mbuf, &hdr->hdr); + ASSIGN_UNLESS_EQUAL(hdr->num_buffers, num_buffers); + vhost_log_write(dev, hdr_phys_addr, dev->vhost_hlen); PRINT_PACKET(dev, (uintptr_t)hdr_addr, dev->vhost_hlen, 0); -- 1.9.0