From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id AA42A2C68 for ; Thu, 10 Mar 2016 05:30:12 +0100 (CET) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga104.fm.intel.com with ESMTP; 09 Mar 2016 20:30:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,314,1455004800"; d="scan'208";a="920667187" Received: from yliu-dev.sh.intel.com ([10.239.66.49]) by fmsmga001.fm.intel.com with ESMTP; 09 Mar 2016 20:30:09 -0800 From: Yuanhan Liu To: dev@dpdk.org Date: Thu, 10 Mar 2016 12:32:42 +0800 Message-Id: <1457584366-3036-5-git-send-email-yuanhan.liu@linux.intel.com> X-Mailer: git-send-email 1.9.0 In-Reply-To: <1457584366-3036-1-git-send-email-yuanhan.liu@linux.intel.com> References: <1455803352-5518-1-git-send-email-yuanhan.liu@linux.intel.com> <1457584366-3036-1-git-send-email-yuanhan.liu@linux.intel.com> Subject: [dpdk-dev] [PATCH v3 4/8] vhost: do not use rte_memcpy for virtio_hdr copy X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Mar 2016 04:30:13 -0000 First of all, rte_memcpy() is mostly useful for coping big packets by leveraging hardware advanced instructions like AVX. But for virtio net hdr, which is 12 bytes at most, invoking rte_memcpy() will not introduce any performance boost. And, to my suprise, rte_memcpy() is VERY huge. Since rte_memcpy() is inlined, it increases the binary code size linearly every time we call it at a different place. Replacing the two rte_memcpy() with directly copy saves nearly 12K bytes of code size! Signed-off-by: Yuanhan Liu --- lib/librte_vhost/vhost_rxtx.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c index 9be3593..bafcb52 100644 --- a/lib/librte_vhost/vhost_rxtx.c +++ b/lib/librte_vhost/vhost_rxtx.c @@ -129,6 +129,16 @@ virtio_enqueue_offload(struct rte_mbuf *m_buf, struct virtio_net_hdr *net_hdr) return; } +static inline void +copy_virtio_net_hdr(struct vhost_virtqueue *vq, uint64_t desc_addr, + struct virtio_net_hdr_mrg_rxbuf hdr) +{ + if (vq->vhost_hlen == sizeof(struct virtio_net_hdr_mrg_rxbuf)) + *(struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)desc_addr = hdr; + else + *(struct virtio_net_hdr *)(uintptr_t)desc_addr = hdr.hdr; +} + static inline int __attribute__((always_inline)) copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mbuf *m, uint16_t desc_idx, uint32_t *copied) @@ -145,8 +155,7 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, rte_prefetch0((void *)(uintptr_t)desc_addr); virtio_enqueue_offload(m, &virtio_hdr.hdr); - rte_memcpy((void *)(uintptr_t)desc_addr, - (const void *)&virtio_hdr, vq->vhost_hlen); + copy_virtio_net_hdr(vq, desc_addr, virtio_hdr); vhost_log_write(dev, desc->addr, vq->vhost_hlen); PRINT_PACKET(dev, (uintptr_t)desc_addr, vq->vhost_hlen, 0); @@ -447,8 +456,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq, dev->device_fh, virtio_hdr.num_buffers); virtio_enqueue_offload(m, &virtio_hdr.hdr); - rte_memcpy((void *)(uintptr_t)desc_addr, - (const void *)&virtio_hdr, vq->vhost_hlen); + copy_virtio_net_hdr(vq, desc_addr, virtio_hdr); vhost_log_write(dev, vq->buf_vec[vec_idx].buf_addr, vq->vhost_hlen); PRINT_PACKET(dev, (uintptr_t)desc_addr, vq->vhost_hlen, 0); -- 1.9.0