From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B0B71A0524; Tue, 1 Jun 2021 09:57:24 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9A731410FC; Tue, 1 Jun 2021 09:57:24 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 9C30240040 for ; Tue, 1 Jun 2021 09:57:23 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 174A11FB; Tue, 1 Jun 2021 00:57:23 -0700 (PDT) Received: from net-arm-n1amp-01.shanghai.arm.com (net-arm-n1amp-01.shanghai.arm.com [10.169.210.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 3C9963F73D; Tue, 1 Jun 2021 00:57:19 -0700 (PDT) From: Ruifeng Wang To: jerinj@marvell.com, hemant.agrawal@nxp.com, ferruh.yigit@intel.com, thomas@monjalon.net, david.marchand@redhat.com Cc: dev@dpdk.org, nd@arm.com, honnappa.nagarahalli@arm.com, Ruifeng Wang Date: Tue, 1 Jun 2021 07:56:51 +0000 Message-Id: <20210601075653.84927-2-ruifeng.wang@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210601075653.84927-1-ruifeng.wang@arm.com> References: <20210318102550.59265-1-ruifeng.wang@arm.com> <20210601075653.84927-1-ruifeng.wang@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH v2 1/3] examples/l3fwd: reorganize code for better performance X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Moved rfc1812 process prior to NEON registers store. On N1SDP, this reorganization mitigates CPU frontend stall and backend stall when forwarding. On N1SDP with MLX5 40G NIC, this change showed 10.2% performance gain in single port single core MRR test. On ThunderX2, this changed showed no performance degradation. Signed-off-by: Ruifeng Wang --- examples/l3fwd/l3fwd_neon.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/examples/l3fwd/l3fwd_neon.h b/examples/l3fwd/l3fwd_neon.h index 86ac5971d7..ea7fe22d00 100644 --- a/examples/l3fwd/l3fwd_neon.h +++ b/examples/l3fwd/l3fwd_neon.h @@ -43,11 +43,6 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP]) ve[2] = vsetq_lane_u32(vgetq_lane_u32(te[2], 3), ve[2], 3); ve[3] = vsetq_lane_u32(vgetq_lane_u32(te[3], 3), ve[3], 3); - vst1q_u32(p[0], ve[0]); - vst1q_u32(p[1], ve[1]); - vst1q_u32(p[2], ve[2]); - vst1q_u32(p[3], ve[3]); - rfc1812_process((struct rte_ipv4_hdr *) ((struct rte_ether_hdr *)p[0] + 1), &dst_port[0], pkt[0]->packet_type); @@ -60,6 +55,11 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP]) rfc1812_process((struct rte_ipv4_hdr *) ((struct rte_ether_hdr *)p[3] + 1), &dst_port[3], pkt[3]->packet_type); + + vst1q_u32(p[0], ve[0]); + vst1q_u32(p[1], ve[1]); + vst1q_u32(p[2], ve[2]); + vst1q_u32(p[3], ve[3]); } /* -- 2.25.1