From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4FE5CA034F; Sun, 6 Jun 2021 20:35:15 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 33FB440147; Sun, 6 Jun 2021 20:35:15 +0200 (CEST) Received: from mail-il1-f175.google.com (mail-il1-f175.google.com [209.85.166.175]) by mails.dpdk.org (Postfix) with ESMTP id 88ACA40040 for ; Sun, 6 Jun 2021 20:35:13 +0200 (CEST) Received: by mail-il1-f175.google.com with SMTP id i13so8297013ilk.3 for ; Sun, 06 Jun 2021 11:35:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9BUHvnGZra1VJdIXBKRF9sMfltSKp3RyebXOwPVjOx0=; b=ow0r8u2i8ymY8VVnmhvl7DZg4dMaIlcoOLGo4mqSwLkSGNm3GD0k4dr7geEE58hakH Fm8M8M/bMzHXOulvRN9mwDt0dL57jpYgqaaU2wrgKtbRPv0hh+qbaoLFuGf47CHvN1N/ m2mBZoFmazRPclIwwYnaTKH3fKsUVylmZceN+knUXsGB4sDTReOZsPKUurDlEy75zaJm il9KK4v7RKTh/qVv2o3sCdlm1wYhlmpkUO18J2FYgeYGAD+b42ZYSmXS6tNtaWcFPst7 0obaevsJND0m7lqMpYwjQgEpNyjJts1Faru7JsLAlIH2spv3at0lrG7DEv6qax5FZ2B5 hxsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9BUHvnGZra1VJdIXBKRF9sMfltSKp3RyebXOwPVjOx0=; b=nh5SQ+pj8EPxclvSIcVQtlzaSn8EaQmk0z7mg+HLU5+LuPu6Eqd81yieUV8DgYjNKw UWocvB/x2upHzCVpAV6GhGVR+yGkax1NxWQry2fJrALOtJqbmwZjkWd5v4eKn9jez+b6 o54h1MqvuiL7fOqFg0pqipbBkl2zd4CcFi2QGazw2HkGLXJHundcCmRLkqJNwOGwZTBO tkhz08XXvypgRI/qDL2ywlEX6WIz2G/lxieOUM/6NZFjZ4AxZ9ypoljIpOYgnSMp59XU Urdw3iaqGjEARERBiTfhOt9Y0LhPbq+7wGX32u4FVvhLSRG4O+fNssAOwiinOB8ljDHd tuqA== X-Gm-Message-State: AOAM531DQKJaPCdiDV/qd89ZFXPsViCtf+XR6B932X6kdQRiqKS6oadu 4pglPX86SgAxmXTswY3B6qgsmF2hoD9AJOYR7XI= X-Google-Smtp-Source: ABdhPJzzcM+wJSNOZf+nVHu/bQgSbzprOX3dzMx7CZuQ9rzoi9lEomTnzErHx6uefoyKmhcW8kYqnFI58Aday3ihwMc= X-Received: by 2002:a05:6e02:1a6a:: with SMTP id w10mr12557128ilv.130.1623004512857; Sun, 06 Jun 2021 11:35:12 -0700 (PDT) MIME-Version: 1.0 References: <20210318102550.59265-1-ruifeng.wang@arm.com> <20210601075653.84927-1-ruifeng.wang@arm.com> <20210601075653.84927-2-ruifeng.wang@arm.com> In-Reply-To: <20210601075653.84927-2-ruifeng.wang@arm.com> From: Jerin Jacob Date: Mon, 7 Jun 2021 00:04:56 +0530 Message-ID: To: Ruifeng Wang Cc: Jerin Jacob , Hemant Agrawal , Ferruh Yigit , Thomas Monjalon , David Marchand , dpdk-dev , nd , Honnappa Nagarahalli Content-Type: text/plain; charset="UTF-8" Subject: Re: [dpdk-dev] [PATCH v2 1/3] examples/l3fwd: reorganize code for better performance X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Tue, Jun 1, 2021 at 1:27 PM Ruifeng Wang wrote: > > Moved rfc1812 process prior to NEON registers store. > On N1SDP, this reorganization mitigates CPU frontend stall and backend > stall when forwarding. > > On N1SDP with MLX5 40G NIC, this change showed 10.2% performance gain > in single port single core MRR test. I think, it may not have anything to do with N1SDP, It could be just the prefetch window timing with MLX5 driver on Tx mbuf on touching with tx_burst() and L1 cache pressure timing. I think, tuning the driver parameters can switch the window to some driver code. On Octeontx2, this change has regression of -3.1% flow lookup miss case. so NACK. > On ThunderX2, this changed showed no performance degradation. > > Signed-off-by: Ruifeng Wang > --- > examples/l3fwd/l3fwd_neon.h | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/examples/l3fwd/l3fwd_neon.h b/examples/l3fwd/l3fwd_neon.h > index 86ac5971d7..ea7fe22d00 100644 > --- a/examples/l3fwd/l3fwd_neon.h > +++ b/examples/l3fwd/l3fwd_neon.h > @@ -43,11 +43,6 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP]) > ve[2] = vsetq_lane_u32(vgetq_lane_u32(te[2], 3), ve[2], 3); > ve[3] = vsetq_lane_u32(vgetq_lane_u32(te[3], 3), ve[3], 3); > > - vst1q_u32(p[0], ve[0]); > - vst1q_u32(p[1], ve[1]); > - vst1q_u32(p[2], ve[2]); > - vst1q_u32(p[3], ve[3]); > - > rfc1812_process((struct rte_ipv4_hdr *) > ((struct rte_ether_hdr *)p[0] + 1), > &dst_port[0], pkt[0]->packet_type); > @@ -60,6 +55,11 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP]) > rfc1812_process((struct rte_ipv4_hdr *) > ((struct rte_ether_hdr *)p[3] + 1), > &dst_port[3], pkt[3]->packet_type); > + > + vst1q_u32(p[0], ve[0]); > + vst1q_u32(p[1], ve[1]); > + vst1q_u32(p[2], ve[2]); > + vst1q_u32(p[3], ve[3]); > } > > /* > -- > 2.25.1 >