From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1E7E3A0563 for ; Mon, 16 Mar 2020 19:31:52 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 006461C0C7; Mon, 16 Mar 2020 19:31:52 +0100 (CET) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 20F1C1C07B for ; Mon, 16 Mar 2020 19:31:49 +0100 (CET) Received: from Internal Mail-Server by MTLPINE1 (envelope-from akozyrev@mellanox.com) with ESMTPS (AES256-SHA encrypted); 16 Mar 2020 20:31:48 +0200 Received: from pegasus02.mtr.labs.mlnx. (pegasus02.mtr.labs.mlnx [10.210.16.122]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 02GIVlE4016010; Mon, 16 Mar 2020 20:31:47 +0200 From: Alexander Kozyrev To: dev@dpdk.org Cc: olivier.matz@6wind.com, viacheslavo@mellanox.com, matan@mellanox.com, thomas@monjalon.net, stable@dpdk.org Date: Mon, 16 Mar 2020 18:31:40 +0000 Message-Id: <1584383500-27482-1-git-send-email-akozyrev@mellanox.com> X-Mailer: git-send-email 1.8.3.1 Subject: [dpdk-stable] [PATCH] mbuf: optimize memory loads during mbuf freeing X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Sender: "stable" Introduction of pinned external buffers doubled memory loads in the rte_pktmbuf_prefree_seg() function. Analysis of the generated assembly code shows unnecessary load of the pool field of the rte_mbuf structure. Here is the snippet of the assembly for "if (!RTE_MBUF_DIRECT(m))": Before the change the code was: movq 0x18(%rbx), %rax // load the ol_flags field test %r13, %rax // check if ol_flags equals to 0x60...0 jz 0x9a8718 // jump out to "if (m->next != NULL)" After the change the code becomed: movq 0x18(%rbx), %rax // load ol_flags test %r14, %rax // check if ol_flags equals to 0x60...0 jnz 0x9bea38 // jump in to "if (!RTE_MBUF_HAS_EXTBUF(m)" movq 0x48(%rbx), %rax // load the pool field jmp 0x9bea78 // jump out to "if (m->next != NULL)" Look like this absolutely unneeded memory load of the pool field is an optimization for the external buffer case in GCC (4.8.5), since Clang generates the same assembly for both before and after the chenge versions. Plus, GCC favors the extrnal buffer case over the simple case. This assembly code layout causes the performance degradation because the rte_pktmbuf_prefree_seg() function is a part of a very hot path. Workaround this compilation issue by moving the check for pinned buffer apart from the check for external buffer and restore the initial code flow that favors the direct mbuf case over the external one. Fixes: 6ef1107ad4c6 ("mbuf: detach mbuf with pinned external buffer") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev Acked-by: Viacheslav Ovsiienko --- lib/librte_mbuf/rte_mbuf.h | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index 34679e0..ab9d3f5 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -1335,10 +1335,9 @@ static inline int __rte_pktmbuf_pinned_extbuf_decref(struct rte_mbuf *m) if (likely(rte_mbuf_refcnt_read(m) == 1)) { if (!RTE_MBUF_DIRECT(m)) { - if (!RTE_MBUF_HAS_EXTBUF(m) || - !RTE_MBUF_HAS_PINNED_EXTBUF(m)) - rte_pktmbuf_detach(m); - else if (__rte_pktmbuf_pinned_extbuf_decref(m)) + rte_pktmbuf_detach(m); + if (RTE_MBUF_HAS_PINNED_EXTBUF(m) && + __rte_pktmbuf_pinned_extbuf_decref(m)) return NULL; } @@ -1352,10 +1351,9 @@ static inline int __rte_pktmbuf_pinned_extbuf_decref(struct rte_mbuf *m) } else if (__rte_mbuf_refcnt_update(m, -1) == 0) { if (!RTE_MBUF_DIRECT(m)) { - if (!RTE_MBUF_HAS_EXTBUF(m) || - !RTE_MBUF_HAS_PINNED_EXTBUF(m)) - rte_pktmbuf_detach(m); - else if (__rte_pktmbuf_pinned_extbuf_decref(m)) + rte_pktmbuf_detach(m); + if (RTE_MBUF_HAS_PINNED_EXTBUF(m) && + __rte_pktmbuf_pinned_extbuf_decref(m)) return NULL; } -- 1.8.3.1