From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id DA957A04BB; Fri, 11 Sep 2020 17:31:04 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5ECF31B13C; Fri, 11 Sep 2020 17:31:04 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by dpdk.org (Postfix) with ESMTP id 13EB5DE0 for ; Fri, 11 Sep 2020 17:31:03 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6E0B21063; Fri, 11 Sep 2020 08:31:02 -0700 (PDT) Received: from localhost.localdomain (unknown [10.57.10.210]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D99E53F73C; Fri, 11 Sep 2020 08:31:00 -0700 (PDT) From: Steven Lariau To: Cc: dev@dpdk.org, nd@arm.com, dharmik.thakkar@arm.com, Steven Lariau Date: Fri, 11 Sep 2020 16:29:33 +0100 Message-Id: <20200911152938.8019-1-steven.lariau@arm.com> X-Mailer: git-send-email 2.17.1 Subject: [dpdk-dev] [PATCH 0/5] lib/stack: improve lockfree C11 implementation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" One implementation of the DPDK stack library is lockfree, based on C11 memory model for atomics. Some of these atomic operations use unnecessary memory orders, that can be relaxed. This patch relax some of these operations in order to improve the performance of the stack library. The patch was tested on several architectures, to ensure that the implementation is correct, and to measure performance. Below are the results for a few architectures on multithread stack lockfree test. The cycles count is the average number of cycles per item to perform a bulk push / pop. $sudo ./builddir/app/dpdk-test RTE>>stack_lf_perf_autotest difference compared to main Cycles count on ThunderX2 2 cores, bulk size = 8: -15.85% 2 cores, bulk size = 32: -04.56% 4 cores, bulk size = 8: -05.00% 4 cores, bulk size = 32: -04.35% 16 cores, bulk size = 8: -02.38% 16 cores, bulk size = 32: -01.88% difference compared to main Cycles count on N1SDP 2 cores, batch size = 8: +00.77% 2 cores, batch size = 32: -16.00% difference compared to main Cycles count on Skylake 2 cores, bulk size = 8: -00.18% 2 cores, bulk size = 32: -00.95% 4 cores, bulk size = 8: -01.19% 4 cores, bulk size = 32: +00.64% 16 cores, bulk size = 8: +01.20% 16 cores, bulk size = 32: +00.48% Steven Lariau (5): lib/stack: fix inconsistent weak / strong cas lib/stack: remove push acquire fence lib/stack: remove redundant orderings for list->len lib/stack: reload head when pop fails lib/stack: remove pop cas release ordering lib/librte_stack/rte_stack_lf_c11.h | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) -- 2.17.1