From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id AB09B152A for ; Tue, 19 Mar 2019 02:20:39 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Mar 2019 18:20:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,495,1544515200"; d="scan'208";a="126564235" Received: from txasoft-yocto.an.intel.com ([10.123.72.192]) by orsmga008.jf.intel.com with ESMTP; 18 Mar 2019 18:20:37 -0700 From: Gage Eads To: dev@dpdk.org Cc: olivier.matz@6wind.com, arybchenko@solarflare.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, stephen@networkplumber.org, jerinj@marvell.com, mczekaj@marvell.com, nd@arm.com, Ola.Liljedahl@arm.com, gage.eads@intel.com Date: Mon, 18 Mar 2019 20:20:04 -0500 Message-Id: <20190319012010.16793-1-gage.eads@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20190318213555.17345-1-gage.eads@intel.com> References: <20190318213555.17345-1-gage.eads@intel.com> Subject: [dpdk-dev] [PATCH v8 0/6] Add lock-free ring and mempool handler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Mar 2019 01:20:40 -0000 For some users, the rte ring's "non-preemptive" constraint is not acceptable; for example, if the application uses a mixture of pinned high-priority threads and multiplexed low-priority threads that share a mempool. This patchset introduces a lock-free ring and a mempool based on it. The lock-free algorithm relies on a double-pointer compare-and-swap, so for 64-bit architectures it is currently limited to x86_64. The ring uses more compare-and-swap atomic operations than the regular rte ring: With no contention, an enqueue of n pointers uses (1 + n) CAS operations and a dequeue of n pointers uses 1. This algorithm has worse average-case performance than the regular rte ring (particularly a highly-contended ring with large bulk accesses), however: - For applications with preemptible pthreads, the regular rte ring's worst-case performance (i.e. one thread being preempted in the update_tail() critical section) is much worse than the lock-free ring's. - Software caching can mitigate the average case performance for ring-based algorithms. For example, a lock-free ring based mempool (a likely use case for this ring) with per-thread caching. The lock-free ring is enabled via a new flag, RING_F_LF. For ease-of-use, existing ring enqueue/dequeue functions work with both standard and lock-free rings. This is also an experimental API, so RING_F_LF users must build with the ALLOW_EXPERIMENTAL_API flag. This patchset also adds lock-free versions of ring_autotest and ring_perf_autotest, and a lock-free ring based mempool. This patchset makes one API change; a deprecation notice was posted in a separate commit[1]. This patchset depends on the 128-bit compare-and-set patch[2]. [1] http://mails.dpdk.org/archives/dev/2019-February/124321.html [2] http://mails.dpdk.org/archives/dev/2019-March/125751.html v8: - Fixed two bugs in the generic implementation written as though the compare-and-swap would update the expected value on failure. v7: - Added ARM copyright to rte_ring_generic.h and rte_ring_c11_mem.h, since the lock-free algorithm is based on ARM's lfring (see v5 notes) - Rename __rte_ring_reload_tail() -> __rte_ring_lf_load_tail() - Remove the unused return value from __rte_ring_lf_load_tail() - Rename 'prev_tail' to 'next_tail' in the multi-producer lock-free enqueue v6: - Rebase patchset onto master (test/test/ -> app/test/) v5: - Incorporated lfring's enqueue and dequeue logic from http://mails.dpdk.org/archives/dev/2019-January/124242.html - Renamed non-blocking -> lock-free and NB -> LF to align with a similar change in the lock-free stack patchset: http://mails.dpdk.org/archives/dev/2019-March/125797.html - Added support for 32-bit architectures by using the full 32b of the modification counter and requiring LF rings on these architectures to be at least 1024 entries. - Updated to the latest rte_atomic128_cmp_exchange() interface. - Added ring start marker to struct rte_ring v4: - Split out nb_enqueue and nb_dequeue functions in generic and C11 versions, with the necessary memory ordering behavior for weakly consistent machines. - Convert size_t variables (from v2) to uint64_t and no-longer-applicable comment about variably-sized ring indexes. - Fix bug in nb_enqueue_mp that the breaks the non-blocking guarantee. - Split the ring_ptr cast into two lines. - Change the dependent patchset from the non-blocking stack patch series to one only containing the 128b CAS commit v3: - Avoid the ABI break by putting 64-bit head and tail values in the same cacheline as struct rte_ring's prod and cons members. - Don't attempt to compile rte_atomic128_cmpset without ALLOW_EXPERIMENTAL_API, as this would break a large number of libraries. - Add a helpful warning to __rte_ring_do_nb_enqueue_mp() in case someone tries to use RING_F_NB without the ALLOW_EXPERIMENTAL_API flag. - Update the ring mempool to use experimental APIs - Clarify that RINB_F_NB is only limited to x86_64 currently; e.g. ARMv8 has the ISA support for 128-bit CAS to eventually support it. v2: - Merge separate docs commit into patch #5 - Convert uintptr_t to size_t - Add a compile-time check for the size of size_t - Fix a space-after-typecast issue - Fix an unnecessary-parentheses checkpatch warning - Bump librte_ring's library version Gage Eads (6): ring: add a pointer-width headtail structure ring: add a ring start marker ring: add a lock-free implementation test_ring: add lock-free ring autotest test_ring_perf: add lock-free ring perf test mempool/ring: add lock-free ring handlers app/test/test_ring.c | 61 +-- app/test/test_ring_perf.c | 19 +- doc/guides/prog_guide/env_abstraction_layer.rst | 10 + drivers/mempool/ring/Makefile | 1 + drivers/mempool/ring/meson.build | 2 + drivers/mempool/ring/rte_mempool_ring.c | 58 ++- lib/librte_ring/rte_ring.c | 92 ++++- lib/librte_ring/rte_ring.h | 334 ++++++++++++++-- lib/librte_ring/rte_ring_c11_mem.h | 501 ++++++++++++++++++++++++ lib/librte_ring/rte_ring_generic.h | 487 ++++++++++++++++++++++- lib/librte_ring/rte_ring_version.map | 7 + 11 files changed, 1494 insertions(+), 78 deletions(-) -- 2.13.6 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id 75B2DA05FE for ; Tue, 19 Mar 2019 02:20:42 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 326FB2B9E; Tue, 19 Mar 2019 02:20:41 +0100 (CET) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id AB09B152A for ; Tue, 19 Mar 2019 02:20:39 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Mar 2019 18:20:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,495,1544515200"; d="scan'208";a="126564235" Received: from txasoft-yocto.an.intel.com ([10.123.72.192]) by orsmga008.jf.intel.com with ESMTP; 18 Mar 2019 18:20:37 -0700 From: Gage Eads To: dev@dpdk.org Cc: olivier.matz@6wind.com, arybchenko@solarflare.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, stephen@networkplumber.org, jerinj@marvell.com, mczekaj@marvell.com, nd@arm.com, Ola.Liljedahl@arm.com, gage.eads@intel.com Date: Mon, 18 Mar 2019 20:20:04 -0500 Message-Id: <20190319012010.16793-1-gage.eads@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20190318213555.17345-1-gage.eads@intel.com> References: <20190318213555.17345-1-gage.eads@intel.com> Subject: [dpdk-dev] [PATCH v8 0/6] Add lock-free ring and mempool handler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Content-Type: text/plain; charset="UTF-8" Message-ID: <20190319012004.9BechEhNEKFgzo7Oyi6MdWv-zzWYkB-u0xYIqEgZn64@z> For some users, the rte ring's "non-preemptive" constraint is not acceptable; for example, if the application uses a mixture of pinned high-priority threads and multiplexed low-priority threads that share a mempool. This patchset introduces a lock-free ring and a mempool based on it. The lock-free algorithm relies on a double-pointer compare-and-swap, so for 64-bit architectures it is currently limited to x86_64. The ring uses more compare-and-swap atomic operations than the regular rte ring: With no contention, an enqueue of n pointers uses (1 + n) CAS operations and a dequeue of n pointers uses 1. This algorithm has worse average-case performance than the regular rte ring (particularly a highly-contended ring with large bulk accesses), however: - For applications with preemptible pthreads, the regular rte ring's worst-case performance (i.e. one thread being preempted in the update_tail() critical section) is much worse than the lock-free ring's. - Software caching can mitigate the average case performance for ring-based algorithms. For example, a lock-free ring based mempool (a likely use case for this ring) with per-thread caching. The lock-free ring is enabled via a new flag, RING_F_LF. For ease-of-use, existing ring enqueue/dequeue functions work with both standard and lock-free rings. This is also an experimental API, so RING_F_LF users must build with the ALLOW_EXPERIMENTAL_API flag. This patchset also adds lock-free versions of ring_autotest and ring_perf_autotest, and a lock-free ring based mempool. This patchset makes one API change; a deprecation notice was posted in a separate commit[1]. This patchset depends on the 128-bit compare-and-set patch[2]. [1] http://mails.dpdk.org/archives/dev/2019-February/124321.html [2] http://mails.dpdk.org/archives/dev/2019-March/125751.html v8: - Fixed two bugs in the generic implementation written as though the compare-and-swap would update the expected value on failure. v7: - Added ARM copyright to rte_ring_generic.h and rte_ring_c11_mem.h, since the lock-free algorithm is based on ARM's lfring (see v5 notes) - Rename __rte_ring_reload_tail() -> __rte_ring_lf_load_tail() - Remove the unused return value from __rte_ring_lf_load_tail() - Rename 'prev_tail' to 'next_tail' in the multi-producer lock-free enqueue v6: - Rebase patchset onto master (test/test/ -> app/test/) v5: - Incorporated lfring's enqueue and dequeue logic from http://mails.dpdk.org/archives/dev/2019-January/124242.html - Renamed non-blocking -> lock-free and NB -> LF to align with a similar change in the lock-free stack patchset: http://mails.dpdk.org/archives/dev/2019-March/125797.html - Added support for 32-bit architectures by using the full 32b of the modification counter and requiring LF rings on these architectures to be at least 1024 entries. - Updated to the latest rte_atomic128_cmp_exchange() interface. - Added ring start marker to struct rte_ring v4: - Split out nb_enqueue and nb_dequeue functions in generic and C11 versions, with the necessary memory ordering behavior for weakly consistent machines. - Convert size_t variables (from v2) to uint64_t and no-longer-applicable comment about variably-sized ring indexes. - Fix bug in nb_enqueue_mp that the breaks the non-blocking guarantee. - Split the ring_ptr cast into two lines. - Change the dependent patchset from the non-blocking stack patch series to one only containing the 128b CAS commit v3: - Avoid the ABI break by putting 64-bit head and tail values in the same cacheline as struct rte_ring's prod and cons members. - Don't attempt to compile rte_atomic128_cmpset without ALLOW_EXPERIMENTAL_API, as this would break a large number of libraries. - Add a helpful warning to __rte_ring_do_nb_enqueue_mp() in case someone tries to use RING_F_NB without the ALLOW_EXPERIMENTAL_API flag. - Update the ring mempool to use experimental APIs - Clarify that RINB_F_NB is only limited to x86_64 currently; e.g. ARMv8 has the ISA support for 128-bit CAS to eventually support it. v2: - Merge separate docs commit into patch #5 - Convert uintptr_t to size_t - Add a compile-time check for the size of size_t - Fix a space-after-typecast issue - Fix an unnecessary-parentheses checkpatch warning - Bump librte_ring's library version Gage Eads (6): ring: add a pointer-width headtail structure ring: add a ring start marker ring: add a lock-free implementation test_ring: add lock-free ring autotest test_ring_perf: add lock-free ring perf test mempool/ring: add lock-free ring handlers app/test/test_ring.c | 61 +-- app/test/test_ring_perf.c | 19 +- doc/guides/prog_guide/env_abstraction_layer.rst | 10 + drivers/mempool/ring/Makefile | 1 + drivers/mempool/ring/meson.build | 2 + drivers/mempool/ring/rte_mempool_ring.c | 58 ++- lib/librte_ring/rte_ring.c | 92 ++++- lib/librte_ring/rte_ring.h | 334 ++++++++++++++-- lib/librte_ring/rte_ring_c11_mem.h | 501 ++++++++++++++++++++++++ lib/librte_ring/rte_ring_generic.h | 487 ++++++++++++++++++++++- lib/librte_ring/rte_ring_version.map | 7 + 11 files changed, 1494 insertions(+), 78 deletions(-) -- 2.13.6