From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id 0C5C4A3 for ; Thu, 17 Jan 2019 16:38:04 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Jan 2019 07:38:04 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,489,1539673200"; d="scan'208";a="139112035" Received: from txasoft-yocto.an.intel.com (HELO txasoft-yocto.an.intel.com.) ([10.123.72.192]) by fmsmga001.fm.intel.com with ESMTP; 17 Jan 2019 07:38:03 -0800 From: Gage Eads To: dev@dpdk.org Cc: olivier.matz@6wind.com, arybchenko@solarflare.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com, gavin.hu@arm.com Date: Thu, 17 Jan 2019 09:36:57 -0600 Message-Id: <20190117153659.28477-1-gage.eads@intel.com> X-Mailer: git-send-email 2.13.6 In-Reply-To: <20190116151835.22424-1-gage.eads@intel.com> References: <20190116151835.22424-1-gage.eads@intel.com> Subject: [dpdk-dev] [PATCH v4 0/2] Add non-blocking stack mempool handler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Jan 2019 15:38:05 -0000 For some users, the rte ring's "non-preemptive" constraint is not acceptable; for example, if the application uses a mixture of pinned high-priority threads and multiplexed low-priority threads that share a mempool. This patchset introduces a non-blocking stack mempool handler. Note that the non-blocking algorithm relies on a 128-bit compare-and-swap, so it is limited to x86_64 machines. In mempool_perf_autotest the lock-based stack outperforms the non-blocking handler*, however: - For applications with preemptible pthreads, a lock-based stack's worst-case performance (i.e. one thread being preempted while holding the spinlock) is much worse than the non-blocking stack's. - Using per-thread mempool caches will largely mitigate the performance difference. *Test setup: x86_64 build with default config, dual-socket Xeon E5-2699 v4, running on isolcpus cores with a tickless scheduler. The lock-based stack's rate_persec was 1x-3.5x the non-blocking stack's. v4: - Simplified the meson.build x86_64 check v3: - Fix two more space-after-typecast issues - Rework nb_stack's meson.build x86_64 check, borrowing from net/sfc/ v2: - Merge separate docs commit into patch #2 - Fix two space-after-typecast issues - Fix alphabetical sorting for build files - Remove unnecessary include path from nb_stack/Makefile - Add a comment to nb_lifo_len() justifying its approximate behavior - Fix comparison with NULL - Remove unnecessary void * cast - Fix meson builds and limit them to x86_64 - Fix missing library error for non-x86_64 builds Gage Eads (2): eal: add 128-bit cmpset (x86-64 only) mempool/nb_stack: add non-blocking stack mempool MAINTAINERS | 4 + config/common_base | 1 + doc/guides/prog_guide/env_abstraction_layer.rst | 5 + drivers/mempool/Makefile | 3 + drivers/mempool/meson.build | 3 +- drivers/mempool/nb_stack/Makefile | 23 ++++ drivers/mempool/nb_stack/meson.build | 6 + drivers/mempool/nb_stack/nb_lifo.h | 147 +++++++++++++++++++++ drivers/mempool/nb_stack/rte_mempool_nb_stack.c | 125 ++++++++++++++++++ .../nb_stack/rte_mempool_nb_stack_version.map | 4 + .../common/include/arch/x86/rte_atomic_64.h | 22 +++ mk/rte.app.mk | 7 +- 12 files changed, 347 insertions(+), 3 deletions(-) create mode 100644 drivers/mempool/nb_stack/Makefile create mode 100644 drivers/mempool/nb_stack/meson.build create mode 100644 drivers/mempool/nb_stack/nb_lifo.h create mode 100644 drivers/mempool/nb_stack/rte_mempool_nb_stack.c create mode 100644 drivers/mempool/nb_stack/rte_mempool_nb_stack_version.map -- 2.13.6