From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gage.eads@intel.com>
Received: from mga02.intel.com (mga02.intel.com [134.134.136.20])
 by dpdk.org (Postfix) with ESMTP id 9DD831B90D
 for <dev@dpdk.org>; Thu, 10 Jan 2019 22:02:33 +0100 (CET)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga003.fm.intel.com ([10.253.24.29])
 by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 10 Jan 2019 13:02:32 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.56,462,1539673200"; d="scan'208";a="124971945"
Received: from txasoft-yocto.an.intel.com (HELO txasoft-yocto.an.intel.com.)
 ([10.123.72.192])
 by FMSMGA003.fm.intel.com with ESMTP; 10 Jan 2019 13:02:31 -0800
From: Gage Eads <gage.eads@intel.com>
To: dev@dpdk.org
Cc: olivier.matz@6wind.com, arybchenko@solarflare.com,
 bruce.richardson@intel.com, konstantin.ananyev@intel.com
Date: Thu, 10 Jan 2019 15:01:16 -0600
Message-Id: <20190110210122.24889-1-gage.eads@intel.com>
X-Mailer: git-send-email 2.13.6
Subject: [dpdk-dev] [PATCH 0/6] Add non-blocking ring
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jan 2019 21:02:34 -0000

For some users, the rte ring's "non-preemptive" constraint is not acceptable;
for example, if the application uses a mixture of pinned high-priority threads
and multiplexed low-priority threads that share a mempool.

This patchset introduces a non-blocking ring, on top of which a mempool can run.
Crucially, the non-blocking algorithm relies on a 128-bit compare-and-swap, so
it is limited to x86_64 machines.

The ring uses more compare-and-swap atomic operations than the regular rte ring:
With no contention, an enqueue of n pointers uses (1 + 2n) CAS operations and a
dequeue of n pointers uses 2. This algorithm has worse average-case performance
than the regular rte ring (particularly a highly-contended ring with large bulk
accesses), however:
- For applications with preemptible pthreads, the regular rte ring's worst-case
  performance (i.e. one thread being preempted in the update_tail() critical
  section) is much worse than the non-blocking ring's.
- Software caching can mitigate the average case performance for ring-based
  algorithms. For example, a non-blocking ring based mempool (a likely use case
  for this ring) with per-thread caching.

The non-blocking ring is enabled via a new flag, RING_F_NB. For ease-of-use,
existing ring enqueue/dequeue functions work with both "regular" and
non-blocking rings.

This patchset also adds non-blocking versions of ring_autotest and
ring_perf_autotest, and a non-blocking ring based mempool.

This patchset makes ABI changes, and thus an ABI update announcement and
deprecation cycle are required.

This patchset depends on the non-blocking stack patchset[1].

[1] http://mails.dpdk.org/archives/dev/2019-January/122923.html

Gage Eads (6):
  ring: change head and tail to pointer-width size
  ring: add a non-blocking implementation
  test_ring: add non-blocking ring autotest
  test_ring_perf: add non-blocking ring perf test
  mempool/ring: add non-blocking ring handlers
  doc: add NB ring comment to EAL "known issues"

 doc/guides/prog_guide/env_abstraction_layer.rst |   2 +-
 drivers/mempool/ring/rte_mempool_ring.c         |  58 ++-
 lib/librte_eventdev/rte_event_ring.h            |   6 +-
 lib/librte_ring/rte_ring.c                      |  53 ++-
 lib/librte_ring/rte_ring.h                      | 555 ++++++++++++++++++++++--
 lib/librte_ring/rte_ring_generic.h              |  16 +-
 lib/librte_ring/rte_ring_version.map            |   7 +
 test/test/test_ring.c                           |  57 ++-
 test/test/test_ring_perf.c                      |  19 +-
 9 files changed, 689 insertions(+), 84 deletions(-)

-- 
2.13.6