From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id B3503A0561; Mon, 20 Apr 2020 14:14:05 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 762801D58F; Mon, 20 Apr 2020 14:12:22 +0200 (CEST) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id ADB8D1D5C1 for ; Mon, 20 Apr 2020 14:12:14 +0200 (CEST) IronPort-SDR: t8tDGNlzx4qDSWjecYnb4zim6hp11ZVMuBu493Zmb0IwaE9dWkRhFPB0qh55nuIcvJ4KgD/GMn 3WyaVs2VaDXA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Apr 2020 05:12:11 -0700 IronPort-SDR: CmZg5SN6STsxUFDdUlF0XiJrRcathkrd7+suwvG/QEiI5hyw811bUxeq+jvlUoSt+g4nEUzuyz 6WvSvrj/u1jQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,406,1580803200"; d="scan'208";a="401810658" Received: from sivswdev08.ir.intel.com ([10.237.217.47]) by orsmga004.jf.intel.com with ESMTP; 20 Apr 2020 05:12:10 -0700 From: Konstantin Ananyev To: dev@dpdk.org Cc: honnappa.nagarahalli@arm.com, david.marchand@redhat.com, jielong.zjl@antfin.com, Konstantin Ananyev Date: Mon, 20 Apr 2020 13:11:13 +0100 Message-Id: <20200420121113.9327-11-konstantin.ananyev@intel.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20200420121113.9327-1-konstantin.ananyev@intel.com> References: <20200418163225.17635-1-konstantin.ananyev@intel.com> <20200420121113.9327-1-konstantin.ananyev@intel.com> Subject: [dpdk-dev] [PATCH v6 10/10] doc: update ring guide X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Changed the rte_ring chapter in programmer's guide to reflect the addition of new sync modes and peek style API. Signed-off-by: Konstantin Ananyev --- doc/guides/prog_guide/ring_lib.rst | 95 ++++++++++++++++++++++++++++++ 1 file changed, 95 insertions(+) diff --git a/doc/guides/prog_guide/ring_lib.rst b/doc/guides/prog_guide/ring_lib.rst index 8cb2b2dd4..668e67ecb 100644 --- a/doc/guides/prog_guide/ring_lib.rst +++ b/doc/guides/prog_guide/ring_lib.rst @@ -349,6 +349,101 @@ even if only the first term of subtraction has overflowed: uint32_t entries = (prod_tail - cons_head); uint32_t free_entries = (mask + cons_tail -prod_head); +Producer/consumer synchronization modes +--------------------------------------- + +rte_ring supports different synchronization modes for porducer and consumers. +These modes can be specified at ring creation/init time via ``flags`` parameter. +That should help user to configure ring in way most suitable for his +specific usage scenarios. +Currently supported modes: + +MP/MC (default one) +~~~~~~~~~~~~~~~~~~~ + +Multi-producer (/multi-consumer) mode. This is a default enqueue (/dequeue) +mode for the ring. In this mode multiple threads can enqueue (/dequeue) +objects to (/from) the ring. For 'classic' DPDK deployments (with one thread +per core) this is usually most suitable and fastest synchronization mode. +As a well known limitaion - it can perform quite pure on some overcommitted +scenarios. + +SP/SC +~~~~~ +Single-producer (/single-consumer) mode. In this mode only one thread at a time +is allowed to enqueue (/dequeue) objects to (/from) the ring. + +MP_RTS/MC_RTS +~~~~~~~~~~~~~ + +Multi-producer (/multi-consumer) with Relaxed Tail Sync (RTS) mode. +The main difference from original MP/MC algorithm is that +tail value is increased not by every thread that finished enqueue/dequeue, +but only by the last one. +That allows threads to avoid spinning on ring tail value, +leaving actual tail value change to the last thread at a given instance. +That technique helps to avoid Lock-Waiter-Preemtion (LWP) problem on tail +update and improves average enqueue/dequeue times on overcommitted systems. +To achieve that RTS requires 2 64-bit CAS for each enqueue(/dequeue) operation: +one for head update, second for tail update. +In comparison original MP/MC algorithm requires one 32-bit CAS +for head update and waiting/spinning on tail value. + +MP_HTS/MC_HTS +~~~~~~~~~~~~~ + +Multi-producer (/multi-consumer) with Head/Tail Sync (HTS) mode. +In that mode enqueue/dequeue operation is fully serialized: +at any given moment only one enqueue/dequeue operation can proceed. +This is achieved by allowing a thread to proceed with changing ``head.value`` +only when ``head.value == tail.value``. +Both head and tail values are updated atomically (as one 64-bit value). +To achieve that 64-bit CAS is used by head update routine. +That technique also avoids Lock-Waiter-Preemtion (LWP) problem on tail +update and helps to improve ring enqueue/dequeue behavior in overcommitted +scenarios. Another advantage of fully serialized producer/consumer - +it provides ability to implement MT safe peek API for rte_ring. + + +Ring Peek API +------------- + +For ring with serialized producer/consumer (HTS sync mode) it is possible +to split public enqueue/dequeue API into two phases: + +* enqueue/dequeue start + +* enqueue/dequeue finish + +That allows user to inspect objects in the ring without removing them +from it (aka MT safe peek) and reserve space for the objects in the ring +before actual enqueue. +Note that this API is available only for two sync modes: + +* Single Producer/Single Consumer (SP/SC) + +* Multi-producer/Multi-consumer with Head/Tail Sync (HTS) + +It is a user responsibility to create/init ring with appropriate sync modes +selected. As an example of usage: + +.. code-block:: c + + /* read 1 elem from the ring: */ + uint32_t n = rte_ring_dequeue_bulk_start(ring, &obj, 1, NULL); + if (n != 0) { + /* examine object */ + if (object_examine(obj) == KEEP) + /* decided to keep it in the ring. */ + rte_ring_dequeue_finish(ring, 0); + else + /* decided to remove it from the ring. */ + rte_ring_dequeue_finish(ring, n); + } + +Note that between ``_start_`` and ``_finish_`` none other thread can proceed +with enqueue(/dequeue) operation till ``_finish_`` completes. + References ---------- -- 2.17.1