From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E30B845CA0; Thu, 7 Nov 2024 17:16:54 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CFCED42F8F; Thu, 7 Nov 2024 17:16:54 +0100 (CET) Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) by mails.dpdk.org (Postfix) with ESMTP id A7B3C42E45 for ; Thu, 7 Nov 2024 17:16:53 +0100 (CET) Received: by mail-pf1-f172.google.com with SMTP id d2e1a72fcca58-71e4e481692so1044727b3a.1 for ; Thu, 07 Nov 2024 08:16:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1730996213; x=1731601013; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=201WFd+HH5bSaOsikRwfGwZeGv6hrh/7KUp1dxIqWh4=; b=YTERssyIVky8AcEiY5ANnEsSIMdsr0JNAeFdPS+ORRkmdyVxbWn+59fEjnvX6ny5KK y91bQX7TVNfhLOj4Ad5XvmWZgCwwiguuC4pVu81FSjNrRe4UeWBl0+51+gd5NooPfgK/ GD3B+BVD2SSXN3+Jp+OLpvHb7ScY/DcPTIW6CNZX569MrLPlpbTkh7LhOUuRvElQdYev tbDOCExiQ48f85W8qJWsPbKYKJDBzB5pYOrp/DwRGM675XDOGfHrcp/43IT34Bop1efT 7DqWw6ZINzSF3Sbt3ycbzZZ86lCMfb0sDhWPFxUsXH1ri239Z0I1hzCvA0PhRst4tyJC mbjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730996213; x=1731601013; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=201WFd+HH5bSaOsikRwfGwZeGv6hrh/7KUp1dxIqWh4=; b=keGJTZL80bctdY9iTU1BzrJ7YC8AkWm9j96zv9zx9kG9mXMhaLGLs2q9tNBPPsJNAl G7M8T8BOAAIeaSKLVsNUHneyvcTvNYEQwJYan1Fu3C5CiIydaRQZkPDTeSBOvbaCK9yq 78yM0t8l76QG3f6AowZiF+n90De4k3zQPNaL1uWBz/X7xSQbnm9K6DEYnC+oNDjvCVep vxgTSFOubDHhIi1yEJfDndHZBKelK/Ndxf8nUwBHngXLRCijvhU6Q/nC/IW5uZM+3PFO ReUFpcFsM62VUVrRIzHmpgxTZ9PEufdFdcI9H1OB0GcI1wSxmqpktcKi0YeIJFa8qIFk LnXQ== X-Gm-Message-State: AOJu0YxhApUOK2bttu8Rz3bP2vkbcJKBp3RSwd0DFnhbEXOWagstJcmy BGEbJWXNOzpsINIow/bmIcjfui5Gs7LttS00gWhSUmR4yVnIlV7bQwmS9HhPTH8= X-Google-Smtp-Source: AGHT+IEjDjLwMU8kW+23toR1SUvf9fhBVX0eRndvdowMqD5XxvjQsgF19dX7jN+4sd1vp2mE7mV5oA== X-Received: by 2002:a05:6a21:8986:b0:1cf:6c86:231c with SMTP id adf61e73a8af0-1dc2053ab67mr127005637.26.1730996212662; Thu, 07 Nov 2024 08:16:52 -0800 (PST) Received: from hermes.local (204-195-96-226.wavecable.com. [204.195.96.226]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-724078608e5sm1750194b3a.13.2024.11.07.08.16.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Nov 2024 08:16:52 -0800 (PST) Date: Thu, 7 Nov 2024 08:16:49 -0800 From: Stephen Hemminger To: Konstantin Ananyev Cc: , , , , , , , , Subject: Re: [PATCH v7 0/7] Stage-Ordered API and other extensions for ring library Message-ID: <20241107081649.2c383cd0@hermes.local> In-Reply-To: <20241030212304.104180-1-konstantin.ananyev@huawei.com> References: <20241021174745.1843-1-konstantin.ananyev@huawei.com> <20241030212304.104180-1-konstantin.ananyev@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Wed, 30 Oct 2024 17:22:57 -0400 Konstantin Ananyev wrote: > Testing coverage (passed): > x86_64, i686, PPC, ARM =20 >=20 > Would like to express my gratitude to all community members who helped me > with testing it on different platforms, in particular: > David Christensen > Cody Cheng > Patrick Robb > Phanendra Vukkisala > Chengwen Feng >=20 > v6 -> v7 > - updated Programmer Guide (Jerin, Morten, Stephen) > - fix some functions in public headers without comments (Morten) > - update debug checks, added new macro for that: RTE_SORING_DEBUG > (disabled by default). >=20 > v5 -> v6 > - fix problem with ring_stress_autotest (Phanendra) > - added more checks and debug output >=20 > v4 -> v5 > - fix public API/doc comments from Jerin > - update devtools/build-dict.sh (Stephen) > - fix MSVC warnings > - introduce new test-suite for meson (stress) with > ring_stress_autotest and soring_stress_autotest in it > - enhance error report in tests > - reorder some sync code in soring and add extra checks > (for better debuggability) >=20 > v3 -> v4: > - fix compilation/doxygen complains (attempt #2) > - updated release notes >=20 > v2 -> v3: > - fix compilation/doxygen complains > - dropped patch: > "examples/l3fwd: make ACL work in pipeline and eventdev modes": [2] > As was mentioned in the patch desctiption it was way too big, > controversial and incomplete. If the community is ok to introduce > pipeline model into the l3fwd, then it is propbably worth to be > a separate patch series. >=20 > v1 -> v2: > - rename 'elmst/objst' to 'meta' (Morten) > - introduce new data-path APIs set: one with both meta{} and objs[], > second with just objs[] (Morten) > - split data-path APIs into burst/bulk flavours (same as rte_ring) > - added dump function for te_soring and improved dump() for rte_ring. > - dropped patch: > " ring: minimize reads of the counterpart cache-line" > - no performance gain observed > - actually it does change behavior of conventional rte_ring > enqueue/dequeue APIs - > it could return available/free less then actually exist in the ring. > As in some other libs we reliy on that information - it will > introduce problems. >=20 > The main aim of these series is to extend ring library with > new API that allows user to create/use Staged-Ordered-Ring (SORING) > abstraction. In addition to that there are few other patches that serve > different purposes: > - first two patches are just code reordering to de-duplicate > and generalize existing rte_ring code. > - patch #3 extends rte_ring_dump() to correctly print head/tail metadata > for different sync modes. > - next two patches introduce SORING API into the ring library and > provide UT for it. >=20 > SORING overview > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > Staged-Ordered-Ring (SORING) provides a SW abstraction for 'ordered' queu= es > with multiple processing 'stages'. It is based on conventional DPDK > rte_ring, re-uses many of its concepts, and even substantial part of > its code. > It can be viewed as an 'extension' of rte_ring functionality. > In particular, main SORING properties: > - circular ring buffer with fixed size objects > - producer, consumer plus multiple processing stages in between. > - allows to split objects processing into multiple stages. > - objects remain in the same ring while moving from one stage to the othe= r, > initial order is preserved, no extra copying needed. > - preserves the ingress order of objects within the queue across multiple > stages > - each stage (and producer/consumer) can be served by single and/or > multiple threads. >=20 > - number of stages, size and number of objects in the ring are > configurable at ring initialization time. >=20 > Data-path API provides four main operations: > - enqueue/dequeue works in the same manner as for conventional rte_ring, > all rte_ring synchronization types are supported. > - acquire/release - for each stage there is an acquire (start) and > release (finish) operation. After some objects are 'acquired' - > given thread can safely assume that it has exclusive ownership of > these objects till it will invoke 'release' for them. > After 'release', objects can be 'acquired' by next stage and/or dequeued > by the consumer (in case of last stage). >=20 > Expected use-case: applications that uses pipeline model > (probably with multiple stages) for packet processing, when preserving > incoming packet order is important. >=20 > The concept of =E2=80=98ring with stages=E2=80=99 is similar to DPDK OPDL= eventdev PMD [1], > but the internals are different. > In particular, SORING maintains internal array of 'states' for each eleme= nt > in the ring that is shared by all threads/processes that access the ring. > That allows 'release' to avoid excessive waits on the tail value and helps > to improve performancei and scalability. > In terms of performance, with our measurements rte_soring and > conventional rte_ring provide nearly identical numbers. > As an example, on our SUT: Intel ICX CPU @ 2.00GHz, > l3fwd (--lookup=3Dacl) in pipeline mode [2] both > rte_ring and rte_soring reach ~20Mpps for single I/O lcore and same > number of worker lcores. >=20 > [1] https://www.dpdk.org/wp-content/uploads/sites/35/2018/06/DPDK-China20= 17-Ma-OPDL.pdf > [2] https://patchwork.dpdk.org/project/dpdk/patch/20240906131348.804-7-ko= nstantin.v.ananyev@yandex.ru/ One future suggestion. What about having an example (l3fwd-soring?) so that performance can be compared. Assuming you get the other minor comments from Morten fixed. Series-Acked-by: Stephen Hemminger