From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id A9BABA058B; Wed, 25 Mar 2020 21:43:46 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6BC2F2C15; Wed, 25 Mar 2020 21:43:45 +0100 (CET) Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-eopbgr60070.outbound.protection.outlook.com [40.107.6.70]) by dpdk.org (Postfix) with ESMTP id AC87C1E34 for ; Wed, 25 Mar 2020 21:43:44 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1Mmm2cQIvrQgiXPy1E3WLSCdLTrvjO5UoOggGgXGY98=; b=9Lsh9gd2q0vz66+K3fK4dQjRGWM6VV0v6ZRrkUongODx7KvAbGO2N9EZbQTmLkhkWShDooexNmeg33e7njv1qS8pQf3d3HrKQkpFn7F/zEmuKTRiCaiF9LY9MjD1M71CoEsrYEr51EdF7saefO13O1etGKtNI7EZ+i7pkeC1JXs= Received: from AM6PR04CA0042.eurprd04.prod.outlook.com (2603:10a6:20b:f0::19) by DBBPR08MB4696.eurprd08.prod.outlook.com (2603:10a6:10:f3::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2835.22; Wed, 25 Mar 2020 20:43:42 +0000 Received: from AM5EUR03FT007.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:f0:cafe::6a) by AM6PR04CA0042.outlook.office365.com (2603:10a6:20b:f0::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2835.18 via Frontend Transport; Wed, 25 Mar 2020 20:43:42 +0000 Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dpdk.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dpdk.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT007.mail.protection.outlook.com (10.152.16.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2856.17 via Frontend Transport; Wed, 25 Mar 2020 20:43:42 +0000 Received: ("Tessian outbound 66307db0259d:v48"); Wed, 25 Mar 2020 20:43:42 +0000 X-CR-MTA-TID: 64aa7808 Received: from cede2a2e290d.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id B577ACB6-FC2E-4088-92EC-96A10A083AEF.1; Wed, 25 Mar 2020 20:43:37 +0000 Received: from EUR02-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id cede2a2e290d.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 25 Mar 2020 20:43:37 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FWxyaffMApm4ioTWzr3BauirFAFKZaDjY7atIQVLWPIEgwcxMESRWBP8Ts0rKd8nKvX4Iaf8frkgDR8VtfhRfVF73nsPGmAcpeRVi1THj0vdCApauk3qD/zz4frJ5ppXdFgQ89MTbHi25EsGuspKM4LbfS+yjYxzBgt4joxeInLCW10NABLBXBwiQtAj87B3+f5wtnsKIbJHAVvYk0+B6+KHCRqVAX9TsthoSw8P4vP2VQMJloBXDE+VgkD8LDSCcDX2pzf5CNrIwxRU+oBREmoWmpHnCPIwo9yliv8o97oqP681gZQO3OljMW0p0jbAGA5TCHe2yu7HXzDAeb9Yew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1Mmm2cQIvrQgiXPy1E3WLSCdLTrvjO5UoOggGgXGY98=; b=Y7OkonT8wTKj4ihm/TiNYpNzH6AWnSJcom+5U1tQCHquFBpdpSz4bTTP2DSia+/5lhtUB5UDsSNwxRfxfRdi7aK3AO0MINWlnmueG2wwnSse8UA4bxDWS5xFiNJ1gtncwR5SVTHAtbZHmaT2lo31P89ir4i4XmbfeX9TVBJoaY/kHTrMfez18pEga3R3yhfvHMy0Y+CCvdRUjZz/ItCMfop/hY9BgKi11Uz+k7XmPs+yIC8R/lYu7NUcyWxVxvYiXH5/SAE8pSLxge5569jeLiGXmNmUpzYgZQD9EZmRp9DVs+NXp+3OuFWdIVVPjJpyPa6aQhJ9TGqwvpRSb3nAXw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1Mmm2cQIvrQgiXPy1E3WLSCdLTrvjO5UoOggGgXGY98=; b=9Lsh9gd2q0vz66+K3fK4dQjRGWM6VV0v6ZRrkUongODx7KvAbGO2N9EZbQTmLkhkWShDooexNmeg33e7njv1qS8pQf3d3HrKQkpFn7F/zEmuKTRiCaiF9LY9MjD1M71CoEsrYEr51EdF7saefO13O1etGKtNI7EZ+i7pkeC1JXs= Received: from VE1PR08MB5149.eurprd08.prod.outlook.com (20.179.30.27) by VE1PR08MB5133.eurprd08.prod.outlook.com (20.179.30.156) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2856.18; Wed, 25 Mar 2020 20:43:33 +0000 Received: from VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::2573:103b:ed96:90bd]) by VE1PR08MB5149.eurprd08.prod.outlook.com ([fe80::2573:103b:ed96:90bd%6]) with mapi id 15.20.2835.023; Wed, 25 Mar 2020 20:43:33 +0000 From: Honnappa Nagarahalli To: Konstantin Ananyev , "dev@dpdk.org" CC: "olivier.matz@6wind.com" , nd , Honnappa Nagarahalli , nd Thread-Topic: [dpdk-dev] [RFC 0/6] New sync modes for ring Thread-Index: AQHWAuYLqB38ezBiB0izOzOcli/52Q== Date: Wed, 25 Mar 2020 20:43:33 +0000 Message-ID: References: <20200224113515.1744-1-konstantin.ananyev@intel.com> In-Reply-To: <20200224113515.1744-1-konstantin.ananyev@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: a73b1dc3-c576-4d51-ab78-da459afa6f75.0 x-checkrecipientchecked: true Authentication-Results-Original: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; x-originating-ip: [70.113.25.165] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 8b7e548e-c2e3-429f-ef82-08d7d0fd34c4 x-ms-traffictypediagnostic: VE1PR08MB5133:|VE1PR08MB5133:|DBBPR08MB4696: x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:8882;OLM:8882; x-forefront-prvs: 0353563E2B X-Forefront-Antispam-Report-Untrusted: SFV:NSPM; SFS:(10009020)(4636009)(366004)(498600001)(66476007)(7696005)(66556008)(64756008)(66446008)(66946007)(5660300002)(966005)(76116006)(8676002)(81156014)(81166006)(52536014)(6506007)(4326008)(26005)(9686003)(55016002)(2906002)(110136005)(54906003)(186003)(8936002)(71200400001)(33656002)(86362001); DIR:OUT; SFP:1101; SCL:1; SRVR:VE1PR08MB5133; H:VE1PR08MB5149.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: r7wNuk+oRsA4cLNOp8V4k45HcfimetHP5z+YE+s8ECSlwTTZK7l3VneewXb3SiC3JDu8ntZzfXNUG5RaGYLXQTL9R7R0AzyrCvqJzMy+V9mMAzN/24v9TU6zsaT/tLc70yU1tndvYF32AEpZYhJskLGBEmPIvwh8EfMTq+VLKZ+3hyCIhGZ1mQvGEx8alxA+e3cG+yHIbl2xCOfsPlw+ch35gXnfwacWzRPG9dZrd3hLeyzhpmL2wwRsVNs5GNs29J4AclK1F5d4AaLZsJySZU7lDg79xrCmxyXruEr3rhoZ7S54aEN3fkgOf0017p2lDvZSAgV9S6c5PQ4DyLnFkealSJvVghAuqcRrEY/IOQW9wqXMSiGycMxvK5Hv5UczyFy7aOAziMQbkG3F3X6kqv+LKpHFcoKZ6UbZlv9NQevmnGui8olW37U0NV9H9VvFMgSHamQLsv7R/UknR8yxXzZjPvInPQPJakw8NTYAOSCFtAM0Irrr8QRdWvWbG1nkpSlfpnCM7T6tUay6kRWRoQ== x-ms-exchange-antispam-messagedata: 05y53ppI3P/aWdIJqaiY7k+QZWf9vbKGDgVX34/mutv3S3P2XIHU5wfwmcC7kf7f9sNY1rOg8XjzglX18QaCbVLWVfWAKH4ohaTwqa4bscKvn7a6W3DSzkzL+6L4xB30F4QXzkRX1wLuSTwXRXs7Ug== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5133 Original-Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Honnappa.Nagarahalli@arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT007.eop-EUR03.prod.protection.outlook.com X-Forefront-Antispam-Report: CIP:63.35.35.123; IPV:CAL; SCL:-1; CTRY:IE; EFV:NLI; SFV:NSPM; SFS:(10009020)(4636009)(46966005)(8676002)(52536014)(33656002)(186003)(26826003)(81166006)(6506007)(8936002)(356004)(4326008)(26005)(336012)(966005)(47076004)(81156014)(5660300002)(2906002)(7696005)(54906003)(86362001)(110136005)(55016002)(70586007)(9686003)(498600001)(70206006)(36906005); DIR:OUT; SFP:1101; SCL:1; SRVR:DBBPR08MB4696; H:64aa7808-outbound-1.mta.getcheckrecipient.com; FPR:; SPF:Pass; LANG:en; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; X-MS-Office365-Filtering-Correlation-Id-Prvs: 27fd7c31-14b0-46bc-5578-08d7d0fd2f7f X-Forefront-PRVS: 0353563E2B X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: DZ0Vul68BtHgPaMpAiAmM//JCsaHMivfcTiuUagzp1swrQ/QNJNEDo4zdYmea35AvenSRDy7w2VUYd6nEuLPKwwhMNq6AVEZyaNP5J9KckR4Crk/rML4zv9dspgRcijdpKEw/6O45QGAQfea2zh3wmidcbSPfOcxIJnFqQXdKis0tOdAgNkIOtCczp3wEfywuiJRNya7LeSYYtoppyoxyBKrFKTcJxJnB1sBBognebOWzZgDRFs0ngoh4VHo1KZU421jvSWyrrNfP/7UI0B3pz/FTBev7mtZm1seM8WE96+C1sLhAaTVxSidIHpD3BGBBY6NIBHtmCnqgIjkqALyWxN2pbWlQSz/NCN1O/im/J1fbnqtM6uXx/IPlfOcpA4cKIUQa2gBQZvqExz16EcjswpkeWwbDg66zZ/UuHe6RNouzgXQ5PQrx7sR8HCY8u2dUAIYnXoNf43/C6fZGfgYdeB/wvMEpKEfWOxverev0iXC8U/9JN/dAXgNLYKy9WFW1M1n7eP66e2T9NDTFD0rgyHsBMpiyI/qXucHnifwGFFayYHNtdnASzQo/LIFTxAOVvO23uwzEWDvYmypCartfJuYy9H6b1ncH+cLNCPpmiY= X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Mar 2020 20:43:42.6048 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 8b7e548e-c2e3-429f-ef82-08d7d0fd34c4 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB4696 Subject: Re: [dpdk-dev] [RFC 0/6] New sync modes for ring X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > Subject: [dpdk-dev] [RFC 0/6] New sync modes for ring >=20 > Upfront note - that RFC is not a complete patch. > It introduces an ABI breakage, plus it doesn't update ring_elem code prop= erly, As per the current rules, these changes (in the current form) will be accep= ted only for 20.11 release. How do we address this for immediate requiremen= ts like RCU defer APIs?=20 I suggest that we move forward with my RFC (taking into consideration your = feedback) to make progress on RCU APIs. > etc. > I plan to deal with all these things in later versions. > Right now I seek an initial feedback about proposed ideas. > Would also ask people to repeat performance tests (see below) on their > platforms to confirm the impact. >=20 > More and more customers use(/try to use) DPDK based apps within > overcommitted systems (multiple acttive threads over same pysical cores): > VM, container deployments, etc. > One quite common problem they hit: Lock-Holder-Preemption with rte_ring. > LHP is quite a common problem for spin-based sync primitives (spin-locks,= etc.) > on overcommitted systems. > The situation gets much worse when some sort of fair-locking technique is > used (ticket-lock, etc.). > As now not only lock-owner but also lock-waiters scheduling order matters= a > lot. > This is a well-known problem for kernel within VMs: > http://www-archive.xenproject.org/files/xensummitboston08/LHP.pdf > https://www.cs.hs-rm.de/~kaiser/events/wamos2017/Slides/selcuk.pdf > The problem with rte_ring is that while head accusion is sort of un-fair = locking, > waiting on tail is very similar to ticket lock schema - tail has to be up= dated in > particular order. > That makes current rte_ring implementation to perform really pure on some > overcommited scenarios. > While it is probably not possible to completely resolve this problem in > userspace only (without some kernel communication/intervention), removing > fairness in tail update can mitigate it significantly. > So this RFC proposes two new optional ring synchronization modes: > 1) Head/Tail Sync (HTS) mode > In that mode enqueue/dequeue operation is fully serialized: > only one thread at a time is allowed to perform given op. > As another enhancement provide ability to split enqueue/dequeue > operation into two phases: > - enqueue/dequeue start > - enqueue/dequeue finish > That allows user to inspect objects in the ring without removing > them from it (aka MT safe peek). IMO, this will not address the problem described above. For ex: when a prod= ucer updates the head and gets scheduled out, other producers have to spin.= The problem is probably worse as with non-HTS case moving of the head and = copying of the ring elements can happen in parallel between the producers (= similarly for consumers). IMO, HTS should not be a configurable flag. In RCU requirement, a MP enqueu= e and HTS dequeue are required. > 2) Relaxed Tail Sync (RTS) > The main difference from original MP/MC algorithm is that tail value is > increased not by every thread that finished enqueue/dequeue, but only by = the > last one. > That allows threads to avoid spinning on ring tail value, leaving actual = tail value > change to the last thread in the update queue. This can be a configurable flag on the ring. I am not sure how this solves the problem you have stated above completely.= Updating the count from all intermediate threads is still required to upda= te the value of the head. But yes, it reduces the severity of the problem b= y not enforcing the order in which the tail is updated. I also think it introduces the problem on the other side of the ring becaus= e the tail is not updated soon enough (the other side has to wait longer fo= r the elements to become available). It also introduces another configurati= on parameter (HTD_MAX_DEF) which they have to deal with. Users have to still implement the current hypervisor related solutions. IMO, we should run the benchmark for this on an over committed setup to und= erstand the benefits. >=20 > Test results on IA (see below) show significant improvements for average > enqueue/dequeue op times on overcommitted systems. > For 'classic' DPDK deployments (one thread per core) original MP/MC > algorithm still shows best numbers, though for 64-bit target RTS numbers = are > not that far away. > Numbers were produced by ring_stress_*autotest (first patch in these seri= es). >=20 > X86_64 @ Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz > DEQ+ENQ average cycles/obj >=20 > MP/MC HTS RTS > 1thread@1core(--lcores=3D6-7) 8.00 8.15 8.99 > 2thread@2core(--lcores=3D6-8) 19.14 19.61 20.3= 5 > 4thread@4core(--lcores=3D6-10) 29.43 29.79 31.8= 2 > 8thread@8core(--lcores=3D6-14) 110.59 192.81 119.= 50 > 16thread@16core(--lcores=3D6-22) 461.03 813.12 495.= 59 > 32thread/@32core(--lcores=3D'6-22,55-70') 982.90 1972.38 1160= .51 >=20 > 2thread@1core(--lcores=3D'6,(10-11)@7' 20140.50 23.58 25.1= 4 > 4thread@2core(--lcores=3D'6,(10-11)@7,(20-21)@8' 153680.60 76.88 80.0= 5 > 8thread@2core(--lcores=3D'6,(10-13)@7,(20-23)@8' 280314.32 294.72 318.= 79 > 16thread@2core(--lcores=3D'6,(10-17)@7,(20-27)@8' 643176.59 1144.02 > 1175.14 32thread@2core(--lcores=3D'6,(10-25)@7,(30-45)@8' 4264238.80 > 4627.48 4892.68 >=20 > 8thread@2core(--lcores=3D'6,(10-17)@(7,8))' 321085.98 298.59 307.= 47 > 16thread@4core(--lcores=3D'6,(20-35)@(7-10))' 1900705.61 575.35 678.= 29 > 32thread@4core(--lcores=3D'6,(20-51)@(7-10))' 5510445.85 2164.36 2714= .12 >=20 > i686 @ Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz > DEQ+ENQ average cycles/obj >=20 > MP/MC HTS RTS > 1thread@1core(--lcores=3D6-7) 7.85 12.13 11.3= 1 > 2thread@2core(--lcores=3D6-8) 17.89 24.52 21.8= 6 > 8thread@8core(--lcores=3D6-14) 32.58 354.20 54.5= 8 > 32thread/@32core(--lcores=3D'6-22,55-70') 813.77 6072.41 2169= .91 >=20 > 2thread@1core(--lcores=3D'6,(10-11)@7' 16095.00 36.06 34.7= 4 > 8thread@2core(--lcores=3D'6,(10-13)@7,(20-23)@8' 1140354.54 346.61 361.= 57 > 16thread@2core(--lcores=3D'6,(10-17)@7,(20-27)@8' 1920417.86 1314.90 > 1416.65 >=20 > 8thread@2core(--lcores=3D'6,(10-17)@(7,8))' 594358.61 332.70 357.= 74 > 32thread@4core(--lcores=3D'6,(20-51)@(7-10))' 5319896.86 2836.44 3028= .87 >=20 > Konstantin Ananyev (6): > test/ring: add contention stress test > ring: rework ring layout to allow new sync schemes > ring: introduce RTS ring mode > test/ring: add contention stress test for RTS ring > ring: introduce HTS ring mode > test/ring: add contention stress test for HTS ring >=20 > app/test/Makefile | 3 + > app/test/meson.build | 3 + > app/test/test_pdump.c | 6 +- > app/test/test_ring_hts_stress.c | 28 ++ > app/test/test_ring_rts_stress.c | 28 ++ > app/test/test_ring_stress.c | 27 ++ > app/test/test_ring_stress.h | 477 +++++++++++++++++++ > lib/librte_pdump/rte_pdump.c | 2 +- > lib/librte_port/rte_port_ring.c | 12 +- > lib/librte_ring/Makefile | 4 +- > lib/librte_ring/meson.build | 4 +- > lib/librte_ring/rte_ring.c | 84 +++- > lib/librte_ring/rte_ring.h | 619 +++++++++++++++++++++++-- > lib/librte_ring/rte_ring_elem.h | 8 +- > lib/librte_ring/rte_ring_hts_generic.h | 228 +++++++++ > lib/librte_ring/rte_ring_rts_generic.h | 240 ++++++++++ > 16 files changed, 1721 insertions(+), 52 deletions(-) create mode 10064= 4 > app/test/test_ring_hts_stress.c create mode 100644 > app/test/test_ring_rts_stress.c create mode 100644 > app/test/test_ring_stress.c create mode 100644 app/test/test_ring_stress= .h > create mode 100644 lib/librte_ring/rte_ring_hts_generic.h > create mode 100644 lib/librte_ring/rte_ring_rts_generic.h >=20 > -- > 2.17.1