From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 04A34A0597; Tue, 21 Apr 2020 13:31:59 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D42A61D608; Tue, 21 Apr 2020 13:31:58 +0200 (CEST) Received: from us-smtp-delivery-1.mimecast.com (us-smtp-2.mimecast.com [205.139.110.61]) by dpdk.org (Postfix) with ESMTP id 83DF91D5E1 for ; Tue, 21 Apr 2020 13:31:57 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1587468717; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TLRXK6F24sTByR0Mq15XY8sbEC8YVCMHh0a8JqgSTWc=; b=DIk2x2qfQ8NBppBPNWnJV4gJjnoz5hnG9CEOJyCOHEIWQS3Ow/odH3owpYOK+Kts9fvh/A 90iLwkQYH8BdXacpJ3ljUOhPMccbIMGkYgyrorrcttv5zeqQigkug82SiTO0AmouABnIUd cL+1uQKvsDXANxuE1tyqYIp1N29m1Jw= Received: from mail-ua1-f72.google.com (mail-ua1-f72.google.com [209.85.222.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-466-C5f6XEyQPXOVtcSUDmgi-g-1; Tue, 21 Apr 2020 07:31:52 -0400 X-MC-Unique: C5f6XEyQPXOVtcSUDmgi-g-1 Received: by mail-ua1-f72.google.com with SMTP id 37so3273315uaf.17 for ; Tue, 21 Apr 2020 04:31:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=UCZjU60z4hk4c4sO2Qp58tdFVCyszr6rm49Lu1KWBjA=; b=dFVXpFvjNuo4vDbToFRBaZKg1dnRWqKdxeafnuCM2ux51+NXPKsD1DOjlZGzIrR8QA rfd1L2/GRwU1rh3xcLNyKkXZsRvpVh1IGgi1wO/xE5lwz6zke33UdTacSuVPfPRcUToj 91Wuf5hNw0kVmL4J5y5xUBk9Lf5q8i/zi98oOIUaDufZsAYNieMdlYD5MrU88tQcADlL SyF/nNnOhdDdQrJ6E8kL5PdzjN9rUmLY9tAxxp/P8vwLoe8wxgbZc950+DqA/UIBV8b/ beEtwOuNjsITmzFtYpWNTUdaSORVi5uZgTLrhDQN8cjl0eCjwMLfKDb9F+F6FaWlRlSB vG5g== X-Gm-Message-State: AGi0PuYMECt1xrylNYKTlxVgB6Kh5gabpwPcjNjLHbRGSYNTW+opMfs7 sZqYGFubUmLfoNQQfyYZH67LRRqdAwP1rV6fwmoReEMCHFCMWzK7stEO60LXRPuPEdCEdLe7fJt pGL4ZtwCSRKHCe23YQXg= X-Received: by 2002:ab0:5ad1:: with SMTP id x17mr11182368uae.126.1587468711950; Tue, 21 Apr 2020 04:31:51 -0700 (PDT) X-Google-Smtp-Source: APiQypJvIWQyD+67bOEKsB8SfucoEubraaJqDoABglXPEPBTxefSAmvzOd9BF/uM3/fe/qhHs3jecmMyYQGtUkoVwl4= X-Received: by 2002:ab0:5ad1:: with SMTP id x17mr11182337uae.126.1587468711465; Tue, 21 Apr 2020 04:31:51 -0700 (PDT) MIME-Version: 1.0 References: <20200420121113.9327-1-konstantin.ananyev@intel.com> <20200420122831.16973-1-konstantin.ananyev@intel.com> In-Reply-To: <20200420122831.16973-1-konstantin.ananyev@intel.com> From: David Marchand Date: Tue, 21 Apr 2020 13:31:39 +0200 Message-ID: To: Konstantin Ananyev , Honnappa Nagarahalli Cc: dev , jielong.zjl@antfin.com, Pavan Nikhilesh , Jerin Jacob Kollanukkaran , Thomas Monjalon X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [PATCH v7 00/10] New sync modes for ring X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Mon, Apr 20, 2020 at 2:28 PM Konstantin Ananyev wrote: > These days more and more customers use(/try to use) DPDK based apps withi= n > overcommitted systems (multiple acttive threads over same pysical cores): > VM, container deployments, etc. > One quite common problem they hit: > Lock-Holder-Preemption/Lock-Waiter-Preemption with rte_ring. > LHP is quite a common problem for spin-based sync primitives > (spin-locks, etc.) on overcommitted systems. > The situation gets much worse when some sort of > fair-locking technique is used (ticket-lock, etc.). > As now not only lock-owner but also lock-waiters scheduling > order matters a lot (LWP). > These two problems are well-known for kernel within VMs: > http://www-archive.xenproject.org/files/xensummitboston08/LHP.pdf > https://www.cs.hs-rm.de/~kaiser/events/wamos2017/Slides/selcuk.pdf > The problem with rte_ring is that while head accusion is sort of > un-fair locking, waiting on tail is very similar to ticket lock schema - > tail has to be updated in particular order. > That makes current rte_ring implementation to perform > really pure on some overcommited scenarios. > It is probably not possible to completely resolve LHP problem in > userspace only (without some kernel communication/intervention). > But removing fairness at tail update helps to avoid LWP and > can mitigate the situation significantly. > This patch proposes two new optional ring synchronization modes: > 1) Head/Tail Sync (HTS) mode > In that mode enqueue/dequeue operation is fully serialized: > only one thread at a time is allowed to perform given op. > As another enhancement provide ability to split enqueue/dequeue > operation into two phases: > - enqueue/dequeue start > - enqueue/dequeue finish > That allows user to inspect objects in the ring without removing > them from it (aka MT safe peek). > 2) Relaxed Tail Sync (RTS) > The main difference from original MP/MC algorithm is that > tail value is increased not by every thread that finished enqueue/dequeue= , > but only by the last one. > That allows threads to avoid spinning on ring tail value, > leaving actual tail value change to the last thread in the update queue. > > Note that these new sync modes are optional. > For current rte_ring users nothing should change > (both in terms of API/ABI and performance). > Existing sync modes MP/MC,SP/SC kept untouched, set up in the same > way (via flags and _init_), and MP/MC remains as default one. > The only thing that changed: > Format of prod/cons now could differ depending on mode selected at _init_= . > So user has to stick with one sync model through whole ring lifetime. > In other words, user can't create a ring for let say SP mode and then > in the middle of data-path change his mind and start using MP_RTS mode. > For existing modes (SP/MP, SC/MC) format remains the same and > user can still use them interchangeably, though of course it is an > error prone practice. > > Test results on IA (see below) show significant improvements > for average enqueue/dequeue op times on overcommitted systems. > For 'classic' DPDK deployments (one thread per core) original MP/MC > algorithm still shows best numbers, though for 64-bit target > RTS numbers are not that far away. > Numbers were produced by new UT test-case: ring_stress_autotest, i.e.: > echo ring_stress_autotest | ./dpdk-test -n 4 --lcores=3D'...' > > X86_64 @ Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz > DEQ+ENQ average cycles/obj > MP/MC HTS RTS > 1thread@1core(--lcores=3D6-7) 8.00 8.15 8.99 > 2thread@2core(--lcores=3D6-8) 19.14 19.61 20.3= 5 > 4thread@4core(--lcores=3D6-10) 29.43 29.79 31.8= 2 > 8thread@8core(--lcores=3D6-14) 110.59 192.81 119.= 50 > 16thread@16core(--lcores=3D6-22) 461.03 813.12 495.= 59 > 32thread/@32core(--lcores=3D'6-22,55-70') 982.90 1972.38 1160= .51 > > 2thread@1core(--lcores=3D'6,(10-11)@7' 20140.50 23.58 25.1= 4 > 4thread@2core(--lcores=3D'6,(10-11)@7,(20-21)@8' 153680.60 76.88 80.0= 5 > 8thread@2core(--lcores=3D'6,(10-13)@7,(20-23)@8' 280314.32 294.72 318.= 79 > 16thread@2core(--lcores=3D'6,(10-17)@7,(20-27)@8' 643176.59 1144.02 1175= .14 > 32thread@2core(--lcores=3D'6,(10-25)@7,(30-45)@8' 4264238.80 4627.48 4892= .68 > > 8thread@2core(--lcores=3D'6,(10-17)@(7,8))' 321085.98 298.59 307.= 47 > 16thread@4core(--lcores=3D'6,(20-35)@(7-10))' 1900705.61 575.35 678.= 29 > 32thread@4core(--lcores=3D'6,(20-51)@(7-10))' 5510445.85 2164.36 2714= .12 > > i686 @ Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz > DEQ+ENQ average cycles/obj > MP/MC HTS RTS > 1thread@1core(--lcores=3D6-7) 7.85 12.13 11.3= 1 > 2thread@2core(--lcores=3D6-8) 17.89 24.52 21.8= 6 > 8thread@8core(--lcores=3D6-14) 32.58 354.20 54.5= 8 > 32thread/@32core(--lcores=3D'6-22,55-70') 813.77 6072.41 2169= .91 > > 2thread@1core(--lcores=3D'6,(10-11)@7' 16095.00 36.06 34.7= 4 > 8thread@2core(--lcores=3D'6,(10-13)@7,(20-23)@8' 1140354.54 346.61 361.= 57 > 16thread@2core(--lcores=3D'6,(10-17)@7,(20-27)@8' 1920417.86 1314.90 1416= .65 > > 8thread@2core(--lcores=3D'6,(10-17)@(7,8))' 594358.61 332.70 357.= 74 > 32thread@4core(--lcores=3D'6,(20-51)@(7-10))' 5319896.86 2836.44 3028= .87 I fixed a couple of typos and split the doc updates. Series applied with the patch from Pavan. Thanks for the work Konstantin, Honnappa. --=20 David Marchand