From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 16896A0583; Fri, 20 Mar 2020 18:54:50 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DFC7D2BB9; Fri, 20 Mar 2020 18:54:48 +0100 (CET) Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) by dpdk.org (Postfix) with ESMTP id 6BFA9F94 for ; Fri, 20 Mar 2020 18:54:47 +0100 (CET) Received: by mail-pj1-f45.google.com with SMTP id v13so2844842pjb.0 for ; Fri, 20 Mar 2020 10:54:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mWVHvJAc9Tlcy7IzcXz0cC5qi5d2Z5nG7B/YsIhq39s=; b=wOKaO5N/qJ+Zx/tx21owvBDu2VPgM+EOew7WlJKOFLE5NJytygdT+axagpZPUPXBJK /jk9arVq5+YMZCgXqlV1eWcxOgzO0qD7aOSCPMnab6s1GVptaws7ohT0QlswSBwTjrgR E+HKKlyxz1iTp3LD9Zw2OY0D2xAj5rWjj/SK6ItIU8R6I0BFkINxcFRz1cMGV4+ey86x HgXtOotMrIvtJUGeIIywWTbhsO9eamtsu7ER/b710FLPUwPufYs3A5UNtDh5hB+Fk4we 2rt+/jjo1qWJQc8QUfLSFwtJhYo6LKmnTTrxRC7i/bQVxHws2KuS4rzK7Pxgapxv33xh KOWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mWVHvJAc9Tlcy7IzcXz0cC5qi5d2Z5nG7B/YsIhq39s=; b=RBBlSx2mIZ90Jx4gInFujSF0DlYycT/4fxWts2dlVRNsv5JnfGo3jFl7uum3eeTr2a /6tMZh1Gfs/2HQRYXqwtkej9NE3B8mye6/czZQNvzlMFK/vicH27Wy/CkzPo2XC76KNX l5vzepcWnxaHz3NnE7yaVMmRkcTfPWortD0Yb4x8oyBIO8yqFh2UOU6Hc9Ip43A+QXeL wm5Gc4miQ+nBCtGHGbLiJwNq9++fGxfVju5nGuZQ+PfZOpxVmkoo3M5LIw1qx6Vjo67t 3BD+YHBmQcxnBAEF15MFBd773XjuP5QXuWumHF/HsSQloRIWhX+y45Dv1tlh6j3JS+4g 4dpw== X-Gm-Message-State: ANhLgQ0ZGUASAMbvzY5EWLHvOKlQTdubjTMHJSVZ+mA3tBCfJcW9FxMX vaiPWpZEkifl09K7+fqS3yTLnw== X-Google-Smtp-Source: ADFU+vuI3T7Yrgk0gy44jmdnKGYs2GJTLlEcBgpL7B8SGayl/CEl2J6wjeg3bNcH7u7vaOuL3IF9Pw== X-Received: by 2002:a17:90a:930e:: with SMTP id p14mr10623897pjo.159.1584726886298; Fri, 20 Mar 2020 10:54:46 -0700 (PDT) Received: from hermes.lan (204-195-22-127.wavecable.com. [204.195.22.127]) by smtp.gmail.com with ESMTPSA id m9sm5828273pga.92.2020.03.20.10.54.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Mar 2020 10:54:46 -0700 (PDT) Date: Fri, 20 Mar 2020 10:54:35 -0700 From: Stephen Hemminger To: Konstantin Ananyev Cc: dev@dpdk.org, olivier.matz@6wind.com, honnappa.nagarahalli@arm.com, jerinj@marvell.com, drc@linux.vnet.ibm.com Message-ID: <20200320105435.47681954@hermes.lan> In-Reply-To: <20200320164138.8510-1-konstantin.ananyev@intel.com> References: <20200320164138.8510-1-konstantin.ananyev@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [RFC] ring: make ring implementation non-inlined X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Fri, 20 Mar 2020 16:41:38 +0000 Konstantin Ananyev wrote: > As was discussed here: > http://mails.dpdk.org/archives/dev/2020-February/158586.html > this RFC aimed to hide ring internals into .c and make all > ring functions non-inlined. In theory that might help to > maintain ABI stability in future. > This is just a POC to measure the impact of proposed idea, > proper implementation would definetly need some extra effort. > On IA box (SKX) ring_perf_autotest shows ~20-30 cycles extra for > enqueue+dequeue pair. On some more realistic code, I suspect > the impact it might be a bit higher. > For MP/MC bulk transfers degradation seems quite small, > though for SP/SC and/or small transfers it is more then noticable > (see exact numbers below). > From my perspective we'd probably keep it inlined for now > to avoid any non-anticipated perfomance degradations. > Though intersted to see perf results and opinions from > other interested parties. > > Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz > ring_perf_autotest (without patch/with patch) > > ### Testing single element enq/deq ### > legacy APIs: SP/SC: single: 8.75/43.23 > legacy APIs: MP/MC: single: 56.18/80.44 > > ### Testing burst enq/deq ### > legacy APIs: SP/SC: burst (size: 8): 37.36/53.37 > legacy APIs: SP/SC: burst (size: 32): 93.97/117.30 > legacy APIs: MP/MC: burst (size: 8): 78.23/91.45 > legacy APIs: MP/MC: burst (size: 32): 131.59/152.49 > > ### Testing bulk enq/deq ### > legacy APIs: SP/SC: bulk (size: 8): 37.29/54.48 > legacy APIs: SP/SC: bulk (size: 32): 92.68/113.01 > legacy APIs: MP/MC: bulk (size: 8): 78.40/93.50 > legacy APIs: MP/MC: bulk (size: 32): 131.49/154.25 > > ### Testing empty bulk deq ### > legacy APIs: SP/SC: bulk (size: 8): 4.00/16.86 > legacy APIs: MP/MC: bulk (size: 8): 7.01/15.55 > > ### Testing using two hyperthreads ### > legacy APIs: SP/SC: bulk (size: 8): 10.64/17.56 > legacy APIs: MP/MC: bulk (size: 8): 15.30/16.69 > legacy APIs: SP/SC: bulk (size: 32): 5.84/7.09 > legacy APIs: MP/MC: bulk (size: 32): 6.34/7.54 > > ### Testing using two physical cores ### > legacy APIs: SP/SC: bulk (size: 8): 24.34/42.40 > legacy APIs: MP/MC: bulk (size: 8): 70.34/71.82 > legacy APIs: SP/SC: bulk (size: 32): 12.67/14.68 > legacy APIs: MP/MC: bulk (size: 32): 22.41/17.93 > > ### Testing single element enq/deq ### > elem APIs: element size 16B: SP/SC: single: 10.65/41.96 > elem APIs: element size 16B: MP/MC: single: 44.33/81.36 > > ### Testing burst enq/deq ### > elem APIs: element size 16B: SP/SC: burst (size: 8): 39.20/58.52 > elem APIs: element size 16B: SP/SC: burst (size: 32): 123.19/142.79 > elem APIs: element size 16B: MP/MC: burst (size: 8): 80.72/101.36 > elem APIs: element size 16B: MP/MC: burst (size: 32): 169.21/185.38 > > ### Testing bulk enq/deq ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 41.64/58.46 > elem APIs: element size 16B: SP/SC: bulk (size: 32): 122.74/142.52 > elem APIs: element size 16B: MP/MC: bulk (size: 8): 80.60/103.14 > elem APIs: element size 16B: MP/MC: bulk (size: 32): 169.39/186.67 > > ### Testing empty bulk deq ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 5.01/17.17 > elem APIs: element size 16B: MP/MC: bulk (size: 8): 6.01/14.80 > > ### Testing using two hyperthreads ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 12.02/17.18 > elem APIs: element size 16B: MP/MC: bulk (size: 8): 16.81/21.14 > elem APIs: element size 16B: SP/SC: bulk (size: 32): 7.87/9.01 > elem APIs: element size 16B: MP/MC: bulk (size: 32): 8.22/10.57 > > ### Testing using two physical cores ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 27.00/51.94 > elem APIs: element size 16B: MP/MC: bulk (size: 8): 78.24/74.48 > elem APIs: element size 16B: SP/SC: bulk (size: 32): 15.41/16.14 > elem APIs: element size 16B: MP/MC: bulk (size: 32): 18.72/21.64 > > Signed-off-by: Konstantin Ananyev What is impact with LTO? I suspect compiler might have a chance to get speed back with LTO.