From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 05216A0563; Fri, 28 Feb 2020 01:18:01 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 38B6E2C4F; Fri, 28 Feb 2020 01:18:01 +0100 (CET) Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by dpdk.org (Postfix) with ESMTP id 698112C02 for ; Fri, 28 Feb 2020 01:17:59 +0100 (CET) Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01S0FK3W106073; Thu, 27 Feb 2020 19:17:57 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 2yepwt9s4u-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 27 Feb 2020 19:17:57 -0500 Received: from m0098404.ppops.net (m0098404.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 01S0HvwL125295; Thu, 27 Feb 2020 19:17:57 -0500 Received: from ppma01wdc.us.ibm.com (fd.55.37a9.ip4.static.sl-reverse.com [169.55.85.253]) by mx0a-001b2d01.pphosted.com with ESMTP id 2yepwt9s4g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 27 Feb 2020 19:17:57 -0500 Received: from pps.filterd (ppma01wdc.us.ibm.com [127.0.0.1]) by ppma01wdc.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 01S0C5BV011955; Fri, 28 Feb 2020 00:17:55 GMT Received: from b01cxnp22036.gho.pok.ibm.com (b01cxnp22036.gho.pok.ibm.com [9.57.198.26]) by ppma01wdc.us.ibm.com with ESMTP id 2yepv28g7e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 28 Feb 2020 00:17:55 +0000 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp22036.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 01S0HtHl13238984 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 28 Feb 2020 00:17:55 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6FD60B2065; Fri, 28 Feb 2020 00:17:55 +0000 (GMT) Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C6018B2064; Fri, 28 Feb 2020 00:17:54 +0000 (GMT) Received: from davids-mbp.usor.ibm.com (unknown [9.70.84.147]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP; Fri, 28 Feb 2020 00:17:54 +0000 (GMT) To: Jerin Jacob , "Ananyev, Konstantin" Cc: Stephen Hemminger , dpdk-dev , Olivier Matz References: <20200224113515.1744-1-konstantin.ananyev@intel.com> <20200224085919.3e73fda7@hermes.lan> <20200224113529.4c1c94ab@hermes.lan> From: David Christensen Message-ID: Date: Thu, 27 Feb 2020 16:17:54 -0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-02-27_08:2020-02-26, 2020-02-27 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 mlxlogscore=999 adultscore=0 mlxscore=0 suspectscore=0 clxscore=1015 bulkscore=0 phishscore=0 lowpriorityscore=0 spamscore=0 impostorscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002280000 Subject: Re: [dpdk-dev] [RFC 0/6] New sync modes for ring X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > On Tue, Feb 25, 2020 at 7:11 PM Ananyev, Konstantin > wrote: > >> We do have a run-time check in our current enqueue()/dequeue implementation. >> In fact we support both modes: we have generic rte_ring_enqueue(/dequeue)_bulk(/burst) >> where sync behaviour is determined at runtime by value of prod(/cons).single. >> Or user can call rte_ring_(mp/sp)_enqueue_* functions directly. >> This RFC follows exactly the same paradigm: >> rte_ring_enqueue(/dequeue)_bulk(/burst) kept generic and it's >> behaviour is determined at runtime, by value of prod(/cons).sync_type. >> Or user can call enqueue/dequeue with particular sync mode directly: >> rte_ring_(mp/sp/rts/hts)_enqueue_(bulk/burst)*. >> The only thing that changed: >> Format of prod/cons now could differ depending on mode selected at _init_. >> So you can't create a ring for let say SP mode and then in the middle of data-path >> change your mind and start using MP_RTS mode. >> For existing modes (SP/MP, SC/MC) format remains the same and user can still >> use them interchangeably, though of course that is an error prone practice. > > Makes sense. > > >> >>>> But I agree with the problem statement that in the virtualization use >>>> case, It may be possible to have N virtual cores runs on a physical >>>> core. >>>> >>>> IMO, The best solution would be keeping the ring API same and have a >>>> different flavor in "compile-time". Something like >>>> liburcu did for accommodating different flavors. >>>> >>>> i.e urcu-qsbr.h and urcu-bp.h will identical definition of API. The >>>> application can simply include ONE header file in a C file based on >>>> the flavor. >> >> I don't think it is a flexible enough approach. >> In one app user might need to have several rings with different sync modes. >> Or even user might need a ring with different sync modes for enqueue/dequeue. > > Ack. > > >> Yes, hiding rte_ring implementation inside .c would help a lot >> in terms of ABI maintenance and would make our future life easier. >> The question is what is the price for it in terms of performance, >> and are we ready to pay it. Not to mention that it would cause >> changes in many other libs/apps... >> So I think it should be a subject for a separate discussion. >> But, agree it would be good at least to measure the performance >> impact of such change. >> If I'll have some spare cycles, will give it a try. >> Meanwhile, can I ask Jerin and other guys to repeat tests from this RFC >> on their HW? Before continuing discussion would probably be good to know >> does the suggested patch work as expected across different platforms. > > > I tested on an arm64 HW. The former section is without the > patch(20.02) and later one with this patch. > I agree with Konstantin that getting more platform tests will be good > early so that we can focus on the approach > to avoid back and forth latter. > > > RTE>>ring_perf_autotest // without path > > ### Testing single element enq/deq ### > legacy APIs: SP/SC: single: 289.78 > legacy APIs: MP/MC: single: 516.20 > > ### Testing burst enq/deq ### > legacy APIs: SP/SC: burst (size: 8): 312.88 > legacy APIs: SP/SC: burst (size: 32): 426.72 > legacy APIs: MP/MC: burst (size: 8): 510.95 > legacy APIs: MP/MC: burst (size: 32): 702.01 > > ### Testing bulk enq/deq ### > legacy APIs: SP/SC: bulk (size: 8): 306.74 > legacy APIs: SP/SC: bulk (size: 32): 411.56 > legacy APIs: MP/MC: bulk (size: 8): 501.32 > legacy APIs: MP/MC: bulk (size: 32): 693.07 > > ### Testing empty bulk deq ### > legacy APIs: SP/SC: bulk (size: 8): 7.00 > legacy APIs: MP/MC: bulk (size: 8): 7.00 > > ### Testing using two physical cores ### > legacy APIs: SP/SC: bulk (size: 8): 74.36 > legacy APIs: MP/MC: bulk (size: 8): 110.18 > legacy APIs: SP/SC: bulk (size: 32): 23.04 > legacy APIs: MP/MC: bulk (size: 32): 32.29 > > ### Testing using all slave nodes ## > Bulk enq/dequeue count on size 8 > Core [8] count = 293741 > Core [9] count = 293741 > Total count (size: 8): 587482 > > Bulk enq/dequeue count on size 32 > Core [8] count = 244909 > Core [9] count = 244909 > Total count (size: 32): 1077300 > > ### Testing single element enq/deq ### > elem APIs: element size 16B: SP/SC: single: 255.37 > elem APIs: element size 16B: MP/MC: single: 456.68 > > ### Testing burst enq/deq ### > elem APIs: element size 16B: SP/SC: burst (size: 8): 291.99 > elem APIs: element size 16B: SP/SC: burst (size: 32): 456.25 > elem APIs: element size 16B: MP/MC: burst (size: 8): 497.77 > elem APIs: element size 16B: MP/MC: burst (size: 32): 680.87 > > ### Testing bulk enq/deq ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 284.40 > elem APIs: element size 16B: SP/SC: bulk (size: 32): 453.17 > elem APIs: element size 16B: MP/MC: bulk (size: 8): 485.77 > elem APIs: element size 16B: MP/MC: bulk (size: 32): 675.08 > > ### Testing empty bulk deq ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 8.00 > elem APIs: element size 16B: MP/MC: bulk (size: 8): 7.00 > > ### Testing using two physical cores ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 74.45 > elem APIs: element size 16B: MP/MC: bulk (size: 8): 105.91 > elem APIs: element size 16B: SP/SC: bulk (size: 32): 22.92 > elem APIs: element size 16B: MP/MC: bulk (size: 32): 31.55 > > ### Testing using all slave nodes ### > > Bulk enq/dequeue count on size 8 > Core [8] count = 308724 > Core [9] count = 308723 > Total count (size: 8): 617447 > > Bulk enq/dequeue count on size 32 > Core [8] count = 214269 > Core [9] count = 214269 > Total count (size: 32): 1045985 > > RTE>>ring_perf_autotest // with patch > > ### Testing single element enq/deq ### > legacy APIs: SP/SC: single: 289.78 > legacy APIs: MP/MC: single: 475.76 > > ### Testing burst enq/deq ### > legacy APIs: SP/SC: burst (size: 8): 323.91 > legacy APIs: SP/SC: burst (size: 32): 424.60 > legacy APIs: MP/MC: burst (size: 8): 523.00 > legacy APIs: MP/MC: burst (size: 32): 717.09 > > ### Testing bulk enq/deq ### > legacy APIs: SP/SC: bulk (size: 8): 317.74 > legacy APIs: SP/SC: bulk (size: 32): 413.57 > legacy APIs: MP/MC: bulk (size: 8): 512.89 > legacy APIs: MP/MC: bulk (size: 32): 712.45 > > ### Testing empty bulk deq ### > legacy APIs: SP/SC: bulk (size: 8): 7.00 > legacy APIs: MP/MC: bulk (size: 8): 7.00 > > ### Testing using two physical cores ### > legacy APIs: SP/SC: bulk (size: 8): 74.82 > legacy APIs: MP/MC: bulk (size: 8): 96.45 > legacy APIs: SP/SC: bulk (size: 32): 22.97 > legacy APIs: MP/MC: bulk (size: 32): 32.52 > > ### Testing using all slave nodes ### > > Bulk enq/dequeue count on size 8 > Core [8] count = 283928 > Core [9] count = 283927 > Total count (size: 8): 567855 > > Bulk enq/dequeue count on size 32 > Core [8] count = 223916 > Core [9] count = 223915 > Total count (size: 32): 1015686 > > ### Testing single element enq/deq ### > elem APIs: element size 16B: SP/SC: single: 267.65 > elem APIs: element size 16B: MP/MC: single: 439.06 > > ### Testing burst enq/deq ### > elem APIs: element size 16B: SP/SC: burst (size: 8): 302.44 > elem APIs: element size 16B: SP/SC: burst (size: 32): 466.31 > elem APIs: element size 16B: MP/MC: burst (size: 8): 502.51 > elem APIs: element size 16B: MP/MC: burst (size: 32): 695.81 > > ### Testing bulk enq/deq ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 295.15 > elem APIs: element size 16B: SP/SC: bulk (size: 32): 462.77 > elem APIs: element size 16B: MP/MC: bulk (size: 8): 496.89 > elem APIs: element size 16B: MP/MC: bulk (size: 32): 690.46 > > ### Testing empty bulk deq ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 7.50 > elem APIs: element size 16B: MP/MC: bulk (size: 8): 7.44 > > ### Testing using two physical cores ### > elem APIs: element size 16B: SP/SC: bulk (size: 8): 65.85 > elem APIs: element size 16B: MP/MC: bulk (size: 8): 103.80 > elem APIs: element size 16B: SP/SC: bulk (size: 32): 23.27 > elem APIs: element size 16B: MP/MC: bulk (size: 32): 31.17 > > ### Testing using all slave nodes ### > > Bulk enq/dequeue count on size 8 > Core [8] count = 304223 > Core [9] count = 304221 > Total count (size: 8): 608444 > > Bulk enq/dequeue count on size 32 > Core [8] count = 214856 > Core [9] count = 214855 > Total count (size: 32): 1038155 > Test OK > RTE>>quit > > Encountered a couple of different build errors with these patches on my Power 9 system: In file included from ../lib/librte_ring/rte_ring.h:534, from ../drivers/mempool/ring/rte_mempool_ring.c:9: ../lib/librte_ring/rte_ring_hts_generic.h: In function ‘__rte_ring_hts_update_tail’: ../lib/librte_ring/rte_ring_hts_generic.h:61:2: warning: implicit declaration of function ‘RTE_ASSERT’; did you mean ‘RTE_STR’? [-Wimplicit-function-declaration] RTE_ASSERT(n >= num); ^~~~~~~~~~ RTE_STR Fixed by adding "#include " to rte_ring.h. Also encountered: In file included from ../app/test/test_ring_hts_stress.c:5: ../app/test/test_ring_stress.h: In function ‘check_updt_elem’: ../app/test/test_ring_stress.h:162:9: error: unknown type name ‘rte_spinlock_t’ static rte_spinlock_t dump_lock; ^~~~~~~~~~~~~~ ../app/test/test_ring_stress.h:166:4: warning: implicit declaration of function ‘rte_spinlock_lock’; did you mean ‘rte_calloc_socket’? [-Wimplicit-function-declaration] rte_spinlock_lock(&dump_lock); ^~~~~~~~~~~~~~~~~ rte_calloc_socket ../app/test/test_ring_stress.h:166:4: warning: nested extern declaration of ‘rte_spinlock_lock’ [-Wnested-externs] ../app/test/test_ring_stress.h:172:4: warning: implicit declaration of function ‘rte_spinlock_unlock’; did you mean ‘pthread_rwlock_unlock’? [-Wimplicit-function-declaration] rte_spinlock_unlock(&dump_lock); ^~~~~~~~~~~~~~~~~~~ pthread_rwlock_unlock ../app/test/test_ring_stress.h:172:4: warning: nested extern declaration of ‘rte_spinlock_unlock’ [-Wnested-externs] Fixed by adding "#include " to test_ring_stress.h. Autoperf test results --------------------- RTE>>ring_perf_autotest // DPDK 20.02, without patch ### Testing single element enq/deq ### legacy APIs: SP/SC: single: 42.14 legacy APIs: MP/MC: single: 56.26 ### Testing burst enq/deq ### legacy APIs: SP/SC: burst (size: 8): 43.59 legacy APIs: SP/SC: burst (size: 32): 49.87 legacy APIs: MP/MC: burst (size: 8): 58.43 legacy APIs: MP/MC: burst (size: 32): 65.68 ### Testing bulk enq/deq ### legacy APIs: SP/SC: bulk (size: 8): 43.59 legacy APIs: SP/SC: bulk (size: 32): 49.85 legacy APIs: MP/MC: bulk (size: 8): 58.43 legacy APIs: MP/MC: bulk (size: 32): 65.60 ### Testing empty bulk deq ### legacy APIs: SP/SC: bulk (size: 8): 7.16 legacy APIs: MP/MC: bulk (size: 8): 7.16 ### Testing using two hyperthreads ### legacy APIs: SP/SC: bulk (size: 8): 12.46 legacy APIs: MP/MC: bulk (size: 8): 16.20 legacy APIs: SP/SC: bulk (size: 32): 3.21 legacy APIs: MP/MC: bulk (size: 32): 3.73 ### Testing using two physical cores ### legacy APIs: SP/SC: bulk (size: 8): 33.34 legacy APIs: MP/MC: bulk (size: 8): 37.99 legacy APIs: SP/SC: bulk (size: 32): 10.19 legacy APIs: MP/MC: bulk (size: 32): 11.90 ### Testing using two NUMA nodes ### legacy APIs: SP/SC: bulk (size: 8): 49.50 legacy APIs: MP/MC: bulk (size: 8): 63.65 legacy APIs: SP/SC: bulk (size: 32): 12.49 legacy APIs: MP/MC: bulk (size: 32): 23.53 ### Testing using all slave nodes ### Bulk enq/dequeue count on size 8 Core [4] count = 5604 Core [5] count = 5563 Core [6] count = 5576 Core [7] count = 5630 Core [8] count = 5643 Core [9] count = 5727 Core [10] count = 5698 Core [11] count = 5711 Core [64] count = 5259 Core [65] count = 5322 Core [66] count = 5321 Core [67] count = 5310 Core [68] count = 4350 Core [69] count = 4455 Core [70] count = 4546 Core [71] count = 4475 Total count (size: 8): 84190 Bulk enq/dequeue count on size 32 Core [4] count = 5543 Core [5] count = 5555 Core [6] count = 5596 Core [7] count = 5584 Core [8] count = 5613 Core [9] count = 5686 Core [10] count = 5689 Core [11] count = 5677 Core [64] count = 5228 Core [65] count = 5389 Core [66] count = 5406 Core [67] count = 5359 Core [68] count = 4554 Core [69] count = 4673 Core [70] count = 4675 Core [71] count = 4644 Total count (size: 32): 169061 ### Testing single element enq/deq ### elem APIs: element size 16B: SP/SC: single: 42.84 elem APIs: element size 16B: MP/MC: single: 56.77 ### Testing burst enq/deq ### elem APIs: element size 16B: SP/SC: burst (size: 8): 44.98 elem APIs: element size 16B: SP/SC: burst (size: 32): 59.16 elem APIs: element size 16B: MP/MC: burst (size: 8): 60.58 elem APIs: element size 16B: MP/MC: burst (size: 32): 74.75 ### Testing bulk enq/deq ### elem APIs: element size 16B: SP/SC: bulk (size: 8): 45.01 elem APIs: element size 16B: SP/SC: bulk (size: 32): 59.08 elem APIs: element size 16B: MP/MC: bulk (size: 8): 60.58 elem APIs: element size 16B: MP/MC: bulk (size: 32): 74.76 ### Testing empty bulk deq ### elem APIs: element size 16B: SP/SC: bulk (size: 8): 7.16 elem APIs: element size 16B: MP/MC: bulk (size: 8): 7.16 ### Testing using two hyperthreads ### elem APIs: element size 16B: SP/SC: bulk (size: 8): 12.18 elem APIs: element size 16B: MP/MC: bulk (size: 8): 15.44 elem APIs: element size 16B: SP/SC: bulk (size: 32): 3.22 elem APIs: element size 16B: MP/MC: bulk (size: 32): 3.97 ### Testing using two physical cores ### elem APIs: element size 16B: SP/SC: bulk (size: 8): 42.07 elem APIs: element size 16B: MP/MC: bulk (size: 8): 44.50 elem APIs: element size 16B: SP/SC: bulk (size: 32): 10.73 elem APIs: element size 16B: MP/MC: bulk (size: 32): 11.73 ### Testing using two NUMA nodes ### elem APIs: element size 16B: SP/SC: bulk (size: 8): 49.55 elem APIs: element size 16B: MP/MC: bulk (size: 8): 93.10 elem APIs: element size 16B: SP/SC: bulk (size: 32): 12.33 elem APIs: element size 16B: MP/MC: bulk (size: 32): 27.10 ### Testing using all slave nodes ### Bulk enq/dequeue count on size 8 Core [4] count = 5489 Core [5] count = 5559 Core [6] count = 5566 Core [7] count = 5577 Core [8] count = 5645 Core [9] count = 5699 Core [10] count = 5695 Core [11] count = 5733 Core [64] count = 5202 Core [65] count = 5284 Core [66] count = 5319 Core [67] count = 5349 Core [68] count = 4331 Core [69] count = 4465 Core [70] count = 4484 Core [71] count = 4439 Total count (size: 8): 83836 Bulk enq/dequeue count on size 32 Core [4] count = 5567 Core [5] count = 5492 Core [6] count = 5517 Core [7] count = 5515 Core [8] count = 5593 Core [9] count = 5650 Core [10] count = 5694 Core [11] count = 5665 Core [64] count = 5236 Core [65] count = 5319 Core [66] count = 5333 Core [67] count = 5304 Core [68] count = 4608 Core [69] count = 4669 Core [70] count = 4690 Core [71] count = 4654 Total count (size: 32): 168342 Test OK --------------------- RTE>>ring_perf_autotest // DPDK 20.02, without patch ### Testing single element enq/deq ### legacy APIs: SP/SC: single: 42.18 legacy APIs: MP/MC: single: 56.26 ### Testing burst enq/deq ### legacy APIs: SP/SC: burst (size: 8): 43.60 legacy APIs: SP/SC: burst (size: 32): 49.86 legacy APIs: MP/MC: burst (size: 8): 58.43 legacy APIs: MP/MC: burst (size: 32): 65.67 ### Testing bulk enq/deq ### legacy APIs: SP/SC: bulk (size: 8): 43.59 legacy APIs: SP/SC: bulk (size: 32): 49.86 legacy APIs: MP/MC: bulk (size: 8): 58.43 legacy APIs: MP/MC: bulk (size: 32): 65.63 ### Testing empty bulk deq ### legacy APIs: SP/SC: bulk (size: 8): 7.16 legacy APIs: MP/MC: bulk (size: 8): 7.16 ### Testing using two hyperthreads ### legacy APIs: SP/SC: bulk (size: 8): 12.07 legacy APIs: MP/MC: bulk (size: 8): 16.24 legacy APIs: SP/SC: bulk (size: 32): 3.20 legacy APIs: MP/MC: bulk (size: 32): 3.72 ### Testing using two physical cores ### legacy APIs: SP/SC: bulk (size: 8): 33.41 legacy APIs: MP/MC: bulk (size: 8): 38.01 legacy APIs: SP/SC: bulk (size: 32): 10.23 legacy APIs: MP/MC: bulk (size: 32): 11.90 ### Testing using two NUMA nodes ### legacy APIs: SP/SC: bulk (size: 8): 49.27 legacy APIs: MP/MC: bulk (size: 8): 64.80 legacy APIs: SP/SC: bulk (size: 32): 12.45 legacy APIs: MP/MC: bulk (size: 32): 23.11 ### Testing using all slave nodes ### Bulk enq/dequeue count on size 8 Core [4] count = 5637 Core [5] count = 5599 Core [6] count = 5623 Core [7] count = 5627 Core [8] count = 5723 Core [9] count = 5758 Core [10] count = 5714 Core [11] count = 5724 Core [64] count = 5310 Core [65] count = 5438 Core [66] count = 5448 Core [67] count = 5374 Core [68] count = 4441 Core [69] count = 4550 Core [70] count = 4550 Core [71] count = 4558 Total count (size: 8): 85074 Bulk enq/dequeue count on size 32 Core [4] count = 5608 Core [5] count = 5623 Core [6] count = 5590 Core [7] count = 5658 Core [8] count = 5680 Core [9] count = 5738 Core [10] count = 5692 Core [11] count = 5712 Core [64] count = 5273 Core [65] count = 5363 Core [66] count = 5341 Core [67] count = 5349 Core [68] count = 4591 Core [69] count = 4673 Core [70] count = 4698 Core [71] count = 4687 Total count (size: 32): 170350 ### Testing single element enq/deq ### elem APIs: element size 16B: SP/SC: single: 42.82 elem APIs: element size 16B: MP/MC: single: 56.79 ### Testing burst enq/deq ### elem APIs: element size 16B: SP/SC: burst (size: 8): 44.99 elem APIs: element size 16B: SP/SC: burst (size: 32): 59.00 elem APIs: element size 16B: MP/MC: burst (size: 8): 60.59 elem APIs: element size 16B: MP/MC: burst (size: 32): 74.78 ### Testing bulk enq/deq ### elem APIs: element size 16B: SP/SC: bulk (size: 8): 44.97 elem APIs: element size 16B: SP/SC: bulk (size: 32): 58.91 elem APIs: element size 16B: MP/MC: bulk (size: 8): 60.60 elem APIs: element size 16B: MP/MC: bulk (size: 32): 74.61 ### Testing empty bulk deq ### elem APIs: element size 16B: SP/SC: bulk (size: 8): 7.16 elem APIs: element size 16B: MP/MC: bulk (size: 8): 7.16 ### Testing using two hyperthreads ### elem APIs: element size 16B: SP/SC: bulk (size: 8): 12.18 elem APIs: element size 16B: MP/MC: bulk (size: 8): 15.41 elem APIs: element size 16B: SP/SC: bulk (size: 32): 3.19 elem APIs: element size 16B: MP/MC: bulk (size: 32): 4.06 ### Testing using two physical cores ### elem APIs: element size 16B: SP/SC: bulk (size: 8): 42.08 elem APIs: element size 16B: MP/MC: bulk (size: 8): 44.52 elem APIs: element size 16B: SP/SC: bulk (size: 32): 10.73 elem APIs: element size 16B: MP/MC: bulk (size: 32): 12.39 ### Testing using two NUMA nodes ### elem APIs: element size 16B: SP/SC: bulk (size: 8): 49.65 elem APIs: element size 16B: MP/MC: bulk (size: 8): 93.27 elem APIs: element size 16B: SP/SC: bulk (size: 32): 12.38 elem APIs: element size 16B: MP/MC: bulk (size: 32): 27.19 ### Testing using all slave nodes ### Bulk enq/dequeue count on size 8 Core [4] count = 5629 Core [5] count = 5585 Core [6] count = 5676 Core [7] count = 5604 Core [8] count = 5639 Core [9] count = 5731 Core [10] count = 5694 Core [11] count = 5707 Core [64] count = 5254 Core [65] count = 5331 Core [66] count = 5340 Core [67] count = 5355 Core [68] count = 4339 Core [69] count = 4481 Core [70] count = 4504 Core [71] count = 4507 Total count (size: 8): 84376 Bulk enq/dequeue count on size 32 Core [4] count = 5518 Core [5] count = 5493 Core [6] count = 5559 Core [7] count = 5484 Core [8] count = 5623 Core [9] count = 5669 Core [10] count = 5661 Core [11] count = 5658 Core [64] count = 5207 Core [65] count = 5305 Core [66] count = 5273 Core [67] count = 5303 Core [68] count = 4542 Core [69] count = 4682 Core [70] count = 4672 Core [71] count = 4609 Total count (size: 32): 168634 Test OK Dave