From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 81E921B937 for ; Fri, 11 Jan 2019 11:25:28 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Jan 2019 02:25:27 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,465,1539673200"; d="scan'208";a="310982381" Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.251.85.183]) ([10.251.85.183]) by fmsmga005.fm.intel.com with ESMTP; 11 Jan 2019 02:25:26 -0800 To: Gage Eads , dev@dpdk.org Cc: olivier.matz@6wind.com, arybchenko@solarflare.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com References: <20190110210122.24889-1-gage.eads@intel.com> <20190110210122.24889-2-gage.eads@intel.com> From: "Burakov, Anatoly" Message-ID: Date: Fri, 11 Jan 2019 10:25:25 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <20190110210122.24889-2-gage.eads@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH 1/6] ring: change head and tail to pointer-width size X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Jan 2019 10:25:28 -0000 On 10-Jan-19 9:01 PM, Gage Eads wrote: > For 64-bit architectures, doubling the head and tail index widths greatly > increases the time it takes for them to wrap-around (with current CPU > speeds, it won't happen within the author's lifetime). This is important in > avoiding the ABA problem -- in which a thread mistakes reading the same > tail index in two accesses to mean that the ring was not modified in the > intervening time -- in the upcoming non-blocking ring implementation. Using > a 64-bit index makes the possibility of this occurring effectively zero. > > I tested this commit's performance impact with an x86_64 build on a > dual-socket Xeon E5-2699 v4 using ring_perf_autotest, and the change made > no significant difference -- the few differences appear to be system noise. > (The test ran on isolcpus cores using a tickless scheduler, but some > variation was stll observed.) Each test was run three times and the results > were averaged: > > | 64b head/tail cycle cost minus > Test | 32b head/tail cycle cost > ------------------------------------------------------------------ > SP/SC single enq/dequeue | 0.33 > MP/MC single enq/dequeue | 0.00 > SP/SC burst enq/dequeue (size 8) | 0.00 > MP/MC burst enq/dequeue (size 8) | 1.00 > SP/SC burst enq/dequeue (size 32) | 0.00 > MP/MC burst enq/dequeue (size 32) | -1.00 > SC empty dequeue | 0.01 > MC empty dequeue | 0.00 > > Single lcore: > SP/SC bulk enq/dequeue (size 8) | -0.36 > MP/MC bulk enq/dequeue (size 8) | 0.99 > SP/SC bulk enq/dequeue (size 32) | -0.40 > MP/MC bulk enq/dequeue (size 32) | -0.57 > > Two physical cores: > SP/SC bulk enq/dequeue (size 8) | -0.49 > MP/MC bulk enq/dequeue (size 8) | 0.19 > SP/SC bulk enq/dequeue (size 32) | -0.28 > MP/MC bulk enq/dequeue (size 32) | -0.62 > > Two NUMA nodes: > SP/SC bulk enq/dequeue (size 8) | 3.25 > MP/MC bulk enq/dequeue (size 8) | 1.87 > SP/SC bulk enq/dequeue (size 32) | -0.44 > MP/MC bulk enq/dequeue (size 32) | -1.10 > > An earlier version of this patch changed the head and tail indexes to > uint64_t, but that caused a performance drop on 32-bit builds. With > uintptr_t, no performance difference is observed on an i686 build. > > Signed-off-by: Gage Eads > --- You're breaking the ABI - version bump for affected libraries is needed. -- Thanks, Anatoly