From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-f196.google.com (mail-pl1-f196.google.com [209.85.214.196]) by dpdk.org (Postfix) with ESMTP id 3587C1B952 for ; Fri, 11 Jan 2019 05:38:42 +0100 (CET) Received: by mail-pl1-f196.google.com with SMTP id u6so6184204plm.8 for ; Thu, 10 Jan 2019 20:38:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Mp8JNSxyGdaU5sS6PPCwrwg3Kt2eMMT6r8gBCBH3VL8=; b=MQG4107XORibT4+kCJnn7mqqNzmi4KDw54L68vcNHvYXK6JZl9mjdV3IPenum1N7dk 2h93NpcmRDqLF88ftHST6/9ozRFl0InRfcRoSFjNxDmFuYYv4ioBxNYMEWRU8qFscBGN usJYNgRFrK2QZ+IlMrt2UmGUsUz+376JLl8M/aURnotti8s7dEXKaDTiQuHK6TEpKa5k ATiw3CzLyW0wk6Uwa/+YRqUTydnMmYRs+ZSKwBLWqwTfndatyO+WGM8ST5Ca71TXMb7x C0vPjQxjBw6SNDlqfPJvTsFcL2fAhmjxytcUoZbhHUpf5T71F2hcrAEIZRr9x/bGxJ9B P/sA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Mp8JNSxyGdaU5sS6PPCwrwg3Kt2eMMT6r8gBCBH3VL8=; b=Z5gj5zmvvxouCD46Jr6pD2043Y6w7t5Giv+11xOPN36Fdh/P69wuXOKXYIyr60nOma 6fIONT5xtBk65bDHfO0OHm0J2MJCr/Tzkb8roa9aaOMPlCiUbffqhLmaSpqKOS80DWT6 pMgurMjnXuYzZcshMVhamO3RL92sjHJYtGOizQFLgNwfbCyUx/TofzD8I4+Pvf9O22AH B7dzWnK9alJy30ft+r/FOGSt2t0un7PThLFTqvjuvYivOl5JuE9KDHW09k6msik3ebM4 Y31jk/IWnqbWMKbBjLxdZx73ZnxUx7DGgMdD1uzLX9vnuBMYzayOQ403o6Wzzjb2cdhd jb5A== X-Gm-Message-State: AJcUukcTo+0lpzJVhaE5qH81gY0X+oIdW+RoFFJ21KKK0Wwy3eBDQzQv YQZB7Yp1sy7ANEHM57U11r4yZQ== X-Google-Smtp-Source: ALg8bN6z+CLMf4cX310K1yJTd9NDkrIkHQMHhLPk5iHTjxoaCeE0ngWus5pQ6cO6/z6GYxp7w1ql/A== X-Received: by 2002:a17:902:3064:: with SMTP id u91mr12934959plb.325.1547181521045; Thu, 10 Jan 2019 20:38:41 -0800 (PST) Received: from hermes.lan (204-195-22-127.wavecable.com. [204.195.22.127]) by smtp.gmail.com with ESMTPSA id d80sm216931419pfm.146.2019.01.10.20.38.40 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 10 Jan 2019 20:38:40 -0800 (PST) Date: Thu, 10 Jan 2019 20:38:33 -0800 From: Stephen Hemminger To: Gage Eads Cc: dev@dpdk.org, olivier.matz@6wind.com, arybchenko@solarflare.com, bruce.richardson@intel.com, konstantin.ananyev@intel.com Message-ID: <20190110203833.3e6a6ea2@hermes.lan> In-Reply-To: <20190110210122.24889-2-gage.eads@intel.com> References: <20190110210122.24889-1-gage.eads@intel.com> <20190110210122.24889-2-gage.eads@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH 1/6] ring: change head and tail to pointer-width size X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Jan 2019 04:38:42 -0000 On Thu, 10 Jan 2019 15:01:17 -0600 Gage Eads wrote: > For 64-bit architectures, doubling the head and tail index widths greatly > increases the time it takes for them to wrap-around (with current CPU > speeds, it won't happen within the author's lifetime). This is important in > avoiding the ABA problem -- in which a thread mistakes reading the same > tail index in two accesses to mean that the ring was not modified in the > intervening time -- in the upcoming non-blocking ring implementation. Using > a 64-bit index makes the possibility of this occurring effectively zero. > > I tested this commit's performance impact with an x86_64 build on a > dual-socket Xeon E5-2699 v4 using ring_perf_autotest, and the change made > no significant difference -- the few differences appear to be system noise. > (The test ran on isolcpus cores using a tickless scheduler, but some > variation was stll observed.) Each test was run three times and the results > were averaged: > > | 64b head/tail cycle cost minus > Test | 32b head/tail cycle cost > ------------------------------------------------------------------ > SP/SC single enq/dequeue | 0.33 > MP/MC single enq/dequeue | 0.00 > SP/SC burst enq/dequeue (size 8) | 0.00 > MP/MC burst enq/dequeue (size 8) | 1.00 > SP/SC burst enq/dequeue (size 32) | 0.00 > MP/MC burst enq/dequeue (size 32) | -1.00 > SC empty dequeue | 0.01 > MC empty dequeue | 0.00 > > Single lcore: > SP/SC bulk enq/dequeue (size 8) | -0.36 > MP/MC bulk enq/dequeue (size 8) | 0.99 > SP/SC bulk enq/dequeue (size 32) | -0.40 > MP/MC bulk enq/dequeue (size 32) | -0.57 > > Two physical cores: > SP/SC bulk enq/dequeue (size 8) | -0.49 > MP/MC bulk enq/dequeue (size 8) | 0.19 > SP/SC bulk enq/dequeue (size 32) | -0.28 > MP/MC bulk enq/dequeue (size 32) | -0.62 > > Two NUMA nodes: > SP/SC bulk enq/dequeue (size 8) | 3.25 > MP/MC bulk enq/dequeue (size 8) | 1.87 > SP/SC bulk enq/dequeue (size 32) | -0.44 > MP/MC bulk enq/dequeue (size 32) | -1.10 > > An earlier version of this patch changed the head and tail indexes to > uint64_t, but that caused a performance drop on 32-bit builds. With > uintptr_t, no performance difference is observed on an i686 build. > > Signed-off-by: Gage Eads > --- > lib/librte_eventdev/rte_event_ring.h | 6 +++--- > lib/librte_ring/rte_ring.c | 10 +++++----- > lib/librte_ring/rte_ring.h | 20 ++++++++++---------- > lib/librte_ring/rte_ring_generic.h | 16 +++++++++------- > 4 files changed, 27 insertions(+), 25 deletions(-) > > diff --git a/lib/librte_eventdev/rte_event_ring.h b/lib/librte_eventdev/rte_event_ring.h > index 827a3209e..eae70f904 100644 > --- a/lib/librte_eventdev/rte_event_ring.h > +++ b/lib/librte_eventdev/rte_event_ring.h > @@ -1,5 +1,5 @@ > /* SPDX-License-Identifier: BSD-3-Clause > - * Copyright(c) 2016-2017 Intel Corporation > + * Copyright(c) 2016-2019 Intel Corporation > */ > > /** > @@ -88,7 +88,7 @@ rte_event_ring_enqueue_burst(struct rte_event_ring *r, > const struct rte_event *events, > unsigned int n, uint16_t *free_space) > { > - uint32_t prod_head, prod_next; > + uintptr_t prod_head, prod_next; > uint32_t free_entries; > > n = __rte_ring_move_prod_head(&r->r, r->r.prod.single, n, > @@ -129,7 +129,7 @@ rte_event_ring_dequeue_burst(struct rte_event_ring *r, > struct rte_event *events, > unsigned int n, uint16_t *available) > { > - uint32_t cons_head, cons_next; > + uintptr_t cons_head, cons_next; > uint32_t entries; > > n = __rte_ring_move_cons_head(&r->r, r->r.cons.single, n, > diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c > index d215acecc..b15ee0eb3 100644 > --- a/lib/librte_ring/rte_ring.c > +++ b/lib/librte_ring/rte_ring.c > @@ -1,6 +1,6 @@ > /* SPDX-License-Identifier: BSD-3-Clause > * > - * Copyright (c) 2010-2015 Intel Corporation > + * Copyright (c) 2010-2019 Intel Corporation > * Copyright (c) 2007,2008 Kip Macy kmacy@freebsd.org > * All rights reserved. > * Derived from FreeBSD's bufring.h > @@ -227,10 +227,10 @@ rte_ring_dump(FILE *f, const struct rte_ring *r) > fprintf(f, " flags=%x\n", r->flags); > fprintf(f, " size=%"PRIu32"\n", r->size); > fprintf(f, " capacity=%"PRIu32"\n", r->capacity); > - fprintf(f, " ct=%"PRIu32"\n", r->cons.tail); > - fprintf(f, " ch=%"PRIu32"\n", r->cons.head); > - fprintf(f, " pt=%"PRIu32"\n", r->prod.tail); > - fprintf(f, " ph=%"PRIu32"\n", r->prod.head); > + fprintf(f, " ct=%"PRIuPTR"\n", r->cons.tail); > + fprintf(f, " ch=%"PRIuPTR"\n", r->cons.head); > + fprintf(f, " pt=%"PRIuPTR"\n", r->prod.tail); > + fprintf(f, " ph=%"PRIuPTR"\n", r->prod.head); > fprintf(f, " used=%u\n", rte_ring_count(r)); > fprintf(f, " avail=%u\n", rte_ring_free_count(r)); > } > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h > index af5444a9f..12af64e13 100644 > --- a/lib/librte_ring/rte_ring.h > +++ b/lib/librte_ring/rte_ring.h > @@ -1,6 +1,6 @@ > /* SPDX-License-Identifier: BSD-3-Clause > * > - * Copyright (c) 2010-2017 Intel Corporation > + * Copyright (c) 2010-2019 Intel Corporation > * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org > * All rights reserved. > * Derived from FreeBSD's bufring.h > @@ -65,8 +65,8 @@ struct rte_memzone; /* forward declaration, so as not to require memzone.h */ > > /* structure to hold a pair of head/tail values and other metadata */ > struct rte_ring_headtail { > - volatile uint32_t head; /**< Prod/consumer head. */ > - volatile uint32_t tail; /**< Prod/consumer tail. */ > + volatile uintptr_t head; /**< Prod/consumer head. */ > + volatile uintptr_t tail; /**< Prod/consumer tail. */ > uint32_t single; /**< True if single prod/cons */ > }; Isn't this a major ABI change which will break existing applications?