From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 537725587 for ; Tue, 29 Mar 2016 18:06:43 +0200 (CEST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga103.fm.intel.com with ESMTP; 29 Mar 2016 09:04:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,411,1455004800"; d="scan'208";a="921109822" Received: from bricha3-mobl3.ger.corp.intel.com ([10.237.221.48]) by orsmga001.jf.intel.com with SMTP; 29 Mar 2016 09:04:43 -0700 Received: by (sSMTP sendmail emulation); Tue, 29 Mar 2016 17:04:42 +0025 Date: Tue, 29 Mar 2016 17:04:42 +0100 From: Bruce Richardson To: Olivier MATZ Cc: Lazaros Koromilas , dev@dpdk.org, Thomas Monjalon Message-ID: <20160329160442.GB15912@bricha3-MOBL3> References: <1458229783-15547-1-git-send-email-l@nofutznetworks.com> <56F51DCB.5020502@6wind.com> <20160329085443.GA17800@bricha3-MOBL3> <56FA9F48.3030509@6wind.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56FA9F48.3030509@6wind.com> Organization: Intel Shannon Ltd. User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Mar 2016 16:06:43 -0000 On Tue, Mar 29, 2016 at 05:29:12PM +0200, Olivier MATZ wrote: > Hi, > > > On 03/29/2016 10:54 AM, Bruce Richardson wrote: > >On Mon, Mar 28, 2016 at 06:48:07PM +0300, Lazaros Koromilas wrote: > >>Hi Olivier, > >> > >>We could have two threads (running on different cores in the general > >>case) that both succeed the cmpset operation. In the dequeue path, > >>when n == 0, then cons_next == cons_head, and cmpset will always > >>succeed. Now, if they both see an old r->cons.tail value from a > >>previous dequeue, they can get stuck in the while > > > >Hi, > > > >I don't see how threads reading an "old r->cons.tail" value is even possible. > >The head and tail pointers on the ring are marked in the code as volatile, so > >all reads and writes to those values are always done from memory and not cached > >in registers. No deadlock should be possible on that while loop, unless a > >process crashes in the middle of a ring operation. Each thread which updates > >the head pointer from x to y, is responsible for updating the tail pointer in > >a similar manner. The loop ensures the tail updates are in the same order as the > >head updates. > > > >If you believe deadlock is possible, can you outline the sequence of operations > >which would lead to such a state, because I cannot see how it could occur without > >a crash inside one of the threads. > > I think the deadlock Lazaros describes could occur in the following > condition: > > current ring state > r->prod.head = 0 > r->prod.tail = 0 > > core 0 core 1 > ==================================================================== > enqueue 0 object > cmpset(&r->prod.head, 0, 0) > core 0 is interrupted here > enqueue 1 object > cmpset(&r->prod.head, 0, 1) > copy the objects in box 0 > while (r->prod.tail != prod_head)) > r->prod.tail = prod_next > copy 0 object (-> nothing to do) > while (r->prod.tail != prod_head)) > > > > I think this issue is indeed fixed by Lazaros' patch (I missed it > in previous review). However, I don't think this deadlock could > happen once we avoided the (n == 0) case. > Yes, I agree with your scenario. However, I thought the issue was occuring with non-zero updates, but I may well be mistaken. If it's fixed now, all good... :-) /Bruce