From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 228054280D; Wed, 22 Mar 2023 19:06:12 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id AD9A040E09; Wed, 22 Mar 2023 19:06:11 +0100 (CET) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 654F440A84 for ; Wed, 22 Mar 2023 19:06:09 +0100 (CET) Received: by linux.microsoft.com (Postfix, from userid 1086) id A217920FB6B2; Wed, 22 Mar 2023 11:06:08 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com A217920FB6B2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1679508368; bh=BasAogkHDQucabBG5RZKE1XmgFV6KBeKA1PAyeCxVYY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=eCF/z4jy78W0YYeTlVm345+rSgDDseQldCkbthVU5zeGHZHFnyQCv+7OkvO3nvub1 sQ5Roqd0ZSZxWWTkOL69NjzMFIFb2bZkZKBDWQzRcS4Bsy69+9sa0nsLk+mGc/3K7t +AkvosGCxpj9Qs0RPaucxHbE7fXKOYG3jlMj4q7U= Date: Wed, 22 Mar 2023 11:06:08 -0700 From: Tyler Retzlaff To: Honnappa Nagarahalli Cc: Morten =?iso-8859-1?Q?Br=F8rup?= , Stephen Hemminger , "dev@dpdk.org" , Ruifeng Wang , "thomas@monjalon.net" , nd Subject: Re: [PATCH 0/7] replace rte atomics with GCC builtin atomics Message-ID: <20230322180608.GA28785@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> References: <20230317144226.2f26bad1@hermes.local> <20230317214910.GA31884@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <98CBD80474FA8B44BF855DF32C47DC35D877E1@smartserver.smartshare.dk> <20230322142134.GA29057@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <98CBD80474FA8B44BF855DF32C47DC35D877E6@smartserver.smartshare.dk> <20230322152932.GB29057@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <98CBD80474FA8B44BF855DF32C47DC35D877E8@smartserver.smartshare.dk> <98CBD80474FA8B44BF855DF32C47DC35D877E9@smartserver.smartshare.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Wed, Mar 22, 2023 at 05:38:12PM +0000, Honnappa Nagarahalli wrote: > > > > -----Original Message----- > > From: Morten Brørup > > Sent: Wednesday, March 22, 2023 12:08 PM > > To: Honnappa Nagarahalli ; Tyler Retzlaff > > > > Cc: Stephen Hemminger ; dev@dpdk.org; > > Ruifeng Wang ; thomas@monjalon.net; nd > > ; nd > > Subject: RE: [PATCH 0/7] replace rte atomics with GCC builtin atomics > > > > > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com] > > > Sent: Wednesday, 22 March 2023 17.40 > > > > > > > From: Morten Brørup > > > > Sent: Wednesday, March 22, 2023 11:14 AM > > > > > > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com] > > > > > Sent: Wednesday, 22 March 2023 16.30 > > > > > > > > > > On Wed, Mar 22, 2023 at 03:58:07PM +0100, Morten Brørup wrote: > > > > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com] > > > > > > > Sent: Wednesday, 22 March 2023 15.22 > > > > > > > > > > > > > > On Wed, Mar 22, 2023 at 12:28:44PM +0100, Morten Brørup wrote: > > > > > > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com] > > > > > > > > > Sent: Friday, 17 March 2023 22.49 > > > > > > > > > > > > > > > > > > On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen > > > > > > > > > Hemminger > > > > wrote: > > > > > > > > > > On Fri, 17 Mar 2023 13:19:41 -0700 Tyler Retzlaff > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Replace the use of rte_atomic.h types and functions, > > > > > > > > > > > instead use > > > > > GCC > > > > > > > > > > > supplied C++11 memory model builtins. > > > > > > > > > > > > > > > > > > > > > > This series covers the libraries and drivers that are > > > > > > > > > > > built on > > > > > > > Windows. > > > > > > > > > > > > > > > > > > > > > > The code has be converted to use the __atomic builtins > > > but > > > > > > > > > > > there > > > > > are > > > > > > > > > > > additional during conversion i notice that there may > > > > > > > > > > > be some > > > > > issues > > > > > > > > > > > that need to be addressed. > > > > > > > > > > > > > > > > > > > > I don't think all these cmpset need to use SEQ_CST. > > > > > > > > > > Especially for the places where it is used a loop, might > > > be > > > > > > > > > > more efficient with some of the other memory models. > > > > > > > > > > > > > > > > > > i agree. > > > > > > > > > > > > > > > > > > however, i'm not trying to improve the code with this > > > change, > > > > > > > > > just decouple it from rte_atomics.h so trying my best to > > > avoid > > > > > > > > > any unnecessary semantic change. > > > > > > > > > > > > > > > > > > certainly if the maintainers of this code wish to weaken > > > > > > > > > the ordering where appropriate after the change is merged > > > > > > > > > they > > > can > > > > > > > > > do so and > > > > > handily > > > > > > > > > this change has enabled them to do so easily allowing them > > > to > > > > > > > > > test > > > > > just > > > > > > > > > their change in isolation. > > > > > > > > > > > > > > > > I agree with the two-step approach, where this first step is > > > > > > > > a simple > > > > > > > search-and-replacement; but I insist that you add a FIXME or > > > > > > > similar note where you have blindly used SEQ_CST, indicating > > > that > > > > > > > the memory order > > > > > needs to > > > > > > > be reviewed and potentially corrected. > > > > > > > > > > > > > > i think the maintainers need to take some responsibility, if > > > they > > > > > > > see optimizations they missed when previously writing the code > > > > > > > they need to follow up with a patch themselves. i can't do > > > > > > > everything for them and marking things i'm not sure about will > > > > > > > only lead to me having to churn patch series to remove the > > > unwanted > > > > comments later. > > > > > > > > > > > > The previous atomic functions didn't have the "memory order" > > > > > > parameter, so > > > > > the maintainers didn't have to think about it - and thus they > > > > > didn't miss any optimizations when accepting the code. > > > > > > > > > > > > I also agree 100 % that it is not your responsibility to > > > > > > consider > > > or > > > > > determine which memory order is appropriate! > > > > > > > > > > > > But I think you should mark the locations where you are changing > > > > > > from the > > > > > old rte_atomic functions (where no memory order optimization was > > > > > available) to the new functions - to highlight where the option of > > > > > memory ordering has been introduced and knowingly ignored (by you). > > > > > > > > > > > > > > > > first, i have to apologize i confused myself about which of the > > > > > many patch series i have up right now that you were commenting on. > > > > > > > > No worries... you are rushing through quite an effort for this, so a > > > little > > > > confusion is perfectly understandable. Especially when I'm replying > > > > to > > > an ageing > > > > email. :-) > > > > > > > > > > > > > > let me ask for clarification in relation to this series. > > > > > > > > > > isn't that every single usage of the rte_atomic APIs? > > > > > > > > Probably, yes. > > > > > > > > > i mean are you > > > > > literally asking for the entire patch series to look like the > > > > > following patch snippet with the expectation that maintainers will > > > > > come along and clean up/review after this series is merged? > > > > > > > > > > -rte_atomic_add32(&o, v); > > > > > +//FIXME: opportunity for relaxing ordering constraint, please > > > review > > > > > +__atomic_fetch_add(&o, v, order); > > > > > > > > Exactly. And something similar for the rte_atomicXX_t variables > > > changed to > > > > intXX_t, such as the packet counters. > > > > > > > > Realistically, I don't expect the maintainers to clean them up > > > > anytime > > > soon. The > > > > purpose is to make the FIXMEs stick until someone eventually cleans > > > them up, so > > > > they are not forgotten as time passes. > > > Cleaning up the rte_atomic APIs is a different effort. There is > > > already lot of effort that has gone into this and there is more effort > > > happening (rte_ring being a painful one) > > > > > > Instead of having FIXME, why not just send a separate patch with > > > SEQ_CST (still a search and replace)? We can leave the tougher ones > > > like rte_ring as they are being worked on. > > > > The FIXME makes it possible in the future to differentiate between the instances > > that still need review and the instances that have been reviewed where > > SEQ_CST was the correct choice. (Similarly for the choice of type for variables > > previously rte_atomicNN_t.) > Apologies, relooked at the heading of this patch, got confused with other patches. yeah, i did the same thing this morning :) > > The changes Arm had done for rte_atomic_ to __atomic_xxx were not direct replacements. The algorithms were studied, relaxed where required, race conditions fixed, performance benchmarked. IMO, we need to go through the same steps here. > > I looked at the series, we should just review the patch and make suggested changes. Are we constrained by any deadlines for this work? i'm going to say yes but i'll qualify. the use of the rte_atomic_xxx APIs drags in extra work when creating a series that performs the actual conversions to the standard atomics. if i don't decouple ring from rte_atomic_xxx that means i have to go convert all the rte_atomic.h to standard atomics and working around some of the implementation detail to do it is very time consuming. which then has further flow on effects because then i have to go fix every single driver that is still using rte_atomic.h. incidentally i have a work in progress to decouple everything from rte_atomic.h (including all drivers) but it would really negatively impact getting standard atomics introduced if we had to serialize the introduction behind a total removal of rte_atomic or had to make changes to every consumer of the old rte_atomic APIs. if we can get by with a comment on the rte_atomic_xxx lines in this series it would be helpful. when we bring the next series for standard atomics i'm not adverse to introducing changes to the ordering in that series if requested so long as i can get the series up 'soon' so there is lots of review time runway for 23.11. > > I would suggest to drop 1/7. Arm is working on removing the non-C11 algorithm for rte_ring (not sure if we will be successful). I think it is better to explore this approach rather than the changes in patch 1/7. i think my answer here is timing. i'd rather take the work from arm but if it isn't coming for a while then it becomes a blocker. we're waiting for the 23.07 start before this series can be merged. how about we re-evaluate where arm is at when the merge window opens. we can then decide to drop 1/7 or not at that time? ty