From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id 73A82A0679 for ; Thu, 4 Apr 2019 16:07:20 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id BDB281B3FB; Thu, 4 Apr 2019 16:07:18 +0200 (CEST) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 163FA1B3BA; Thu, 4 Apr 2019 16:07:15 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 04 Apr 2019 07:07:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,308,1549958400"; d="scan'208";a="131433644" Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.237.220.103]) ([10.237.220.103]) by orsmga008.jf.intel.com with ESMTP; 04 Apr 2019 07:07:13 -0700 To: Ray Kinsella , Bruce Richardson Cc: dev@dpdk.org, Kevin Traynor , "techboard@dpdk.org" References: <94df3cc4-de54-72d6-84c6-81bebd209a81@intel.com> <20190404105447.GA1351@bricha3-MOBL.ger.corp.intel.com> From: "Burakov, Anatoly" Message-ID: <455a61b4-891d-eaaf-d784-2be884bcacbd@intel.com> Date: Thu, 4 Apr 2019 15:07:12 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format="flowed" Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [dpdk-techboard] DPDK ABI/API Stability X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Message-ID: <20190404140712.BsZkW_OnK3vjKUx-wTg7CbNKd_6Fia85k0OdbA0BUrk@z> On 04-Apr-19 1:52 PM, Ray Kinsella wrote: > > > On 04/04/2019 11:54, Bruce Richardson wrote: >> On Thu, Apr 04, 2019 at 10:29:19AM +0100, Burakov, Anatoly wrote: >>> On 03-Apr-19 4:42 PM, Ray Kinsella wrote: >>>> Hi folks, >>>> > [SNIP] >>> >>> Hi Ray, >>> >>> My somewhat rambly 2 cents :) >>> >>> While i think some solution has to be found for the situation, we also have >>> to balance this against speed of development and new features rollout. >>> >>> For example, let's consider what i am intimately familiar with - the memory >>> rework. I have made enormous efforts to ensure that pre-18.05 and post-18.05 >>> remain as ABI/API compatible as possible, but there were a couple of API >>> calls that were removed, and there couldn't have been any replacements >>> (these API's were exposing internal structures that shouldn't have been >>> exposed in the first place), and 18.05 also broke the ABI compatibility, >>> because there was no way to do it without it (shared internal structures >>> needed to change in part to support multiprocess). >>> >>> So, if i understand your proposal correctly, assuming a 2-year waiting >>> period for the deprecation of core API's, you would essentially still be >>> waiting for the memory rework to land for a year more. Moreover, even >>> *after* it has landed, there was a continuous stream of improvements and >>> bugfixes, some of which has broke ABI compatibility as well. Some of them >>> were my fault (as in, i could've foreseen the need for those changes, but >>> didn't), but others came as a result of people using these new features in >>> the wild and reporting issues/problems/suggestions - i am but one man, after >>> all. Plus, you know, there's only 24 hours in a day, and some stuff takes >>> time to implement :) >>> >>> Since this rework goes right at the heart of DPDK (arguably there isn't a >>> more "core" API than memory!), there is no (sane) way in the universe to 1) >>> keep backwards compatibility for this, or 2) keep two parallel versions of >>> it. We also need to test all that, and, to be honest, one validation cycle >>> for a release wouldn't be enough to figure out all of the kinks and >>> implications of such a case. It was really great that memory rework has >>> landed in 18.05 and we had time to improve and prepare it for an 18.11 LTS - >>> i think everyone can say that it's in much better shape in 18.11 than it was >>> in 18.05, but if we couldn't do an ABI break here or there, this rate of >>> improvements would have slowed down significantly. >>> >>> Now, i understand that this is probably a highly exceptional case, but i'm >>> sure that maintainers of other parts of DPDK will have their own examples of >>> similar things happening. >>> >>> I have no idea what a proper solution would look like. Any "splitting" of >>> the trees into "experimental" vs. "stable" will end up causing the same >>> issue - people choose to use stable over experimental because, well, it's >>> more stable, and new/experimental features don't get tested as much because >>> no one runs the thing in the first place. >>> >>> TL;DR we have to be careful not to constrain the pace of >>> development/bugfixing just for the sake of having a stable API/ABI :) >>> >> >> Actually, I think we *do* need to constrain the pace of development for the >> sake of ABI stability. At this stage DPDK has been around for quite a >> number of years and so should be considered a fairly mature project - it >> should just start acting like it. > > I 100% agree. > > If you break your users stuff regularly enough, they will eventually > start looking around for an alternative that doesn't break their stuff > quiet so regularly. > > We often use the pace of innovation in DPDK as justification for ABI/API > breakages, but that approach is a real rarity among the Open Source > community. I can't think of any mature project off-hand that share's it. > > I would ask is Linux any less innovative because they offer a stable API > and have an absolute commitment to never breaking userspace? Would Linux > have ever been as popular as it is today it they broke userspace every > quarter? > > They reality is that they (Linux) find workarounds and compromise > because there is an uber-maintainer Linus who had a strong ethos from > the start not to break their users stuff - we need the same ethos in DPDK. > >> >> Now, in terms of features like the memory rework, that is indeed a case >> that there was no alternative other than a massive ABI break. However, for >> that rework there was a strong need for improvement in that area that we >> can make the case for an ABI break to support it - and it is of a scale >> that nothing other than an ABI change would do. For other areas and >> examples, I doubt there are many in the last couple of years that are of >> that scale. > > I would also be inclined to agree with Bruce's points on memory rework > was somewhat of an outlier, we don't see many like it. >> My thoughts on the matter are: >> 1. I think we really need to do work to start hiding more of our data >> structures - like what Stephen's latest RFC does. This hiding should reduce >> the scope for ABI breaks. >> 2. Once done, I think we should commit to having an ABI break only in the >> rarest of circumstances, and only with very large justification. I want us >> to get to the point where DPDK releases can immediately be picked up by all >> linux distros and rolled out because they are ABI compatible. > > The work that Anatoly describes removing APIs that exposed internal > structures and Stephen H's RFC similarly are good examples of the kind > of work required to prepare for this change. We need to take a good look > at the API and reduce the number of unnecessary internal structures > exposed. > > I never expected it going to to be a big bang - but is a definite > direction we need to move towards over the next few release. ...in this case, we have to think long and hard about the fabled EAL rework/split, and in general *specifying* what is it that we want to support, and the use cases that we want to target. Right now there is a huge mountain of technical debt and kludges and workarounds that has accumulated over the years, and it exists precisely because "every change breaks someone's workflow". For example, just in memory subsystem alone, we have legacy mem, because some use cases require huge amounts of contiguous memory, and not everyone is using VFIO; there's all of the 32-bit related workarounds and hacks; there's the single-file-segments stuff that could have been the default if not for the fact that we support kernels that don't support fallocate(); there are two different ways of doing in-memory mode, because not all kernels support memfd's; there is a gargantuan pile of workarounds (and "known issues", and just code in general) all over the DPDK codebase just to support our multiprocess model and all of the various warts that come with it. In fact, i would even go as far as to say that *most* of EAL ABI breaks have been due to the fact that we store data in shared memory because of multiprocess - so there is simply no way we can change these internal data structures without ABI breaks, because even if they're not exposed through user-facing API, they are still exposed by virtue of secondary processes basically having an ABI contract with primary process instances. So, if we are to cement our core API - we have to make a concrete effort to specify what goes and what stays, if we want it to be maintainable. The DPDK 1.0 specification, if you will :) -- Thanks, Anatoly