From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1D1E0A04A5; Tue, 16 Jun 2020 19:09:44 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 7CECF1BF97; Tue, 16 Jun 2020 19:09:43 +0200 (CEST) Received: from mail-io1-f65.google.com (mail-io1-f65.google.com [209.85.166.65]) by dpdk.org (Postfix) with ESMTP id C60AD1BF95 for ; Tue, 16 Jun 2020 19:09:42 +0200 (CEST) Received: by mail-io1-f65.google.com with SMTP id w18so2689157iom.5 for ; Tue, 16 Jun 2020 10:09:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=i9L99CkbwHTM0uPeqQyxdLyJD04biMrzhylkdVE/kjw=; b=F0asregi7LID3kMf+EckXKHEG86dVfgzV9z/jlCsKn9w//kgrDHH74uC9aH6vJwF9/ fj3+Fo+9AMt0ctsiZcj4VnOAH+uYBkR9pB/ReEsyBJnOdpUkgVhgz9jTNnFfrbMYZQJX T4HjDciZzLktFKGuaSevTUNcOwYHQn+xoXoy73Vmvc3DeCKQf5SMJXjXo7KLBmC5jP8r iu0goTqrDnOH7TiEPMA6eosrllYUn8qSGppLdSaMvNMpMOQpE9VW2NpsZkDkG5giP6+8 KAZJori0OVTdhssmig8qQqk4ZfswRzZM+jRvmQ1Gogv+GGU2lS+2307nw0hI0BbHy+1m lJAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=i9L99CkbwHTM0uPeqQyxdLyJD04biMrzhylkdVE/kjw=; b=ASOgdARFo4K491aAm7RjdQVY/5/8mwjE+i0u3MfJo5qYbRpSMUplE3AXnn9kPgZ/a1 E6jk8oYhZqHAtabkw8JdxxT94KhWOET32JW5K19HQTOk8S8jNHRa55iLlo90J+wq7JPs XDB3ADLFUJjCH5smZRZpXjO+Rg4DkL/w9xx6oW1zFZ6emJxUP1LRBgiBDN4qIve7033v RLlvDoI52zcEb4ZL14JHK/Q0V2wcaR6LRX/lMWXYsGWItVkMMJ4zIMMequL9nynExPDm DR+F/oYQ6WGgqUNKw8S0BjTnVmBpl4sYSDiiY7r5na3iix6GkBpbgvFF7BodHdWd8PJQ cuYg== X-Gm-Message-State: AOAM5323REbN2EQxXPD80OD3I+4Po8KAVKCAygwzt32JF0Ubo9cUjuL/ LZ3mfObZ5EYqbYSzaZx/qlUNX+Ye3KoA7AdwgS0= X-Google-Smtp-Source: ABdhPJzhpD5IIUWkgKclg+LhcrLBEJIPPmMA3LlCEZAxKdjCmD6M7NkpGNfebkrXWMuJUy+k2tKr6Zqc2ajJoTEwh1E= X-Received: by 2002:a05:6602:2001:: with SMTP id y1mr3974505iod.94.1592327381843; Tue, 16 Jun 2020 10:09:41 -0700 (PDT) MIME-Version: 1.0 References: <20200529131955.GA83122@outlook.office365.com> <660267ae-00a0-bb55-fbc3-9ac1473957ea@intel.com> <20200530100228.GA24648@outlook.office365.com> <20200602102259.GA54715@outlook.office365.com> <20200602121654.GA106816@outlook.office365.com> <0dff9790-c57f-7912-8ae3-7ee67f846d3d@intel.com> <93726fd8-b6ac-efa8-695e-214d24dfc561@intel.com> In-Reply-To: From: Jerin Jacob Date: Tue, 16 Jun 2020 22:39:25 +0530 Message-ID: To: "Burakov, Anatoly" Cc: Harman Kalra , dpdk-dev , David Hunt , Liang Ma , Reshma Pattan Content-Type: text/plain; charset="UTF-8" Subject: Re: [dpdk-dev] [EXT] Re: [PATCH 3/3] l3fwd-power: add interrupt-only mode X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Tue, Jun 16, 2020 at 3:01 PM Burakov, Anatoly wrote: > > On 15-Jun-20 5:29 PM, Jerin Jacob wrote: > > On Mon, Jun 15, 2020 at 9:15 PM Burakov, Anatoly > > wrote: > >> > >> On 15-Jun-20 4:21 PM, Jerin Jacob wrote: > >>> On Mon, Jun 15, 2020 at 8:35 PM Burakov, Anatoly > >>> wrote: > >>>> > >>>> On 15-Jun-20 12:43 PM, Jerin Jacob wrote: > >>>>> On Mon, Jun 15, 2020 at 5:02 PM Burakov, Anatoly > >>>>> wrote: > >>>>>> > >>>>>> On 02-Jun-20 1:16 PM, Harman Kalra wrote: > >>>>>>> On Tue, Jun 02, 2020 at 03:53:07PM +0530, Harman Kalra wrote: > >>>>>>>> On Mon, Jun 01, 2020 at 01:50:26PM +0100, Burakov, Anatoly wrote: > >>>>>>>>> On 30-May-20 11:02 AM, Harman Kalra wrote: > >>>>>>>>>> On Fri, May 29, 2020 at 03:19:45PM +0100, Burakov, Anatoly wrote: > >>>>>>>>>>> External Email > >>>>>>>>>>> > >>>>>>>>>>> ---------------------------------------------------------------------- > >>>>>>>>>>> On 29-May-20 2:19 PM, Harman Kalra wrote: > >>>>>>>>>>> > >>>>>>>>>>>>> if (ret < 0) > >>>>>>>>>>>>> rte_exit(EXIT_FAILURE, "Invalid L3FWD parameters\n"); > >>>>>>>>>>>>> - if (app_mode != APP_MODE_TELEMETRY && init_power_library()) > >>>>>>>>>>>>> + if (app_mode == APP_MODE_DEFAULT) > >>>>>>>>>>>>> + app_mode = APP_MODE_LEGACY; > >>>>>>>>>>>>> + > >>>>>>>>>>>>> + /* only legacy and empty poll mode rely on power library */ > >>>>>>>>>>>>> + if ((app_mode == APP_MODE_LEGACY || app_mode == APP_MODE_EMPTY_POLL) && > >>>>>>>>>>>>> + init_power_library()) > >>>>>>>>>>>>> rte_exit(EXIT_FAILURE, "init_power_library failed\n"); > >>>>>>>>>>>> Hi, > >>>>>>>>>>>> > >>>>>>>>>>>> Rather than just exiting from here can we have a else condition to > >>>>>>>>>>>> automatically enter into the "interrupt only" mode. > >>>>>>>>>>>> Please correct me if I am missing something. > >>>>>>>>>>> > >>>>>>>>>>> Hi, > >>>>>>>>>>> > >>>>>>>>>>> Thanks for your review. I don't think silently proceeding is a good idea. If > >>>>>>>>>>> the user wants interrupt-only mode, they should request it. Silently falling > >>>>>>>>>>> back to interrupt-only mode will create an illusion of successful > >>>>>>>>>>> initialization and set the wrong expectation for how the application will > >>>>>>>>>>> behave. > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Hi, > >>>>>>>>>> > >>>>>>>>>> Thanks for the explanation which even I also believe is logically perfect. > >>>>>>>>>> > >>>>>>>>>> But since l3fwd-power is an old application and has many users around > >>>>>>>>>> which are currently using this app in interrupt only mode without giving > >>>>>>>>>> an extra argument. But suddenly they will start getting failure messages with > >>>>>>>>>> the new patchset. > >>>>>>>>>> > >>>>>>>>>> My only intent with else condition was backward compatibility. > >>>>>>>>>> Or may be we can have more descriptive failure message, something like > >>>>>>>>>> "init_power_library failed, check manual for other modes". > >>>>>>>>>> > >>>>>>>>>> Thanks > >>>>>>>>>> Harman > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> I think we can compormise on an informative log message suggesting to use > >>>>>>>>> interrupt mode. I'm not keen on reverting to previous buggy behavior :) > >>>>>>>>> > >>>>>>>> Hi > >>>>>>>> > >>>>>>>> I am not insisting to revert to previous behavior, I am just trying to > >>>>>>>> highlight some probable issues that many users might face as its an old > >>>>>>>> application. > >>>>>>>> Since many arm based soc might not be supporting frequency scaling, can > >>>>>>>> we add the following check as soon as the application starts, probe the > >>>>>>>> platform if it supports frequency scaling, if not automatically set the > >>>>>>>> mode to interrupt mode, something like: > >>>>>>>> if (access("/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor", > >>>>>>>> F_OK)) > >>>>>>>> app_mode = APP_MODE_INTERRUPT; > >>>>>>> > >>>>>>> Sorry, no direct check in application but we can introduce a new API in > >>>>>>> power library: > >>>>>>> bool rte_is_freq_scaling() { > >>>>>>> return access("/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor", > >>>>>>> F_OK); > >>>>>>> } > >>>>>>> > >>>>>>> and in the application we can use "rte_is_freq_scaling()" at the start. > >>>>>>> > >>>>>> > >>>>>> What you're suggesting here is effectively what you have already > >>>>>> suggested: silently fall back to interrupt-only mode if power lib init > >>>>>> failed. I already outlined why i don't think it's a good approach. > >>>>> > >>>>> Is probing "/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor" > >>>>> file presence, > >>>>> detects the power lib availability . Right? Not the failure. Right? > >>>>> IMO, it make sense to have following case: > >>>>> # first check, Is the system is capable of power lib if so use power lib > >>>>> # if the system is not capable of using power lib use interrupt mode. > >>>>> > >>>>> I think, there is difference between system capable of using power lib > >>>>> vs power lib not available or power lib failure. > >>>> > >>>> I am of the opinion that if a test sets up an unrealistic expectation of > >>>> how an application should behave, it's a problem with the test, not with > >>>> the application. > >>>> > >>>> If the system is not capable of running with power lib - the application > >>>> shouldn't be requested to run in such mode. > >>> > >>> But with this patch, default proposed mode is power lib without any > >>> explicit request. > >> > >> The default has always been "use the power library". That was always the > >> expected behavior, i believe since the very first version of this > >> application. In other words, even if the "requested" behavior is not > >> requested explicitly, it has always been that implicitly, and this patch > >> is *keeping existing behavior*. > > > > See below, > >> > >>> > >>>> > >>>> "The application behaved that way before" - yes, it did. It was a bug in > >>>> the application, that it allowed users to effectively misuse the > >>>> application and use it despite the fact that it was in a half-working > >>>> state. This problem has been addressed by 1) not allowing the > >>>> application to run in half-working state, and 2) adding a new mode where > >>>> the old "expected" behavior is *actually* expected and is "full working > >>>> state" now. > >>>> > >>>> Therefore, all users who were previously misusing the application to do > >>>> something it was not designed to do because of a bug in the > >>>> implementation, should now fix their usage and use the correct mode - > >>>> and such breakage is IMO necessary to call attention to earlier misuse > >>>> in the tools, and to correct this usage. > >>>> > >>>> What bothers me about your suggestion is that it is impossible to fail > >>>> the test if the wrong mode was requested (as in, if we request the > >>>> power-lib mode on a system that doesn't have freq scaling) - it instead > >>>> silently falls back to a mode that is almost guaranteed to work. > >>> > >>> I agree that it should fail, i.e someone explicitly request, > >>> power-lib mode or any mode > >>> and it should fail application if the platform we can not do that. > >>> > >>> My suggestion is all about, what is the default, IMO, if no argument > >>> is specified, > >>> the application should _probe_ the capability of the platfrom and > >>> choose the mode. One can override > >>> the probe to select the desired one. In such mode, fail to configure > >>> the mode should result in > >>> an error. > >> > >> This would change the default behavior that has always existed with this > >> application, and would still be subject to silent failure issue > >> *because* older tests may not account for this implied assumption of > >> "the application will run no matter what", leading to a possible false > >> positive test result. > >> > >> Now, if the default was "not to run and ask explicitly what mode should > >> the user use" - i can see how we could both agree on that. It's not > >> unprecedented - l3fwd itself won't run unless you explicitly specify > >> core config, so we *could* request additional parameters by default. I > >> would've also agreed with the "probe" thing *if it was a new application > >> without any pre-existing usages* - but it isn't, and in this case IMO > >> this is doubly important because there may be tests out there that *rely > >> on a bug* and thus should be explicitly broken and addressed (like the > >> internal test we had that uncovered the issue in the first place). > >> > >> In other words, in the perfect world, i would agree with you :) As it > >> stands though, i think that intentionally breaking tests that are > >> themselves broken and rely on wrong assumptions is more important than > >> keeping them working. > > > > OK. Let's enumerate the pros and cons of both approaches? > > Approach a: Auto probe > > Approach b: Current patch > > > > Approach a: > > + Application picks up the mode based on platform capability > > + No change in behavior wrt existing l3fwd-power where the platform > > has only interrupt support. > > (otherwise, It will fail to boot up, the CI etc we need to patch based > > on the DPDK version) > > > > I am not sure approach b has pros wrt approach a. > > The power library has other ways of freq scaling, not just through the > pstate/ACPI. Now, granted, the third way (KVM) isn't supposed to work on > l3fwd-power, and is in fact explicitly stopped from working in this > patch, but still - relying on the scaling governor alone is not good enough. > > Perhaps we should add a "check available" API that would probe each of > the supported modes and return 1 for (potentially) supported and 0 for > (definitely) unsupported? +1 for adding "check available" API for the specific mode. > As far as i know, all three are reliant on > file access at the end of the day, so we can check if they exist before > trying to access them. That would effectively be "autodetect" without > being too specific to ACPI/pstate scaling checks. +1 > > > > > I.e On the x86 platform where freq scaling is present then SHOULD NOT > > have any difference in the approach a vs > > approach b. ie. Auto probe finds the system is capable of freq scaling > > and picks the powerlib. its is win-win case, > > I am not sure, What I am missing? > > > > > > I was worried that an x86 VM would still have this file without being > capable of frequency scaling, but i just checked and it seems that VM's > indeed don't expose the cpufreq filesystem. Looks good then!! > > -- > Thanks, > Anatoly