From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f67.google.com (mail-wm0-f67.google.com [74.125.82.67]) by dpdk.org (Postfix) with ESMTP id 4C0C3322C for ; Tue, 23 Jan 2018 13:56:51 +0100 (CET) Received: by mail-wm0-f67.google.com with SMTP id t74so1663841wme.3 for ; Tue, 23 Jan 2018 04:56:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=Xr9airnEo/RTuWjVyi8pJALuiQIoDk9sI05uyz6pfvU=; b=TFJXk/lcmhwfOdUFjEZwO7ul6zAJuUgHUkOiKOfGjfWqpw7NDDF0dofX9X1vlJfnWb m0f8nyF3fq2C5g6NjdqXzF3Xb003mGhL2/fNRSDR5aegCFl302mFPxt2DO4O6gj6lKbW HS6WihFfRbdoNzAZiR7FHJscExmQP6T/lhJIZmYOSANsCp2mhAdo/eMEEpUDtnOfoFJj r3etPGifppk9uXfrWqc/hXvIW8jF9XmZGr1STN+Jz0ZeDpl1DOoykAd0taSfL/VsG5sd XG076jIheGTltGnBM3Ne883wDRVzP4N/3uJsLFhZKsHrP1GrSqvV4pVISJ0b19EVeuYm kLiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=Xr9airnEo/RTuWjVyi8pJALuiQIoDk9sI05uyz6pfvU=; b=ICjqj69N7JLF1zl3RA/FNGinqBic2DHOEVOSeI83CsuFZxmPGaJ04NkP8rFW34Pntm k8M9KBx+jvqGREAIxG58fehvMTt63xhE1cjMapbnI5SLWetgscqshq0Ou2nqdiLSgeqv +jrBg/plsSVUI4ek03RYPl0307Z0CUDhSsjSIITqOrPnRkoqjJgSZT2ECnkHm/7Xe+6F U2b+VB0Aai166S9sUy/9jnMYv10GaFbi/P7Z17Gef9x5vcrJIWiyr9HVRPNSJ7I/i5XJ jw+SDvfpYwcDrtBxWiHTr5J1peHvpKlXEMDx/qRQzLaFcFlNvUectySj1Wp2kPMiAaGv bjpQ== X-Gm-Message-State: AKwxytdKeKfa0AgnCTb44k6Y79GjRiiw9IXA9y2Ji/SNkQq1itmfUhTA iWBfDo6I0LdtZLYfKt9RROTqXw== X-Google-Smtp-Source: AH8x227ltYecxwsNJWDLa04EdhyBETGC5jEX5jDFI5U7WxjT4cwsGcGSEFwUn9azwyG8JdcqwRSP2g== X-Received: by 10.28.142.74 with SMTP id q71mr1893159wmd.125.1516712211333; Tue, 23 Jan 2018 04:56:51 -0800 (PST) Received: from bidouze.vm.6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id 59sm418853wrs.85.2018.01.23.04.56.49 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 23 Jan 2018 04:56:49 -0800 (PST) Date: Tue, 23 Jan 2018 13:56:37 +0100 From: =?iso-8859-1?Q?Ga=EBtan?= Rivet To: Matan Azrad Cc: "Ananyev, Konstantin" , Thomas Monjalon , "Wu, Jingjing" , "dev@dpdk.org" , Neil Horman , "Richardson, Bruce" Message-ID: <20180123125637.p2kufd6n2erpiar5@bidouze.vm.6wind.com> References: <1516293317-30748-8-git-send-email-matan@mellanox.com> <2601191342CEEE43887BDE71AB97725886280A68@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725886280AE8@irsmsx105.ger.corp.intel.com> <20180119150017.mljpcdmldqx32mkq@bidouze.vm.6wind.com> <2601191342CEEE43887BDE71AB97725886281B1D@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725886281E73@irsmsx105.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) Subject: Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Jan 2018 12:56:52 -0000 Hi all, On Tue, Jan 23, 2018 at 08:54:27AM +0000, Matan Azrad wrote: > > > > > > > > > > Subject: [PATCH v3 7/7] app/testpmd: adjust ethdev port > > > > > > > > > > ownership > > > > > > > > > > > > > > > > > > > > Testpmd should not use ethdev ports which are managed by > > > > > > > > > > other DPDK entities. > > > > > > > > > > > > > > > > > > > > Set Testpmd ownership to each port which is not used by > > > > > > > > > > other entity and prevent any usage of ethdev ports which > > > > > > > > > > are not owned by > > > > > > > Testpmd. > > > > > > > > > > > > > > > > > > > > Signed-off-by: Matan Azrad > > > > > > > > > > --- > > > > > > > > > > app/test-pmd/cmdline.c | 89 +++++++++++++++++++------ > > ---- > > > > ------- > > > > > > > ---- > > > > > > > > > ----- > > > > > > > > > > app/test-pmd/cmdline_flow.c | 2 +- > > > > > > > > > > app/test-pmd/config.c | 37 ++++++++++--------- > > > > > > > > > > app/test-pmd/parameters.c | 4 +- > > > > > > > > > > app/test-pmd/testpmd.c | 63 ++++++++++++++++++++---- > > ---- > > > > ---- > > > > > > > > > > app/test-pmd/testpmd.h | 3 ++ > > > > > > > > > > 6 files changed, 103 insertions(+), 95 deletions(-) > > > > > > > > > > > > > > > > > > > > diff --git a/app/test-pmd/cmdline.c > > > > > > > > > > b/app/test-pmd/cmdline.c index > > > > > > > > > > 31919ba..6199c64 100644 > > > > > > > > > > --- a/app/test-pmd/cmdline.c > > > > > > > > > > +++ b/app/test-pmd/cmdline.c > > > > > > > > > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all { > > > > > > > > > > &link_speed) < 0) > > > > > > > > > > return; > > > > > > > > > > > > > > > > > > > > - RTE_ETH_FOREACH_DEV(pid) { > > > > > > > > > > + RTE_ETH_FOREACH_DEV_OWNED_BY(pid, > > my_owner.id) { > > > > > > > > > > > > > > > > > > Why do we need all these changes? > > > > > > > > > As I understand you changed definition of > > > > > > > > > RTE_ETH_FOREACH_DEV(), so no testpmd should work ok > > > > > > > > > default > > > > (no_owner case). > > > > > > > > > Am I missing something here? > > > > > > > > > > > > > > > > Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid) will > > > > > > > > iterate > > > > > > > over all valid and ownerless ports. > > > > > > > > > > > > > > Yes. > > > > > > > > > > > > > > > Here Testpmd wants to iterate over its owned ports. > > > > > > > > > > > > > > Why? Why it can't just iterate over all valid and ownerless ports? > > > > > > > As I understand it would be enough to fix current problems and > > > > > > > would allow us to avoid any changes in testmpd (which I think > > > > > > > is a good > > > > thing). > > > > > > > > > > > > Yes, I understand that this big change is very daunted, But I > > > > > > think the current a lot of bugs in testpmd(regarding port > > > > > > ownership) even more > > > > > daunted. > > > > > > > > > > > > Look, > > > > > > Testpmd initiates some of its internal databases depends on > > > > > > specific port iteration, In some time someone may take ownership > > > > > > of Testpmd ports and testpmd will continue to touch them. > > > > > > > > But if someone will take the ownership (assign new owner_id) that > > > > port will not appear in RTE_ETH_FOREACH_DEV() any more. > > > > > > > > > > Yes, but testpmd sometimes depends on previous iteration using internal database. > > > So it uses internal database that was updated by old iteration. > > > > That sounds like just a bug in testpmd that need to be fixed, no? > > If Testpmd already took ownership for these ports(like I did), it is ok. > Have you tested using the default iterator (NO_OWNER)? It worked until now with the bare minimal device tagging using DEV_DEFERRED. Testpmd did not seem to mind having to skip this port. I'm sure there were places where this was overlooked, but overall, I'd think everything should be fixable using only the NO_OWNER iteration. Can you point to a specific scenario (command line, chain of event) that would lead to a problem? > > Any particular places where outdated device info is used? > > For example, look for the stream management in testpmd(I think I saw it there). > The stream management is certainly shaky, but it happens after the EAL initial port creation, and is not able to update itself for new hotplugged ports (unless something changed). > > > > > If I look back on the fail-safe, its sole purpose is to have > > > > > seamless hotplug with existing applications. > > > > > > > > > > Port ownership is a genericization of some functions introduced by > > > > > the fail-safe, that could structure DPDK further. It should allow > > > > > applications to have a seamless integration with subsystems using > > > > > port ownership. Without this, port ownership cannot be used. > > > > > > > > > > Testpmd should be fixed, but follow the most common design > > > > > patterns of DPDK applications. Going with port ownership seems > > > > > like a paradigm shift. > > > > > > > > > > > In addition > > > > > > Using the old iterator in some places in testpmd will cause a > > > > > > race for run- > > > > time new ports(can be created by failsafe or any hotplug code): > > > > > > - testpmd finds an ownerless port(just now created) by the old > > > > > > iterator and start traffic there, How does testpmd start traffic there? Testpmd has only a callback for displaying that it received an event for a new port. It has no concept of hotplugging beyond that. Testpmd will not start using any new port probed using the hotplug API on its own, again, unless something has drastically changed. > > > > > > - failsafe takes ownership of this new port and start traffic there. > > > > > > Problem! > > > > > > > > Could you shed a bit more light here - it would be race condition > > > > between whom and whom? > > > > > > Sure. > > > > > > > As I remember in testpmd all control ops are done within one thread > > > > (main lcore). > > > > > > But other dpdk entity can use another thread, for example: > > > Failsafe uses the host thread(using alarm callback) to create a new port and > > to take ownership of a port. > > > > Hm, and you create new ports inside failsafe PMD, right and then set new > > owner_id for it? > > Yes. > > > And all this in alarm in interrupt thread? > > Yes. > > > If so I wonder how you can guarantee that no-one else will set different > > owner_id between > > rte_eth_dev_allocate() and rte_eth_dev_owner_set()? > > I check it (see failsafe patch to this series - V5). > Function: fs_bus_init. > > > Could you point me to that place (I am not really familiar with familiar with > > failsafe code)? > > > > > > > > The race: > > > Testpmd iterates over all ports by the master thread. > > > Failsafe takes ownership of a port by the host thread and start using it. > > > => The two dpdk entities may use the device at same time! > > When can this happen? Fail-safe creates its initial pool of ports during EAL init, before testpmd scans eth_dev ports and configure its streams. At that point, it has taken ownership, from the master lcore context. After this point, new ports could be detected and hotplugged by fail-safe. However, even if testpmd had a callback to capture those new ports and reconfigure its streams, it would be executed from within the intr-thread, same as failsafe. If the thread was interrupted, by a dataplane-lcore for example, streams would not have been reconfigured. The fail-safe would execute its callback and set the owner-id before the callback chains goes to the application. And that would only be if testpmd had any callback for hotplugging ports and reconfiguring its streams, which it hasn't, as far as I know. > > Ok, if failsafe really assigns its owner_id(s) to ports that are already in use by > > the app, then how such scheme supposed to work at all? > > If the app works well (with the new rules) it already took ownership and failsafe will see it and will wait until the application release it. > Every dpdk entity should know which port it wants to manage, > If 2 entities want to manage the same device - it can be ok and port ownership can synchronize the usage. > > Probably, application which will run fail-safe wants to manage only the fail-safe port and therefor to take ownership only for it. > > > I.E. application has a port - it assigns some owner_id != 0 to it, then PMD tries > > to set its owner_id tot the same port. > > Obviously failsafe's set_owner() will always fail in such case. > > > Yes, and will try again after some time. > > > From what I hear we need to introduce a concept of 'default owner id'. > > I.E. when failsafe PMD is created - user assigns some owner_id to it (default). > > Then failsafe PMD generates it's own owner_id and assigns it only to the > > ports whose current owner_id is equal either 0 or 'default' owner_id. > > > > It is a suggestion and we need to think about it more (I'm talking about it with Gaetan in another thread). > Actually I think, if we want a generic solution to the generic problem the current solution is ok. > We could as well conclude this other thread there. The only solution would be to have a default relationship between owners, something that goes beyond the scope assigned by Thomas to your evolution, but would be necessary for this API to be properly used by existing applications. I think it's the only way to have a sane default behavior with your API, but I also think this goes beyong the scope of the DPDK altogether. But even with those considerations that could be ironed out later (API is still experimental anyway), in the meantime, I think we should strive not to break "userland" as much as possible. Meaning that unless you have a specific situation creating a bug, you shouldn't have to modify testpmd, and if an issues arises, you need to try to improve your API before resorting to changing the resource management model of all existing applications. > > > > > > Obeying the new ownership rules can prevent all these races. > > > > > > > When we discussed RFC of owner_id patch, I thought we all agreed that > > owner_id API shouldn't be mandatory - i.e. existing apps not required to > > change to work normally with that. > > Yes, it is not mandatory if app doesn't use hotplug. > > I think with hotplug it is mandatory in the most cases. > > And it can ease the secondary process model too. > > Again, in the generic ownership problem as discussed in RFC: > Every entity, include app, should know which ports it wants to manage and to take ownership only for them. > > > Though right now it seems that application changes seems necessary, at least > > to work ok with failsafe PMD. > > And for solving the generic problem of ownership.(will defend from future issues by sure). > > > Which makes we wonder was it some sort of misunderstanding or we did we > > do something wrong here? > > Mistakes can be done all the time, but I think we are all understand the big issue of ownership and how the current solution solves it. > fail-safe it is just a current example for the problems which are possible because of the generic ownership issue. > Regards, -- Gaëtan Rivet 6WIND