From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-DB5-obe.outbound.protection.outlook.com (mail-eopbgr40080.outbound.protection.outlook.com [40.107.4.80]) by dpdk.org (Postfix) with ESMTP id D9D5AA84E for ; Tue, 23 Jan 2018 15:30:23 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=NKSAOXMi6e0gLYH0WNDSWyh+4055gMf010dtMB3ROE4=; b=mfYMjHYTbbPmS67QKApVz0m7W10bfglFquJQM4R7QK9o+jVXgcPGpqr89HYasgdD8q39lI+YCRgtBZEyQACWk+xqQnPOWgHIEiIU3ulZUsTHUMR03LNlpC15sRhDIWsO42bU3ivg98iUfTb3JGt/VihHkD/rtssY7a2vwxJVOMM= Received: from AM6PR0502MB3797.eurprd05.prod.outlook.com (52.133.21.26) by AM6PR0502MB3912.eurprd05.prod.outlook.com (52.133.21.161) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.428.17; Tue, 23 Jan 2018 14:30:22 +0000 Received: from AM6PR0502MB3797.eurprd05.prod.outlook.com ([fe80::6c28:c6b3:de94:a733]) by AM6PR0502MB3797.eurprd05.prod.outlook.com ([fe80::6c28:c6b3:de94:a733%13]) with mapi id 15.20.0428.023; Tue, 23 Jan 2018 14:30:22 +0000 From: Matan Azrad To: =?iso-8859-1?Q?Ga=EBtan_Rivet?= CC: "Ananyev, Konstantin" , Thomas Monjalon , "Wu, Jingjing" , "dev@dpdk.org" , Neil Horman , "Richardson, Bruce" Thread-Topic: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership Thread-Index: AQHTkSJSs9wv4iQCB0Cv+2LP7vySOKN7IsfQgAAIKACAAAHXQIAAHWSAgASMgQCAAAupAIAAgCeAgAC+c1CAAFACgIAADfMQ Date: Tue, 23 Jan 2018 14:30:22 +0000 Message-ID: References: <1516293317-30748-8-git-send-email-matan@mellanox.com> <2601191342CEEE43887BDE71AB97725886280A68@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725886280AE8@irsmsx105.ger.corp.intel.com> <20180119150017.mljpcdmldqx32mkq@bidouze.vm.6wind.com> <2601191342CEEE43887BDE71AB97725886281B1D@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725886281E73@irsmsx105.ger.corp.intel.com> <20180123125637.p2kufd6n2erpiar5@bidouze.vm.6wind.com> In-Reply-To: <20180123125637.p2kufd6n2erpiar5@bidouze.vm.6wind.com> Accept-Language: en-US, he-IL Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=matan@mellanox.com; x-originating-ip: [193.47.165.251] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; AM6PR0502MB3912; 7:ZIJCCNvlBsihVzy2jnKlDUSACg9POQDlPTQUxCkCegQ0uQ7UFgN//9zYXnzCjwWHso8uQHHZHvJutNx7MvbNPSfB5bc2NHBG48FVZ3sUWF4m3GR53tVO94dyba60bKWXQJ9dkQ+C1rSQt+DBxoNfFuejcKHafQa7xRuYmEFPqpXaEsKis6TLM+7288CI490dlnFJUXrLgKnCoZ8pngfBuaeP3esq2Dm2AcBIlEPeX2KE4YOnIIotuqNKBC9IMD2+ x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 5345b2d4-5bbc-416a-2b6a-08d5626dd624 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(4534165)(4627221)(201703031133081)(201702281549075)(48565401081)(5600026)(4604075)(3008032)(2017052603307)(7153060)(7193020); SRVR:AM6PR0502MB3912; x-ms-traffictypediagnostic: AM6PR0502MB3912: x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(60795455431006)(278428928389397)(17755550239193); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040501)(2401047)(8121501046)(5005006)(10201501046)(93006095)(93001095)(3002001)(3231023)(2400081)(944501161)(6055026)(6041288)(20161123558120)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123564045)(20161123560045)(6072148)(201708071742011); SRVR:AM6PR0502MB3912; BCL:0; PCL:0; RULEID:(100000803101)(100110400095); SRVR:AM6PR0502MB3912; x-forefront-prvs: 05610E64EE x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(346002)(376002)(396003)(366004)(39860400002)(39380400002)(76104003)(55674003)(189003)(199004)(86362001)(68736007)(106356001)(6916009)(9686003)(97736004)(2950100002)(55016002)(3280700002)(53936002)(3660700001)(8936002)(229853002)(81166006)(93886005)(2900100001)(59450400001)(8676002)(5660300001)(26005)(25786009)(81156014)(7736002)(66066001)(102836004)(4326008)(6506007)(54906003)(76176011)(14454004)(305945005)(5250100002)(74316002)(99286004)(478600001)(7696005)(6246003)(2906002)(316002)(33656002)(6116002)(3846002)(6436002)(105586002); DIR:OUT; SFP:1101; SCL:1; SRVR:AM6PR0502MB3912; H:AM6PR0502MB3797.eurprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: Qv8L0El7Wy5JaWm03MCJRIZjlhB15OfVstVbYggPy9P6+PatHnEu17QgOos3xPYPx88pR+Asw8fz+TqpvRjJ/g== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5345b2d4-5bbc-416a-2b6a-08d5626dd624 X-MS-Exchange-CrossTenant-originalarrivaltime: 23 Jan 2018 14:30:22.5195 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR0502MB3912 Subject: Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Jan 2018 14:30:24 -0000 Hi From: Ga=EBtan Rivet [mailto:gaetan.rivet@6wind.com] > > > > > > Look, > > > > > > > Testpmd initiates some of its internal databases depends on > > > > > > > specific port iteration, In some time someone may take > > > > > > > ownership of Testpmd ports and testpmd will continue to touch > them. > > > > > > > > > > But if someone will take the ownership (assign new owner_id) > > > > > that port will not appear in RTE_ETH_FOREACH_DEV() any more. > > > > > > > > > > > > > Yes, but testpmd sometimes depends on previous iteration using > internal database. > > > > So it uses internal database that was updated by old iteration. > > > > > > That sounds like just a bug in testpmd that need to be fixed, no? > > > > If Testpmd already took ownership for these ports(like I did), it is ok= . > > >=20 > Have you tested using the default iterator (NO_OWNER)? > It worked until now with the bare minimal device tagging using > DEV_DEFERRED. Testpmd did not seem to mind having to skip this port. >=20 > I'm sure there were places where this was overlooked, but overall, I'd th= ink > everything should be fixable using only the NO_OWNER iteration. I don't think so. > Can you point to a specific scenario (command line, chain of event) that > would lead to a problem? > I didn't construct a race test to catch testpmd issue, but I think without = this patch, there is a lot of issues. Go to the testpmd code (before ownership) and find usage of the old iterato= r(after the first iteration in main), Ask yourself what should happen if exactly in this time, a new port is crea= ted by fail-safe(plug in event). =20 > > > Any particular places where outdated device info is used? > > > > For example, look for the stream management in testpmd(I think I saw it > there). > > >=20 > The stream management is certainly shaky, but it happens after the EAL > initial port creation, and is not able to update itself for new hotplugge= d ports > (unless something changed). >=20 Yes, but conceptually someone in the future may take the port(because it ow= nerless). > > > > > > If I look back on the fail-safe, its sole purpose is to have > > > > > > seamless hotplug with existing applications. > > > > > > > > > > > > Port ownership is a genericization of some functions > > > > > > introduced by the fail-safe, that could structure DPDK > > > > > > further. It should allow applications to have a seamless > > > > > > integration with subsystems using port ownership. Without this, > port ownership cannot be used. > > > > > > > > > > > > Testpmd should be fixed, but follow the most common design > > > > > > patterns of DPDK applications. Going with port ownership seems > > > > > > like a paradigm shift. > > > > > > > > > > > > > In addition > > > > > > > Using the old iterator in some places in testpmd will cause > > > > > > > a race for run- > > > > > time new ports(can be created by failsafe or any hotplug code): > > > > > > > - testpmd finds an ownerless port(just now created) by the > > > > > > > old iterator and start traffic there, >=20 > How does testpmd start traffic there? Testpmd has only a callback for > displaying that it received an event for a new port. It has no concept of > hotplugging beyond that. >=20 Yes, so no traffic just some control command. > Testpmd will not start using any new port probed using the hotplug API on= its > own, again, unless something has drastically changed. >=20 Every iterator using in testpmd is exposed to race. > > > > > > > - failsafe takes ownership of this new port and start traffic= there. > > > > > > > Problem! > > > > > > > > > > Could you shed a bit more light here - it would be race > > > > > condition between whom and whom? > > > > > > > > Sure. > > > > > > > > > As I remember in testpmd all control ops are done within one > > > > > thread (main lcore). > > > > > > > > But other dpdk entity can use another thread, for example: > > > > Failsafe uses the host thread(using alarm callback) to create a > > > > new port and > > > to take ownership of a port. > > > > > > Hm, and you create new ports inside failsafe PMD, right and then set > > > new owner_id for it? > > > > Yes. > > > > > And all this in alarm in interrupt thread? > > > > Yes. > > > > > If so I wonder how you can guarantee that no-one else will set > > > different owner_id between > > > rte_eth_dev_allocate() and rte_eth_dev_owner_set()? > > > > I check it (see failsafe patch to this series - V5). > > Function: fs_bus_init. > > > > > Could you point me to that place (I am not really familiar with > > > familiar with failsafe code)? > > > > > > > > > > > The race: > > > > Testpmd iterates over all ports by the master thread. > > > > Failsafe takes ownership of a port by the host thread and start usi= ng it. > > > > =3D> The two dpdk entities may use the device at same time! > > > >=20 > When can this happen? Fail-safe creates its initial pool of ports during = EAL > init, before testpmd scans eth_dev ports and configure its streams. > At that point, it has taken ownership, from the master lcore context. >=20 > After this point, new ports could be detected and hotplugged by fail-safe= . > However, even if testpmd had a callback to capture those new ports and > reconfigure its streams, it would be executed from within the intr-thread= , > same as failsafe. If the thread was interrupted, by a dataplane-lcore for > example, streams would not have been reconfigured. > The fail-safe would execute its callback and set the owner-id before the > callback chains goes to the application. > Some iterator may be invoked in plug out process by other thread in testpmd= and causes to control command=20 =20 > And that would only be if testpmd had any callback for hotplugging ports = and > reconfiguring its streams, which it hasn't, as far as I know. > We don't need to implement it in testpmd. =20 > > > Ok, if failsafe really assigns its owner_id(s) to ports that are > > > already in use by the app, then how such scheme supposed to work at > all? > > > > If the app works well (with the new rules) it already took ownership an= d > failsafe will see it and will wait until the application release it. > > Every dpdk entity should know which port it wants to manage, If 2 > > entities want to manage the same device - it can be ok and port owners= hip > can synchronize the usage. > > > > Probably, application which will run fail-safe wants to manage only the= fail- > safe port and therefor to take ownership only for it. > > > > > I.E. application has a port - it assigns some owner_id !=3D 0 to it, > > > then PMD tries to set its owner_id tot the same port. > > > Obviously failsafe's set_owner() will always fail in such case. > > > > > Yes, and will try again after some time. > > > > > From what I hear we need to introduce a concept of 'default owner id'= . > > > I.E. when failsafe PMD is created - user assigns some owner_id to it > (default). > > > Then failsafe PMD generates it's own owner_id and assigns it only to > > > the ports whose current owner_id is equal either 0 or 'default' owner= _id. > > > > > > > It is a suggestion and we need to think about it more (I'm talking abou= t it > with Gaetan in another thread). > > Actually I think, if we want a generic solution to the generic problem = the > current solution is ok. > > >=20 > We could as well conclude this other thread there. >=20 > The only solution would be to have a default relationship between owners, > something that goes beyond the scope assigned by Thomas to your > evolution, but would be necessary for this API to be properly used by > existing applications. >=20 > I think it's the only way to have a sane default behavior with your API, = but I > also think this goes beyong the scope of the DPDK altogether. >=20 > But even with those considerations that could be ironed out later (API is= still > experimental anyway), in the meantime, I think we should strive not to > break "userland" as much as possible. Meaning that unless you have a > specific situation creating a bug, you shouldn't have to modify testpmd, = and if > an issues arises, you need to try to improve your API before resorting to > changing the resource management model of all existing applications. >=20 I understand it. Suggestion: 2 system owners. APP_OWNER - 0. NO_OWNER - 1. And allowing for more owners as now. 1. Every port creation will set the owner for NO_OWNER (as now). 2. There is option for all dpdk entities to take owner of NO_OWNER ports a= ll the time(as now). 3. In some point in the end of EAL init: set all the NO_OWNER to APP_OWNER(= for V6). 4. Change the old iterator to iterate over APP_OWNER ports(for V6). What do you think?