From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 555CEA0350; Tue, 30 Jun 2020 16:35:35 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 016401BFE9; Tue, 30 Jun 2020 16:35:35 +0200 (CEST) Received: from new2-smtp.messagingengine.com (new2-smtp.messagingengine.com [66.111.4.224]) by dpdk.org (Postfix) with ESMTP id 348FB1BFE4 for ; Tue, 30 Jun 2020 16:35:33 +0200 (CEST) Received: from compute7.internal (compute7.nyi.internal [10.202.2.47]) by mailnew.nyi.internal (Postfix) with ESMTP id CA07B58052A; Tue, 30 Jun 2020 10:35:30 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute7.internal (MEProxy); Tue, 30 Jun 2020 10:35:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=fm1; bh= WnP8XSbWlircADjefQr01gFAud4ekZGtHwmWfRx5pNM=; b=q7EMMlR2YR4c6hMg bR+ubcWS9hRVSFE5PVYSD398Z0GGcIJt4+vzogIOQ9TosA04z2vzqTis2dn+dllm raeqh7qot9vxYh+yjDj2TG2KxL6IaaBoClVYYjOX0CPY6IrflphBnjm1WWqMGHV/ KZ8BmOObesQlRxzXkCz35XIxyxMk8Xamqqy1Y+DsLfkeqqFE2jiQLdGygUgmpPuF 5FjWDhr+TjD2D6E9ZAjVtdoeZyKlRH/aU4QDLFzOm2uR1FaMvriG2wm+P8cGwZ8y 5tX/X6ZVeNrOyyVRV5Eai1fNijD34rblwSqXWkuNvpOdSbfxvwvoqqId4D0o0Yg+ On5nrw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; bh=WnP8XSbWlircADjefQr01gFAud4ekZGtHwmWfRx5p NM=; b=X2CB+fpO8D833ZrSA5woUsVPjHlec0MPMN2+Jcn3Mw4zDbWX6esy316Qi Pp259dTgNu8AWIWisacsKH5Hfa+QixyKSXsIWovXTpI25aaZSyPUgrjgKDhttBR6 h2EdtA6/B27U+NRY1t6Pl6Y6Hudvla6a5tnbaWzxfwbRKVrp0UAPyUTACnYuu7A2 TzbWAdx0rktxCKD2vHCfIDbif85weZNa0ErtWoeU0CAz0vJqwc9EGUGdYBEa9Rnt sZytiuEyuVVpliHuBJS6rlueta47K19D/2DCzF5jlgaEXkMdEaZpdTRCFhZw9XdY agPvegVu2XlCkT7G8njbM+EFe02jQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduiedrtddtgdeglecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkjghfggfgtgesthfuredttddtvdenucfhrhhomhepvfhhohhmrghs ucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenucggtf frrghtthgvrhhnpeegjeffhffhgefhteevffegffetleevkefhgffhfeegvdelueevteff gfduleevhfenucffohhmrghinhepghhithhhuhgsrdgtohhmnecukfhppeejjedrudefge drvddtfedrudekgeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhl fhhrohhmpehthhhomhgrshesmhhonhhjrghlohhnrdhnvght X-ME-Proxy: Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184]) by mail.messagingengine.com (Postfix) with ESMTPA id 7936B3280064; Tue, 30 Jun 2020 10:35:27 -0400 (EDT) From: Thomas Monjalon To: David Marchand , "Ananyev, Konstantin" Cc: "dev@dpdk.org" , "jerinjacobk@gmail.com" , "Richardson, Bruce" , "mdr@ashroe.eu" , "ktraynor@redhat.com" , "Stokes, Ian" , "i.maximets@ovn.org" , "Mcnamara, John" , "Kovacevic, Marko" , "Burakov, Anatoly" , Olivier Matz , Andrew Rybchenko , Neil Horman Date: Tue, 30 Jun 2020 16:35:26 +0200 Message-ID: <2881429.9YLJUJncU7@thomas> In-Reply-To: References: <20200610144506.30505-1-david.marchand@redhat.com> <2939263.AvGHZF5Fiy@thomas> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Subject: Re: [dpdk-dev] [PATCH v3 6/9] eal: register non-EAL threads as lcores X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" 30/06/2020 14:07, Ananyev, Konstantin: > > 26/06/2020 16:43, David Marchand: > > > On Wed, Jun 24, 2020 at 1:59 PM Ananyev, Konstantin > > > wrote: > > > > > > Do you mean - make this new dynamic-lcore API return an error if callied > > > > > > from secondary process? > > > > > > > > > > Yes, and prohibiting from attaching a secondary process if dynamic > > > > > lcore API has been used in primary. > > > > > I intend to squash in patch 6: > > > > > https://github.com/david-marchand/dpdk/commit/e5861ee734bfe2e4dc23d9b919b0db2a32a58aee > > > > > > > > But secondary process can attach before lcore_register, so we'll have some sort of inconsistency in behaviour. > > > > > > If the developer tries to use both features, he gets an ERROR log in > > > the two init path. > > > So whatever the order at runtime, we inform the developer (who did not > > > read/understand the rte_thread_register() documentation) that what he > > > is doing is unsupported. > > > > I agree. > > Before this patch, pinning a thread on a random core can > > trigger some issues. > > After this patch, register an external thread will > > take care of logging errors in case of inconsistencies. > > So the user will know he is doing something not supported > > by the app. > > I understand that, and return a meaningful error is definitely > better the silent crash or memory corruption. > The problem with that approach, as I said before, MP group > behaviour becomes non-deterministic. It was already non-deterministic before these patches. > > It is an nice improvement. > > > > > > If we really want to go ahead with such workaround - > > > > It is not a workaround. > > It is fixing some old issues and making clear what is really impossible. > > The root cause of the problem is in our MP model design decisions: > from one side we treat lcore_id as process local data, from other side > in some shared data-structures we use lcore_id as an index. > I think to fix it properly we need either: > make lcore_id data shared or stop using lcore_id as an index for shared data. > So from my perspective this approach is just one of possible workarounds. > BTW, there is nothing wrong to have a workaround for the problem > we are not ready to fix right now. I think you are trying to fix multi-process handling. This patch is not about multi-process, it only highlight incompatibilities. > > > > probably better to introduce explicit EAL flag ( --single-process or so). > > > > As Thomas and Bruce suggested, if I understood them properly. > > > > No I was thinking to maintain the tri-state information: > > - secondary is possible > > - secondary is attached > > - secondary is forbidden > > Ok, then I misunderstood you. > > > Asking the user to use an option to forbid attaching a secondary process > > is the same as telling him it is forbidden. > > I don't think it is the same. > On a live and complex system user can't always predict will the primary proc > use dynamic lcore and if it will at what particular moment. > Same for secondary process launching - user might never start it, > might start it straight after the primary one, > or might be after several hours. I don't see the difference. An app which register external threads is not compatible with multi-process. It needs to be clear. If the user tries to do it anyway, there can be some error, OK. > > The error log is enough in my opinion. > > I think it is better than nothing, but probably not the best one. > Apart from possible non-consistent behaviour, it is quite restrictive: > dynamic lcore_id wouldn't be available on any DPDK MP deployment. > Which is a pity - I think it is a cool and useful feature. So you are asking to extend the feature. Honestly, I'm not a fan of multi-process, so I would not push any feature for it. If we don't add any new option now, and restrict MP handling to error messages, it would not prevent from extending in future, right? > What do you guys think about different approach: > introduce new optional EAL parameter to restrict lcore_id > values available for the process. > > #let say to start primary proc that can use lcore_id=[0-99] only: > dpdk_primary --lcore-allow=0-99 ... --file-prefix=xz1 > > #to start secondary one for it with allowed lcore_id=[100-109]: > dpdk_secondary --lcore-allow=100-109 ... --file-prefix=xz1 --proc-type=secondary > > It is still a workaround, but that way we don't need to > add any new limitations for dynamic lcores and secondary process usage. > Now it is up to user to decide would multiple-process use the same shared data > and if so - split lcore_id space properly among them > (same as he has to do now with static lcores). Isn't it pushing too much to the user? > > > A EAL flag is a stable API from the start, as there is nothing > > > describing how we can remove one. > > > So a new EAL flag for an experimental API/feature seems contradictory. > > > > > > Going with a new features status API... I think it is beyond this series. > > > > > > Thomas seems to suggest an automatic resolution when features conflict > > > happens.. ? > > > > I suggest allowing the maximum and raise an error when usage conflicts. > > It seems this is what you did in v4. > > > > > I'll send the v4, let's discuss it there if you want.