From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3E4ECA0350; Tue, 30 Jun 2020 14:44:12 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E34081BEB4; Tue, 30 Jun 2020 14:44:11 +0200 (CEST) Received: from mail-wr1-f66.google.com (mail-wr1-f66.google.com [209.85.221.66]) by dpdk.org (Postfix) with ESMTP id 52D5D1BEA8 for ; Tue, 30 Jun 2020 14:44:11 +0200 (CEST) Received: by mail-wr1-f66.google.com with SMTP id a6so19983945wrm.4 for ; Tue, 30 Jun 2020 05:44:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=Uxe3b3Bt5aav6Cmkb6d/mk86us5NDFghzNBNx0whdYs=; b=bnD696qfwg+bBYeNDQR7CPNADGpdp9zdJzYEYkTHD2eLebXfNmKG7lpCp1c0rxdG6S 2m4ifjK6bdacVtfl7gX7hPpzktyYLHKAnIMCVcrxJdhpiWkkFv7ELKsoeyLa1DQgo2Yn NoKsbgMs3UUMV9ZR+v8E+UzNQRkfQM8k0KN3k0kxiKx2Wp1hZgblCPmimzx+F9EoI8FQ YnRwBDNIKjLszedbcH7niZAQ4GAuHOf0+QXSXvXgVLKtk15E/CU6de0WwUjx/kw5Jc1y 4uRvzcw3wihei8fKygR0EYhbEIMo2fadHeQddK7lXVnEj/6pvLKywCabHllInx1c8dbw jzCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=Uxe3b3Bt5aav6Cmkb6d/mk86us5NDFghzNBNx0whdYs=; b=it6CVs0tItTLQlymGFJuiqNmoCVXmPX+n4qlbCZ2ogNlr9pc3CM0qOXF6Y1eXw7/pM cwr2fwk0F/2LI+TV5tLh5yBDIi085pvMGOxsJ22SapBO2xStxMoJ2V18PQai9vVKMr70 F0ZIwrzAR6nV8pRi/xUCZnotefqGwyEsMxVLVqk498okrBNVfe/r5+rf28aFykAfRX/s Z3f7wy5BfN96TvAE6TDFiUzFQ5HwcupMMC77OHQhq2uJh9E4t8f7qcdXS9BkeifgFCX+ t/SPSyhRGJFIg8yBMdVXGKrpNqEZINADrIx/4A+6AyV0AVgNFGzCBuB65Jkr2zVW+h/S kIiw== X-Gm-Message-State: AOAM531KYPPj4WGYCObZE3C5EdwPewr27ahRlCf9KnxUUy9ccbXy7MtP 2ZeHCgNd217gkL8KmuRiAGZ7qA== X-Google-Smtp-Source: ABdhPJzHV0w7oKMbubseTQ/+Kpx9v95lbS1pWFpKKZUg6cLFzB15/WnOt3Bidvslh7NxM9BdCUQ+2A== X-Received: by 2002:adf:ec90:: with SMTP id z16mr21606312wrn.52.1593521050969; Tue, 30 Jun 2020 05:44:10 -0700 (PDT) Received: from 6wind.com (2a01cb0c0005a600345636f7e65ed1a0.ipv6.abo.wanadoo.fr. [2a01:cb0c:5:a600:3456:36f7:e65e:d1a0]) by smtp.gmail.com with ESMTPSA id s15sm3452733wmj.41.2020.06.30.05.44.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jun 2020 05:44:10 -0700 (PDT) Date: Tue, 30 Jun 2020 14:44:09 +0200 From: Olivier Matz To: "Ananyev, Konstantin" Cc: Thomas Monjalon , David Marchand , "dev@dpdk.org" , "jerinjacobk@gmail.com" , "Richardson, Bruce" , "mdr@ashroe.eu" , "ktraynor@redhat.com" , "Stokes, Ian" , "i.maximets@ovn.org" , "Mcnamara, John" , "Kovacevic, Marko" , "Burakov, Anatoly" , Andrew Rybchenko , Neil Horman Message-ID: <20200630124409.GL5869@platinum> References: <20200610144506.30505-1-david.marchand@redhat.com> <2939263.AvGHZF5Fiy@thomas> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Subject: Re: [dpdk-dev] [PATCH v3 6/9] eal: register non-EAL threads as lcores X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Tue, Jun 30, 2020 at 12:07:32PM +0000, Ananyev, Konstantin wrote: > > > > 26/06/2020 16:43, David Marchand: > > > On Wed, Jun 24, 2020 at 1:59 PM Ananyev, Konstantin > > > wrote: > > > > > > Do you mean - make this new dynamic-lcore API return an error if callied > > > > > > from secondary process? > > > > > > > > > > Yes, and prohibiting from attaching a secondary process if dynamic > > > > > lcore API has been used in primary. > > > > > I intend to squash in patch 6: > > > > > https://github.com/david-marchand/dpdk/commit/e5861ee734bfe2e4dc23d9b919b0db2a32a58aee > > > > > > > > But secondary process can attach before lcore_register, so we'll have some sort of inconsistency in behaviour. > > > > > > If the developer tries to use both features, he gets an ERROR log in > > > the two init path. > > > So whatever the order at runtime, we inform the developer (who did not > > > read/understand the rte_thread_register() documentation) that what he > > > is doing is unsupported. > > > > I agree. > > Before this patch, pinning a thread on a random core can > > trigger some issues. > > After this patch, register an external thread will > > take care of logging errors in case of inconsistencies. > > So the user will know he is doing something not supported > > by the app. > > I understand that, and return a meaningful error is definitely > better the silent crash or memory corruption. > The problem with that approach, as I said before, MP group > behaviour becomes non-deterministic. > > > > > It is an nice improvement. > > > > > > If we really want to go ahead with such workaround - > > > > It is not a workaround. > > It is fixing some old issues and making clear what is really impossible. > > The root cause of the problem is in our MP model design decisions: > from one side we treat lcore_id as process local data, from other side > in some shared data-structures we use lcore_id as an index. > I think to fix it properly we need either: > make lcore_id data shared or stop using lcore_id as an index for shared data. > So from my perspective this approach is just one of possible workarounds. > BTW, there is nothing wrong to have a workaround for the problem > we are not ready to fix right now. > > > > > probably better to introduce explicit EAL flag ( --single-process or so). > > > > As Thomas and Bruce suggested, if I understood them properly. > > > > No I was thinking to maintain the tri-state information: > > - secondary is possible > > - secondary is attached > > - secondary is forbidden > > Ok, then I misunderstood you. > > > Asking the user to use an option to forbid attaching a secondary process > > is the same as telling him it is forbidden. > > I don't think it is the same. > On a live and complex system user can't always predict will the primary proc > use dynamic lcore and if it will at what particular moment. > Same for secondary process launching - user might never start it, > might start it straight after the primary one, > or might be after several hours. > > > The error log is enough in my opinion. > > I think it is better than nothing, but probably not the best one. > Apart from possible non-consistent behaviour, it is quite restrictive: > dynamic lcore_id wouldn't be available on any DPDK MP deployment. > Which is a pity - I think it is a cool and useful feature. > > What do you guys think about different approach: > introduce new optional EAL parameter to restrict lcore_id > values available for the process. > > #let say to start primary proc that can use lcore_id=[0-99] only: > dpdk_primary --lcore-allow=0-99 ... --file-prefix=xz1 > > #to start secondary one for it with allowed lcore_id=[100-109]: > dpdk_secondary --lcore-allow=100-109 ... --file-prefix=xz1 --proc-type=secondary > > It is still a workaround, but that way we don't need to > add any new limitations for dynamic lcores and secondary process usage. > Now it is up to user to decide would multiple-process use the same shared data > and if so - split lcore_id space properly among them > (same as he has to do now with static lcores). A variant (more simple) of your approach could be to add "--proc-type=standalone" to explicitly disable MP and enable dynamic thread registration. > > > A EAL flag is a stable API from the start, as there is nothing > > > describing how we can remove one. > > > So a new EAL flag for an experimental API/feature seems contradictory. > > > > > > Going with a new features status API... I think it is beyond this series. > > > > > > Thomas seems to suggest an automatic resolution when features conflict > > > happens.. ? > > > > I suggest allowing the maximum and raise an error when usage conflicts. > > It seems this is what you did in v4. > > > > > I'll send the v4, let's discuss it there if you want. > > >