From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 6CB93A0562;
	Sat,  4 Apr 2020 21:51:25 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id B9C794C8A;
	Sat,  4 Apr 2020 21:51:24 +0200 (CEST)
Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com
 [64.147.123.24]) by dpdk.org (Postfix) with ESMTP id D281A3B5
 for <dev@dpdk.org>; Sat,  4 Apr 2020 21:51:23 +0200 (CEST)
Received: from compute7.internal (compute7.nyi.internal [10.202.2.47])
 by mailout.west.internal (Postfix) with ESMTP id CDBEC5D9;
 Sat,  4 Apr 2020 15:51:21 -0400 (EDT)
Received: from mailfrontend1 ([10.202.2.162])
 by compute7.internal (MEProxy); Sat, 04 Apr 2020 15:51:22 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h=
 from:to:cc:subject:date:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding:content-type; s=mesmtp;
 bh=EyEXIkS0Z1d80sVCeDLG/Id+CV6m3X6fRRW1iyvpMxQ=; b=pcvacIzPLPSH
 n+H2fuXOBTQgWSVXU9bdN5kiFjv2V8JIL9ehsyy5NdAVGRKAaOge5xLi9rqRu4Wh
 6+MvjMA+dTbaY+2jfAqqscGhVsYQWBuYBu4tX39+5uTed2JRIKfr49GBjRMQdCaJ
 aR/P9RJCASKNtyisCeCE5G9yLv+SURA=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:content-transfer-encoding:content-type
 :date:from:in-reply-to:message-id:mime-version:references
 :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender
 :x-sasl-enc; s=fm2; bh=EyEXIkS0Z1d80sVCeDLG/Id+CV6m3X6fRRW1iyvpM
 xQ=; b=zhQX/RdkimTkIbqC7Q+F1X80TetmzVHnJvw1E2ZmMn6VBJ6uikskhn41d
 KHMKYvnxQ0yLi7YEwtfR+uWRTFgAoKPB8mOpBe6+hiASDecYzEqEUnDV/Y8vP21b
 AT14Nx43P9KP8hIuTGVX9UNes65GqQIZU03Z2JE7sRnR3wPeyilVeXOzEMUHNx32
 y+L525Jtqj1XGoIENzjjLmb3I62uMdal19pgO6m279BpGtuz1wOSLT9QzehJgfY6
 ++nUAO4FVMZu7f3hzMmsgHo9/JkZOa/ag2FB0/VJD2smmJqdd+ePiI9kOi7D1/Uu
 ZYY8EQBmK6IC+oqWnwI3oRTLDkrww==
X-ME-Sender: <xms:OeWIXnPcB1TKjt7bsaqSP9fNdl31JUxY9k_7-gC2n2PKz7HG_qaXdA>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduhedrtdekgddufeelucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen
 uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne
 cujfgurhephffvufffkfgjfhgggfgtsehtufertddttddvnecuhfhrohhmpefvhhhomhgr
 shcuofhonhhjrghlohhnuceothhhohhmrghssehmohhnjhgrlhhonhdrnhgvtheqnecukf
 hppeejjedrudefgedrvddtfedrudekgeenucevlhhushhtvghrufhiiigvpedtnecurfgr
 rhgrmhepmhgrihhlfhhrohhmpehthhhomhgrshesmhhonhhjrghlohhnrdhnvght
X-ME-Proxy: <xmx:OeWIXpYmPn2-3CX0yJIigL7-_kBE_f7eaNQ405hamZN9Ai_VcSfkmA>
 <xmx:OeWIXjo65wQFZ87Let35BqdSWwG9Wq8zC2P9kitDeRXMDXQEnhT6Ig>
 <xmx:OeWIXlZqBrdDPB8ZF-Mu5zSjrYk03TeW8fqb0RXeQear9Z77xuaFNQ>
 <xmx:OeWIXnUzoBwxIEJ3p5l35DoYWVIq9bbeDdgELOt7ypwCDkXnJGNtWg>
Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184])
 by mail.messagingengine.com (Postfix) with ESMTPA id A24A43280065;
 Sat,  4 Apr 2020 15:51:20 -0400 (EDT)
From: Thomas Monjalon <thomas@monjalon.net>
To: "Andrzej Ostruszka [C]" <aostruszka@marvell.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Date: Sat, 04 Apr 2020 21:51:18 +0200
Message-ID: <4852266.7OPFtVAQ6q@xps>
In-Reply-To: <9c033551-481e-d90e-9cc6-b10e627512c1@marvell.com>
References: <20200306164104.15528-1-aostruszka@marvell.com>
 <6610147.nAD6y4vbrC@xps> <9c033551-481e-d90e-9cc6-b10e627512c1@marvell.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"
Subject: Re: [dpdk-dev] [PATCH v2 0/4] Introduce IF proxy library
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

04/04/2020 20:07, Andrzej Ostruszka [C]:
> On 4/3/20 11:42 PM, Thomas Monjalon wrote:
> > 10/03/2020 12:10, Andrzej Ostruszka:
> [...]
> >> The purpose of this library is to help with both of these tasks (as long
> >> as they remain in domain of configuration available to the system).  In
> >> other words, if DPDK application has some special needs, that cannot be
> >> addressed by the normal system configuration utilities, then they need
> >> to be solved by the application itself.
> > 
> > In any case, the application must be in the loop.
> > The application should always remain in control.
> 
> OK - so let me try to understand what you mean here on the example of
> this IF Proxy.  I wanted (and that is an explicit goal) to use iproute2
> tools to configure DPDK ports.  In this context allowing application to
> have control might mean two things:
> 
> - application can accept/ignore/deny change requested by the user from
> the shell in a dynamic way (based on some state/event/...)

Yes

> - application writer have a choice to bind or not, but once the proxy is
> bound you simply accept the requests - just like you accept user
> requests in e.g. testpmd shell.

No, the application may check each user request before accepting.
And on the path, the application may need to adapt based on user request.

> So ...
> 
> > When querying some information, nothing need to be controlled I guess.
> > But when adjusting some configuration, the application must be able
> > to be notified and decide which change is allowed.
> > Of course, the application might allow being bypassed.
> 
> ... it looks like you are talking about the first option ("bypass" is I
> guess the second option).  In the concrete example of IF Proxy that
> might be a bit problematic.  User requests changes on proxy interface
> kernel accepts them and the DPDK application is just notified about that
> - has no chance to deny request (say "busy" or "not permitted" to the user).

The application must return a decision.

> Of course app can ignore it (do nothing in the callback or drop the
> event) and have a mismatch between port and its proxy.  I'm not so sure
> if this is what you had in mind.

If the change is denied, the proxy may rollback the change in kernel.

> > Currently this rule is not respected in the rte_mp IPC system.
> > I think rte_mp and IF proxy should follow the same path,
> > keeping the primary application process in control.
> > 
> > I would like not only secondary process and IF proxy be able to use
> > this control path. It should be generic enough to allow any application
> > (local or remote) be part of the control path, communicating with
> > the DPDK application primary process.
> 
> That goal sounds indeed like ZMQ.  Is the consensus about that already
> reached?  On a general level that sounds good to me - the devil might be
> in details, like e.g. trying to be simple and generic enough.  I'd like
> also to solicit here input from other members of the community.

No there is no consensus. Integrating ZMQ is a very fresh idea.
As said, we should open this major topic in a separate email thread.

> > As a summary, I propose to target the following goal:
> > implement a user configuration path as a DPDK standard
> > that the application can enable.
> > 
> > Do we agree that the exception packet path is out of scope?
> 
> Could you rephrase this question?  I'm not sure I understand it.  If you
> wanted to say that we should:
> 
> - implement general config/notification mechanism
> - rebase IF Proxy upon it (so instead of deciding whether this should be
> callback/queue it simply uses this new scheme to deliver the change to
> application)

Yes this is what I mean in general.

> then I'm fine with this.  If you meant something else then please explain.

I ask for confirmation that IF proxy is not managing an exception datapath.
You said the netdev used as proxy can also be used to send/receive
packets to/from kernel stack. But it is out of scope of IF proxy features,
right?

> > [...]
> >> We create two proxy interfaces (here based on Tap driver) and bind the
> >> ports to their proxies.  When user issues a command changing MTU for
> >> Tap1 interface the library notes this and calls "mtu_change" callback
> >> for the Port1.  Similarly when user adds an IPv4 address to the Tap2
> >> interface "addr_add" callback is called for the Port2 and the same
> >> happens for configuration of routing rule pointing to Tap2.
> > 
> > Will it work as well with TC flow configuration converted to rte_flow?
> 
> Not at the moment.  But should be doable - as long as there is good
> mapping between them (I haven't checked).

That's an interesting challenge.

> >> Apart from
> >> callbacks this library can notify about changes via adding events to
> >> notification queues.  See below for more inforamtion about that and
> >> a complete list of available callbacks.
> > 
> > There is choice between callback in a random context,
> > or a read from a message queue in a controlled context.
> > Second option looks better.
> 
> Note that callback can be a simple enqueue to some ring.  From the IF
> Proxy implementation point of view - this is not much of a difference.
> I notice the change and in that place in code I can either call callback
> or queue an event.  Since I expect queues to be a popular choice its
> support is added but without this user could register callback that
> would be enqueuing (one more indirection in slow path).
> 
> Having said that - the only reason callbacks are kept is that (as
> mentioned in cover):
> 
> - I can easily implement global action (true - in random context), since
> queues are "match all" each event will be added to all queues and cores
> would have to decide which one of them performs the global action
> - I can do single/global preparation before queueing event
> 
> But I guess this is rather a mute point since we are going in the
> direction of general config which IF Proxy would be using.
> 
> >> Please note that nothing has been mentioned about forwarding of the
> >> packets between system and DPDK.  Since the proxies are normal DPDK
> >> ports you can receive/send to them via usual RX/TX burst API.  However
> >> since the library is not aware of the structure of packet processing
> >> used by the application it cannot automatically forward the packets - it
> >> is responsibility of the application to include proxy ports into its
> >> packet processing engine.
> > 
> > So IF proxy does nothing special with packets, right?
> 
> Correct.
> 
> >> Although the library only helps you to identify proxy for given port
> >> (and vice versa) and calls appropriate callbacks it does open some
> >> interesting possibilities.  For example you can use the proxy ports to
> >> forward packets for protocols that you do not wish to handle in DPDK
> >> application to the system protocol stack and just listen to the
> >> configuration changes - so that way you can "offload" handling of those
> >> protocols to the system.
> > 
> > Note that when using a bifurcated driver (af_xdp or mlx),
> > the exception path in the kernel is not going through DPDK.
> > Moreover, no proxy is needed for device configuration in such case.
> 
> True for the link level info.  But if application would like to have
> also address/routing/neighbouring info then I guess proxy would be
> needed.  As for the bifurcated drivers - in one version of the library I
> had an option to bind port to itself.  The binding is there only to tell
> library which if_index is interesting and how to report event (if_index
> -> port_id)

Yes you're right, and that's interesting.
I wonder whether we should have a flag in the proxy port to mark
bifurcated model.

> [...]
> >> This creates logical binding - as mentioned above there is no automatic
> >> packet forwarding.  With this binding whenever user changes the state of
> >> proxy interface in the system (link up/down, change mac/mtu, add/remove
> >> IPv4/IPv6) you get appropriate notification for the bound port.
> > 
> > When configuring a port via DPDK API, is it mirrored automatically
> > to the kernel device?
> 
> No it isn't.  It's one way at the moment.  If we wanted bidirectional
> then I would have to plug in somewhere in eth_dev to monitor changes to
> ports and request similar changes to the proxy.

OK I think it is a gap we need to fill.
Bidirectional way looks mandatory to me.
Given that the application needs to be aware of the proxy binding,
can we have an explicit mechanism to notify the proxy of any change?
Or do we want to use something as failsafe PMD to pair the port and
its proxy as sub-devices of a main one, dispatching all changes?

> [...]
> >> and the actual logic used is: if there is callback registered then it is
> >> called, if it returns non-zero then event is considered completed,
> >> otherwise event is added to each configured notification queue.
> >> That way application can update data structures that are safe to be
> >> modified by single writer from within callback or do the common
> >> preprocessing steps (if any needed) in callback and data that is
> >> replicated can be updated during handling of queued events.
> > 
> > As explained above, the application must control every changes.
> > 
> > One issue is thread safety.
> > The simplest model is to manage control path from a single thread
> > in the primary process.
> > 
> > If we create an API to allow the application managing the control path
> > from external requests, I think it should be a building block
> > independent of IF proxy. Then IF proxy can plug into this subsystem.
> > It would allow other control path mechanisms to co-exist.
> 
> I'm fine with this.
> 
> > [...]
> >> It is worth to mention also that while typical case would be a 1-to-1
> >> mapping between port and proxy, the 1-to-many mapping is also supported.
> >> In that case related callbacks will be called for each port bound to
> >> given proxy interface - it is application responsibility to define
> >> semantic of such mapping (e.g. all changes apply to all ports, or link
> >> changes apply to all but other are accepted in "round robin" fashion, or
> >> some other logic).
> > 
> > I don't get the interest of one-to-many mapping.
> 
> That was a request during early version of library - with bridging in
> mind.  However this is an "experimental" part of an "experimental"
> library - I would not focus on this, as it might get removed if we don't
> find a real use case for that.

If we don't have a real use-case, I suggest to not implement it,
but keep some room for it in the design.

> > [...]
> > 
> > Thanks for the work.
> > It seems there are some overlaps with telemetry and rte_mp channels.
> > The same channel could be used also for dynamic tracing command
> > or for remote control.
> > Would you be OK to extend it to a global control subsystem,
> > having IF proxy plugged in?
> 
> Yes.  I don't have a problem with that - however at the moment the
> requirements/design are a bit vague, so let's discuss this more.

Thanks a lot