DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Gaëtan Rivet" <gaetan.rivet@6wind.com>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: Bruce Richardson <bruce.richardson@intel.com>,
	dev@dpdk.org, Thomas Monjalon <thomas.monjalon@6wind.com>,
	Adrien Mazarguil <adrien.mazarguil@6wind.com>
Subject: Re: [dpdk-dev] [PATCH v2 00/13] introduce fail-safe PMD
Date: Tue, 14 Mar 2017 15:49:47 +0100	[thread overview]
Message-ID: <20170314144947.GO908@bidouze.vm.6wind.com> (raw)
In-Reply-To: <20170310224355.GA28527@hmswarspite.think-freely.org>

> > The central question that I would like to tackle is this: why should
> > we require from our users declaring a bonding device to have
> > hot-plug support?
> >
> We'll, strictly speaking, I suppose we don't have to require it.  But by that
> same token, we don't need to do it in a separate PMD either, there are lots of
> other options.
>

I think I used an ambiguous formulation here. To be sure that we
understand each other, what I want to express is that it will certainly
seem very strange for a user to declare a bond on a single device,
first, and to be expected to do so if they wanted hot-plug support on
that particular device.

The bonding here would only be used as a place-holder, which does not
really make sense from the user point of view.

I think you understood that, and what I take from your response is that
while agreeing you would like to do so differently. I just wanted to be
clear.

> > I took some time to illustrate a few modes of operation:
> >
> > Fig. 1
> >
> >    .-------------.
> >    | application |
> >    `------.------'
> >           |
> >      .----'-----.---------. <------ init, conf, Rx/Tx
> >      |          |         |
> >      |      .---|--.------|--. <--- conf, link check, Rx/Tx
> >      |      |   |  |      |  |
> >      v      |   v  v      v  v
> > .---------. | .-------. .------.
> > | bonding | | | ixgbe | | mlx4 |
> > `----.----' | `-------' `------'
> >      |      |
> >      `------'
> >
> > Typical link fail-over.
> >
> >
> > Fig. 2
> >
> >  .-------------.
> >  | application |
> >  `------.------'
> >         | <-------- init, conf, Rx/Tx
> >         v
> >   .-----------.
> >   | fail-safe |
> >   `-----.-----'
> >         |
> >     .---'----. <--- init, conf, dev check, Rx/Tx
> >     |        |
> >     v        v
> > .-------. .------.
> > | ixgbe | | mlx4 |
> > `-------' `------'
> >
> > Typical automatic hot-plug handling with device fail-over.
> >
> > [...]
>
> Yes, I think we all understand the purpose of your PMD - its there to provide a
> placeholder device that can respond to application requests in a sane manner,
> until such time as real hardware is put in its place via a hot plug/failure.  Thats all
> well and good, I'm saying there are better ways to go about this that can
> provide the same functionality without having to add an extra 4k lines of code
> to the project, many of which already exist.
>

Ah, yes, I didn't want to imply that the purpose of this PMD wasn't
understood already by many. These figures are there to illustrate some
use cases that some users could recognize, and serve as a support for
the point made afterward.

The main thing that can be taken from these is the division along the
link-level and device-level checking that is done. This explains most of
my position. The nature of those checks imply different kind of code,
most of which is thus actually not duplicated / would require pretty
much the same amount of code to be implemented either as libraries or as
part of the bonding PMD. This is the crux of my argument, which I expand
upon below.

> > 1. LSC vs. RMV
> >
> >  A link status change is a valid state for a device. It calls for
> >  specific responses, e.g. a link switch in a bonding device, without
> >  losing the general configuration of the port.
> >
> >  The removal of a device calls for more than pausing operations and
> >  switching an active device. The party responsible for initializing the
> >  device should take care of closing it properly. If this party also
> >  wants to be able to restore the device if it was plugged back in, it
> >  would need be able to initialize it back and reconfigure its previous
> >  state.
> >
> >  As we can see that in [Fig. 1], this responsibility lies upon the
> >  application.
> >
> Again, yes, I think we all see the benefit of centralizing hot plug operations,
> no one is disagreing with that, its the code/functional duplication that is
> concerning.
>

Certainly, I will try to explain why the code is not actually duplicated
/ why the functions are actually only superficially overlapping.

> > 2. Bonding and link availability
> >
> >  The hot-plug functionality is not a core function of the bonding PMD.
> >  It is only interested in knowing if the link is active or not.
> >
> Currently, yes.  The suggestion was that you augment the bonding driver so that
> hot plug is a core function of bonding.
>
> >  Adding the device persistence to the bonding PMD would mean adding the
> >  ability to flexibly parse device definitions to cope with plug-ins in
> >  evolving busses (PCI hot-plug could mean changing bus addresses), being
> >  able to emulate the EAL and the ether layer and to properly store the
> >  device configuration.  This means formally describing the life of a
> >  device in a DPDK application from start to finish.
> >
> Which seems to me to be exactly what your PMD does.  I don't see why its
> fundamentally harder to do that in an existing pmd, than it is in a new one.
>

Indeed it does. I must emphasize the "formally describe the life of a
device". The hot-plug functionality goes beyong the link-level check.
The description of a device from a DPDK standpoint is complete in the
fail-safe PMD. The state-machine must be able to describe the entire
life of a device, from the devargs parsing to its start-up.

We cannot reuse the existing bonding PMD architecture for this.  We
would have to rewrite the bonding PMD from the ground up for the
hot-plug function. Because it is actually a different approach to
managing the slaves.

This is what I wanted to illustrate in [Fig. 1] and [Fig. 2]:

- In the bonding, the init and configuration steps are still the
  responsibility of the application and no one else. The bonding PMD
  captures the device, re-applies its configuration upon dev_configure()
  which is actually re-applying part of the configuration already 
  present within the slave eth_dev (cf rte_eth_dev_config_restore).

- In the fail-safe, the init and configuration are both the
  responsibilities of the fail-safe PMD itself, not the application
  anymore. This handling of these responsibilities in lieu of the
  application is the whole point of the "deferred hot-plug" support, of
  proposing a simple implementation to the user.

This change in responsibilities is the bulk of the fail-safe code. It
would have to be added as-is to the bonding. Verifying the correctness
of the sync of the initialization phase (acceptable states of a device
following several events registered by the fail-safe PMD) and the
configuration items between the state the application believes it is in
and the fail-safe knows it is in, is the bulk of the fail-safe code.

This function is not overlapping with that of the bonding. The reason I
did not add this whole architecture to the bonding is that when I tried
to do so, I found that I only had two possibilities:

- The current slave handling path is kept, and we only add a new one
  with additional functionalities: full init and conf handling with
  extended parsing capabilities.

- The current slave handling is scraped and replaced entirely by the new
  slave management. The old capturing of existing device is not done
  anymore.

The first solution is not acceptable, because we effectively end-up with
a maintenance nightmare by having to validate two types of slaves with
differing capabilities, differing initialization paths and differing
configuration code.  This is extremely awkward and architecturally
unsound. This is essentially the same as having the exact code of the
fail-safe as an aside in the bonding, maintening exactly the same
breadth of code while having muddier interfaces and organization.

The second solution is not acceptable, because we are bending the whole
existing bonding API to our whim. We could just as well simply rename
the fail-safe PMD as bonding, add a few grouping capabilities and call
it a day. This is not acceptable for users.

All of this, for hotplug support in the bonding PMD, while it actually
makes sense to offer this function on its own. Again, this third path
with a new PMD clarifies the functions for the users and for the reasons
exposed, helps the maintenance of the code.

There is no code duplication, or very little, because the slave
management done by the fail-safe is specific to the hot-plug function.
We do not handle slaves in the same way, and this is entirely due to the
difference in nature between a link-level check and a device-level
check.

>
> My suggestion to you would be to re-architect this in the following way:
>
> 1) Augment the null pmd to accept arbitrary parameters.  Paremeters which are
> not explicitly recognized are stored.  Create an api call via the ethdev api to
> retrieve said arguments.
>
> 2) Augment the bonding driver so that, upon configuration, null drivers can be
> added to the bond, and marked as replaceable in some fashion
>
> 3) Port your hotplug implementation into the bonding code (or make it a separate
> library, not sure which yet).  On a hot plug event trigger, react by querying
> the replaceable interfaces to see if the hot plugged nic is meant to replace an
> existing one.  If a match is found, detach the null pmd instance, and add the
> hot plugged interface.
>

Porting this feature either as part of the bonding PMD or as a library 
means having a good part of the fail-safe PMD added somewhere else. It 
does not reduce much code length.

It does however modify the proposed user interface: either as a library 
that an application would have to explicitly use, or part of the bonding 
PMD with the problems previously explained.

The library form has been considered before. We would have to propose 
both a library for hot-plug handling (initializing and uninitializing a 
device), and a library for handling configuration. Neither can be 
seamlessly integrated in existing applications. They will have to be 
called correctly, an API will have to be respected to add the hot-plug 
support.

Additionally, their implementation is not so simple in this form. The 
configuration store / restore for example will need a full dev_ops 
coverage in order to deal with each configuration items that can be 
used. We end up with a complete dev_ops layer that strongly resemble a 
PMD. It also requires a saving space that will need to be managed, 
either by an application or by a PMD calling it. Having this space 
somehow managed by the EAL leads to implementing a PMD.

The PMD form allows for a clean abstraction layer that can be interfaced 
both with an application on top and a driver underneath. We are able to 
respect both APIs and to be inserted without either being aware of it.  
It does not overcomplicate the code itself nor the architecture while 
the overhead of a PMD is a negligible part, which can be justified to 
offer the seamless integration feature.

> The intent here is to provide you with the functionality that you have (which is
> a good feature), without having to support an entire new pmd that replicates a
> great deal of existing code an functionality.
>

We already considered using the bonding PMD as much as possible, having
generic libraries otherwise for the new facilities that we want. It is
not simpler or easier to maintain. It however impedes users in adopting 
the feature.

Regards,
-- 
Gaëtan Rivet
6WIND

  reply	other threads:[~2017-03-14 14:49 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-03 15:40 [dpdk-dev] [PATCH 00/12] " Gaetan Rivet
2017-03-03 15:40 ` [dpdk-dev] [PATCH 01/12] ethdev: save VLAN filter setting Gaetan Rivet
2017-03-03 17:33   ` Stephen Hemminger
2017-03-03 15:40 ` [dpdk-dev] [PATCH 02/12] ethdev: add flow API rule copy function Gaetan Rivet
2017-03-03 15:40 ` [dpdk-dev] [PATCH 03/12] ethdev: add deferred intermediate device state Gaetan Rivet
2017-03-03 17:34   ` Stephen Hemminger
2017-03-03 15:40 ` [dpdk-dev] [PATCH 04/12] pci: expose device detach routine Gaetan Rivet
2017-03-03 15:40 ` [dpdk-dev] [PATCH 05/12] pci: expose parse and probe routines Gaetan Rivet
2017-03-03 15:40 ` [dpdk-dev] [PATCH 06/12] net/failsafe: add fail-safe PMD Gaetan Rivet
2017-03-03 17:38   ` Stephen Hemminger
2017-03-06 14:19     ` Gaëtan Rivet
2017-03-03 15:40 ` [dpdk-dev] [PATCH 07/12] net/failsafe: add plug-in support Gaetan Rivet
2017-03-03 15:40 ` [dpdk-dev] [PATCH 08/12] net/failsafe: add flexible device definition Gaetan Rivet
2017-03-03 15:40 ` [dpdk-dev] [PATCH 09/12] net/failsafe: support flow API Gaetan Rivet
2017-03-03 15:40 ` [dpdk-dev] [PATCH 10/12] net/failsafe: support offload capabilities Gaetan Rivet
2017-03-03 15:40 ` [dpdk-dev] [PATCH 11/12] net/failsafe: add fast burst functions Gaetan Rivet
2017-03-03 15:40 ` [dpdk-dev] [PATCH 12/12] net/failsafe: support device removal Gaetan Rivet
2017-03-03 16:14 ` [dpdk-dev] [PATCH 00/12] introduce fail-safe PMD Bruce Richardson
2017-03-06 13:53   ` Gaëtan Rivet
2017-03-03 17:27 ` Stephen Hemminger
2017-03-08 15:15 ` [dpdk-dev] [PATCH v2 00/13] " Gaetan Rivet
2017-03-08 15:15   ` [dpdk-dev] [PATCH v2 01/13] ethdev: save VLAN filter setting Gaetan Rivet
2017-03-08 15:15   ` [dpdk-dev] [PATCH v2 02/13] ethdev: add flow API rule copy function Gaetan Rivet
2017-03-08 15:15   ` [dpdk-dev] [PATCH v2 03/13] ethdev: add deferred intermediate device state Gaetan Rivet
2017-03-08 15:15   ` [dpdk-dev] [PATCH v2 04/13] pci: expose device detach routine Gaetan Rivet
2017-03-08 15:15   ` [dpdk-dev] [PATCH v2 05/13] pci: expose parse and probe routines Gaetan Rivet
2017-03-08 15:15   ` [dpdk-dev] [PATCH v2 06/13] net/failsafe: add fail-safe PMD Gaetan Rivet
2017-03-08 15:15   ` [dpdk-dev] [PATCH v2 07/13] net/failsafe: add plug-in support Gaetan Rivet
2017-03-08 15:15   ` [dpdk-dev] [PATCH v2 08/13] net/failsafe: add flexible device definition Gaetan Rivet
2017-03-08 15:15   ` [dpdk-dev] [PATCH v2 09/13] net/failsafe: support flow API Gaetan Rivet
2017-03-08 15:15   ` [dpdk-dev] [PATCH v2 10/13] net/failsafe: support offload capabilities Gaetan Rivet
2017-03-08 15:15   ` [dpdk-dev] [PATCH v2 11/13] net/failsafe: add fast burst functions Gaetan Rivet
2017-03-08 15:15   ` [dpdk-dev] [PATCH v2 12/13] net/failsafe: support device removal Gaetan Rivet
2017-03-08 15:15   ` [dpdk-dev] [PATCH v2 13/13] net/failsafe: support link status change event Gaetan Rivet
2017-03-08 16:54   ` [dpdk-dev] [PATCH v2 00/13] introduce fail-safe PMD Neil Horman
2017-03-09  9:15     ` Bruce Richardson
2017-03-10  9:13       ` Gaëtan Rivet
2017-03-10 22:43         ` Neil Horman
2017-03-14 14:49           ` Gaëtan Rivet [this message]
2017-03-15  3:28             ` Bruce Richardson
2017-03-15 11:15               ` Thomas Monjalon
2017-03-15 14:25                 ` Gaëtan Rivet
2017-03-16 20:50                   ` Neil Horman
2017-03-17 10:56                     ` Gaëtan Rivet
2017-03-18 19:51                       ` Neil Horman
2017-03-20 15:00   ` Thomas Monjalon
2017-05-17 12:50     ` Ferruh Yigit
2017-05-17 16:59       ` Gaëtan Rivet
2017-03-23 13:01   ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170314144947.GO908@bidouze.vm.6wind.com \
    --to=gaetan.rivet@6wind.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=nhorman@tuxdriver.com \
    --cc=thomas.monjalon@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).