patches for DPDK stable branches
 help / color / mirror / Atom feed
From: Adrien Mazarguil <adrien.mazarguil@6wind.com>
To: Thomas Monjalon <thomas@monjalon.net>
Cc: dev@dpdk.org, Gaetan Rivet <gaetan.rivet@6wind.com>,
	Ferruh Yigit <ferruh.yigit@intel.com>,
	David Marchand <david.marchand@redhat.com>,
	stable@dpdk.org
Subject: Re: [dpdk-stable] [dpdk-dev] [PATCH v2] net/failsafe: fix source port ID in Rx packets
Date: Thu, 18 Apr 2019 18:46:24 +0200	[thread overview]
Message-ID: <20190418164624.GG4889@6wind.com> (raw)
In-Reply-To: <2732210.eFvFzSlaYO@xps>

On Thu, Apr 18, 2019 at 05:51:18PM +0200, Thomas Monjalon wrote:
> 18/04/2019 17:39, Thomas Monjalon:
> > 18/04/2019 17:32, Adrien Mazarguil:
> > > When passed to the application, Rx packets retain the port ID value
> > > originally set by slave devices. Unfortunately these IDs have no meaning to
> > > applications, which are typically unaware of their existence.
> > > 
> > > This confuses those caring about the source port field in mbufs (m->port)
> > > which experience issues ranging from traffic drop to crashes.
> [...]
> > > +/*
> > > + * Override source port in Rx packets.
> > > + *
> > > + * Make Rx packets originate from this PMD instance instead of one of its
> > > + * slaves. This is mandatory to avoid breaking applications.
> > > + */
<snip>
> > "slave" is a wording from bonding.
> > In failsafe, it is sub-device, isn't it?

I don't mind, although grep shows a couple of comments talking about slaves
already. Either way I think it fits as those are failsafe's pets, as in
failsafe does whatever it wants to them and they don't have a say :)

Does it warrant a v3?

> > > +static void
> > > +failsafe_rx_set_port(struct rte_mbuf **rx_pkts, uint16_t nb_pkts, uint16_t port)
> > > +{
> > > +	unsigned int i;
> > > +
> > > +	for (i = 0; i != nb_pkts; ++i)
> > > +		rx_pkts[i]->port = port;
> > > +}
> > > +
> > >  uint16_t
> > >  failsafe_rx_burst(void *queue,
> > >  		  struct rte_mbuf **rx_pkts,
> > > @@ -87,6 +102,9 @@ failsafe_rx_burst(void *queue,
> > >  		sdev = sdev->next;
> > >  	} while (nb_rx == 0 && sdev != rxq->sdev);
> > >  	rxq->sdev = sdev;
> > > +	if (nb_rx)
> > > +		failsafe_rx_set_port(rx_pkts, nb_rx,
> > > +				     rxq->priv->data->port_id);
> > >  	return nb_rx;
> > >  }
> > 
> > I'm afraid the performance drop to be hard.

Mbufs are still hot from the oven at this stage, so it's not *that*
expensive. I don't see a more efficient approach.

> > How the port id in mbuf is used exactly?

Applications that dissociate Rx itself from packet processing, or whenever a
networking stack is involved. Basically every time some code wonders where a
packet comes from due to lack of context and looks at m->port for the
answer (e.g. checking that a packet arrives on the right port given its
destination address).

> > What crash are you seeing?

None, thankfully. In my specific use case, 6WINDGate's stack simply drops
traffic coming from unknown ports.

However nothing prevents applications from using m->port as an index of some
array they allocated to quickly retrieve port context without looking it
up. They wouldn't expect indices they do not know about in there; assuming
it will result in a crash is not far fetched.

> Another way to fix it without performance drop would be to add
> a new driver op to set the top-level port id.
> This top-level id would be stored in the private structure of the port,
> initialized with the port id of the port itself, and used to fill mbufs.
> 
> Thoughts?

Adding a new devop as a fix would be a problem for stable releases, so this
patch is definitely needed, at least as a first step.

I'm not against a new API, however would it be worth the trouble? Especially
considering it would only be used by failsafe-like drivers with something to
hide from applications which is not the main use case.

For some PMDs, this operation could only be done at init time before port ID
is stored in private Rx queue data for fast retrieval. Retrieving it through
a pointer so it can be updated anytime would make it more expensive than
necessary for them.

It's understood that having failsafe in the dataplane has a cost, but even
with the proposed fix, that cost is dwarfed by the amount of work done by a
true PMD (and the application) for Rx processing.

My suggestion is to wait for someone to complain about the performance
compared to what they had before that fix, only then see what we can do.

-- 
Adrien Mazarguil
6WIND

  reply	other threads:[~2019-04-18 16:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20190418130419.25675-1-adrien.mazarguil@6wind.com>
2019-04-18 15:32 ` [dpdk-stable] " Adrien Mazarguil
2019-04-18 15:39   ` [dpdk-stable] [dpdk-dev] " Thomas Monjalon
2019-04-18 15:51     ` Thomas Monjalon
2019-04-18 16:46       ` Adrien Mazarguil [this message]
2019-04-18 16:54         ` Thomas Monjalon
2019-04-18 17:09           ` Adrien Mazarguil
2019-04-18 17:43             ` Thomas Monjalon
2019-04-18 15:51     ` Gaëtan Rivet
2019-04-18 17:20   ` [dpdk-stable] [PATCH v3] " Adrien Mazarguil
2019-04-18 18:51     ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190418164624.GG4889@6wind.com \
    --to=adrien.mazarguil@6wind.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=gaetan.rivet@6wind.com \
    --cc=stable@dpdk.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).