From: Neil Horman <nhorman@tuxdriver.com>
To: "Wodkowski, PawelX" <pawelx.wodkowski@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
"Jastrzebski, MichalX K" <michalx.k.jastrzebski@intel.com>
Subject: Re: [dpdk-dev] [PATCH 2/2] bond: add mode 4 support
Date: Thu, 18 Sep 2014 12:02:34 -0400 [thread overview]
Message-ID: <20140918160234.GJ20389@hmsreliant.think-freely.org> (raw)
In-Reply-To: <F6F2A6264E145F47A18AB6DF8E87425D12B24CC2@IRSMSX102.ger.corp.intel.com>
On Thu, Sep 18, 2014 at 08:07:31AM +0000, Wodkowski, PawelX wrote:
> > > +int
> > > +bond_mode_8023ad_deactivate_slave(struct rte_eth_dev *bond_dev,
> > > + uint8_t slave_pos)
> > > +{
> > > + struct bond_dev_private *internals = bond_dev->data->dev_private;
> > > + struct mode8023ad_data *data = &internals->mode4;
> > > + struct port *port;
> > > + uint8_t i;
> > > +
> > > + bond_mode_8023ad_stop(bond_dev);
> > > +
> > > + /* Exclude slave from transmit policy. If this slave is an aggregator
> > > + * make all aggregated slaves unselected to force sellection logic
> > > + * to select suitable aggregator for this port */
> > > + for (i = 0; i < internals->active_slave_count; i++) {
> > > + port = &data->port_list[slave_pos];
> > > + if (port->used_agregator_idx == slave_pos) {
> > > + port->selected = UNSELECTED;
> > > + port->actor_state &= ~(STATE_SYNCHRONIZATION |
> > STATE_DISTRIBUTING |
> > > + STATE_COLLECTING);
> > > +
> > > + /* Use default aggregator */
> > > + port->used_agregator_idx = i;
> > > + }
> > > + }
> > > +
> > > + port = &data->port_list[slave_pos];
> > > + timer_cancel(&port->current_while_timer);
> > > + timer_cancel(&port->periodic_timer);
> > > + timer_cancel(&port->wait_while_timer);
> > > + timer_cancel(&port->tx_machine_timer);
> > > +
> > These all seem rather racy. Alarm callbacks are executed with the alarm list
> > locks not held. So there is every possibility that you could execute these (or
> > any timer_cancel calls in this PMD in parallel with the internal state machine
> > timer callback, and leave either with a corrupted timer list (resulting from a
> > double free between here, and the actual callback site),
>
> I don't think so. Yes, callbacks are executed with alarm list locks not held, but
> this is not the issue because access to list itself is guarded by lock and
> ap->executing variable. So list will not be trashed. Check source of
> eal_alarm_callback(), rte_eal_alarm_set() and rte_eal_alarm_cancel().
>
Yes, you're right, the list is probably safe wht the executing bit.
> > or a timer that is
> > actually still pending when a slave is removed.
> >
> This is not the issue also, but problem might be similar. I assumed that alarms
> are atomic but when I looked at rte alarms closer I saw a race condition
> between and rte_eal_alarm_cancel() from bond_mode_8023ad_stop()
> and rte_eal_alarm_set() from state machines callback. This need to be
> reworked in some way.
Yes, this is what I was referring to:
CPU0 CPU1
rte_eal_alarm_callback bond_8023ad_deactivate_slave
-bond_8023_ad_periodic_cb timer_cancel
timer_set
If those timer functions operate on the same timer, the result is that you can
leave the stop/deactivate slave paths with a timer function for that slave still
pending. The bonding mode needs some internal state to serialize those
operations and determine if the timer should be reactivated.
Neil
next prev parent reply other threads:[~2014-09-18 15:56 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-17 14:21 [dpdk-dev] [PATCH 0/2] " Pawel Wodkowski
2014-09-17 14:21 ` [dpdk-dev] [PATCH 1/2] bond: extract common code to separate functions Pawel Wodkowski
2014-09-17 14:51 ` Neil Horman
2014-09-17 14:21 ` [dpdk-dev] [PATCH 2/2] bond: add mode 4 support Pawel Wodkowski
2014-09-17 15:13 ` Neil Horman
2014-09-18 8:07 ` Wodkowski, PawelX
2014-09-18 16:02 ` Neil Horman [this message]
2014-09-19 12:47 ` Wodkowski, PawelX
2014-09-19 17:29 ` Neil Horman
2014-09-22 6:26 ` Wodkowski, PawelX
2014-09-22 10:24 ` Neil Horman
[not found] ` <60ABE07DBB3A454EB7FAD707B4BB158213896977@IRSMSX109.ger.corp.intel.com>
2014-09-22 13:15 ` Neil Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140918160234.GJ20389@hmsreliant.think-freely.org \
--to=nhorman@tuxdriver.com \
--cc=dev@dpdk.org \
--cc=michalx.k.jastrzebski@intel.com \
--cc=pawelx.wodkowski@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).