From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pawelx.wodkowski@intel.com>
Received: from mga09.intel.com (mga09.intel.com [134.134.136.24])
 by dpdk.org (Postfix) with ESMTP id 8CD48B3AF
 for <dev@dpdk.org>; Fri, 19 Sep 2014 14:42:09 +0200 (CEST)
Received: from orsmga002.jf.intel.com ([10.7.209.21])
 by orsmga102.jf.intel.com with ESMTP; 19 Sep 2014 05:41:46 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.04,554,1406617200"; d="scan'208";a="605367164"
Received: from irsmsx103.ger.corp.intel.com ([163.33.3.157])
 by orsmga002.jf.intel.com with ESMTP; 19 Sep 2014 05:47:54 -0700
Received: from irsmsx106.ger.corp.intel.com (163.33.3.31) by
 IRSMSX103.ger.corp.intel.com (163.33.3.157) with Microsoft SMTP Server (TLS)
 id 14.3.195.1; Fri, 19 Sep 2014 13:47:36 +0100
Received: from irsmsx102.ger.corp.intel.com ([169.254.2.24]) by
 IRSMSX106.ger.corp.intel.com ([169.254.8.3]) with mapi id 14.03.0195.001;
 Fri, 19 Sep 2014 13:47:36 +0100
From: "Wodkowski, PawelX" <pawelx.wodkowski@intel.com>
To: Neil Horman <nhorman@tuxdriver.com>
Thread-Topic: [dpdk-dev] [PATCH 2/2] bond: add mode 4 support
Thread-Index: AQHP0oLWm9lr6LYAYECquqxpYdLEtpwFXfgAgAEYBtCAAIgjAIABXSYA
Date: Fri, 19 Sep 2014 12:47:35 +0000
Message-ID: <F6F2A6264E145F47A18AB6DF8E87425D12B2513B@IRSMSX102.ger.corp.intel.com>
References: <1410963713-13837-1-git-send-email-pawelx.wodkowski@intel.com>
 <1410963713-13837-3-git-send-email-pawelx.wodkowski@intel.com>
 <20140917151304.GD4213@localhost.localdomain>
 <F6F2A6264E145F47A18AB6DF8E87425D12B24CC2@IRSMSX102.ger.corp.intel.com>
 <20140918160234.GJ20389@hmsreliant.think-freely.org>
In-Reply-To: <20140918160234.GJ20389@hmsreliant.think-freely.org>
Accept-Language: pl-PL, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [163.33.239.180]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "dev@dpdk.org" <dev@dpdk.org>, "Jastrzebski,
 MichalX K" <michalx.k.jastrzebski@intel.com>
Subject: Re: [dpdk-dev] [PATCH 2/2] bond: add mode 4 support
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Sep 2014 12:42:10 -0000

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Thursday, September 18, 2014 18:03
> To: Wodkowski, PawelX
> Cc: dev@dpdk.org; Jastrzebski, MichalX K; Doherty, Declan
> Subject: Re: [dpdk-dev] [PATCH 2/2] bond: add mode 4 support
>=20
> On Thu, Sep 18, 2014 at 08:07:31AM +0000, Wodkowski, PawelX wrote:
> > > > +int
> > > > +bond_mode_8023ad_deactivate_slave(struct rte_eth_dev *bond_dev,
> > > > +		uint8_t slave_pos)
> > > > +{
> > > > +	struct bond_dev_private *internals =3D bond_dev->data->dev_privat=
e;
> > > > +	struct mode8023ad_data *data =3D &internals->mode4;
> > > > +	struct port *port;
> > > > +	uint8_t i;
> > > > +
> > > > +	bond_mode_8023ad_stop(bond_dev);
> > > > +
> > > > +	/* Exclude slave from transmit policy. If this slave is an aggreg=
ator
> > > > +	 * make all aggregated slaves unselected to force sellection logi=
c
> > > > +	 * to select suitable aggregator for this port	 */
> > > > +	for (i =3D 0; i < internals->active_slave_count; i++) {
> > > > +		port =3D &data->port_list[slave_pos];
> > > > +		if (port->used_agregator_idx =3D=3D slave_pos) {
> > > > +			port->selected =3D UNSELECTED;
> > > > +			port->actor_state &=3D ~(STATE_SYNCHRONIZATION |
> > > STATE_DISTRIBUTING |
> > > > +				STATE_COLLECTING);
> > > > +
> > > > +			/* Use default aggregator */
> > > > +			port->used_agregator_idx =3D i;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	port =3D &data->port_list[slave_pos];
> > > > +	timer_cancel(&port->current_while_timer);
> > > > +	timer_cancel(&port->periodic_timer);
> > > > +	timer_cancel(&port->wait_while_timer);
> > > > +	timer_cancel(&port->tx_machine_timer);
> > > > +
> > > These all seem rather racy.  Alarm callbacks are executed with the al=
arm list
> > > locks not held.  So there is every possibility that you could execute=
 these (or
> > > any timer_cancel calls in this PMD in parallel with the internal stat=
e machine
> > > timer callback, and leave either with a corrupted timer list (resulti=
ng from a
> > > double free between here, and the actual callback site),
> >
> > I don't think so. Yes, callbacks are executed with  alarm list locks no=
t held, but
> > this is not the issue because access to list itself is guarded by lock =
and
> > ap->executing variable. So list will not be trashed. Check source of
> > eal_alarm_callback(), rte_eal_alarm_set() and rte_eal_alarm_cancel().
> >
> Yes, you're right, the list is probably safe wht the executing bit.
>=20
> > > or a timer that is
> > > actually still pending when a slave is removed.
> > >
> > This is not the issue also, but problem might be similar. I assumed tha=
t alarms
> > are atomic but when I looked at rte alarms closer I saw a race conditio=
n
> > between and rte_eal_alarm_cancel() from  bond_mode_8023ad_stop()
> > and rte_eal_alarm_set() from state machines callback. This need to be
> > reworked in some way.
>=20
> Yes, this is what I was referring to:
>=20
> CPU0				CPU1
> rte_eal_alarm_callback		bond_8023ad_deactivate_slave
> -bond_8023_ad_periodic_cb	timer_cancel
> timer_set
>=20
> If those timer functions operate on the same timer, the result is that yo=
u can
> leave the stop/deactivate slave paths with a timer function for that slav=
e still
> pending. The bonding mode needs some internal state to serialize those
> operations and determine if the timer should be reactivated.
>=20
> Neil

I did rethink the issue and problem is much simpler than it looks like. I d=
id the=20
following:
1. Change internal state machine alarms to use rte_rdtsc(). This makes all=
=20
 mode 4 internal timer_*() function not affected by any race condition.
2. Do a busy loop when canceling main callback timer until cancel is succes=
sfull.
This should do the trick about race condition. Do you agree?

Here is part involving timers I have changed:

static void
-timer_expired_cb(void *arg)
+timer_stop(uint64_t *timer)
 {
-	enum timer_state *timer =3D arg;
-
-	BOND_ASSERT(*timer =3D=3D TIMER_RUNNING);
-	*timer =3D TIMER_EXPIRED;
+	*timer =3D 0;
 }
=20
 static void
-timer_cancel(enum timer_state *timer)
+timer_set(uint64_t *timer, uint64_t timeout_ms)
 {
-	rte_eal_alarm_cancel(&timer_expired_cb, timer);
-	*timer =3D TIMER_NOT_STARTED;
+	*timer =3D rte_rdtsc() + timeout_ms * rte_get_tsc_hz() / 1000;
 }
=20
+/* Forces given timer to be in expired state. */
 static void
-timer_set(enum timer_state *timer, uint64_t timeout)
+timer_force_expired(uint64_t *timer)
 {
-	rte_eal_alarm_cancel(&timer_expired_cb, timer);
-	rte_eal_alarm_set(timeout * 1000, &timer_expired_cb, timer);
-	*timer =3D TIMER_RUNNING;
+	*timer =3D rte_rdtsc();
 }
=20
 static bool
-timer_is_expired(enum timer_state *timer)
+timer_is_stopped(uint64_t *timer)
 {
-	return *timer =3D=3D TIMER_EXPIRED;
+	return *timer =3D=3D 0;
+}
+
+/* Timer is in running state if it is not stopped nor expired */
+static bool
+timer_is_running(uint64_t *timer)
+{
+	return *timer > 0 && *timer < rte_rdtsc();
+}
+
+
+static bool
+timer_is_expired(uint64_t *timer)
+{
+	return *timer <=3D rte_rdtsc();
 }
---

And part stopping mode 4 callback.

-int
+void
 bond_mode_8023ad_stop(struct rte_eth_dev *bond_dev)
 {
-	rte_eal_alarm_cancel(bond_mode_8023ad_periodic_cb, bond_dev);
-	return 0;
+	/* Loop untill we cancel pending alarm. Alarm that is executing will
+	 * not be canceled but when reshedules it self it will be canceled. */
+	while (rte_eal_alarm_cancel(&bond_mode_8023ad_periodic_cb, bond_dev) =3D=
=3D 0)
+		rte_pause();
 }