From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 8CD48B3AF for ; Fri, 19 Sep 2014 14:42:09 +0200 (CEST) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP; 19 Sep 2014 05:41:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,554,1406617200"; d="scan'208";a="605367164" Received: from irsmsx103.ger.corp.intel.com ([163.33.3.157]) by orsmga002.jf.intel.com with ESMTP; 19 Sep 2014 05:47:54 -0700 Received: from irsmsx106.ger.corp.intel.com (163.33.3.31) by IRSMSX103.ger.corp.intel.com (163.33.3.157) with Microsoft SMTP Server (TLS) id 14.3.195.1; Fri, 19 Sep 2014 13:47:36 +0100 Received: from irsmsx102.ger.corp.intel.com ([169.254.2.24]) by IRSMSX106.ger.corp.intel.com ([169.254.8.3]) with mapi id 14.03.0195.001; Fri, 19 Sep 2014 13:47:36 +0100 From: "Wodkowski, PawelX" To: Neil Horman Thread-Topic: [dpdk-dev] [PATCH 2/2] bond: add mode 4 support Thread-Index: AQHP0oLWm9lr6LYAYECquqxpYdLEtpwFXfgAgAEYBtCAAIgjAIABXSYA Date: Fri, 19 Sep 2014 12:47:35 +0000 Message-ID: References: <1410963713-13837-1-git-send-email-pawelx.wodkowski@intel.com> <1410963713-13837-3-git-send-email-pawelx.wodkowski@intel.com> <20140917151304.GD4213@localhost.localdomain> <20140918160234.GJ20389@hmsreliant.think-freely.org> In-Reply-To: <20140918160234.GJ20389@hmsreliant.think-freely.org> Accept-Language: pl-PL, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.180] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "dev@dpdk.org" , "Jastrzebski, MichalX K" Subject: Re: [dpdk-dev] [PATCH 2/2] bond: add mode 4 support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Sep 2014 12:42:10 -0000 > -----Original Message----- > From: Neil Horman [mailto:nhorman@tuxdriver.com] > Sent: Thursday, September 18, 2014 18:03 > To: Wodkowski, PawelX > Cc: dev@dpdk.org; Jastrzebski, MichalX K; Doherty, Declan > Subject: Re: [dpdk-dev] [PATCH 2/2] bond: add mode 4 support >=20 > On Thu, Sep 18, 2014 at 08:07:31AM +0000, Wodkowski, PawelX wrote: > > > > +int > > > > +bond_mode_8023ad_deactivate_slave(struct rte_eth_dev *bond_dev, > > > > + uint8_t slave_pos) > > > > +{ > > > > + struct bond_dev_private *internals =3D bond_dev->data->dev_privat= e; > > > > + struct mode8023ad_data *data =3D &internals->mode4; > > > > + struct port *port; > > > > + uint8_t i; > > > > + > > > > + bond_mode_8023ad_stop(bond_dev); > > > > + > > > > + /* Exclude slave from transmit policy. If this slave is an aggreg= ator > > > > + * make all aggregated slaves unselected to force sellection logi= c > > > > + * to select suitable aggregator for this port */ > > > > + for (i =3D 0; i < internals->active_slave_count; i++) { > > > > + port =3D &data->port_list[slave_pos]; > > > > + if (port->used_agregator_idx =3D=3D slave_pos) { > > > > + port->selected =3D UNSELECTED; > > > > + port->actor_state &=3D ~(STATE_SYNCHRONIZATION | > > > STATE_DISTRIBUTING | > > > > + STATE_COLLECTING); > > > > + > > > > + /* Use default aggregator */ > > > > + port->used_agregator_idx =3D i; > > > > + } > > > > + } > > > > + > > > > + port =3D &data->port_list[slave_pos]; > > > > + timer_cancel(&port->current_while_timer); > > > > + timer_cancel(&port->periodic_timer); > > > > + timer_cancel(&port->wait_while_timer); > > > > + timer_cancel(&port->tx_machine_timer); > > > > + > > > These all seem rather racy. Alarm callbacks are executed with the al= arm list > > > locks not held. So there is every possibility that you could execute= these (or > > > any timer_cancel calls in this PMD in parallel with the internal stat= e machine > > > timer callback, and leave either with a corrupted timer list (resulti= ng from a > > > double free between here, and the actual callback site), > > > > I don't think so. Yes, callbacks are executed with alarm list locks no= t held, but > > this is not the issue because access to list itself is guarded by lock = and > > ap->executing variable. So list will not be trashed. Check source of > > eal_alarm_callback(), rte_eal_alarm_set() and rte_eal_alarm_cancel(). > > > Yes, you're right, the list is probably safe wht the executing bit. >=20 > > > or a timer that is > > > actually still pending when a slave is removed. > > > > > This is not the issue also, but problem might be similar. I assumed tha= t alarms > > are atomic but when I looked at rte alarms closer I saw a race conditio= n > > between and rte_eal_alarm_cancel() from bond_mode_8023ad_stop() > > and rte_eal_alarm_set() from state machines callback. This need to be > > reworked in some way. >=20 > Yes, this is what I was referring to: >=20 > CPU0 CPU1 > rte_eal_alarm_callback bond_8023ad_deactivate_slave > -bond_8023_ad_periodic_cb timer_cancel > timer_set >=20 > If those timer functions operate on the same timer, the result is that yo= u can > leave the stop/deactivate slave paths with a timer function for that slav= e still > pending. The bonding mode needs some internal state to serialize those > operations and determine if the timer should be reactivated. >=20 > Neil I did rethink the issue and problem is much simpler than it looks like. I d= id the=20 following: 1. Change internal state machine alarms to use rte_rdtsc(). This makes all= =20 mode 4 internal timer_*() function not affected by any race condition. 2. Do a busy loop when canceling main callback timer until cancel is succes= sfull. This should do the trick about race condition. Do you agree? Here is part involving timers I have changed: static void -timer_expired_cb(void *arg) +timer_stop(uint64_t *timer) { - enum timer_state *timer =3D arg; - - BOND_ASSERT(*timer =3D=3D TIMER_RUNNING); - *timer =3D TIMER_EXPIRED; + *timer =3D 0; } =20 static void -timer_cancel(enum timer_state *timer) +timer_set(uint64_t *timer, uint64_t timeout_ms) { - rte_eal_alarm_cancel(&timer_expired_cb, timer); - *timer =3D TIMER_NOT_STARTED; + *timer =3D rte_rdtsc() + timeout_ms * rte_get_tsc_hz() / 1000; } =20 +/* Forces given timer to be in expired state. */ static void -timer_set(enum timer_state *timer, uint64_t timeout) +timer_force_expired(uint64_t *timer) { - rte_eal_alarm_cancel(&timer_expired_cb, timer); - rte_eal_alarm_set(timeout * 1000, &timer_expired_cb, timer); - *timer =3D TIMER_RUNNING; + *timer =3D rte_rdtsc(); } =20 static bool -timer_is_expired(enum timer_state *timer) +timer_is_stopped(uint64_t *timer) { - return *timer =3D=3D TIMER_EXPIRED; + return *timer =3D=3D 0; +} + +/* Timer is in running state if it is not stopped nor expired */ +static bool +timer_is_running(uint64_t *timer) +{ + return *timer > 0 && *timer < rte_rdtsc(); +} + + +static bool +timer_is_expired(uint64_t *timer) +{ + return *timer <=3D rte_rdtsc(); } --- And part stopping mode 4 callback. -int +void bond_mode_8023ad_stop(struct rte_eth_dev *bond_dev) { - rte_eal_alarm_cancel(bond_mode_8023ad_periodic_cb, bond_dev); - return 0; + /* Loop untill we cancel pending alarm. Alarm that is executing will + * not be canceled but when reshedules it self it will be canceled. */ + while (rte_eal_alarm_cancel(&bond_mode_8023ad_periodic_cb, bond_dev) =3D= =3D 0) + rte_pause(); }