From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 19596B39B for ; Thu, 18 Sep 2014 10:01:49 +0200 (CEST) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga101.fm.intel.com with ESMTP; 18 Sep 2014 01:07:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.97,862,1389772800"; d="scan'208";a="387914117" Received: from irsmsx103.ger.corp.intel.com ([163.33.3.157]) by FMSMGA003.fm.intel.com with ESMTP; 18 Sep 2014 01:02:00 -0700 Received: from irsmsx155.ger.corp.intel.com (163.33.192.3) by IRSMSX103.ger.corp.intel.com (163.33.3.157) with Microsoft SMTP Server (TLS) id 14.3.195.1; Thu, 18 Sep 2014 09:07:32 +0100 Received: from irsmsx102.ger.corp.intel.com ([169.254.2.24]) by IRSMSX155.ger.corp.intel.com ([169.254.14.85]) with mapi id 14.03.0195.001; Thu, 18 Sep 2014 09:07:32 +0100 From: "Wodkowski, PawelX" To: Neil Horman Thread-Topic: [dpdk-dev] [PATCH 2/2] bond: add mode 4 support Thread-Index: AQHP0oLWm9lr6LYAYECquqxpYdLEtpwFXfgAgAEYBtA= Date: Thu, 18 Sep 2014 08:07:31 +0000 Message-ID: References: <1410963713-13837-1-git-send-email-pawelx.wodkowski@intel.com> <1410963713-13837-3-git-send-email-pawelx.wodkowski@intel.com> <20140917151304.GD4213@localhost.localdomain> In-Reply-To: <20140917151304.GD4213@localhost.localdomain> Accept-Language: pl-PL, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.180] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "dev@dpdk.org" , "Jastrzebski, MichalX K" Subject: Re: [dpdk-dev] [PATCH 2/2] bond: add mode 4 support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Sep 2014 08:01:50 -0000 > > +int > > +bond_mode_8023ad_deactivate_slave(struct rte_eth_dev *bond_dev, > > + uint8_t slave_pos) > > +{ > > + struct bond_dev_private *internals =3D bond_dev->data->dev_private; > > + struct mode8023ad_data *data =3D &internals->mode4; > > + struct port *port; > > + uint8_t i; > > + > > + bond_mode_8023ad_stop(bond_dev); > > + > > + /* Exclude slave from transmit policy. If this slave is an aggregator > > + * make all aggregated slaves unselected to force sellection logic > > + * to select suitable aggregator for this port */ > > + for (i =3D 0; i < internals->active_slave_count; i++) { > > + port =3D &data->port_list[slave_pos]; > > + if (port->used_agregator_idx =3D=3D slave_pos) { > > + port->selected =3D UNSELECTED; > > + port->actor_state &=3D ~(STATE_SYNCHRONIZATION | > STATE_DISTRIBUTING | > > + STATE_COLLECTING); > > + > > + /* Use default aggregator */ > > + port->used_agregator_idx =3D i; > > + } > > + } > > + > > + port =3D &data->port_list[slave_pos]; > > + timer_cancel(&port->current_while_timer); > > + timer_cancel(&port->periodic_timer); > > + timer_cancel(&port->wait_while_timer); > > + timer_cancel(&port->tx_machine_timer); > > + > These all seem rather racy. Alarm callbacks are executed with the alarm = list > locks not held. So there is every possibility that you could execute the= se (or > any timer_cancel calls in this PMD in parallel with the internal state ma= chine > timer callback, and leave either with a corrupted timer list (resulting f= rom a > double free between here, and the actual callback site), I don't think so. Yes, callbacks are executed with alarm list locks not he= ld, but=20 this is not the issue because access to list itself is guarded by lock and= =20 ap->executing variable. So list will not be trashed. Check source of=20 eal_alarm_callback(), rte_eal_alarm_set() and rte_eal_alarm_cancel(). > or a timer that is > actually still pending when a slave is removed. >=20 This is not the issue also, but problem might be similar. I assumed that al= arms are atomic but when I looked at rte alarms closer I saw a race condition between and rte_eal_alarm_cancel() from bond_mode_8023ad_stop() and rte_eal_alarm_set() from state machines callback. This need to be=20 reworked in some way.