From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from HUB024-nj-3.exch024.serverdata.net (hub024-nj-3.exch024.serverdata.net [206.225.165.118]) by dpdk.org (Postfix) with ESMTP id ADCEE36E for ; Thu, 24 Sep 2015 22:44:34 +0200 (CEST) Received: from MBX024-E1-NJ-2.exch024.domain.local ([10.240.10.52]) by HUB024-NJ-3.exch024.domain.local ([10.240.10.36]) with mapi id 14.03.0224.002; Thu, 24 Sep 2015 13:44:33 -0700 From: Tim Shearer To: "dev@dpdk.org" Thread-Topic: [PATCH] librte: Link status interrupt race condition, IGB E1000 Thread-Index: AdD3B010xN7sf0SHTdCLWgWsN0OPpg== Date: Thu, 24 Sep 2015 20:44:33 +0000 Message-ID: <33526A3108217C45B7DAFFA5277E4B67485277B3@mbx024-e1-nj-2.exch024.domain.local> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [50.58.84.238] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH] librte: Link status interrupt race condition, IGB E1000 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Sep 2015 20:44:35 -0000 I encountered an issue with DPDK 2.1.0 which occasionally causes the link = status interrupt callback not to be called after the interface is started f= or the first time. I traced the problem back to the function eth_igb_link_u= pdate(), which is used to determine if the link has changed state since the= previous time it was called. It appears that this function can be called s= imultaneously from two different threads: (1) From the main application/configuration thread, via rte_eth_dev_start()= - pointed to by (*dev->dev_ops->link_update) (2) From the eal interrupt thread, via eth_igb_interrupt_action(), to check= if the link state has transitioned up or down. The user callback is only e= xecuted if the link has changed state. The race condition manifests itself as follows: - Main thread configures the interface with link status interrupt (LSI) en= abled, sets up the queues etc. - Main thread calls rte_eth_dev_start. The interface is started and then w= e call eth_igb_link_update() - While in this call, the link goes up. Accordingly, we detect the transi= tion, and write the new link state (up) into the global rte_eth_dev struct - The interrupt fires, which also drops into the eth_igb_link_update funct= ion, finds that the global link status has already been set to up (no chang= e) - Therefore, the handler thinks the interrupt was spurious, and the callba= ck doesn't get called. I suspect that rte_eth_dev_start shouldn't be checking the link state if in= terrupts are enabled. Would someone mind taking a quick look at the patch b= elow? Thanks! Tim --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -1300,7 +1300,7 @@ rte_eth_dev_start(uint8_t port_id) =20 rte_eth_dev_config_restore(port_id); =20 - if (dev->data->dev_conf.intr_conf.lsc !=3D 0) { + if (dev->data->dev_conf.intr_conf.lsc =3D=3D 0) { FUNC_PTR_OR_ERR_RET(*dev->dev_ops->link_update, -ENOTSUP); (*dev->dev_ops->link_update)(dev, 0); }