From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by dpdk.org (Postfix) with ESMTP id 8B1C64C57 for ; Wed, 26 Jul 2017 12:13:28 +0200 (CEST) Received: from 172.30.72.56 (EHLO DGGEML404-HUB.china.huawei.com) ([172.30.72.56]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id ASD00087; Wed, 26 Jul 2017 18:13:26 +0800 (CST) Received: from localhost (10.177.220.209) by DGGEML404-HUB.china.huawei.com (10.3.17.39) with Microsoft SMTP Server id 14.3.301.0; Wed, 26 Jul 2017 18:13:15 +0800 From: To: , CC: , , , Sha Zhang Date: Wed, 26 Jul 2017 18:13:12 +0800 Message-ID: <1501063992-10704-1-git-send-email-zhangsha.zhang@huawei.com> X-Mailer: git-send-email 1.9.4.msysgit.1 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.177.220.209] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020201.59786B46.00AA, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 43e000f96b89f529ae7c0a9e0f77eaf8 Subject: [dpdk-dev] [PATCH v3] bonding: fix the segfault caused by the race condition between master thread and eal-intr-thread X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Jul 2017 10:13:30 -0000 From: Sha Zhang Function slave_configure calls functions bond_ethdev_lsc_event_callback and slave_eth_dev->dev_ops->link_update to fix updating slave link status. But there is a low probability that process may be crashed if the master thread, which create bonding-device, adds the active_slave_count of the bond to nozero while the rx_ring or tx_ring of it haven't been created. This patch moves the functions bond_ethdev_lsc_event_callback and slave_eth_dev->dev_ops->link_update to eal-intr-thread to aviod the competition. Fixes: 210903803f6e ("net/bonding: fix updating slave link status") Signed-off-by: Sha Zhang --- drivers/net/bonding/rte_eth_bond_pmd.c | 58 +++++++++++++++++++++++++++++----- 1 file changed, 50 insertions(+), 8 deletions(-) diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c index 383e27c..bc0ee7f 100644 --- a/drivers/net/bonding/rte_eth_bond_pmd.c +++ b/drivers/net/bonding/rte_eth_bond_pmd.c @@ -53,6 +53,7 @@ #define REORDER_PERIOD_MS 10 #define DEFAULT_POLLING_INTERVAL_10_MS (10) +#define BOND_LSC_DELAY_TIME_US (10 * 1000) #define HASH_L4_PORTS(h) ((h)->src_port ^ (h)->dst_port) @@ -1800,14 +1801,6 @@ struct bwg_slave { } } - /* If lsc interrupt is set, check initial slave's link status */ - if (slave_eth_dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC) { - slave_eth_dev->dev_ops->link_update(slave_eth_dev, 0); - bond_ethdev_lsc_event_callback(slave_eth_dev->data->port_id, - RTE_ETH_EVENT_INTR_LSC, &bonded_eth_dev->data->port_id, - NULL); - } - return 0; } @@ -1878,6 +1871,51 @@ struct bwg_slave { static void bond_ethdev_promiscuous_enable(struct rte_eth_dev *eth_dev); +static void +bond_ethdev_slave_lsc_delay(void *cb_arg) +{ + struct rte_eth_dev *bonded_ethdev, *slave_dev; + struct bond_dev_private *internals; + + /* Default value for polling slave found is true as we don't + * want todisable the polling thread if we cannot get the lock. + */ + int i = 0; + + if (!cb_arg) + return; + + bonded_ethdev = (struct rte_eth_dev *)cb_arg; + if (!bonded_ethdev->data->dev_started) + return; + + internals = (struct bond_dev_private *)bonded_ethdev->data->dev_private; + if (!rte_spinlock_trylock(&internals->lock)) { + rte_eal_alarm_set(BOND_LSC_DELAY_TIME_US * 10, + bond_ethdev_slave_lsc_delay, + (void *)&rte_eth_devices[internals->port_id]); + return; + } + + for (i = 0; i < internals->slave_count; i++) { + slave_dev = &(rte_eth_devices[internals->slaves[i].port_id]); + if (slave_dev->data->dev_conf.intr_conf.lsc != 0) { + if (slave_dev->dev_ops && + slave_dev->dev_ops->link_update) + slave_dev->dev_ops->link_update(slave_dev, 0); + bond_ethdev_lsc_event_callback( + internals->slaves[i].port_id, + RTE_ETH_EVENT_INTR_LSC, + &bonded_ethdev->data->port_id, NULL); + } + } + rte_spinlock_unlock(&internals->lock); + RTE_LOG(INFO, EAL, + "bond %s(%u): slave num %d, current active slave num %d\n", + bonded_ethdev->data->name, bonded_ethdev->data->port_id, + internals->slave_count, internals->active_slave_count); +} + static int bond_ethdev_start(struct rte_eth_dev *eth_dev) { @@ -1953,6 +1991,10 @@ struct bwg_slave { if (internals->slaves[i].link_status_poll_enabled) internals->link_status_polling_enabled = 1; } + + rte_eal_alarm_set(BOND_LSC_DELAY_TIME_US, bond_ethdev_slave_lsc_delay, + (void *)&rte_eth_devices[internals->port_id]); + /* start polling if needed */ if (internals->link_status_polling_enabled) { rte_eal_alarm_set( -- 1.8.3.1