From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR02-VE1-obe.outbound.protection.outlook.com (mail-eopbgr20049.outbound.protection.outlook.com [40.107.2.49]) by dpdk.org (Postfix) with ESMTP id C2A011B85F; Thu, 8 Feb 2018 20:24:36 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=T/lGlF8GSsAowUEwNGeFALqjI0iJDIU0wdS4Jp81P+M=; b=aDtheYbbcH2EoIN5miTjdxCl705wAiL3GVWvE/gsmT9ClFX6pjHsmjzOhLnKUjGd7VqljggK9W7EZQPABdWQ4F2cDnOeRT11TTVRSBFMWZIf8bLAn8TkvhzkpOmltl3LyyAph0xpIvUbtXC/xHG1/QwvbR7qYfZsJOb4CYOs76E= Received: from AM4PR0501MB2657.eurprd05.prod.outlook.com (10.172.215.19) by AM4PR0501MB2706.eurprd05.prod.outlook.com (10.172.215.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.485.10; Thu, 8 Feb 2018 19:24:35 +0000 Received: from AM4PR0501MB2657.eurprd05.prod.outlook.com ([fe80::80c6:df5:b1b0:ff05]) by AM4PR0501MB2657.eurprd05.prod.outlook.com ([fe80::80c6:df5:b1b0:ff05%17]) with mapi id 15.20.0485.009; Thu, 8 Feb 2018 19:24:35 +0000 From: Matan Azrad To: =?iso-8859-1?Q?Ga=EBtan_Rivet?= CC: "dev@dpdk.org" , "stable@dpdk.org" , "Ophir Munk" Thread-Topic: [PATCH v5 3/3] net/failsafe: fix calling device during RMV events Thread-Index: AQHToQg4MiBMVw5LcUysGsiy5c1Id6Oa3O1Q Date: Thu, 8 Feb 2018 19:24:34 +0000 Message-ID: References: <1518092427-4333-1-git-send-email-matan@mellanox.com> <1518107653-15466-1-git-send-email-matan@mellanox.com> <1518107653-15466-4-git-send-email-matan@mellanox.com> <20180208181103.qmkjffn7tfdb7vm3@bidouze.vm.6wind.com> In-Reply-To: <20180208181103.qmkjffn7tfdb7vm3@bidouze.vm.6wind.com> Accept-Language: en-US, he-IL Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=matan@mellanox.com; x-originating-ip: [85.64.136.190] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; AM4PR0501MB2706; 7:ypsGkuhULFbmxcayogtFEq8LtmPdtS/iEbfhOcjOdlkqjf5IknWrmJ0wGJw7n4Vhc7QJOTFcxH4bk3+9xjvNJCdr5hAJoTWXDfY/thsGGfYURQznVATJwhFxJt0cDf8pwTz+U2QitZ87iQRgqp8Ra85tp0tl9DROjoHr6m9/qTGzowpJpn724mwlgAphSxdTqwv8gC7uHMs62agOOVd8wkHaLGEPMjcUgTan3uxt9ImeBsFkPxqm+Pba+OOSNSf3 x-ms-exchange-antispam-srfa-diagnostics: SSOS;SSOR; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 11c86772-452c-44b8-a2b1-08d56f29966f x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(4534165)(4627221)(201703031133081)(201702281549075)(5600026)(4604075)(3008032)(2017052603307)(7153060)(7193020); SRVR:AM4PR0501MB2706; x-ms-traffictypediagnostic: AM4PR0501MB2706: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040501)(2401047)(5005006)(8121501046)(93006095)(93001095)(3002001)(3231101)(2400082)(944501161)(10201501046)(6055026)(6041288)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123558120)(20161123562045)(20161123560045)(6072148)(201708071742011); SRVR:AM4PR0501MB2706; BCL:0; PCL:0; RULEID:; SRVR:AM4PR0501MB2706; x-forefront-prvs: 0577AD41D6 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(346002)(39860400002)(376002)(39380400002)(396003)(366004)(199004)(43544003)(189003)(93886005)(76176011)(102836004)(305945005)(7736002)(8676002)(74316002)(2900100001)(14454004)(2950100002)(6916009)(81156014)(6436002)(81166006)(4326008)(25786009)(106356001)(99286004)(6116002)(7696005)(316002)(9686003)(97736004)(107886003)(54906003)(66066001)(3846002)(5250100002)(86362001)(68736007)(53936002)(55016002)(8936002)(3660700001)(5660300001)(59450400001)(229853002)(6246003)(3280700002)(2906002)(26005)(6506007)(186003)(105586002)(478600001)(33656002); DIR:OUT; SFP:1101; SCL:1; SRVR:AM4PR0501MB2706; H:AM4PR0501MB2657.eurprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: kaL3ZpG8P9ffLl24ivLMZdZBI2Ds6Z+VpWkOyawG5y+89gDuRedEl0h3OGWKE4DW76qdUZj1XeNJPqS6DFe9LQ== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 11c86772-452c-44b8-a2b1-08d56f29966f X-MS-Exchange-CrossTenant-originalarrivaltime: 08 Feb 2018 19:24:34.9627 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR0501MB2706 Subject: Re: [dpdk-dev] [PATCH v5 3/3] net/failsafe: fix calling device during RMV events X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Feb 2018 19:24:37 -0000 Hi Gaetan > From: Ga=EBtan Rivet, Thursday, February 8, 2018 8:11 PM > On Thu, Feb 08, 2018 at 04:34:13PM +0000, Matan Azrad wrote: > > Following are the failure steps: > > 1. The physical device is removed due to change in one of PF > > parameters (e.g. MTU) 2. The interrupt thread flags the device 3. > > Within 2 seconds Interrupt thread initializes the actual device > > removal, then every 2 seconds it tries to re-sync (plug in) the > > device. The trials fail as long as VF parameter mismatches the PF > parameter. > > 4. A control thread initiates a control operation on failsafe which > > initiates this operation on the device. > > 5. A race condition occurs between the control thread and interrupt > > thread when accessing the device data structures. > > > > This patch mitigates the race condition in step 5. > > > > Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD") > > Cc: stable@dpdk.org > > > > Signed-off-by: Matan Azrad > > --- > > drivers/net/failsafe/failsafe.c | 2 +- > > drivers/net/failsafe/failsafe_eal.c | 2 +- > > drivers/net/failsafe/failsafe_ether.c | 2 +- > > drivers/net/failsafe/failsafe_ops.c | 26 +++++++++++++++++-------- > > drivers/net/failsafe/failsafe_private.h | 34 > > ++++++++++++++++++++++++++------- > > 5 files changed, 48 insertions(+), 18 deletions(-) > > > > diff --git a/drivers/net/failsafe/failsafe.c > > b/drivers/net/failsafe/failsafe.c index 7b2cdbb..6cdefd0 100644 > > --- a/drivers/net/failsafe/failsafe.c > > +++ b/drivers/net/failsafe/failsafe.c > > @@ -187,7 +187,7 @@ > > * If MAC address was provided as a parameter, > > * apply to all probed slaves. > > */ > > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_PROBED) { > > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, > DEV_PROBED) { >=20 > No need for the UNSAFE here. The ports should have been just initialized, > and sdev->remove should be 0. So, the check is not relevant, why to do so? The UNSAFE skip the check. > If sdev->remove is 1, then it means it has been set already by a plugout > event, meaning that rte_eth_dev_default_mac_addr_set should not even > be called on it. >=20 > > ret =3D > rte_eth_dev_default_mac_addr_set(PORT_ID(sdev), > > mac); > > if (ret) { > > diff --git a/drivers/net/failsafe/failsafe_eal.c > > b/drivers/net/failsafe/failsafe_eal.c > > index c3d6731..b3b9c32 100644 > > --- a/drivers/net/failsafe/failsafe_eal.c > > +++ b/drivers/net/failsafe/failsafe_eal.c > > @@ -126,7 +126,7 @@ > > int sdev_ret; > > int ret =3D 0; > > > > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_PROBED) { > > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_PROBED) { > > sdev_ret =3D rte_eal_hotplug_remove(sdev->bus->name, > > sdev->dev->name); > > if (sdev_ret) { > > diff --git a/drivers/net/failsafe/failsafe_ether.c > > b/drivers/net/failsafe/failsafe_ether.c > > index ca42376..f2a52c9 100644 > > --- a/drivers/net/failsafe/failsafe_ether.c > > +++ b/drivers/net/failsafe/failsafe_ether.c > > @@ -325,7 +325,7 @@ > > struct sub_device *sdev; > > uint8_t i; > > > > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) > > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) > > if (sdev->remove && fs_rxtx_clean(sdev)) { > > fs_dev_stats_save(sdev); > > fs_dev_remove(sdev); > > diff --git a/drivers/net/failsafe/failsafe_ops.c > > b/drivers/net/failsafe/failsafe_ops.c > > index a7c2dba..3d2cb32 100644 > > --- a/drivers/net/failsafe/failsafe_ops.c > > +++ b/drivers/net/failsafe/failsafe_ops.c > > @@ -181,6 +181,9 @@ > > FOREACH_SUBDEV(sdev, i, dev) { > > if (sdev->state !=3D DEV_ACTIVE) > > continue; > > + if (sdev->remove =3D=3D 1 && PRIV(dev)->state < DEV_STARTED) > > + /* Application shouldn't start removed sub-devices. > */ > > + continue; >=20 > FOREACH_SUBDEV should already have avoided sub-devices which remove > flag is 1. fs_dev_start() is called by the alarm thread too to restart a removed devic= e(marked by remove flag), so it should not condition the remove flag. > If not, then the fs_err(sdev, ret) stanza right after will let the loop c= ontinue, > and the port should be handled by the next slave cleanup. >=20 > > DEBUG("Starting sub_device %d", i); > > ret =3D rte_eth_dev_start(PORT_ID(sdev)); > > if (ret) { > > @@ -265,10 +268,17 @@ > > uint8_t i; > > > > failsafe_hotplug_alarm_cancel(dev); > > - if (PRIV(dev)->state =3D=3D DEV_STARTED) > > + if (PRIV(dev)->state =3D=3D DEV_STARTED) { > > + /* > > + * Clean remove flags to allow stop for all sub-devices > because > > + * there is not hot-plug alarm to stop the removed sub- > devices. > > + */ > > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, > DEV_STARTED) > > + sdev->remove =3D 0; > Why make this conditional to state =3D=3D DEV_STARTED? > > dev->dev_ops->dev_stop(dev); > > + } > > PRIV(dev)->state =3D DEV_ACTIVE - 1; > > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) { > > DEBUG("Closing sub_device %d", i); > > rte_eth_dev_close(PORT_ID(sdev)); > > sdev->state =3D DEV_ACTIVE - 1; >=20 > --> >=20 > /* > * Clean remove flags to allow stop for all sub-devices because > * there is no hot-plug alarm to clean the removed sub-devices. > */ > FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) > sdev->remove =3D 0; > if (PRIV(dev)->state =3D=3D DEV_STARTED) > dev->dev_ops->dev_stop(dev); > PRIV(dev)->state =3D DEV_ACTIVE - 1; > FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > DEBUG("Closing sub_device %d", i); > rte_eth_dev_close(PORT_ID(sdev)); > sdev->state =3D DEV_ACTIVE - 1; > Agree. =20 > > @@ -309,7 +319,7 @@ > > if (rxq->event_fd > 0) > > close(rxq->event_fd); > > dev =3D rxq->priv->dev; > > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) > > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) >=20 > No need here, as you would have reset sdev->remove if the port was > closing, or it would be dealt with by fs_dev_remove if the alarm is still > running. Agree. >=20 > > SUBOPS(sdev, rx_queue_release) > > (ETH(sdev)->data->rx_queues[rxq->qid]); > > dev->data->rx_queues[rxq->qid] =3D NULL; @@ -376,7 +386,7 @@ >=20 > you really should update your git, it is difficult to verify these change= s > without the function contexts. > Agree :) sorry. =20 > > return ret; > > rxq->event_fd =3D intr_handle.efds[0]; > > dev->data->rx_queues[rx_queue_id] =3D rxq; > > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) { >=20 > Why should we setup queues on ports marked for removal? >=20 Need to change it.=20 > > ret =3D rte_eth_rx_queue_setup(PORT_ID(sdev), > > rx_queue_id, > > nb_rx_desc, socket_id, > > @@ -493,7 +503,7 @@ > > return; > > txq =3D queue; > > dev =3D txq->priv->dev; > > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) > > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) >=20 > Same as for rx_queue_release: either the device is closing, and the flag = has > been reset, or the alarm is still running and will take care of this. >=20 Agree. > > SUBOPS(sdev, tx_queue_release) > > (ETH(sdev)->data->tx_queues[txq->qid]); >=20 > Actually, now that I think about it, there seems to be an issue with queu= es > not released on plugout? >=20 > In fs_dev_remove, we only do the general dev_stop and dev_close on the > sub_device. >=20 > shouldn't we call tx_queue_release as well before? Isn't it done by dev_close()? >=20 > > dev->data->tx_queues[txq->qid] =3D NULL; @@ -548,7 +558,7 @@ > > txq->info.nb_desc =3D nb_tx_desc; > > txq->priv =3D PRIV(dev); > > dev->data->tx_queues[tx_queue_id] =3D txq; > > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) { >=20 > Why using the UNSAFE operator for a setup operation? (Same as for > rx_queue_setup.) >=20 No need , you right, all the queue operation should be safe too. > > ret =3D rte_eth_tx_queue_setup(PORT_ID(sdev), > > tx_queue_id, > > nb_tx_desc, socket_id, > > @@ -663,7 +673,7 @@ > > int ret; > > > > rte_memcpy(stats, &PRIV(dev)->stats_accumulator, sizeof(*stats)); > > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) { >=20 > Why do you want to attempt a stat read on port bound for removal? SW counters may success, this function deals with removal case. > > struct rte_eth_stats *snapshot =3D &sdev- > >stats_snapshot.stats; > > uint64_t *timestamp =3D &sdev->stats_snapshot.timestamp; > > > > @@ -746,7 +756,7 @@ > > > > rx_offload_capa =3D default_infos.rx_offload_capa; > > rxq_offload_capa =3D default_infos.rx_queue_offload_capa; > > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_PROBED) { > > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, > DEV_PROBED) { >=20 > same here. No need the check, so why? >=20 > > rte_eth_dev_info_get(PORT_ID(sdev), > > &PRIV(dev)->infos); > > rx_offload_capa &=3D PRIV(dev)- > >infos.rx_offload_capa; diff --git > > a/drivers/net/failsafe/failsafe_private.h > > b/drivers/net/failsafe/failsafe_private.h > > index f3be152..7ddd63a 100644 > > --- a/drivers/net/failsafe/failsafe_private.h > > +++ b/drivers/net/failsafe/failsafe_private.h > > @@ -244,16 +244,31 @@ int failsafe_eth_lsc_event_callback(uint16_t > port_id, > > ((sdev)->sid) > > > > /** > > - * Stateful iterator construct over fail-safe sub-devices: > > + * Stateful iterator construct over fail-safe sub-devices, > > + * including the removed sub-devices: >=20 > "including sub-devices marked for removal" is more correct here, as the > device is not actually removed yet, only scheduled for. >=20 Maybe "including unsynchronized sub-devices"?=20 > > + * s: (struct sub_device *), iterator > > + * i: (uint8_t), increment > > + * dev: (struct rte_eth_dev *), fail-safe ethdev > > + * state: (enum dev_state), minimum acceptable device state */ > > + >=20 > Here the same documentation as for other macros: parameters type, quick > description of what it does. > Don't understand you here. =20 > > +#define FOREACH_SUBDEV_STATE_UNSAFE(s, i, dev, state) \ > > + for (s =3D fs_find_next((dev), 0, state, 0, &i); \ > > + s !=3D NULL; \ > > + s =3D fs_find_next((dev), i + 1, state, 0, &i)) > > + > > +/** > > + * Stateful iterator construct over fail-safe sub-devices, > > + * except the removed sub-devices: > > * s: (struct sub_device *), iterator > > * i: (uint8_t), increment > > * dev: (struct rte_eth_dev *), fail-safe ethdev > > * state: (enum dev_state), minimum acceptable device state > > */ > > #define FOREACH_SUBDEV_STATE(s, i, dev, state) \ > > - for (s =3D fs_find_next((dev), 0, state, &i); \ > > + for (s =3D fs_find_next((dev), 0, state, 1, &i); \ > > s !=3D NULL; \ > > - s =3D fs_find_next((dev), i + 1, state, &i)) > > + s =3D fs_find_next((dev), i + 1, state, 1, &i)) > > > > /** > > * Iterator construct over fail-safe sub-devices: > > @@ -262,7 +277,7 @@ int failsafe_eth_lsc_event_callback(uint16_t > port_id, > > * dev: (struct rte_eth_dev *), fail-safe ethdev > > */ > > #define FOREACH_SUBDEV(s, i, dev) \ > > - FOREACH_SUBDEV_STATE(s, i, dev, DEV_UNDEFINED) > > + FOREACH_SUBDEV_STATE_UNSAFE(s, i, dev, DEV_UNDEFINED) >=20 > No actually, the default case should be using the "SAFE" iterator, so no > change needed here. Also here, I think the check is unnecessary, so using UNSAFE skip it. > > > > /* dev: (struct rte_eth_dev *) fail-safe device */ #define > > PREFERRED_SUBDEV(dev) \ @@ -328,6 +343,7 @@ int > > failsafe_eth_lsc_event_callback(uint16_t port_id, fs_find_next(struct > > rte_eth_dev *dev, > > uint8_t sid, > > enum dev_state min_state, > > + uint8_t check_remove, >=20 > skip_remove? Seems more descriptive. >=20 Agree. > > uint8_t *sid_out) > > { > > struct sub_device *subs; > > @@ -336,8 +352,12 @@ int failsafe_eth_lsc_event_callback(uint16_t > port_id, > > subs =3D PRIV(dev)->subs; > > tail =3D PRIV(dev)->subs_tail; > > while (sid < tail) { > > - if (subs[sid].state >=3D min_state) > > - break; > > + if (subs[sid].state >=3D min_state) { > > + if (check_remove =3D=3D 0) > > + break; > > + if (PRIV(dev)->subs[sid].remove =3D=3D 0) > > + break; > > + } > > sid++; > > } > > *sid_out =3D sid; > > @@ -376,7 +396,7 @@ int failsafe_eth_lsc_event_callback(uint16_t > port_id, > > uint8_t i; > > > > /* Using acceptable device */ > > - FOREACH_SUBDEV_STATE(sdev, i, dev, req_state) { > > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, req_state) { >=20 > Why should we switch emitting device to one marked for removal? Agree, should be changed. > > if (sdev =3D=3D banned) > > continue; > > DEBUG("Switching tx_dev to sub_device %d", > > -- > > 1.8.3.1 > > >=20 > Regards, > -- > Ga=EBtan Rivet > 6WIND