From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01on0066.outbound.protection.outlook.com [104.47.1.66]) by dpdk.org (Postfix) with ESMTP id 15DD41B3D0; Mon, 12 Feb 2018 21:35:54 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=7ZJatQhPYuujE7+dGC4wt8s6JXdS9bycC9IVEHjrmRo=; b=qx3qrVH7JfWsgzVUWbX7hqHGBSUbSm2dxbCX8N4uUjB/oUWdXZnfQ6Rl1k7MdI38cR+GDhkpp4otkpqHhGM4Y8WnVVTiSAMHUTtFXGvuuhfloEAXiDDMj05ZAfeKNZf73op0+E+jqGxDR23wGgR7Fshfu6OFAjmxkQS6YzlOFNw= Received: from AM4PR0501MB2657.eurprd05.prod.outlook.com (10.172.215.19) by AM4PR0501MB2226.eurprd05.prod.outlook.com (10.165.82.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.485.10; Mon, 12 Feb 2018 20:35:52 +0000 Received: from AM4PR0501MB2657.eurprd05.prod.outlook.com ([fe80::80c6:df5:b1b0:ff05]) by AM4PR0501MB2657.eurprd05.prod.outlook.com ([fe80::80c6:df5:b1b0:ff05%17]) with mapi id 15.20.0485.013; Mon, 12 Feb 2018 20:35:52 +0000 From: Matan Azrad To: =?iso-8859-1?Q?Ga=EBtan_Rivet?= CC: "dev@dpdk.org" , "stable@dpdk.org" Thread-Topic: [PATCH v6 3/3] net/failsafe: fix hotplug races Thread-Index: AQHTpDACospYNiOuzkihs2leJD3Ji6OhIt/w Date: Mon, 12 Feb 2018 20:35:52 +0000 Message-ID: References: <1518107653-15466-1-git-send-email-matan@mellanox.com> <1518369872-12324-1-git-send-email-matan@mellanox.com> <1518369872-12324-4-git-send-email-matan@mellanox.com> <20180212183325.ecp6i4ei4mio7khx@bidouze.vm.6wind.com> In-Reply-To: <20180212183325.ecp6i4ei4mio7khx@bidouze.vm.6wind.com> Accept-Language: en-US, he-IL Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=matan@mellanox.com; x-originating-ip: [85.64.136.190] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; AM4PR0501MB2226; 7:0hp8bDZ4oqGwfK9wyJ9edqC+PtOSllRscavYGf7xKBgzhbSpTs0yi/YNyWfN0Q6r8qT4/4H38Omhv00gJb9YvRK+5Bgi6VdzqPCiV0OEEI1Hgyhlj/AUwdYF3vEyjBC4URbwQeg8biU3vtruDMNMdf7whwe3sZ8VHNP3xp3uhqP6m4IDter3qQsXm7ZBAnpKdtYXyZA/jB8cgfRpL3faJl9V9utnaevuvUO2//N5N8AMfsLX/sgrRT+TwWlDmIRb x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 3b28c203-4e14-4d57-1524-08d5725835b6 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603307)(7153060)(7193020); SRVR:AM4PR0501MB2226; x-ms-traffictypediagnostic: AM4PR0501MB2226: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(60795455431006); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040501)(2401047)(5005006)(8121501046)(93006095)(93001095)(3002001)(10201501046)(3231101)(944501161)(6055026)(6041288)(20161123558120)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123564045)(20161123562045)(6072148)(201708071742011); SRVR:AM4PR0501MB2226; BCL:0; PCL:0; RULEID:; SRVR:AM4PR0501MB2226; x-forefront-prvs: 0581B5AB35 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(396003)(366004)(39860400002)(346002)(376002)(39380400002)(189003)(199004)(25786009)(2900100001)(8936002)(33656002)(93886005)(5660300001)(6116002)(3280700002)(5250100002)(3846002)(55016002)(81166006)(105586002)(3660700001)(4326008)(53936002)(68736007)(9686003)(97736004)(6246003)(81156014)(86362001)(8676002)(2906002)(66066001)(2950100002)(59450400001)(229853002)(305945005)(7736002)(74316002)(106356001)(26005)(54906003)(6436002)(99286004)(102836004)(6916009)(186003)(6506007)(7696005)(14454004)(478600001)(316002)(76176011); DIR:OUT; SFP:1101; SCL:1; SRVR:AM4PR0501MB2226; H:AM4PR0501MB2657.eurprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: alFTFS18skBIvrda0FyMfVweTaHZJ5dPRICG0l3Rb168NDoGVj031ZpSWELfEso4RdLhWFrZ0GmyJKW99OMSqA== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 3b28c203-4e14-4d57-1524-08d5725835b6 X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Feb 2018 20:35:52.5357 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR0501MB2226 Subject: Re: [dpdk-dev] [PATCH v6 3/3] net/failsafe: fix hotplug races X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Feb 2018 20:35:54 -0000 Hi Gaetan From: Ga=EBtan Rivet, Sent: Monday, February 12, 2018 8:33 PM > Hi Matan, >=20 > On Sun, Feb 11, 2018 at 05:24:32PM +0000, Matan Azrad wrote: > > Fail-safe uses periodic alarm mechanism, running from the host thread, > > to manage the hot-plug events of its sub-devices. This management > > requires a lot of sub-devices PMDs operations (stop,close,start,etc). > > > > While the hot-plug alarm runs in the host thread, the application may > > call fail-safe operations which directly trigger the sub-devices PMDs > > operations too, This call may occur from any thread decided by the > > application (probably the master thread). > > > > So, more than one operation can execute to a sub-device in same time > > what can cause a lot of races in the sub-PMDs. > > > > Moreover, some control operations update the fail-safe internal > > databases which can be used by the alarm mechanism in the same time, > > what also can cause to races and crashes. > > > > Fail-safe is the owner of its sub-devices and must to synchronize > > their use according to the ETHDEV ownership rules. > > > > Synchronize hot-plug management by a new lock mechanism uses a mutex > > to atomically defend each critical section in the fail-safe hot-plug > > mechanism and control operations to prevent any races between them. > > > > Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD") > > Cc: stable@dpdk.org > > > > Signed-off-by: Matan Azrad > > --- > > drivers/net/failsafe/Makefile | 1 + > > drivers/net/failsafe/failsafe.c | 35 ++++++++ > > drivers/net/failsafe/failsafe_ether.c | 6 +- > > drivers/net/failsafe/failsafe_flow.c | 20 ++++- > > drivers/net/failsafe/failsafe_ops.c | 148 > ++++++++++++++++++++++++++------ > > drivers/net/failsafe/failsafe_private.h | 62 +++++++++++-- > > 6 files changed, 239 insertions(+), 33 deletions(-) > > > > diff --git a/drivers/net/failsafe/Makefile > > b/drivers/net/failsafe/Makefile index d1ae899..bd2f019 100644 > > --- a/drivers/net/failsafe/Makefile > > +++ b/drivers/net/failsafe/Makefile > > @@ -68,5 +68,6 @@ CFLAGS +=3D -pedantic > > LDLIBS +=3D -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS +=3D > > -lrte_ethdev -lrte_net -lrte_kvargs LDLIBS +=3D -lrte_bus_vdev > > +LDLIBS +=3D -lpthread > > > > include $(RTE_SDK)/mk/rte.lib.mk > > diff --git a/drivers/net/failsafe/failsafe.c > > b/drivers/net/failsafe/failsafe.c index 7b2cdbb..c499bfb 100644 > > --- a/drivers/net/failsafe/failsafe.c > > +++ b/drivers/net/failsafe/failsafe.c > > @@ -113,17 +113,46 @@ > > break; > > /* if we have non-probed device */ > > if (i !=3D PRIV(dev)->subs_tail) { > > + if (fs_lock(dev, 1) !=3D 0) > > + goto reinstall; >=20 > You have left a few operations unlocked further down the call stack. > With these discrepancies fixed, the RECURSIVE attribute could be removed, > and this lock as well. >=20 > > ret =3D failsafe_eth_dev_state_sync(dev); > > + fs_unlock(dev, 1); >=20 > Compared to the first version of these changes, I much prefer having a > wrapper for locking. However, I dislike having the arguably unnecessary > additional argument (alarm_lock). >=20 > I guess you added this for debugging purpose, but in the end either the > design is simple and clear, and you have a proper model, or you don't, an= d > that's an issue. Not for debug, the debug is by the way, Actually it is just will be nice to know if the alarm is running in the cri= tical sections and may be used in future. Actually, following patch "fix reconfiguration" is using it.=20 > And having the RECURSIVE attribute "just in case", is not reassuring. You know, there are a lot of pros and cons to the RECURSIVE usage and I can= understand your concern. Just not to create a bigger patch this time I think it can be changed in th= e next release. > > if (ret) > > ERROR("Unable to synchronize sub_device state"); > > } > > failsafe_dev_remove(dev); > > +reinstall: > > ret =3D failsafe_hotplug_alarm_install(dev); > > if (ret) > > ERROR("Unable to set up next alarm"); } > > > > static int > > +fs_mutex_init(struct fs_priv *priv) > > +{ > > + int ret; > > + pthread_mutexattr_t attr; > > + > > + ret =3D pthread_mutexattr_init(&attr); > > + if (ret) { > > + ERROR("Cannot initiate mutex attributes - %s", strerror(ret)); > > + return ret; > > + } > > + /* Allow mutex relocks for the thread holding the mutex. */ > > + ret =3D pthread_mutexattr_settype(&attr, > PTHREAD_MUTEX_RECURSIVE); > > + if (ret) { >=20 > Just to emphasize, I think this should be removed. >=20 > Please explain why you thought it was necessary. >=20 To simplify the code: 1. Allow to use less calls to lock operations. 2. Allow to differentiate between alarm time to app time in the shared code= (dev_configure(),dev_start()) easily. 3. Allow easily to use try_lock in the alarm thread. 4. Defend from some kinds of deadlock. Actually the alternative way to use simple lock is more complicated. > > + ERROR("Cannot set mutex type - %s", strerror(ret)); > > + return ret; > > + } > > + ret =3D pthread_mutex_init(&priv->hotplug_mutex, &attr); > > + if (ret) { > > + ERROR("Cannot initiate mutex - %s", strerror(ret)); > > + return ret; > > + } > > + return 0; > > +} > > + > > +static int > > fs_eth_dev_create(struct rte_vdev_device *vdev) { > > struct rte_eth_dev *dev; > > @@ -176,6 +205,9 @@ > > ret =3D failsafe_eal_init(dev); > > if (ret) > > goto free_args; > > + ret =3D fs_mutex_init(priv); > > + if (ret) > > + goto free_args; > > ret =3D failsafe_hotplug_alarm_install(dev); > > if (ret) { > > ERROR("Could not set up plug-in event detection"); @@ - > 250,6 +282,9 > > @@ > > ERROR("Error while uninitializing sub-EAL"); > > failsafe_args_free(dev); > > fs_sub_device_free(dev); > > + ret =3D pthread_mutex_destroy(&PRIV(dev)->hotplug_mutex); > > + if (ret) > > + ERROR("Error while destroying hotplug mutex"); > > rte_free(PRIV(dev)); > > rte_eth_dev_release_port(dev); > > return ret; > > diff --git a/drivers/net/failsafe/failsafe_ether.c > > b/drivers/net/failsafe/failsafe_ether.c > > index d820faf..8672819 100644 > > --- a/drivers/net/failsafe/failsafe_ether.c > > +++ b/drivers/net/failsafe/failsafe_ether.c >=20 > Locking fs_eth_dev_conf_apply should allow to remove the lock in > fs_hotplug_alarm, as long as we make sure only public rte_ether API is us= ed > in fs_eth_dev_conf_apply and its callee. No, all the state updates(failsafe state and the sub-devices states) in fs_= hotplug_alarm stuck should be defended by lock (or any other atomic mechani= sm).=20 > > @@ -328,8 +328,11 @@ > > > > FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) > > if (sdev->remove && fs_rxtx_clean(sdev)) { > > + if (fs_lock(dev, 1) !=3D 0) > > + return; > > fs_dev_stats_save(sdev); > > fs_dev_remove(sdev); > > + fs_unlock(dev, 1); > > } > > } > > > > @@ -428,7 +431,7 @@ > > void *cb_arg, void *out __rte_unused) { > > struct sub_device *sdev =3D cb_arg; > > - >=20 > This line should remain. Sure. =20 > > + fs_lock(sdev->fs_dev, 0); > > /* Switch as soon as possible tx_dev. */ > > fs_switch_dev(sdev->fs_dev, sdev); > > /* Use safe bursts in any case. */ > > @@ -438,6 +441,7 @@ > > * the callback at the source of the current thread context. > > */ > > sdev->remove =3D 1; > > + fs_unlock(sdev->fs_dev, 0); > > return 0; > > } > > >=20 > >=20 > > diff --git a/drivers/net/failsafe/failsafe_private.h > > b/drivers/net/failsafe/failsafe_private.h > > index f3be152..ef1f63b 100644 > > --- a/drivers/net/failsafe/failsafe_private.h > > +++ b/drivers/net/failsafe/failsafe_private.h > > @@ -7,6 +7,7 @@ > > #define _RTE_ETH_FAILSAFE_PRIVATE_H_ > > > > #include > > +#include > > > > #include > > #include > > @@ -161,6 +162,9 @@ struct fs_priv { > > * appropriate failsafe Rx queue. > > */ > > struct rx_proxy rxp; > > + pthread_mutex_t hotplug_mutex; > > + /* Hot-plug mutex is locked by the alarm mechanism. */ > > + volatile unsigned int alarm_lock:1; >=20 > Without the RECURSIVE attribute, I believe this becomes unnecessary. I explained the potential usage above. >=20 > > unsigned int pending_alarm:1; /* An alarm is pending */ > > /* flow isolation state */ > > int flow_isolated:1; > > @@ -255,12 +259,6 @@ int failsafe_eth_lsc_event_callback(uint16_t > port_id, > > s !=3D NULL; \ > > s =3D fs_find_next((dev), i + 1, state, &i)) > > > > -/** > > - * Iterator construct over fail-safe sub-devices: > > - * s: (struct sub_device *), iterator > > - * i: (uint8_t), increment > > - * dev: (struct rte_eth_dev *), fail-safe ethdev > > - */ >=20 > Editing mistake I think here. Sure. > > #define FOREACH_SUBDEV(s, i, dev) \ > > FOREACH_SUBDEV_STATE(s, i, dev, DEV_UNDEFINED) > > > > @@ -347,6 +345,58 @@ int failsafe_eth_lsc_event_callback(uint16_t > > port_id, } > > > > /* > > + * Lock hot-plug mutex. > > + * is_alarm means that the caller is, for sure, the hot-plug alarm > mechanism. > > + */ > > +static inline int > > +fs_lock(struct rte_eth_dev *dev, unsigned int is_alarm) >=20 > The "is_alarm" should be removed without RECURSIVE. Not sure. =20 > > +{ > > + int ret; > > + > > + if (is_alarm) { > > + ret =3D pthread_mutex_trylock(&PRIV(dev)->hotplug_mutex); > > + if (ret) { > > + DEBUG("Hot-plug mutex lock trying failed(%s), will > try" > > + " again later...", strerror(ret)); > > + return ret; > > + } > > + PRIV(dev)->alarm_lock =3D 1; > > + } else { > > + ret =3D pthread_mutex_lock(&PRIV(dev)->hotplug_mutex); > > + if (ret) { > > + ERROR("Cannot lock mutex(%s)", strerror(ret)); > > + return ret; > > + } > > + } > > + DEBUG("Hot-plug mutex was locked by thread %lu%s", > pthread_self(), > > + PRIV(dev)->alarm_lock ? " by the hot-plug alarm" : ""); > > + return ret; > > +} > > + > > +/* > > + * Unlock hot-plug mutex. > > + * is_alarm means that the caller is, for sure, the hot-plug alarm > mechanism. > > + */ > > +static inline void > > +fs_unlock(struct rte_eth_dev *dev, unsigned int is_alarm) >=20 > ditto Ditto > > +{ > > + int ret; > > + unsigned int prev_alarm_lock =3D PRIV(dev)->alarm_lock; > > + > > + if (is_alarm) { > > + RTE_ASSERT(PRIV(dev)->alarm_lock =3D=3D 1); > > + PRIV(dev)->alarm_lock =3D 0; > > + } > > + ret =3D pthread_mutex_unlock(&PRIV(dev)->hotplug_mutex); > > + if (ret) > > + ERROR("Cannot unlock hot-plug mutex(%s)", strerror(ret)); > > + else > > + DEBUG("Hot-plug mutex was unlocked by thread %lu%s", > > + pthread_self(), > > + prev_alarm_lock ? " by the hot-plug alarm" : ""); } >=20 > I know that using a RECURSIVE lock allows you having an implementation > quicker. Yes. =20 > So this choice of implementation is only done due to the impending releas= e, > not because it is the right one. I think it should work, and I heard that= it was > heavily tested internally. It is right but can be implemented in other way that have another pros and = cons. > So I guess this patch can go in with the few other nits (removed blank li= ne, > removed macro doc), as long as it is reworked soon after. Sure will send V7 for it. > On this matter, I do not think that blindly testing implementations that = all > either copied each other or weren't too complicated does the trick regard= ing > concurrency issues. >=20 > You were thinking about an example app for your ownership library, in ord= er > to validate its implementation. I think this could work nicely as a tortu= re > instrument for this patchset as well, with some care. Can be, Yes. Thanks. > Regards, > -- > Ga=EBtan Rivet > 6WIND