From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f65.google.com (mail-wm0-f65.google.com [74.125.82.65]) by dpdk.org (Postfix) with ESMTP id 25ED11B36E for ; Mon, 12 Feb 2018 19:33:40 +0100 (CET) Received: by mail-wm0-f65.google.com with SMTP id g1so11359818wmg.2 for ; Mon, 12 Feb 2018 10:33:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=wSr3Q2ATiDirHawRXCZuGnhYaxRb5jFtXvb+EjSWJyQ=; b=fTkkavk5Y1gBBkDkZ4oFvP+9V/h6BWdjgDOO2Yh8M9ulzvuGqQ1vQYULEs7cfSM6O1 41EbEkaNdg+CH83S8mNF7XSAgST5qqZJNB/IDV37mFSbSNwQNupn8BmFGiNrqykYU0wO E/uP6q7FSm9aThzklz/jJpd4uTIgiXB9oDFFP812XDM2tR+6lHc2FgeHrC2wLJQ7zfOk 8x2aK2dIokB3AesRKpP/WyQDxiIa2fXcWVS12x6EkOs+5564g7IHgNcD5nHmMTjaXjl8 yZ6Ox/pN3V4fNa4dLJzzN9m7n5baudVF+o7J0feea1/YvBZaSOc2+pCD9pa3x14Sw0b6 xweA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=wSr3Q2ATiDirHawRXCZuGnhYaxRb5jFtXvb+EjSWJyQ=; b=B9GrDr6MMkPVBwRo0GB7PGqFO831ktQsp/+QJjEl7OPyz11V8b+pVdTaUFz4cp32Fh 9x30Eudu+P2Xv1i7qUziQL8q2bglvE5wFYLXjiDGI8ZMecwxh2GYRoavlU4JGM2Z/3Do ogL/6ovlapmM15P5plTI0bhQY0507MIpROT/TkqEDgclTGeRpDD+kpQ3iyDeMKipssy1 3yJj6rbwPmywA4BO0SwLFhHnKBiSiqX9Iv55qhz3xPCqO3LiSbtDTLJ6XYgvUY/NQbrR MaQnfqQ7j3VcNGOPoKb9gzdPlakuL1qSD/IwQd5mCMo2YiVTuURDkR8oLBmKuqz86TmL BV+A== X-Gm-Message-State: APf1xPC/Qd+s/Ic9xORMtqevBHWTqg3JqqWEb+iwYVIxHLmcfBCFqz7j VKkK5zwV2zOFMhOXHUXMtB+uRnVn X-Google-Smtp-Source: AH8x225vVB49dph9kN6xNqxWi5+RJIwLFvXRAzZ37Pbp6clfbrFwDWiuzq5HYRPH/IUi+1SncFaiPw== X-Received: by 10.28.14.10 with SMTP id 10mr4783909wmo.2.1518460419478; Mon, 12 Feb 2018 10:33:39 -0800 (PST) Received: from bidouze.vm.6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id t66sm13947280wrc.21.2018.02.12.10.33.37 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 12 Feb 2018 10:33:38 -0800 (PST) Date: Mon, 12 Feb 2018 19:33:25 +0100 From: =?iso-8859-1?Q?Ga=EBtan?= Rivet To: Matan Azrad Cc: dev@dpdk.org, stable@dpdk.org Message-ID: <20180212183325.ecp6i4ei4mio7khx@bidouze.vm.6wind.com> References: <1518107653-15466-1-git-send-email-matan@mellanox.com> <1518369872-12324-1-git-send-email-matan@mellanox.com> <1518369872-12324-4-git-send-email-matan@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1518369872-12324-4-git-send-email-matan@mellanox.com> User-Agent: NeoMutt/20170113 (1.7.2) Subject: Re: [dpdk-dev] [PATCH v6 3/3] net/failsafe: fix hotplug races X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Feb 2018 18:33:40 -0000 Hi Matan, On Sun, Feb 11, 2018 at 05:24:32PM +0000, Matan Azrad wrote: > Fail-safe uses periodic alarm mechanism, running from the host thread, > to manage the hot-plug events of its sub-devices. This management > requires a lot of sub-devices PMDs operations (stop,close,start,etc). > > While the hot-plug alarm runs in the host thread, the application may > call fail-safe operations which directly trigger the sub-devices PMDs > operations too, This call may occur from any thread decided by the > application (probably the master thread). > > So, more than one operation can execute to a sub-device in same time > what can cause a lot of races in the sub-PMDs. > > Moreover, some control operations update the fail-safe internal > databases which can be used by the alarm mechanism in the same > time, what also can cause to races and crashes. > > Fail-safe is the owner of its sub-devices and must to synchronize their > use according to the ETHDEV ownership rules. > > Synchronize hot-plug management by a new lock mechanism uses a mutex to > atomically defend each critical section in the fail-safe hot-plug > mechanism and control operations to prevent any races between them. > > Fixes: a46f8d5 ("net/failsafe: add fail-safe PMD") > Cc: stable@dpdk.org > > Signed-off-by: Matan Azrad > --- > drivers/net/failsafe/Makefile | 1 + > drivers/net/failsafe/failsafe.c | 35 ++++++++ > drivers/net/failsafe/failsafe_ether.c | 6 +- > drivers/net/failsafe/failsafe_flow.c | 20 ++++- > drivers/net/failsafe/failsafe_ops.c | 148 ++++++++++++++++++++++++++------ > drivers/net/failsafe/failsafe_private.h | 62 +++++++++++-- > 6 files changed, 239 insertions(+), 33 deletions(-) > > diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile > index d1ae899..bd2f019 100644 > --- a/drivers/net/failsafe/Makefile > +++ b/drivers/net/failsafe/Makefile > @@ -68,5 +68,6 @@ CFLAGS += -pedantic > LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring > LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs > LDLIBS += -lrte_bus_vdev > +LDLIBS += -lpthread > > include $(RTE_SDK)/mk/rte.lib.mk > diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c > index 7b2cdbb..c499bfb 100644 > --- a/drivers/net/failsafe/failsafe.c > +++ b/drivers/net/failsafe/failsafe.c > @@ -113,17 +113,46 @@ > break; > /* if we have non-probed device */ > if (i != PRIV(dev)->subs_tail) { > + if (fs_lock(dev, 1) != 0) > + goto reinstall; You have left a few operations unlocked further down the call stack. With these discrepancies fixed, the RECURSIVE attribute could be removed, and this lock as well. > ret = failsafe_eth_dev_state_sync(dev); > + fs_unlock(dev, 1); Compared to the first version of these changes, I much prefer having a wrapper for locking. However, I dislike having the arguably unnecessary additional argument (alarm_lock). I guess you added this for debugging purpose, but in the end either the design is simple and clear, and you have a proper model, or you don't, and that's an issue. And having the RECURSIVE attribute "just in case", is not reassuring. > if (ret) > ERROR("Unable to synchronize sub_device state"); > } > failsafe_dev_remove(dev); > +reinstall: > ret = failsafe_hotplug_alarm_install(dev); > if (ret) > ERROR("Unable to set up next alarm"); > } > > static int > +fs_mutex_init(struct fs_priv *priv) > +{ > + int ret; > + pthread_mutexattr_t attr; > + > + ret = pthread_mutexattr_init(&attr); > + if (ret) { > + ERROR("Cannot initiate mutex attributes - %s", strerror(ret)); > + return ret; > + } > + /* Allow mutex relocks for the thread holding the mutex. */ > + ret = pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE); > + if (ret) { Just to emphasize, I think this should be removed. Please explain why you thought it was necessary. > + ERROR("Cannot set mutex type - %s", strerror(ret)); > + return ret; > + } > + ret = pthread_mutex_init(&priv->hotplug_mutex, &attr); > + if (ret) { > + ERROR("Cannot initiate mutex - %s", strerror(ret)); > + return ret; > + } > + return 0; > +} > + > +static int > fs_eth_dev_create(struct rte_vdev_device *vdev) > { > struct rte_eth_dev *dev; > @@ -176,6 +205,9 @@ > ret = failsafe_eal_init(dev); > if (ret) > goto free_args; > + ret = fs_mutex_init(priv); > + if (ret) > + goto free_args; > ret = failsafe_hotplug_alarm_install(dev); > if (ret) { > ERROR("Could not set up plug-in event detection"); > @@ -250,6 +282,9 @@ > ERROR("Error while uninitializing sub-EAL"); > failsafe_args_free(dev); > fs_sub_device_free(dev); > + ret = pthread_mutex_destroy(&PRIV(dev)->hotplug_mutex); > + if (ret) > + ERROR("Error while destroying hotplug mutex"); > rte_free(PRIV(dev)); > rte_eth_dev_release_port(dev); > return ret; > diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c > index d820faf..8672819 100644 > --- a/drivers/net/failsafe/failsafe_ether.c > +++ b/drivers/net/failsafe/failsafe_ether.c Locking fs_eth_dev_conf_apply should allow to remove the lock in fs_hotplug_alarm, as long as we make sure only public rte_ether API is used in fs_eth_dev_conf_apply and its callee. > @@ -328,8 +328,11 @@ > > FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) > if (sdev->remove && fs_rxtx_clean(sdev)) { > + if (fs_lock(dev, 1) != 0) > + return; > fs_dev_stats_save(sdev); > fs_dev_remove(sdev); > + fs_unlock(dev, 1); > } > } > > @@ -428,7 +431,7 @@ > void *cb_arg, void *out __rte_unused) > { > struct sub_device *sdev = cb_arg; > - This line should remain. > + fs_lock(sdev->fs_dev, 0); > /* Switch as soon as possible tx_dev. */ > fs_switch_dev(sdev->fs_dev, sdev); > /* Use safe bursts in any case. */ > @@ -438,6 +441,7 @@ > * the callback at the source of the current thread context. > */ > sdev->remove = 1; > + fs_unlock(sdev->fs_dev, 0); > return 0; > } > > diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h > index f3be152..ef1f63b 100644 > --- a/drivers/net/failsafe/failsafe_private.h > +++ b/drivers/net/failsafe/failsafe_private.h > @@ -7,6 +7,7 @@ > #define _RTE_ETH_FAILSAFE_PRIVATE_H_ > > #include > +#include > > #include > #include > @@ -161,6 +162,9 @@ struct fs_priv { > * appropriate failsafe Rx queue. > */ > struct rx_proxy rxp; > + pthread_mutex_t hotplug_mutex; > + /* Hot-plug mutex is locked by the alarm mechanism. */ > + volatile unsigned int alarm_lock:1; Without the RECURSIVE attribute, I believe this becomes unnecessary. > unsigned int pending_alarm:1; /* An alarm is pending */ > /* flow isolation state */ > int flow_isolated:1; > @@ -255,12 +259,6 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id, > s != NULL; \ > s = fs_find_next((dev), i + 1, state, &i)) > > -/** > - * Iterator construct over fail-safe sub-devices: > - * s: (struct sub_device *), iterator > - * i: (uint8_t), increment > - * dev: (struct rte_eth_dev *), fail-safe ethdev > - */ Editing mistake I think here. > #define FOREACH_SUBDEV(s, i, dev) \ > FOREACH_SUBDEV_STATE(s, i, dev, DEV_UNDEFINED) > > @@ -347,6 +345,58 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id, > } > > /* > + * Lock hot-plug mutex. > + * is_alarm means that the caller is, for sure, the hot-plug alarm mechanism. > + */ > +static inline int > +fs_lock(struct rte_eth_dev *dev, unsigned int is_alarm) The "is_alarm" should be removed without RECURSIVE. > +{ > + int ret; > + > + if (is_alarm) { > + ret = pthread_mutex_trylock(&PRIV(dev)->hotplug_mutex); > + if (ret) { > + DEBUG("Hot-plug mutex lock trying failed(%s), will try" > + " again later...", strerror(ret)); > + return ret; > + } > + PRIV(dev)->alarm_lock = 1; > + } else { > + ret = pthread_mutex_lock(&PRIV(dev)->hotplug_mutex); > + if (ret) { > + ERROR("Cannot lock mutex(%s)", strerror(ret)); > + return ret; > + } > + } > + DEBUG("Hot-plug mutex was locked by thread %lu%s", pthread_self(), > + PRIV(dev)->alarm_lock ? " by the hot-plug alarm" : ""); > + return ret; > +} > + > +/* > + * Unlock hot-plug mutex. > + * is_alarm means that the caller is, for sure, the hot-plug alarm mechanism. > + */ > +static inline void > +fs_unlock(struct rte_eth_dev *dev, unsigned int is_alarm) ditto > +{ > + int ret; > + unsigned int prev_alarm_lock = PRIV(dev)->alarm_lock; > + > + if (is_alarm) { > + RTE_ASSERT(PRIV(dev)->alarm_lock == 1); > + PRIV(dev)->alarm_lock = 0; > + } > + ret = pthread_mutex_unlock(&PRIV(dev)->hotplug_mutex); > + if (ret) > + ERROR("Cannot unlock hot-plug mutex(%s)", strerror(ret)); > + else > + DEBUG("Hot-plug mutex was unlocked by thread %lu%s", > + pthread_self(), > + prev_alarm_lock ? " by the hot-plug alarm" : ""); > +} I know that using a RECURSIVE lock allows you having an implementation quicker. So this choice of implementation is only done due to the impending release, not because it is the right one. I think it should work, and I heard that it was heavily tested internally. So I guess this patch can go in with the few other nits (removed blank line, removed macro doc), as long as it is reworked soon after. On this matter, I do not think that blindly testing implementations that all either copied each other or weren't too complicated does the trick regarding concurrency issues. You were thinking about an example app for your ownership library, in order to validate its implementation. I think this could work nicely as a torture instrument for this patchset as well, with some care. Regards, -- Gaëtan Rivet 6WIND