From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr0-f172.google.com (mail-wr0-f172.google.com [209.85.128.172]) by dpdk.org (Postfix) with ESMTP id 7154E2BE5 for ; Fri, 2 Mar 2018 20:13:13 +0100 (CET) Received: by mail-wr0-f172.google.com with SMTP id w77so11154093wrc.6 for ; Fri, 02 Mar 2018 11:13:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=kWubSTtrSpZHESEdWfC+d2ExYMI5a5uR3IUv4oCnVC8=; b=W3oyIduO8xEBliD3BF041BKI+6zjtzRpPlJLILeoLAAbJCv9Q7XID3hloMMzhFiAwc UiUePYHukjwo4NkY01+H3yr/Y5x4NFaJrGO+Yszi2B3CRsoIzwenrB3rB8Gk0fI4rg5C zo2Mffj2P+3bUoL2f/Vdv75eum60Ja4MmFJkdw0PQvOQTyc0hGSRIFdRxFgEL+F7XoI7 QHvL313wNxuS04PBvS5ow7T26Qj58+Xp1xpFCLtOBwDHfNuLj6AmFYDeQ0erBzoLRcY6 Ups5nWLxQuuNN2kbF69VwIdf3JwK9AvsmvbhA1tV4/wiNzfeX1KklEoZcc7DlJUOjNGV iU7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=kWubSTtrSpZHESEdWfC+d2ExYMI5a5uR3IUv4oCnVC8=; b=Yirz6jhN2dHx+gjmruS5XvftDFuHjr5ow4dg9vcL1ZgUrSIqVWH2rjAte77j4WcDd4 c8fjKE0A3RGfO1F+nNOTKEA8wAdMNM7isBtXGiNCvXU+f5oik1KrkOuiR2kTRTblaxZ/ RI9naPZ6+rmYuWljnx1NBWf0ZJkPPdw7BPZmU80WDL8KJV94LUa20QtUEcw7kS7/JNQo u66DwsauLNbEPAZnduG7S+KlkAhKEdx8uNKcZ/l6Dkl9DluJTbApDAfnFdEuJ0ds8fPd iODziTzD5G1ZJ+cuom93TWVEokSD+hRVjs0CY61nXB+39O+vvpxz6yc6Zt/ZKKOB3XAx dwwQ== X-Gm-Message-State: APf1xPC1PRwZnRJfiv5GQRykpxHrE+lE2rCY4876khGsfr68KyMLY0CD IchceDp+kr2eF8DUu7O5X5VD/g== X-Google-Smtp-Source: AG47ELvAiz7hKzj/Bfg7yWRk8ijNSWnNR8Mzk09tysC57TdqBpO0ZvThC+lzxlHg+EJFIoi9QhouJg== X-Received: by 10.223.201.15 with SMTP id m15mr5479962wrh.54.1520017993101; Fri, 02 Mar 2018 11:13:13 -0800 (PST) Received: from 6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id h38sm5871051wrh.61.2018.03.02.11.13.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 02 Mar 2018 11:13:11 -0800 (PST) Date: Fri, 2 Mar 2018 20:12:58 +0100 From: Adrien Mazarguil To: Ophir Munk Cc: dev@dpdk.org, Thomas Monjalon , Olga Shern , stable@dpdk.org, Gaetan Rivet Message-ID: <20180302191258.GF4256@6wind.com> References: <1519836450-11895-1-git-send-email-ophirmu@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1519836450-11895-1-git-send-email-ophirmu@mellanox.com> Subject: Re: [dpdk-dev] [PATCH v1] net/mlx4: fix 'show port info all' during detach X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Mar 2018 19:13:13 -0000 On Wed, Feb 28, 2018 at 04:47:30PM +0000, Ophir Munk wrote: > The following scenario causes a crash in function mlx4_get_ifname > 1. On testpmd startup mlx4 device is probed and started > 2. mlx4 sriov is disabled. As a result an RMV event is sent to > testpmd which closes the device and nullify the priv struct > members. > 3. Running 'show port info all' in testpmd results in segmentation > fault because of accessing NULL pointer priv->ctx > > The fix is to return with an error from mlx4_get_ifname() if priv->ctx > member is NULL. So priv->ctx is NULL after mlx4_dev_close() was called. In my opinion the fact testpmd is allowed to call rte_eth_dev_info_get() on a closed port is fishy, there are quite a few other ethdev callbacks that may end up crashing in such a scenario. Since commit 284c908cc588 ("app/testpmd: request device removal interrupt"), testpmd automatically calls rte_eth_dev_close() and rte_eal_dev_detach() after receiving a removal event on a port. rte_eth_dev_close() documentation says: "Close a stopped Ethernet device. The device cannot be restarted! The function frees all resources except for needed by the closed state. To free these resources, call rte_eth_dev_detach()." Unfortunately testpmd calls rte_*eal*_dev_detach() not rte_*eth*_dev_detach(), the former doesn't release ethdev ports while the latter does. I think it's a mistake, there's no point in keeping a closed device around after it's been physically unplugged. In short, rmv_event_callback() should call detach_port() instead of rte_eal_dev_detach(). > Fixes: 61cbdd419478 ("net/mlx4: separate device control functions") > Cc: stable@dpdk.org > > Signed-off-by: Ophir Munk The above suggests the problem is actually in testpmd and was introduced in v17.05 by commit 284c908cc588. The proposed patch is a workaround that doesn't address the underlying issue, thus NACK unless proven otherwise :) > --- > drivers/net/mlx4/mlx4_ethdev.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c > index 3bc6927..cca5223 100644 > --- a/drivers/net/mlx4/mlx4_ethdev.c > +++ b/drivers/net/mlx4/mlx4_ethdev.c > @@ -67,6 +67,9 @@ mlx4_get_ifname(const struct priv *priv, char (*ifname)[IF_NAMESIZE]) > char match[IF_NAMESIZE] = ""; > > { > + if (priv->ctx == NULL) > + return -ENOENT; > + > MKSTR(path, "%s/device/net", priv->ctx->device->ibdev_path); > > dir = opendir(path); > -- > 2.7.4 > -- Adrien Mazarguil 6WIND