DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v1] net/mlx4: fix 'show port info all' during detach
@ 2018-02-28 16:47 Ophir Munk
  2018-03-02 19:12 ` Adrien Mazarguil
  0 siblings, 1 reply; 3+ messages in thread
From: Ophir Munk @ 2018-02-28 16:47 UTC (permalink / raw)
  To: dev, Adrien Mazarguil; +Cc: Thomas Monjalon, Olga Shern, Ophir Munk, stable

The following scenario causes a crash in function mlx4_get_ifname
1. On testpmd startup mlx4 device is probed and started
2. mlx4 sriov is disabled. As a result an RMV event is sent to
testpmd which closes the device and nullify the priv struct
members.
3. Running 'show port info all' in testpmd results in segmentation
fault because of accessing NULL pointer priv->ctx

The fix is to return with an error from mlx4_get_ifname() if priv->ctx
member is NULL.

Fixes: 61cbdd419478 ("net/mlx4: separate device control functions")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
---
 drivers/net/mlx4/mlx4_ethdev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c
index 3bc6927..cca5223 100644
--- a/drivers/net/mlx4/mlx4_ethdev.c
+++ b/drivers/net/mlx4/mlx4_ethdev.c
@@ -67,6 +67,9 @@ mlx4_get_ifname(const struct priv *priv, char (*ifname)[IF_NAMESIZE])
 	char match[IF_NAMESIZE] = "";
 
 	{
+		if (priv->ctx == NULL)
+			return -ENOENT;
+
 		MKSTR(path, "%s/device/net", priv->ctx->device->ibdev_path);
 
 		dir = opendir(path);
-- 
2.7.4

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dpdk-dev] [PATCH v1] net/mlx4: fix 'show port info all' during detach
  2018-02-28 16:47 [dpdk-dev] [PATCH v1] net/mlx4: fix 'show port info all' during detach Ophir Munk
@ 2018-03-02 19:12 ` Adrien Mazarguil
  2018-03-20 14:02   ` Gaëtan Rivet
  0 siblings, 1 reply; 3+ messages in thread
From: Adrien Mazarguil @ 2018-03-02 19:12 UTC (permalink / raw)
  To: Ophir Munk; +Cc: dev, Thomas Monjalon, Olga Shern, stable, Gaetan Rivet

On Wed, Feb 28, 2018 at 04:47:30PM +0000, Ophir Munk wrote:
> The following scenario causes a crash in function mlx4_get_ifname
> 1. On testpmd startup mlx4 device is probed and started
> 2. mlx4 sriov is disabled. As a result an RMV event is sent to
> testpmd which closes the device and nullify the priv struct
> members.
> 3. Running 'show port info all' in testpmd results in segmentation
> fault because of accessing NULL pointer priv->ctx
> 
> The fix is to return with an error from mlx4_get_ifname() if priv->ctx
> member is NULL.

So priv->ctx is NULL after mlx4_dev_close() was called. In my opinion the
fact testpmd is allowed to call rte_eth_dev_info_get() on a closed port is
fishy, there are quite a few other ethdev callbacks that may end up
crashing in such a scenario.

Since commit 284c908cc588 ("app/testpmd: request device removal interrupt"),
testpmd automatically calls rte_eth_dev_close() and rte_eal_dev_detach()
after receiving a removal event on a port.

rte_eth_dev_close() documentation says:

 "Close a stopped Ethernet device. The device cannot be restarted!
  The function frees all resources except for needed by the
  closed state. To free these resources, call rte_eth_dev_detach()."

Unfortunately testpmd calls rte_*eal*_dev_detach() not
rte_*eth*_dev_detach(), the former doesn't release ethdev ports while the
latter does. I think it's a mistake, there's no point in keeping a closed
device around after it's been physically unplugged.

In short, rmv_event_callback() should call detach_port() instead of
rte_eal_dev_detach().

> Fixes: 61cbdd419478 ("net/mlx4: separate device control functions")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Ophir Munk <ophirmu@mellanox.com>

The above suggests the problem is actually in testpmd and was introduced in
v17.05 by commit 284c908cc588. The proposed patch is a workaround that
doesn't address the underlying issue, thus NACK unless proven otherwise :)

> ---
>  drivers/net/mlx4/mlx4_ethdev.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c
> index 3bc6927..cca5223 100644
> --- a/drivers/net/mlx4/mlx4_ethdev.c
> +++ b/drivers/net/mlx4/mlx4_ethdev.c
> @@ -67,6 +67,9 @@ mlx4_get_ifname(const struct priv *priv, char (*ifname)[IF_NAMESIZE])
>  	char match[IF_NAMESIZE] = "";
>  
>  	{
> +		if (priv->ctx == NULL)
> +			return -ENOENT;
> +
>  		MKSTR(path, "%s/device/net", priv->ctx->device->ibdev_path);
>  
>  		dir = opendir(path);
> -- 
> 2.7.4
> 

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dpdk-dev] [PATCH v1] net/mlx4: fix 'show port info all' during detach
  2018-03-02 19:12 ` Adrien Mazarguil
@ 2018-03-20 14:02   ` Gaëtan Rivet
  0 siblings, 0 replies; 3+ messages in thread
From: Gaëtan Rivet @ 2018-03-20 14:02 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Ophir Munk, dev, Thomas Monjalon, Olga Shern, stable

Hi Ophir, Adrien,

On Fri, Mar 02, 2018 at 08:12:58PM +0100, Adrien Mazarguil wrote:
> On Wed, Feb 28, 2018 at 04:47:30PM +0000, Ophir Munk wrote:
> > The following scenario causes a crash in function mlx4_get_ifname
> > 1. On testpmd startup mlx4 device is probed and started
> > 2. mlx4 sriov is disabled. As a result an RMV event is sent to
> > testpmd which closes the device and nullify the priv struct
> > members.
> > 3. Running 'show port info all' in testpmd results in segmentation
> > fault because of accessing NULL pointer priv->ctx
> > 
> > The fix is to return with an error from mlx4_get_ifname() if priv->ctx
> > member is NULL.
> 
> So priv->ctx is NULL after mlx4_dev_close() was called. In my opinion the
> fact testpmd is allowed to call rte_eth_dev_info_get() on a closed port is
> fishy, there are quite a few other ethdev callbacks that may end up
> crashing in such a scenario.
> 
> Since commit 284c908cc588 ("app/testpmd: request device removal interrupt"),
> testpmd automatically calls rte_eth_dev_close() and rte_eal_dev_detach()
> after receiving a removal event on a port.
> 
> rte_eth_dev_close() documentation says:
> 
>  "Close a stopped Ethernet device. The device cannot be restarted!
>   The function frees all resources except for needed by the
>   closed state. To free these resources, call rte_eth_dev_detach()."
> 
> Unfortunately testpmd calls rte_*eal*_dev_detach() not
> rte_*eth*_dev_detach(), the former doesn't release ethdev ports while the
> latter does. I think it's a mistake, there's no point in keeping a closed
> device around after it's been physically unplugged.
> 
> In short, rmv_event_callback() should call detach_port() instead of
> rte_eal_dev_detach().
> 

This is correct, the issue is actually testpmd making a mistake when
calling directly rte_eal_dev_detach().

The fix should thus be simply to change this call.

> > Fixes: 61cbdd419478 ("net/mlx4: separate device control functions")
> > Cc: stable@dpdk.org
> > 
> > Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
> 
> The above suggests the problem is actually in testpmd and was introduced in
> v17.05 by commit 284c908cc588. The proposed patch is a workaround that
> doesn't address the underlying issue, thus NACK unless proven otherwise :)
> 
> > ---
> >  drivers/net/mlx4/mlx4_ethdev.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c
> > index 3bc6927..cca5223 100644
> > --- a/drivers/net/mlx4/mlx4_ethdev.c
> > +++ b/drivers/net/mlx4/mlx4_ethdev.c
> > @@ -67,6 +67,9 @@ mlx4_get_ifname(const struct priv *priv, char (*ifname)[IF_NAMESIZE])
> >  	char match[IF_NAMESIZE] = "";
> >  
> >  	{
> > +		if (priv->ctx == NULL)
> > +			return -ENOENT;
> > +
> >  		MKSTR(path, "%s/device/net", priv->ctx->device->ibdev_path);
> >  
> >  		dir = opendir(path);
> > -- 
> > 2.7.4
> > 
> 
> -- 
> Adrien Mazarguil
> 6WIND

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-03-20 14:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-28 16:47 [dpdk-dev] [PATCH v1] net/mlx4: fix 'show port info all' during detach Ophir Munk
2018-03-02 19:12 ` Adrien Mazarguil
2018-03-20 14:02   ` Gaëtan Rivet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).