[PATCH] failsafe: fix segfault on hotplug event

patches for DPDK stable branches
 help / color / mirror / Atom feed

* [PATCH] failsafe: fix segfault on hotplug event
@ 2022-11-10 16:34 Luc Pelletier
  2022-11-10 17:42 ` [PATCH v2] " Luc Pelletier
                   ` (9 more replies)
  0 siblings, 10 replies; 16+ messages in thread
From: Luc Pelletier @ 2022-11-10 16:34 UTC (permalink / raw)
  To: grive; +Cc: dev, Luc Pelletier, Konstantin Ananyev, stable

When the failsafe PMD encounters a hotplug event, it switches its rx/tx
functions to "safe" ones that validate the sub-device's rx/tx functions
before calling them. It switches the rx/tx functions by changing the
function pointers in the rte_eth_dev structure.

Following commit 7a0935239b, the rx/tx functions of PMDs are no longer
called through the function pointers in the rte_eth_dev structure. They
are rather called through a flat array named rte_eth_fp_ops. The
function pointers in that array are initialized when the devices start
and are initialized.

When a hotplug event occurs, the function pointers in rte_eth_fp_ops
still point to the "unsafe" rx/tx functions in the failsafe PMD since
they haven't been updated. This results in a segmentation fault because
it ends up using the "unsafe" functions, when the "safe" functions
should have been used.

To fix the problem, the failsafe PMD code was changed to update the
function pointers in the rte_eth_fp_ops array when a hotplug event
occurs.

Fixes: 7a0935239b ("ethdev: make fast-path functions to use new flat array")
Cc: Konstantin Ananyev <konstantin.ananyev@intel.com>
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
 drivers/net/failsafe/failsafe_rxtx.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/failsafe/failsafe_rxtx.c b/drivers/net/failsafe/failsafe_rxtx.c
index fe67293299..34d59dfbb1 100644
--- a/drivers/net/failsafe/failsafe_rxtx.c
+++ b/drivers/net/failsafe/failsafe_rxtx.c
@@ -5,6 +5,7 @@

 #include <rte_atomic.h>
 #include <rte_debug.h>
+#include <rte_ethdev.h>
 #include <rte_mbuf.h>
 #include <ethdev_driver.h>

@@ -44,9 +45,13 @@ failsafe_set_burst_fn(struct rte_eth_dev *dev, int force_safe)
 		DEBUG("Using safe RX bursts%s",
 		      (force_safe ? " (forced)" : ""));
 		dev->rx_pkt_burst = &failsafe_rx_burst;
+		rte_eth_fp_ops[dev->data->port_id].rx_pkt_burst =
+			&failsafe_rx_burst;
 	} else if (!need_safe && safe_set) {
 		DEBUG("Using fast RX bursts");
 		dev->rx_pkt_burst = &failsafe_rx_burst_fast;
+		rte_eth_fp_ops[dev->data->port_id].rx_pkt_burst =
+			&failsafe_rx_burst_fast;
 	}
 	need_safe = force_safe || fs_tx_unsafe(TX_SUBDEV(dev));
 	safe_set = (dev->tx_pkt_burst == &failsafe_tx_burst);
@@ -54,9 +59,13 @@ failsafe_set_burst_fn(struct rte_eth_dev *dev, int force_safe)
 		DEBUG("Using safe TX bursts%s",
 		      (force_safe ? " (forced)" : ""));
 		dev->tx_pkt_burst = &failsafe_tx_burst;
+		rte_eth_fp_ops[dev->data->port_id].tx_pkt_burst =
+			&failsafe_tx_burst;
 	} else if (!need_safe && safe_set) {
 		DEBUG("Using fast TX bursts");
 		dev->tx_pkt_burst = &failsafe_tx_burst_fast;
+		rte_eth_fp_ops[dev->data->port_id].tx_pkt_burst =
+			&failsafe_tx_burst_fast;
 	}
 	rte_wmb();
 }
-- 
2.25.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2] failsafe: fix segfault on hotplug event
  2022-11-10 16:34 [PATCH] failsafe: fix segfault on hotplug event Luc Pelletier
@ 2022-11-10 17:42 ` Luc Pelletier
  2022-11-10 23:33   ` Stephen Hemminger
  2022-11-16 17:35 ` [PATCH] " Konstantin Ananyev
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Luc Pelletier @ 2022-11-10 17:42 UTC (permalink / raw)
  To: grive; +Cc: dev, Luc Pelletier, Konstantin Ananyev, stable

When the failsafe PMD encounters a hotplug event, it switches its rx/tx
functions to "safe" ones that validate the sub-device's rx/tx functions
before calling them. It switches the rx/tx functions by changing the
function pointers in the rte_eth_dev structure.

Following commit 7a0935239b9e, the rx/tx functions of PMDs are no longer
called through the function pointers in the rte_eth_dev structure. They
are rather called through a flat array named rte_eth_fp_ops. The
function pointers in that array are initialized when the devices start
and are initialized.

When a hotplug event occurs, the function pointers in rte_eth_fp_ops
still point to the "unsafe" rx/tx functions in the failsafe PMD since
they haven't been updated. This results in a segmentation fault because
it ends up using the "unsafe" functions, when the "safe" functions
should have been used.

To fix the problem, the failsafe PMD code was changed to update the
function pointers in the rte_eth_fp_ops array when a hotplug event
occurs.

Fixes: 7a0935239b9e ("ethdev: make fast-path functions to use new flat array")
Cc: Konstantin Ananyev <konstantin.ananyev@intel.com>
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---

v2:
* fixed git commit hashes in commit message

 drivers/net/failsafe/failsafe_rxtx.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/failsafe/failsafe_rxtx.c b/drivers/net/failsafe/failsafe_rxtx.c
index fe67293299..34d59dfbb1 100644
--- a/drivers/net/failsafe/failsafe_rxtx.c
+++ b/drivers/net/failsafe/failsafe_rxtx.c
@@ -5,6 +5,7 @@
 
 #include <rte_atomic.h>
 #include <rte_debug.h>
+#include <rte_ethdev.h>
 #include <rte_mbuf.h>
 #include <ethdev_driver.h>
 
@@ -44,9 +45,13 @@ failsafe_set_burst_fn(struct rte_eth_dev *dev, int force_safe)
 		DEBUG("Using safe RX bursts%s",
 		      (force_safe ? " (forced)" : ""));
 		dev->rx_pkt_burst = &failsafe_rx_burst;
+		rte_eth_fp_ops[dev->data->port_id].rx_pkt_burst =
+			&failsafe_rx_burst;
 	} else if (!need_safe && safe_set) {
 		DEBUG("Using fast RX bursts");
 		dev->rx_pkt_burst = &failsafe_rx_burst_fast;
+		rte_eth_fp_ops[dev->data->port_id].rx_pkt_burst =
+			&failsafe_rx_burst_fast;
 	}
 	need_safe = force_safe || fs_tx_unsafe(TX_SUBDEV(dev));
 	safe_set = (dev->tx_pkt_burst == &failsafe_tx_burst);
@@ -54,9 +59,13 @@ failsafe_set_burst_fn(struct rte_eth_dev *dev, int force_safe)
 		DEBUG("Using safe TX bursts%s",
 		      (force_safe ? " (forced)" : ""));
 		dev->tx_pkt_burst = &failsafe_tx_burst;
+		rte_eth_fp_ops[dev->data->port_id].tx_pkt_burst =
+			&failsafe_tx_burst;
 	} else if (!need_safe && safe_set) {
 		DEBUG("Using fast TX bursts");
 		dev->tx_pkt_burst = &failsafe_tx_burst_fast;
+		rte_eth_fp_ops[dev->data->port_id].tx_pkt_burst =
+			&failsafe_tx_burst_fast;
 	}
 	rte_wmb();
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2] failsafe: fix segfault on hotplug event
  2022-11-10 17:42 ` [PATCH v2] " Luc Pelletier
@ 2022-11-10 23:33   ` Stephen Hemminger
  0 siblings, 0 replies; 16+ messages in thread
From: Stephen Hemminger @ 2022-11-10 23:33 UTC (permalink / raw)
  To: Luc Pelletier; +Cc: grive, dev, Konstantin Ananyev, stable

On Thu, 10 Nov 2022 12:42:43 -0500
Luc Pelletier <lucp.at.work@gmail.com> wrote:

> When the failsafe PMD encounters a hotplug event, it switches its rx/tx
> functions to "safe" ones that validate the sub-device's rx/tx functions
> before calling them. It switches the rx/tx functions by changing the
> function pointers in the rte_eth_dev structure.
> 
> Following commit 7a0935239b9e, the rx/tx functions of PMDs are no longer
> called through the function pointers in the rte_eth_dev structure. They
> are rather called through a flat array named rte_eth_fp_ops. The
> function pointers in that array are initialized when the devices start
> and are initialized.
> 
> When a hotplug event occurs, the function pointers in rte_eth_fp_ops
> still point to the "unsafe" rx/tx functions in the failsafe PMD since
> they haven't been updated. This results in a segmentation fault because
> it ends up using the "unsafe" functions, when the "safe" functions
> should have been used.
> 
> To fix the problem, the failsafe PMD code was changed to update the
> function pointers in the rte_eth_fp_ops array when a hotplug event
> occurs.

Have it in both places might be breaking other drivers as well.
Shouldn't there be a ethdev function when changing rx/tx burst.

Also, changing a variable used by another thread needs to be
using __atomic_store and __atomic_load to guarantee that CPU
or compiler will no that it changed.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] failsafe: fix segfault on hotplug event
  2022-11-10 16:34 [PATCH] failsafe: fix segfault on hotplug event Luc Pelletier
  2022-11-10 17:42 ` [PATCH v2] " Luc Pelletier
@ 2022-11-16 17:35 ` Konstantin Ananyev
  2022-11-16 21:51   ` Luc Pelletier
  2022-11-29 14:48 ` [PATCH v3 1/5] " Luc Pelletier
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Konstantin Ananyev @ 2022-11-16 17:35 UTC (permalink / raw)
  To: Luc Pelletier, grive; +Cc: dev, Konstantin Ananyev, stable



 
> When the failsafe PMD encounters a hotplug event, it switches its rx/tx
> functions to "safe" ones that validate the sub-device's rx/tx functions
> before calling them. It switches the rx/tx functions by changing the
> function pointers in the rte_eth_dev structure.
> 
> Following commit 7a0935239b, the rx/tx functions of PMDs are no longer
> called through the function pointers in the rte_eth_dev structure. They
> are rather called through a flat array named rte_eth_fp_ops. The
> function pointers in that array are initialized when the devices start
> and are initialized.
> 
> When a hotplug event occurs, the function pointers in rte_eth_fp_ops
> still point to the "unsafe" rx/tx functions in the failsafe PMD since
> they haven't been updated. This results in a segmentation fault because
> it ends up using the "unsafe" functions, when the "safe" functions
> should have been used.
> 
> To fix the problem, the failsafe PMD code was changed to update the
> function pointers in the rte_eth_fp_ops array when a hotplug event
> occurs.

 
It is not recommended way to update rte_eth_fp_ops[] contents directly.
There are eth_dev_fp_ops_setup()/ eth_dev_fp_ops_reset() that supposed
to be used for that.
About the fix itself - while it might help till some extent,
I think it will not remove the problem completely.
There still remain a race-condition between rte_eth_rx_burst() and failsafe_eth_rmv_event_callback().
Right now DPDK doesn't support switching PMD fast-ops functions (or updating rxq/txq data)
on the fly.
  
> Fixes: 7a0935239b ("ethdev: make fast-path functions to use new flat array")
> Cc: Konstantin Ananyev <konstantin.ananyev@intel.com>
> Cc: stable@dpdk.org
> 
> Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
> ---
>  drivers/net/failsafe/failsafe_rxtx.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/net/failsafe/failsafe_rxtx.c b/drivers/net/failsafe/failsafe_rxtx.c
> index fe67293299..34d59dfbb1 100644
> --- a/drivers/net/failsafe/failsafe_rxtx.c
> +++ b/drivers/net/failsafe/failsafe_rxtx.c
> @@ -5,6 +5,7 @@
> 
>  #include <rte_atomic.h>
>  #include <rte_debug.h>
> +#include <rte_ethdev.h>
>  #include <rte_mbuf.h>
>  #include <ethdev_driver.h>
> 
> @@ -44,9 +45,13 @@ failsafe_set_burst_fn(struct rte_eth_dev *dev, int force_safe)
>  		DEBUG("Using safe RX bursts%s",
>  		      (force_safe ? " (forced)" : ""));
>  		dev->rx_pkt_burst = &failsafe_rx_burst;
> +		rte_eth_fp_ops[dev->data->port_id].rx_pkt_burst =
> +			&failsafe_rx_burst;
>  	} else if (!need_safe && safe_set) {
>  		DEBUG("Using fast RX bursts");
>  		dev->rx_pkt_burst = &failsafe_rx_burst_fast;
> +		rte_eth_fp_ops[dev->data->port_id].rx_pkt_burst =
> +			&failsafe_rx_burst_fast;
>  	}
>  	need_safe = force_safe || fs_tx_unsafe(TX_SUBDEV(dev));
>  	safe_set = (dev->tx_pkt_burst == &failsafe_tx_burst);
> @@ -54,9 +59,13 @@ failsafe_set_burst_fn(struct rte_eth_dev *dev, int force_safe)
>  		DEBUG("Using safe TX bursts%s",
>  		      (force_safe ? " (forced)" : ""));
>  		dev->tx_pkt_burst = &failsafe_tx_burst;
> +		rte_eth_fp_ops[dev->data->port_id].tx_pkt_burst =
> +			&failsafe_tx_burst;
>  	} else if (!need_safe && safe_set) {
>  		DEBUG("Using fast TX bursts");
>  		dev->tx_pkt_burst = &failsafe_tx_burst_fast;
> +		rte_eth_fp_ops[dev->data->port_id].tx_pkt_burst =
> +			&failsafe_tx_burst_fast;
>  	}
>  	rte_wmb();
>  }
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] failsafe: fix segfault on hotplug event
  2022-11-16 17:35 ` [PATCH] " Konstantin Ananyev
@ 2022-11-16 21:51   ` Luc Pelletier
  2022-11-16 22:25     ` Stephen Hemminger
  0 siblings, 1 reply; 16+ messages in thread
From: Luc Pelletier @ 2022-11-16 21:51 UTC (permalink / raw)
  To: Konstantin Ananyev; +Cc: grive, dev, stable, hyonkim, johndale

Hi Konstantin,

> It is not recommended way to update rte_eth_fp_ops[] contents directly.
> There are eth_dev_fp_ops_setup()/ eth_dev_fp_ops_reset() that supposed
> to be used for that.

Good to know. I see another fix that was made in a different PMD that
does exactly the same thing:

https://github.com/DPDK/dpdk/commit/bcd68b68415172815e55fc67cf3947c0433baf74

CC'ing the authors for awareness.

> About the fix itself - while it might help till some extent,
> I think it will not remove the problem completely.
> There still remain a race-condition between rte_eth_rx_burst() and failsafe_eth_rmv_event_callback().
> Right now DPDK doesn't support switching PMD fast-ops functions (or updating rxq/txq data)
> on the fly.

Thanks for the information. This is very helpful.

Are you saying that the previous code also had that same race
condition? It was only updating the rte_eth_dev structure, but I
assume the problem would have been the same since rte_eth_rx_burst()
in DPDK versions <=20 use the function pointers in rte_eth_dev, not
rte_eth_fp_ops.

Can you think of a possible solution to this problem? I'm happy to
provide a patch to properly fix the problem. Having your guidance
would be extremely helpful.

Thanks!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] failsafe: fix segfault on hotplug event
  2022-11-16 21:51   ` Luc Pelletier
@ 2022-11-16 22:25     ` Stephen Hemminger
  2022-11-18 16:31       ` Konstantin Ananyev
  0 siblings, 1 reply; 16+ messages in thread
From: Stephen Hemminger @ 2022-11-16 22:25 UTC (permalink / raw)
  To: Luc Pelletier; +Cc: Konstantin Ananyev, grive, dev, stable, hyonkim, johndale

On Wed, 16 Nov 2022 16:51:59 -0500
Luc Pelletier <lucp.at.work@gmail.com> wrote:

> Hi Konstantin,
> 
> > It is not recommended way to update rte_eth_fp_ops[] contents directly.
> > There are eth_dev_fp_ops_setup()/ eth_dev_fp_ops_reset() that supposed
> > to be used for that.  
> 
> Good to know. I see another fix that was made in a different PMD that
> does exactly the same thing:
> 
> https://github.com/DPDK/dpdk/commit/bcd68b68415172815e55fc67cf3947c0433baf74
> 
> CC'ing the authors for awareness.
> 
> > About the fix itself - while it might help till some extent,
> > I think it will not remove the problem completely.
> > There still remain a race-condition between rte_eth_rx_burst() and failsafe_eth_rmv_event_callback().
> > Right now DPDK doesn't support switching PMD fast-ops functions (or updating rxq/txq data)
> > on the fly.  
> 
> Thanks for the information. This is very helpful.
> 
> Are you saying that the previous code also had that same race
> condition? It was only updating the rte_eth_dev structure, but I
> assume the problem would have been the same since rte_eth_rx_burst()
> in DPDK versions <=20 use the function pointers in rte_eth_dev, not
> rte_eth_fp_ops.
> 
> Can you think of a possible solution to this problem? I'm happy to
> provide a patch to properly fix the problem. Having your guidance
> would be extremely helpful.
> 
> Thanks!

Changing burst mode on a running device is not safe because
of lack of locking and/or memory barriers.

Would have been better to not to do this optimization.
Just have one rx_burst/tx_burst function and look at what
ever conditions are present there.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] failsafe: fix segfault on hotplug event
  2022-11-16 22:25     ` Stephen Hemminger
@ 2022-11-18 16:31       ` Konstantin Ananyev
  0 siblings, 0 replies; 16+ messages in thread
From: Konstantin Ananyev @ 2022-11-18 16:31 UTC (permalink / raw)
  To: Stephen Hemminger, Luc Pelletier; +Cc: grive, dev, stable, hyonkim, johndale



Hi Luc,
 
> > Hi Konstantin,
> >
> > > It is not recommended way to update rte_eth_fp_ops[] contents directly.
> > > There are eth_dev_fp_ops_setup()/ eth_dev_fp_ops_reset() that supposed
> > > to be used for that.
> >
> > Good to know. I see another fix that was made in a different PMD that
> > does exactly the same thing:
> >
> > https://github.com/DPDK/dpdk/commit/bcd68b68415172815e55fc67cf3947c0433baf74
> >
> > CC'ing the authors for awareness.
> >
> > > About the fix itself - while it might help till some extent,
> > > I think it will not remove the problem completely.
> > > There still remain a race-condition between rte_eth_rx_burst() and failsafe_eth_rmv_event_callback().
> > > Right now DPDK doesn't support switching PMD fast-ops functions (or updating rxq/txq data)
> > > on the fly.
> >
> > Thanks for the information. This is very helpful.
> >
> > Are you saying that the previous code also had that same race
> > condition?

Yes, I believe so. 

> It was only updating the rte_eth_dev structure, but I
> > assume the problem would have been the same since rte_eth_rx_burst()
> > in DPDK versions <=20 use the function pointers in rte_eth_dev, not
> > rte_eth_fp_ops.
> >
> > Can you think of a possible solution to this problem? I'm happy to
> > provide a patch to properly fix the problem. Having your guidance
> > would be extremely helpful.
> >
> > Thanks!
> 
> Changing burst mode on a running device is not safe because
> of lack of locking and/or memory barriers.
> 
> Would have been better to not to do this optimization.
> Just have one rx_burst/tx_burst function and look at what
> ever conditions are present there.

I think Stephen is right - within DPDK it is just not possible to switch RX/TX
function on the fly (without some external synchronization).
So the safe way is to always use safe version of RX/TX call.
I personally don't think such few extra checks will affect performance that much.

As another nit: inside failsafe rx_burst functions it is probably better not to access dev->data->rx_queues directly,
but call rte_eth_rx_burst(sdev->sdev_port_id, ...); instead.
Same for TX.

Thanks
Konstantin

 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 1/5] failsafe: fix segfault on hotplug event
  2022-11-10 16:34 [PATCH] failsafe: fix segfault on hotplug event Luc Pelletier
  2022-11-10 17:42 ` [PATCH v2] " Luc Pelletier
  2022-11-16 17:35 ` [PATCH] " Konstantin Ananyev
@ 2022-11-29 14:48 ` Luc Pelletier
  2023-10-31 17:35   ` Stephen Hemminger
  2022-11-29 14:48 ` [PATCH v3 3/5] failsafe: fix double release of port Luc Pelletier
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Luc Pelletier @ 2022-11-29 14:48 UTC (permalink / raw)
  To: grive; +Cc: dev, stephen, konstantin.ananyev, Luc Pelletier, stable

When the failsafe PMD encounters a hotplug event, it switches its rx/tx
functions to "safe" ones that validate the sub-device's rx/tx functions
before calling them. It switches the rx/tx functions by changing the
function pointers in the rte_eth_dev structure.

Following commit 7a0935239b9e, the rx/tx functions of PMDs are no longer
called through the function pointers in the rte_eth_dev structure. They
are rather called through a flat array named rte_eth_fp_ops. The
function pointers in that array are initialized when the devices start
and are initialized.

When a hotplug event occurs, the function pointers in rte_eth_fp_ops
still point to the "unsafe" rx/tx functions in the failsafe PMD since
they haven't been updated. This results in a segmentation fault because
it ends up using the "unsafe" functions, when the "safe" functions
should have been used.

To fix the problem, the "unsafe" rx/tx functions were completely
removed. The "safe" functions are now always used. Modifying the rx/tx
functions on-the-fly is not supported by DPDK, so this is the correct
approach and should have very minimal impact on performance.

Fixes: 7a0935239b9e ("ethdev: make fast-path functions to use new flat array")
Cc: Konstantin Ananyev <konstantin.ananyev@huawei.com>
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
 drivers/net/failsafe/failsafe_ether.c   |  2 -
 drivers/net/failsafe/failsafe_private.h |  8 ---
 drivers/net/failsafe/failsafe_rxtx.c    | 83 +------------------------
 3 files changed, 1 insertion(+), 92 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 10b90fd837..517126565f 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -595,8 +595,6 @@ failsafe_eth_rmv_event_callback(uint16_t port_id __rte_unused,
 	fs_lock(fs_dev(sdev), 0);
 	/* Switch as soon as possible tx_dev. */
 	fs_switch_dev(fs_dev(sdev), sdev);
-	/* Use safe bursts in any case. */
-	failsafe_set_burst_fn(fs_dev(sdev), 1);
 	/*
 	 * Async removal, the sub-PMD will try to unregister
 	 * the callback at the source of the current thread context.
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 53a451c1b1..3865f2fc34 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -208,18 +208,11 @@ int failsafe_hotplug_alarm_cancel(struct rte_eth_dev *dev);
 
 /* RX / TX */
 
-void failsafe_set_burst_fn(struct rte_eth_dev *dev, int force_safe);
-
 uint16_t failsafe_rx_burst(void *rxq,
 		struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
 uint16_t failsafe_tx_burst(void *txq,
 		struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
 
-uint16_t failsafe_rx_burst_fast(void *rxq,
-		struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
-uint16_t failsafe_tx_burst_fast(void *txq,
-		struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
-
 /* ARGS */
 
 int failsafe_args_parse(struct rte_eth_dev *dev, const char *params);
@@ -487,7 +480,6 @@ fs_switch_dev(struct rte_eth_dev *dev,
 	} else {
 		return;
 	}
-	failsafe_set_burst_fn(dev, 0);
 	rte_wmb();
 }
 
diff --git a/drivers/net/failsafe/failsafe_rxtx.c b/drivers/net/failsafe/failsafe_rxtx.c
index fe67293299..707fe60a36 100644
--- a/drivers/net/failsafe/failsafe_rxtx.c
+++ b/drivers/net/failsafe/failsafe_rxtx.c
@@ -28,39 +28,6 @@ fs_tx_unsafe(struct sub_device *sdev)
 		(sdev->state != DEV_STARTED);
 }
 
-void
-failsafe_set_burst_fn(struct rte_eth_dev *dev, int force_safe)
-{
-	struct sub_device *sdev;
-	uint8_t i;
-	int need_safe;
-	int safe_set;
-
-	need_safe = force_safe;
-	FOREACH_SUBDEV(sdev, i, dev)
-		need_safe |= fs_rx_unsafe(sdev);
-	safe_set = (dev->rx_pkt_burst == &failsafe_rx_burst);
-	if (need_safe && !safe_set) {
-		DEBUG("Using safe RX bursts%s",
-		      (force_safe ? " (forced)" : ""));
-		dev->rx_pkt_burst = &failsafe_rx_burst;
-	} else if (!need_safe && safe_set) {
-		DEBUG("Using fast RX bursts");
-		dev->rx_pkt_burst = &failsafe_rx_burst_fast;
-	}
-	need_safe = force_safe || fs_tx_unsafe(TX_SUBDEV(dev));
-	safe_set = (dev->tx_pkt_burst == &failsafe_tx_burst);
-	if (need_safe && !safe_set) {
-		DEBUG("Using safe TX bursts%s",
-		      (force_safe ? " (forced)" : ""));
-		dev->tx_pkt_burst = &failsafe_tx_burst;
-	} else if (!need_safe && safe_set) {
-		DEBUG("Using fast TX bursts");
-		dev->tx_pkt_burst = &failsafe_tx_burst_fast;
-	}
-	rte_wmb();
-}
-
 /*
  * Override source port in Rx packets.
  *
@@ -89,7 +56,7 @@ failsafe_rx_burst(void *queue,
 	rxq = queue;
 	sdev = rxq->sdev;
 	do {
-		if (fs_rx_unsafe(sdev)) {
+		if (unlikely(fs_rx_unsafe(sdev))) {
 			nb_rx = 0;
 			sdev = sdev->next;
 			continue;
@@ -108,34 +75,6 @@ failsafe_rx_burst(void *queue,
 	return nb_rx;
 }
 
-uint16_t
-failsafe_rx_burst_fast(void *queue,
-			 struct rte_mbuf **rx_pkts,
-			 uint16_t nb_pkts)
-{
-	struct sub_device *sdev;
-	struct rxq *rxq;
-	void *sub_rxq;
-	uint16_t nb_rx;
-
-	rxq = queue;
-	sdev = rxq->sdev;
-	do {
-		RTE_ASSERT(!fs_rx_unsafe(sdev));
-		sub_rxq = ETH(sdev)->data->rx_queues[rxq->qid];
-		FS_ATOMIC_P(rxq->refcnt[sdev->sid]);
-		nb_rx = ETH(sdev)->
-			rx_pkt_burst(sub_rxq, rx_pkts, nb_pkts);
-		FS_ATOMIC_V(rxq->refcnt[sdev->sid]);
-		sdev = sdev->next;
-	} while (nb_rx == 0 && sdev != rxq->sdev);
-	rxq->sdev = sdev;
-	if (nb_rx)
-		failsafe_rx_set_port(rx_pkts, nb_rx,
-				     rxq->priv->data->port_id);
-	return nb_rx;
-}
-
 uint16_t
 failsafe_tx_burst(void *queue,
 		  struct rte_mbuf **tx_pkts,
@@ -156,23 +95,3 @@ failsafe_tx_burst(void *queue,
 	FS_ATOMIC_V(txq->refcnt[sdev->sid]);
 	return nb_tx;
 }
-
-uint16_t
-failsafe_tx_burst_fast(void *queue,
-			 struct rte_mbuf **tx_pkts,
-			 uint16_t nb_pkts)
-{
-	struct sub_device *sdev;
-	struct txq *txq;
-	void *sub_txq;
-	uint16_t nb_tx;
-
-	txq = queue;
-	sdev = TX_SUBDEV(&rte_eth_devices[txq->priv->data->port_id]);
-	RTE_ASSERT(!fs_tx_unsafe(sdev));
-	sub_txq = ETH(sdev)->data->tx_queues[txq->qid];
-	FS_ATOMIC_P(txq->refcnt[sdev->sid]);
-	nb_tx = ETH(sdev)->tx_pkt_burst(sub_txq, tx_pkts, nb_pkts);
-	FS_ATOMIC_V(txq->refcnt[sdev->sid]);
-	return nb_tx;
-}
-- 
2.38.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 3/5] failsafe: fix double release of port
  2022-11-10 16:34 [PATCH] failsafe: fix segfault on hotplug event Luc Pelletier
                   ` (2 preceding siblings ...)
  2022-11-29 14:48 ` [PATCH v3 1/5] " Luc Pelletier
@ 2022-11-29 14:48 ` Luc Pelletier
  2022-11-29 14:48 ` [PATCH v3 4/5] failsafe: use public APIs in fs_link_update Luc Pelletier
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Luc Pelletier @ 2022-11-29 14:48 UTC (permalink / raw)
  To: grive
  Cc: dev, stephen, konstantin.ananyev, Luc Pelletier, Matan Azrad, stable

When a sub-device is hot-unplugged, rte_eth_dev_release_port is being
called twice. Once indirectly via rte_eth_dev_close and once explicitly.

Changed the code to remove the explicit call since it's not required and
results in mistakenly releasing port 0 (due to zero'ing of memory being
done in rte_eth_dev_release_port).

Fixes: fac0ae546e5f ("net/failsafe: free port by dedicated function")
Cc: Matan Azrad <matan@nvidia.com>
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
 drivers/net/failsafe/failsafe_ether.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 517126565f..3bfc446638 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -301,8 +301,6 @@ fs_dev_remove(struct sub_device *sdev)
 		if (ret < 0) {
 			ERROR("Bus detach failed for sub_device %u",
 			      SUB_ID(sdev));
-		} else {
-			rte_eth_dev_release_port(ETH(sdev));
 		}
 		sdev->state = DEV_PARSED;
 		/* fallthrough */
-- 
2.38.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 4/5] failsafe: use public APIs in fs_link_update
  2022-11-10 16:34 [PATCH] failsafe: fix segfault on hotplug event Luc Pelletier
                   ` (3 preceding siblings ...)
  2022-11-29 14:48 ` [PATCH v3 3/5] failsafe: fix double release of port Luc Pelletier
@ 2022-11-29 14:48 ` Luc Pelletier
  2022-11-29 14:48 ` [PATCH v3 5/5] failsafe: make sub-device remove flag thread-safe Luc Pelletier
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Luc Pelletier @ 2022-11-29 14:48 UTC (permalink / raw)
  To: grive
  Cc: dev, stephen, konstantin.ananyev, Luc Pelletier, Matan Azrad, stable

The link_update function pointer is called directly on a sub-device, via
its rte_eth_dev structure. In addition, it assumes that if link_update
returns a value other than 0 or -1, an error occurred. That's not
accurate. For example, the mlx5 PMD returns a positive value to indicate
that the link status was updated. This results in fs_link_update failing
when it really should have succeeded.

The code now uses the public APIs rte_eth_link_get and
rte_eth_link_get_nowait to query the link status of each sub-device. It
also uses rte_eth_linkstatus_set to set the link status of the failsafe
device.

Fixes: ae80146c5a1b ("net/failsafe: fix removed device handling")
Cc: Matan Azrad <matan@nvidia.com>
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
 drivers/net/failsafe/failsafe_ops.c | 38 ++++++++++++++---------------
 1 file changed, 18 insertions(+), 20 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index d357e1bc83..7bd223af37 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -820,34 +820,32 @@ fs_link_update(struct rte_eth_dev *dev,
 {
 	struct sub_device *sdev;
 	uint8_t i;
-	int ret;
+	int ret = -1;
+	int sdev_link_ret;
+	struct rte_eth_link link_info;
 
 	fs_lock(dev, 0);
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
-		DEBUG("Calling link_update on sub_device %d", i);
-		ret = (SUBOPS(sdev, link_update))(ETH(sdev), wait_to_complete);
-		if (ret && ret != -1 && sdev->remove == 0 &&
-		    rte_eth_dev_is_removed(PORT_ID(sdev)) == 0) {
-			ERROR("Link update failed for sub_device %d with error %d",
-			      i, ret);
-			fs_unlock(dev, 0);
-			return ret;
+		DEBUG("Calling rte_eth_link_get on sub_device %d", i);
+		if (wait_to_complete) {
+			sdev_link_ret = rte_eth_link_get(PORT_ID(sdev), &link_info);
+		} else {
+			sdev_link_ret = rte_eth_link_get_nowait(PORT_ID(sdev),
+				&link_info);
 		}
-	}
-	if (TX_SUBDEV(dev)) {
-		struct rte_eth_link *l1;
-		struct rte_eth_link *l2;
 
-		l1 = &dev->data->dev_link;
-		l2 = &ETH(TX_SUBDEV(dev))->data->dev_link;
-		if (memcmp(l1, l2, sizeof(*l1))) {
-			*l1 = *l2;
-			fs_unlock(dev, 0);
-			return 0;
+		if (likely(sdev_link_ret == 0)) {
+			if (TX_SUBDEV(dev) == sdev)
+				ret = rte_eth_linkstatus_set(dev, &link_info);
+		}
+		else if (sdev->remove == 0 &&
+				rte_eth_dev_is_removed(PORT_ID(sdev)) == 0) {
+			ERROR("Link get failed for sub_device %d with error %d",
+				i, sdev_link_ret);
 		}
 	}
 	fs_unlock(dev, 0);
-	return -1;
+	return ret;
 }
 
 static int
-- 
2.38.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 5/5] failsafe: make sub-device remove flag thread-safe
  2022-11-10 16:34 [PATCH] failsafe: fix segfault on hotplug event Luc Pelletier
                   ` (4 preceding siblings ...)
  2022-11-29 14:48 ` [PATCH v3 4/5] failsafe: use public APIs in fs_link_update Luc Pelletier
@ 2022-11-29 14:48 ` Luc Pelletier
  2022-11-29 15:25 ` [PATCH v4 1/5] failsafe: fix segfault on hotplug event Luc Pelletier
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Luc Pelletier @ 2022-11-29 14:48 UTC (permalink / raw)
  To: grive; +Cc: dev, stephen, konstantin.ananyev, Luc Pelletier, stable

For each sub-device, there's a remove flag that's used to indicate if
the sub-device has been removed or not. This flag is currently a
volatile bit field. Even if it's marked as volatile, a bit field is not
thread-safe according to the C standard.

The remove flag is now an unsigned int rather than a bit field. We also
now use the atomic built-ins to read/write the value of the remove flag
to ensure thread-safety.

Fixes: 598fb8aec6f6 ("net/failsafe: support device removal")
Cc: Gaetan Rivet <grive@u256.net>
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
 drivers/net/failsafe/failsafe_ether.c   | 8 ++++----
 drivers/net/failsafe/failsafe_ops.c     | 2 +-
 drivers/net/failsafe/failsafe_private.h | 4 ++--
 drivers/net/failsafe/failsafe_rxtx.c    | 2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 3bfc446638..9014777e52 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -311,7 +311,7 @@ fs_dev_remove(struct sub_device *sdev)
 		/* the end */
 		break;
 	}
-	sdev->remove = 0;
+	__atomic_store_n(&sdev->remove, 0, __ATOMIC_RELEASE);
 	failsafe_hotplug_alarm_install(fs_dev(sdev));
 }
 
@@ -388,7 +388,7 @@ failsafe_dev_remove(struct rte_eth_dev *dev)
 	uint8_t i;
 
 	FOREACH_SUBDEV(sdev, i, dev) {
-		if (!sdev->remove)
+		if (!__atomic_load_n(&sdev->remove, __ATOMIC_ACQUIRE))
 			continue;
 
 		/* Active devices must have finished their burst and
@@ -556,7 +556,7 @@ failsafe_eth_dev_state_sync(struct rte_eth_dev *dev)
 err_remove:
 	FOREACH_SUBDEV(sdev, i, dev)
 		if (sdev->state != PRIV(dev)->state)
-			sdev->remove = 1;
+			__atomic_store_n(&sdev->remove, 1, __ATOMIC_RELEASE);
 	return ret;
 }
 
@@ -597,7 +597,7 @@ failsafe_eth_rmv_event_callback(uint16_t port_id __rte_unused,
 	 * Async removal, the sub-PMD will try to unregister
 	 * the callback at the source of the current thread context.
 	 */
-	sdev->remove = 1;
+	__atomic_store_n(&sdev->remove, 1, __ATOMIC_RELEASE);
 	fs_unlock(fs_dev(sdev), 0);
 	return 0;
 }
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index 7bd223af37..86d97970bc 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -838,7 +838,7 @@ fs_link_update(struct rte_eth_dev *dev,
 			if (TX_SUBDEV(dev) == sdev)
 				ret = rte_eth_linkstatus_set(dev, &link_info);
 		}
-		else if (sdev->remove == 0 &&
+		else if (__atomic_load_n(&sdev->remove, __ATOMIC_ACQUIRE) == 0 &&
 				rte_eth_dev_is_removed(PORT_ID(sdev)) == 0) {
 			ERROR("Link get failed for sub_device %d with error %d",
 				i, sdev_link_ret);
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 3865f2fc34..0577a62ab2 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -131,7 +131,7 @@ struct sub_device {
 	/* sub device port id*/
 	uint16_t sdev_port_id; /* shared between processes */
 	/* flag calling for recollection */
-	volatile unsigned int remove:1;
+	unsigned int remove;
 	/* flow isolation state */
 	int flow_isolated:1;
 	/* RMV callback registration state */
@@ -490,7 +490,7 @@ static inline int
 fs_err(struct sub_device *sdev, int err)
 {
 	/* A device removal shouldn't be reported as an error. */
-	if (sdev->remove == 1 || err == -EIO)
+	if (__atomic_load_n(&sdev->remove, __ATOMIC_ACQUIRE) == 1 || err == -EIO)
 		return rte_errno = 0;
 	return err;
 }
diff --git a/drivers/net/failsafe/failsafe_rxtx.c b/drivers/net/failsafe/failsafe_rxtx.c
index 6d8ab0a6e7..c5caab4204 100644
--- a/drivers/net/failsafe/failsafe_rxtx.c
+++ b/drivers/net/failsafe/failsafe_rxtx.c
@@ -16,7 +16,7 @@ fs_rx_unsafe(struct sub_device *sdev)
 	return (ETH(sdev) == NULL) ||
 		(ETH(sdev)->rx_pkt_burst == NULL) ||
 		(sdev->state != DEV_STARTED) ||
-		(sdev->remove != 0);
+		(__atomic_load_n(&sdev->remove, __ATOMIC_ACQUIRE) != 0);
 }
 
 static inline int
-- 
2.38.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4 1/5] failsafe: fix segfault on hotplug event
  2022-11-10 16:34 [PATCH] failsafe: fix segfault on hotplug event Luc Pelletier
                   ` (5 preceding siblings ...)
  2022-11-29 14:48 ` [PATCH v3 5/5] failsafe: make sub-device remove flag thread-safe Luc Pelletier
@ 2022-11-29 15:25 ` Luc Pelletier
  2022-11-29 15:25 ` [PATCH v4 3/5] failsafe: fix double release of port Luc Pelletier
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Luc Pelletier @ 2022-11-29 15:25 UTC (permalink / raw)
  To: grive; +Cc: dev, stephen, konstantin.ananyev, Luc Pelletier, stable

When the failsafe PMD encounters a hotplug event, it switches its rx/tx
functions to "safe" ones that validate the sub-device's rx/tx functions
before calling them. It switches the rx/tx functions by changing the
function pointers in the rte_eth_dev structure.

Following commit 7a0935239b9e, the rx/tx functions of PMDs are no longer
called through the function pointers in the rte_eth_dev structure. They
are rather called through a flat array named rte_eth_fp_ops. The
function pointers in that array are initialized when the devices start
and are initialized.

When a hotplug event occurs, the function pointers in rte_eth_fp_ops
still point to the "unsafe" rx/tx functions in the failsafe PMD since
they haven't been updated. This results in a segmentation fault because
it ends up using the "unsafe" functions, when the "safe" functions
should have been used.

To fix the problem, the "unsafe" rx/tx functions were completely
removed. The "safe" functions are now always used. Modifying the rx/tx
functions on-the-fly is not supported by DPDK, so this is the correct
approach and should have very minimal impact on performance.

Fixes: 7a0935239b9e ("ethdev: make fast-path functions to use new flat array")
Cc: Konstantin Ananyev <konstantin.ananyev@huawei.com>
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
 drivers/net/failsafe/failsafe_ether.c   |  2 -
 drivers/net/failsafe/failsafe_private.h |  8 ---
 drivers/net/failsafe/failsafe_rxtx.c    | 83 +------------------------
 3 files changed, 1 insertion(+), 92 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 10b90fd837..517126565f 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -595,8 +595,6 @@ failsafe_eth_rmv_event_callback(uint16_t port_id __rte_unused,
 	fs_lock(fs_dev(sdev), 0);
 	/* Switch as soon as possible tx_dev. */
 	fs_switch_dev(fs_dev(sdev), sdev);
-	/* Use safe bursts in any case. */
-	failsafe_set_burst_fn(fs_dev(sdev), 1);
 	/*
 	 * Async removal, the sub-PMD will try to unregister
 	 * the callback at the source of the current thread context.
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 53a451c1b1..3865f2fc34 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -208,18 +208,11 @@ int failsafe_hotplug_alarm_cancel(struct rte_eth_dev *dev);
 
 /* RX / TX */
 
-void failsafe_set_burst_fn(struct rte_eth_dev *dev, int force_safe);
-
 uint16_t failsafe_rx_burst(void *rxq,
 		struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
 uint16_t failsafe_tx_burst(void *txq,
 		struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
 
-uint16_t failsafe_rx_burst_fast(void *rxq,
-		struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
-uint16_t failsafe_tx_burst_fast(void *txq,
-		struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
-
 /* ARGS */
 
 int failsafe_args_parse(struct rte_eth_dev *dev, const char *params);
@@ -487,7 +480,6 @@ fs_switch_dev(struct rte_eth_dev *dev,
 	} else {
 		return;
 	}
-	failsafe_set_burst_fn(dev, 0);
 	rte_wmb();
 }
 
diff --git a/drivers/net/failsafe/failsafe_rxtx.c b/drivers/net/failsafe/failsafe_rxtx.c
index fe67293299..707fe60a36 100644
--- a/drivers/net/failsafe/failsafe_rxtx.c
+++ b/drivers/net/failsafe/failsafe_rxtx.c
@@ -28,39 +28,6 @@ fs_tx_unsafe(struct sub_device *sdev)
 		(sdev->state != DEV_STARTED);
 }
 
-void
-failsafe_set_burst_fn(struct rte_eth_dev *dev, int force_safe)
-{
-	struct sub_device *sdev;
-	uint8_t i;
-	int need_safe;
-	int safe_set;
-
-	need_safe = force_safe;
-	FOREACH_SUBDEV(sdev, i, dev)
-		need_safe |= fs_rx_unsafe(sdev);
-	safe_set = (dev->rx_pkt_burst == &failsafe_rx_burst);
-	if (need_safe && !safe_set) {
-		DEBUG("Using safe RX bursts%s",
-		      (force_safe ? " (forced)" : ""));
-		dev->rx_pkt_burst = &failsafe_rx_burst;
-	} else if (!need_safe && safe_set) {
-		DEBUG("Using fast RX bursts");
-		dev->rx_pkt_burst = &failsafe_rx_burst_fast;
-	}
-	need_safe = force_safe || fs_tx_unsafe(TX_SUBDEV(dev));
-	safe_set = (dev->tx_pkt_burst == &failsafe_tx_burst);
-	if (need_safe && !safe_set) {
-		DEBUG("Using safe TX bursts%s",
-		      (force_safe ? " (forced)" : ""));
-		dev->tx_pkt_burst = &failsafe_tx_burst;
-	} else if (!need_safe && safe_set) {
-		DEBUG("Using fast TX bursts");
-		dev->tx_pkt_burst = &failsafe_tx_burst_fast;
-	}
-	rte_wmb();
-}
-
 /*
  * Override source port in Rx packets.
  *
@@ -89,7 +56,7 @@ failsafe_rx_burst(void *queue,
 	rxq = queue;
 	sdev = rxq->sdev;
 	do {
-		if (fs_rx_unsafe(sdev)) {
+		if (unlikely(fs_rx_unsafe(sdev))) {
 			nb_rx = 0;
 			sdev = sdev->next;
 			continue;
@@ -108,34 +75,6 @@ failsafe_rx_burst(void *queue,
 	return nb_rx;
 }
 
-uint16_t
-failsafe_rx_burst_fast(void *queue,
-			 struct rte_mbuf **rx_pkts,
-			 uint16_t nb_pkts)
-{
-	struct sub_device *sdev;
-	struct rxq *rxq;
-	void *sub_rxq;
-	uint16_t nb_rx;
-
-	rxq = queue;
-	sdev = rxq->sdev;
-	do {
-		RTE_ASSERT(!fs_rx_unsafe(sdev));
-		sub_rxq = ETH(sdev)->data->rx_queues[rxq->qid];
-		FS_ATOMIC_P(rxq->refcnt[sdev->sid]);
-		nb_rx = ETH(sdev)->
-			rx_pkt_burst(sub_rxq, rx_pkts, nb_pkts);
-		FS_ATOMIC_V(rxq->refcnt[sdev->sid]);
-		sdev = sdev->next;
-	} while (nb_rx == 0 && sdev != rxq->sdev);
-	rxq->sdev = sdev;
-	if (nb_rx)
-		failsafe_rx_set_port(rx_pkts, nb_rx,
-				     rxq->priv->data->port_id);
-	return nb_rx;
-}
-
 uint16_t
 failsafe_tx_burst(void *queue,
 		  struct rte_mbuf **tx_pkts,
@@ -156,23 +95,3 @@ failsafe_tx_burst(void *queue,
 	FS_ATOMIC_V(txq->refcnt[sdev->sid]);
 	return nb_tx;
 }
-
-uint16_t
-failsafe_tx_burst_fast(void *queue,
-			 struct rte_mbuf **tx_pkts,
-			 uint16_t nb_pkts)
-{
-	struct sub_device *sdev;
-	struct txq *txq;
-	void *sub_txq;
-	uint16_t nb_tx;
-
-	txq = queue;
-	sdev = TX_SUBDEV(&rte_eth_devices[txq->priv->data->port_id]);
-	RTE_ASSERT(!fs_tx_unsafe(sdev));
-	sub_txq = ETH(sdev)->data->tx_queues[txq->qid];
-	FS_ATOMIC_P(txq->refcnt[sdev->sid]);
-	nb_tx = ETH(sdev)->tx_pkt_burst(sub_txq, tx_pkts, nb_pkts);
-	FS_ATOMIC_V(txq->refcnt[sdev->sid]);
-	return nb_tx;
-}
-- 
2.38.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4 3/5] failsafe: fix double release of port
  2022-11-10 16:34 [PATCH] failsafe: fix segfault on hotplug event Luc Pelletier
                   ` (6 preceding siblings ...)
  2022-11-29 15:25 ` [PATCH v4 1/5] failsafe: fix segfault on hotplug event Luc Pelletier
@ 2022-11-29 15:25 ` Luc Pelletier
  2022-11-29 15:25 ` [PATCH v4 4/5] failsafe: use public APIs in fs_link_update Luc Pelletier
  2022-11-29 15:25 ` [PATCH v4 5/5] failsafe: make sub-device remove flag thread-safe Luc Pelletier
  9 siblings, 0 replies; 16+ messages in thread
From: Luc Pelletier @ 2022-11-29 15:25 UTC (permalink / raw)
  To: grive
  Cc: dev, stephen, konstantin.ananyev, Luc Pelletier, Matan Azrad, stable

When a sub-device is hot-unplugged, rte_eth_dev_release_port is being
called twice. Once indirectly via rte_eth_dev_close and once explicitly.

Changed the code to remove the explicit call since it's not required and
results in mistakenly releasing port 0 (due to zero'ing of memory being
done in rte_eth_dev_release_port).

Fixes: fac0ae546e5f ("net/failsafe: free port by dedicated function")
Cc: Matan Azrad <matan@nvidia.com>
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
 drivers/net/failsafe/failsafe_ether.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 517126565f..3bfc446638 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -301,8 +301,6 @@ fs_dev_remove(struct sub_device *sdev)
 		if (ret < 0) {
 			ERROR("Bus detach failed for sub_device %u",
 			      SUB_ID(sdev));
-		} else {
-			rte_eth_dev_release_port(ETH(sdev));
 		}
 		sdev->state = DEV_PARSED;
 		/* fallthrough */
-- 
2.38.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4 4/5] failsafe: use public APIs in fs_link_update
  2022-11-10 16:34 [PATCH] failsafe: fix segfault on hotplug event Luc Pelletier
                   ` (7 preceding siblings ...)
  2022-11-29 15:25 ` [PATCH v4 3/5] failsafe: fix double release of port Luc Pelletier
@ 2022-11-29 15:25 ` Luc Pelletier
  2022-11-29 15:25 ` [PATCH v4 5/5] failsafe: make sub-device remove flag thread-safe Luc Pelletier
  9 siblings, 0 replies; 16+ messages in thread
From: Luc Pelletier @ 2022-11-29 15:25 UTC (permalink / raw)
  To: grive
  Cc: dev, stephen, konstantin.ananyev, Luc Pelletier, Matan Azrad, stable

The link_update function pointer is called directly on a sub-device, via
its rte_eth_dev structure. In addition, it assumes that if link_update
returns a value other than 0 or -1, an error occurred. That's not
accurate. For example, the mlx5 PMD returns a positive value to indicate
that the link status was updated. This results in fs_link_update failing
when it really should have succeeded.

The code now uses the public APIs rte_eth_link_get and
rte_eth_link_get_nowait to query the link status of each sub-device. It
also uses rte_eth_linkstatus_set to set the link status of the failsafe
device.

Fixes: ae80146c5a1b ("net/failsafe: fix removed device handling")
Cc: Matan Azrad <matan@nvidia.com>
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
 drivers/net/failsafe/failsafe_ops.c | 37 +++++++++++++----------------
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index d357e1bc83..b66dfa269b 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -820,34 +820,31 @@ fs_link_update(struct rte_eth_dev *dev,
 {
 	struct sub_device *sdev;
 	uint8_t i;
-	int ret;
+	int ret = -1;
+	int sdev_link_ret;
+	struct rte_eth_link link_info;
 
 	fs_lock(dev, 0);
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
-		DEBUG("Calling link_update on sub_device %d", i);
-		ret = (SUBOPS(sdev, link_update))(ETH(sdev), wait_to_complete);
-		if (ret && ret != -1 && sdev->remove == 0 &&
-		    rte_eth_dev_is_removed(PORT_ID(sdev)) == 0) {
-			ERROR("Link update failed for sub_device %d with error %d",
-			      i, ret);
-			fs_unlock(dev, 0);
-			return ret;
+		DEBUG("Calling rte_eth_link_get on sub_device %d", i);
+		if (wait_to_complete) {
+			sdev_link_ret = rte_eth_link_get(PORT_ID(sdev), &link_info);
+		} else {
+			sdev_link_ret = rte_eth_link_get_nowait(PORT_ID(sdev),
+				&link_info);
 		}
-	}
-	if (TX_SUBDEV(dev)) {
-		struct rte_eth_link *l1;
-		struct rte_eth_link *l2;
 
-		l1 = &dev->data->dev_link;
-		l2 = &ETH(TX_SUBDEV(dev))->data->dev_link;
-		if (memcmp(l1, l2, sizeof(*l1))) {
-			*l1 = *l2;
-			fs_unlock(dev, 0);
-			return 0;
+		if (likely(sdev_link_ret == 0)) {
+			if (TX_SUBDEV(dev) == sdev)
+				ret = rte_eth_linkstatus_set(dev, &link_info);
+		} else if (sdev->remove == 0 &&
+				rte_eth_dev_is_removed(PORT_ID(sdev)) == 0) {
+			ERROR("Link get failed for sub_device %d with error %d",
+				i, sdev_link_ret);
 		}
 	}
 	fs_unlock(dev, 0);
-	return -1;
+	return ret;
 }
 
 static int
-- 
2.38.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4 5/5] failsafe: make sub-device remove flag thread-safe
  2022-11-10 16:34 [PATCH] failsafe: fix segfault on hotplug event Luc Pelletier
                   ` (8 preceding siblings ...)
  2022-11-29 15:25 ` [PATCH v4 4/5] failsafe: use public APIs in fs_link_update Luc Pelletier
@ 2022-11-29 15:25 ` Luc Pelletier
  9 siblings, 0 replies; 16+ messages in thread
From: Luc Pelletier @ 2022-11-29 15:25 UTC (permalink / raw)
  To: grive; +Cc: dev, stephen, konstantin.ananyev, Luc Pelletier, stable

For each sub-device, there's a remove flag that's used to indicate if
the sub-device has been removed or not. This flag is currently a
volatile bit field. Even if it's marked as volatile, a bit field is not
thread-safe according to the C standard.

The remove flag is now an unsigned int rather than a bit field. We also
now use the atomic built-ins to read/write the value of the remove flag
to ensure thread-safety.

Fixes: 598fb8aec6f6 ("net/failsafe: support device removal")
Cc: Gaetan Rivet <grive@u256.net>
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
---
 drivers/net/failsafe/failsafe_ether.c   | 8 ++++----
 drivers/net/failsafe/failsafe_ops.c     | 2 +-
 drivers/net/failsafe/failsafe_private.h | 4 ++--
 drivers/net/failsafe/failsafe_rxtx.c    | 2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 3bfc446638..9014777e52 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -311,7 +311,7 @@ fs_dev_remove(struct sub_device *sdev)
 		/* the end */
 		break;
 	}
-	sdev->remove = 0;
+	__atomic_store_n(&sdev->remove, 0, __ATOMIC_RELEASE);
 	failsafe_hotplug_alarm_install(fs_dev(sdev));
 }
 
@@ -388,7 +388,7 @@ failsafe_dev_remove(struct rte_eth_dev *dev)
 	uint8_t i;
 
 	FOREACH_SUBDEV(sdev, i, dev) {
-		if (!sdev->remove)
+		if (!__atomic_load_n(&sdev->remove, __ATOMIC_ACQUIRE))
 			continue;
 
 		/* Active devices must have finished their burst and
@@ -556,7 +556,7 @@ failsafe_eth_dev_state_sync(struct rte_eth_dev *dev)
 err_remove:
 	FOREACH_SUBDEV(sdev, i, dev)
 		if (sdev->state != PRIV(dev)->state)
-			sdev->remove = 1;
+			__atomic_store_n(&sdev->remove, 1, __ATOMIC_RELEASE);
 	return ret;
 }
 
@@ -597,7 +597,7 @@ failsafe_eth_rmv_event_callback(uint16_t port_id __rte_unused,
 	 * Async removal, the sub-PMD will try to unregister
 	 * the callback at the source of the current thread context.
 	 */
-	sdev->remove = 1;
+	__atomic_store_n(&sdev->remove, 1, __ATOMIC_RELEASE);
 	fs_unlock(fs_dev(sdev), 0);
 	return 0;
 }
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index b66dfa269b..e65c8a261a 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -837,7 +837,7 @@ fs_link_update(struct rte_eth_dev *dev,
 		if (likely(sdev_link_ret == 0)) {
 			if (TX_SUBDEV(dev) == sdev)
 				ret = rte_eth_linkstatus_set(dev, &link_info);
-		} else if (sdev->remove == 0 &&
+		} else if (__atomic_load_n(&sdev->remove, __ATOMIC_ACQUIRE) == 0 &&
 				rte_eth_dev_is_removed(PORT_ID(sdev)) == 0) {
 			ERROR("Link get failed for sub_device %d with error %d",
 				i, sdev_link_ret);
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 3865f2fc34..0577a62ab2 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -131,7 +131,7 @@ struct sub_device {
 	/* sub device port id*/
 	uint16_t sdev_port_id; /* shared between processes */
 	/* flag calling for recollection */
-	volatile unsigned int remove:1;
+	unsigned int remove;
 	/* flow isolation state */
 	int flow_isolated:1;
 	/* RMV callback registration state */
@@ -490,7 +490,7 @@ static inline int
 fs_err(struct sub_device *sdev, int err)
 {
 	/* A device removal shouldn't be reported as an error. */
-	if (sdev->remove == 1 || err == -EIO)
+	if (__atomic_load_n(&sdev->remove, __ATOMIC_ACQUIRE) == 1 || err == -EIO)
 		return rte_errno = 0;
 	return err;
 }
diff --git a/drivers/net/failsafe/failsafe_rxtx.c b/drivers/net/failsafe/failsafe_rxtx.c
index 6d8ab0a6e7..c5caab4204 100644
--- a/drivers/net/failsafe/failsafe_rxtx.c
+++ b/drivers/net/failsafe/failsafe_rxtx.c
@@ -16,7 +16,7 @@ fs_rx_unsafe(struct sub_device *sdev)
 	return (ETH(sdev) == NULL) ||
 		(ETH(sdev)->rx_pkt_burst == NULL) ||
 		(sdev->state != DEV_STARTED) ||
-		(sdev->remove != 0);
+		(__atomic_load_n(&sdev->remove, __ATOMIC_ACQUIRE) != 0);
 }
 
 static inline int
-- 
2.38.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/5] failsafe: fix segfault on hotplug event
  2022-11-29 14:48 ` [PATCH v3 1/5] " Luc Pelletier
@ 2023-10-31 17:35   ` Stephen Hemminger
  0 siblings, 0 replies; 16+ messages in thread
From: Stephen Hemminger @ 2023-10-31 17:35 UTC (permalink / raw)
  To: Luc Pelletier; +Cc: grive, dev, konstantin.ananyev, stable

On Tue, 29 Nov 2022 09:48:29 -0500
Luc Pelletier <lucp.at.work@gmail.com> wrote:

> When the failsafe PMD encounters a hotplug event, it switches its rx/tx
> functions to "safe" ones that validate the sub-device's rx/tx functions
> before calling them. It switches the rx/tx functions by changing the
> function pointers in the rte_eth_dev structure.
> 
> Following commit 7a0935239b9e, the rx/tx functions of PMDs are no longer
> called through the function pointers in the rte_eth_dev structure. They
> are rather called through a flat array named rte_eth_fp_ops. The
> function pointers in that array are initialized when the devices start
> and are initialized.
> 
> When a hotplug event occurs, the function pointers in rte_eth_fp_ops
> still point to the "unsafe" rx/tx functions in the failsafe PMD since
> they haven't been updated. This results in a segmentation fault because
> it ends up using the "unsafe" functions, when the "safe" functions
> should have been used.
> 
> To fix the problem, the "unsafe" rx/tx functions were completely
> removed. The "safe" functions are now always used. Modifying the rx/tx
> functions on-the-fly is not supported by DPDK, so this is the correct
> approach and should have very minimal impact on performance.
> 
> Fixes: 7a0935239b9e ("ethdev: make fast-path functions to use new flat array")
> Cc: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> Cc: stable@dpdk.org
> 
> Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>


I don't use or have way to test failsafe anymore.
But wanted to give a try at reviewing this.
This version does not apply to current code base.
If still interested, please rebase and resubmit.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-10-31 17:35 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-10 16:34 [PATCH] failsafe: fix segfault on hotplug event Luc Pelletier
2022-11-10 17:42 ` [PATCH v2] " Luc Pelletier
2022-11-10 23:33   ` Stephen Hemminger
2022-11-16 17:35 ` [PATCH] " Konstantin Ananyev
2022-11-16 21:51   ` Luc Pelletier
2022-11-16 22:25     ` Stephen Hemminger
2022-11-18 16:31       ` Konstantin Ananyev
2022-11-29 14:48 ` [PATCH v3 1/5] " Luc Pelletier
2023-10-31 17:35   ` Stephen Hemminger
2022-11-29 14:48 ` [PATCH v3 3/5] failsafe: fix double release of port Luc Pelletier
2022-11-29 14:48 ` [PATCH v3 4/5] failsafe: use public APIs in fs_link_update Luc Pelletier
2022-11-29 14:48 ` [PATCH v3 5/5] failsafe: make sub-device remove flag thread-safe Luc Pelletier
2022-11-29 15:25 ` [PATCH v4 1/5] failsafe: fix segfault on hotplug event Luc Pelletier
2022-11-29 15:25 ` [PATCH v4 3/5] failsafe: fix double release of port Luc Pelletier
2022-11-29 15:25 ` [PATCH v4 4/5] failsafe: use public APIs in fs_link_update Luc Pelletier
2022-11-29 15:25 ` [PATCH v4 5/5] failsafe: make sub-device remove flag thread-safe Luc Pelletier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).