DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] net/iavf: fix crash on closing representor ports
@ 2023-10-26  9:51 Mingjin Ye
  2023-10-27  7:17 ` David Marchand
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Mingjin Ye @ 2023-10-26  9:51 UTC (permalink / raw)
  To: dev; +Cc: qiming.yang, yidingx.zhou, Mingjin Ye, stable, Qi Zhang

Since the representor port needs to access the resources of the
associated DCF when it is closed. Therefore, the correct close
port operation is to close all the representor ports first, and
then close the associated DCF port.

If the DCF port is closed before the representor port on pmd exit.
This will result in accessing freed resources and eventually a
core dump will occur.

This patch fixes this issue by notifying all presentor ports
that DCF is not accessible when the DCF port is closed.
And when the presentor port is closed, it determines if the DCF
resources are accessible. If it can't be accessed, it will
report an error and return.

Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
Fixes: 295968d17407 ("ethdev: add namespace")
Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
Cc: stable@dpdk.org

Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
---
 drivers/net/ice/ice_dcf_ethdev.h         |  1 +
 drivers/net/ice/ice_dcf_vf_representor.c | 12 +++++++-----
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ice/ice_dcf_ethdev.h b/drivers/net/ice/ice_dcf_ethdev.h
index 4baaec4b8b..d94ef10244 100644
--- a/drivers/net/ice/ice_dcf_ethdev.h
+++ b/drivers/net/ice/ice_dcf_ethdev.h
@@ -60,6 +60,7 @@ struct ice_dcf_vf_repr {
 	struct rte_ether_addr mac_addr;
 	uint16_t switch_domain_id;
 	uint16_t vf_id;
+	bool dcf_valid;
 
 	struct ice_dcf_vlan outer_vlan_info; /* DCF always handle outer VLAN */
 };
diff --git a/drivers/net/ice/ice_dcf_vf_representor.c b/drivers/net/ice/ice_dcf_vf_representor.c
index b9fcfc80ad..b4a94427df 100644
--- a/drivers/net/ice/ice_dcf_vf_representor.c
+++ b/drivers/net/ice/ice_dcf_vf_representor.c
@@ -45,6 +45,8 @@ ice_dcf_vf_repr_dev_start(struct rte_eth_dev *dev)
 static int
 ice_dcf_vf_repr_dev_stop(struct rte_eth_dev *dev)
 {
+	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
+	repr->dcf_valid = false;
 	dev->data->dev_link.link_status = RTE_ETH_LINK_DOWN;
 
 	return 0;
@@ -111,14 +113,13 @@ ice_dcf_vf_repr_link_update(__rte_unused struct rte_eth_dev *ethdev,
 static __rte_always_inline struct ice_dcf_hw *
 ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)
 {
-	struct ice_dcf_adapter *dcf_adapter =
-			repr->dcf_eth_dev->data->dev_private;
-
-	if (!dcf_adapter) {
+	struct ice_dcf_adapter *dcf_adapter;
+	if (!repr->dcf_valid) {
 		PMD_DRV_LOG(ERR, "DCF for VF representor has been released\n");
 		return NULL;
 	}
-
+dcf_adapter =
+			repr->dcf_eth_dev->data->dev_private;
 	return &dcf_adapter->real_hw;
 }
 
@@ -414,6 +415,7 @@ ice_dcf_vf_repr_init(struct rte_eth_dev *vf_rep_eth_dev, void *init_param)
 	repr->dcf_eth_dev = param->dcf_eth_dev;
 	repr->switch_domain_id = param->switch_domain_id;
 	repr->vf_id = param->vf_id;
+	repr->dcf_valid = true;
 	repr->outer_vlan_info.port_vlan_ena = false;
 	repr->outer_vlan_info.stripping_ena = false;
 	repr->outer_vlan_info.tpid = RTE_ETHER_TYPE_VLAN;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] net/iavf: fix crash on closing representor ports
  2023-10-26  9:51 [PATCH] net/iavf: fix crash on closing representor ports Mingjin Ye
@ 2023-10-27  7:17 ` David Marchand
  2023-10-30  2:35 ` [PATCH v2] " Mingjin Ye
  2023-10-30  8:44 ` [PATCH v2] net/ice: " Mingjin Ye
  2 siblings, 0 replies; 17+ messages in thread
From: David Marchand @ 2023-10-27  7:17 UTC (permalink / raw)
  To: Mingjin Ye, Jiale Song; +Cc: dev, qiming.yang, yidingx.zhou, stable, Qi Zhang

On Thu, Oct 26, 2023 at 12:01 PM Mingjin Ye <mingjinx.ye@intel.com> wrote:
>

Afaics, the patch title is wrong, this is a net/ice fix.


> Since the representor port needs to access the resources of the
> associated DCF when it is closed. Therefore, the correct close
> port operation is to close all the representor ports first, and
> then close the associated DCF port.
>
> If the DCF port is closed before the representor port on pmd exit.
> This will result in accessing freed resources and eventually a
> core dump will occur.
>
> This patch fixes this issue by notifying all presentor ports
> that DCF is not accessible when the DCF port is closed.
> And when the presentor port is closed, it determines if the DCF
> resources are accessible. If it can't be accessed, it will
> report an error and return.
>
> Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
> Fixes: 295968d17407 ("ethdev: add namespace")

This Fixes: tag above is very likely wrong: this 295968d17407 commit
only prefixed ethdev macros and functions with the right namespace.
Please remove.


> Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
> Cc: stable@dpdk.org
>

I got pinged by Song Jiale about a ice DCF crash.

Is this the same issue?
If so, if a bugzilla was opened, it must be referenced here.
And please confirm the issue is fixed.


> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> ---
>  drivers/net/ice/ice_dcf_ethdev.h         |  1 +
>  drivers/net/ice/ice_dcf_vf_representor.c | 12 +++++++-----
>  2 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ice/ice_dcf_ethdev.h b/drivers/net/ice/ice_dcf_ethdev.h
> index 4baaec4b8b..d94ef10244 100644
> --- a/drivers/net/ice/ice_dcf_ethdev.h
> +++ b/drivers/net/ice/ice_dcf_ethdev.h
> @@ -60,6 +60,7 @@ struct ice_dcf_vf_repr {
>         struct rte_ether_addr mac_addr;
>         uint16_t switch_domain_id;
>         uint16_t vf_id;
> +       bool dcf_valid;
>
>         struct ice_dcf_vlan outer_vlan_info; /* DCF always handle outer VLAN */
>  };
> diff --git a/drivers/net/ice/ice_dcf_vf_representor.c b/drivers/net/ice/ice_dcf_vf_representor.c
> index b9fcfc80ad..b4a94427df 100644
> --- a/drivers/net/ice/ice_dcf_vf_representor.c
> +++ b/drivers/net/ice/ice_dcf_vf_representor.c
> @@ -45,6 +45,8 @@ ice_dcf_vf_repr_dev_start(struct rte_eth_dev *dev)
>  static int
>  ice_dcf_vf_repr_dev_stop(struct rte_eth_dev *dev)
>  {
> +       struct ice_dcf_vf_repr *repr = dev->data->dev_private;

Newline missing.

> +       repr->dcf_valid = false;
>         dev->data->dev_link.link_status = RTE_ETH_LINK_DOWN;
>
>         return 0;
> @@ -111,14 +113,13 @@ ice_dcf_vf_repr_link_update(__rte_unused struct rte_eth_dev *ethdev,
>  static __rte_always_inline struct ice_dcf_hw *
>  ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)
>  {
> -       struct ice_dcf_adapter *dcf_adapter =
> -                       repr->dcf_eth_dev->data->dev_private;
> -
> -       if (!dcf_adapter) {
> +       struct ice_dcf_adapter *dcf_adapter;

Newline missing.

> +       if (!repr->dcf_valid) {
>                 PMD_DRV_LOG(ERR, "DCF for VF representor has been released\n");
>                 return NULL;
>         }
> -
> +dcf_adapter =

Indent is wrong.

> +                       repr->dcf_eth_dev->data->dev_private;

And no need for extra newline.


>         return &dcf_adapter->real_hw;
>  }
>
> @@ -414,6 +415,7 @@ ice_dcf_vf_repr_init(struct rte_eth_dev *vf_rep_eth_dev, void *init_param)
>         repr->dcf_eth_dev = param->dcf_eth_dev;
>         repr->switch_domain_id = param->switch_domain_id;
>         repr->vf_id = param->vf_id;
> +       repr->dcf_valid = true;
>         repr->outer_vlan_info.port_vlan_ena = false;
>         repr->outer_vlan_info.stripping_ena = false;
>         repr->outer_vlan_info.tpid = RTE_ETHER_TYPE_VLAN;
> --
> 2.25.1
>

Thanks.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2] net/iavf: fix crash on closing representor ports
  2023-10-26  9:51 [PATCH] net/iavf: fix crash on closing representor ports Mingjin Ye
  2023-10-27  7:17 ` David Marchand
@ 2023-10-30  2:35 ` Mingjin Ye
  2023-10-30  8:44 ` [PATCH v2] net/ice: " Mingjin Ye
  2 siblings, 0 replies; 17+ messages in thread
From: Mingjin Ye @ 2023-10-30  2:35 UTC (permalink / raw)
  To: dev; +Cc: qiming.yang, yidingx.zhou, Mingjin Ye, stable, Qi Zhang

Since the representor port needs to access the resources of the
associated DCF when it is closed. Therefore, the correct close
port operation is to close all the representor ports first, and
then close the associated DCF port.

If the DCF port is closed before the representor port on pmd exit.
This will result in accessing freed resources and eventually a
core dump will occur.

This patch fixes this issue by notifying all presentor ports
that DCF is not accessible when the DCF port is closed.
And when the presentor port is closed, it determines if the DCF
resources are accessible. If it can't be accessed, it will
report an error and return.

Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
Cc: stable@dpdk.org

Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
---
v2: Reformat code to remove unneeded fixlines.
---
 drivers/net/ice/ice_dcf_ethdev.h         |  1 +
 drivers/net/ice/ice_dcf_vf_representor.c | 11 ++++++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ice/ice_dcf_ethdev.h b/drivers/net/ice/ice_dcf_ethdev.h
index 4baaec4b8b..d94ef10244 100644
--- a/drivers/net/ice/ice_dcf_ethdev.h
+++ b/drivers/net/ice/ice_dcf_ethdev.h
@@ -60,6 +60,7 @@ struct ice_dcf_vf_repr {
 	struct rte_ether_addr mac_addr;
 	uint16_t switch_domain_id;
 	uint16_t vf_id;
+	bool dcf_valid;
 
 	struct ice_dcf_vlan outer_vlan_info; /* DCF always handle outer VLAN */
 };
diff --git a/drivers/net/ice/ice_dcf_vf_representor.c b/drivers/net/ice/ice_dcf_vf_representor.c
index b9fcfc80ad..eb49eae4e4 100644
--- a/drivers/net/ice/ice_dcf_vf_representor.c
+++ b/drivers/net/ice/ice_dcf_vf_representor.c
@@ -45,6 +45,9 @@ ice_dcf_vf_repr_dev_start(struct rte_eth_dev *dev)
 static int
 ice_dcf_vf_repr_dev_stop(struct rte_eth_dev *dev)
 {
+	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
+
+	repr->dcf_valid = false;
 	dev->data->dev_link.link_status = RTE_ETH_LINK_DOWN;
 
 	return 0;
@@ -111,14 +114,15 @@ ice_dcf_vf_repr_link_update(__rte_unused struct rte_eth_dev *ethdev,
 static __rte_always_inline struct ice_dcf_hw *
 ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)
 {
-	struct ice_dcf_adapter *dcf_adapter =
-			repr->dcf_eth_dev->data->dev_private;
+	struct ice_dcf_adapter *dcf_adapter;
 
-	if (!dcf_adapter) {
+	if (!repr->dcf_valid) {
 		PMD_DRV_LOG(ERR, "DCF for VF representor has been released\n");
 		return NULL;
 	}
 
+	dcf_adapter = repr->dcf_eth_dev->data->dev_private;
+
 	return &dcf_adapter->real_hw;
 }
 
@@ -414,6 +418,7 @@ ice_dcf_vf_repr_init(struct rte_eth_dev *vf_rep_eth_dev, void *init_param)
 	repr->dcf_eth_dev = param->dcf_eth_dev;
 	repr->switch_domain_id = param->switch_domain_id;
 	repr->vf_id = param->vf_id;
+	repr->dcf_valid = true;
 	repr->outer_vlan_info.port_vlan_ena = false;
 	repr->outer_vlan_info.stripping_ena = false;
 	repr->outer_vlan_info.tpid = RTE_ETHER_TYPE_VLAN;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2] net/ice: fix crash on closing representor ports
  2023-10-26  9:51 [PATCH] net/iavf: fix crash on closing representor ports Mingjin Ye
  2023-10-27  7:17 ` David Marchand
  2023-10-30  2:35 ` [PATCH v2] " Mingjin Ye
@ 2023-10-30  8:44 ` Mingjin Ye
  2023-11-01  1:11   ` Zhang, Qi Z
  2023-11-01 10:13   ` [PATCH v3] " Mingjin Ye
  2 siblings, 2 replies; 17+ messages in thread
From: Mingjin Ye @ 2023-10-30  8:44 UTC (permalink / raw)
  To: dev; +Cc: qiming.yang, yidingx.zhou, Mingjin Ye, stable, Qi Zhang

Since the representor port needs to access the resources of the
associated DCF when it is closed. Therefore, the correct close
port operation is to close all the representor ports first, and
then close the associated DCF port.

If the DCF port is closed before the representor port on pmd exit.
This will result in accessing freed resources and eventually a
core dump will occur.

This patch fixes this issue by notifying all presentor ports
that DCF is not accessible when the DCF port is closed.
And when the presentor port is closed, it determines if the DCF
resources are accessible. If it can't be accessed, it will
report an error and return.

Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
Cc: stable@dpdk.org

Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
---
v2: Reformat code to remove unneeded fixlines.
---
 drivers/net/ice/ice_dcf_ethdev.h         |  1 +
 drivers/net/ice/ice_dcf_vf_representor.c | 11 ++++++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ice/ice_dcf_ethdev.h b/drivers/net/ice/ice_dcf_ethdev.h
index 4baaec4b8b..d94ef10244 100644
--- a/drivers/net/ice/ice_dcf_ethdev.h
+++ b/drivers/net/ice/ice_dcf_ethdev.h
@@ -60,6 +60,7 @@ struct ice_dcf_vf_repr {
 	struct rte_ether_addr mac_addr;
 	uint16_t switch_domain_id;
 	uint16_t vf_id;
+	bool dcf_valid;
 
 	struct ice_dcf_vlan outer_vlan_info; /* DCF always handle outer VLAN */
 };
diff --git a/drivers/net/ice/ice_dcf_vf_representor.c b/drivers/net/ice/ice_dcf_vf_representor.c
index b9fcfc80ad..eb49eae4e4 100644
--- a/drivers/net/ice/ice_dcf_vf_representor.c
+++ b/drivers/net/ice/ice_dcf_vf_representor.c
@@ -45,6 +45,9 @@ ice_dcf_vf_repr_dev_start(struct rte_eth_dev *dev)
 static int
 ice_dcf_vf_repr_dev_stop(struct rte_eth_dev *dev)
 {
+	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
+
+	repr->dcf_valid = false;
 	dev->data->dev_link.link_status = RTE_ETH_LINK_DOWN;
 
 	return 0;
@@ -111,14 +114,15 @@ ice_dcf_vf_repr_link_update(__rte_unused struct rte_eth_dev *ethdev,
 static __rte_always_inline struct ice_dcf_hw *
 ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)
 {
-	struct ice_dcf_adapter *dcf_adapter =
-			repr->dcf_eth_dev->data->dev_private;
+	struct ice_dcf_adapter *dcf_adapter;
 
-	if (!dcf_adapter) {
+	if (!repr->dcf_valid) {
 		PMD_DRV_LOG(ERR, "DCF for VF representor has been released\n");
 		return NULL;
 	}
 
+	dcf_adapter = repr->dcf_eth_dev->data->dev_private;
+
 	return &dcf_adapter->real_hw;
 }
 
@@ -414,6 +418,7 @@ ice_dcf_vf_repr_init(struct rte_eth_dev *vf_rep_eth_dev, void *init_param)
 	repr->dcf_eth_dev = param->dcf_eth_dev;
 	repr->switch_domain_id = param->switch_domain_id;
 	repr->vf_id = param->vf_id;
+	repr->dcf_valid = true;
 	repr->outer_vlan_info.port_vlan_ena = false;
 	repr->outer_vlan_info.stripping_ena = false;
 	repr->outer_vlan_info.tpid = RTE_ETHER_TYPE_VLAN;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v2] net/ice: fix crash on closing representor ports
  2023-10-30  8:44 ` [PATCH v2] net/ice: " Mingjin Ye
@ 2023-11-01  1:11   ` Zhang, Qi Z
  2023-11-01 10:13   ` [PATCH v3] " Mingjin Ye
  1 sibling, 0 replies; 17+ messages in thread
From: Zhang, Qi Z @ 2023-11-01  1:11 UTC (permalink / raw)
  To: Ye, MingjinX, dev; +Cc: Yang, Qiming, Zhou, YidingX, stable



> -----Original Message-----
> From: Ye, MingjinX <mingjinx.ye@intel.com>
> Sent: Monday, October 30, 2023 4:45 PM
> To: dev@dpdk.org
> Cc: Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; Ye, MingjinX <mingjinx.ye@intel.com>;
> stable@dpdk.org; Zhang, Qi Z <qi.z.zhang@intel.com>
> Subject: [PATCH v2] net/ice: fix crash on closing representor ports
> 
> Since the representor port needs to access the resources of the associated
> DCF when it is closed. Therefore, the correct close port operation is to close
> all the representor ports first, and then close the associated DCF port.
> 
> If the DCF port is closed before the representor port on pmd exit.
> This will result in accessing freed resources and eventually a core dump will
> occur.
> 
> This patch fixes this issue by notifying all presentor ports that DCF is not
> accessible when the DCF port is closed.
> And when the presentor port is closed, it determines if the DCF resources are
> accessible. If it can't be accessed, it will report an error and return.
> 
> Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
> Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> ---
> v2: Reformat code to remove unneeded fixlines.
> ---
>  drivers/net/ice/ice_dcf_ethdev.h         |  1 +
>  drivers/net/ice/ice_dcf_vf_representor.c | 11 ++++++++---
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_dcf_ethdev.h
> b/drivers/net/ice/ice_dcf_ethdev.h
> index 4baaec4b8b..d94ef10244 100644
> --- a/drivers/net/ice/ice_dcf_ethdev.h
> +++ b/drivers/net/ice/ice_dcf_ethdev.h
> @@ -60,6 +60,7 @@ struct ice_dcf_vf_repr {
>  	struct rte_ether_addr mac_addr;
>  	uint16_t switch_domain_id;
>  	uint16_t vf_id;
> +	bool dcf_valid;
> 
>  	struct ice_dcf_vlan outer_vlan_info; /* DCF always handle outer VLAN
> */  }; diff --git a/drivers/net/ice/ice_dcf_vf_representor.c
> b/drivers/net/ice/ice_dcf_vf_representor.c
> index b9fcfc80ad..eb49eae4e4 100644
> --- a/drivers/net/ice/ice_dcf_vf_representor.c
> +++ b/drivers/net/ice/ice_dcf_vf_representor.c
> @@ -45,6 +45,9 @@ ice_dcf_vf_repr_dev_start(struct rte_eth_dev *dev)
> static int  ice_dcf_vf_repr_dev_stop(struct rte_eth_dev *dev)  {
> +	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
> +
> +	repr->dcf_valid = false;

The fix assume the ice_dcf_vf_repr_dev_stop will be invoked during ice_dcf_dev_stop.
But what if DCF port is not closed, while only the port representor port is stopped, should we still reset the flag?
Based your description of the issue, is it better to check on a per DCF port flag, but not a per representer flag?



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3] net/ice: fix crash on closing representor ports
  2023-10-30  8:44 ` [PATCH v2] net/ice: " Mingjin Ye
  2023-11-01  1:11   ` Zhang, Qi Z
@ 2023-11-01 10:13   ` Mingjin Ye
  2023-11-01 10:48     ` Zhang, Qi Z
  2023-11-02 10:11     ` [PATCH v4] " Mingjin Ye
  1 sibling, 2 replies; 17+ messages in thread
From: Mingjin Ye @ 2023-11-01 10:13 UTC (permalink / raw)
  To: dev; +Cc: qiming.yang, yidingx.zhou, Mingjin Ye, stable, Qi Zhang

The data resource in struct rte_eth_dev is cleared and points to NULL
when the DCF port is closed.

If the DCF representor port is closed after the DCF port is closed,
a segmentation fault occurs because the representor port accesses
the data resource released by the DCF port.

This patch checks if the resource is present before accessing.

Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
Cc: stable@dpdk.org

Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
---
v3: New solution.
---
 drivers/net/ice/ice_dcf_vf_representor.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ice/ice_dcf_vf_representor.c b/drivers/net/ice/ice_dcf_vf_representor.c
index b9fcfc80ad..8c45e28f02 100644
--- a/drivers/net/ice/ice_dcf_vf_representor.c
+++ b/drivers/net/ice/ice_dcf_vf_representor.c
@@ -111,14 +111,16 @@ ice_dcf_vf_repr_link_update(__rte_unused struct rte_eth_dev *ethdev,
 static __rte_always_inline struct ice_dcf_hw *
 ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)
 {
-	struct ice_dcf_adapter *dcf_adapter =
-			repr->dcf_eth_dev->data->dev_private;
+	struct rte_eth_dev_data *dcf_data = repr->dcf_eth_dev->data;
+	struct ice_dcf_adapter *dcf_adapter;
 
-	if (!dcf_adapter) {
+	if (!dcf_data || !dcf_data->dev_private) {
 		PMD_DRV_LOG(ERR, "DCF for VF representor has been released\n");
 		return NULL;
 	}
 
+	dcf_adapter = dcf_data->dev_private;
+
 	return &dcf_adapter->real_hw;
 }
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v3] net/ice: fix crash on closing representor ports
  2023-11-01 10:13   ` [PATCH v3] " Mingjin Ye
@ 2023-11-01 10:48     ` Zhang, Qi Z
  2023-11-02  2:38       ` Ye, MingjinX
  2023-11-02 10:11     ` [PATCH v4] " Mingjin Ye
  1 sibling, 1 reply; 17+ messages in thread
From: Zhang, Qi Z @ 2023-11-01 10:48 UTC (permalink / raw)
  To: Ye, MingjinX, dev; +Cc: Yang, Qiming, Zhou, YidingX, stable



> -----Original Message-----
> From: Ye, MingjinX <mingjinx.ye@intel.com>
> Sent: Wednesday, November 1, 2023 6:14 PM
> To: dev@dpdk.org
> Cc: Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; Ye, MingjinX <mingjinx.ye@intel.com>;
> stable@dpdk.org; Zhang, Qi Z <qi.z.zhang@intel.com>
> Subject: [PATCH v3] net/ice: fix crash on closing representor ports
> 
> The data resource in struct rte_eth_dev is cleared and points to NULL when
> the DCF port is closed.
> 
> If the DCF representor port is closed after the DCF port is closed, a
> segmentation fault occurs because the representor port accesses the data
> resource released by the DCF port.
> 
> This patch checks if the resource is present before accessing.
> 
> Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
> Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> ---
> v3: New solution.
> ---
>  drivers/net/ice/ice_dcf_vf_representor.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_dcf_vf_representor.c
> b/drivers/net/ice/ice_dcf_vf_representor.c
> index b9fcfc80ad..8c45e28f02 100644
> --- a/drivers/net/ice/ice_dcf_vf_representor.c
> +++ b/drivers/net/ice/ice_dcf_vf_representor.c
> @@ -111,14 +111,16 @@ ice_dcf_vf_repr_link_update(__rte_unused struct
> rte_eth_dev *ethdev,  static __rte_always_inline struct ice_dcf_hw *
> ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)  {
> -	struct ice_dcf_adapter *dcf_adapter =
> -			repr->dcf_eth_dev->data->dev_private;
> +	struct rte_eth_dev_data *dcf_data = repr->dcf_eth_dev->data;

Seems this expose another issue, if dcf port already be closed, the dcf_eth_dev instance could already be reused by another driver.
So we can't assume dcf_eth_dev->data is NULL,  I think you can refine based on v2's method, but don't update dcf_valid flag in representor port's dev_stop.



> +	struct ice_dcf_adapter *dcf_adapter;
> 
> -	if (!dcf_adapter) {
> +	if (!dcf_data || !dcf_data->dev_private) {
>  		PMD_DRV_LOG(ERR, "DCF for VF representor has been
> released\n");
>  		return NULL;
>  	}
> 
> +	dcf_adapter = dcf_data->dev_private;
> +
>  	return &dcf_adapter->real_hw;
>  }
> 
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v3] net/ice: fix crash on closing representor ports
  2023-11-01 10:48     ` Zhang, Qi Z
@ 2023-11-02  2:38       ` Ye, MingjinX
  2023-11-02  3:59         ` Zhang, Qi Z
  0 siblings, 1 reply; 17+ messages in thread
From: Ye, MingjinX @ 2023-11-02  2:38 UTC (permalink / raw)
  To: Zhang, Qi Z, dev; +Cc: Yang, Qiming, Zhou, YidingX, stable



> -----Original Message-----
> From: Zhang, Qi Z <qi.z.zhang@intel.com>
> Sent: Wednesday, November 1, 2023 6:49 PM
> To: Ye, MingjinX <mingjinx.ye@intel.com>; dev@dpdk.org
> Cc: Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; stable@dpdk.org
> Subject: RE: [PATCH v3] net/ice: fix crash on closing representor ports
> 
> 
> 
> > -----Original Message-----
> > From: Ye, MingjinX <mingjinx.ye@intel.com>
> > Sent: Wednesday, November 1, 2023 6:14 PM
> > To: dev@dpdk.org
> > Cc: Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> > <yidingx.zhou@intel.com>; Ye, MingjinX <mingjinx.ye@intel.com>;
> > stable@dpdk.org; Zhang, Qi Z <qi.z.zhang@intel.com>
> > Subject: [PATCH v3] net/ice: fix crash on closing representor ports
> >
> > The data resource in struct rte_eth_dev is cleared and points to NULL
> > when the DCF port is closed.
> >
> > If the DCF representor port is closed after the DCF port is closed, a
> > segmentation fault occurs because the representor port accesses the
> > data resource released by the DCF port.
> >
> > This patch checks if the resource is present before accessing.
> >
> > Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
> > Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> > ---
> > v3: New solution.
> > ---
> >  drivers/net/ice/ice_dcf_vf_representor.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ice/ice_dcf_vf_representor.c
> > b/drivers/net/ice/ice_dcf_vf_representor.c
> > index b9fcfc80ad..8c45e28f02 100644
> > --- a/drivers/net/ice/ice_dcf_vf_representor.c
> > +++ b/drivers/net/ice/ice_dcf_vf_representor.c
> > @@ -111,14 +111,16 @@ ice_dcf_vf_repr_link_update(__rte_unused
> struct
> > rte_eth_dev *ethdev,  static __rte_always_inline struct ice_dcf_hw *
> > ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)  {
> > -	struct ice_dcf_adapter *dcf_adapter =
> > -			repr->dcf_eth_dev->data->dev_private;
> > +	struct rte_eth_dev_data *dcf_data = repr->dcf_eth_dev->data;
> 
> Seems this expose another issue, if dcf port already be closed, the
> dcf_eth_dev instance could already be reused by another driver.
Your analysis is spot on.

The reason for the issue:
Affected by 36c46e738120c381cf663c96692454c5aa75e203 commit.
da9cdcd1f37220e87db23993d6352637d71df25b commit does not fix the issue.

v2 patch:
Notify every proxy port of DCF port status.

v3 patch:
Based on da9cdcd1f37220e87db23993d636352637d71df25b, made optimization to fix the issue. 

> So we can't assume dcf_eth_dev->data is NULL,  I think you can refine based
> on v2's method, but don't update dcf_valid flag in representor port's
> dev_stop.
Implementation difficulties:
1. When the DCF is created, all associated proxy ports are created and each proxy port (struct rte_eth_dev) pointer is recorded in an array.
2. When shutting down the DCF, all all proxy ports are notified at ice_dcf_vf_repr_stop_all. Finally the array is deleted.

Based on the original scenario, notifying all agent ports of DCF state failure is the simplest and most effective way. This is very similar to the v2 patch scenario.

Check each DCF port flag?
If the DCF port has failed and the resource has been released. the DCF port status, naturally, is not available. This is very similar to the v3 patch scenario.

Other ideas
1. Re-implement DCF port and proxy port communication, can't directly use (struct rte_eth_dev) pointer to access the device resources on the other end.

2. Designed to return an error when removing the proxy port directly. The proxy port device is managed by the control DCF (the proxy port is created by the DCF port and the removal work should be done by it).

The above 2 options will bring a huge amount of work. It seems beyond the scope of this discussion.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v3] net/ice: fix crash on closing representor ports
  2023-11-02  2:38       ` Ye, MingjinX
@ 2023-11-02  3:59         ` Zhang, Qi Z
  0 siblings, 0 replies; 17+ messages in thread
From: Zhang, Qi Z @ 2023-11-02  3:59 UTC (permalink / raw)
  To: Ye, MingjinX, dev; +Cc: Yang, Qiming, Zhou, YidingX, stable



> -----Original Message-----
> From: Ye, MingjinX <mingjinx.ye@intel.com>
> Sent: Thursday, November 2, 2023 10:39 AM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>; dev@dpdk.org
> Cc: Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; stable@dpdk.org
> Subject: RE: [PATCH v3] net/ice: fix crash on closing representor ports
> 
> 
> 
> > -----Original Message-----
> > From: Zhang, Qi Z <qi.z.zhang@intel.com>
> > Sent: Wednesday, November 1, 2023 6:49 PM
> > To: Ye, MingjinX <mingjinx.ye@intel.com>; dev@dpdk.org
> > Cc: Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> > <yidingx.zhou@intel.com>; stable@dpdk.org
> > Subject: RE: [PATCH v3] net/ice: fix crash on closing representor
> > ports
> >
> >
> >
> > > -----Original Message-----
> > > From: Ye, MingjinX <mingjinx.ye@intel.com>
> > > Sent: Wednesday, November 1, 2023 6:14 PM
> > > To: dev@dpdk.org
> > > Cc: Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> > > <yidingx.zhou@intel.com>; Ye, MingjinX <mingjinx.ye@intel.com>;
> > > stable@dpdk.org; Zhang, Qi Z <qi.z.zhang@intel.com>
> > > Subject: [PATCH v3] net/ice: fix crash on closing representor ports
> > >
> > > The data resource in struct rte_eth_dev is cleared and points to
> > > NULL when the DCF port is closed.
> > >
> > > If the DCF representor port is closed after the DCF port is closed,
> > > a segmentation fault occurs because the representor port accesses
> > > the data resource released by the DCF port.
> > >
> > > This patch checks if the resource is present before accessing.
> > >
> > > Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
> > > Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port
> > > closing")
> > > Cc: stable@dpdk.org
> > >
> > > Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> > > ---
> > > v3: New solution.
> > > ---
> > >  drivers/net/ice/ice_dcf_vf_representor.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/net/ice/ice_dcf_vf_representor.c
> > > b/drivers/net/ice/ice_dcf_vf_representor.c
> > > index b9fcfc80ad..8c45e28f02 100644
> > > --- a/drivers/net/ice/ice_dcf_vf_representor.c
> > > +++ b/drivers/net/ice/ice_dcf_vf_representor.c
> > > @@ -111,14 +111,16 @@ ice_dcf_vf_repr_link_update(__rte_unused
> > struct
> > > rte_eth_dev *ethdev,  static __rte_always_inline struct ice_dcf_hw *
> > > ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)  {
> > > -	struct ice_dcf_adapter *dcf_adapter =
> > > -			repr->dcf_eth_dev->data->dev_private;
> > > +	struct rte_eth_dev_data *dcf_data = repr->dcf_eth_dev->data;
> >
> > Seems this expose another issue, if dcf port already be closed, the
> > dcf_eth_dev instance could already be reused by another driver.
> Your analysis is spot on.
> 
> The reason for the issue:
> Affected by 36c46e738120c381cf663c96692454c5aa75e203 commit.
> da9cdcd1f37220e87db23993d6352637d71df25b commit does not fix the
> issue.
> 
> v2 patch:
> Notify every proxy port of DCF port status.
> 
> v3 patch:
> Based on da9cdcd1f37220e87db23993d636352637d71df25b, made
> optimization to fix the issue.
> 
> > So we can't assume dcf_eth_dev->data is NULL,  I think you can refine
> > based on v2's method, but don't update dcf_valid flag in representor
> > port's dev_stop.
> Implementation difficulties:
> 1. When the DCF is created, all associated proxy ports are created and each
> proxy port (struct rte_eth_dev) pointer is recorded in an array.
> 2. When shutting down the DCF, all all proxy ports are notified at
> ice_dcf_vf_repr_stop_all. Finally the array is deleted.
> 
> Based on the original scenario, notifying all agent ports of DCF state failure is
> the simplest and most effective way. This is very similar to the v2 patch
> scenario.

Yes, that's why I suggest to refine based on v2 after notice the issue on v3, seems we need to maintain a per repr flag for DCF status, but need to update it carefully.

> 
> Check each DCF port flag?
> If the DCF port has failed and the resource has been released. the DCF port
> status, naturally, is not available. This is very similar to the v3 patch scenario.

Check dev->data is NULL does not work, a better way is to check eth_dev->state and device name to make sure it is still used by DCF.
But I think v2's method is more reliable.


> 
> Other ideas
> 1. Re-implement DCF port and proxy port communication, can't directly use
> (struct rte_eth_dev) pointer to access the device resources on the other end.
> 
> 2. Designed to return an error when removing the proxy port directly. The
> proxy port device is managed by the control DCF (the proxy port is created by
> the DCF port and the removal work should be done by it).
> 
> The above 2 options will bring a huge amount of work. It seems beyond the
> scope of this discussion.
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v4] net/ice: fix crash on closing representor ports
  2023-11-01 10:13   ` [PATCH v3] " Mingjin Ye
  2023-11-01 10:48     ` Zhang, Qi Z
@ 2023-11-02 10:11     ` Mingjin Ye
  2023-11-06  1:23       ` Zhang, Qi Z
  2023-11-06 10:00       ` [PATCH v5] " Mingjin Ye
  1 sibling, 2 replies; 17+ messages in thread
From: Mingjin Ye @ 2023-11-02 10:11 UTC (permalink / raw)
  To: dev; +Cc: qiming.yang, yidingx.zhou, Mingjin Ye, stable, Qi Zhang

Since the representor port needs to access the resource of the
associated DCF when it is closing. Therefore, all the representor
port should be closed first, and then close the associated DCF port.

If the DCF port is closed before the representor port on PMD exit.
This will result in accessing freed resources and eventually a
core dump will occur.

This patch fixes this issue by notifying each other to unassociate
when the DCF port and the representor port are closed.

Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
Fixes: c7e1a1a3bfeb ("net/ice: refactor DCF VLAN handling")
Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
Cc: stable@dpdk.org

Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
---
v2: Reformat code to remove unneeded fixlines.
---
v3: New solution.
---
v4: Optimize v2 patch.
---
 drivers/net/ice/ice_dcf_ethdev.c         | 20 ++++++++++++++++++++
 drivers/net/ice/ice_dcf_ethdev.h         |  2 ++
 drivers/net/ice/ice_dcf_vf_representor.c | 23 ++++++++++++++++++-----
 3 files changed, 40 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ice/ice_dcf_ethdev.c b/drivers/net/ice/ice_dcf_ethdev.c
index 065ec728c2..63e63a23de 100644
--- a/drivers/net/ice/ice_dcf_ethdev.c
+++ b/drivers/net/ice/ice_dcf_ethdev.c
@@ -1618,6 +1618,26 @@ ice_dcf_free_repr_info(struct ice_dcf_adapter *dcf_adapter)
 	}
 }
 
+int
+ice_dcf_handle_vf_repr_uninit(struct ice_dcf_adapter *dcf_adapter,
+				uint16_t vf_id)
+{
+	struct ice_dcf_repr_info *vf_rep_info;
+
+	if (dcf_adapter->num_reprs >= vf_id) {
+		PMD_DRV_LOG(ERR, "Invalid VF id: %d", vf_id);
+		return -1;
+	}
+
+	if (!dcf_adapter->repr_infos)
+		return 0;
+
+	vf_rep_info = &dcf_adapter->repr_infos[vf_id];
+	vf_rep_info->vf_rep_eth_dev = NULL;
+
+	return 0;
+}
+
 static int
 ice_dcf_init_repr_info(struct ice_dcf_adapter *dcf_adapter)
 {
diff --git a/drivers/net/ice/ice_dcf_ethdev.h b/drivers/net/ice/ice_dcf_ethdev.h
index 4baaec4b8b..094e2a36db 100644
--- a/drivers/net/ice/ice_dcf_ethdev.h
+++ b/drivers/net/ice/ice_dcf_ethdev.h
@@ -60,6 +60,7 @@ struct ice_dcf_vf_repr {
 	struct rte_ether_addr mac_addr;
 	uint16_t switch_domain_id;
 	uint16_t vf_id;
+	bool dcf_valid;
 
 	struct ice_dcf_vlan outer_vlan_info; /* DCF always handle outer VLAN */
 };
@@ -81,5 +82,6 @@ int ice_dcf_vf_repr_uninit(struct rte_eth_dev *vf_rep_eth_dev);
 int ice_dcf_vf_repr_init_vlan(struct rte_eth_dev *vf_rep_eth_dev);
 void ice_dcf_vf_repr_stop_all(struct ice_dcf_adapter *dcf_adapter);
 bool ice_dcf_adminq_need_retry(struct ice_adapter *ad);
+int ice_dcf_handle_vf_repr_uninit(struct ice_dcf_adapter *dcf_adapter, uint16_t vf_id);
 
 #endif /* _ICE_DCF_ETHDEV_H_ */
diff --git a/drivers/net/ice/ice_dcf_vf_representor.c b/drivers/net/ice/ice_dcf_vf_representor.c
index b9fcfc80ad..167abaa780 100644
--- a/drivers/net/ice/ice_dcf_vf_representor.c
+++ b/drivers/net/ice/ice_dcf_vf_representor.c
@@ -45,6 +45,9 @@ ice_dcf_vf_repr_dev_start(struct rte_eth_dev *dev)
 static int
 ice_dcf_vf_repr_dev_stop(struct rte_eth_dev *dev)
 {
+	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
+
+	repr->dcf_valid = false;
 	dev->data->dev_link.link_status = RTE_ETH_LINK_DOWN;
 
 	return 0;
@@ -111,14 +114,15 @@ ice_dcf_vf_repr_link_update(__rte_unused struct rte_eth_dev *ethdev,
 static __rte_always_inline struct ice_dcf_hw *
 ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)
 {
-	struct ice_dcf_adapter *dcf_adapter =
-			repr->dcf_eth_dev->data->dev_private;
+	struct ice_dcf_adapter *dcf_adapter;
 
-	if (!dcf_adapter) {
+	if (!repr->dcf_valid) {
 		PMD_DRV_LOG(ERR, "DCF for VF representor has been released\n");
 		return NULL;
 	}
 
+	dcf_adapter = repr->dcf_eth_dev->data->dev_private;
+
 	return &dcf_adapter->real_hw;
 }
 
@@ -414,6 +418,7 @@ ice_dcf_vf_repr_init(struct rte_eth_dev *vf_rep_eth_dev, void *init_param)
 	repr->dcf_eth_dev = param->dcf_eth_dev;
 	repr->switch_domain_id = param->switch_domain_id;
 	repr->vf_id = param->vf_id;
+	repr->dcf_valid = true;
 	repr->outer_vlan_info.port_vlan_ena = false;
 	repr->outer_vlan_info.stripping_ena = false;
 	repr->outer_vlan_info.tpid = RTE_ETHER_TYPE_VLAN;
@@ -437,6 +442,14 @@ ice_dcf_vf_repr_init(struct rte_eth_dev *vf_rep_eth_dev, void *init_param)
 int
 ice_dcf_vf_repr_uninit(struct rte_eth_dev *vf_rep_eth_dev)
 {
+	struct ice_dcf_vf_repr *repr = vf_rep_eth_dev->data->dev_private;
+	struct ice_dcf_adapter *dcf_adapter;
+
+	if (repr->dcf_valid) {
+		dcf_adapter = repr->dcf_eth_dev->data->dev_private;
+		ice_dcf_handle_vf_repr_uninit(dcf_adapter, repr->vf_id);
+	}
+
 	vf_rep_eth_dev->data->mac_addrs = NULL;
 
 	return 0;
@@ -480,11 +493,11 @@ ice_dcf_vf_repr_stop_all(struct ice_dcf_adapter *dcf_adapter)
 	for (vf_id = 0; vf_id < dcf_adapter->real_hw.num_vfs; vf_id++) {
 		struct rte_eth_dev *vf_rep_eth_dev =
 				dcf_adapter->repr_infos[vf_id].vf_rep_eth_dev;
-		if (!vf_rep_eth_dev || vf_rep_eth_dev->data->dev_started == 0)
+		if (!vf_rep_eth_dev)
 			continue;
 
 		ret = ice_dcf_vf_repr_dev_stop(vf_rep_eth_dev);
 		if (!ret)
-			vf_rep_eth_dev->data->dev_started = 0;
+			dcf_adapter->repr_infos[vf_id].vf_rep_eth_dev = NULL;
 	}
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v4] net/ice: fix crash on closing representor ports
  2023-11-02 10:11     ` [PATCH v4] " Mingjin Ye
@ 2023-11-06  1:23       ` Zhang, Qi Z
  2023-11-06 10:00       ` [PATCH v5] " Mingjin Ye
  1 sibling, 0 replies; 17+ messages in thread
From: Zhang, Qi Z @ 2023-11-06  1:23 UTC (permalink / raw)
  To: Ye, MingjinX, dev; +Cc: Yang, Qiming, Zhou, YidingX, stable



> -----Original Message-----
> From: Ye, MingjinX <mingjinx.ye@intel.com>
> Sent: Thursday, November 2, 2023 6:11 PM
> To: dev@dpdk.org
> Cc: Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; Ye, MingjinX <mingjinx.ye@intel.com>;
> stable@dpdk.org; Zhang, Qi Z <qi.z.zhang@intel.com>
> Subject: [PATCH v4] net/ice: fix crash on closing representor ports
> 
> Since the representor port needs to access the resource of the associated DCF
> when it is closing. Therefore, all the representor port should be closed first,
> and then close the associated DCF port.
> 
> If the DCF port is closed before the representor port on PMD exit.
> This will result in accessing freed resources and eventually a core dump will
> occur.
> 
> This patch fixes this issue by notifying each other to unassociate when the
> DCF port and the representor port are closed.
> 
> Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
> Fixes: c7e1a1a3bfeb ("net/ice: refactor DCF VLAN handling")
> Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> ---
> v2: Reformat code to remove unneeded fixlines.
> ---
> v3: New solution.
> ---
> v4: Optimize v2 patch.
> ---
>  drivers/net/ice/ice_dcf_ethdev.c         | 20 ++++++++++++++++++++
>  drivers/net/ice/ice_dcf_ethdev.h         |  2 ++
>  drivers/net/ice/ice_dcf_vf_representor.c | 23 ++++++++++++++++++-----
>  3 files changed, 40 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_dcf_ethdev.c b/drivers/net/ice/ice_dcf_ethdev.c
> index 065ec728c2..63e63a23de 100644
> --- a/drivers/net/ice/ice_dcf_ethdev.c
> +++ b/drivers/net/ice/ice_dcf_ethdev.c
> @@ -1618,6 +1618,26 @@ ice_dcf_free_repr_info(struct ice_dcf_adapter
> *dcf_adapter)
>  	}
>  }
> 
> +int
> +ice_dcf_handle_vf_repr_uninit(struct ice_dcf_adapter *dcf_adapter,
> +				uint16_t vf_id)
> +{
> +	struct ice_dcf_repr_info *vf_rep_info;
> +
> +	if (dcf_adapter->num_reprs >= vf_id) {
> +		PMD_DRV_LOG(ERR, "Invalid VF id: %d", vf_id);
> +		return -1;
> +	}
> +
> +	if (!dcf_adapter->repr_infos)
> +		return 0;
> +
> +	vf_rep_info = &dcf_adapter->repr_infos[vf_id];
> +	vf_rep_info->vf_rep_eth_dev = NULL;
> +
> +	return 0;
> +}
> +
>  static int
>  ice_dcf_init_repr_info(struct ice_dcf_adapter *dcf_adapter)  { diff --git
> a/drivers/net/ice/ice_dcf_ethdev.h b/drivers/net/ice/ice_dcf_ethdev.h
> index 4baaec4b8b..094e2a36db 100644
> --- a/drivers/net/ice/ice_dcf_ethdev.h
> +++ b/drivers/net/ice/ice_dcf_ethdev.h
> @@ -60,6 +60,7 @@ struct ice_dcf_vf_repr {
>  	struct rte_ether_addr mac_addr;
>  	uint16_t switch_domain_id;
>  	uint16_t vf_id;
> +	bool dcf_valid;
> 
>  	struct ice_dcf_vlan outer_vlan_info; /* DCF always handle outer VLAN
> */  }; @@ -81,5 +82,6 @@ int ice_dcf_vf_repr_uninit(struct rte_eth_dev
> *vf_rep_eth_dev);  int ice_dcf_vf_repr_init_vlan(struct rte_eth_dev
> *vf_rep_eth_dev);  void ice_dcf_vf_repr_stop_all(struct ice_dcf_adapter
> *dcf_adapter);  bool ice_dcf_adminq_need_retry(struct ice_adapter *ad);
> +int ice_dcf_handle_vf_repr_uninit(struct ice_dcf_adapter *dcf_adapter,
> +uint16_t vf_id);
> 
>  #endif /* _ICE_DCF_ETHDEV_H_ */
> diff --git a/drivers/net/ice/ice_dcf_vf_representor.c
> b/drivers/net/ice/ice_dcf_vf_representor.c
> index b9fcfc80ad..167abaa780 100644
> --- a/drivers/net/ice/ice_dcf_vf_representor.c
> +++ b/drivers/net/ice/ice_dcf_vf_representor.c
> @@ -45,6 +45,9 @@ ice_dcf_vf_repr_dev_start(struct rte_eth_dev *dev)
> static int  ice_dcf_vf_repr_dev_stop(struct rte_eth_dev *dev)  {
> +	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
> +
> +	repr->dcf_valid = false;

Is this correct ? 

ice_dcf_handle_vf_repr_uninit will not be called,
if we stop a representor port (call ice_dcf_vf_repr_dev_stop), then close it (call ice_dcf_vf_repr_uninit) while DCF still be active.

if you want to use dcf_valid as a notification flag sent from dcf to each representors,  never set it in a representer's own callback function.

Btw, besides set dcf_valid to true in ice_dcf_vf_repr_init, you may also consider the case when a DCF reset happen, dcf_valid need to be reset to true for each active representor. 

...

> *init_param)  int  ice_dcf_vf_repr_uninit(struct rte_eth_dev *vf_rep_eth_dev)
> {
> +	struct ice_dcf_vf_repr *repr = vf_rep_eth_dev->data->dev_private;
> +	struct ice_dcf_adapter *dcf_adapter;
> +
> +	if (repr->dcf_valid) {
> +		dcf_adapter = repr->dcf_eth_dev->data->dev_private;
> +		ice_dcf_handle_vf_repr_uninit(dcf_adapter, repr->vf_id);
> +	}
> +
>  	vf_rep_eth_dev->data->mac_addrs = NULL;
> 
>  	return 0;


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v5] net/ice: fix crash on closing representor ports
  2023-11-02 10:11     ` [PATCH v4] " Mingjin Ye
  2023-11-06  1:23       ` Zhang, Qi Z
@ 2023-11-06 10:00       ` Mingjin Ye
  2023-11-06 12:05         ` Zhang, Qi Z
  2023-11-07 10:12         ` [PATCH v6] " Mingjin Ye
  1 sibling, 2 replies; 17+ messages in thread
From: Mingjin Ye @ 2023-11-06 10:00 UTC (permalink / raw)
  To: dev; +Cc: qiming.yang, yidingx.zhou, Mingjin Ye, stable, Qi Zhang

The data resource in struct rte_eth_dev is cleared and points to NULL
when the DCF port is closed.

If the DCF representor port is closed after the DCF port is closed,
a segmentation fault occurs because the representor port accesses the
data resource released by the DCF port.

This patch fixes this issue by synchronizing the state of DCF ports and
representor ports to the peer in real time when their state changes.

Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
Fixes: 7564d5509611 ("net/ice: add DCF hardware initialization")
Fixes: c7e1a1a3bfeb ("net/ice: refactor DCF VLAN handling")
Fixes: 1a86f4dbdf42 ("net/ice: support DCF device reset")
Cc: stable@dpdk.org

Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
---
v2: Reformat code to remove unneeded fixlines.
---
v3: New solution.
---
v4: Optimize v2 patch.
---
v5: optimization.
---
 drivers/net/ice/ice_dcf_ethdev.c         | 30 ++++++++++++--
 drivers/net/ice/ice_dcf_ethdev.h         |  3 ++
 drivers/net/ice/ice_dcf_vf_representor.c | 50 ++++++++++++++++++++++--
 3 files changed, 77 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ice/ice_dcf_ethdev.c b/drivers/net/ice/ice_dcf_ethdev.c
index 065ec728c2..eea24ee3a9 100644
--- a/drivers/net/ice/ice_dcf_ethdev.c
+++ b/drivers/net/ice/ice_dcf_ethdev.c
@@ -1618,6 +1618,26 @@ ice_dcf_free_repr_info(struct ice_dcf_adapter *dcf_adapter)
 	}
 }
 
+int
+ice_dcf_handle_vf_repr_close(struct ice_dcf_adapter *dcf_adapter,
+				uint16_t vf_id)
+{
+	struct ice_dcf_repr_info *vf_rep_info;
+
+	if (dcf_adapter->num_reprs >= vf_id) {
+		PMD_DRV_LOG(ERR, "Invalid VF id: %d", vf_id);
+		return -1;
+	}
+
+	if (!dcf_adapter->repr_infos)
+		return 0;
+
+	vf_rep_info = &dcf_adapter->repr_infos[vf_id];
+	vf_rep_info->vf_rep_eth_dev = NULL;
+
+	return 0;
+}
+
 static int
 ice_dcf_init_repr_info(struct ice_dcf_adapter *dcf_adapter)
 {
@@ -1641,11 +1661,10 @@ ice_dcf_dev_close(struct rte_eth_dev *dev)
 	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
 		return 0;
 
+	ice_dcf_vf_repr_notify_all(adapter, false);
 	(void)ice_dcf_dev_stop(dev);
 
 	ice_free_queues(dev);
-
-	ice_dcf_free_repr_info(adapter);
 	ice_dcf_uninit_parent_adapter(dev);
 	ice_dcf_uninit_hw(dev, &adapter->real_hw);
 
@@ -1835,7 +1854,7 @@ ice_dcf_dev_reset(struct rte_eth_dev *dev)
 		ice_dcf_reset_hw(dev, hw);
 	}
 
-	ret = ice_dcf_dev_uninit(dev);
+	ret = ice_dcf_dev_close(dev);
 	if (ret)
 		return ret;
 
@@ -1938,12 +1957,17 @@ ice_dcf_dev_init(struct rte_eth_dev *eth_dev)
 	}
 
 	dcf_config_promisc(adapter, false, false);
+	ice_dcf_vf_repr_notify_all(adapter, true);
+
 	return 0;
 }
 
 static int
 ice_dcf_dev_uninit(struct rte_eth_dev *eth_dev)
 {
+	struct ice_dcf_adapter *adapter = eth_dev->data->dev_private;
+
+	ice_dcf_free_repr_info(adapter);
 	ice_dcf_dev_close(eth_dev);
 
 	return 0;
diff --git a/drivers/net/ice/ice_dcf_ethdev.h b/drivers/net/ice/ice_dcf_ethdev.h
index 4baaec4b8b..6dcbaac5eb 100644
--- a/drivers/net/ice/ice_dcf_ethdev.h
+++ b/drivers/net/ice/ice_dcf_ethdev.h
@@ -60,6 +60,7 @@ struct ice_dcf_vf_repr {
 	struct rte_ether_addr mac_addr;
 	uint16_t switch_domain_id;
 	uint16_t vf_id;
+	bool dcf_valid;
 
 	struct ice_dcf_vlan outer_vlan_info; /* DCF always handle outer VLAN */
 };
@@ -80,6 +81,8 @@ int ice_dcf_vf_repr_init(struct rte_eth_dev *vf_rep_eth_dev, void *init_param);
 int ice_dcf_vf_repr_uninit(struct rte_eth_dev *vf_rep_eth_dev);
 int ice_dcf_vf_repr_init_vlan(struct rte_eth_dev *vf_rep_eth_dev);
 void ice_dcf_vf_repr_stop_all(struct ice_dcf_adapter *dcf_adapter);
+void ice_dcf_vf_repr_notify_all(struct ice_dcf_adapter *dcf_adapter, bool valid);
+int ice_dcf_handle_vf_repr_close(struct ice_dcf_adapter *dcf_adapter, uint16_t vf_id);
 bool ice_dcf_adminq_need_retry(struct ice_adapter *ad);
 
 #endif /* _ICE_DCF_ETHDEV_H_ */
diff --git a/drivers/net/ice/ice_dcf_vf_representor.c b/drivers/net/ice/ice_dcf_vf_representor.c
index b9fcfc80ad..6c342798ac 100644
--- a/drivers/net/ice/ice_dcf_vf_representor.c
+++ b/drivers/net/ice/ice_dcf_vf_representor.c
@@ -50,9 +50,32 @@ ice_dcf_vf_repr_dev_stop(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static bool
+ice_dcf_vf_repr_set_dcf_valid(struct rte_eth_dev *dev, bool valid)
+{
+	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
+
+	if (!repr)
+		return false;
+
+	repr->dcf_valid = valid;
+
+	return true;
+}
+
 static int
 ice_dcf_vf_repr_dev_close(struct rte_eth_dev *dev)
 {
+	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
+	struct ice_dcf_adapter *dcf_adapter;
+
+	if (repr->dcf_valid) {
+		dcf_adapter = repr->dcf_eth_dev->data->dev_private;
+		ice_dcf_handle_vf_repr_close(dcf_adapter, repr->vf_id);
+		/* Commitment to not access DCF port resources */
+		repr->dcf_valid = false;
+	}
+
 	return ice_dcf_vf_repr_uninit(dev);
 }
 
@@ -111,14 +134,15 @@ ice_dcf_vf_repr_link_update(__rte_unused struct rte_eth_dev *ethdev,
 static __rte_always_inline struct ice_dcf_hw *
 ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)
 {
-	struct ice_dcf_adapter *dcf_adapter =
-			repr->dcf_eth_dev->data->dev_private;
+	struct ice_dcf_adapter *dcf_adapter;
 
-	if (!dcf_adapter) {
+	if (!repr->dcf_valid) {
 		PMD_DRV_LOG(ERR, "DCF for VF representor has been released\n");
 		return NULL;
 	}
 
+	dcf_adapter = repr->dcf_eth_dev->data->dev_private;
+
 	return &dcf_adapter->real_hw;
 }
 
@@ -414,6 +438,7 @@ ice_dcf_vf_repr_init(struct rte_eth_dev *vf_rep_eth_dev, void *init_param)
 	repr->dcf_eth_dev = param->dcf_eth_dev;
 	repr->switch_domain_id = param->switch_domain_id;
 	repr->vf_id = param->vf_id;
+	repr->dcf_valid = true;
 	repr->outer_vlan_info.port_vlan_ena = false;
 	repr->outer_vlan_info.stripping_ena = false;
 	repr->outer_vlan_info.tpid = RTE_ETHER_TYPE_VLAN;
@@ -488,3 +513,22 @@ ice_dcf_vf_repr_stop_all(struct ice_dcf_adapter *dcf_adapter)
 			vf_rep_eth_dev->data->dev_started = 0;
 	}
 }
+
+void
+ice_dcf_vf_repr_notify_all(struct ice_dcf_adapter *dcf_adapter, bool valid)
+{
+	uint16_t vf_id;
+	struct rte_eth_dev *vf_rep_eth_dev;
+
+	if (!dcf_adapter->repr_infos)
+		return;
+
+	for (vf_id = 0; vf_id < dcf_adapter->real_hw.num_vfs; vf_id++) {
+		vf_rep_eth_dev = dcf_adapter->repr_infos[vf_id].vf_rep_eth_dev;
+
+		if (!vf_rep_eth_dev)
+			continue;
+
+		ice_dcf_vf_repr_set_dcf_valid(vf_rep_eth_dev, valid);
+	}
+}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v5] net/ice: fix crash on closing representor ports
  2023-11-06 10:00       ` [PATCH v5] " Mingjin Ye
@ 2023-11-06 12:05         ` Zhang, Qi Z
  2023-11-07 10:12         ` [PATCH v6] " Mingjin Ye
  1 sibling, 0 replies; 17+ messages in thread
From: Zhang, Qi Z @ 2023-11-06 12:05 UTC (permalink / raw)
  To: Ye, MingjinX, dev; +Cc: Yang, Qiming, Zhou, YidingX, stable



> -----Original Message-----
> From: Ye, MingjinX <mingjinx.ye@intel.com>
> Sent: Monday, November 6, 2023 6:00 PM
> To: dev@dpdk.org
> Cc: Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; Ye, MingjinX <mingjinx.ye@intel.com>;
> stable@dpdk.org; Zhang, Qi Z <qi.z.zhang@intel.com>
> Subject: [PATCH v5] net/ice: fix crash on closing representor ports
> 
> The data resource in struct rte_eth_dev is cleared and points to NULL when
> the DCF port is closed.
> 
> If the DCF representor port is closed after the DCF port is closed, a
> segmentation fault occurs because the representor port accesses the data
> resource released by the DCF port.
> 
> This patch fixes this issue by synchronizing the state of DCF ports and
> representor ports to the peer in real time when their state changes.
> 
> Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
> Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
> Fixes: 7564d5509611 ("net/ice: add DCF hardware initialization")
> Fixes: c7e1a1a3bfeb ("net/ice: refactor DCF VLAN handling")
> Fixes: 1a86f4dbdf42 ("net/ice: support DCF device reset")

These fixlines does not make sense to me, I believe the issue comes from when we enabled port representor for DCF, 
A patch expose the issue does not imply it cause the issue.

> Cc: stable@dpdk.org
> 
> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> ---
> v2: Reformat code to remove unneeded fixlines.
> ---
> v3: New solution.
> ---
> v4: Optimize v2 patch.
> ---
> v5: optimization.
> ---
>  drivers/net/ice/ice_dcf_ethdev.c         | 30 ++++++++++++--
>  drivers/net/ice/ice_dcf_ethdev.h         |  3 ++
>  drivers/net/ice/ice_dcf_vf_representor.c | 50 ++++++++++++++++++++++--
>  3 files changed, 77 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_dcf_ethdev.c b/drivers/net/ice/ice_dcf_ethdev.c
> index 065ec728c2..eea24ee3a9 100644
> --- a/drivers/net/ice/ice_dcf_ethdev.c
> +++ b/drivers/net/ice/ice_dcf_ethdev.c
> @@ -1618,6 +1618,26 @@ ice_dcf_free_repr_info(struct ice_dcf_adapter
> *dcf_adapter)
>  	}
>  }
> 
> +int
> +ice_dcf_handle_vf_repr_close(struct ice_dcf_adapter *dcf_adapter,
> +				uint16_t vf_id)
> +{
> +	struct ice_dcf_repr_info *vf_rep_info;
> +
> +	if (dcf_adapter->num_reprs >= vf_id) {
> +		PMD_DRV_LOG(ERR, "Invalid VF id: %d", vf_id);
> +		return -1;
> +	}
> +
> +	if (!dcf_adapter->repr_infos)
> +		return 0;
> +
> +	vf_rep_info = &dcf_adapter->repr_infos[vf_id];
> +	vf_rep_info->vf_rep_eth_dev = NULL;
> +
> +	return 0;
> +}
> +
>  static int
>  ice_dcf_init_repr_info(struct ice_dcf_adapter *dcf_adapter)  { @@ -1641,11
> +1661,10 @@ ice_dcf_dev_close(struct rte_eth_dev *dev)
>  	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
>  		return 0;
> 
> +	ice_dcf_vf_repr_notify_all(adapter, false);
>  	(void)ice_dcf_dev_stop(dev);
> 
>  	ice_free_queues(dev);
> -
> -	ice_dcf_free_repr_info(adapter);
>  	ice_dcf_uninit_parent_adapter(dev);
>  	ice_dcf_uninit_hw(dev, &adapter->real_hw);
> 
> @@ -1835,7 +1854,7 @@ ice_dcf_dev_reset(struct rte_eth_dev *dev)
>  		ice_dcf_reset_hw(dev, hw);
>  	}
> 
> -	ret = ice_dcf_dev_uninit(dev);
> +	ret = ice_dcf_dev_close(dev);
>  	if (ret)
>  		return ret;
> 
> @@ -1938,12 +1957,17 @@ ice_dcf_dev_init(struct rte_eth_dev *eth_dev)
>  	}
> 
>  	dcf_config_promisc(adapter, false, false);
> +	ice_dcf_vf_repr_notify_all(adapter, true);
> +
>  	return 0;
>  }
> 
>  static int
>  ice_dcf_dev_uninit(struct rte_eth_dev *eth_dev)  {
> +	struct ice_dcf_adapter *adapter = eth_dev->data->dev_private;
> +
> +	ice_dcf_free_repr_info(adapter);
>  	ice_dcf_dev_close(eth_dev);
> 
>  	return 0;
> diff --git a/drivers/net/ice/ice_dcf_ethdev.h
> b/drivers/net/ice/ice_dcf_ethdev.h
> index 4baaec4b8b..6dcbaac5eb 100644
> --- a/drivers/net/ice/ice_dcf_ethdev.h
> +++ b/drivers/net/ice/ice_dcf_ethdev.h
> @@ -60,6 +60,7 @@ struct ice_dcf_vf_repr {
>  	struct rte_ether_addr mac_addr;
>  	uint16_t switch_domain_id;
>  	uint16_t vf_id;
> +	bool dcf_valid;
> 
>  	struct ice_dcf_vlan outer_vlan_info; /* DCF always handle outer VLAN
> */  }; @@ -80,6 +81,8 @@ int ice_dcf_vf_repr_init(struct rte_eth_dev
> *vf_rep_eth_dev, void *init_param);  int ice_dcf_vf_repr_uninit(struct
> rte_eth_dev *vf_rep_eth_dev);  int ice_dcf_vf_repr_init_vlan(struct
> rte_eth_dev *vf_rep_eth_dev);  void ice_dcf_vf_repr_stop_all(struct
> ice_dcf_adapter *dcf_adapter);
> +void ice_dcf_vf_repr_notify_all(struct ice_dcf_adapter *dcf_adapter,
> +bool valid); int ice_dcf_handle_vf_repr_close(struct ice_dcf_adapter
> +*dcf_adapter, uint16_t vf_id);
>  bool ice_dcf_adminq_need_retry(struct ice_adapter *ad);
> 
>  #endif /* _ICE_DCF_ETHDEV_H_ */
> diff --git a/drivers/net/ice/ice_dcf_vf_representor.c
> b/drivers/net/ice/ice_dcf_vf_representor.c
> index b9fcfc80ad..6c342798ac 100644
> --- a/drivers/net/ice/ice_dcf_vf_representor.c
> +++ b/drivers/net/ice/ice_dcf_vf_representor.c
> @@ -50,9 +50,32 @@ ice_dcf_vf_repr_dev_stop(struct rte_eth_dev *dev)
>  	return 0;
>  }
> 
> +static bool

This function return Boolean, but it never be checked.

> +ice_dcf_vf_repr_set_dcf_valid(struct rte_eth_dev *dev, bool valid) {
> +	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
> +
> +	if (!repr)
> +		return false;

Is this check necessary? If repr = null, means the representor port already be closed, the function will not be called for that port.

> +
> +	repr->dcf_valid = valid;
> +
> +	return true;
> +}
> +
>  static int
>  ice_dcf_vf_repr_dev_close(struct rte_eth_dev *dev)  {
> +	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
> +	struct ice_dcf_adapter *dcf_adapter;
> +
> +	if (repr->dcf_valid) {
> +		dcf_adapter = repr->dcf_eth_dev->data->dev_private;
> +		ice_dcf_handle_vf_repr_close(dcf_adapter, repr->vf_id);
> +		/* Commitment to not access DCF port resources */
> +		repr->dcf_valid = false;

This is not necessary, as you already called ice_dcf_handle_vf_repr_close to notify dcf.
And the repr will be freed in ice_dcf_vf_repr_uninit anyway.


> +	}
> +
>  	return ice_dcf_vf_repr_uninit(dev);
>  }
> 
> @@ -111,14 +134,15 @@ ice_dcf_vf_repr_link_update(__rte_unused struct
> rte_eth_dev *ethdev,  static __rte_always_inline struct ice_dcf_hw *
> ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)  {
> -	struct ice_dcf_adapter *dcf_adapter =
> -			repr->dcf_eth_dev->data->dev_private;
> +	struct ice_dcf_adapter *dcf_adapter;
> 
> -	if (!dcf_adapter) {
> +	if (!repr->dcf_valid) {
>  		PMD_DRV_LOG(ERR, "DCF for VF representor has been
> released\n");
>  		return NULL;
>  	}
> 
> +	dcf_adapter = repr->dcf_eth_dev->data->dev_private;
> +
>  	return &dcf_adapter->real_hw;
>  }
> 
> @@ -414,6 +438,7 @@ ice_dcf_vf_repr_init(struct rte_eth_dev
> *vf_rep_eth_dev, void *init_param)
>  	repr->dcf_eth_dev = param->dcf_eth_dev;
>  	repr->switch_domain_id = param->switch_domain_id;
>  	repr->vf_id = param->vf_id;
> +	repr->dcf_valid = true;
>  	repr->outer_vlan_info.port_vlan_ena = false;
>  	repr->outer_vlan_info.stripping_ena = false;
>  	repr->outer_vlan_info.tpid = RTE_ETHER_TYPE_VLAN; @@ -488,3
> +513,22 @@ ice_dcf_vf_repr_stop_all(struct ice_dcf_adapter *dcf_adapter)
>  			vf_rep_eth_dev->data->dev_started = 0;
>  	}
>  }
> +
> +void
> +ice_dcf_vf_repr_notify_all(struct ice_dcf_adapter *dcf_adapter, bool
> +valid) {
> +	uint16_t vf_id;
> +	struct rte_eth_dev *vf_rep_eth_dev;
> +
> +	if (!dcf_adapter->repr_infos)
> +		return;
> +
> +	for (vf_id = 0; vf_id < dcf_adapter->real_hw.num_vfs; vf_id++) {
> +		vf_rep_eth_dev = dcf_adapter-
> >repr_infos[vf_id].vf_rep_eth_dev;
> +
> +		if (!vf_rep_eth_dev)
> +			continue;
> +
> +		ice_dcf_vf_repr_set_dcf_valid(vf_rep_eth_dev, valid);
> +	}
> +}
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6] net/ice: fix crash on closing representor ports
  2023-11-06 10:00       ` [PATCH v5] " Mingjin Ye
  2023-11-06 12:05         ` Zhang, Qi Z
@ 2023-11-07 10:12         ` Mingjin Ye
  2023-11-07 12:18           ` Zhang, Qi Z
  2023-11-08 11:39           ` [PATCH v7] " Mingjin Ye
  1 sibling, 2 replies; 17+ messages in thread
From: Mingjin Ye @ 2023-11-07 10:12 UTC (permalink / raw)
  To: dev; +Cc: qiming.yang, Mingjin Ye, stable, Qi Zhang

The data resource in struct rte_eth_dev is cleared and points to NULL
when the DCF port is closed.

If the DCF representor port is closed after the DCF port is closed,
a segmentation fault occurs because the representor port accesses the
data resource released by the DCF port.

This patch fixes this issue by synchronizing the state of DCF ports and
representor ports to the peer in real time when their state changes.

Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")
Cc: stable@dpdk.org

Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
---
v2: Reformat code to remove unneeded fixlines.
---
v3: New solution.
---
v4: Optimize v2 patch.
---
v5: optimization.
---
v6: Optimize and resolve conflicts.
---
 drivers/net/ice/ice_dcf_ethdev.c         | 30 ++++++++++++--
 drivers/net/ice/ice_dcf_ethdev.h         |  3 ++
 drivers/net/ice/ice_dcf_vf_representor.c | 51 ++++++++++++++++++++++--
 3 files changed, 78 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ice/ice_dcf_ethdev.c b/drivers/net/ice/ice_dcf_ethdev.c
index 29699c2c32..5d845bba31 100644
--- a/drivers/net/ice/ice_dcf_ethdev.c
+++ b/drivers/net/ice/ice_dcf_ethdev.c
@@ -1618,6 +1618,26 @@ ice_dcf_free_repr_info(struct ice_dcf_adapter *dcf_adapter)
 	}
 }
 
+int
+ice_dcf_handle_vf_repr_close(struct ice_dcf_adapter *dcf_adapter,
+				uint16_t vf_id)
+{
+	struct ice_dcf_repr_info *vf_rep_info;
+
+	if (dcf_adapter->num_reprs >= vf_id) {
+		PMD_DRV_LOG(ERR, "Invalid VF id: %d", vf_id);
+		return -1;
+	}
+
+	if (!dcf_adapter->repr_infos)
+		return 0;
+
+	vf_rep_info = &dcf_adapter->repr_infos[vf_id];
+	vf_rep_info->vf_rep_eth_dev = NULL;
+
+	return 0;
+}
+
 static int
 ice_dcf_init_repr_info(struct ice_dcf_adapter *dcf_adapter)
 {
@@ -1641,11 +1661,10 @@ ice_dcf_dev_close(struct rte_eth_dev *dev)
 	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
 		return 0;
 
+	ice_dcf_vf_repr_notify_all(adapter, false);
 	(void)ice_dcf_dev_stop(dev);
 
 	ice_free_queues(dev);
-
-	ice_dcf_free_repr_info(adapter);
 	ice_dcf_uninit_parent_adapter(dev);
 	ice_dcf_uninit_hw(dev, &adapter->real_hw);
 
@@ -1835,7 +1854,7 @@ ice_dcf_dev_reset(struct rte_eth_dev *dev)
 		ice_dcf_reset_hw(dev, hw);
 	}
 
-	ret = ice_dcf_dev_uninit(dev);
+	ret = ice_dcf_dev_close(dev);
 	if (ret)
 		return ret;
 
@@ -1940,12 +1959,17 @@ ice_dcf_dev_init(struct rte_eth_dev *eth_dev)
 	ice_dcf_stats_reset(eth_dev);
 
 	dcf_config_promisc(adapter, false, false);
+	ice_dcf_vf_repr_notify_all(adapter, true);
+
 	return 0;
 }
 
 static int
 ice_dcf_dev_uninit(struct rte_eth_dev *eth_dev)
 {
+	struct ice_dcf_adapter *adapter = eth_dev->data->dev_private;
+
+	ice_dcf_free_repr_info(adapter);
 	ice_dcf_dev_close(eth_dev);
 
 	return 0;
diff --git a/drivers/net/ice/ice_dcf_ethdev.h b/drivers/net/ice/ice_dcf_ethdev.h
index 4baaec4b8b..6dcbaac5eb 100644
--- a/drivers/net/ice/ice_dcf_ethdev.h
+++ b/drivers/net/ice/ice_dcf_ethdev.h
@@ -60,6 +60,7 @@ struct ice_dcf_vf_repr {
 	struct rte_ether_addr mac_addr;
 	uint16_t switch_domain_id;
 	uint16_t vf_id;
+	bool dcf_valid;
 
 	struct ice_dcf_vlan outer_vlan_info; /* DCF always handle outer VLAN */
 };
@@ -80,6 +81,8 @@ int ice_dcf_vf_repr_init(struct rte_eth_dev *vf_rep_eth_dev, void *init_param);
 int ice_dcf_vf_repr_uninit(struct rte_eth_dev *vf_rep_eth_dev);
 int ice_dcf_vf_repr_init_vlan(struct rte_eth_dev *vf_rep_eth_dev);
 void ice_dcf_vf_repr_stop_all(struct ice_dcf_adapter *dcf_adapter);
+void ice_dcf_vf_repr_notify_all(struct ice_dcf_adapter *dcf_adapter, bool valid);
+int ice_dcf_handle_vf_repr_close(struct ice_dcf_adapter *dcf_adapter, uint16_t vf_id);
 bool ice_dcf_adminq_need_retry(struct ice_adapter *ad);
 
 #endif /* _ICE_DCF_ETHDEV_H_ */
diff --git a/drivers/net/ice/ice_dcf_vf_representor.c b/drivers/net/ice/ice_dcf_vf_representor.c
index b9fcfc80ad..00dc322b30 100644
--- a/drivers/net/ice/ice_dcf_vf_representor.c
+++ b/drivers/net/ice/ice_dcf_vf_representor.c
@@ -50,9 +50,30 @@ ice_dcf_vf_repr_dev_stop(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static int
+ice_dcf_vf_repr_set_dcf_valid(struct rte_eth_dev *dev, bool valid)
+{
+	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
+
+	repr->dcf_valid = valid;
+
+	return 0;
+}
+
 static int
 ice_dcf_vf_repr_dev_close(struct rte_eth_dev *dev)
 {
+	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
+	struct ice_dcf_adapter *dcf_adapter;
+	int err;
+
+	if (repr->dcf_valid) {
+		dcf_adapter = repr->dcf_eth_dev->data->dev_private;
+		err = ice_dcf_handle_vf_repr_close(dcf_adapter, repr->vf_id);
+		if (err)
+			PMD_DRV_LOG(ERR, "VF representor invalid");
+	}
+
 	return ice_dcf_vf_repr_uninit(dev);
 }
 
@@ -111,14 +132,15 @@ ice_dcf_vf_repr_link_update(__rte_unused struct rte_eth_dev *ethdev,
 static __rte_always_inline struct ice_dcf_hw *
 ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)
 {
-	struct ice_dcf_adapter *dcf_adapter =
-			repr->dcf_eth_dev->data->dev_private;
+	struct ice_dcf_adapter *dcf_adapter;
 
-	if (!dcf_adapter) {
+	if (!repr->dcf_valid) {
 		PMD_DRV_LOG(ERR, "DCF for VF representor has been released\n");
 		return NULL;
 	}
 
+	dcf_adapter = repr->dcf_eth_dev->data->dev_private;
+
 	return &dcf_adapter->real_hw;
 }
 
@@ -414,6 +436,7 @@ ice_dcf_vf_repr_init(struct rte_eth_dev *vf_rep_eth_dev, void *init_param)
 	repr->dcf_eth_dev = param->dcf_eth_dev;
 	repr->switch_domain_id = param->switch_domain_id;
 	repr->vf_id = param->vf_id;
+	repr->dcf_valid = true;
 	repr->outer_vlan_info.port_vlan_ena = false;
 	repr->outer_vlan_info.stripping_ena = false;
 	repr->outer_vlan_info.tpid = RTE_ETHER_TYPE_VLAN;
@@ -488,3 +511,25 @@ ice_dcf_vf_repr_stop_all(struct ice_dcf_adapter *dcf_adapter)
 			vf_rep_eth_dev->data->dev_started = 0;
 	}
 }
+
+void
+ice_dcf_vf_repr_notify_all(struct ice_dcf_adapter *dcf_adapter, bool valid)
+{
+	uint16_t vf_id;
+	int err;
+	struct rte_eth_dev *vf_rep_eth_dev;
+
+	if (!dcf_adapter->repr_infos)
+		return;
+
+	for (vf_id = 0; vf_id < dcf_adapter->real_hw.num_vfs; vf_id++) {
+		vf_rep_eth_dev = dcf_adapter->repr_infos[vf_id].vf_rep_eth_dev;
+
+		if (!vf_rep_eth_dev)
+			continue;
+
+		err = ice_dcf_vf_repr_set_dcf_valid(vf_rep_eth_dev, valid);
+		if (err)
+			PMD_DRV_LOG(ERR, "Failed to notify VF representor: %d", vf_id);
+	}
+}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v6] net/ice: fix crash on closing representor ports
  2023-11-07 10:12         ` [PATCH v6] " Mingjin Ye
@ 2023-11-07 12:18           ` Zhang, Qi Z
  2023-11-08 11:39           ` [PATCH v7] " Mingjin Ye
  1 sibling, 0 replies; 17+ messages in thread
From: Zhang, Qi Z @ 2023-11-07 12:18 UTC (permalink / raw)
  To: Ye, MingjinX, dev; +Cc: Yang, Qiming, stable



> -----Original Message-----
> From: Ye, MingjinX <mingjinx.ye@intel.com>
> Sent: Tuesday, November 7, 2023 6:12 PM
> To: dev@dpdk.org
> Cc: Yang, Qiming <qiming.yang@intel.com>; Ye, MingjinX
> <mingjinx.ye@intel.com>; stable@dpdk.org; Zhang, Qi Z
> <qi.z.zhang@intel.com>
> Subject: [PATCH v6] net/ice: fix crash on closing representor ports
> 
> The data resource in struct rte_eth_dev is cleared and points to NULL when
> the DCF port is closed.
> 
> If the DCF representor port is closed after the DCF port is closed, a
> segmentation fault occurs because the representor port accesses the data
> resource released by the DCF port.
> 
> This patch fixes this issue by synchronizing the state of DCF ports and
> representor ports to the peer in real time when their state changes.
> 
> Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port closing")

The fixline still not make sense, the issue should be already exist before above patch.

> Cc: stable@dpdk.org
> 
> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> ---
> v2: Reformat code to remove unneeded fixlines.
> ---
> v3: New solution.
> ---
> v4: Optimize v2 patch.
> ---
> v5: optimization.
> ---
> v6: Optimize and resolve conflicts.
> ---
>  drivers/net/ice/ice_dcf_ethdev.c         | 30 ++++++++++++--
>  drivers/net/ice/ice_dcf_ethdev.h         |  3 ++
>  drivers/net/ice/ice_dcf_vf_representor.c | 51 ++++++++++++++++++++++--
>  3 files changed, 78 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_dcf_ethdev.c b/drivers/net/ice/ice_dcf_ethdev.c
> index 29699c2c32..5d845bba31 100644
> --- a/drivers/net/ice/ice_dcf_ethdev.c
> +++ b/drivers/net/ice/ice_dcf_ethdev.c
> @@ -1618,6 +1618,26 @@ ice_dcf_free_repr_info(struct ice_dcf_adapter
> *dcf_adapter)
>  	}
>  }
> 
> +int
> +ice_dcf_handle_vf_repr_close(struct ice_dcf_adapter *dcf_adapter,
> +				uint16_t vf_id)
> +{
> +	struct ice_dcf_repr_info *vf_rep_info;
> +
> +	if (dcf_adapter->num_reprs >= vf_id) {
> +		PMD_DRV_LOG(ERR, "Invalid VF id: %d", vf_id);
> +		return -1;
> +	}
> +
> +	if (!dcf_adapter->repr_infos)
> +		return 0;
> +
> +	vf_rep_info = &dcf_adapter->repr_infos[vf_id];
> +	vf_rep_info->vf_rep_eth_dev = NULL;
> +
> +	return 0;
> +}
> +
>  static int
>  ice_dcf_init_repr_info(struct ice_dcf_adapter *dcf_adapter)  { @@ -1641,11
> +1661,10 @@ ice_dcf_dev_close(struct rte_eth_dev *dev)
>  	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
>  		return 0;
> 
> +	ice_dcf_vf_repr_notify_all(adapter, false);
>  	(void)ice_dcf_dev_stop(dev);
> 
>  	ice_free_queues(dev);
> -
> -	ice_dcf_free_repr_info(adapter);
>  	ice_dcf_uninit_parent_adapter(dev);
>  	ice_dcf_uninit_hw(dev, &adapter->real_hw);
> 
> @@ -1835,7 +1854,7 @@ ice_dcf_dev_reset(struct rte_eth_dev *dev)
>  		ice_dcf_reset_hw(dev, hw);
>  	}
> 
> -	ret = ice_dcf_dev_uninit(dev);
> +	ret = ice_dcf_dev_close(dev);
>  	if (ret)
>  		return ret;
> 
> @@ -1940,12 +1959,17 @@ ice_dcf_dev_init(struct rte_eth_dev *eth_dev)
>  	ice_dcf_stats_reset(eth_dev);
> 
>  	dcf_config_promisc(adapter, false, false);
> +	ice_dcf_vf_repr_notify_all(adapter, true);
> +
>  	return 0;
>  }
> 
>  static int
>  ice_dcf_dev_uninit(struct rte_eth_dev *eth_dev)  {
> +	struct ice_dcf_adapter *adapter = eth_dev->data->dev_private;
> +
> +	ice_dcf_free_repr_info(adapter);
>  	ice_dcf_dev_close(eth_dev);
> 
>  	return 0;
> diff --git a/drivers/net/ice/ice_dcf_ethdev.h
> b/drivers/net/ice/ice_dcf_ethdev.h
> index 4baaec4b8b..6dcbaac5eb 100644
> --- a/drivers/net/ice/ice_dcf_ethdev.h
> +++ b/drivers/net/ice/ice_dcf_ethdev.h
> @@ -60,6 +60,7 @@ struct ice_dcf_vf_repr {
>  	struct rte_ether_addr mac_addr;
>  	uint16_t switch_domain_id;
>  	uint16_t vf_id;
> +	bool dcf_valid;
> 
>  	struct ice_dcf_vlan outer_vlan_info; /* DCF always handle outer VLAN
> */  }; @@ -80,6 +81,8 @@ int ice_dcf_vf_repr_init(struct rte_eth_dev
> *vf_rep_eth_dev, void *init_param);  int ice_dcf_vf_repr_uninit(struct
> rte_eth_dev *vf_rep_eth_dev);  int ice_dcf_vf_repr_init_vlan(struct
> rte_eth_dev *vf_rep_eth_dev);  void ice_dcf_vf_repr_stop_all(struct
> ice_dcf_adapter *dcf_adapter);
> +void ice_dcf_vf_repr_notify_all(struct ice_dcf_adapter *dcf_adapter,
> +bool valid); int ice_dcf_handle_vf_repr_close(struct ice_dcf_adapter
> +*dcf_adapter, uint16_t vf_id);
>  bool ice_dcf_adminq_need_retry(struct ice_adapter *ad);
> 
>  #endif /* _ICE_DCF_ETHDEV_H_ */
> diff --git a/drivers/net/ice/ice_dcf_vf_representor.c
> b/drivers/net/ice/ice_dcf_vf_representor.c
> index b9fcfc80ad..00dc322b30 100644
> --- a/drivers/net/ice/ice_dcf_vf_representor.c
> +++ b/drivers/net/ice/ice_dcf_vf_representor.c
> @@ -50,9 +50,30 @@ ice_dcf_vf_repr_dev_stop(struct rte_eth_dev *dev)
>  	return 0;
>  }
> 
> +static int
> +ice_dcf_vf_repr_set_dcf_valid(struct rte_eth_dev *dev, bool valid) {
> +	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
> +
> +	repr->dcf_valid = valid;
> +
> +	return 0;
> +}
> +
>  static int
>  ice_dcf_vf_repr_dev_close(struct rte_eth_dev *dev)  {
> +	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
> +	struct ice_dcf_adapter *dcf_adapter;
> +	int err;
> +
> +	if (repr->dcf_valid) {
> +		dcf_adapter = repr->dcf_eth_dev->data->dev_private;
> +		err = ice_dcf_handle_vf_repr_close(dcf_adapter, repr->vf_id);
> +		if (err)
> +			PMD_DRV_LOG(ERR, "VF representor invalid");
> +	}
> +
>  	return ice_dcf_vf_repr_uninit(dev);
>  }
> 
> @@ -111,14 +132,15 @@ ice_dcf_vf_repr_link_update(__rte_unused struct
> rte_eth_dev *ethdev,  static __rte_always_inline struct ice_dcf_hw *
> ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)  {
> -	struct ice_dcf_adapter *dcf_adapter =
> -			repr->dcf_eth_dev->data->dev_private;
> +	struct ice_dcf_adapter *dcf_adapter;
> 
> -	if (!dcf_adapter) {
> +	if (!repr->dcf_valid) {
>  		PMD_DRV_LOG(ERR, "DCF for VF representor has been
> released\n");
>  		return NULL;
>  	}
> 
> +	dcf_adapter = repr->dcf_eth_dev->data->dev_private;
> +
>  	return &dcf_adapter->real_hw;
>  }
> 
> @@ -414,6 +436,7 @@ ice_dcf_vf_repr_init(struct rte_eth_dev
> *vf_rep_eth_dev, void *init_param)
>  	repr->dcf_eth_dev = param->dcf_eth_dev;
>  	repr->switch_domain_id = param->switch_domain_id;
>  	repr->vf_id = param->vf_id;
> +	repr->dcf_valid = true;
>  	repr->outer_vlan_info.port_vlan_ena = false;
>  	repr->outer_vlan_info.stripping_ena = false;
>  	repr->outer_vlan_info.tpid = RTE_ETHER_TYPE_VLAN; @@ -488,3
> +511,25 @@ ice_dcf_vf_repr_stop_all(struct ice_dcf_adapter *dcf_adapter)
>  			vf_rep_eth_dev->data->dev_started = 0;
>  	}
>  }
> +
> +void
> +ice_dcf_vf_repr_notify_all(struct ice_dcf_adapter *dcf_adapter, bool
> +valid) {
> +	uint16_t vf_id;
> +	int err;
> +	struct rte_eth_dev *vf_rep_eth_dev;
> +
> +	if (!dcf_adapter->repr_infos)
> +		return;
> +
> +	for (vf_id = 0; vf_id < dcf_adapter->real_hw.num_vfs; vf_id++) {
> +		vf_rep_eth_dev = dcf_adapter-
> >repr_infos[vf_id].vf_rep_eth_dev;
> +
> +		if (!vf_rep_eth_dev)
> +			continue;
> +
> +		err = ice_dcf_vf_repr_set_dcf_valid(vf_rep_eth_dev, valid);

Better to rename ice_dcf_vf_repr_set_dcf_valid  to ice_dcf_vf_repr_notify_one make it more readable in ice_dcf_vf_repr_notify_all

And it's not necessary to check the return value,  it's a internal function not an API, you can add it when it is necessary.



> +		if (err)
> +			PMD_DRV_LOG(ERR, "Failed to notify VF
> representor: %d", vf_id);
> +	}
> +}
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v7] net/ice: fix crash on closing representor ports
  2023-11-07 10:12         ` [PATCH v6] " Mingjin Ye
  2023-11-07 12:18           ` Zhang, Qi Z
@ 2023-11-08 11:39           ` Mingjin Ye
  2023-11-09  0:26             ` Zhang, Qi Z
  1 sibling, 1 reply; 17+ messages in thread
From: Mingjin Ye @ 2023-11-08 11:39 UTC (permalink / raw)
  To: dev; +Cc: qiming.yang, yidingx.zhou, Mingjin Ye, stable, Qi Zhang

The data resource in struct rte_eth_dev is cleared and points to NULL
when the DCF port is closed.

If the DCF representor port is closed after the DCF port is closed,
a segmentation fault occurs because the representor port accesses the
data resource released by the DCF port.

This patch fixes this issue by synchronizing the state of DCF ports and
representor ports to the peer in real time when their state changes.

Fixes: c7e1a1a3bfeb ("net/ice: refactor DCF VLAN handling")
Cc: stable@dpdk.org

Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
---
v2: Reformat code to remove unneeded fixlines.
---
v3: New solution.
---
v4: Optimize v2 patch.
---
v5: optimization.
---
v6: Optimize and resolve conflicts.
---
v7: optimization.
---
 drivers/net/ice/ice_dcf_ethdev.c         | 30 ++++++++++++++--
 drivers/net/ice/ice_dcf_ethdev.h         |  3 ++
 drivers/net/ice/ice_dcf_vf_representor.c | 46 ++++++++++++++++++++++--
 3 files changed, 73 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ice/ice_dcf_ethdev.c b/drivers/net/ice/ice_dcf_ethdev.c
index 29699c2c32..5d845bba31 100644
--- a/drivers/net/ice/ice_dcf_ethdev.c
+++ b/drivers/net/ice/ice_dcf_ethdev.c
@@ -1618,6 +1618,26 @@ ice_dcf_free_repr_info(struct ice_dcf_adapter *dcf_adapter)
 	}
 }
 
+int
+ice_dcf_handle_vf_repr_close(struct ice_dcf_adapter *dcf_adapter,
+				uint16_t vf_id)
+{
+	struct ice_dcf_repr_info *vf_rep_info;
+
+	if (dcf_adapter->num_reprs >= vf_id) {
+		PMD_DRV_LOG(ERR, "Invalid VF id: %d", vf_id);
+		return -1;
+	}
+
+	if (!dcf_adapter->repr_infos)
+		return 0;
+
+	vf_rep_info = &dcf_adapter->repr_infos[vf_id];
+	vf_rep_info->vf_rep_eth_dev = NULL;
+
+	return 0;
+}
+
 static int
 ice_dcf_init_repr_info(struct ice_dcf_adapter *dcf_adapter)
 {
@@ -1641,11 +1661,10 @@ ice_dcf_dev_close(struct rte_eth_dev *dev)
 	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
 		return 0;
 
+	ice_dcf_vf_repr_notify_all(adapter, false);
 	(void)ice_dcf_dev_stop(dev);
 
 	ice_free_queues(dev);
-
-	ice_dcf_free_repr_info(adapter);
 	ice_dcf_uninit_parent_adapter(dev);
 	ice_dcf_uninit_hw(dev, &adapter->real_hw);
 
@@ -1835,7 +1854,7 @@ ice_dcf_dev_reset(struct rte_eth_dev *dev)
 		ice_dcf_reset_hw(dev, hw);
 	}
 
-	ret = ice_dcf_dev_uninit(dev);
+	ret = ice_dcf_dev_close(dev);
 	if (ret)
 		return ret;
 
@@ -1940,12 +1959,17 @@ ice_dcf_dev_init(struct rte_eth_dev *eth_dev)
 	ice_dcf_stats_reset(eth_dev);
 
 	dcf_config_promisc(adapter, false, false);
+	ice_dcf_vf_repr_notify_all(adapter, true);
+
 	return 0;
 }
 
 static int
 ice_dcf_dev_uninit(struct rte_eth_dev *eth_dev)
 {
+	struct ice_dcf_adapter *adapter = eth_dev->data->dev_private;
+
+	ice_dcf_free_repr_info(adapter);
 	ice_dcf_dev_close(eth_dev);
 
 	return 0;
diff --git a/drivers/net/ice/ice_dcf_ethdev.h b/drivers/net/ice/ice_dcf_ethdev.h
index 4baaec4b8b..6dcbaac5eb 100644
--- a/drivers/net/ice/ice_dcf_ethdev.h
+++ b/drivers/net/ice/ice_dcf_ethdev.h
@@ -60,6 +60,7 @@ struct ice_dcf_vf_repr {
 	struct rte_ether_addr mac_addr;
 	uint16_t switch_domain_id;
 	uint16_t vf_id;
+	bool dcf_valid;
 
 	struct ice_dcf_vlan outer_vlan_info; /* DCF always handle outer VLAN */
 };
@@ -80,6 +81,8 @@ int ice_dcf_vf_repr_init(struct rte_eth_dev *vf_rep_eth_dev, void *init_param);
 int ice_dcf_vf_repr_uninit(struct rte_eth_dev *vf_rep_eth_dev);
 int ice_dcf_vf_repr_init_vlan(struct rte_eth_dev *vf_rep_eth_dev);
 void ice_dcf_vf_repr_stop_all(struct ice_dcf_adapter *dcf_adapter);
+void ice_dcf_vf_repr_notify_all(struct ice_dcf_adapter *dcf_adapter, bool valid);
+int ice_dcf_handle_vf_repr_close(struct ice_dcf_adapter *dcf_adapter, uint16_t vf_id);
 bool ice_dcf_adminq_need_retry(struct ice_adapter *ad);
 
 #endif /* _ICE_DCF_ETHDEV_H_ */
diff --git a/drivers/net/ice/ice_dcf_vf_representor.c b/drivers/net/ice/ice_dcf_vf_representor.c
index b9fcfc80ad..af281f069a 100644
--- a/drivers/net/ice/ice_dcf_vf_representor.c
+++ b/drivers/net/ice/ice_dcf_vf_representor.c
@@ -50,9 +50,28 @@ ice_dcf_vf_repr_dev_stop(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static void
+ice_dcf_vf_repr_notify_one(struct rte_eth_dev *dev, bool valid)
+{
+	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
+
+	repr->dcf_valid = valid;
+}
+
 static int
 ice_dcf_vf_repr_dev_close(struct rte_eth_dev *dev)
 {
+	struct ice_dcf_vf_repr *repr = dev->data->dev_private;
+	struct ice_dcf_adapter *dcf_adapter;
+	int err;
+
+	if (repr->dcf_valid) {
+		dcf_adapter = repr->dcf_eth_dev->data->dev_private;
+		err = ice_dcf_handle_vf_repr_close(dcf_adapter, repr->vf_id);
+		if (err)
+			PMD_DRV_LOG(ERR, "VF representor invalid");
+	}
+
 	return ice_dcf_vf_repr_uninit(dev);
 }
 
@@ -111,14 +130,15 @@ ice_dcf_vf_repr_link_update(__rte_unused struct rte_eth_dev *ethdev,
 static __rte_always_inline struct ice_dcf_hw *
 ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)
 {
-	struct ice_dcf_adapter *dcf_adapter =
-			repr->dcf_eth_dev->data->dev_private;
+	struct ice_dcf_adapter *dcf_adapter;
 
-	if (!dcf_adapter) {
+	if (!repr->dcf_valid) {
 		PMD_DRV_LOG(ERR, "DCF for VF representor has been released\n");
 		return NULL;
 	}
 
+	dcf_adapter = repr->dcf_eth_dev->data->dev_private;
+
 	return &dcf_adapter->real_hw;
 }
 
@@ -414,6 +434,7 @@ ice_dcf_vf_repr_init(struct rte_eth_dev *vf_rep_eth_dev, void *init_param)
 	repr->dcf_eth_dev = param->dcf_eth_dev;
 	repr->switch_domain_id = param->switch_domain_id;
 	repr->vf_id = param->vf_id;
+	repr->dcf_valid = true;
 	repr->outer_vlan_info.port_vlan_ena = false;
 	repr->outer_vlan_info.stripping_ena = false;
 	repr->outer_vlan_info.tpid = RTE_ETHER_TYPE_VLAN;
@@ -488,3 +509,22 @@ ice_dcf_vf_repr_stop_all(struct ice_dcf_adapter *dcf_adapter)
 			vf_rep_eth_dev->data->dev_started = 0;
 	}
 }
+
+void
+ice_dcf_vf_repr_notify_all(struct ice_dcf_adapter *dcf_adapter, bool valid)
+{
+	uint16_t vf_id;
+	struct rte_eth_dev *vf_rep_eth_dev;
+
+	if (!dcf_adapter->repr_infos)
+		return;
+
+	for (vf_id = 0; vf_id < dcf_adapter->real_hw.num_vfs; vf_id++) {
+		vf_rep_eth_dev = dcf_adapter->repr_infos[vf_id].vf_rep_eth_dev;
+
+		if (!vf_rep_eth_dev)
+			continue;
+
+		ice_dcf_vf_repr_notify_one(vf_rep_eth_dev, valid);
+	}
+}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH v7] net/ice: fix crash on closing representor ports
  2023-11-08 11:39           ` [PATCH v7] " Mingjin Ye
@ 2023-11-09  0:26             ` Zhang, Qi Z
  0 siblings, 0 replies; 17+ messages in thread
From: Zhang, Qi Z @ 2023-11-09  0:26 UTC (permalink / raw)
  To: Ye, MingjinX, dev; +Cc: Yang, Qiming, Zhou, YidingX, stable



> -----Original Message-----
> From: Ye, MingjinX <mingjinx.ye@intel.com>
> Sent: Wednesday, November 8, 2023 7:40 PM
> To: dev@dpdk.org
> Cc: Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; Ye, MingjinX <mingjinx.ye@intel.com>;
> stable@dpdk.org; Zhang, Qi Z <qi.z.zhang@intel.com>
> Subject: [PATCH v7] net/ice: fix crash on closing representor ports
> 
> The data resource in struct rte_eth_dev is cleared and points to NULL when
> the DCF port is closed.
> 
> If the DCF representor port is closed after the DCF port is closed, a
> segmentation fault occurs because the representor port accesses the data
> resource released by the DCF port.
> 
> This patch fixes this issue by synchronizing the state of DCF ports and
> representor ports to the peer in real time when their state changes.
> 
> Fixes: c7e1a1a3bfeb ("net/ice: refactor DCF VLAN handling")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>

Acked-by: Qi Zhang <qi.z.zhang@intel.com>

Applied to dpdk-next-net-intel.

Thanks
Qi


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-11-09  0:26 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-26  9:51 [PATCH] net/iavf: fix crash on closing representor ports Mingjin Ye
2023-10-27  7:17 ` David Marchand
2023-10-30  2:35 ` [PATCH v2] " Mingjin Ye
2023-10-30  8:44 ` [PATCH v2] net/ice: " Mingjin Ye
2023-11-01  1:11   ` Zhang, Qi Z
2023-11-01 10:13   ` [PATCH v3] " Mingjin Ye
2023-11-01 10:48     ` Zhang, Qi Z
2023-11-02  2:38       ` Ye, MingjinX
2023-11-02  3:59         ` Zhang, Qi Z
2023-11-02 10:11     ` [PATCH v4] " Mingjin Ye
2023-11-06  1:23       ` Zhang, Qi Z
2023-11-06 10:00       ` [PATCH v5] " Mingjin Ye
2023-11-06 12:05         ` Zhang, Qi Z
2023-11-07 10:12         ` [PATCH v6] " Mingjin Ye
2023-11-07 12:18           ` Zhang, Qi Z
2023-11-08 11:39           ` [PATCH v7] " Mingjin Ye
2023-11-09  0:26             ` Zhang, Qi Z

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).