patches for DPDK stable branches
 help / color / mirror / Atom feed
From: "Zhang, Qi Z" <qi.z.zhang@intel.com>
To: "Ye, MingjinX" <mingjinx.ye@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "Yang, Qiming" <qiming.yang@intel.com>,
	"Zhou, YidingX" <yidingx.zhou@intel.com>,
	"stable@dpdk.org" <stable@dpdk.org>
Subject: RE: [PATCH v3] net/ice: fix crash on closing representor ports
Date: Thu, 2 Nov 2023 03:59:21 +0000	[thread overview]
Message-ID: <DM4PR11MB59940F596F1E6F481D51A305D7A6A@DM4PR11MB5994.namprd11.prod.outlook.com> (raw)
In-Reply-To: <SN7PR11MB7139CC496AF0A0B64CC17D5DE5A6A@SN7PR11MB7139.namprd11.prod.outlook.com>



> -----Original Message-----
> From: Ye, MingjinX <mingjinx.ye@intel.com>
> Sent: Thursday, November 2, 2023 10:39 AM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>; dev@dpdk.org
> Cc: Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> <yidingx.zhou@intel.com>; stable@dpdk.org
> Subject: RE: [PATCH v3] net/ice: fix crash on closing representor ports
> 
> 
> 
> > -----Original Message-----
> > From: Zhang, Qi Z <qi.z.zhang@intel.com>
> > Sent: Wednesday, November 1, 2023 6:49 PM
> > To: Ye, MingjinX <mingjinx.ye@intel.com>; dev@dpdk.org
> > Cc: Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> > <yidingx.zhou@intel.com>; stable@dpdk.org
> > Subject: RE: [PATCH v3] net/ice: fix crash on closing representor
> > ports
> >
> >
> >
> > > -----Original Message-----
> > > From: Ye, MingjinX <mingjinx.ye@intel.com>
> > > Sent: Wednesday, November 1, 2023 6:14 PM
> > > To: dev@dpdk.org
> > > Cc: Yang, Qiming <qiming.yang@intel.com>; Zhou, YidingX
> > > <yidingx.zhou@intel.com>; Ye, MingjinX <mingjinx.ye@intel.com>;
> > > stable@dpdk.org; Zhang, Qi Z <qi.z.zhang@intel.com>
> > > Subject: [PATCH v3] net/ice: fix crash on closing representor ports
> > >
> > > The data resource in struct rte_eth_dev is cleared and points to
> > > NULL when the DCF port is closed.
> > >
> > > If the DCF representor port is closed after the DCF port is closed,
> > > a segmentation fault occurs because the representor port accesses
> > > the data resource released by the DCF port.
> > >
> > > This patch checks if the resource is present before accessing.
> > >
> > > Fixes: 5674465a32c8 ("net/ice: add DCF VLAN handling")
> > > Fixes: da9cdcd1f372 ("net/ice: fix crash on representor port
> > > closing")
> > > Cc: stable@dpdk.org
> > >
> > > Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> > > ---
> > > v3: New solution.
> > > ---
> > >  drivers/net/ice/ice_dcf_vf_representor.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/net/ice/ice_dcf_vf_representor.c
> > > b/drivers/net/ice/ice_dcf_vf_representor.c
> > > index b9fcfc80ad..8c45e28f02 100644
> > > --- a/drivers/net/ice/ice_dcf_vf_representor.c
> > > +++ b/drivers/net/ice/ice_dcf_vf_representor.c
> > > @@ -111,14 +111,16 @@ ice_dcf_vf_repr_link_update(__rte_unused
> > struct
> > > rte_eth_dev *ethdev,  static __rte_always_inline struct ice_dcf_hw *
> > > ice_dcf_vf_repr_hw(struct ice_dcf_vf_repr *repr)  {
> > > -	struct ice_dcf_adapter *dcf_adapter =
> > > -			repr->dcf_eth_dev->data->dev_private;
> > > +	struct rte_eth_dev_data *dcf_data = repr->dcf_eth_dev->data;
> >
> > Seems this expose another issue, if dcf port already be closed, the
> > dcf_eth_dev instance could already be reused by another driver.
> Your analysis is spot on.
> 
> The reason for the issue:
> Affected by 36c46e738120c381cf663c96692454c5aa75e203 commit.
> da9cdcd1f37220e87db23993d6352637d71df25b commit does not fix the
> issue.
> 
> v2 patch:
> Notify every proxy port of DCF port status.
> 
> v3 patch:
> Based on da9cdcd1f37220e87db23993d636352637d71df25b, made
> optimization to fix the issue.
> 
> > So we can't assume dcf_eth_dev->data is NULL,  I think you can refine
> > based on v2's method, but don't update dcf_valid flag in representor
> > port's dev_stop.
> Implementation difficulties:
> 1. When the DCF is created, all associated proxy ports are created and each
> proxy port (struct rte_eth_dev) pointer is recorded in an array.
> 2. When shutting down the DCF, all all proxy ports are notified at
> ice_dcf_vf_repr_stop_all. Finally the array is deleted.
> 
> Based on the original scenario, notifying all agent ports of DCF state failure is
> the simplest and most effective way. This is very similar to the v2 patch
> scenario.

Yes, that's why I suggest to refine based on v2 after notice the issue on v3, seems we need to maintain a per repr flag for DCF status, but need to update it carefully.

> 
> Check each DCF port flag?
> If the DCF port has failed and the resource has been released. the DCF port
> status, naturally, is not available. This is very similar to the v3 patch scenario.

Check dev->data is NULL does not work, a better way is to check eth_dev->state and device name to make sure it is still used by DCF.
But I think v2's method is more reliable.


> 
> Other ideas
> 1. Re-implement DCF port and proxy port communication, can't directly use
> (struct rte_eth_dev) pointer to access the device resources on the other end.
> 
> 2. Designed to return an error when removing the proxy port directly. The
> proxy port device is managed by the control DCF (the proxy port is created by
> the DCF port and the removal work should be done by it).
> 
> The above 2 options will bring a huge amount of work. It seems beyond the
> scope of this discussion.
> 


  reply	other threads:[~2023-11-02  3:59 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-26  9:51 [PATCH] net/iavf: " Mingjin Ye
2023-10-27  7:17 ` David Marchand
2023-10-30  2:35 ` [PATCH v2] " Mingjin Ye
2023-10-30  8:44 ` [PATCH v2] net/ice: " Mingjin Ye
2023-11-01  1:11   ` Zhang, Qi Z
2023-11-01 10:13   ` [PATCH v3] " Mingjin Ye
2023-11-01 10:48     ` Zhang, Qi Z
2023-11-02  2:38       ` Ye, MingjinX
2023-11-02  3:59         ` Zhang, Qi Z [this message]
2023-11-02 10:11     ` [PATCH v4] " Mingjin Ye
2023-11-06  1:23       ` Zhang, Qi Z
2023-11-06 10:00       ` [PATCH v5] " Mingjin Ye
2023-11-06 12:05         ` Zhang, Qi Z
2023-11-07 10:12         ` [PATCH v6] " Mingjin Ye
2023-11-07 12:18           ` Zhang, Qi Z
2023-11-08 11:39           ` [PATCH v7] " Mingjin Ye
2023-11-09  0:26             ` Zhang, Qi Z

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM4PR11MB59940F596F1E6F481D51A305D7A6A@DM4PR11MB5994.namprd11.prod.outlook.com \
    --to=qi.z.zhang@intel.com \
    --cc=dev@dpdk.org \
    --cc=mingjinx.ye@intel.com \
    --cc=qiming.yang@intel.com \
    --cc=stable@dpdk.org \
    --cc=yidingx.zhou@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).