From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
To: Konstantin Ananyev <konstantin.ananyev@huawei.com>,
Fengchengwen <fengchengwen@huawei.com>,
Stephen Hemminger <stephen@networkplumber.org>,
Ruifeng Wang <Ruifeng.Wang@arm.com>,
"Ajit Khaparde (ajit.khaparde@broadcom.com)"
<ajit.khaparde@broadcom.com>
Cc: Ashok Kaladi <ashok.k.kaladi@intel.com>,
"jerinj@marvell.com" <jerinj@marvell.com>,
"thomas@monjalon.net" <thomas@monjalon.net>,
"dev@dpdk.org" <dev@dpdk.org>,
"s.v.naga.harish.k@intel.com" <s.v.naga.harish.k@intel.com>,
"erik.g.carrillo@intel.com" <erik.g.carrillo@intel.com>,
"abhinandan.gujjar@intel.com" <abhinandan.gujjar@intel.com>,
"stable@dpdk.org" <stable@dpdk.org>, nd <nd@arm.com>,
nd <nd@arm.com>
Subject: RE: [PATCH 2/2] ethdev: fix race condition in fast-path ops setup
Date: Thu, 23 Feb 2023 04:40:02 +0000 [thread overview]
Message-ID: <DBAPR08MB5814AB1090C2C968D83C0BE098AB9@DBAPR08MB5814.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <f90cb5d9703d41ad9a8ddf77afe21853@huawei.com>
<snip>
> > >>>
> > >>> On 2023/2/20 14:08, Ashok Kaladi wrote:
> > >>>> If ethdev enqueue or dequeue function is called during
> > >>>> eth_dev_fp_ops_setup(), it may get pre-empted after setting the
> > >>>> function pointers, but before setting the pointer to port data.
> > >>>> In this case the newly registered enqueue/dequeue function will
> > >>>> use dummy port data and end up in seg fault.
> > >>>>
> > >>>> This patch moves the updation of each data pointers before
> > >>>> updating corresponding function pointers.
> > >>>>
> > >>>> Fixes: c87d435a4d79 ("ethdev: copy fast-path API into separate
> > >>>> structure")
> > >>>> Cc: stable@dpdk.org
> > >
> > > Why is something calling enqueue/dequeue when device is not fully
> started.
> > > A correctly written application would not call rx/tx burst until
> > > after ethdev start had finished.
> >
> > Please refer the eb0d471a894 (ethdev: add proactive error handling
> > mode), when driver recover itself, the application may still invoke
> enqueue/dequeue API.
>
> Right now DPDK ethdev layer *does not* provide synchronization
> mechanisms between data-path and control-path functions.
> That was a deliberate deisgn choice. If we want to change that rule, then I
> suppose we need a community consensus for it.
> I think that if the driver wants to provide some sort of error recovery
> procedure, then it has to provide some synchronization mechanism inside it
> between data-path and control-path functions.
> Actually looking at eb0d471a894 (ethdev: add proactive error handling
> mode), and following patches I wonder how it creeped in?
> It seems we just introduced a loophole for race condition with this
> approach...
> It probably needs to be either deprecated or reworked.
Looking at the commit, it does not say anything about the data plane functions which probably means, the error recovery is happening within the data plane thread. What happens to other data plane threads that are polling the same port on which the error recovery is happening?
Also, the commit log says that while the error recovery is under progress, the application should not call any control plane APIs. Does that mean, the application has to check for error condition every time it calls a control plane API?
The commit message also says that "PMD makes sure the control path operations failed with retcode -EBUSY". It does not say how it does this. But, any communication from the PMD thread to control plane thread may introduce race conditions if not done correctly.
>
> >
> > >
> > > Would something like this work better?
> > >
> > > Note: there is another bug in current code. The check for link state
> > > interrupt and link_ops could return -ENOTSUP and leave device in
> indeterminate state.
> > > The check should be done before calling PMD.
> > >
> > > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
> > > 0266cc82acb6..d6c163ed85e7 100644
> > > --- a/lib/ethdev/rte_ethdev.c
> > > +++ b/lib/ethdev/rte_ethdev.c
> > > @@ -1582,6 +1582,14 @@ rte_eth_dev_start(uint16_t port_id)
> > > return 0;
> > > }
> > >
> > > + if (dev->data->dev_conf.intr_conf.lsc == 0 &&
> > > + dev->dev_ops->link_update == NULL) {
> > > + RTE_ETHDEV_LOG(INFO,
> > > + "Device with port_id=%"PRIu16" link update not
> supported\n",
> > > + port_id);
> > > + return -ENOTSUP;
> > > + }
> > > +
> > > ret = rte_eth_dev_info_get(port_id, &dev_info);
> > > if (ret != 0)
> > > return ret;
> > > @@ -1591,9 +1599,7 @@ rte_eth_dev_start(uint16_t port_id)
> > > eth_dev_mac_restore(dev, &dev_info);
> > >
> > > diag = (*dev->dev_ops->dev_start)(dev);
> > > - if (diag == 0)
> > > - dev->data->dev_started = 1;
> > > - else
> > > + if (diag != 0)
> > > return eth_err(port_id, diag);
> > >
> > > ret = eth_dev_config_restore(dev, &dev_info, port_id); @@ -1611,16
> > > +1617,18 @@ rte_eth_dev_start(uint16_t port_id)
> > > return ret;
> > > }
> > >
> > > - if (dev->data->dev_conf.intr_conf.lsc == 0) {
> > > - if (*dev->dev_ops->link_update == NULL)
> > > - return -ENOTSUP;
> > > - (*dev->dev_ops->link_update)(dev, 0);
> > > - }
> > > -
> > > /* expose selection of PMD fast-path functions */
> > > eth_dev_fp_ops_setup(rte_eth_fp_ops + port_id, dev);
> > >
> > > + /* ensure state is set before marking device ready */
> > > + rte_smp_wmb();
> > > +
> > > rte_ethdev_trace_start(port_id);
> > > +
> > > + /* Update current link state */
> > > + if (dev->data->dev_conf.intr_conf.lsc == 0)
> > > + (*dev->dev_ops->link_update)(dev, 0);
> > > +
> > > return 0;
> > > }
> > >
> > >
> > > .
> > >
next prev parent reply other threads:[~2023-02-23 4:40 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-20 6:08 [PATCH 1/2] eventdev: fix race condition in fast-path set function Ashok Kaladi
2023-02-20 6:08 ` [PATCH 2/2] ethdev: fix race condition in fast-path ops setup Ashok Kaladi
2023-02-20 6:57 ` fengchengwen
2023-02-21 7:24 ` Ruifeng Wang
2023-02-21 17:00 ` Stephen Hemminger
2023-02-22 1:07 ` fengchengwen
2023-02-22 9:41 ` Ruifeng Wang
2023-02-22 10:41 ` Konstantin Ananyev
2023-02-22 22:48 ` Honnappa Nagarahalli
2023-02-23 1:15 ` Stephen Hemminger
2023-02-23 4:47 ` Honnappa Nagarahalli
2023-02-23 4:40 ` Honnappa Nagarahalli [this message]
2023-02-23 8:23 ` fengchengwen
2023-02-23 13:31 ` Konstantin Ananyev
2023-02-25 1:32 ` fengchengwen
2023-02-26 17:22 ` Konstantin Ananyev
2023-02-27 2:56 ` fengchengwen
2023-02-27 19:08 ` Konstantin Ananyev
2023-03-03 17:19 ` Ferruh Yigit
2023-03-06 1:57 ` fengchengwen
2023-03-06 6:13 ` Ruifeng Wang
2023-03-06 10:32 ` Konstantin Ananyev
2023-03-06 11:17 ` Ajit Khaparde
2023-03-06 11:57 ` Ferruh Yigit
2023-03-06 12:36 ` Konstantin Ananyev
2023-02-28 23:57 ` Honnappa Nagarahalli
2023-02-20 7:01 ` fengchengwen
2023-02-20 9:44 ` Konstantin Ananyev
2023-03-03 16:49 ` Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DBAPR08MB5814AB1090C2C968D83C0BE098AB9@DBAPR08MB5814.eurprd08.prod.outlook.com \
--to=honnappa.nagarahalli@arm.com \
--cc=Ruifeng.Wang@arm.com \
--cc=abhinandan.gujjar@intel.com \
--cc=ajit.khaparde@broadcom.com \
--cc=ashok.k.kaladi@intel.com \
--cc=dev@dpdk.org \
--cc=erik.g.carrillo@intel.com \
--cc=fengchengwen@huawei.com \
--cc=jerinj@marvell.com \
--cc=konstantin.ananyev@huawei.com \
--cc=nd@arm.com \
--cc=s.v.naga.harish.k@intel.com \
--cc=stable@dpdk.org \
--cc=stephen@networkplumber.org \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).