* [dpdk-dev] [PATCH 0/3] fix error path of multi-process probe @ 2019-03-02 2:42 Thomas Monjalon 2019-03-02 2:42 ` [dpdk-dev] [PATCH 1/3] eal: remove useless checks for already probed device Thomas Monjalon ` (3 more replies) 0 siblings, 4 replies; 9+ messages in thread From: Thomas Monjalon @ 2019-03-02 2:42 UTC (permalink / raw) To: dev While working on multi-process support of failsafe with Raslan, some issues have been discovered on probing failures. Thomas Monjalon (3): eal: remove useless checks for already probed device eal: remove error logs for already probed device eal: fix multi-process probe failure handling lib/librte_eal/common/eal_common_dev.c | 31 ++++++++++---------------- lib/librte_eal/common/eal_private.h | 2 +- lib/librte_eal/common/hotplug_mp.c | 14 +++++++----- 3 files changed, 21 insertions(+), 26 deletions(-) -- 2.20.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [dpdk-dev] [PATCH 1/3] eal: remove useless checks for already probed device 2019-03-02 2:42 [dpdk-dev] [PATCH 0/3] fix error path of multi-process probe Thomas Monjalon @ 2019-03-02 2:42 ` Thomas Monjalon 2019-03-13 13:46 ` Zhang, Qi Z 2019-03-02 2:42 ` [dpdk-dev] [PATCH 2/3] eal: remove error logs " Thomas Monjalon ` (2 subsequent siblings) 3 siblings, 1 reply; 9+ messages in thread From: Thomas Monjalon @ 2019-03-02 2:42 UTC (permalink / raw) To: dev; +Cc: dariusz.stojaczyk, qi.z.zhang, stable The function eal_dev_hotplug_request_to_secondary() never returns -EEXIST result. The case of already probed device is filtered out. The test in __handle_secondary_request() was always true. The test in rte_dev_probe() was never true, and that's fine not returning -EEXIST if device is already attached in secondary processes. Fixes: 494db286f37d ("eal: fix multi-process hotplug if attached in secondary") Cc: dariusz.stojaczyk@intel.com Cc: qi.z.zhang@intel.com Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> --- lib/librte_eal/common/eal_common_dev.c | 9 +-------- lib/librte_eal/common/hotplug_mp.c | 3 +-- 2 files changed, 2 insertions(+), 10 deletions(-) diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c index fd7f5ca7d5..048c0b025f 100644 --- a/lib/librte_eal/common/eal_common_dev.c +++ b/lib/librte_eal/common/eal_common_dev.c @@ -247,18 +247,11 @@ rte_dev_probe(const char *devargs) goto rollback; } - /** - * if any secondary failed to attach, we need to consider if rollback - * is necessary. - */ + /* if any secondary failed to attach, need to rollback. */ if (req.result != 0) { RTE_LOG(ERR, EAL, "Failed to attach device on secondary process\n"); ret = req.result; - - /* for -EEXIST, we don't need to rollback. */ - if (ret == -EEXIST) - return ret; goto rollback; } diff --git a/lib/librte_eal/common/hotplug_mp.c b/lib/librte_eal/common/hotplug_mp.c index 4052a5c7fb..94bd1d896e 100644 --- a/lib/librte_eal/common/hotplug_mp.c +++ b/lib/librte_eal/common/hotplug_mp.c @@ -110,8 +110,7 @@ __handle_secondary_request(void *param) if (tmp_req.result != 0) { ret = tmp_req.result; RTE_LOG(ERR, EAL, "Failed to hotplug add device on secondary\n"); - if (ret != -EEXIST) - goto rollback; + goto rollback; } } else if (req->t == EAL_DEV_REQ_TYPE_DETACH) { ret = rte_devargs_parse(&da, req->devargs); -- 2.20.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [PATCH 1/3] eal: remove useless checks for already probed device 2019-03-02 2:42 ` [dpdk-dev] [PATCH 1/3] eal: remove useless checks for already probed device Thomas Monjalon @ 2019-03-13 13:46 ` Zhang, Qi Z 2023-06-11 2:52 ` Stephen Hemminger 0 siblings, 1 reply; 9+ messages in thread From: Zhang, Qi Z @ 2019-03-13 13:46 UTC (permalink / raw) To: Thomas Monjalon, dev; +Cc: Stojaczyk, Dariusz, stable Hi: > -----Original Message----- > From: Thomas Monjalon [mailto:thomas@monjalon.net] > Sent: Saturday, March 2, 2019 10:43 AM > To: dev@dpdk.org > Cc: Stojaczyk, Dariusz <dariusz.stojaczyk@intel.com>; Zhang, Qi Z > <qi.z.zhang@intel.com>; stable@dpdk.org > Subject: [PATCH 1/3] eal: remove useless checks for already probed device > > The function eal_dev_hotplug_request_to_secondary() never returns -EEXIST > result. The case of already probed device is filtered out. > > The test in __handle_secondary_request() was always true. > The test in rte_dev_probe() was never true, and that's fine not returning -EEXIST > if device is already attached in secondary processes. I didn't get this. eal_dev_hotplug_request_to_secondary() never return -EEXIST, but req->result could be -EEXIST. This happens when secondary try to attach an already attached device (__handle_primary_request --> local_dev_probe --> dev->bus->plug ) > > Fixes: 494db286f37d ("eal: fix multi-process hotplug if attached in secondary") > Cc: dariusz.stojaczyk@intel.com > Cc: qi.z.zhang@intel.com > Cc: stable@dpdk.org > > Signed-off-by: Thomas Monjalon <thomas@monjalon.net> > --- > lib/librte_eal/common/eal_common_dev.c | 9 +-------- > lib/librte_eal/common/hotplug_mp.c | 3 +-- > 2 files changed, 2 insertions(+), 10 deletions(-) > > diff --git a/lib/librte_eal/common/eal_common_dev.c > b/lib/librte_eal/common/eal_common_dev.c > index fd7f5ca7d5..048c0b025f 100644 > --- a/lib/librte_eal/common/eal_common_dev.c > +++ b/lib/librte_eal/common/eal_common_dev.c > @@ -247,18 +247,11 @@ rte_dev_probe(const char *devargs) > goto rollback; > } > > - /** > - * if any secondary failed to attach, we need to consider if rollback > - * is necessary. > - */ > + /* if any secondary failed to attach, need to rollback. */ > if (req.result != 0) { > RTE_LOG(ERR, EAL, > "Failed to attach device on secondary process\n"); > ret = req.result; > - > - /* for -EEXIST, we don't need to rollback. */ > - if (ret == -EEXIST) > - return ret; > goto rollback; > } > > diff --git a/lib/librte_eal/common/hotplug_mp.c > b/lib/librte_eal/common/hotplug_mp.c > index 4052a5c7fb..94bd1d896e 100644 > --- a/lib/librte_eal/common/hotplug_mp.c > +++ b/lib/librte_eal/common/hotplug_mp.c > @@ -110,8 +110,7 @@ __handle_secondary_request(void *param) > if (tmp_req.result != 0) { > ret = tmp_req.result; > RTE_LOG(ERR, EAL, "Failed to hotplug add device on > secondary\n"); > - if (ret != -EEXIST) > - goto rollback; > + goto rollback; > } > } else if (req->t == EAL_DEV_REQ_TYPE_DETACH) { > ret = rte_devargs_parse(&da, req->devargs); > -- > 2.20.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [PATCH 1/3] eal: remove useless checks for already probed device 2019-03-13 13:46 ` Zhang, Qi Z @ 2023-06-11 2:52 ` Stephen Hemminger 0 siblings, 0 replies; 9+ messages in thread From: Stephen Hemminger @ 2023-06-11 2:52 UTC (permalink / raw) To: Zhang, Qi Z; +Cc: Thomas Monjalon, dev, Stojaczyk, Dariusz, stable On Wed, 13 Mar 2019 13:46:01 +0000 "Zhang, Qi Z" <qi.z.zhang@intel.com> wrote: > > -----Original Message----- > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > Sent: Saturday, March 2, 2019 10:43 AM > > To: dev@dpdk.org > > Cc: Stojaczyk, Dariusz <dariusz.stojaczyk@intel.com>; Zhang, Qi Z > > <qi.z.zhang@intel.com>; stable@dpdk.org > > Subject: [PATCH 1/3] eal: remove useless checks for already probed device > > > > The function eal_dev_hotplug_request_to_secondary() never returns -EEXIST > > result. The case of already probed device is filtered out. > > > > The test in __handle_secondary_request() was always true. > > The test in rte_dev_probe() was never true, and that's fine not returning -EEXIST > > if device is already attached in secondary processes. > > I didn't get this. > eal_dev_hotplug_request_to_secondary() never return -EEXIST, but req->result could be -EEXIST. > > This happens when secondary try to attach an already attached device > (__handle_primary_request --> local_dev_probe --> dev->bus->plug ) This seems to be the outstanding question on this old patch. Is this possible? If so then the original code is ok, and patch is not required. ^ permalink raw reply [flat|nested] 9+ messages in thread
* [dpdk-dev] [PATCH 2/3] eal: remove error logs for already probed device 2019-03-02 2:42 [dpdk-dev] [PATCH 0/3] fix error path of multi-process probe Thomas Monjalon 2019-03-02 2:42 ` [dpdk-dev] [PATCH 1/3] eal: remove useless checks for already probed device Thomas Monjalon @ 2019-03-02 2:42 ` Thomas Monjalon 2019-03-02 2:42 ` [dpdk-dev] [PATCH 3/3] eal: fix multi-process probe failure handling Thomas Monjalon 2023-06-14 19:39 ` [dpdk-dev] [PATCH 0/3] fix error path of multi-process probe Stephen Hemminger 3 siblings, 0 replies; 9+ messages in thread From: Thomas Monjalon @ 2019-03-02 2:42 UTC (permalink / raw) To: dev In functions rte_dev_probe() and __handle_secondary_request(), an error message was logged as a failure if the probe returns -EEXIST. Anyway this error code is not returned by rte_dev_probe() as it is not considered as an error. For instance, in multi-process case, some synchronizations may request to probe a device which was already probed before by the process. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> --- lib/librte_eal/common/eal_common_dev.c | 12 ++---------- lib/librte_eal/common/hotplug_mp.c | 5 ++--- 2 files changed, 4 insertions(+), 13 deletions(-) diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c index 048c0b025f..deaaea9345 100644 --- a/lib/librte_eal/common/eal_common_dev.c +++ b/lib/librte_eal/common/eal_common_dev.c @@ -222,18 +222,10 @@ rte_dev_probe(const char *devargs) /* primary attach the new device itself. */ ret = local_dev_probe(devargs, &dev); - if (ret != 0) { + if (ret != 0 && ret != -EEXIST) { RTE_LOG(ERR, EAL, "Failed to attach device on primary process\n"); - - /** - * it is possible that secondary process failed to attached a - * device that primary process have during initialization, - * so for -EEXIST case, we still need to sync with secondary - * process. - */ - if (ret != -EEXIST) - return ret; + return ret; } /* primary send attach sync request to secondary. */ diff --git a/lib/librte_eal/common/hotplug_mp.c b/lib/librte_eal/common/hotplug_mp.c index 94bd1d896e..69e9a16d6a 100644 --- a/lib/librte_eal/common/hotplug_mp.c +++ b/lib/librte_eal/common/hotplug_mp.c @@ -96,10 +96,9 @@ __handle_secondary_request(void *param) if (req->t == EAL_DEV_REQ_TYPE_ATTACH) { ret = local_dev_probe(req->devargs, &dev); - if (ret != 0) { + if (ret != 0 && ret != -EEXIST) { RTE_LOG(ERR, EAL, "Failed to hotplug add device on primary\n"); - if (ret != -EEXIST) - goto finish; + goto finish; } ret = eal_dev_hotplug_request_to_secondary(&tmp_req); if (ret != 0) { -- 2.20.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [dpdk-dev] [PATCH 3/3] eal: fix multi-process probe failure handling 2019-03-02 2:42 [dpdk-dev] [PATCH 0/3] fix error path of multi-process probe Thomas Monjalon 2019-03-02 2:42 ` [dpdk-dev] [PATCH 1/3] eal: remove useless checks for already probed device Thomas Monjalon 2019-03-02 2:42 ` [dpdk-dev] [PATCH 2/3] eal: remove error logs " Thomas Monjalon @ 2019-03-02 2:42 ` Thomas Monjalon 2019-03-14 7:15 ` Zhang, Qi Z 2023-06-14 19:39 ` [dpdk-dev] [PATCH 0/3] fix error path of multi-process probe Stephen Hemminger 3 siblings, 1 reply; 9+ messages in thread From: Thomas Monjalon @ 2019-03-02 2:42 UTC (permalink / raw) To: dev; +Cc: qi.z.zhang, stable If probe fails in multi-process context, the device must removed in other processes for consistency. This is a rollback mechanism. However the rollback should not happen for devices which were already probed before the current probe transaction. When probing an already probed device, the driver may reject with -EEXIST or update and succeed with code 0. In order to distinguish successful new probe from re-probe, in the function local_dev_probe(), the positive EEXIST code is returned for the latter case. The functions rte_dev_probe() and __handle_secondary_request() can test for -EEXIST and +EEXIST, and skip rollback in such case. Fixes: 244d5130719c ("eal: enable hotplug on multi-process") Fixes: ac9e4a17370f ("eal: support attach/detach shared device from secondary") Cc: qi.z.zhang@intel.com Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> --- lib/librte_eal/common/eal_common_dev.c | 12 ++++++++++-- lib/librte_eal/common/eal_private.h | 2 +- lib/librte_eal/common/hotplug_mp.c | 8 ++++++-- 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c index deaaea9345..2c7b1ab071 100644 --- a/lib/librte_eal/common/eal_common_dev.c +++ b/lib/librte_eal/common/eal_common_dev.c @@ -132,6 +132,7 @@ local_dev_probe(const char *devargs, struct rte_device **new_dev) { struct rte_device *dev; struct rte_devargs *da; + bool already_probed; int ret; *new_dev = NULL; @@ -171,12 +172,15 @@ local_dev_probe(const char *devargs, struct rte_device **new_dev) * those devargs shouldn't be removed manually anymore. */ + already_probed = rte_dev_is_probed(dev); ret = dev->bus->plug(dev); if (ret && !rte_dev_is_probed(dev)) { /* if hasn't ever succeeded */ RTE_LOG(ERR, EAL, "Driver cannot attach the device (%s)\n", dev->name); return ret; } + if (ret == 0 && already_probed) + ret = EEXIST; /* hint to avoid any rollback */ *new_dev = dev; return ret; @@ -194,6 +198,7 @@ rte_dev_probe(const char *devargs) { struct eal_dev_mp_req req; struct rte_device *dev; + bool already_probed; int ret; memset(&req, 0, sizeof(req)); @@ -221,8 +226,8 @@ rte_dev_probe(const char *devargs) /* primary attach the new device itself. */ ret = local_dev_probe(devargs, &dev); - - if (ret != 0 && ret != -EEXIST) { + already_probed = (ret == -EEXIST || ret == EEXIST); + if (ret < 0 && !already_probed) { RTE_LOG(ERR, EAL, "Failed to attach device on primary process\n"); return ret; @@ -250,6 +255,9 @@ rte_dev_probe(const char *devargs) return 0; rollback: + if (already_probed) + return ret; /* skip rollback */ + req.t = EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK; /* primary send rollback request to secondary. */ diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h index 798ede553b..a01d252930 100644 --- a/lib/librte_eal/common/eal_private.h +++ b/lib/librte_eal/common/eal_private.h @@ -304,7 +304,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs, * @param new_dev * new device be probed as output. * @return - * 0 on success, negative on error. + * >=0 on success (+EEXIST if already probed), negative on error. */ int local_dev_probe(const char *devargs, struct rte_device **new_dev); diff --git a/lib/librte_eal/common/hotplug_mp.c b/lib/librte_eal/common/hotplug_mp.c index 69e9a16d6a..9f8ef28a3b 100644 --- a/lib/librte_eal/common/hotplug_mp.c +++ b/lib/librte_eal/common/hotplug_mp.c @@ -90,13 +90,15 @@ __handle_secondary_request(void *param) struct rte_devargs da; struct rte_device *dev; struct rte_bus *bus; + bool already_probed = false; int ret = 0; tmp_req = *req; if (req->t == EAL_DEV_REQ_TYPE_ATTACH) { ret = local_dev_probe(req->devargs, &dev); - if (ret != 0 && ret != -EEXIST) { + already_probed = (ret == -EEXIST || ret == EEXIST); + if (ret < 0 && !already_probed) { RTE_LOG(ERR, EAL, "Failed to hotplug add device on primary\n"); goto finish; } @@ -159,7 +161,7 @@ __handle_secondary_request(void *param) goto finish; rollback: - if (req->t == EAL_DEV_REQ_TYPE_ATTACH) { + if (req->t == EAL_DEV_REQ_TYPE_ATTACH && !already_probed) { tmp_req.t = EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK; eal_dev_hotplug_request_to_secondary(&tmp_req); local_dev_remove(dev); @@ -238,6 +240,8 @@ static void __handle_primary_request(void *param) case EAL_DEV_REQ_TYPE_ATTACH: case EAL_DEV_REQ_TYPE_DETACH_ROLLBACK: ret = local_dev_probe(req->devargs, &dev); + if (ret > 0) + ret = 0; /* return only errors */ break; case EAL_DEV_REQ_TYPE_DETACH: case EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK: -- 2.20.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [PATCH 3/3] eal: fix multi-process probe failure handling 2019-03-02 2:42 ` [dpdk-dev] [PATCH 3/3] eal: fix multi-process probe failure handling Thomas Monjalon @ 2019-03-14 7:15 ` Zhang, Qi Z 2019-03-14 7:15 ` Zhang, Qi Z 0 siblings, 1 reply; 9+ messages in thread From: Zhang, Qi Z @ 2019-03-14 7:15 UTC (permalink / raw) To: Thomas Monjalon, dev; +Cc: stable > -----Original Message----- > From: Thomas Monjalon [mailto:thomas@monjalon.net] > Sent: Saturday, March 2, 2019 10:43 AM > To: dev@dpdk.org > Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; stable@dpdk.org > Subject: [PATCH 3/3] eal: fix multi-process probe failure handling > > If probe fails in multi-process context, the device must removed in other > processes for consistency. This is a rollback mechanism. > However the rollback should not happen for devices which were already probed > before the current probe transaction. > > When probing an already probed device, the driver may reject with -EEXIST or > update and succeed with code 0. > In order to distinguish successful new probe from re-probe, in the function > local_dev_probe(), the positive EEXIST code is returned for the latter case. > > The functions rte_dev_probe() and __handle_secondary_request() can test for > -EEXIST and +EEXIST, and skip rollback in such case. > > Fixes: 244d5130719c ("eal: enable hotplug on multi-process") > Fixes: ac9e4a17370f ("eal: support attach/detach shared device from secondary") > Cc: qi.z.zhang@intel.com > Cc: stable@dpdk.org > > Signed-off-by: Thomas Monjalon <thomas@monjalon.net> > --- > lib/librte_eal/common/eal_common_dev.c | 12 ++++++++++-- > lib/librte_eal/common/eal_private.h | 2 +- > lib/librte_eal/common/hotplug_mp.c | 8 ++++++-- > 3 files changed, 17 insertions(+), 5 deletions(-) > > diff --git a/lib/librte_eal/common/eal_common_dev.c > b/lib/librte_eal/common/eal_common_dev.c > index deaaea9345..2c7b1ab071 100644 > --- a/lib/librte_eal/common/eal_common_dev.c > +++ b/lib/librte_eal/common/eal_common_dev.c > @@ -132,6 +132,7 @@ local_dev_probe(const char *devargs, struct rte_device > **new_dev) { > struct rte_device *dev; > struct rte_devargs *da; > + bool already_probed; > int ret; > > *new_dev = NULL; > @@ -171,12 +172,15 @@ local_dev_probe(const char *devargs, struct rte_device > **new_dev) > * those devargs shouldn't be removed manually anymore. > */ > > + already_probed = rte_dev_is_probed(dev); > ret = dev->bus->plug(dev); > if (ret && !rte_dev_is_probed(dev)) { /* if hasn't ever succeeded */ > RTE_LOG(ERR, EAL, "Driver cannot attach the device (%s)\n", > dev->name); > return ret; > } > + if (ret == 0 && already_probed) > + ret = EEXIST; /* hint to avoid any rollback */ What if bus->plug return -EEXIST and rte_dev_is_probed return true? (See rte_pci_probe_one_driver) You will not give hint here, but is this expected? > > *new_dev = dev; > return ret; > @@ -194,6 +198,7 @@ rte_dev_probe(const char *devargs) { > struct eal_dev_mp_req req; > struct rte_device *dev; > + bool already_probed; > int ret; > > memset(&req, 0, sizeof(req)); > @@ -221,8 +226,8 @@ rte_dev_probe(const char *devargs) > > /* primary attach the new device itself. */ > ret = local_dev_probe(devargs, &dev); > - > - if (ret != 0 && ret != -EEXIST) { > + already_probed = (ret == -EEXIST || ret == EEXIST); > + if (ret < 0 && !already_probed) { > RTE_LOG(ERR, EAL, > "Failed to attach device on primary process\n"); > return ret; > @@ -250,6 +255,9 @@ rte_dev_probe(const char *devargs) > return 0; > > rollback: > + if (already_probed) > + return ret; /* skip rollback */ > + > req.t = EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK; > > /* primary send rollback request to secondary. */ diff --git > a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h > index 798ede553b..a01d252930 100644 > --- a/lib/librte_eal/common/eal_private.h > +++ b/lib/librte_eal/common/eal_private.h > @@ -304,7 +304,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs, > * @param new_dev > * new device be probed as output. > * @return > - * 0 on success, negative on error. > + * >=0 on success (+EEXIST if already probed), negative on error. > */ > int local_dev_probe(const char *devargs, struct rte_device **new_dev); > > diff --git a/lib/librte_eal/common/hotplug_mp.c > b/lib/librte_eal/common/hotplug_mp.c > index 69e9a16d6a..9f8ef28a3b 100644 > --- a/lib/librte_eal/common/hotplug_mp.c > +++ b/lib/librte_eal/common/hotplug_mp.c > @@ -90,13 +90,15 @@ __handle_secondary_request(void *param) > struct rte_devargs da; > struct rte_device *dev; > struct rte_bus *bus; > + bool already_probed = false; > int ret = 0; > > tmp_req = *req; > > if (req->t == EAL_DEV_REQ_TYPE_ATTACH) { > ret = local_dev_probe(req->devargs, &dev); > - if (ret != 0 && ret != -EEXIST) { > + already_probed = (ret == -EEXIST || ret == EEXIST); > + if (ret < 0 && !already_probed) { > RTE_LOG(ERR, EAL, "Failed to hotplug add device on primary\n"); > goto finish; > } > @@ -159,7 +161,7 @@ __handle_secondary_request(void *param) > goto finish; > > rollback: > - if (req->t == EAL_DEV_REQ_TYPE_ATTACH) { > + if (req->t == EAL_DEV_REQ_TYPE_ATTACH && !already_probed) { > tmp_req.t = EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK; > eal_dev_hotplug_request_to_secondary(&tmp_req); > local_dev_remove(dev); > @@ -238,6 +240,8 @@ static void __handle_primary_request(void *param) > case EAL_DEV_REQ_TYPE_ATTACH: > case EAL_DEV_REQ_TYPE_DETACH_ROLLBACK: > ret = local_dev_probe(req->devargs, &dev); > + if (ret > 0) > + ret = 0; /* return only errors */ > break; > case EAL_DEV_REQ_TYPE_DETACH: > case EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK: > -- > 2.20.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [PATCH 3/3] eal: fix multi-process probe failure handling 2019-03-14 7:15 ` Zhang, Qi Z @ 2019-03-14 7:15 ` Zhang, Qi Z 0 siblings, 0 replies; 9+ messages in thread From: Zhang, Qi Z @ 2019-03-14 7:15 UTC (permalink / raw) To: Thomas Monjalon, dev; +Cc: stable > -----Original Message----- > From: Thomas Monjalon [mailto:thomas@monjalon.net] > Sent: Saturday, March 2, 2019 10:43 AM > To: dev@dpdk.org > Cc: Zhang, Qi Z <qi.z.zhang@intel.com>; stable@dpdk.org > Subject: [PATCH 3/3] eal: fix multi-process probe failure handling > > If probe fails in multi-process context, the device must removed in other > processes for consistency. This is a rollback mechanism. > However the rollback should not happen for devices which were already probed > before the current probe transaction. > > When probing an already probed device, the driver may reject with -EEXIST or > update and succeed with code 0. > In order to distinguish successful new probe from re-probe, in the function > local_dev_probe(), the positive EEXIST code is returned for the latter case. > > The functions rte_dev_probe() and __handle_secondary_request() can test for > -EEXIST and +EEXIST, and skip rollback in such case. > > Fixes: 244d5130719c ("eal: enable hotplug on multi-process") > Fixes: ac9e4a17370f ("eal: support attach/detach shared device from secondary") > Cc: qi.z.zhang@intel.com > Cc: stable@dpdk.org > > Signed-off-by: Thomas Monjalon <thomas@monjalon.net> > --- > lib/librte_eal/common/eal_common_dev.c | 12 ++++++++++-- > lib/librte_eal/common/eal_private.h | 2 +- > lib/librte_eal/common/hotplug_mp.c | 8 ++++++-- > 3 files changed, 17 insertions(+), 5 deletions(-) > > diff --git a/lib/librte_eal/common/eal_common_dev.c > b/lib/librte_eal/common/eal_common_dev.c > index deaaea9345..2c7b1ab071 100644 > --- a/lib/librte_eal/common/eal_common_dev.c > +++ b/lib/librte_eal/common/eal_common_dev.c > @@ -132,6 +132,7 @@ local_dev_probe(const char *devargs, struct rte_device > **new_dev) { > struct rte_device *dev; > struct rte_devargs *da; > + bool already_probed; > int ret; > > *new_dev = NULL; > @@ -171,12 +172,15 @@ local_dev_probe(const char *devargs, struct rte_device > **new_dev) > * those devargs shouldn't be removed manually anymore. > */ > > + already_probed = rte_dev_is_probed(dev); > ret = dev->bus->plug(dev); > if (ret && !rte_dev_is_probed(dev)) { /* if hasn't ever succeeded */ > RTE_LOG(ERR, EAL, "Driver cannot attach the device (%s)\n", > dev->name); > return ret; > } > + if (ret == 0 && already_probed) > + ret = EEXIST; /* hint to avoid any rollback */ What if bus->plug return -EEXIST and rte_dev_is_probed return true? (See rte_pci_probe_one_driver) You will not give hint here, but is this expected? > > *new_dev = dev; > return ret; > @@ -194,6 +198,7 @@ rte_dev_probe(const char *devargs) { > struct eal_dev_mp_req req; > struct rte_device *dev; > + bool already_probed; > int ret; > > memset(&req, 0, sizeof(req)); > @@ -221,8 +226,8 @@ rte_dev_probe(const char *devargs) > > /* primary attach the new device itself. */ > ret = local_dev_probe(devargs, &dev); > - > - if (ret != 0 && ret != -EEXIST) { > + already_probed = (ret == -EEXIST || ret == EEXIST); > + if (ret < 0 && !already_probed) { > RTE_LOG(ERR, EAL, > "Failed to attach device on primary process\n"); > return ret; > @@ -250,6 +255,9 @@ rte_dev_probe(const char *devargs) > return 0; > > rollback: > + if (already_probed) > + return ret; /* skip rollback */ > + > req.t = EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK; > > /* primary send rollback request to secondary. */ diff --git > a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h > index 798ede553b..a01d252930 100644 > --- a/lib/librte_eal/common/eal_private.h > +++ b/lib/librte_eal/common/eal_private.h > @@ -304,7 +304,7 @@ rte_devargs_layers_parse(struct rte_devargs *devargs, > * @param new_dev > * new device be probed as output. > * @return > - * 0 on success, negative on error. > + * >=0 on success (+EEXIST if already probed), negative on error. > */ > int local_dev_probe(const char *devargs, struct rte_device **new_dev); > > diff --git a/lib/librte_eal/common/hotplug_mp.c > b/lib/librte_eal/common/hotplug_mp.c > index 69e9a16d6a..9f8ef28a3b 100644 > --- a/lib/librte_eal/common/hotplug_mp.c > +++ b/lib/librte_eal/common/hotplug_mp.c > @@ -90,13 +90,15 @@ __handle_secondary_request(void *param) > struct rte_devargs da; > struct rte_device *dev; > struct rte_bus *bus; > + bool already_probed = false; > int ret = 0; > > tmp_req = *req; > > if (req->t == EAL_DEV_REQ_TYPE_ATTACH) { > ret = local_dev_probe(req->devargs, &dev); > - if (ret != 0 && ret != -EEXIST) { > + already_probed = (ret == -EEXIST || ret == EEXIST); > + if (ret < 0 && !already_probed) { > RTE_LOG(ERR, EAL, "Failed to hotplug add device on primary\n"); > goto finish; > } > @@ -159,7 +161,7 @@ __handle_secondary_request(void *param) > goto finish; > > rollback: > - if (req->t == EAL_DEV_REQ_TYPE_ATTACH) { > + if (req->t == EAL_DEV_REQ_TYPE_ATTACH && !already_probed) { > tmp_req.t = EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK; > eal_dev_hotplug_request_to_secondary(&tmp_req); > local_dev_remove(dev); > @@ -238,6 +240,8 @@ static void __handle_primary_request(void *param) > case EAL_DEV_REQ_TYPE_ATTACH: > case EAL_DEV_REQ_TYPE_DETACH_ROLLBACK: > ret = local_dev_probe(req->devargs, &dev); > + if (ret > 0) > + ret = 0; /* return only errors */ > break; > case EAL_DEV_REQ_TYPE_DETACH: > case EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK: > -- > 2.20.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [dpdk-dev] [PATCH 0/3] fix error path of multi-process probe 2019-03-02 2:42 [dpdk-dev] [PATCH 0/3] fix error path of multi-process probe Thomas Monjalon ` (2 preceding siblings ...) 2019-03-02 2:42 ` [dpdk-dev] [PATCH 3/3] eal: fix multi-process probe failure handling Thomas Monjalon @ 2023-06-14 19:39 ` Stephen Hemminger 3 siblings, 0 replies; 9+ messages in thread From: Stephen Hemminger @ 2023-06-14 19:39 UTC (permalink / raw) To: dev On Sat, 2 Mar 2019 03:42:50 +0100 Thomas Monjalon <thomas@monjalon.net> wrote: > While working on multi-process support of failsafe with Raslan, > some issues have been discovered on probing failures. > > Thomas Monjalon (3): > eal: remove useless checks for already probed device > eal: remove error logs for already probed device > eal: fix multi-process probe failure handling > > lib/librte_eal/common/eal_common_dev.c | 31 ++++++++++---------------- > lib/librte_eal/common/eal_private.h | 2 +- > lib/librte_eal/common/hotplug_mp.c | 14 +++++++----- > 3 files changed, 21 insertions(+), 26 deletions(-) > Is this old patch set relevant. Seems like it addressing a "this should never happen but if it does don't be stupid" kind of scenario. Marking it as "Changes Requested" since rebase and retest would be needed. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-06-14 19:39 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-03-02 2:42 [dpdk-dev] [PATCH 0/3] fix error path of multi-process probe Thomas Monjalon 2019-03-02 2:42 ` [dpdk-dev] [PATCH 1/3] eal: remove useless checks for already probed device Thomas Monjalon 2019-03-13 13:46 ` Zhang, Qi Z 2023-06-11 2:52 ` Stephen Hemminger 2019-03-02 2:42 ` [dpdk-dev] [PATCH 2/3] eal: remove error logs " Thomas Monjalon 2019-03-02 2:42 ` [dpdk-dev] [PATCH 3/3] eal: fix multi-process probe failure handling Thomas Monjalon 2019-03-14 7:15 ` Zhang, Qi Z 2019-03-14 7:15 ` Zhang, Qi Z 2023-06-14 19:39 ` [dpdk-dev] [PATCH 0/3] fix error path of multi-process probe Stephen Hemminger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).