DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Stojaczyk, Dariusz" <dariusz.stojaczyk@intel.com>
To: "Zhang, Qi Z" <qi.z.zhang@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "thomas@monjalon.net" <thomas@monjalon.net>
Subject: Re: [dpdk-dev] [PATCH] dev: fix attach rollback of a device that was already attached
Date: Fri, 23 Nov 2018 20:29:04 +0000	[thread overview]
Message-ID: <FBE7E039FA50BF47A673AD0BD3CD56A846234D6F@HASMSX105.ger.corp.intel.com> (raw)
In-Reply-To: <039ED4275CED7440929022BC67E70611532EA1D0@SHSMSX103.ccr.corp.intel.com>



> -----Original Message-----
> From: Zhang, Qi Z
> Sent: Friday, November 23, 2018 8:11 PM
> To: Stojaczyk, Dariusz <dariusz.stojaczyk@intel.com>; dev@dpdk.org
> Cc: thomas@monjalon.net
> Subject: RE: [PATCH] dev: fix attach rollback of a device that was already
> attached
> 
> 
> 
> > -----Original Message-----
> > From: Stojaczyk, Dariusz
> > Sent: Friday, November 23, 2018 6:45 AM
> > To: dev@dpdk.org
> > Cc: thomas@monjalon.net; Stojaczyk, Dariusz
> <dariusz.stojaczyk@intel.com>;
> > Zhang, Qi Z <qi.z.zhang@intel.com>
> > Subject: [PATCH] dev: fix attach rollback of a device that was already
> attached
> >
> > When primary process receives an IPC attach request of a device that's
> already
> > locally-attached, it doesn't setup its variables properly and is prone to
> segfaulting
> > on a subsequent rollback.
> >
> > `ret = local_dev_probe(req->devargs, &dev)`
> >
> > The above function will set `dev` pointer to the proper device *unless* it
> returns
> > with error. One of those errors is -EEXIST, which the hotplug function
> explicitly
> > ignores. For -EEXIST, it proceeds with attaching the device and expects the
> dev
> > pointer to be valid.
> 
> Good capture.
> >
> > Despite this patch being a fix, it also introduces a design decision - when
> any
> > secondary process fails to attach a device, the primary process that already
> had
> > the device attached won't attempt to detach that device locally as a part of
> the
> > rollback routine.
> > Primary process would have already printed a message "Failed to [...] on
> > secondary" and now it will also print a warning "Devices may not be in sync
> [...]".
> 
> A little bit concern for this.
> we may try to avoid the abnormal situation that device is not synced.
> The scenario you describe actually is start from an abnormal situation due to
> some previous error.
> so is it better to always take chance to end up with a normal situation.
> 
> It looks better for me if we can fixed it in local_dev_probe to return a valid
> device with -EEXIST.

Actually that was my original idea, but I gave it up in the end.
Ok, I'll do that in V2.

Thanks,
D.

> 
> >
> > Fixes: ac9e4a17370f ("eal: support attach/detach shared device from
> > secondary")
> > Cc: qi.z.zhang@intel.com
> >
> > Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
> > ---
> >  lib/librte_eal/common/hotplug_mp.c | 12 ++++++++++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/librte_eal/common/hotplug_mp.c
> > b/lib/librte_eal/common/hotplug_mp.c
> > index 7c9fcc46c..7ee074a31 100644
> > --- a/lib/librte_eal/common/hotplug_mp.c
> > +++ b/lib/librte_eal/common/hotplug_mp.c
> > @@ -88,7 +88,7 @@ __handle_secondary_request(void *param)
> >  		(const struct eal_dev_mp_req *)msg->param;
> >  	struct eal_dev_mp_req tmp_req;
> >  	struct rte_devargs *da;
> > -	struct rte_device *dev;
> > +	struct rte_device *dev = NULL;
> >  	struct rte_bus *bus;
> >  	int ret = 0;
> >
> > @@ -168,7 +168,15 @@ __handle_secondary_request(void *param)
> >  	if (req->t == EAL_DEV_REQ_TYPE_ATTACH) {
> >  		tmp_req.t = EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK;
> >  		eal_dev_hotplug_request_to_secondary(&tmp_req);
> > -		local_dev_remove(dev);
> > +		if (dev == NULL) {
> > +			/* device was already attached at the time we got
> the
> > +			 * request, don't detach it now.
> > +			 */
> > +			RTE_LOG(WARNING, EAL,
> > +				"Devices in secondary may not sync with
> primary\n");
> > +		} else {
> > +			local_dev_remove(dev);
> > +		}
> >  	} else {
> >  		tmp_req.t = EAL_DEV_REQ_TYPE_DETACH_ROLLBACK;
> >  		eal_dev_hotplug_request_to_secondary(&tmp_req);
> > --
> > 2.17.1

  reply	other threads:[~2018-11-23 20:29 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-23 14:45 Darek Stojaczyk
2018-11-23 19:10 ` Zhang, Qi Z
2018-11-23 20:29   ` Stojaczyk, Dariusz [this message]
2018-11-23 21:26 ` [dpdk-dev] [PATCH v2] " Darek Stojaczyk
2018-11-25 12:25   ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=FBE7E039FA50BF47A673AD0BD3CD56A846234D6F@HASMSX105.ger.corp.intel.com \
    --to=dariusz.stojaczyk@intel.com \
    --cc=dev@dpdk.org \
    --cc=qi.z.zhang@intel.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).