From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id 19C831B5E2 for ; Fri, 23 Nov 2018 21:29:09 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 23 Nov 2018 12:29:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,271,1539673200"; d="scan'208";a="88693163" Received: from fmsmsx107.amr.corp.intel.com ([10.18.124.205]) by fmsmga008.fm.intel.com with ESMTP; 23 Nov 2018 12:29:08 -0800 Received: from lcsmsx154.ger.corp.intel.com (10.186.165.229) by fmsmsx107.amr.corp.intel.com (10.18.124.205) with Microsoft SMTP Server (TLS) id 14.3.408.0; Fri, 23 Nov 2018 12:29:08 -0800 Received: from hasmsx105.ger.corp.intel.com ([169.254.1.193]) by LCSMSX154.ger.corp.intel.com ([169.254.7.79]) with mapi id 14.03.0415.000; Fri, 23 Nov 2018 22:29:05 +0200 From: "Stojaczyk, Dariusz" To: "Zhang, Qi Z" , "dev@dpdk.org" CC: "thomas@monjalon.net" Thread-Topic: [PATCH] dev: fix attach rollback of a device that was already attached Thread-Index: AQHUgzvsYqpLh2GJeUWuRc5wZiMduaVdmHmAgAA0hzA= Date: Fri, 23 Nov 2018 20:29:04 +0000 Message-ID: References: <20181123144506.95367-1-dariusz.stojaczyk@intel.com> <039ED4275CED7440929022BC67E70611532EA1D0@SHSMSX103.ccr.corp.intel.com> In-Reply-To: <039ED4275CED7440929022BC67E70611532EA1D0@SHSMSX103.ccr.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.0.400.15 dlp-reaction: no-action x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiOTRiODA0NDUtMWU4ZC00MjliLTg3YjItNGNkMGE5YjRkOTk3IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoieDloOEV1MERnNndRdlF5Q2czVFg0ajlNcVhkcElcL1dEZlwvc2xcL2VxREVEXC9VMDY0NzlXQ3c5bWZ6U09OWkdoblwvIn0= x-ctpclassification: CTP_NT x-originating-ip: [10.252.41.58] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH] dev: fix attach rollback of a device that was already attached X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Nov 2018 20:29:10 -0000 > -----Original Message----- > From: Zhang, Qi Z > Sent: Friday, November 23, 2018 8:11 PM > To: Stojaczyk, Dariusz ; dev@dpdk.org > Cc: thomas@monjalon.net > Subject: RE: [PATCH] dev: fix attach rollback of a device that was alread= y > attached >=20 >=20 >=20 > > -----Original Message----- > > From: Stojaczyk, Dariusz > > Sent: Friday, November 23, 2018 6:45 AM > > To: dev@dpdk.org > > Cc: thomas@monjalon.net; Stojaczyk, Dariusz > ; > > Zhang, Qi Z > > Subject: [PATCH] dev: fix attach rollback of a device that was already > attached > > > > When primary process receives an IPC attach request of a device that's > already > > locally-attached, it doesn't setup its variables properly and is prone = to > segfaulting > > on a subsequent rollback. > > > > `ret =3D local_dev_probe(req->devargs, &dev)` > > > > The above function will set `dev` pointer to the proper device *unless*= it > returns > > with error. One of those errors is -EEXIST, which the hotplug function > explicitly > > ignores. For -EEXIST, it proceeds with attaching the device and expects= the > dev > > pointer to be valid. >=20 > Good capture. > > > > Despite this patch being a fix, it also introduces a design decision - = when > any > > secondary process fails to attach a device, the primary process that al= ready > had > > the device attached won't attempt to detach that device locally as a pa= rt of > the > > rollback routine. > > Primary process would have already printed a message "Failed to [...] o= n > > secondary" and now it will also print a warning "Devices may not be in = sync > [...]". >=20 > A little bit concern for this. > we may try to avoid the abnormal situation that device is not synced. > The scenario you describe actually is start from an abnormal situation du= e to > some previous error. > so is it better to always take chance to end up with a normal situation. >=20 > It looks better for me if we can fixed it in local_dev_probe to return a = valid > device with -EEXIST. Actually that was my original idea, but I gave it up in the end. Ok, I'll do that in V2. Thanks, D. >=20 > > > > Fixes: ac9e4a17370f ("eal: support attach/detach shared device from > > secondary") > > Cc: qi.z.zhang@intel.com > > > > Signed-off-by: Darek Stojaczyk > > --- > > lib/librte_eal/common/hotplug_mp.c | 12 ++++++++++-- > > 1 file changed, 10 insertions(+), 2 deletions(-) > > > > diff --git a/lib/librte_eal/common/hotplug_mp.c > > b/lib/librte_eal/common/hotplug_mp.c > > index 7c9fcc46c..7ee074a31 100644 > > --- a/lib/librte_eal/common/hotplug_mp.c > > +++ b/lib/librte_eal/common/hotplug_mp.c > > @@ -88,7 +88,7 @@ __handle_secondary_request(void *param) > > (const struct eal_dev_mp_req *)msg->param; > > struct eal_dev_mp_req tmp_req; > > struct rte_devargs *da; > > - struct rte_device *dev; > > + struct rte_device *dev =3D NULL; > > struct rte_bus *bus; > > int ret =3D 0; > > > > @@ -168,7 +168,15 @@ __handle_secondary_request(void *param) > > if (req->t =3D=3D EAL_DEV_REQ_TYPE_ATTACH) { > > tmp_req.t =3D EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK; > > eal_dev_hotplug_request_to_secondary(&tmp_req); > > - local_dev_remove(dev); > > + if (dev =3D=3D NULL) { > > + /* device was already attached at the time we got > the > > + * request, don't detach it now. > > + */ > > + RTE_LOG(WARNING, EAL, > > + "Devices in secondary may not sync with > primary\n"); > > + } else { > > + local_dev_remove(dev); > > + } > > } else { > > tmp_req.t =3D EAL_DEV_REQ_TYPE_DETACH_ROLLBACK; > > eal_dev_hotplug_request_to_secondary(&tmp_req); > > -- > > 2.17.1