From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 98526A052F; Mon, 10 Feb 2020 15:16:47 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1E8664C90; Mon, 10 Feb 2020 15:16:46 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 401D84C87 for ; Mon, 10 Feb 2020 15:16:44 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Feb 2020 06:16:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,425,1574150400"; d="scan'208";a="265872310" Received: from orsmsx106.amr.corp.intel.com ([10.22.225.133]) by fmsmga002.fm.intel.com with ESMTP; 10 Feb 2020 06:16:42 -0800 Received: from orsmsx160.amr.corp.intel.com (10.22.226.43) by ORSMSX106.amr.corp.intel.com (10.22.225.133) with Microsoft SMTP Server (TLS) id 14.3.439.0; Mon, 10 Feb 2020 06:16:42 -0800 Received: from ORSEDG001.ED.cps.intel.com (10.7.248.4) by ORSMSX160.amr.corp.intel.com (10.22.226.43) with Microsoft SMTP Server (TLS) id 14.3.439.0; Mon, 10 Feb 2020 06:16:41 -0800 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (104.47.59.177) by edgegateway.intel.com (134.134.137.100) with Microsoft SMTP Server (TLS) id 14.3.439.0; Mon, 10 Feb 2020 06:16:41 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Im0r6mC2envU4OybEGyux83AxbQr+h+dmQoBBOIb9g7cYLUvNrj96ZsHFDRFDr4W4rlrviRdvpaWJyrpcdQjUM+aRgrZKUdTQDTZ9O54f2yGx7VOPFdKqISeiNTGCOnBbaZVZ4PJaw9z1zE7TFxoKjp7PtlYOSIYeHdKPMnV7FjgLnSM/FI26SgWwygX6RJ91xY7rzE1Aqm1cPgVivRGeXVPzyzYaRHbu+i0zw3npESGxJqhIwyG95qxjwQ7r+/Nghi2DzjTYeLmBshx2Msgnq2r0BFpSp4I74kNnRWNIipzrXPxQECTUaFc1ELHLtZS7oHXpi3VRYv8/K6Cedn6pw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=KkQEgvnrOzME46bJ+qKLbxzgBOm/dUUHFRNW5Ev261o=; b=QYJsuJQwaWaU+nxg7Ko5qWSn0drqBcugVlVOCMP0xRIlZo/BqCgi+zirqFXcSm+nj7p5GAftoV97J4pAYDMQvKf1nFHE/q+W2UbML+YRN7u4xhJYdZnNJr6ev2AFCkKDkRHuZmPRg19MmAa+pUprYqd/iTllOYbxUVFxnw80ysHc+DMr9gGOBywl36G/P5FG92f81mtCBZR+AMVd5s4sUzpYKawuNYEInC2c6NOrqj1T3E3wgqxY/dBeQbyfBoAjxN6VkbSLqBM4tPLvsbTblH0c0duPMHpJ2sBYhELKVOwl1xAMh555iLmUw6Y8mJIZjcgCXFFb44uLdEOxGIy1vg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=KkQEgvnrOzME46bJ+qKLbxzgBOm/dUUHFRNW5Ev261o=; b=AUgbZsOsxE7wGcWavyfPn2V/B15tVLDhbu/1Mc8Vsb9QfnRznUqOPJip9k8wza9iAOgk1zkzh4ImGFgI5psrAhWfjJF4yKcUWGXAVNIhrW6pkM8RDLhoKwL5bFT/c9CjuWnjJxZmjHe4hpPw1RTWTJXHEckLLUXd5Z+bDlxrbp8= Received: from MN2PR11MB4447.namprd11.prod.outlook.com (52.135.39.217) by MN2PR11MB3871.namprd11.prod.outlook.com (20.179.151.95) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2707.24; Mon, 10 Feb 2020 14:16:40 +0000 Received: from MN2PR11MB4447.namprd11.prod.outlook.com ([fe80::c94c:3b9c:cb80:a1c0]) by MN2PR11MB4447.namprd11.prod.outlook.com ([fe80::c94c:3b9c:cb80:a1c0%7]) with mapi id 15.20.2707.028; Mon, 10 Feb 2020 14:16:40 +0000 From: "Van Haaren, Harry" To: Aaron Conole , David Marchand CC: dev Thread-Topic: [RFC] service: stop lcore threads before 'finalize' Thread-Index: AQHVzKZTm2xXY2gopEmIbtUd5PRbX6fug+KAgByiXgCAABcNOIAJYjxg Date: Mon, 10 Feb 2020 14:16:39 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiNTA1MmQ4NjUtYjU1Zi00NjMyLTljZmEtMDI3OGQ3ODU5N2M2IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiRnZ0dUNKeWt0V1huOHV3Tng4RmZhWStGTVpZa3owRm9kXC9Ib1FPNk9Kc2NrbUZ3QVFYUzhzS0Y5RmM4dWxzRWUifQ== dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.2.0.6 x-ctpclassification: CTP_NT authentication-results: spf=none (sender IP is ) smtp.mailfrom=harry.van.haaren@intel.com; x-originating-ip: [192.198.151.169] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: fb8c0671-fee3-401d-8998-08d7ae33d8d1 x-ms-traffictypediagnostic: MN2PR11MB3871: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:6108; x-forefront-prvs: 03094A4065 x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(376002)(396003)(346002)(39860400002)(366004)(136003)(199004)(189003)(110136005)(52536014)(4326008)(71200400001)(55016002)(6506007)(9686003)(966005)(26005)(7696005)(2906002)(186003)(76116006)(478600001)(66446008)(64756008)(66476007)(66946007)(66556008)(33656002)(81156014)(53546011)(8676002)(5660300002)(81166006)(86362001)(8936002)(316002); DIR:OUT; SFP:1102; SCL:1; SRVR:MN2PR11MB3871; H:MN2PR11MB4447.namprd11.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: SmqtN+BoMJ9BFdx60Pl6uiQZjVAtrcm9Lm6Bh12x9yr/1/F2aq6/YqcAgzuQyIijYtvj0uYsk2XAfh+o92WZBUORp3gNNKCwyxmbzxhx1kwZyddt3+Losvv9D1yRPHcSUUBkMlUeL/+T0QKcLbha2pV1vq5ByD5xWtbmDctSoiF8L8XnrdA8jG7hU1xJt0V+doNRzZovU2p96wdtxgN4Nj46oOeAIpOE5vfPRndWJ6o29Zj3QdJsUUlP7fWpMcvymz8TllU03+PKmsdg+slENjPaHsh600qFrMFSx8tbM/nqVoINucOcdWlpNe8LeeaTnf6n10EL0aHDRl39hx0m05O605Lwwc+vnCHIsZEnbC5mVqWVWPNXNtesrubxnLYfrgZdgCBcR18s5nfYJc1dJg5N8EsExhPvkxJ3f2D7jEOGJzRq+EsEqw/EK8Porlc92a+ZHVwbRZ62MqobXc2cG+R58f3q915OqAjOoQBKoZl8sWqSmKrfg7A2NUpTQku4WSNGKYSWKDz8IovVo+icSw== x-ms-exchange-antispam-messagedata: xwI/aLruytX5Fo8Ejp33oO+V/2EvvaUM8T2B4UaR8bjiTWpwHiCj4NbsyeDroR5E3Oa7VjQr4svOpEdfI401L2XLOyYLpKW20hgk1uidBJiE5MBf4d6ZrTf1fNno54dabJK49s/kPK3X2YEINZuCvw== x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: fb8c0671-fee3-401d-8998-08d7ae33d8d1 X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Feb 2020 14:16:39.8111 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: zUAMmQQiYYyTzsiHOeMpTHbBvDDH3kWA+smO3m4fmZ/6W4yIN6lvp7wrFCJThXPY90hJAZGKhN1pjWI5belM2Miu7CwlWwx0jRua/WEpdIQ= X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR11MB3871 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [RFC] service: stop lcore threads before 'finalize' X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Aaron Conole > Sent: Tuesday, February 4, 2020 2:51 PM > To: David Marchand > Cc: Van Haaren, Harry ; dev > Subject: Re: [RFC] service: stop lcore threads before 'finalize' >=20 > David Marchand writes: >=20 > > On Fri, Jan 17, 2020 at 9:17 AM David Marchand > > wrote: > >> > >> On Thu, Jan 16, 2020 at 8:50 PM Aaron Conole wrot= e: > >> > > >> > I've noticed an occasional segfault from the build system in the > >> > service_autotest and after talking with David (CC'd), it seems like > it's > >> > due to the rte_service_finalize deleting the lcore_states object whi= le > >> > active lcores are running. > >> > > >> > The below patch is an attempt to solve it by first reassigning all t= he > >> > lcores back to ROLE_RTE before releasing the memory. There is proba= bly > >> > a larger question for DPDK proper about actually closing the pending > >> > lcore threads, but that's a separate issue. I've been running with = the > >> > patch for a while, and haven't seen the crash anymore on my system. > >> > > >> > Thoughts? Is it acceptable as-is? > >> > >> Added this patch to my env, still reproducing the same issue after ~10= -20 > tries. > >> I added a breakpoint to service_lcore_uninit that is indeed caught > >> when exiting the test application (just wanted to make sure your > >> change was in my binary). > > > > Harry, > > > > We need a fix for this issue. >=20 > +1 Hi All, > > Interestingly, Stephen patch that joins all pthreads at > > rte_eal_cleanup [1] makes this issue disappear. > > So my understanding is that we are missing a api (well, I could not > > find a way) to synchronously stop service lcores. >=20 > Maybe we can take that patch as a fix. I hate to see this segfault > in the field. I need to figure out what I missed in my cleanup > (probably missed a synchronization point). I haven't easily reproduced this yet - so I'll investigate a way to=20 reproduce with close to 100% rate, then we can identify the root cause and actually get a clean fix. If you have pointers to reproduce easily, please let me know. -H > > 1: https://patchwork.dpdk.org/patch/64201/