From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 58246A0535; Tue, 4 Feb 2020 15:51:03 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 5EAE21C1D3; Tue, 4 Feb 2020 15:51:02 +0100 (CET) Received: from us-smtp-1.mimecast.com (us-smtp-2.mimecast.com [207.211.31.81]) by dpdk.org (Postfix) with ESMTP id 7BDD91C1D0 for ; Tue, 4 Feb 2020 15:51:00 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580827859; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7LBXe6oy5XMBaO13gpMDQHWkiUm7t2xGmd2/DmiuCrU=; b=hdqEHiFi0HdLOg8i80QJjC4/OaRcsiKZqbyXsNeg6na4ni2H+4Ybz6oWD4HepyF9VeKemb u4qLh0MW1bh+retrbZadB5IvlYVY0R/DardK0Bq2OEWl7oNNodq5VTCw/oMYnmow3PDzXw 1wmsrnInOK6usw9aO9n0/1LfRmdTKUY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-19-QyTIoz3CNlCZ1Vy4E4E1hA-1; Tue, 04 Feb 2020 09:50:43 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id EBA678010F4; Tue, 4 Feb 2020 14:50:42 +0000 (UTC) Received: from dhcp-25.97.bos.redhat.com (unknown [10.18.25.126]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D6AE884D90; Tue, 4 Feb 2020 14:50:39 +0000 (UTC) From: Aaron Conole To: David Marchand Cc: Harry Van Haaren , dev References: Date: Tue, 04 Feb 2020 09:50:39 -0500 In-Reply-To: (David Marchand's message of "Tue, 4 Feb 2020 14:34:08 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-MC-Unique: QyTIoz3CNlCZ1Vy4E4E1hA-1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [RFC] service: stop lcore threads before 'finalize' X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" David Marchand writes: > On Fri, Jan 17, 2020 at 9:17 AM David Marchand > wrote: >> >> On Thu, Jan 16, 2020 at 8:50 PM Aaron Conole wrote: >> > >> > I've noticed an occasional segfault from the build system in the >> > service_autotest and after talking with David (CC'd), it seems like it= 's >> > due to the rte_service_finalize deleting the lcore_states object while >> > active lcores are running. >> > >> > The below patch is an attempt to solve it by first reassigning all the >> > lcores back to ROLE_RTE before releasing the memory. There is probabl= y >> > a larger question for DPDK proper about actually closing the pending >> > lcore threads, but that's a separate issue. I've been running with th= e >> > patch for a while, and haven't seen the crash anymore on my system. >> > >> > Thoughts? Is it acceptable as-is? >> >> Added this patch to my env, still reproducing the same issue after ~10-2= 0 tries. >> I added a breakpoint to service_lcore_uninit that is indeed caught >> when exiting the test application (just wanted to make sure your >> change was in my binary). > > Harry, > > We need a fix for this issue. +1 > Interestingly, Stephen patch that joins all pthreads at > rte_eal_cleanup [1] makes this issue disappear. > So my understanding is that we are missing a api (well, I could not > find a way) to synchronously stop service lcores. Maybe we can take that patch as a fix. I hate to see this segfault in the field. I need to figure out what I missed in my cleanup (probably missed a synchronization point). > > 1: https://patchwork.dpdk.org/patch/64201/