From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 44934A0534; Tue, 4 Feb 2020 14:34:28 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B09D91C0AD; Tue, 4 Feb 2020 14:34:27 +0100 (CET) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) by dpdk.org (Postfix) with ESMTP id 794841C02E for ; Tue, 4 Feb 2020 14:34:25 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580823264; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TMErh6uuH20RNRnGi7E00p+mGSzf4fqhD6l+hyEACk8=; b=NIWzeEW+LYoV0a487qr1LZJrrGGnmRDke5aADGkAOIhp/ppFtBF0alHQoFbW18BDpliann SDX5jbKnr1objW8Vr6YtXEZ3heLsKKPq549bh1r4rVXbMMSv69lgpxOcf7hD0Sm7QVomWy pdh5o4Sh9OMMbgk82phTTQ+9+6vLINs= Received: from mail-vk1-f198.google.com (mail-vk1-f198.google.com [209.85.221.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-164-5hbryqDWNFmunijXRW8yxQ-1; Tue, 04 Feb 2020 08:34:20 -0500 Received: by mail-vk1-f198.google.com with SMTP id e13so5858313vkd.13 for ; Tue, 04 Feb 2020 05:34:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=RM6gJEJ5WWRA6gb07IE9VDQdF+gtrQdchVrsZMwV9wA=; b=WL7UC/P1aCFaf2Na2SvSvxCyhSnoJI1gRoupsljHnniLqkmjab31QwTAjso27dvbbq RLoCTJEc34ihK4rnHTIljVp137Z4jk6BO/uWW4bN0jBNTmC0ZhGpcuhdU3B7ys+PqCMk MFSnKptwLJN1TRqUZQ8CQ5LdsZ8zn3siuaO/vEM5JcbDzlcX7T6eySyzC8UvIAEhiA6n c6DqFmbshm8dnVOg8kUXxXPsV/AF/040VtFekjO595iNqp3I2duKsCA2pzbUV28c7unD fyGD/VjqACkqlSCB3ygp+EXFuLR5ltGVqx0nPC+ZtV4SC1cEi8ssmSh77e4pzrbszsWY 1leA== X-Gm-Message-State: APjAAAVHaz/za1cl5QG80ONZrfeEb3TbF3iV76aDKAJ6k8xHNMLz6oEo 5AkkEVgPnHqHFeLV4kcXTDqnJc3Q/1Opr3wf0+HOT4TkWnlLXtPVzEDpaYsQ2e1PRJ3uUBz/qxa FXfo7kg5ICwg8Rrr5nlM= X-Received: by 2002:a67:b303:: with SMTP id a3mr17346887vsm.141.1580823260195; Tue, 04 Feb 2020 05:34:20 -0800 (PST) X-Google-Smtp-Source: APXvYqwWGvACV7kjnQdHw5T3s/tvjAwpP2s7cRKrvFLr4MMqkmJHx0GwCW+JoA34Q/ixvqrWUAjhVCyeWzEf82zv2H8= X-Received: by 2002:a67:b303:: with SMTP id a3mr17346877vsm.141.1580823259939; Tue, 04 Feb 2020 05:34:19 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: David Marchand Date: Tue, 4 Feb 2020 14:34:08 +0100 Message-ID: To: Harry Van Haaren Cc: dev , Aaron Conole X-MC-Unique: 5hbryqDWNFmunijXRW8yxQ-1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [RFC] service: stop lcore threads before 'finalize' X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Fri, Jan 17, 2020 at 9:17 AM David Marchand wrote: > > On Thu, Jan 16, 2020 at 8:50 PM Aaron Conole wrote: > > > > I've noticed an occasional segfault from the build system in the > > service_autotest and after talking with David (CC'd), it seems like it'= s > > due to the rte_service_finalize deleting the lcore_states object while > > active lcores are running. > > > > The below patch is an attempt to solve it by first reassigning all the > > lcores back to ROLE_RTE before releasing the memory. There is probably > > a larger question for DPDK proper about actually closing the pending > > lcore threads, but that's a separate issue. I've been running with the > > patch for a while, and haven't seen the crash anymore on my system. > > > > Thoughts? Is it acceptable as-is? > > Added this patch to my env, still reproducing the same issue after ~10-20= tries. > I added a breakpoint to service_lcore_uninit that is indeed caught > when exiting the test application (just wanted to make sure your > change was in my binary). Harry, We need a fix for this issue. Interestingly, Stephen patch that joins all pthreads at rte_eal_cleanup [1] makes this issue disappear. So my understanding is that we are missing a api (well, I could not find a way) to synchronously stop service lcores. 1: https://patchwork.dpdk.org/patch/64201/ --=20 David Marchand