From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yk0-f182.google.com (mail-yk0-f182.google.com [209.85.160.182]) by dpdk.org (Postfix) with ESMTP id F3A34B5D7 for ; Mon, 16 Feb 2015 14:32:00 +0100 (CET) Received: by mail-yk0-f182.google.com with SMTP id 142so9342570ykq.13 for ; Mon, 16 Feb 2015 05:32:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=m5EGLXFiavySLr55LOXC1DGUVaaxlCZW7827aP4tIs8=; b=CRivSqW/s4/qHam6sRWDl1fZB363me2GyP/J/AzIzBvWjOlrBuIO23jptrcC9eKpss c/rKVYS328pD5vwkSn6kpfD/yzuoL/YFAW+AgUxbfWDMpis74tRxVhZN5wpKbfsN7snf FenQJOjvKUsi3T+K1weJaC+yBSlIZFf8misrPYTpdvNcMnE4xFQxnsw2Uo+7bnAh2fQU ODNThBxO50L6QGpFobRTzjD6eyf/+6NYvQFPMWjbK1QB3tGmBR38Qs4Lr1uBK0agOKCF dHAV/xPa3ukm0eEzR9VBMbyaqxZpzucac7Jtp/mEhjnEKfKdloZGV5dzTgelHUAhK9pB FkGQ== X-Gm-Message-State: ALoCoQksY9sM6VKoZOx0plqHTaNdFbdYWHxYPbHqllJSG6UXhw/XLMm+Ke7clGqTWWQbyq+JgKKm MIME-Version: 1.0 X-Received: by 10.170.112.130 with SMTP id e124mr18170980ykb.40.1424093520407; Mon, 16 Feb 2015 05:32:00 -0800 (PST) Received: by 10.170.205.212 with HTTP; Mon, 16 Feb 2015 05:32:00 -0800 (PST) In-Reply-To: References: Date: Mon, 16 Feb 2015 07:32:00 -0600 Message-ID: From: Jay Rolette To: Alejandro Lucero Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Cc: dev Subject: Re: [dpdk-dev] kernel: BUG: soft lockup - CPU#1 stuck for 22s! [kni_single:1782] X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Feb 2015 13:32:01 -0000 Thanks Alejandro. I'll look into the kernel dump if there is one. The system is extremely brittle once this happens. Usually I can't do much other than power-cycle the box. Anything requiring sudo just locks the terminal up, so little to look at besides the messages on the console. Matthew Hall also suggested a few things for me to look into, so I'm following up on that as well. Jay On Wed, Feb 11, 2015 at 10:25 AM, Alejandro Lucero < alejandro.lucero@netronome.com> wrote: > Hi Jay, > > I saw these errors when I worked in the HPC sector. They come usually with > a kernel dump for each core in the machine so you can know, after some > peering at the kernel code, how the soft lockup triggers. When I did that > it was always an issue with the memory. > > So those times that you can still work on the machine after the problem, > look at the kernel messages. I will be glad to look at it. > > > > On Wed, Feb 11, 2015 at 1:33 AM, Jay Rolette > wrote: > > > Environment: > > * DPDK 1.6.0r2 > > * Ubuntu 14.04 LTS > > * kernel: 3.13.0-38-generic > > > > When we start exercising KNI a fair bit (transferring files across it, > both > > sending and receiving), I'm starting to see a fair bit of these kernel > > lockups: > > > > kernel: BUG: soft lockup - CPU#1 stuck for 22s! [kni_single:1782] > > > > Frequently I can't do much other than get a screenshot of the error > message > > coming across the console session once we get into this state, so > debugging > > what is happening is "interesting"... > > > > I've seen this on multiple hardware platforms (so not box specific) as > well > > as virtual machines. > > > > Are there any known issues with KNI that would cause kernel lockups in > DPDK > > 1.6? Really hoping someone that knows KNI well can point me in the right > > direction. > > > > KNI in the 1.8 tree is significantly different, so it didn't look > > straight-forward to back-port it, although I do see a few changes that > > might be relevant. > > > > Any suggestions, pointers or other general help for tracking this down? > > > > Thanks! > > Jay > > >