From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yh0-f46.google.com (mail-yh0-f46.google.com [209.85.213.46]) by dpdk.org (Postfix) with ESMTP id E2292B619 for ; Mon, 16 Feb 2015 17:33:52 +0100 (CET) Received: by mail-yh0-f46.google.com with SMTP id z6so14448869yhz.5 for ; Mon, 16 Feb 2015 08:33:52 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=xtp3U//Iz+6el0cwylch+uRwomJ6Gr+nNXmubMRvZW0=; b=JuDgU6/vkSlyuAvJXTdlY7rXGvwD4A1tys1APIx5xOoqg9lHpxA4tfstC12fn0leGJ Sc3pqJWZPsCpudrqakhgqUWMw0//p4e1zVStUqdOOPpumLOuBYBzYp5Av3oY/AI4F7nr xxOLaHGC+cfo9nO10v6tmlgZcjbjEbOEDTPrz5KZHNmTH8yinwgy7CUXcoqJVOSXoKng ME0MKJ4sQRVTRfhhIEZ2TNokSxX40pJhueKwt9vXZE6Vq0YPkpPZebcUP9T43S/K+17l fXf1ENV3U/KFruYIZGbwtA6gAc20O271sH1NUrZl7BGoVyxQObFkxMc9v4fcvQAkOQeW JVgA== X-Gm-Message-State: ALoCoQkIFyVorYhI0/aQXqcODMAmW6PhZofcC+I5IKU1r0B/RyMc/88dsjU2VG3Dh4IZ9YUvCJa9 MIME-Version: 1.0 X-Received: by 10.170.112.130 with SMTP id e124mr747662ykb.40.1424104432371; Mon, 16 Feb 2015 08:33:52 -0800 (PST) Received: by 10.170.205.212 with HTTP; Mon, 16 Feb 2015 08:33:52 -0800 (PST) In-Reply-To: References: Date: Mon, 16 Feb 2015 10:33:52 -0600 Message-ID: From: Jay Rolette To: Dev Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] kernel: BUG: soft lockup - CPU#1 stuck for 22s! [kni_single:1782] X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Feb 2015 16:33:53 -0000 On Tue, Feb 10, 2015 at 7:33 PM, Jay Rolette wrote: > Environment: > * DPDK 1.6.0r2 > * Ubuntu 14.04 LTS > * kernel: 3.13.0-38-generic > > When we start exercising KNI a fair bit (transferring files across it, > both sending and receiving), I'm starting to see a fair bit of these kernel > lockups: > > kernel: BUG: soft lockup - CPU#1 stuck for 22s! [kni_single:1782] > > Frequently I can't do much other than get a screenshot of the error > message coming across the console session once we get into this state, so > debugging what is happening is "interesting"... > > I've seen this on multiple hardware platforms (so not box specific) as > well as virtual machines. > > Are there any known issues with KNI that would cause kernel lockups in > DPDK 1.6? Really hoping someone that knows KNI well can point me in the > right direction. > > KNI in the 1.8 tree is significantly different, so it didn't look > straight-forward to back-port it, although I do see a few changes that > might be relevant. > Found the problem. No patch to submit since it's already fixed in later versions of DPDK, but thought I'd follow up with the details since I'm sure we aren't the only ones trying to use bleeding-edge versions of DPDK... In kni_net_rx_normal(), it was calling netif_receive_skb() instead of netif_rx(). The source for netif_receive_skb() point out that it should only be called from soft-irq context, which isn't the case for KNI. As typical, simple fix once you track it down. Yao-Po Wang's fix: commit 41a6ebded53982107c1adfc0652d6cc1375a7db9. Cheers, Jay