From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vc0-f170.google.com (mail-vc0-f170.google.com [209.85.220.170]) by dpdk.org (Postfix) with ESMTP id 2837A312 for ; Thu, 12 Jun 2014 16:45:59 +0200 (CEST) Received: by mail-vc0-f170.google.com with SMTP id hy10so919014vcb.29 for ; Thu, 12 Jun 2014 07:46:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=fVOmw3gzM0aM4eU7bA593i3mCEbL0kLVq+wuseurmeA=; b=QRjz/UK4NmYOM3EVviyzxMwiOmvlbbOXVOe+wEd6aXCBTSw+xzcNS1DgwEJhrX3Ldf hrEuP/qEycIJuo4HXZIkcG/FELd/x3dI+ZjGUD2Z/h3IJAtuFfugib8qsPLZslpLg1ON JBNpxDIYDbfMnvjUBBBFNFyLOa7XOEzy1xhN8zFM2adgynt6E6h1MGnMjAqc3mrp/95O rgRj46xJWLdWRkleDZZc/NqgC3Zu4Oi4fQk0Wr9xKuYwSpnzg9kUiCPh6leawF2xrAXV QuAK9/88+DNSt9YbTMjqpS7H73AWJSdM1utXegALQozWWgwFvgauJwep4dNLY6jK+f5c ec4A== MIME-Version: 1.0 X-Received: by 10.220.81.194 with SMTP id y2mr9269524vck.29.1402584374052; Thu, 12 Jun 2014 07:46:14 -0700 (PDT) Received: by 10.58.150.40 with HTTP; Thu, 12 Jun 2014 07:46:14 -0700 (PDT) Date: Thu, 12 Jun 2014 22:46:14 +0800 Message-ID: From: Tyrone Lau To: dev@dpdk.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: [dpdk-dev] A deadlock may occur in kni kernel thread while netif_receive_skb is called X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jun 2014 14:45:59 -0000 Hi, all. I have found recently the Linux kernel will complain occasionally a dead lock, while I use the kernel module rte_kni provided in DPDK. After reviewing the dpdk source code and googling, I found that the deadlock occurred because netif_receive_skb is invoked in a non-softirq context. The erroneous source code is listed as below (in lib/librte_eal/linuxapp/kni/kni_net.c:kni_net_rx_normal): * /* Transfer received packets to netif */ for (i = 0; i < num; i++) { kva = (void *)va[i] - kni->mbuf_va + kni->mbuf_kva; len = kva->data_len; data_kva = kva->data - kni->mbuf_va + kni->mbuf_kva; skb = dev_alloc_skb(len + 2); if (!skb) { KNI_ERR("Out of mem, dropping pkts\n"); /* Update statistics */ kni->stats.rx_dropped++; } else { /* Align IP on 16B boundary */ skb_reserve(skb, 2); memcpy(skb_put(skb, len), data_kva, len); skb->dev = dev; skb->protocol = eth_type_trans(skb, dev); skb->ip_summed = CHECKSUM_UNNECESSARY; /* Call netif interface */ netif_receive_skb(skb); /* Update statistics */ kni->stats.rx_bytes += len; kni->stats.rx_packets++; } }* The similar bug is reported and fixed in dpdk extension memnic. See http://comments.gmane.org/gmane.comp.networking.dpdk.devel/3151 To fix this bug, we should call local_bh_disable/local_bh_enable around netif_receive_skb to disable and re-enable soft-irq. Best Regards