From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f48.google.com (mail-pa0-f48.google.com [209.85.220.48]) by dpdk.org (Postfix) with ESMTP id 73AC158D7 for ; Fri, 13 Jun 2014 02:21:04 +0200 (CEST) Received: by mail-pa0-f48.google.com with SMTP id et14so208969pad.21 for ; Thu, 12 Jun 2014 17:21:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-type:content-transfer-encoding; bh=DQPkbzoIuKkFEnsJZt/jqK9m9m8fhR5WIW8yEI5yaZc=; b=guLMrrFduhX8X9Y0YGBE2ipdJ+JzIXhqCY0E70q8Q9G8K8c73s31J/mF/7r4cdqZy/ JlQXM+013G90oMr1YslarZTj91NijCwrChzRnUXwtUdoFQ2bHtfEECiNLS8p9kFrk0rb dIuETR72xc/5HFG0MCii/C3R1pqeIs3QgUmGddRJM1TJSxumVx4+/PJWE3qc1z4umo4E Qz1btP2L6NxgnpJ5ofGZbDB2bkh63PkKN0jIj7Gy1zqvJXZN2lfxSe+jPX1sRFoOam7K QhiA10TEFpmfZg7cwpBD3488Hd5yBLUNaPpMTmekVTGp2q43I5ZrfQFmck8VjsWWxh/G Lr6A== X-Gm-Message-State: ALoCoQkIuy7UlX/5D0bEGR8iFTtQya/q/WGz7bLg7O7dyiLXuJyxhUuuWS4kn/alIMKDl7Gz0UkH X-Received: by 10.66.175.166 with SMTP id cb6mr24572516pac.128.1402618879175; Thu, 12 Jun 2014 17:21:19 -0700 (PDT) Received: from nehalam.linuxnetplumber.net (static-50-53-83-51.bvtn.or.frontiernet.net. [50.53.83.51]) by mx.google.com with ESMTPSA id xk1sm12269191pac.21.2014.06.12.17.21.18 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Thu, 12 Jun 2014 17:21:18 -0700 (PDT) Date: Thu, 12 Jun 2014 17:21:15 -0700 From: Stephen Hemminger To: Tyrone Lau Message-ID: <20140612172115.5dc60812@nehalam.linuxnetplumber.net> In-Reply-To: References: X-Mailer: Claws Mail 3.9.3 (GTK+ 2.24.23; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: dev@dpdk.org Subject: Re: [dpdk-dev] A deadlock may occur in kni kernel thread while netif_receive_skb is called X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Jun 2014 00:21:04 -0000 On Thu, 12 Jun 2014 22:46:14 +0800 Tyrone Lau wrote: > Hi, all. I have found recently the Linux kernel will complain occasionally > a dead lock, while I use the kernel module rte_kni provided in DPDK. After > reviewing the dpdk source code and googling, > I found that the deadlock occurred because netif_receive_skb is invoked in > a non-softirq context. The erroneous source code is listed as below (in > lib/librte_eal/linuxapp/kni/kni_net.c:kni_net_rx_normal): > > * /* Transfer received packets to netif */ > for (i = 0; i < num; i++) { > kva = (void *)va[i] - kni->mbuf_va + kni->mbuf_kva; > len = kva->data_len; > data_kva = kva->data - kni->mbuf_va + kni->mbuf_kva; > > skb = dev_alloc_skb(len + 2); > if (!skb) { > KNI_ERR("Out of mem, dropping pkts\n"); > /* Update statistics */ > kni->stats.rx_dropped++; > } > else { > /* Align IP on 16B boundary */ > skb_reserve(skb, 2); > memcpy(skb_put(skb, len), data_kva, len); > skb->dev = dev; > skb->protocol = eth_type_trans(skb, dev); > skb->ip_summed = CHECKSUM_UNNECESSARY; > > /* Call netif interface */ > netif_receive_skb(skb); > > /* Update statistics */ > kni->stats.rx_bytes += len; > kni->stats.rx_packets++; > } > }* > > The similar bug is reported and fixed in dpdk extension memnic. See > > http://comments.gmane.org/gmane.comp.networking.dpdk.devel/3151 > > To fix this bug, we should call local_bh_disable/local_bh_enable > around netif_receive_skb to disable and re-enable soft-irq. > Best Regards Probably better to call netif_rx instead, because that will handle the case of overrun. Other comments, this code should be using per-cpu stats. it should use netdev_alloc_skb_ip_align rather than doing align itself. Even better yet would be bursting packets into the receive handler.