From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id F24F3A034C; Tue, 1 Mar 2022 17:55:34 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E2E9F41163; Tue, 1 Mar 2022 17:55:34 +0100 (CET) Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by mails.dpdk.org (Postfix) with ESMTP id E55DF40DF6 for ; Tue, 1 Mar 2022 17:55:32 +0100 (CET) Received: by mail-pl1-f175.google.com with SMTP id ay5so11113638plb.1 for ; Tue, 01 Mar 2022 08:55:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YtM+KQLciX44sZT23ychBNlVXg7rlXDM1UvrlnjidWw=; b=pyPrMRMgAiLChPFeZOptsTus1bmEEg628OJWR9oKDZqJrFP32DxaLDapnRpqZSwtes 9NkvSAG1ZjpKYYrEmJz/sSs9XurhcKEqHg84EshHVOiKB/DzoUXJJz4sPqa5AnUYtkCR KFwESGf4XwQ//nE1jj6d8AgZZSCdJii4zlfaWMMFnKT8ES8GZQXrOJM+f/vcBW8/dgME TtSeuh9pCKHFDJ/zYno6ghiS6JgqpdgS/OFmLr6DDAyNNV14TWiujjFvvrAphAH0z156 hjrfK5HqjT93hIQ+tnMb5LoZQXwN/7nilvODQiK73yfo26KGtvi9RPWxLQrmdo+uFCRE UC5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YtM+KQLciX44sZT23ychBNlVXg7rlXDM1UvrlnjidWw=; b=lzZvcsMt4QKwNPVejkKXW2Dcx15CJtyefwZUhoWa229GU25LU273CtebWgmHZ/c94x 6OcQOWFx5HsdvapkO3HhDLAuAFjytfUz7c1oS0YNlRYmbNipZhTCuvJSLyaFJ0Ja/VtI B4OrQAQHfgYD0ObrJ85bvpX4yhEZ492RCoXKrZDZN62rRihBL0PYFDIBHsnMcVp/7qBW jQ2BJTS98L7C1sT99uzc7MfFWW66iDbZ1dsbMAmsBTmiDq42JEbd3IfVxVxkSOpqnijX xw+OmZqGvJEQbCGbvoZzEZlFES6ae0m2BuxzuB6YqMVeJSQOAEc2Bc2pNMcAfwCx+Tz4 kVbg== X-Gm-Message-State: AOAM533sQhNe6ibKqruGzst6npP0Yq1NYT3Ee9DNti34AEKEhc46Wgme vmzELDz1rCZBcLTYB+1wSWRgrmd7XAgn4g== X-Google-Smtp-Source: ABdhPJzvLVln36R0D1fhoL8tJDxw5NC4gOYED39B1/FYzLUQCnrEcYO1RJDtLKIfytd8C61g3cTQxg== X-Received: by 2002:a17:902:f711:b0:14d:61ba:8baf with SMTP id h17-20020a170902f71100b0014d61ba8bafmr26455393plo.39.1646153732019; Tue, 01 Mar 2022 08:55:32 -0800 (PST) Received: from hermes.local (204-195-112-199.wavecable.com. [204.195.112.199]) by smtp.gmail.com with ESMTPSA id u37-20020a056a0009a500b004e1414d69besm18103063pfg.151.2022.03.01.08.55.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Mar 2022 08:55:31 -0800 (PST) Date: Tue, 1 Mar 2022 08:55:26 -0800 From: Stephen Hemminger To: bugzilla@dpdk.org Cc: dev@dpdk.org Subject: Re: [Bug 943] BUG: scheduling while atomic Message-ID: <20220301085526.3431ff46@hermes.local> In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Tue, 01 Mar 2022 08:44:48 +0000 bugzilla@dpdk.org wrote: > https://bugs.dpdk.org/show_bug.cgi?id=943 > > Bug ID: 943 > Summary: BUG: scheduling while atomic > Product: DPDK > Version: 21.11 > Hardware: All > OS: Linux > Status: UNCONFIRMED > Severity: critical > Priority: Normal > Component: examples > Assignee: dev@dpdk.org > Reporter: yun.zhou@windriver.com > Target Milestone: --- > > Hi all, > > There is a "scheduling while atomic" bug when enslave a macvlan of kni > interface to a bond master. This issue can be reproduced on dpdk from > 528057df4c4fb5(kni: support promiscuous mode set). > > > The kernel message is like this. > > [ 697.574325] igb 0000:03:00.0: removed PHC on ens8f0 > [ 697.738976] igb_uio 0000:03:00.0: mapping 1K dma=0x40a98a000 > host=000000000d71c45b > [ 697.738981] igb_uio 0000:03:00.0: unmapping 1K dma=0x40a98a000 > host=000000000d71c45b > [ 1200.157918] igb_uio 0000:03:00.0: uio device registered with irq 127 > [ 1235.846792] igb_uio 0000:03:00.0: uio device registered with irq 127 > [ 1236.441082] rte_kni: Creating kni... > [ 1236.463488] IPv6: ADDRCONF(NETDEV_UP): vEth0_0: link is not ready > [ 1238.272115] IPv6: ADDRCONF(NETDEV_UP): vEth0_0: link is not ready > [ 1271.311470] rte_kni: Successfully release kni named vEth0_0 > [ 1273.749487] igb_uio 0000:03:00.0: uio device registered with irq 127 > [ 1274.345550] rte_kni: Creating kni... > [ 1274.367822] IPv6: ADDRCONF(NETDEV_UP): vEth0_0: link is not ready > [ 1276.275826] IPv6: ADDRCONF(NETDEV_UP): vEth0_0: link is not ready > [ 1396.811992] Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) > [ 1426.711326] bonding: remux0 is being created... > [ 1459.463366] device vEth0_0 entered promiscuous mode > [ 1459.463368] BUG: scheduling while atomic: bash/3560/0x00000200 > [ 1459.463369] Modules linked in: bonding macvlan rte_kni(OE) iptable_filter > igb_uio(OE) uio vmnet(OE) vmw_vsock_vmci_transport vsock vmw_vmci vmmon(OE) > dell_rbu dcdbas nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp > coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul > ghash_clmulni_intel snd_hda_codec_hdmi pcbc hp_wmi sparse_keymap aesni_intel > aes_x86_64 snd_hda_codec_realtek wmi_bmof snd_hda_codec_generic crypto_simd > snd_hda_intel snd_hda_codec glue_helper cryptd serio_raw intel_cstate > intel_rapl_perf joydev input_leds snd_hda_core snd_hwdep snd_pcm snd_seq_midi > snd_seq_midi_event snd_rawmidi mei_me mei snd_seq snd_seq_device snd_timer wmi > acpi_pad i915 snd drm_kms_helper drm fb_sys_fops syscopyarea sysfillrect > sysimgblt intel_pch_thermal video soundcore shpchp mac_hid sch_fq_codel > [ 1459.463389] nfsd auth_rpcgss parport_pc ppdev nfs_acl lockd lp parport > grace sunrpc ip_tables x_tables autofs4 psmouse e1000e igb i2c_algo_bit dca ptp > ahci pps_core libahci hid_generic usbhid hid > [ 1459.463396] CPU: 2 PID: 3560 Comm: bash Tainted: G OE > 4.15.0-167-generic #175-Ubuntu > [ 1459.463397] Hardware name: HP HP EliteDesk 800 G2 SFF/8054, BIOS N01 Ver. > 02.15 06/02/2016 > [ 1459.463397] Call Trace: > [ 1459.463403] dump_stack+0x6d/0x8b > [ 1459.463405] __schedule_bug+0x55/0x70 > [ 1459.463407] __schedule+0x658/0x890 > [ 1459.463408] ? log_store+0x226/0x270 > [ 1459.463409] schedule+0x2c/0x80 > [ 1459.463410] schedule_timeout+0x15d/0x370 ====> in > wait_event_interruptible_timeout > [ 1459.463412] ? __next_timer_interrupt+0xe0/0xe0 > [ 1459.463415] kni_net_process_request+0x277/0x300 [rte_kni] > [ 1459.463416] ? wait_woken+0x80/0x80 > [ 1459.463418] kni_net_change_rx_flags+0x6b/0x90 [rte_kni] > [ 1459.463420] __dev_set_promiscuity+0x121/0x1d0 > [ 1459.463421] __dev_set_rx_mode+0x83/0x90 > [ 1459.463423] dev_uc_add+0x56/0x70 ====> enter atomic context by calling > netif_addr_lock_bh() > [ 1459.463424] macvlan_open+0x15e/0x1d0 [macvlan] > [ 1459.463426] __dev_open+0xd3/0x160 > [ 1459.463427] dev_open+0x4e/0x90 > [ 1459.463431] bond_enslave+0x62a/0x1530 [bonding] > [ 1459.463433] ? vsscanf+0x805/0x8d0 > [ 1459.463434] ? sscanf+0x49/0x70 > [ 1459.463438] bond_option_slaves_set+0xd0/0x1a0 [bonding] > [ 1459.463441] __bond_opt_set+0x101/0x3a0 [bonding] > [ 1459.463444] __bond_opt_set_notify+0x2c/0x80 [bonding] > [ 1459.463447] bond_opt_tryset_rtnl+0x56/0xa0 [bonding] > [ 1459.463450] bonding_sysfs_store_option+0x35/0x60 [bonding] > [ 1459.463452] dev_attr_store+0x1b/0x30 > [ 1459.463453] sysfs_kf_write+0x3c/0x50 > [ 1459.463454] kernfs_fop_write+0x125/0x1a0 > [ 1459.463456] __vfs_write+0x1b/0x40 > [ 1459.463456] vfs_write+0xb1/0x1a0 > [ 1459.463457] SyS_write+0x5c/0xe0 > [ 1459.463459] do_syscall_64+0x73/0x130 ====> write > /sys/class/net/remux0/bonding/slaves to enslave eth5 > [ 1459.463460] entry_SYSCALL_64_after_hwframe+0x41/0xa6 > [ 1459.463461] RIP: 0033:0x7fdda9a41224 > [ 1459.463462] RSP: 002b:00007ffe02de1b68 EFLAGS: 00000246 ORIG_RAX: > 0000000000000001 > [ 1459.463463] RAX: ffffffffffffffda RBX: 0000000000000006 RCX: > 00007fdda9a41224 > [ 1459.463463] RDX: 0000000000000006 RSI: 000055d2b3e046f0 RDI: > 0000000000000001 > [ 1459.463464] RBP: 000055d2b3e046f0 R08: 000000000000000a R09: > 0000000000000005 > [ 1459.463465] R10: 000000000000000a R11: 0000000000000246 R12: > 00007fdda9d1d760 > [ 1459.463465] R13: 0000000000000006 R14: 00007fdda9d192a0 R15: > 00007fdda9d18760 > [ 1459.463705] remux0: Enslaving eth5 as an active interface with an up link > > > Reproduction Steps > > # modprobe uio > # insmod dpdk-kmods/linux/igb_uio/igb_uio.ko > # insmod build/kernel/linux/kni/rte_kni.ko > > # ifconfig ens8f0 down > # ./usertools/dpdk-devbind.py -b igb_uio ens8f0 > > # ./build/examples/dpdk-kni ./${version}/kni -c 0xf -- -p 0x1 -P > --config="(0,0,1,2)" & > [1] 12356 > root@pek-yzhou-d2:/home/wrsadmin/work/dpdk# EAL: Detected CPU lcores: 4 > EAL: Detected NUMA nodes: 1 > EAL: Detected static linkage of DPDK > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket > EAL: Selected IOVA mode 'PA' > EAL: Probe PCI driver: net_e1000_igb (8086:10c9) device: 0000:03:00.0 (socket > 0) > TELEMETRY: No legacy callbacks, legacy socket not created > APP: Initialising port 0 ... > > Checking link status > ...............done > Port 0 Link up at 100 Mbps FDX Autoneg > APP: ======================== > APP: KNI Running > APP: kill -SIGUSR1 12356 > APP: Show KNI Statistics. > APP: kill -SIGUSR2 12356 > APP: Zero KNI Statistics. > APP: ======================== > APP: Lcore 1 is writing to port 0 > APP: Lcore 2 has nothing to do > APP: Lcore 3 has nothing to do > APP: Lcore 0 is reading from port 0 > APP: Configure network interface of 0 up > > # ip li add link vEth0_0 eth5 address 00:11:22:33:44:55 type macvlan mode > private > # ip li add link vEth0_0 eth6 address 00:11:22:33:44:56 type macvlan mode > private > # modprobe bonding > # echo +remux0 > /sys/class/net/bonding_masters > # echo +eth5 > /sys/class/net/remux0/bonding/slaves > This looks like a bad idea... KNI is fragile and uses kernel netdev in ways that are unlikely to be safe. See the calling userspace with RTNL held bugs.