* [Bug 943] BUG: scheduling while atomic
@ 2022-03-01  8:44 bugzilla
  2022-03-01 16:55 ` Stephen Hemminger
  2023-12-15  3:55 ` bugzilla
  0 siblings, 2 replies; 3+ messages in thread
From: bugzilla @ 2022-03-01  8:44 UTC (permalink / raw)
  To: dev
https://bugs.dpdk.org/show_bug.cgi?id=943
            Bug ID: 943
           Summary: BUG: scheduling while atomic
           Product: DPDK
           Version: 21.11
          Hardware: All
                OS: Linux
            Status: UNCONFIRMED
          Severity: critical
          Priority: Normal
         Component: examples
          Assignee: dev@dpdk.org
          Reporter: yun.zhou@windriver.com
  Target Milestone: ---
Hi all, 
There is a "scheduling while atomic" bug when enslave a macvlan of kni
interface to a bond master. This issue can be reproduced on dpdk from
528057df4c4fb5(kni: support promiscuous mode set).
> The kernel message is like this.
[  697.574325] igb 0000:03:00.0: removed PHC on ens8f0
[  697.738976] igb_uio 0000:03:00.0: mapping 1K dma=0x40a98a000
host=000000000d71c45b
[  697.738981] igb_uio 0000:03:00.0: unmapping 1K dma=0x40a98a000
host=000000000d71c45b
[ 1200.157918] igb_uio 0000:03:00.0: uio device registered with irq 127
[ 1235.846792] igb_uio 0000:03:00.0: uio device registered with irq 127
[ 1236.441082] rte_kni: Creating kni...
[ 1236.463488] IPv6: ADDRCONF(NETDEV_UP): vEth0_0: link is not ready
[ 1238.272115] IPv6: ADDRCONF(NETDEV_UP): vEth0_0: link is not ready
[ 1271.311470] rte_kni: Successfully release kni named vEth0_0
[ 1273.749487] igb_uio 0000:03:00.0: uio device registered with irq 127
[ 1274.345550] rte_kni: Creating kni...
[ 1274.367822] IPv6: ADDRCONF(NETDEV_UP): vEth0_0: link is not ready
[ 1276.275826] IPv6: ADDRCONF(NETDEV_UP): vEth0_0: link is not ready
[ 1396.811992] Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
[ 1426.711326] bonding: remux0 is being created...
[ 1459.463366] device vEth0_0 entered promiscuous mode
[ 1459.463368] BUG: scheduling while atomic: bash/3560/0x00000200
[ 1459.463369] Modules linked in: bonding macvlan rte_kni(OE) iptable_filter
igb_uio(OE) uio vmnet(OE) vmw_vsock_vmci_transport vsock vmw_vmci vmmon(OE)
dell_rbu dcdbas nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp
coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel snd_hda_codec_hdmi pcbc hp_wmi sparse_keymap aesni_intel
aes_x86_64 snd_hda_codec_realtek wmi_bmof snd_hda_codec_generic crypto_simd
snd_hda_intel snd_hda_codec glue_helper cryptd serio_raw intel_cstate
intel_rapl_perf joydev input_leds snd_hda_core snd_hwdep snd_pcm snd_seq_midi
snd_seq_midi_event snd_rawmidi mei_me mei snd_seq snd_seq_device snd_timer wmi
acpi_pad i915 snd drm_kms_helper drm fb_sys_fops syscopyarea sysfillrect
sysimgblt intel_pch_thermal video soundcore shpchp mac_hid sch_fq_codel
[ 1459.463389]  nfsd auth_rpcgss parport_pc ppdev nfs_acl lockd lp parport
grace sunrpc ip_tables x_tables autofs4 psmouse e1000e igb i2c_algo_bit dca ptp
ahci pps_core libahci hid_generic usbhid hid
[ 1459.463396] CPU: 2 PID: 3560 Comm: bash Tainted: G           OE   
4.15.0-167-generic #175-Ubuntu
[ 1459.463397] Hardware name: HP HP EliteDesk 800 G2 SFF/8054, BIOS N01 Ver.
02.15 06/02/2016
[ 1459.463397] Call Trace:
[ 1459.463403]  dump_stack+0x6d/0x8b
[ 1459.463405]  __schedule_bug+0x55/0x70
[ 1459.463407]  __schedule+0x658/0x890
[ 1459.463408]  ? log_store+0x226/0x270
[ 1459.463409]  schedule+0x2c/0x80
[ 1459.463410]  schedule_timeout+0x15d/0x370 ====> in
wait_event_interruptible_timeout
[ 1459.463412]  ? __next_timer_interrupt+0xe0/0xe0
[ 1459.463415]  kni_net_process_request+0x277/0x300 [rte_kni]
[ 1459.463416]  ? wait_woken+0x80/0x80
[ 1459.463418]  kni_net_change_rx_flags+0x6b/0x90 [rte_kni]
[ 1459.463420]  __dev_set_promiscuity+0x121/0x1d0
[ 1459.463421]  __dev_set_rx_mode+0x83/0x90
[ 1459.463423]  dev_uc_add+0x56/0x70 ====> enter atomic context by calling
netif_addr_lock_bh()
[ 1459.463424]  macvlan_open+0x15e/0x1d0 [macvlan]
[ 1459.463426]  __dev_open+0xd3/0x160
[ 1459.463427]  dev_open+0x4e/0x90
[ 1459.463431]  bond_enslave+0x62a/0x1530 [bonding]
[ 1459.463433]  ? vsscanf+0x805/0x8d0
[ 1459.463434]  ? sscanf+0x49/0x70
[ 1459.463438]  bond_option_slaves_set+0xd0/0x1a0 [bonding]
[ 1459.463441]  __bond_opt_set+0x101/0x3a0 [bonding]
[ 1459.463444]  __bond_opt_set_notify+0x2c/0x80 [bonding]
[ 1459.463447]  bond_opt_tryset_rtnl+0x56/0xa0 [bonding]
[ 1459.463450]  bonding_sysfs_store_option+0x35/0x60 [bonding]
[ 1459.463452]  dev_attr_store+0x1b/0x30
[ 1459.463453]  sysfs_kf_write+0x3c/0x50
[ 1459.463454]  kernfs_fop_write+0x125/0x1a0
[ 1459.463456]  __vfs_write+0x1b/0x40
[ 1459.463456]  vfs_write+0xb1/0x1a0
[ 1459.463457]  SyS_write+0x5c/0xe0
[ 1459.463459]  do_syscall_64+0x73/0x130 ====> write
/sys/class/net/remux0/bonding/slaves to enslave eth5
[ 1459.463460]  entry_SYSCALL_64_after_hwframe+0x41/0xa6
[ 1459.463461] RIP: 0033:0x7fdda9a41224
[ 1459.463462] RSP: 002b:00007ffe02de1b68 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[ 1459.463463] RAX: ffffffffffffffda RBX: 0000000000000006 RCX:
00007fdda9a41224
[ 1459.463463] RDX: 0000000000000006 RSI: 000055d2b3e046f0 RDI:
0000000000000001
[ 1459.463464] RBP: 000055d2b3e046f0 R08: 000000000000000a R09:
0000000000000005
[ 1459.463465] R10: 000000000000000a R11: 0000000000000246 R12:
00007fdda9d1d760
[ 1459.463465] R13: 0000000000000006 R14: 00007fdda9d192a0 R15:
00007fdda9d18760
[ 1459.463705] remux0: Enslaving eth5 as an active interface with an up link
> Reproduction Steps
# modprobe uio
# insmod dpdk-kmods/linux/igb_uio/igb_uio.ko
# insmod build/kernel/linux/kni/rte_kni.ko
# ifconfig ens8f0 down
# ./usertools/dpdk-devbind.py -b igb_uio ens8f0
# ./build/examples/dpdk-kni ./${version}/kni -c 0xf -- -p 0x1 -P
--config="(0,0,1,2)" &
[1] 12356
root@pek-yzhou-d2:/home/wrsadmin/work/dpdk# EAL: Detected CPU lcores: 4
EAL: Detected NUMA nodes: 1
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: Probe PCI driver: net_e1000_igb (8086:10c9) device: 0000:03:00.0 (socket
0)
TELEMETRY: No legacy callbacks, legacy socket not created
APP: Initialising port 0 ...
Checking link status
...............done
Port 0 Link up at 100 Mbps FDX Autoneg
APP: ========================
APP: KNI Running
APP: kill -SIGUSR1 12356
APP:     Show KNI Statistics.
APP: kill -SIGUSR2 12356
APP:     Zero KNI Statistics.
APP: ========================
APP: Lcore 1 is writing to port 0
APP: Lcore 2 has nothing to do
APP: Lcore 3 has nothing to do
APP: Lcore 0 is reading from port 0
APP: Configure network interface of 0 up
# ip li add link vEth0_0 eth5 address 00:11:22:33:44:55 type macvlan mode
private
# ip li add link vEth0_0 eth6 address 00:11:22:33:44:56 type macvlan mode
private
# modprobe bonding
# echo +remux0 > /sys/class/net/bonding_masters
# echo +eth5 > /sys/class/net/remux0/bonding/slaves
-- 
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply	[flat|nested] 3+ messages in thread
* Re: [Bug 943] BUG: scheduling while atomic
  2022-03-01  8:44 [Bug 943] BUG: scheduling while atomic bugzilla
@ 2022-03-01 16:55 ` Stephen Hemminger
  2023-12-15  3:55 ` bugzilla
  1 sibling, 0 replies; 3+ messages in thread
From: Stephen Hemminger @ 2022-03-01 16:55 UTC (permalink / raw)
  To: bugzilla; +Cc: dev
On Tue, 01 Mar 2022 08:44:48 +0000
bugzilla@dpdk.org wrote:
> https://bugs.dpdk.org/show_bug.cgi?id=943
> 
>             Bug ID: 943
>            Summary: BUG: scheduling while atomic
>            Product: DPDK
>            Version: 21.11
>           Hardware: All
>                 OS: Linux
>             Status: UNCONFIRMED
>           Severity: critical
>           Priority: Normal
>          Component: examples
>           Assignee: dev@dpdk.org
>           Reporter: yun.zhou@windriver.com
>   Target Milestone: ---
> 
> Hi all, 
> 
> There is a "scheduling while atomic" bug when enslave a macvlan of kni
> interface to a bond master. This issue can be reproduced on dpdk from
> 528057df4c4fb5(kni: support promiscuous mode set).
> 
> > The kernel message is like this.  
> 
> [  697.574325] igb 0000:03:00.0: removed PHC on ens8f0
> [  697.738976] igb_uio 0000:03:00.0: mapping 1K dma=0x40a98a000
> host=000000000d71c45b
> [  697.738981] igb_uio 0000:03:00.0: unmapping 1K dma=0x40a98a000
> host=000000000d71c45b
> [ 1200.157918] igb_uio 0000:03:00.0: uio device registered with irq 127
> [ 1235.846792] igb_uio 0000:03:00.0: uio device registered with irq 127
> [ 1236.441082] rte_kni: Creating kni...
> [ 1236.463488] IPv6: ADDRCONF(NETDEV_UP): vEth0_0: link is not ready
> [ 1238.272115] IPv6: ADDRCONF(NETDEV_UP): vEth0_0: link is not ready
> [ 1271.311470] rte_kni: Successfully release kni named vEth0_0
> [ 1273.749487] igb_uio 0000:03:00.0: uio device registered with irq 127
> [ 1274.345550] rte_kni: Creating kni...
> [ 1274.367822] IPv6: ADDRCONF(NETDEV_UP): vEth0_0: link is not ready
> [ 1276.275826] IPv6: ADDRCONF(NETDEV_UP): vEth0_0: link is not ready
> [ 1396.811992] Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
> [ 1426.711326] bonding: remux0 is being created...
> [ 1459.463366] device vEth0_0 entered promiscuous mode
> [ 1459.463368] BUG: scheduling while atomic: bash/3560/0x00000200
> [ 1459.463369] Modules linked in: bonding macvlan rte_kni(OE) iptable_filter
> igb_uio(OE) uio vmnet(OE) vmw_vsock_vmci_transport vsock vmw_vmci vmmon(OE)
> dell_rbu dcdbas nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp
> coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
> ghash_clmulni_intel snd_hda_codec_hdmi pcbc hp_wmi sparse_keymap aesni_intel
> aes_x86_64 snd_hda_codec_realtek wmi_bmof snd_hda_codec_generic crypto_simd
> snd_hda_intel snd_hda_codec glue_helper cryptd serio_raw intel_cstate
> intel_rapl_perf joydev input_leds snd_hda_core snd_hwdep snd_pcm snd_seq_midi
> snd_seq_midi_event snd_rawmidi mei_me mei snd_seq snd_seq_device snd_timer wmi
> acpi_pad i915 snd drm_kms_helper drm fb_sys_fops syscopyarea sysfillrect
> sysimgblt intel_pch_thermal video soundcore shpchp mac_hid sch_fq_codel
> [ 1459.463389]  nfsd auth_rpcgss parport_pc ppdev nfs_acl lockd lp parport
> grace sunrpc ip_tables x_tables autofs4 psmouse e1000e igb i2c_algo_bit dca ptp
> ahci pps_core libahci hid_generic usbhid hid
> [ 1459.463396] CPU: 2 PID: 3560 Comm: bash Tainted: G           OE   
> 4.15.0-167-generic #175-Ubuntu
> [ 1459.463397] Hardware name: HP HP EliteDesk 800 G2 SFF/8054, BIOS N01 Ver.
> 02.15 06/02/2016
> [ 1459.463397] Call Trace:
> [ 1459.463403]  dump_stack+0x6d/0x8b
> [ 1459.463405]  __schedule_bug+0x55/0x70
> [ 1459.463407]  __schedule+0x658/0x890
> [ 1459.463408]  ? log_store+0x226/0x270
> [ 1459.463409]  schedule+0x2c/0x80
> [ 1459.463410]  schedule_timeout+0x15d/0x370 ====> in
> wait_event_interruptible_timeout
> [ 1459.463412]  ? __next_timer_interrupt+0xe0/0xe0
> [ 1459.463415]  kni_net_process_request+0x277/0x300 [rte_kni]
> [ 1459.463416]  ? wait_woken+0x80/0x80
> [ 1459.463418]  kni_net_change_rx_flags+0x6b/0x90 [rte_kni]
> [ 1459.463420]  __dev_set_promiscuity+0x121/0x1d0
> [ 1459.463421]  __dev_set_rx_mode+0x83/0x90
> [ 1459.463423]  dev_uc_add+0x56/0x70 ====> enter atomic context by calling
> netif_addr_lock_bh()
> [ 1459.463424]  macvlan_open+0x15e/0x1d0 [macvlan]
> [ 1459.463426]  __dev_open+0xd3/0x160
> [ 1459.463427]  dev_open+0x4e/0x90
> [ 1459.463431]  bond_enslave+0x62a/0x1530 [bonding]
> [ 1459.463433]  ? vsscanf+0x805/0x8d0
> [ 1459.463434]  ? sscanf+0x49/0x70
> [ 1459.463438]  bond_option_slaves_set+0xd0/0x1a0 [bonding]
> [ 1459.463441]  __bond_opt_set+0x101/0x3a0 [bonding]
> [ 1459.463444]  __bond_opt_set_notify+0x2c/0x80 [bonding]
> [ 1459.463447]  bond_opt_tryset_rtnl+0x56/0xa0 [bonding]
> [ 1459.463450]  bonding_sysfs_store_option+0x35/0x60 [bonding]
> [ 1459.463452]  dev_attr_store+0x1b/0x30
> [ 1459.463453]  sysfs_kf_write+0x3c/0x50
> [ 1459.463454]  kernfs_fop_write+0x125/0x1a0
> [ 1459.463456]  __vfs_write+0x1b/0x40
> [ 1459.463456]  vfs_write+0xb1/0x1a0
> [ 1459.463457]  SyS_write+0x5c/0xe0
> [ 1459.463459]  do_syscall_64+0x73/0x130 ====> write
> /sys/class/net/remux0/bonding/slaves to enslave eth5
> [ 1459.463460]  entry_SYSCALL_64_after_hwframe+0x41/0xa6
> [ 1459.463461] RIP: 0033:0x7fdda9a41224
> [ 1459.463462] RSP: 002b:00007ffe02de1b68 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000001
> [ 1459.463463] RAX: ffffffffffffffda RBX: 0000000000000006 RCX:
> 00007fdda9a41224
> [ 1459.463463] RDX: 0000000000000006 RSI: 000055d2b3e046f0 RDI:
> 0000000000000001
> [ 1459.463464] RBP: 000055d2b3e046f0 R08: 000000000000000a R09:
> 0000000000000005
> [ 1459.463465] R10: 000000000000000a R11: 0000000000000246 R12:
> 00007fdda9d1d760
> [ 1459.463465] R13: 0000000000000006 R14: 00007fdda9d192a0 R15:
> 00007fdda9d18760
> [ 1459.463705] remux0: Enslaving eth5 as an active interface with an up link
> 
> > Reproduction Steps  
> 
> # modprobe uio
> # insmod dpdk-kmods/linux/igb_uio/igb_uio.ko
> # insmod build/kernel/linux/kni/rte_kni.ko
> 
> # ifconfig ens8f0 down
> # ./usertools/dpdk-devbind.py -b igb_uio ens8f0
> 
> # ./build/examples/dpdk-kni ./${version}/kni -c 0xf -- -p 0x1 -P
> --config="(0,0,1,2)" &
> [1] 12356
> root@pek-yzhou-d2:/home/wrsadmin/work/dpdk# EAL: Detected CPU lcores: 4
> EAL: Detected NUMA nodes: 1
> EAL: Detected static linkage of DPDK
> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> EAL: Selected IOVA mode 'PA'
> EAL: Probe PCI driver: net_e1000_igb (8086:10c9) device: 0000:03:00.0 (socket
> 0)
> TELEMETRY: No legacy callbacks, legacy socket not created
> APP: Initialising port 0 ...
> 
> Checking link status
> ...............done
> Port 0 Link up at 100 Mbps FDX Autoneg
> APP: ========================
> APP: KNI Running
> APP: kill -SIGUSR1 12356
> APP:     Show KNI Statistics.
> APP: kill -SIGUSR2 12356
> APP:     Zero KNI Statistics.
> APP: ========================
> APP: Lcore 1 is writing to port 0
> APP: Lcore 2 has nothing to do
> APP: Lcore 3 has nothing to do
> APP: Lcore 0 is reading from port 0
> APP: Configure network interface of 0 up
> 
> # ip li add link vEth0_0 eth5 address 00:11:22:33:44:55 type macvlan mode
> private
> # ip li add link vEth0_0 eth6 address 00:11:22:33:44:56 type macvlan mode
> private
> # modprobe bonding
> # echo +remux0 > /sys/class/net/bonding_masters
> # echo +eth5 > /sys/class/net/remux0/bonding/slaves
> 
This looks like a bad idea...
KNI is fragile and uses kernel netdev in ways that are unlikely
to be safe.  See the calling userspace with RTNL held bugs.
^ permalink raw reply	[flat|nested] 3+ messages in thread
* [Bug 943] BUG: scheduling while atomic
  2022-03-01  8:44 [Bug 943] BUG: scheduling while atomic bugzilla
  2022-03-01 16:55 ` Stephen Hemminger
@ 2023-12-15  3:55 ` bugzilla
  1 sibling, 0 replies; 3+ messages in thread
From: bugzilla @ 2023-12-15  3:55 UTC (permalink / raw)
  To: dev
[-- Attachment #1: Type: text/plain, Size: 800 bytes --]
https://bugs.dpdk.org/show_bug.cgi?id=943
Stephen Hemminger (stephen@networkplumber.org) changed:
           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |WONTFIX
                 CC|                            |stephen@networkplumber.org
             Status|CONFIRMED                   |RESOLVED
--- Comment #1 from Stephen Hemminger (stephen@networkplumber.org) ---
KNI was flawed in the design. It called to userspace with RTNL mutex held
leading to all sorts of issues. Because of this and other flaws, the driver was
deprecated and removed in 23.11 release. Closing this bug.
-- 
You are receiving this mail because:
You are the assignee for the bug.
[-- Attachment #2: Type: text/html, Size: 2890 bytes --]
^ permalink raw reply	[flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-12-15  3:55 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-01  8:44 [Bug 943] BUG: scheduling while atomic bugzilla
2022-03-01 16:55 ` Stephen Hemminger
2023-12-15  3:55 ` bugzilla
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).