The mechanism implemented in bpf_pkt.c is like an open coded version of
seqlock.
There is an inherit race because:
If the CPU running the callback doesn't reach the before the count
is executed, it can rance with the CPU doing destroy.
CPU 1: CPU 2:
bpf_eth_unload()
bc = bpf_eth_cbh_find()
bpf_rx_callback_vm (or
bpf_rx_callback_jit)
rte_eth_remove_rx_callback()
bpf_eth_cbi_unload(bc)
bpf_eth_cbi_wait(bc)
at this point bc->inuse == 0 because call back not started
but is going to be used by CPU 2. And calling rte_bpf_destroy
will lead to use after free.
There is no good way to fix this without using RCU.
Also, the code should be consistently using C11 atomic not barriers.
Not sure if anyone ever uses this code anyway!
The mechanism implemented in bpf_pkt.c is like an open coded version of seqlock. There is an inherit race because: If the CPU running the callback doesn't reach the before the count is executed, it can rance with the CPU doing destroy. CPU 1: CPU 2: bpf_eth_unload() bc = bpf_eth_cbh_find() bpf_rx_callback_vm (or bpf_rx_callback_jit) rte_eth_remove_rx_callback() bpf_eth_cbi_unload(bc) bpf_eth_cbi_wait(bc) at this point bc->inuse == 0 because call back not started but is going to be used by CPU 2. And calling rte_bpf_destroy will lead to use after free. There is no good way to fix this without using RCU. Also, the code should be consistently using C11 atomic not barriers. Not sure if anyone ever uses this code anyway!