On Fri, Jan 26, 2024 at 1:53 AM Stephen Hemminger < stephen@networkplumber.org> wrote: > On Thu, 25 Jan 2024 10:48:07 +0200 > Pavel Vazharov wrote: > > > Hi there, > > > > I'd like to ask for advice for a weird issue that I'm facing trying to > run > > XDP on top of a bonding device (802.3ad) (and also on the physical > > interfaces behind the bond). > > > > I've a DPDK application which runs on top of XDP sockets, using the DPDK > AF_XDP > > driver . It was a pure > DPDK > > application but lately it was migrated to run on top of XDP sockets > because > > we need to split the traffic entering the machine between the DPDK > > application and other "standard-Linux" applications running on the same > > machine. > > The application works fine when running on top of a single interface but > it > > has problems when it runs on top of a bonding interface. It needs to be > > able to run with multiple XDP sockets where each socket (or group of XDP > > sockets) is/are handled in a separate thread. However, the bonding device > > is reported with a single queue and thus the application can't open more > > than one XDP socket for it. So I've tried binding the XDP sockets to the > > queues of the physical interfaces. For example: > > - 3 interfaces each one is set to have 8 queues > > - I've created 3 virtual af_xdp devices each one with 8 queues i.e. in > > summary 24 XDP sockets each bound to a separate queue (this functionality > > is provided by the DPDK itself). > > - I've run the application on 2 threads where the first thread handled > the > > first 12 queues (XDP sockets) and the second thread handled the next 12 > > queues (XDP socket) i.e. the first thread worked with all 8 queues from > > af_xdp device 0 and the first 4 queues from af_xdp device 1. The second > > thread worked with the next 4 queues from af_xdp device 1 and all 8 > queues > > from af_xdp device 2. I've also tried another distribution scheme (see > > below). The given threads just call the receve/transmit functions > provided > > by the DPDK for the assigned queues. > > - The problem is that with this scheme the network device on the other > side > > reports: "The member of the LACP mode Eth-Trunk interface received an > > abnormal LACPDU, which may be caused by optical fiber misconnection". And > > this error is always reported for the last device/interface in the > bonding > > and the bonding/LACP doesn't work. > > - Another thing is that if I run the DPDK application on a single thread, > > and the sending/receiving on all queues is handled on a single thread, > then > > the bonding seems to work correctly and the above error is not reported. > > - I've checked the code multiple times and I'm sure that each thread is > > accessing its own group of queues/sockets. > > - I've tried 2 different schemes of accessing but each one led to the > same > > issue. For example (device_idx - queue_idx), I've tried these two orders > of > > accessing: > > Thread 1 Thread2 > > (0 - 0) (1 - 4) > > (0 - 1) (1 - 5) > > ... (1 - 6) > > ... (1 - 7) > > (0 - 7) (2 - 0) > > (1 - 0) (2 - 1) > > (1 - 1) ... > > (1 - 2) ... > > (1 - 3) (2 - 7) > > > > Thread 1 Thread2 > > (0 - 0) (0 - 4) > > (1 - 0) (1 - 4) > > (2 - 0) (2 - 4) > > (0 - 1) (0 - 5) > > (1 - 1) (1 - 5) > > (2 - 1) (2 - 5) > > ... ... > > (0 - 3) (0 - 7) > > (1 - 3) (1 - 7) > > (2 - 3) (2 - 7) > > > > And here are my questions based on the above situation: > > 1. I assumed that it's not possible to run multiple XDP sockets on top of > > the bonding device itself and I need to "bind" the XDP sockets on the > > physical interfaces behind the bonding device. Am I right about this or > am > > I missing something? > > 2. Is the bonding logic (LACP management traffic) affected by the access > > pattern of the XDP sockets? > > 3. Is this scheme supposed to work or it's just that the design is > wrong? I > > mean, maybe a group of queues/sockets shouldn't be handled on a given > > thread but only a single queue should be handled on a given application > > thread. It's just that the physical devices have more queues setup on > them > > than the number of threads in the DPDK application and thus multiple > queues > > need to be handled on a single application thread. > > > > Any ideas are appreciated! > > > > Regards, > > Pavel. > > Look at recent discussions on netdev mailing list. > Linux bonding device still needs more work to fully support XDP. > Thank you. Will do so.