From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id ED61F439BE for ; Thu, 25 Jan 2024 09:48:20 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 64C47402E1; Thu, 25 Jan 2024 09:48:20 +0100 (CET) Received: from mail-vk1-f175.google.com (mail-vk1-f175.google.com [209.85.221.175]) by mails.dpdk.org (Postfix) with ESMTP id 91A394029B for ; Thu, 25 Jan 2024 09:48:19 +0100 (CET) Received: by mail-vk1-f175.google.com with SMTP id 71dfb90a1353d-4bd986f3462so98717e0c.3 for ; Thu, 25 Jan 2024 00:48:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706172499; x=1706777299; darn=dpdk.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=0U5JVAojAERLngQLcaQ9HmpLbuB3OL/27c4OxK7teQI=; b=JFqTF7aXcp+SIeIK/uKAYrn9i+39W/aAOT3ESV1WXE/ifrpYsPi92imZAFkya07hM4 dgNTYU3Jz6VIfnpR7IpwyoRIcplvGXR4p4NG4StjjckTXKaxD4piHPH1ecU4PZpoO7tL Ij9DwXq6aOxwmYiJL0R7rOTrz6VMPiJsSi/9eI8CNveeiw38PjB4xpazaaO37YfwUWDz sEN7nFa4gyw6Z/enDz8yr2b4kQDrxpNOpZZfKrpxAGGLIZh0599ekNpv6b/Z8WZjvqZ3 ZxkNX5y7rcKSJz/vIdv2TA9QLcH30ZpCKOWbZa64ev24cnFVHmoRCxCSDK2Jctjj9VtL CwDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706172499; x=1706777299; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=0U5JVAojAERLngQLcaQ9HmpLbuB3OL/27c4OxK7teQI=; b=rJPZGPAkLFStbk32gcU0LK9wzvrnIn7R68QmUMwRwKZirANMn6Ik1qpceINiGpJm4L Ha1npAQW78hbY74zEAXFDUsuAGiQ4orbjHotF/hrG5tNaQIrd9eGF1oJhcfF5/a0dTbw oDEEZg1/B4n6izgg4A35MkopEdqttMdcka3uKWqr4H6hzHi3cDnuZLsdI/5jFB+g0CiO dqWLQIbUczYADhiyU7GFjQpsS5q4ipzg8VtnzS83UKRerrWvh/XC3Hx5LxmWgPoOOG8w E9b6CIxuX5RiU0zJ9RQYjd4vmfVLbKngz7yFbrRylFzFToANvA8FRe6U1G6zvkfAA8iI ybwA== X-Gm-Message-State: AOJu0YyWl7phVlpYG4db5JBLgI5+yb71d+EWXYMw+cLQz80xphK6tmtw IXLgKII4+xhP4Uk+DycBqrloLlJHz5GBflnjcDgzIO/KY1NaJI23b9+dGD04lloKvA80l/ZKO7V NODbP24P6IXE1VNbglaL3cZEzhJQ6djagSPU= X-Google-Smtp-Source: AGHT+IETpLlMvEnxMGLMOgOA086FZXQWmSXH1iwjeKnEVOWavPgmS6owMV+VXo+9u82CJXr+QdA4j2pNB2fDVT22QCM= X-Received: by 2002:a05:6122:1682:b0:4bd:7a4a:b2d3 with SMTP id 2-20020a056122168200b004bd7a4ab2d3mr294876vkl.29.1706172498503; Thu, 25 Jan 2024 00:48:18 -0800 (PST) MIME-Version: 1.0 From: Pavel Vazharov Date: Thu, 25 Jan 2024 10:48:07 +0200 Message-ID: Subject: Questions about running XDP sockets on top of bonding device or on the physical interfaces behind the bond To: users Content-Type: multipart/alternative; boundary="000000000000b2f3da060fc13d7e" X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org --000000000000b2f3da060fc13d7e Content-Type: text/plain; charset="UTF-8" Hi there, I'd like to ask for advice for a weird issue that I'm facing trying to run XDP on top of a bonding device (802.3ad) (and also on the physical interfaces behind the bond). I've a DPDK application which runs on top of XDP sockets, using the DPDK AF_XDP driver . It was a pure DPDK application but lately it was migrated to run on top of XDP sockets because we need to split the traffic entering the machine between the DPDK application and other "standard-Linux" applications running on the same machine. The application works fine when running on top of a single interface but it has problems when it runs on top of a bonding interface. It needs to be able to run with multiple XDP sockets where each socket (or group of XDP sockets) is/are handled in a separate thread. However, the bonding device is reported with a single queue and thus the application can't open more than one XDP socket for it. So I've tried binding the XDP sockets to the queues of the physical interfaces. For example: - 3 interfaces each one is set to have 8 queues - I've created 3 virtual af_xdp devices each one with 8 queues i.e. in summary 24 XDP sockets each bound to a separate queue (this functionality is provided by the DPDK itself). - I've run the application on 2 threads where the first thread handled the first 12 queues (XDP sockets) and the second thread handled the next 12 queues (XDP socket) i.e. the first thread worked with all 8 queues from af_xdp device 0 and the first 4 queues from af_xdp device 1. The second thread worked with the next 4 queues from af_xdp device 1 and all 8 queues from af_xdp device 2. I've also tried another distribution scheme (see below). The given threads just call the receve/transmit functions provided by the DPDK for the assigned queues. - The problem is that with this scheme the network device on the other side reports: "The member of the LACP mode Eth-Trunk interface received an abnormal LACPDU, which may be caused by optical fiber misconnection". And this error is always reported for the last device/interface in the bonding and the bonding/LACP doesn't work. - Another thing is that if I run the DPDK application on a single thread, and the sending/receiving on all queues is handled on a single thread, then the bonding seems to work correctly and the above error is not reported. - I've checked the code multiple times and I'm sure that each thread is accessing its own group of queues/sockets. - I've tried 2 different schemes of accessing but each one led to the same issue. For example (device_idx - queue_idx), I've tried these two orders of accessing: Thread 1 Thread2 (0 - 0) (1 - 4) (0 - 1) (1 - 5) ... (1 - 6) ... (1 - 7) (0 - 7) (2 - 0) (1 - 0) (2 - 1) (1 - 1) ... (1 - 2) ... (1 - 3) (2 - 7) Thread 1 Thread2 (0 - 0) (0 - 4) (1 - 0) (1 - 4) (2 - 0) (2 - 4) (0 - 1) (0 - 5) (1 - 1) (1 - 5) (2 - 1) (2 - 5) ... ... (0 - 3) (0 - 7) (1 - 3) (1 - 7) (2 - 3) (2 - 7) And here are my questions based on the above situation: 1. I assumed that it's not possible to run multiple XDP sockets on top of the bonding device itself and I need to "bind" the XDP sockets on the physical interfaces behind the bonding device. Am I right about this or am I missing something? 2. Is the bonding logic (LACP management traffic) affected by the access pattern of the XDP sockets? 3. Is this scheme supposed to work or it's just that the design is wrong? I mean, maybe a group of queues/sockets shouldn't be handled on a given thread but only a single queue should be handled on a given application thread. It's just that the physical devices have more queues setup on them than the number of threads in the DPDK application and thus multiple queues need to be handled on a single application thread. Any ideas are appreciated! Regards, Pavel. --000000000000b2f3da060fc13d7e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi there,

I'= ;d like to ask for advice for a weird issue that I'm facing trying to run XDP on top of a= =20 bonding device (802.3ad) (and also on the physical interfaces behind the bond).

I've a DPDK application which runs on top of = XDP sockets, using the DPDK AF_XDP driver. It was a pure DPDK application but lately it was migrated to run on top of XDP sockets because we need to split the traffic entering the=20 machine between the DPDK application and other "standard-Linux"= =20 applications running on the same machine.
The application works fine=20 when running on top of a single interface but it has problems when it=20 runs on top of a bonding interface. It needs to be able to run with=20 multiple XDP sockets where each socket (or group of XDP sockets) is/are=20 handled in a separate thread. However, the bonding device is reported=20 with a single queue and thus the application can't open more than one= =C2=A0=20 XDP socket for it. So I've tried binding the XDP sockets to the queues= =20 of the physical interfaces. For example:
- 3 interfaces each one = is set to have 8 queues
- I've created 3 virtual af_xdp devices each one with 8 queues i.e. in= =20 summary 24 XDP sockets each bound to a separate queue (this=20 functionality is provided by the DPDK itself).
- I've run= =20 the application on 2 threads where the first thread handled the first 12 queues (XDP sockets) and the second thread handled the next 12 queues=20 (XDP socket) i.e. the first thread worked with all 8 queues from af_xdp=20 device 0 and the first 4 queues from af_xdp device 1. The second thread=20 worked with the next 4 queues from af_xdp device 1 and all 8 queues from af_xdp device 2. I've also tried another distribution scheme (see=20 below). The given threads just call the receve/transmit functions=20 provided by the DPDK for the assigned queues.
- The problem i= s that with this scheme the network device on the other side reports: "= ;The member of the LACP mode Eth-Trunk interface received an abnormal=20 LACPDU, which may be caused by optical fiber misconnection". And this= =20 error is always reported for the last device/interface in the bonding=20 and the bonding/LACP doesn't work.
- Another= =20 thing is that if I run the DPDK application on a single thread, and the=20 sending/receiving on all queues is handled on a single thread, then the=20 bonding seems to work correctly and the above error is not reported.
- I've checked the code multiple times and I'= m sure that each thread is accessing its own group of queues/sockets.
- =
I've tried 2 different schemes of accessing but each one led to the same=20 issue. For example (device_idx - queue_idx), I've tried these two=20 orders of accessing:
Thread 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = Thread2
(0 - 0)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 (1 - 4)
(0 - 1)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (1 - 5)
... =C2=A0 =C2=A0 =C2=A0 =C2= =A0=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (1 - 6)...=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (1 - 7)
(0 - 7)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 (2 - 0)
(1 - 0)=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (2 - 1)
(1 - 1) =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ... =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0
(1 - 2)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ...
(1 - 3)=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (2 - 7)
Thread 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Thread2
(0 - 0)=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (0 - = 4)
(1 - 0)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 (1 - 4)
(2 - 0) =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 =C2=A0= =C2=A0 (2 - 4)
(0 - 1) =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 (0 - 5)
(1 - 1)=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (1 - 5)<= br>
(2 - 1)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 (2 - 5)
... =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ...(0 - 3) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 (0 - 7) =C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0
(1 - 3)=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (1 - 7)
(2 - 3)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 (2 - 7)
<= /div>

And here are my questions based on the abo= ve situation:
1. I assumed that it's not possible to run multiple XDP sockets on top of= =20 the bonding device itself and I need to "bind" the XDP sockets on= the=20 physical interfaces behind the bonding device. Am I right about this or=20 am I missing something?
2. Is the bonding logic (LACP management traffic= ) affected by the access pattern of the XDP sockets?
3. Is this scheme supposed to work or it's just that the design is wrong?= I mean, maybe a group of queues/sockets shouldn't be handled on a given= =20 thread but only a single queue should be handled on a given application=20 thread. It's just that the physical devices have more queues setup on= =20 them than the number of threads in the DPDK application and thus=20 multiple queues need to be handled on a single application thread.

Any ideas are appreciated!

Reg= ards,
Pavel.
--000000000000b2f3da060fc13d7e--