From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id E8E43439D2 for ; Fri, 26 Jan 2024 15:01:24 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A749240289; Fri, 26 Jan 2024 15:01:24 +0100 (CET) Received: from mail-qk1-f174.google.com (mail-qk1-f174.google.com [209.85.222.174]) by mails.dpdk.org (Postfix) with ESMTP id 757DE4021D for ; Fri, 26 Jan 2024 15:01:23 +0100 (CET) Received: by mail-qk1-f174.google.com with SMTP id af79cd13be357-783ced12f9bso22319485a.3 for ; Fri, 26 Jan 2024 06:01:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706277683; x=1706882483; darn=dpdk.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=7QDSJ4RwxU6X5E6TyB2ologae80dCB+NanoUEOj40EI=; b=Bgn9W0YkOpVBBVz90T8rKhS1gZ8ecG3sKfE+cDskWe5uRDaklH5PJP+FvY5d+7qML6 7wbqP5gIFy+L7xh2zkVJ+VuoFxjecB1wkuh5tnj68Q66mm30N+duYL8cEZ0Fut+gx2g1 e/7VzGfwiHpEecZh/DQtG7ePMJff4ynYk0Ltw9gVWSo+pGZF3T1c8is99hKrG3LLPaLe Swgud3UpT1jnHyt3f/CMy8XKgwFbiGF3Bvia813ZOS0UL6HTZmvIcSnDN4NHIx8d4/02 Zi/HYLlIU4F06nsIsqAkV01HfuEaZmLlTgZigxeZHtLKRsCR/jQk/GBn9FiJGjah3f58 VVZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706277683; x=1706882483; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=7QDSJ4RwxU6X5E6TyB2ologae80dCB+NanoUEOj40EI=; b=jU3oWEQufmNrWv9ExphjS0Hfk2sQxNp82T0grXoSxPWLtFRinnU4UPVA9rRPVk79SK 5xc7BUf4JAORcVBmGd+soPNG1CqZAwfXleixfjW6ERInjrCmwcIuG5r1AJr6DxiiS0uC h4VOFwXJFx1pOEwuKEsfS6m9aPir3tjFv2Q0dffzNMerVdWcfoPTc2uEm7QUlPCj0jM9 h9n+n0JIQj+CDP2A5kJseXcD4yeybSHdapVvaO1naPXtvLG2yxGWmjY9tLHDyfTTVxQt fPEe7EovX5yOILFeByFnhnd9LP30J2zAP+KM4jVEwi6bmwTHnLFaoiKZaHD4bC042FHf Zljw== X-Gm-Message-State: AOJu0YxoTvcVfyWGI1NFC19Sr02uHaDQtbRi1Vt2VGjzXMA3uwfB1reP 0/wZkNY+owLPy9FY58rjABlwqwncqyf+8vEVcBnA0zSDswzz95hXRLBANcghs19VSyRpVZOXGBT w4OKgyeEB0fUcR8+2iiwwi/bK42We0cL0 X-Google-Smtp-Source: AGHT+IE6IPINn9U72AceHO/m5WH6lJzlQCt6sqQSaD2l1EXOwO9gQRylinMgLeQRHLZ+C1KdcJf8uRqy/pLWP+mMS/0= X-Received: by 2002:a05:6214:2526:b0:685:3684:5a9e with SMTP id gg6-20020a056214252600b0068536845a9emr1352620qvb.42.1706277681109; Fri, 26 Jan 2024 06:01:21 -0800 (PST) MIME-Version: 1.0 References: <20240125155304.6816355b@hermes.local> In-Reply-To: <20240125155304.6816355b@hermes.local> From: Pavel Vazharov Date: Fri, 26 Jan 2024 16:01:09 +0200 Message-ID: Subject: Re: Questions about running XDP sockets on top of bonding device or on the physical interfaces behind the bond To: Stephen Hemminger Cc: users Content-Type: multipart/alternative; boundary="000000000000121ff1060fd9bbd7" X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org --000000000000121ff1060fd9bbd7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Jan 26, 2024 at 1:53=E2=80=AFAM Stephen Hemminger < stephen@networkplumber.org> wrote: > On Thu, 25 Jan 2024 10:48:07 +0200 > Pavel Vazharov wrote: > > > Hi there, > > > > I'd like to ask for advice for a weird issue that I'm facing trying to > run > > XDP on top of a bonding device (802.3ad) (and also on the physical > > interfaces behind the bond). > > > > I've a DPDK application which runs on top of XDP sockets, using the DPD= K > AF_XDP > > driver . It was a pure > DPDK > > application but lately it was migrated to run on top of XDP sockets > because > > we need to split the traffic entering the machine between the DPDK > > application and other "standard-Linux" applications running on the same > > machine. > > The application works fine when running on top of a single interface bu= t > it > > has problems when it runs on top of a bonding interface. It needs to be > > able to run with multiple XDP sockets where each socket (or group of XD= P > > sockets) is/are handled in a separate thread. However, the bonding devi= ce > > is reported with a single queue and thus the application can't open mor= e > > than one XDP socket for it. So I've tried binding the XDP sockets to t= he > > queues of the physical interfaces. For example: > > - 3 interfaces each one is set to have 8 queues > > - I've created 3 virtual af_xdp devices each one with 8 queues i.e. in > > summary 24 XDP sockets each bound to a separate queue (this functionali= ty > > is provided by the DPDK itself). > > - I've run the application on 2 threads where the first thread handled > the > > first 12 queues (XDP sockets) and the second thread handled the next 12 > > queues (XDP socket) i.e. the first thread worked with all 8 queues from > > af_xdp device 0 and the first 4 queues from af_xdp device 1. The second > > thread worked with the next 4 queues from af_xdp device 1 and all 8 > queues > > from af_xdp device 2. I've also tried another distribution scheme (see > > below). The given threads just call the receve/transmit functions > provided > > by the DPDK for the assigned queues. > > - The problem is that with this scheme the network device on the other > side > > reports: "The member of the LACP mode Eth-Trunk interface received an > > abnormal LACPDU, which may be caused by optical fiber misconnection". A= nd > > this error is always reported for the last device/interface in the > bonding > > and the bonding/LACP doesn't work. > > - Another thing is that if I run the DPDK application on a single threa= d, > > and the sending/receiving on all queues is handled on a single thread, > then > > the bonding seems to work correctly and the above error is not reported= . > > - I've checked the code multiple times and I'm sure that each thread is > > accessing its own group of queues/sockets. > > - I've tried 2 different schemes of accessing but each one led to the > same > > issue. For example (device_idx - queue_idx), I've tried these two order= s > of > > accessing: > > Thread 1 Thread2 > > (0 - 0) (1 - 4) > > (0 - 1) (1 - 5) > > ... (1 - 6) > > ... (1 - 7) > > (0 - 7) (2 - 0) > > (1 - 0) (2 - 1) > > (1 - 1) ... > > (1 - 2) ... > > (1 - 3) (2 - 7) > > > > Thread 1 Thread2 > > (0 - 0) (0 - 4) > > (1 - 0) (1 - 4) > > (2 - 0) (2 - 4) > > (0 - 1) (0 - 5) > > (1 - 1) (1 - 5) > > (2 - 1) (2 - 5) > > ... ... > > (0 - 3) (0 - 7) > > (1 - 3) (1 - 7) > > (2 - 3) (2 - 7) > > > > And here are my questions based on the above situation: > > 1. I assumed that it's not possible to run multiple XDP sockets on top = of > > the bonding device itself and I need to "bind" the XDP sockets on the > > physical interfaces behind the bonding device. Am I right about this or > am > > I missing something? > > 2. Is the bonding logic (LACP management traffic) affected by the acces= s > > pattern of the XDP sockets? > > 3. Is this scheme supposed to work or it's just that the design is > wrong? I > > mean, maybe a group of queues/sockets shouldn't be handled on a given > > thread but only a single queue should be handled on a given application > > thread. It's just that the physical devices have more queues setup on > them > > than the number of threads in the DPDK application and thus multiple > queues > > need to be handled on a single application thread. > > > > Any ideas are appreciated! > > > > Regards, > > Pavel. > > Look at recent discussions on netdev mailing list. > Linux bonding device still needs more work to fully support XDP. > Thank you. Will do so. --000000000000121ff1060fd9bbd7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Fri, Jan 26, 2024 at 1:53=E2=80=AFAM Stephen Hemminger <stephen@networkplumber.org> = wrote:
On Thu, 2= 5 Jan 2024 10:48:07 +0200
Pavel Vazharov <f= reakpv@gmail.com> wrote:

> Hi there,
>
> I'd like to ask for advice for a weird issue that I'm facing t= rying to run
> XDP on top of a bonding device (802.3ad) (and also on the physical
> interfaces behind the bond).
>
> I've a DPDK application which runs on top of XDP sockets, using th= e DPDK AF_XDP
> driver <https://doc.dpdk.org/guides/nics/af_xdp.= html>. It was a pure DPDK
> application but lately it was migrated to run on top of XDP sockets be= cause
> we need to split the traffic entering the machine between the DPDK
> application and other "standard-Linux" applications running = on the same
> machine.
> The application works fine when running on top of a single interface b= ut it
> has problems when it runs on top of a bonding interface. It needs to b= e
> able to run with multiple XDP sockets where each socket (or group of X= DP
> sockets) is/are handled in a separate thread. However, the bonding dev= ice
> is reported with a single queue and thus the application can't ope= n more
> than one=C2=A0 XDP socket for it. So I've tried binding the XDP so= ckets to the
> queues of the physical interfaces. For example:
> - 3 interfaces each one is set to have 8 queues
> - I've created 3 virtual af_xdp devices each one with 8 queues i.e= . in
> summary 24 XDP sockets each bound to a separate queue (this functional= ity
> is provided by the DPDK itself).
> - I've run the application on 2 threads where the first thread han= dled the
> first 12 queues (XDP sockets) and the second thread handled the next 1= 2
> queues (XDP socket) i.e. the first thread worked with all 8 queues fro= m
> af_xdp device 0 and the first 4 queues from af_xdp device 1. The secon= d
> thread worked with the next 4 queues from af_xdp device 1 and all 8 qu= eues
> from af_xdp device 2. I've also tried another distribution scheme = (see
> below). The given threads just call the receve/transmit functions prov= ided
> by the DPDK for the assigned queues.
> - The problem is that with this scheme the network device on the other= side
> reports: "The member of the LACP mode Eth-Trunk interface receive= d an
> abnormal LACPDU, which may be caused by optical fiber misconnection&qu= ot;. And
> this error is always reported for the last device/interface in the bon= ding
> and the bonding/LACP doesn't work.
> - Another thing is that if I run the DPDK application on a single thre= ad,
> and the sending/receiving on all queues is handled on a single thread,= then
> the bonding seems to work correctly and the above error is not reporte= d.
> - I've checked the code multiple times and I'm sure that each = thread is
> accessing its own group of queues/sockets.
> - I've tried 2 different schemes of accessing but each one led to = the same
> issue. For example (device_idx - queue_idx), I've tried these two = orders of
> accessing:
> Thread 1=C2=A0 =C2=A0 =C2=A0 =C2=A0 Thread2
> (0 - 0)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(1 - 4)
> (0 - 1)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(1 - 5)
> ...=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 (1 - 6)
> ...=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 (1 - 7)
> (0 - 7)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(2 - 0)
> (1 - 0)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(2 - 1)
> (1 - 1)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0...
> (1 - 2)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0...
> (1 - 3)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(2 - 7)
>
> Thread 1=C2=A0 =C2=A0 =C2=A0 =C2=A0 Thread2
> (0 - 0)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(0 - 4)
> (1 - 0)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(1 - 4)
> (2 - 0)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(2 - 4)
> (0 - 1)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(0 - 5)
> (1 - 1)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(1 - 5)
> (2 - 1)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(2 - 5)
> ...=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 ...
> (0 - 3)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(0 - 7)
> (1 - 3)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(1 - 7)
> (2 - 3)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0(2 - 7)
>
> And here are my questions based on the above situation:
> 1. I assumed that it's not possible to run multiple XDP sockets on= top of
> the bonding device itself and I need to "bind" the XDP socke= ts on the
> physical interfaces behind the bonding device. Am I right about this o= r am
> I missing something?
> 2. Is the bonding logic (LACP management traffic) affected by the acce= ss
> pattern of the XDP sockets?
> 3. Is this scheme supposed to work or it's just that the design is= wrong? I
> mean, maybe a group of queues/sockets shouldn't be handled on a gi= ven
> thread but only a single queue should be handled on a given applicatio= n
> thread. It's just that the physical devices have more queues setup= on them
> than the number of threads in the DPDK application and thus multiple q= ueues
> need to be handled on a single application thread.
>
> Any ideas are appreciated!
>
> Regards,
> Pavel.

Look at recent discussions on netdev mailing list.
Linux bonding device still needs more work to fully support XDP.
Thank you. Will do so.
--000000000000121ff1060fd9bbd7--