From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 11AE0439C9 for ; Fri, 26 Jan 2024 00:53:09 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 779B3402A7; Fri, 26 Jan 2024 00:53:08 +0100 (CET) Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by mails.dpdk.org (Postfix) with ESMTP id 9510A40289 for ; Fri, 26 Jan 2024 00:53:07 +0100 (CET) Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-1d51ba18e1bso67583205ad.0 for ; Thu, 25 Jan 2024 15:53:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1706226786; x=1706831586; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=jDHMlIpwWSvdnDhuV62m9BJezwjSHzoQ6LkqYwDzKEs=; b=DFNiB42tDK7T+zqZq45eFzW6kd/YO7VNNP+AnwlQS6wXb3n8/1njDZdfmrboa8kHNI /slgVprV9JrF8kpOUdYXqNiZV5qNpaDt3oChUZITUM2xRnjmO3K7qq19HPKe6kNRCDF0 lLsGBgWnO+sWr1beKTcVEJ3ck4sWP0caqzr/P2ZhyDndKimRtPb7Kwx9HezbdhAgjg+P nSRONSt8GgM6+hwaWuHYsXTmS4RO0By7/mafcx7eSV3+Q8K7zTBgpySWyxVOHxvvX5nY B5sm+vdPjfrwTC3OSjSMAQL9knYFpZGerOSMC54qpD/lWMUy7SPOzH2RxSTkHXlmtePL 5Zmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706226786; x=1706831586; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jDHMlIpwWSvdnDhuV62m9BJezwjSHzoQ6LkqYwDzKEs=; b=sxa89Gy3ViIvicP4FT8l/Cm7YeJDL8DKRXk+hTE4+5QCkkvppEcXL1C8qv17tV0+/C NIyxGrIJMbDQhe1UsoLiVq78l451CWcRZT6n2lahevJSk+CyMgQc/4cf4WuYhoCnAXBf agG3+/zEjxGu5s54klY4Xyv0RMW5fyecIuiIP9BzvQoNs3Y00TG2RxUgBSEf0J54nywD XqKhz7a6tYO+6zZ8wh06eYTGyEd0UzWDk8nOhNbBEW/ac1xOf8n0L1o6sa/urYtkqDts kowu2IrP1eCFz5ykqJhtjSY6x51dEfgNp52RPudYgAbHo5fmB1VAyObSv+q+RTWLRAWz ifog== X-Gm-Message-State: AOJu0YwaMako/xVqoArLkhlg3wBbEjHc4xBkgMsS4z3mEtRMpyg/839v uzqlPHYoVmDsttCZISw1LrCVmH++km6Jra7jzaO/j66AM2f25gOAgsUa9Ft5sDA= X-Google-Smtp-Source: AGHT+IG4XwNDzr1Bc1wtqylibIkZFIHuXRlkXopV39qklDNIXq5g4tREHCK9TpOpSdrDiu8fKxRmKg== X-Received: by 2002:a17:902:d682:b0:1d7:2bd6:23e6 with SMTP id v2-20020a170902d68200b001d72bd623e6mr512148ply.128.1706226786466; Thu, 25 Jan 2024 15:53:06 -0800 (PST) Received: from hermes.local (204-195-123-141.wavecable.com. [204.195.123.141]) by smtp.gmail.com with ESMTPSA id kj14-20020a17090306ce00b001d74a674620sm46902plb.198.2024.01.25.15.53.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 25 Jan 2024 15:53:06 -0800 (PST) Date: Thu, 25 Jan 2024 15:53:04 -0800 From: Stephen Hemminger To: Pavel Vazharov Cc: users Subject: Re: Questions about running XDP sockets on top of bonding device or on the physical interfaces behind the bond Message-ID: <20240125155304.6816355b@hermes.local> In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org On Thu, 25 Jan 2024 10:48:07 +0200 Pavel Vazharov wrote: > Hi there, > > I'd like to ask for advice for a weird issue that I'm facing trying to run > XDP on top of a bonding device (802.3ad) (and also on the physical > interfaces behind the bond). > > I've a DPDK application which runs on top of XDP sockets, using the DPDK AF_XDP > driver . It was a pure DPDK > application but lately it was migrated to run on top of XDP sockets because > we need to split the traffic entering the machine between the DPDK > application and other "standard-Linux" applications running on the same > machine. > The application works fine when running on top of a single interface but it > has problems when it runs on top of a bonding interface. It needs to be > able to run with multiple XDP sockets where each socket (or group of XDP > sockets) is/are handled in a separate thread. However, the bonding device > is reported with a single queue and thus the application can't open more > than one XDP socket for it. So I've tried binding the XDP sockets to the > queues of the physical interfaces. For example: > - 3 interfaces each one is set to have 8 queues > - I've created 3 virtual af_xdp devices each one with 8 queues i.e. in > summary 24 XDP sockets each bound to a separate queue (this functionality > is provided by the DPDK itself). > - I've run the application on 2 threads where the first thread handled the > first 12 queues (XDP sockets) and the second thread handled the next 12 > queues (XDP socket) i.e. the first thread worked with all 8 queues from > af_xdp device 0 and the first 4 queues from af_xdp device 1. The second > thread worked with the next 4 queues from af_xdp device 1 and all 8 queues > from af_xdp device 2. I've also tried another distribution scheme (see > below). The given threads just call the receve/transmit functions provided > by the DPDK for the assigned queues. > - The problem is that with this scheme the network device on the other side > reports: "The member of the LACP mode Eth-Trunk interface received an > abnormal LACPDU, which may be caused by optical fiber misconnection". And > this error is always reported for the last device/interface in the bonding > and the bonding/LACP doesn't work. > - Another thing is that if I run the DPDK application on a single thread, > and the sending/receiving on all queues is handled on a single thread, then > the bonding seems to work correctly and the above error is not reported. > - I've checked the code multiple times and I'm sure that each thread is > accessing its own group of queues/sockets. > - I've tried 2 different schemes of accessing but each one led to the same > issue. For example (device_idx - queue_idx), I've tried these two orders of > accessing: > Thread 1 Thread2 > (0 - 0) (1 - 4) > (0 - 1) (1 - 5) > ... (1 - 6) > ... (1 - 7) > (0 - 7) (2 - 0) > (1 - 0) (2 - 1) > (1 - 1) ... > (1 - 2) ... > (1 - 3) (2 - 7) > > Thread 1 Thread2 > (0 - 0) (0 - 4) > (1 - 0) (1 - 4) > (2 - 0) (2 - 4) > (0 - 1) (0 - 5) > (1 - 1) (1 - 5) > (2 - 1) (2 - 5) > ... ... > (0 - 3) (0 - 7) > (1 - 3) (1 - 7) > (2 - 3) (2 - 7) > > And here are my questions based on the above situation: > 1. I assumed that it's not possible to run multiple XDP sockets on top of > the bonding device itself and I need to "bind" the XDP sockets on the > physical interfaces behind the bonding device. Am I right about this or am > I missing something? > 2. Is the bonding logic (LACP management traffic) affected by the access > pattern of the XDP sockets? > 3. Is this scheme supposed to work or it's just that the design is wrong? I > mean, maybe a group of queues/sockets shouldn't be handled on a given > thread but only a single queue should be handled on a given application > thread. It's just that the physical devices have more queues setup on them > than the number of threads in the DPDK application and thus multiple queues > need to be handled on a single application thread. > > Any ideas are appreciated! > > Regards, > Pavel. Look at recent discussions on netdev mailing list. Linux bonding device still needs more work to fully support XDP.