From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id EB63645F84 for ; Sat, 4 Jan 2025 19:40:39 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7C0AE40278; Sat, 4 Jan 2025 19:40:39 +0100 (CET) Received: from mail-lf1-f45.google.com (mail-lf1-f45.google.com [209.85.167.45]) by mails.dpdk.org (Postfix) with ESMTP id 3D4594014F for ; Sat, 4 Jan 2025 19:40:38 +0100 (CET) Received: by mail-lf1-f45.google.com with SMTP id 2adb3069b0e04-53e3a37ae07so13941423e87.3 for ; Sat, 04 Jan 2025 10:40:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736016037; x=1736620837; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=EWKVYKDyd4rjwhfchA+u76uISGysZwd7Da35Nc58yUs=; b=mySYLN8UmhREPR0m3608WmQh2h8ogYD/ZhU2oTTmdLtIRLTeJG6CaiGAVNJrTxyIkP GIjabjKMo2IxCDqwY3RGWoee7XhwYlJo1Plr75CDYjfQI9bAVmt5JHwborA1fQ8esROU uRqyeZr002WPn1BowK7V0AY6gU6w4dVUXNGLSrE/aWSYTo/KvmYE0+cdakTyexJqmupd Znj1J7KXHSYhUEnEo3nLQlCb5RdbyYdqViJG5bJY2yDFQMlBF8USePXzSg46ob88E8Wa JipbN2BKI/x1WdGCpRWcUbFSWYQgDhnu5FlPyBMPdEKoojgIjsgBT/BH0zEHksglAE6D TdVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736016037; x=1736620837; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EWKVYKDyd4rjwhfchA+u76uISGysZwd7Da35Nc58yUs=; b=LP5kdAX0XwyxAjVgPtpCEmW0/TK5Wm3HCeFNejyAOi91KYlCCA5xFeGqt7Xvd6N+LY 04Cvfr9xcEV/lqxKzSjU/VIw5aySUEcjfDdTb9GkvG8m9RntVMulQwNss33224hy6ED7 LstfpEM3dA+UQLoHUf7dfvLGUmhE3GdnJ6BF1WpRYmF6Q4rwwFZFB9cxWHk+KavyXyo7 DMuxDAMJDYLSwR2vJANjD/gKKiers4JHdquy5jAE0XmYwp2V/eBe69o2vkVMNZvbumCs jV10HMeY/3pa9s35GDvrlokZC3YDmuSmHR2Fo0SN2rLcv0UCaT8qU0KrjwgO5DQ5DBye 5JOw== X-Gm-Message-State: AOJu0YwMZsy2x1L/122f6lonpM4CkL5FXoC3cODvEgu9JxyrwR5+A4Hj 7eIIaSdzUDdjTRckABoensqk0dz/lJMjx8mA4Lwkjg0h8Oxn86NvQEiDEg== X-Gm-Gg: ASbGncsti2uF6x36oAHnOpN3qov+EySwBxLVeSfv60kqnTLt3nbUqsJdzuOuN2DHPuh M/OeITbr1c7u340mtY1YHc+Q6FX3xSIpO+ySWflEsDObe08h9hZdbUoYpEkt1rzr6MyyQeGWxYx D7vS4cvAUq3YqMcddozRnFV3FtaFOsFC1pD1EZ9gBkdIxOhnrxnqvwrPF89kdUvBSmUi/5i7oV7 G7w+VuJ1zbhZqzyEjJ4gXFSZu4TcS+Z6CAgoxVu0t5Ei+qIdEtaUBWv35BzLEGe3+E53Nvj4DdR dy01xIYDjcOI3wSJKhqWJDIccYYA X-Google-Smtp-Source: AGHT+IGoQy9uG1SqeFQwC5OzZOT1dm2t5kDb+88FdQYr0whbuYdgsUv83B8NarAk8cXyYAlkD9lphg== X-Received: by 2002:a05:6512:1053:b0:542:29e5:7323 with SMTP id 2adb3069b0e04-54229e573famr14751145e87.5.1736016036973; Sat, 04 Jan 2025 10:40:36 -0800 (PST) Received: from sovereign (broadband-109-173-43-194.ip.moscow.rt.ru. [109.173.43.194]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-542236000ddsm4539960e87.82.2025.01.04.10.40.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 04 Jan 2025 10:40:35 -0800 (PST) Date: Sat, 4 Jan 2025 21:40:32 +0300 From: Dmitry Kozlyuk To: Alan Beadle Cc: users@dpdk.org Subject: Re: Multiprocess App Problems with tx_burst Message-ID: <20250104214032.04eb6d25@sovereign> In-Reply-To: References: X-Mailer: Claws Mail 3.18.0 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org 2025-01-04 11:22 (UTC-0500), Alan Beadle: > Hi everyone, >=20 > I'm still stuck on this. Most likely I am doing something wrong in the > initialization phase. I am trying to follow the standard code example > for symmetric multi-process, but since my code is doing very different > things from this example I cannot even begin to guess where I am going > wrong. I do not even know if what I am trying to do is permissible in > the DPDK API. >=20 > It would be very helpful if someone could provide an initialization > checklist for my use case (below). >=20 > As explained previously, I have several separately launched processes. > These processes already share a memory region for local communication. > I want all of these processes to have equal ability to read incoming > packets, place pointers to the mbufs in shared memory, and wake each > other up when packets destined for a particular one of these processes > arrives. I have one X550-T2 NIC and I am only using one one of the > physical ports. It connects to a second machine which is doing > essentially the same thing, running the same DPDK code. >=20 > In summary, each of my multiple processes should all be able to > equally receive packets of behalf of each other, and leave pointers to > rx'ed mbufs for each other in shared memory according to which process > the mbuf was destined for. Outbound packets may also be shared with > local peer processes for reading. In order to do this I am also > bumping the mbuf refcount until the peer process has read the mbuf. >=20 > I already thought I had all of this working fine, but it turns out > that they were all taking turns on the same physical core, and > everything breaks when they are run concurrently on separate cores. I > have seen conflicting information in online threads about the thread > safety of the various DPDK functions that I am using. I tried adding > synchronization around DPDK allocation and tx/rx bursts to no avail. > My code detects weird errors where either mbufs contain unexpected > things (invalid reuse?) or tx bursts start to fail in one of the > processes. >=20 > Frankly I also feel very confused about how ports, queues, mempools, > etc work and I suspect that a lot of what I have been reading is > outdated or faulty information. >=20 > Any guidance at all would be greatly appreciated! > -Alan >=20 > On Tue, Dec 31, 2024 at 12:49=E2=80=AFPM Alan Beadle wrote: > > > > Hi everyone, > > > > I am working on a multi-process DPDK application. It uses one NIC, one > > port, and both separate processes send as well as receive, and they > > share memory for synchronization and IPC. > > > > I had previously made a mistake in setting up the lcores, and all of > > the processes were assigned to the same physical core. This seems to > > have concealed some DPDK thread safety issues which I am now dealing > > with. > > > > I understand that rte_eth_tx_burst() and rte_eth_rx_burst() are not > > thread safe. Previously I did not have any synchronization around > > these functions. Now that I am successfully using separate cores, I > > have added a shared spinlock around all invocations of these > > functions, as well as around all mbuf frees and allocations. > > > > However, when my code sends a packet, it checks the return value of > > rte_eth_tx_burst() to ensure that the packet was actually sent. If it > > fails to send, my app exits with an error. This was not previously > > happening, but now it happens every time I run it. I thought this was > > due to the lack of synchronization but it is still happening after I > > added the lock. Why would rte_eth_tx_burst() be failing now? > > > > Thank you, > > -Alan =20 Hi Alan, A lot is still unclear, let's start gradually. Thread-unsafe are queues, not calls to rte_eth_rx/tx_burst(). You can call rte_eth_rx/tx_burst() concurrently without synchronization if they operate on different queues. Typically you assign each lcore to operate on one or more queues, but no queue to be operated by multiple lcores. Otherwise you need to synchronize access, which obviously hurts scaling. Does this hold in your case? Lcore is a thread to which DPDK can dispatch work. By default it is pinned to one physical core unless --lcores is used. What is lcore-to-CPU mapping in your case? What is the design of your app regarding processes, lcores, and queues? That is: which process runs which lcores and which queues to the latter ser= ve? P.S. Please don't top-post.