From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 21F3F45F82 for ; Sat, 4 Jan 2025 17:22:32 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id F1979402AD; Sat, 4 Jan 2025 17:22:29 +0100 (CET) Received: from mail-oa1-f45.google.com (mail-oa1-f45.google.com [209.85.160.45]) by mails.dpdk.org (Postfix) with ESMTP id CF7734014F for ; Sat, 4 Jan 2025 17:22:27 +0100 (CET) Received: by mail-oa1-f45.google.com with SMTP id 586e51a60fabf-29ff8053384so5883756fac.3 for ; Sat, 04 Jan 2025 08:22:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736007747; x=1736612547; darn=dpdk.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=WL1R0199yex1MRrTp6ophRVO1c3lu4nSV8G6x39Qaek=; b=ObBmjK9tN+U7A+qw+dE+FN6G/7CyFjveTJ9zfdQ+DOvGMRxCo1QFGUxSzDTuT3uXl5 LiEvPkiriz2Sw4Rq9x/+5kx2rA9hQ+NQbH537RG6zoQvpIICJNlaFGNozv5B2WBMF+Af /tHj3/p13QnlvARbrtRk3t8jD+Iv9qqvnVEJe2Dm7Ev+5XljShcJJkGU+tYaXxY406q1 lAKrOFV2fj1/YNQFEtRT2gNd+Ofz5Z5KZNC4w7C3/Stvyx7UKN+nYE2PkCWc/ev5I5iE F9vwh84fOd8ff2x9UNfAlqmzkFfeeeubgCWb65QUQrTSIuxW4DJHu6AatZYW3xioqmdF RyEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736007747; x=1736612547; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WL1R0199yex1MRrTp6ophRVO1c3lu4nSV8G6x39Qaek=; b=stRQS6Eg/1B9OCSWqocUQy7KccT6uvmJNV/nWWPCOZJsGViqtu95Wvc17LcnJcjP9i hg13wPnaxDN4nJaAVHVQWHzkaJ8Fl13yN10wNj8s5FjJdcueMkAbrq/+oa0JCTXhUcUn E/NUMmLH0hJ+gG8tMDoRfU85QdqVZ3eXJyP40pkTaAc2gJvq4pqyGVkZSOeV541J5dvG Mea0vjuVF8XdBajEudSqwjcuVQBqAWIkIHiNyFZ8kyjHa2Zz1DDJ9ZBBG/FUkfhfLIvR 17zSaR37Tmba4qWvrb6vCtF0dMf4URDXky3gRsBG7+Wm4Qt5DNWzgIfNMfS3zleLoo0J M6/A== X-Gm-Message-State: AOJu0YxHJRxX1QtwNRsrdryFQuGBtaEhlCSZ7FI2JwfKL2i/07jKiDM/ w1eOwtqFpLzkTRTvM6/Sc6k5a8tOTZCk9nPJOE4drtThjSTdQJR2i0JIboj9IPnj8jCN1+p7VwQ 6kSLi0DEjOLYFA1qFO/OkE/QV3CPakA== X-Gm-Gg: ASbGncv9jV1A4zPeqUEjV79o4cHx35yr/7QsDvvFCQClPdZ5RiQ+Zhy78kDZ/ZAXv7z K1fS1V1MeS1fJI+EKtog59iBWtVn57RB7Jsb9 X-Google-Smtp-Source: AGHT+IFxwBwWYs5iXGlMaKKPYf/cO5F95jmty1pfB+vjn3Oh7GAk+7wCvwv5XDBu7DQG9kmxEKptGZWfuzuEOaZOAgo= X-Received: by 2002:a05:6870:968d:b0:29e:5e7b:dc0f with SMTP id 586e51a60fabf-2a7fb5581d7mr26240620fac.38.1736007746726; Sat, 04 Jan 2025 08:22:26 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Alan Beadle Date: Sat, 4 Jan 2025 11:22:16 -0500 Message-ID: Subject: Re: Multiprocess App Problems with tx_burst To: users@dpdk.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org Hi everyone, I'm still stuck on this. Most likely I am doing something wrong in the initialization phase. I am trying to follow the standard code example for symmetric multi-process, but since my code is doing very different things from this example I cannot even begin to guess where I am going wrong. I do not even know if what I am trying to do is permissible in the DPDK API. It would be very helpful if someone could provide an initialization checklist for my use case (below). As explained previously, I have several separately launched processes. These processes already share a memory region for local communication. I want all of these processes to have equal ability to read incoming packets, place pointers to the mbufs in shared memory, and wake each other up when packets destined for a particular one of these processes arrives. I have one X550-T2 NIC and I am only using one one of the physical ports. It connects to a second machine which is doing essentially the same thing, running the same DPDK code. In summary, each of my multiple processes should all be able to equally receive packets of behalf of each other, and leave pointers to rx'ed mbufs for each other in shared memory according to which process the mbuf was destined for. Outbound packets may also be shared with local peer processes for reading. In order to do this I am also bumping the mbuf refcount until the peer process has read the mbuf. I already thought I had all of this working fine, but it turns out that they were all taking turns on the same physical core, and everything breaks when they are run concurrently on separate cores. I have seen conflicting information in online threads about the thread safety of the various DPDK functions that I am using. I tried adding synchronization around DPDK allocation and tx/rx bursts to no avail. My code detects weird errors where either mbufs contain unexpected things (invalid reuse?) or tx bursts start to fail in one of the processes. Frankly I also feel very confused about how ports, queues, mempools, etc work and I suspect that a lot of what I have been reading is outdated or faulty information. Any guidance at all would be greatly appreciated! -Alan On Tue, Dec 31, 2024 at 12:49=E2=80=AFPM Alan Beadle = wrote: > > Hi everyone, > > I am working on a multi-process DPDK application. It uses one NIC, one > port, and both separate processes send as well as receive, and they > share memory for synchronization and IPC. > > I had previously made a mistake in setting up the lcores, and all of > the processes were assigned to the same physical core. This seems to > have concealed some DPDK thread safety issues which I am now dealing > with. > > I understand that rte_eth_tx_burst() and rte_eth_rx_burst() are not > thread safe. Previously I did not have any synchronization around > these functions. Now that I am successfully using separate cores, I > have added a shared spinlock around all invocations of these > functions, as well as around all mbuf frees and allocations. > > However, when my code sends a packet, it checks the return value of > rte_eth_tx_burst() to ensure that the packet was actually sent. If it > fails to send, my app exits with an error. This was not previously > happening, but now it happens every time I run it. I thought this was > due to the lack of synchronization but it is still happening after I > added the lock. Why would rte_eth_tx_burst() be failing now? > > Thank you, > -Alan