From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9669B45F99 for ; Mon, 6 Jan 2025 17:05:51 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 14E36406B7; Mon, 6 Jan 2025 17:05:51 +0100 (CET) Received: from mail-oo1-f50.google.com (mail-oo1-f50.google.com [209.85.161.50]) by mails.dpdk.org (Postfix) with ESMTP id 69E0C40687 for ; Mon, 6 Jan 2025 17:05:50 +0100 (CET) Received: by mail-oo1-f50.google.com with SMTP id 006d021491bc7-5f6b65c89c4so1613671eaf.2 for ; Mon, 06 Jan 2025 08:05:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736179549; x=1736784349; darn=dpdk.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=y6Yty0IxuzPdt55xsyrfy/OpmFOY7cKi+2XYLder7Ro=; b=RLhmhAttStdtKWjusdPcdoWM5xi1wS0mqP0jz9Ryd/D1GaX+VImfOg1UUHnccnwef3 diW/XF8iV/YMMuA7MGXyb5AjQ/aDd2Nh3ymD5Zk96CwHg+YbjghVAAVngaG1t5fGYc4o P5eR85vsoxUW2HGudh81gOHKGRLeDu7LrE2/O9Cx39ZZvBsma4MW6GPy/9eKzM1RGr3z v/A8KKsO6SyY7K1BL1xW/a3fkNbDseHXWWZ9vstDNson7Ttu98yWt2r2A2/yfvPlTZ7x +ez6uKgKbe5CeGBGB//XuNq7g3a+Rq6E/nFgTGvOOuE5rwvoXV3bqLx3sfdQq67H1X6N GJcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736179549; x=1736784349; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=y6Yty0IxuzPdt55xsyrfy/OpmFOY7cKi+2XYLder7Ro=; b=aNKsdPoRD8MMuATddUq9mxgPmKbYfkTy7u0DwK71TR3EH2XQUG99gEH9d9bNWLJ8IH ICTNIqV54fjexmecdIb5+C2JCLTeh7evw3T6GhVox28CQbvLX5h2LYCoW7LSxh2OVh5W lei/mufCInSVTGkbXFub9ZELIVzv1/n/b/C6SaXVX1Fnt3KE9AJd6ntbMnlUhuCT/s34 5tGcUdGBuV6Sp0BC0hx/20+IG6Lx5Dy8aXI1P/9mkiQMpuD9D+gOFyTMWEeVYNQX/m25 ulLCQP62oyLIBdHdeRt2d+VLDhLjwVL1yJIthogjJoBj6AXC7Swna373nV6vYL4TnAqU B+4A== X-Gm-Message-State: AOJu0Yyy06gIdyHMZi9+nDDcIh2s29Uuyn0U8zqJPa/x1QkQhysXURTA x0sZzQ52QUgIKRRy+EBJsH0Thx7QxlwcIp2+EhyA2VHlwgQ0i7RzMZ2DHfMSyYpmWkdhjvrjI+g RSdA137cSmF12+TLGialdRCYr/tM= X-Gm-Gg: ASbGncvg0EdLyY9+4k09F/g7KCcEh7PYDk5LvApEIc69rn/3r/ZhaokwKtVr7FG9R8u eIZFwhIOzH3+vyNPH9Z8ZZb9O6DHEEecmiGu/ X-Google-Smtp-Source: AGHT+IE2LFLiPIEprvJ0z0EILevuyeet1x8ffQpPE3iKiyUTWWWqf4DAetlKGMgf6ArcDZ1WqMmnt+BVW9Q57HTgoVQ= X-Received: by 2002:a05:6871:5e16:b0:29e:2422:49e2 with SMTP id 586e51a60fabf-2a7fb3129bemr31579139fac.31.1736179549341; Mon, 06 Jan 2025 08:05:49 -0800 (PST) MIME-Version: 1.0 References: <20250104214032.04eb6d25@sovereign> <20250105010148.1ef26333@sovereign> In-Reply-To: From: Alan Beadle Date: Mon, 6 Jan 2025 11:05:37 -0500 Message-ID: Subject: Re: Multiprocess App Problems with tx_burst To: Dmitry Kozlyuk Cc: users@dpdk.org Content-Type: text/plain; charset="UTF-8" X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: users-bounces@dpdk.org > Note that I am also seeing another error. Sometimes, rather than tx > failing, my app detects incorrect/corrupted mbuf contents and exits > immediately. It appears that mbufs are being re-allocated when they > should not be. I thought I had finally solved this (see my earlier > threads) but with multi-core concurrency this problem has returned. It > is very possible that this error is somewhere in my own library code, > as it looks like the accompanying non-DPDK structures are also being > corrupted (probably first). > > For background, I maintain a hash table of header structs to track > individual mbufs. The sequence numbers in the headers should match > those contained in the mbuf's payload. This check is failing after a > few hundred successful data messages have been exchanged between the > hosts. The sequence number in the mbuf shows that it is in the wrong > hash bucket, and the sequence number in the header is a large > corrupted value which is out of range for my sequence numbers (and > also not matching the bucket). > There is definitely something going wrong with the mbuf allocator. Each run results in such different errors that it is difficult to add instrumentation for a specific one, but one frequent error is that a newly allocated mbuf already has a refcnt of 2, and contains data that I am still using elsewhere. At each call to rte_pktmbuf_alloc() (with locks around it) I immediately do a rte_mbuf_refcnt_read() and ensure that it is 1. Sometimes it is 2. This should never occur and I believe it proves that DPDK is not working as expected here for some reason. -Alan