From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 4C27D4387A;
	Wed, 10 Jan 2024 00:09:12 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 72AF44067A;
	Wed, 10 Jan 2024 00:09:06 +0100 (CET)
Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com
 [209.85.214.169])
 by mails.dpdk.org (Postfix) with ESMTP id EE6A54021E
 for <dev@dpdk.org>; Wed, 10 Jan 2024 00:09:04 +0100 (CET)
Received: by mail-pl1-f169.google.com with SMTP id
 d9443c01a7336-1d3e05abcaeso21249905ad.1
 for <dev@dpdk.org>; Tue, 09 Jan 2024 15:09:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1704841744;
 x=1705446544; darn=dpdk.org; 
 h=content-transfer-encoding:mime-version:references:in-reply-to
 :message-id:subject:cc:to:from:date:from:to:cc:subject:date
 :message-id:reply-to;
 bh=Hn8ahnCbg5TwrFUdT9AxiDCk2rOnauOzcH02KW76Bh8=;
 b=Cs6B5au+oGkJqpt3sIFIwfP9y8duArKv681PyW4MDU/mXiuAxC3rhit4eBkRnCi6NK
 BlAEAPm09I3QEOk8b2xKn/uJrLb6wJzzoH18uslG1rle+qMMoQ34P6ddlX1R9xMDaz9q
 9j+Cir1YLFRfXClMfyhwOw98l/2yPiI5GXJHMqsxYNO0Z84Js4lLh6GgTXEiIj3IfJZB
 7CHvUki5QW4zU8j4UGPiFK6d6HKOvb0U6oznfV1AI9GLs+XnMeYq6gIn+FPLN9z1lI53
 fmeIHZNj05pSpn4cVJbHjiUNqDsFgMp5BtNwx7/Z/WKod5iYD1JIupSWCI8MTsvuay/i
 5IGg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1704841744; x=1705446544;
 h=content-transfer-encoding:mime-version:references:in-reply-to
 :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc
 :subject:date:message-id:reply-to;
 bh=Hn8ahnCbg5TwrFUdT9AxiDCk2rOnauOzcH02KW76Bh8=;
 b=PBHgTL1k0rgxHL+V3golWeEDmNTx/bmTWrqo7PiPPdBde3oDmyFO29tcuVCW0A3yn4
 B73SrS1drTXM9F/haprvk5i4obCTWQ8MhMx3cyJxAeEbAMoiLvHpSWj5daJGiJbjb3jY
 SHxvtlSbPG31Uqk24+xkmahdSgDMDDem0VclStdK1csYzNBBMWfA/V9rELwiojwEETc+
 7F+kBK7u1qQYFC3XrdV5La0f/pjf9s+mlcsCqasq+OE9dGOPgOoO+FHCC+yabXPt0ZXp
 ITkE0DF33ySmfiNCWTX4D0XUQ4xq+B2RLfb7yUyu56fJFuRt02OMFoEyz0FoURPJyJXP
 W8bQ==
X-Gm-Message-State: AOJu0Yx3nLG8htL3vQ27U/TPlDLs0RX+5WYa31CA7wnrgknTcuT4k0zm
 d7IU2mQivt0nI5m4LGPkgGdvtI4rhO0TIA==
X-Google-Smtp-Source: AGHT+IHMUc08haIGpHUKHsfTkBsckEnRt+1etaUD021bAqTAUbds6NqBxYk1KOKFIVdsi+Q5kx4KPQ==
X-Received: by 2002:a17:903:110f:b0:1d5:6917:c9a8 with SMTP id
 n15-20020a170903110f00b001d56917c9a8mr110555plh.106.1704841744199; 
 Tue, 09 Jan 2024 15:09:04 -0800 (PST)
Received: from hermes.local (204-195-123-141.wavecable.com. [204.195.123.141])
 by smtp.gmail.com with ESMTPSA id
 u2-20020a170902e80200b001d4e058284esm2331526plg.89.2024.01.09.15.09.03
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Tue, 09 Jan 2024 15:09:03 -0800 (PST)
Date: Tue, 9 Jan 2024 15:06:47 -0800
From: Stephen Hemminger <stephen@networkplumber.org>
To: Konstantin Ananyev <konstantin.ananyev@huawei.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, "arshdeep.kaur@intel.com"
 <arshdeep.kaur@intel.com>, "Gowda, Sandesh" <sandesh.gowda@intel.com>,
 Reshma Pattan <reshma.pattan@intel.com>
Subject: Re: Issues around packet capture when secondary process is doing rx/tx
Message-ID: <20240109150611.00d13e13@hermes.local>
In-Reply-To: <5c28d2a26f5142c3a509cc8bda2fca75@huawei.com>
References: <20240107175900.1276c0a5@hermes.local>
 <5c28d2a26f5142c3a509cc8bda2fca75@huawei.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

On Mon, 8 Jan 2024 15:13:25 +0000
Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:

> > I have been looking at a problem reported by Sandesh
> > where packet capture does not work if rx/tx burst is done in secondary process.
> > 
> > The root cause is that existing rx/tx callback model just doesn't work
> > unless the process doing the rx/tx burst calls is the same one that
> > registered the callbacks.
> > 
> > An example sequence would be:
> > 	1. dumpcap (or pdump) as secondary tells pdump in primary to register callback
> > 	2. secondary process calls rx_burst.
> > 	3. rx_burst sees the callback but it has pointer pdump_rx which is not necessarily
> > 	   at same location in primary and secondary process.
> > 	4. indirect function call in secondary to bad location likely causes crash.  
> 
> As I remember, RX/TX callbacks were never intended to work over multiple processes.
> Right now RX/TX callbacks are private for the process, different process simply should not
> see/execute them.
> I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the rte_eth_dev.data that is shared
> between processes.
> It should be normal, wehn for the same port/queue you will end-up with different list of callbacks
> for different processes.  
> So, unless I am missing something, I don't see how we can end-up with 3) and 4) from above:
> From my understanding secondary process will never see/call primary's callbacks.
> 
> About pdump itself, it was a while when I looked at it last time, but as I remember to start it to work,
> server process has to call rte_pdump_init() which in terns register PDUMP_MP handler.
> I suppose for the secondary process to act as a 'pdump server' it needs to call rte_pdump_init() itself,
> though I am not sure such option is supported right now. 
>  

Did some more tests with modified testpmd, and reached some conclusions:

The logical interface would be to allow rte_pdump_init() to be called by
   the process that would be using rx/tx burst API's.

  This doesn't work as it should because the multi-process socket API
  assumes that the it only runs the server in primary.  The secondary
  can start its own MP thread, but it won't work:

  Primary EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
  Secondary: EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_6057_1ccd4157fd5

  The problem is when client (pdump or dumpcap) tries to run, it uses the mp_socket
  in the primary which causes: EAL: Cannot find action: mp_pdump

  Looks like the whole MP socket mechanism is just not up to this.

Maybe pdump needs to have its own socket and control thread?
Or MP socket needs to have some multicast fanout to all secondaries?







        2. Fut