From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7FC8443DD2; Wed, 3 Apr 2024 02:14:09 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5E04A40144; Wed, 3 Apr 2024 02:14:09 +0200 (CEST) Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by mails.dpdk.org (Postfix) with ESMTP id 6BAAC400D7 for ; Wed, 3 Apr 2024 02:14:07 +0200 (CEST) Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-1dff837d674so45544795ad.3 for ; Tue, 02 Apr 2024 17:14:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1712103246; x=1712708046; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=vCibG1n3rQDll8f5ZZ6g2PLQI2N8LXSHqizcaXaiaHg=; b=PxUhQcSYxoK661CSpp7DM1yKutvITBEIgifTcw9JldWgdk9q6Zhwk0I48yhoS4pqDU VXXEIXHTbUv+xOl4Pa3CG486G1z6luwVxNzKDWYmXlhQGSRdptL5XyaXVZ7DxkKZMTzy Msg2T4tKSKROIvd7zcu+OfXPqjKd8RUfRxQaRySkfelgimO5unmIeWxk6Ho86PgDy1hl 3Xha5HyrTvbxASkH5yzUaJyNax4lQ7+1lCMS9eCmfVeJ+nh/g+dnBP0ufOZXea3x5T9G 5gv0rlXcH0e+04m7+0hsK+mPi/K1WtlvmEMgeHxJjR7nntdski2mFrjROdw+yxPGiwb5 c6JA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712103246; x=1712708046; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vCibG1n3rQDll8f5ZZ6g2PLQI2N8LXSHqizcaXaiaHg=; b=eCPk5pMQOoQ3s3tMgWXK2cO4Rf+EXyIkK6NJE1rnvRZnvmMFlcWYyy8edUHd6W2m+u 7OlLtCDxuNUHF2TOngpidyGM6b+h6+zv5Q0VQmPmsjqzJ0jnQvrRfTtxd9oAjVUHLl+c JOwhojp9WxDIRxapUEYfo+caqKHb51OmHKSNi7CJ8P0xpEYFbJE7kt3xHIWq1bwZU4qp 2TSB1Fl0rzHYza7V91OpS6fRCuI8TGylp+MVReQR90Ma0Jtif/RFLRHuzFHgARc0jGlF V8/DUbSF1CSokkF6wshgw1Vg2JUyX2PN7BhY7RJxz8bw1HoR2Rj7Tc5WoiXjyOfUHvKb 4smw== X-Gm-Message-State: AOJu0Ywv7T/DYmpnT1pqHRyMQNBiEBDa8gH4czrecb96w2nAMWMOLaED ZPbuO7OHagw69ZVzooCCIdjebG3pAnzHmVPxnM1UZNWx/uOD6qbgMHsmVmXw1TQ= X-Google-Smtp-Source: AGHT+IEI6My0+uKYBqZLRrPi5HohlqgMFSscmqbKP/c7e90OiN/KVhlLgdGrxzvCB1zAWlmgA/2H8Q== X-Received: by 2002:a17:903:2291:b0:1e0:e236:fe45 with SMTP id b17-20020a170903229100b001e0e236fe45mr15721006plh.10.1712103246402; Tue, 02 Apr 2024 17:14:06 -0700 (PDT) Received: from hermes.local (204-195-96-226.wavecable.com. [204.195.96.226]) by smtp.gmail.com with ESMTPSA id b18-20020a170902d51200b001deeac592absm11751422plg.180.2024.04.02.17.14.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Apr 2024 17:14:06 -0700 (PDT) Date: Tue, 2 Apr 2024 17:14:04 -0700 From: Stephen Hemminger To: Konstantin Ananyev Cc: "dev@dpdk.org" , "arshdeep.kaur@intel.com" , "Gowda, Sandesh" , Reshma Pattan Subject: Re: Issues around packet capture when secondary process is doing rx/tx Message-ID: <20240402171404.3a0385ed@hermes.local> In-Reply-To: <5c28d2a26f5142c3a509cc8bda2fca75@huawei.com> References: <20240107175900.1276c0a5@hermes.local> <5c28d2a26f5142c3a509cc8bda2fca75@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Mon, 8 Jan 2024 15:13:25 +0000 Konstantin Ananyev wrote: > > I have been looking at a problem reported by Sandesh > > where packet capture does not work if rx/tx burst is done in secondary process. > > > > The root cause is that existing rx/tx callback model just doesn't work > > unless the process doing the rx/tx burst calls is the same one that > > registered the callbacks. > > > > An example sequence would be: > > 1. dumpcap (or pdump) as secondary tells pdump in primary to register callback > > 2. secondary process calls rx_burst. > > 3. rx_burst sees the callback but it has pointer pdump_rx which is not necessarily > > at same location in primary and secondary process. > > 4. indirect function call in secondary to bad location likely causes crash. > > As I remember, RX/TX callbacks were never intended to work over multiple processes. > Right now RX/TX callbacks are private for the process, different process simply should not > see/execute them. > I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the rte_eth_dev.data that is shared > between processes. > It should be normal, wehn for the same port/queue you will end-up with different list of callbacks > for different processes. > So, unless I am missing something, I don't see how we can end-up with 3) and 4) from above: > From my understanding secondary process will never see/call primary's callbacks. > > About pdump itself, it was a while when I looked at it last time, but as I remember to start it to work, > server process has to call rte_pdump_init() which in terns register PDUMP_MP handler. > I suppose for the secondary process to act as a 'pdump server' it needs to call rte_pdump_init() itself, > though I am not sure such option is supported right now. > > > > > Some possible workarounds. > > 1. Keep callback list per-process: messy, but won't crash. Capture won't work > > without other changes. In this primary would register callback, but secondaries > > would not use them in rx/tx burst. > > > > 2. Replace use of rx/tx callback in pdump with change to rte_ethdev to have > > a capture flag. (i.e. don't use indirection). Likely ABI problems. > > Basically, ignore the rx/tx callback mechanism. This is my preferred > > solution. > > It is not only the capture flag, it is also what to do with the captured packets > (copy? If yes, then where to? examine? drop?, do something else?). > It is probably not the best choice to add all these things into ethdev API. > > > 3. Some fix up mechanism (in EAL mp support?) to have each process fixup > > its callback mechanism. > > Probably the easiest way to fix that - pass to rte_pdump_enable() extra information > that would allow it to distinguish on what exact process (local, remote) > we want to enable pdump functionality. Then it could act accordingly. > > > > > 4. Do something in pdump_init to register the callback in same process context > > (probably need callbacks to be per-process). Would mean callback is always > > on independent of capture being enabled. > > > > 5. Get rid of indirect function call pointer, and replace it by index into > > a static table of callback functions. Every process would have same code > > (in this case pdump_rx) but at different address. Requires all callbacks > > to be statically defined at build time. > > Doesn't look like a good approach - it will break many things. > > > The existing rx/tx callback is not safe id rx/tx burst is called from different process > > than where callback is registered. > > Have been looking into best way to fix this, and the real answer is not to use callbacks but instead use a flag per-queue. The natural place to put these in rte_ethdev_driver. BUT this will mean an ABI breakage, so will have to wait for 24.11 release. Sometimes fixing a design flaw means an ABI change.