From: Stephen Hemminger <stephen@networkplumber.org>
To: Konstantin Ananyev <konstantin.ananyev@huawei.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
"arshdeep.kaur@intel.com" <arshdeep.kaur@intel.com>,
"Gowda, Sandesh" <sandesh.gowda@intel.com>,
Reshma Pattan <reshma.pattan@intel.com>
Subject: Re: Issues around packet capture when secondary process is doing rx/tx
Date: Tue, 2 Apr 2024 17:14:04 -0700 [thread overview]
Message-ID: <20240402171404.3a0385ed@hermes.local> (raw)
In-Reply-To: <5c28d2a26f5142c3a509cc8bda2fca75@huawei.com>
On Mon, 8 Jan 2024 15:13:25 +0000
Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
> > I have been looking at a problem reported by Sandesh
> > where packet capture does not work if rx/tx burst is done in secondary process.
> >
> > The root cause is that existing rx/tx callback model just doesn't work
> > unless the process doing the rx/tx burst calls is the same one that
> > registered the callbacks.
> >
> > An example sequence would be:
> > 1. dumpcap (or pdump) as secondary tells pdump in primary to register callback
> > 2. secondary process calls rx_burst.
> > 3. rx_burst sees the callback but it has pointer pdump_rx which is not necessarily
> > at same location in primary and secondary process.
> > 4. indirect function call in secondary to bad location likely causes crash.
>
> As I remember, RX/TX callbacks were never intended to work over multiple processes.
> Right now RX/TX callbacks are private for the process, different process simply should not
> see/execute them.
> I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the rte_eth_dev.data that is shared
> between processes.
> It should be normal, wehn for the same port/queue you will end-up with different list of callbacks
> for different processes.
> So, unless I am missing something, I don't see how we can end-up with 3) and 4) from above:
> From my understanding secondary process will never see/call primary's callbacks.
>
> About pdump itself, it was a while when I looked at it last time, but as I remember to start it to work,
> server process has to call rte_pdump_init() which in terns register PDUMP_MP handler.
> I suppose for the secondary process to act as a 'pdump server' it needs to call rte_pdump_init() itself,
> though I am not sure such option is supported right now.
>
> >
> > Some possible workarounds.
> > 1. Keep callback list per-process: messy, but won't crash. Capture won't work
> > without other changes. In this primary would register callback, but secondaries
> > would not use them in rx/tx burst.
> >
> > 2. Replace use of rx/tx callback in pdump with change to rte_ethdev to have
> > a capture flag. (i.e. don't use indirection). Likely ABI problems.
> > Basically, ignore the rx/tx callback mechanism. This is my preferred
> > solution.
>
> It is not only the capture flag, it is also what to do with the captured packets
> (copy? If yes, then where to? examine? drop?, do something else?).
> It is probably not the best choice to add all these things into ethdev API.
>
> > 3. Some fix up mechanism (in EAL mp support?) to have each process fixup
> > its callback mechanism.
>
> Probably the easiest way to fix that - pass to rte_pdump_enable() extra information
> that would allow it to distinguish on what exact process (local, remote)
> we want to enable pdump functionality. Then it could act accordingly.
>
> >
> > 4. Do something in pdump_init to register the callback in same process context
> > (probably need callbacks to be per-process). Would mean callback is always
> > on independent of capture being enabled.
> >
> > 5. Get rid of indirect function call pointer, and replace it by index into
> > a static table of callback functions. Every process would have same code
> > (in this case pdump_rx) but at different address. Requires all callbacks
> > to be statically defined at build time.
>
> Doesn't look like a good approach - it will break many things.
>
> > The existing rx/tx callback is not safe id rx/tx burst is called from different process
> > than where callback is registered.
>
>
Have been looking into best way to fix this, and the real answer is not to use
callbacks but instead use a flag per-queue. The natural place to put these in
rte_ethdev_driver. BUT this will mean an ABI breakage, so will have to wait for 24.11
release. Sometimes fixing a design flaw means an ABI change.
next prev parent reply other threads:[~2024-04-03 0:14 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-08 1:59 Stephen Hemminger
2024-01-08 10:41 ` Morten Brørup
2024-04-03 11:43 ` Ferruh Yigit
2024-01-08 15:13 ` Konstantin Ananyev
2024-01-08 17:02 ` Stephen Hemminger
2024-01-08 17:55 ` Stephen Hemminger
2024-01-09 23:06 ` Stephen Hemminger
2024-01-09 23:07 ` Stephen Hemminger
2024-04-03 12:11 ` Ferruh Yigit
2024-01-10 20:11 ` Konstantin Ananyev
2024-04-03 12:20 ` Ferruh Yigit
2024-04-04 13:26 ` Konstantin Ananyev
2024-04-04 14:28 ` Ferruh Yigit
2024-04-04 15:21 ` Stephen Hemminger
2024-04-04 16:18 ` Konstantin Ananyev
2024-04-03 0:14 ` Stephen Hemminger [this message]
2024-04-03 11:42 ` Ferruh Yigit
2024-01-09 1:30 ` Honnappa Nagarahalli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240402171404.3a0385ed@hermes.local \
--to=stephen@networkplumber.org \
--cc=arshdeep.kaur@intel.com \
--cc=dev@dpdk.org \
--cc=konstantin.ananyev@huawei.com \
--cc=reshma.pattan@intel.com \
--cc=sandesh.gowda@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).