* Re: Issues around packet capture when secondary process is doing rx/tx
2024-01-08 15:13 ` Konstantin Ananyev
@ 2024-01-08 17:02 ` Stephen Hemminger
2024-01-08 17:55 ` Stephen Hemminger
` (3 subsequent siblings)
4 siblings, 0 replies; 18+ messages in thread
From: Stephen Hemminger @ 2024-01-08 17:02 UTC (permalink / raw)
To: Konstantin Ananyev; +Cc: dev, arshdeep.kaur, Gowda, Sandesh, Reshma Pattan
On Mon, 8 Jan 2024 15:13:25 +0000
Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
> >
> > 2. Replace use of rx/tx callback in pdump with change to rte_ethdev to have
> > a capture flag. (i.e. don't use indirection). Likely ABI problems.
> > Basically, ignore the rx/tx callback mechanism. This is my preferred
> > solution.
>
> It is not only the capture flag, it is also what to do with the captured packets
> (copy? If yes, then where to? examine? drop?, do something else?).
> It is probably not the best choice to add all these things into ethdev API.
The part that pdump does is trivial, it just copies and puts in ring.
This will work from any process.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Issues around packet capture when secondary process is doing rx/tx
2024-01-08 15:13 ` Konstantin Ananyev
2024-01-08 17:02 ` Stephen Hemminger
@ 2024-01-08 17:55 ` Stephen Hemminger
2024-01-09 23:06 ` Stephen Hemminger
` (2 subsequent siblings)
4 siblings, 0 replies; 18+ messages in thread
From: Stephen Hemminger @ 2024-01-08 17:55 UTC (permalink / raw)
To: Konstantin Ananyev; +Cc: dev, arshdeep.kaur, Gowda, Sandesh, Reshma Pattan
On Mon, 8 Jan 2024 15:13:25 +0000
Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
> > I have been looking at a problem reported by Sandesh
> > where packet capture does not work if rx/tx burst is done in secondary process.
> >
> > The root cause is that existing rx/tx callback model just doesn't work
> > unless the process doing the rx/tx burst calls is the same one that
> > registered the callbacks.
> >
> > An example sequence would be:
> > 1. dumpcap (or pdump) as secondary tells pdump in primary to register callback
> > 2. secondary process calls rx_burst.
> > 3. rx_burst sees the callback but it has pointer pdump_rx which is not necessarily
> > at same location in primary and secondary process.
> > 4. indirect function call in secondary to bad location likely causes crash.
>
> As I remember, RX/TX callbacks were never intended to work over multiple processes.
> Right now RX/TX callbacks are private for the process, different process simply should not
> see/execute them.
> I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the rte_eth_dev.data that is shared
> between processes.
> It should be normal, wehn for the same port/queue you will end-up with different list of callbacks
> for different processes.
> So, unless I am missing something, I don't see how we can end-up with 3) and 4) from above:
> From my understanding secondary process will never see/call primary's callbacks.
>
> About pdump itself, it was a while when I looked at it last time, but as I remember to start it to work,
> server process has to call rte_pdump_init() which in terns register PDUMP_MP handler.
> I suppose for the secondary process to act as a 'pdump server' it needs to call rte_pdump_init() itself,
> though I am not sure such option is supported right now.
Maybe the simplest would be just to make sure that rte_pdump_init() is called
in the process that does rx/tx burst. That might be made to work.
Still won't work for case where there are multiple secondary processes and some
the ethdev ports are used differently in each one, but would work better than now.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Issues around packet capture when secondary process is doing rx/tx
2024-01-08 15:13 ` Konstantin Ananyev
2024-01-08 17:02 ` Stephen Hemminger
2024-01-08 17:55 ` Stephen Hemminger
@ 2024-01-09 23:06 ` Stephen Hemminger
2024-01-09 23:07 ` Stephen Hemminger
2024-01-10 20:11 ` Konstantin Ananyev
2024-04-03 0:14 ` Stephen Hemminger
2024-04-03 11:42 ` Ferruh Yigit
4 siblings, 2 replies; 18+ messages in thread
From: Stephen Hemminger @ 2024-01-09 23:06 UTC (permalink / raw)
To: Konstantin Ananyev; +Cc: dev, arshdeep.kaur, Gowda, Sandesh, Reshma Pattan
On Mon, 8 Jan 2024 15:13:25 +0000
Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
> > I have been looking at a problem reported by Sandesh
> > where packet capture does not work if rx/tx burst is done in secondary process.
> >
> > The root cause is that existing rx/tx callback model just doesn't work
> > unless the process doing the rx/tx burst calls is the same one that
> > registered the callbacks.
> >
> > An example sequence would be:
> > 1. dumpcap (or pdump) as secondary tells pdump in primary to register callback
> > 2. secondary process calls rx_burst.
> > 3. rx_burst sees the callback but it has pointer pdump_rx which is not necessarily
> > at same location in primary and secondary process.
> > 4. indirect function call in secondary to bad location likely causes crash.
>
> As I remember, RX/TX callbacks were never intended to work over multiple processes.
> Right now RX/TX callbacks are private for the process, different process simply should not
> see/execute them.
> I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the rte_eth_dev.data that is shared
> between processes.
> It should be normal, wehn for the same port/queue you will end-up with different list of callbacks
> for different processes.
> So, unless I am missing something, I don't see how we can end-up with 3) and 4) from above:
> From my understanding secondary process will never see/call primary's callbacks.
>
> About pdump itself, it was a while when I looked at it last time, but as I remember to start it to work,
> server process has to call rte_pdump_init() which in terns register PDUMP_MP handler.
> I suppose for the secondary process to act as a 'pdump server' it needs to call rte_pdump_init() itself,
> though I am not sure such option is supported right now.
>
Did some more tests with modified testpmd, and reached some conclusions:
The logical interface would be to allow rte_pdump_init() to be called by
the process that would be using rx/tx burst API's.
This doesn't work as it should because the multi-process socket API
assumes that the it only runs the server in primary. The secondary
can start its own MP thread, but it won't work:
Primary EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
Secondary: EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_6057_1ccd4157fd5
The problem is when client (pdump or dumpcap) tries to run, it uses the mp_socket
in the primary which causes: EAL: Cannot find action: mp_pdump
Looks like the whole MP socket mechanism is just not up to this.
Maybe pdump needs to have its own socket and control thread?
Or MP socket needs to have some multicast fanout to all secondaries?
2. Fut
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Issues around packet capture when secondary process is doing rx/tx
2024-01-09 23:06 ` Stephen Hemminger
@ 2024-01-09 23:07 ` Stephen Hemminger
2024-04-03 12:11 ` Ferruh Yigit
2024-01-10 20:11 ` Konstantin Ananyev
1 sibling, 1 reply; 18+ messages in thread
From: Stephen Hemminger @ 2024-01-09 23:07 UTC (permalink / raw)
To: Konstantin Ananyev; +Cc: dev, arshdeep.kaur, Gowda, Sandesh, Reshma Pattan
On Tue, 9 Jan 2024 15:06:47 -0800
Stephen Hemminger <stephen@networkplumber.org> wrote:
> On Mon, 8 Jan 2024 15:13:25 +0000
> Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
>
> > > I have been looking at a problem reported by Sandesh
> > > where packet capture does not work if rx/tx burst is done in secondary process.
> > >
> > > The root cause is that existing rx/tx callback model just doesn't work
> > > unless the process doing the rx/tx burst calls is the same one that
> > > registered the callbacks.
> > >
> > > An example sequence would be:
> > > 1. dumpcap (or pdump) as secondary tells pdump in primary to register callback
> > > 2. secondary process calls rx_burst.
> > > 3. rx_burst sees the callback but it has pointer pdump_rx which is not necessarily
> > > at same location in primary and secondary process.
> > > 4. indirect function call in secondary to bad location likely causes crash.
> >
> > As I remember, RX/TX callbacks were never intended to work over multiple processes.
> > Right now RX/TX callbacks are private for the process, different process simply should not
> > see/execute them.
> > I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the rte_eth_dev.data that is shared
> > between processes.
> > It should be normal, wehn for the same port/queue you will end-up with different list of callbacks
> > for different processes.
> > So, unless I am missing something, I don't see how we can end-up with 3) and 4) from above:
> > From my understanding secondary process will never see/call primary's callbacks.
> >
> > About pdump itself, it was a while when I looked at it last time, but as I remember to start it to work,
> > server process has to call rte_pdump_init() which in terns register PDUMP_MP handler.
> > I suppose for the secondary process to act as a 'pdump server' it needs to call rte_pdump_init() itself,
> > though I am not sure such option is supported right now.
> >
>
> Did some more tests with modified testpmd, and reached some conclusions:
>
> The logical interface would be to allow rte_pdump_init() to be called by
> the process that would be using rx/tx burst API's.
>
> This doesn't work as it should because the multi-process socket API
> assumes that the it only runs the server in primary. The secondary
> can start its own MP thread, but it won't work:
>
> Primary EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> Secondary: EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_6057_1ccd4157fd5
>
> The problem is when client (pdump or dumpcap) tries to run, it uses the mp_socket
> in the primary which causes: EAL: Cannot find action: mp_pdump
>
> Looks like the whole MP socket mechanism is just not up to this.
>
> Maybe pdump needs to have its own socket and control thread?
> Or MP socket needs to have some multicast fanout to all secondaries?
>
>
>
>
>
>
>
> 2. Fut
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Issues around packet capture when secondary process is doing rx/tx
2024-01-09 23:07 ` Stephen Hemminger
@ 2024-04-03 12:11 ` Ferruh Yigit
0 siblings, 0 replies; 18+ messages in thread
From: Ferruh Yigit @ 2024-04-03 12:11 UTC (permalink / raw)
To: Stephen Hemminger, Konstantin Ananyev
Cc: dev, arshdeep.kaur, Gowda, Sandesh, Reshma Pattan
On 1/9/2024 11:07 PM, Stephen Hemminger wrote:
> On Tue, 9 Jan 2024 15:06:47 -0800
> Stephen Hemminger <stephen@networkplumber.org> wrote:
>
>> On Mon, 8 Jan 2024 15:13:25 +0000
>> Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
>>
>>>> I have been looking at a problem reported by Sandesh
>>>> where packet capture does not work if rx/tx burst is done in secondary process.
>>>>
>>>> The root cause is that existing rx/tx callback model just doesn't work
>>>> unless the process doing the rx/tx burst calls is the same one that
>>>> registered the callbacks.
>>>>
>>>> An example sequence would be:
>>>> 1. dumpcap (or pdump) as secondary tells pdump in primary to register callback
>>>> 2. secondary process calls rx_burst.
>>>> 3. rx_burst sees the callback but it has pointer pdump_rx which is not necessarily
>>>> at same location in primary and secondary process.
>>>> 4. indirect function call in secondary to bad location likely causes crash.
>>>
>>> As I remember, RX/TX callbacks were never intended to work over multiple processes.
>>> Right now RX/TX callbacks are private for the process, different process simply should not
>>> see/execute them.
>>> I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the rte_eth_dev.data that is shared
>>> between processes.
>>> It should be normal, wehn for the same port/queue you will end-up with different list of callbacks
>>> for different processes.
>>> So, unless I am missing something, I don't see how we can end-up with 3) and 4) from above:
>>> From my understanding secondary process will never see/call primary's callbacks.
>>>
>>> About pdump itself, it was a while when I looked at it last time, but as I remember to start it to work,
>>> server process has to call rte_pdump_init() which in terns register PDUMP_MP handler.
>>> I suppose for the secondary process to act as a 'pdump server' it needs to call rte_pdump_init() itself,
>>> though I am not sure such option is supported right now.
>>>
>>
>> Did some more tests with modified testpmd, and reached some conclusions:
>>
>> The logical interface would be to allow rte_pdump_init() to be called by
>> the process that would be using rx/tx burst API's.
>>
>> This doesn't work as it should because the multi-process socket API
>> assumes that the it only runs the server in primary. The secondary
>> can start its own MP thread, but it won't work:
>>
>> Primary EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>> Secondary: EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_6057_1ccd4157fd5
>>
>> The problem is when client (pdump or dumpcap) tries to run, it uses the mp_socket
>> in the primary which causes: EAL: Cannot find action: mp_pdump
>>
>> Looks like the whole MP socket mechanism is just not up to this.
>>
>> Maybe pdump needs to have its own socket and control thread?
>> Or MP socket needs to have some multicast fanout to all secondaries?
>>
I replied to old email but you seem already figured out the root cause.
So when a secondary sends an MP message, the registered MP handler in
another secondary is not called.
As you suggested fan-out to all secondaries with a flag in the message
can be an option.
And one of the reasons MP socket added was, when a device hotplugged in
secondary, this new device populated both in primary and secondary.
And as far as I know if there are multiple secondaries, device populated
to all, not via secondary to secondary communication, but via primary to
all secondaries communication.
So some kind of fan out to all secondaries should be happening for
hotplugging device usecase.
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: Issues around packet capture when secondary process is doing rx/tx
2024-01-09 23:06 ` Stephen Hemminger
2024-01-09 23:07 ` Stephen Hemminger
@ 2024-01-10 20:11 ` Konstantin Ananyev
2024-04-03 12:20 ` Ferruh Yigit
1 sibling, 1 reply; 18+ messages in thread
From: Konstantin Ananyev @ 2024-01-10 20:11 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev, arshdeep.kaur, Gowda, Sandesh, Reshma Pattan
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Tuesday, January 9, 2024 11:07 PM
> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> Cc: dev@dpdk.org; arshdeep.kaur@intel.com; Gowda, Sandesh <sandesh.gowda@intel.com>; Reshma Pattan
> <reshma.pattan@intel.com>
> Subject: Re: Issues around packet capture when secondary process is doing rx/tx
>
> On Mon, 8 Jan 2024 15:13:25 +0000
> Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
>
> > > I have been looking at a problem reported by Sandesh
> > > where packet capture does not work if rx/tx burst is done in secondary process.
> > >
> > > The root cause is that existing rx/tx callback model just doesn't work
> > > unless the process doing the rx/tx burst calls is the same one that
> > > registered the callbacks.
> > >
> > > An example sequence would be:
> > > 1. dumpcap (or pdump) as secondary tells pdump in primary to register callback
> > > 2. secondary process calls rx_burst.
> > > 3. rx_burst sees the callback but it has pointer pdump_rx which is not necessarily
> > > at same location in primary and secondary process.
> > > 4. indirect function call in secondary to bad location likely causes crash.
> >
> > As I remember, RX/TX callbacks were never intended to work over multiple processes.
> > Right now RX/TX callbacks are private for the process, different process simply should not
> > see/execute them.
> > I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the rte_eth_dev.data that is shared
> > between processes.
> > It should be normal, wehn for the same port/queue you will end-up with different list of callbacks
> > for different processes.
> > So, unless I am missing something, I don't see how we can end-up with 3) and 4) from above:
> > From my understanding secondary process will never see/call primary's callbacks.
> >
> > About pdump itself, it was a while when I looked at it last time, but as I remember to start it to work,
> > server process has to call rte_pdump_init() which in terns register PDUMP_MP handler.
> > I suppose for the secondary process to act as a 'pdump server' it needs to call rte_pdump_init() itself,
> > though I am not sure such option is supported right now.
> >
>
> Did some more tests with modified testpmd, and reached some conclusions:
>
> The logical interface would be to allow rte_pdump_init() to be called by
> the process that would be using rx/tx burst API's.
>
> This doesn't work as it should because the multi-process socket API
> assumes that the it only runs the server in primary. The secondary
> can start its own MP thread, but it won't work:
>
> Primary EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> Secondary: EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_6057_1ccd4157fd5
>
> The problem is when client (pdump or dumpcap) tries to run, it uses the mp_socket
> in the primary which causes: EAL: Cannot find action: mp_pdump
>
> Looks like the whole MP socket mechanism is just not up to this.
>
> Maybe pdump needs to have its own socket and control thread?
> Or MP socket needs to have some multicast fanout to all secondaries?
Might be we can do something simpler: pass to pdump_enable(), where we want to enable it:
on primary (remote_ process or secondary (local) process?
And then for primary send a message over MP socket (as we doing now), and for secondary (itself)
just do actual pdump enablement on it's own (install callbacks, etc.).
Yes, in that way, one secondary would not be able to enable/idable pdump on another secondary,
only on itself, but might be it is not needed?
Konstrantin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Issues around packet capture when secondary process is doing rx/tx
2024-01-10 20:11 ` Konstantin Ananyev
@ 2024-04-03 12:20 ` Ferruh Yigit
2024-04-04 13:26 ` Konstantin Ananyev
0 siblings, 1 reply; 18+ messages in thread
From: Ferruh Yigit @ 2024-04-03 12:20 UTC (permalink / raw)
To: Konstantin Ananyev, Stephen Hemminger
Cc: dev, arshdeep.kaur, Gowda, Sandesh, Reshma Pattan
On 1/10/2024 8:11 PM, Konstantin Ananyev wrote:
>
>
>> -----Original Message-----
>> From: Stephen Hemminger <stephen@networkplumber.org>
>> Sent: Tuesday, January 9, 2024 11:07 PM
>> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>> Cc: dev@dpdk.org; arshdeep.kaur@intel.com; Gowda, Sandesh <sandesh.gowda@intel.com>; Reshma Pattan
>> <reshma.pattan@intel.com>
>> Subject: Re: Issues around packet capture when secondary process is doing rx/tx
>>
>> On Mon, 8 Jan 2024 15:13:25 +0000
>> Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
>>
>>>> I have been looking at a problem reported by Sandesh
>>>> where packet capture does not work if rx/tx burst is done in secondary process.
>>>>
>>>> The root cause is that existing rx/tx callback model just doesn't work
>>>> unless the process doing the rx/tx burst calls is the same one that
>>>> registered the callbacks.
>>>>
>>>> An example sequence would be:
>>>> 1. dumpcap (or pdump) as secondary tells pdump in primary to register callback
>>>> 2. secondary process calls rx_burst.
>>>> 3. rx_burst sees the callback but it has pointer pdump_rx which is not necessarily
>>>> at same location in primary and secondary process.
>>>> 4. indirect function call in secondary to bad location likely causes crash.
>>>
>>> As I remember, RX/TX callbacks were never intended to work over multiple processes.
>>> Right now RX/TX callbacks are private for the process, different process simply should not
>>> see/execute them.
>>> I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the rte_eth_dev.data that is shared
>>> between processes.
>>> It should be normal, wehn for the same port/queue you will end-up with different list of callbacks
>>> for different processes.
>>> So, unless I am missing something, I don't see how we can end-up with 3) and 4) from above:
>>> From my understanding secondary process will never see/call primary's callbacks.
>>>
>>> About pdump itself, it was a while when I looked at it last time, but as I remember to start it to work,
>>> server process has to call rte_pdump_init() which in terns register PDUMP_MP handler.
>>> I suppose for the secondary process to act as a 'pdump server' it needs to call rte_pdump_init() itself,
>>> though I am not sure such option is supported right now.
>>>
>>
>> Did some more tests with modified testpmd, and reached some conclusions:
>>
>> The logical interface would be to allow rte_pdump_init() to be called by
>> the process that would be using rx/tx burst API's.
>>
>> This doesn't work as it should because the multi-process socket API
>> assumes that the it only runs the server in primary. The secondary
>> can start its own MP thread, but it won't work:
>>
>> Primary EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>> Secondary: EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_6057_1ccd4157fd5
>>
>> The problem is when client (pdump or dumpcap) tries to run, it uses the mp_socket
>> in the primary which causes: EAL: Cannot find action: mp_pdump
>>
>> Looks like the whole MP socket mechanism is just not up to this.
>>
>> Maybe pdump needs to have its own socket and control thread?
>> Or MP socket needs to have some multicast fanout to all secondaries?
>
> Might be we can do something simpler: pass to pdump_enable(), where we want to enable it:
> on primary (remote_ process or secondary (local) process?
> And then for primary send a message over MP socket (as we doing now), and for secondary (itself)
> just do actual pdump enablement on it's own (install callbacks, etc.).
> Yes, in that way, one secondary would not be able to enable/idable pdump on another secondary,
> only on itself, but might be it is not needed?
>
>
How secondary, lets say testpmd secondary, install callbacks without
getting 'mp' & 'ring' info from pdump secondary process?
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: Issues around packet capture when secondary process is doing rx/tx
2024-04-03 12:20 ` Ferruh Yigit
@ 2024-04-04 13:26 ` Konstantin Ananyev
2024-04-04 14:28 ` Ferruh Yigit
0 siblings, 1 reply; 18+ messages in thread
From: Konstantin Ananyev @ 2024-04-04 13:26 UTC (permalink / raw)
To: Ferruh Yigit, Stephen Hemminger
Cc: dev, arshdeep.kaur, Gowda, Sandesh, Reshma Pattan
> >> -----Original Message-----
> >> From: Stephen Hemminger <stephen@networkplumber.org>
> >> Sent: Tuesday, January 9, 2024 11:07 PM
> >> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> >> Cc: dev@dpdk.org; arshdeep.kaur@intel.com; Gowda, Sandesh <sandesh.gowda@intel.com>; Reshma Pattan
> >> <reshma.pattan@intel.com>
> >> Subject: Re: Issues around packet capture when secondary process is doing rx/tx
> >>
> >> On Mon, 8 Jan 2024 15:13:25 +0000
> >> Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
> >>
> >>>> I have been looking at a problem reported by Sandesh
> >>>> where packet capture does not work if rx/tx burst is done in secondary process.
> >>>>
> >>>> The root cause is that existing rx/tx callback model just doesn't work
> >>>> unless the process doing the rx/tx burst calls is the same one that
> >>>> registered the callbacks.
> >>>>
> >>>> An example sequence would be:
> >>>> 1. dumpcap (or pdump) as secondary tells pdump in primary to register callback
> >>>> 2. secondary process calls rx_burst.
> >>>> 3. rx_burst sees the callback but it has pointer pdump_rx which is not necessarily
> >>>> at same location in primary and secondary process.
> >>>> 4. indirect function call in secondary to bad location likely causes crash.
> >>>
> >>> As I remember, RX/TX callbacks were never intended to work over multiple processes.
> >>> Right now RX/TX callbacks are private for the process, different process simply should not
> >>> see/execute them.
> >>> I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the rte_eth_dev.data that is shared
> >>> between processes.
> >>> It should be normal, wehn for the same port/queue you will end-up with different list of callbacks
> >>> for different processes.
> >>> So, unless I am missing something, I don't see how we can end-up with 3) and 4) from above:
> >>> From my understanding secondary process will never see/call primary's callbacks.
> >>>
> >>> About pdump itself, it was a while when I looked at it last time, but as I remember to start it to work,
> >>> server process has to call rte_pdump_init() which in terns register PDUMP_MP handler.
> >>> I suppose for the secondary process to act as a 'pdump server' it needs to call rte_pdump_init() itself,
> >>> though I am not sure such option is supported right now.
> >>>
> >>
> >> Did some more tests with modified testpmd, and reached some conclusions:
> >>
> >> The logical interface would be to allow rte_pdump_init() to be called by
> >> the process that would be using rx/tx burst API's.
> >>
> >> This doesn't work as it should because the multi-process socket API
> >> assumes that the it only runs the server in primary. The secondary
> >> can start its own MP thread, but it won't work:
> >>
> >> Primary EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> >> Secondary: EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_6057_1ccd4157fd5
> >>
> >> The problem is when client (pdump or dumpcap) tries to run, it uses the mp_socket
> >> in the primary which causes: EAL: Cannot find action: mp_pdump
> >>
> >> Looks like the whole MP socket mechanism is just not up to this.
> >>
> >> Maybe pdump needs to have its own socket and control thread?
> >> Or MP socket needs to have some multicast fanout to all secondaries?
> >
> > Might be we can do something simpler: pass to pdump_enable(), where we want to enable it:
> > on primary (remote_ process or secondary (local) process?
> > And then for primary send a message over MP socket (as we doing now), and for secondary (itself)
> > just do actual pdump enablement on it's own (install callbacks, etc.).
> > Yes, in that way, one secondary would not be able to enable/idable pdump on another secondary,
> > only on itself, but might be it is not needed?
> >
> >
>
> How secondary, lets say testpmd secondary, install callbacks without
> getting 'mp' & 'ring' info from pdump secondary process?
Please see my comment above (I copied it here too):
>Yes, in that way, one secondary would not be able to enable/disable pdump on another secondary, only on itself, but might be it is not needed?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Issues around packet capture when secondary process is doing rx/tx
2024-04-04 13:26 ` Konstantin Ananyev
@ 2024-04-04 14:28 ` Ferruh Yigit
2024-04-04 15:21 ` Stephen Hemminger
2024-04-04 16:18 ` Konstantin Ananyev
0 siblings, 2 replies; 18+ messages in thread
From: Ferruh Yigit @ 2024-04-04 14:28 UTC (permalink / raw)
To: Konstantin Ananyev, Stephen Hemminger
Cc: dev, arshdeep.kaur, Gowda, Sandesh, Reshma Pattan
On 4/4/2024 2:26 PM, Konstantin Ananyev wrote:
>
>
>>>> -----Original Message-----
>>>> From: Stephen Hemminger <stephen@networkplumber.org>
>>>> Sent: Tuesday, January 9, 2024 11:07 PM
>>>> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>
>>>> Cc: dev@dpdk.org; arshdeep.kaur@intel.com; Gowda, Sandesh <sandesh.gowda@intel.com>; Reshma Pattan
>>>> <reshma.pattan@intel.com>
>>>> Subject: Re: Issues around packet capture when secondary process is doing rx/tx
>>>>
>>>> On Mon, 8 Jan 2024 15:13:25 +0000
>>>> Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
>>>>
>>>>>> I have been looking at a problem reported by Sandesh
>>>>>> where packet capture does not work if rx/tx burst is done in secondary process.
>>>>>>
>>>>>> The root cause is that existing rx/tx callback model just doesn't work
>>>>>> unless the process doing the rx/tx burst calls is the same one that
>>>>>> registered the callbacks.
>>>>>>
>>>>>> An example sequence would be:
>>>>>> 1. dumpcap (or pdump) as secondary tells pdump in primary to register callback
>>>>>> 2. secondary process calls rx_burst.
>>>>>> 3. rx_burst sees the callback but it has pointer pdump_rx which is not necessarily
>>>>>> at same location in primary and secondary process.
>>>>>> 4. indirect function call in secondary to bad location likely causes crash.
>>>>>
>>>>> As I remember, RX/TX callbacks were never intended to work over multiple processes.
>>>>> Right now RX/TX callbacks are private for the process, different process simply should not
>>>>> see/execute them.
>>>>> I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the rte_eth_dev.data that is shared
>>>>> between processes.
>>>>> It should be normal, wehn for the same port/queue you will end-up with different list of callbacks
>>>>> for different processes.
>>>>> So, unless I am missing something, I don't see how we can end-up with 3) and 4) from above:
>>>>> From my understanding secondary process will never see/call primary's callbacks.
>>>>>
>>>>> About pdump itself, it was a while when I looked at it last time, but as I remember to start it to work,
>>>>> server process has to call rte_pdump_init() which in terns register PDUMP_MP handler.
>>>>> I suppose for the secondary process to act as a 'pdump server' it needs to call rte_pdump_init() itself,
>>>>> though I am not sure such option is supported right now.
>>>>>
>>>>
>>>> Did some more tests with modified testpmd, and reached some conclusions:
>>>>
>>>> The logical interface would be to allow rte_pdump_init() to be called by
>>>> the process that would be using rx/tx burst API's.
>>>>
>>>> This doesn't work as it should because the multi-process socket API
>>>> assumes that the it only runs the server in primary. The secondary
>>>> can start its own MP thread, but it won't work:
>>>>
>>>> Primary EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>>>> Secondary: EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_6057_1ccd4157fd5
>>>>
>>>> The problem is when client (pdump or dumpcap) tries to run, it uses the mp_socket
>>>> in the primary which causes: EAL: Cannot find action: mp_pdump
>>>>
>>>> Looks like the whole MP socket mechanism is just not up to this.
>>>>
>>>> Maybe pdump needs to have its own socket and control thread?
>>>> Or MP socket needs to have some multicast fanout to all secondaries?
>>>
>>> Might be we can do something simpler: pass to pdump_enable(), where we want to enable it:
>>> on primary (remote_ process or secondary (local) process?
>>> And then for primary send a message over MP socket (as we doing now), and for secondary (itself)
>>> just do actual pdump enablement on it's own (install callbacks, etc.).
>>> Yes, in that way, one secondary would not be able to enable/idable pdump on another secondary,
>>> only on itself, but might be it is not needed?
>>>
>>>
>>
>> How secondary, lets say testpmd secondary, install callbacks without
>> getting 'mp' & 'ring' info from pdump secondary process?
>
> Please see my comment above (I copied it here too):
>> Yes, in that way, one secondary would not be able to enable/disable pdump on another secondary, only on itself, but might be it is not needed?
>
I saw it Konstantin, but it wasn't clear to me what you are suggesting,
that is why I am asking more.
Do you suggest when testpmd run as secondary process and doing
forwarding, it should do the tasks of pdump itself and we don't use
pdump at all?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Issues around packet capture when secondary process is doing rx/tx
2024-04-04 14:28 ` Ferruh Yigit
@ 2024-04-04 15:21 ` Stephen Hemminger
2024-04-04 16:18 ` Konstantin Ananyev
1 sibling, 0 replies; 18+ messages in thread
From: Stephen Hemminger @ 2024-04-04 15:21 UTC (permalink / raw)
To: Ferruh Yigit
Cc: Konstantin Ananyev, dev, arshdeep.kaur, Gowda, Sandesh, Reshma Pattan
> >>>> Maybe pdump needs to have its own socket and control thread?
> >>>> Or MP socket needs to have some multicast fanout to all secondaries?
> >>>
> >>> Might be we can do something simpler: pass to pdump_enable(), where we want to enable it:
> >>> on primary (remote_ process or secondary (local) process?
> >>> And then for primary send a message over MP socket (as we doing now), and for secondary (itself)
> >>> just do actual pdump enablement on it's own (install callbacks, etc.).
> >>> Yes, in that way, one secondary would not be able to enable/idable pdump on another secondary,
> >>> only on itself, but might be it is not needed?
> >>>
> >>>
> >>
> >> How secondary, lets say testpmd secondary, install callbacks without
> >> getting 'mp' & 'ring' info from pdump secondary process?
> >
> > Please see my comment above (I copied it here too):
> >> Yes, in that way, one secondary would not be able to enable/disable pdump on another secondary, only on itself, but might be it is not needed?
> >
>
> I saw it Konstantin, but it wasn't clear to me what you are suggesting,
> that is why I am asking more.
>
> Do you suggest when testpmd run as secondary process and doing
> forwarding, it should do the tasks of pdump itself and we don't use
> pdump at all?
>
I looked into starting pdump_init in the active secondary process,
but that won't work right because the passive secondary won't talk to it
over the right unix domain socket. It might be possible to have multiple
MP server sockets and use some form of AF_UNIX multicast, but it gets
complex to handle.
Probably best to skip callbacks for this and use a state flag in eth_dev_driver.
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: Issues around packet capture when secondary process is doing rx/tx
2024-04-04 14:28 ` Ferruh Yigit
2024-04-04 15:21 ` Stephen Hemminger
@ 2024-04-04 16:18 ` Konstantin Ananyev
1 sibling, 0 replies; 18+ messages in thread
From: Konstantin Ananyev @ 2024-04-04 16:18 UTC (permalink / raw)
To: Ferruh Yigit, Stephen Hemminger
Cc: dev, arshdeep.kaur, Gowda, Sandesh, Reshma Pattan
> >>>> -----Original Message-----
> >>>> From: Stephen Hemminger <stephen@networkplumber.org>
> >>>> Sent: Tuesday, January 9, 2024 11:07 PM
> >>>> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> >>>> Cc: dev@dpdk.org; arshdeep.kaur@intel.com; Gowda, Sandesh <sandesh.gowda@intel.com>; Reshma Pattan
> >>>> <reshma.pattan@intel.com>
> >>>> Subject: Re: Issues around packet capture when secondary process is doing rx/tx
> >>>>
> >>>> On Mon, 8 Jan 2024 15:13:25 +0000
> >>>> Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
> >>>>
> >>>>>> I have been looking at a problem reported by Sandesh
> >>>>>> where packet capture does not work if rx/tx burst is done in secondary process.
> >>>>>>
> >>>>>> The root cause is that existing rx/tx callback model just doesn't work
> >>>>>> unless the process doing the rx/tx burst calls is the same one that
> >>>>>> registered the callbacks.
> >>>>>>
> >>>>>> An example sequence would be:
> >>>>>> 1. dumpcap (or pdump) as secondary tells pdump in primary to register callback
> >>>>>> 2. secondary process calls rx_burst.
> >>>>>> 3. rx_burst sees the callback but it has pointer pdump_rx which is not necessarily
> >>>>>> at same location in primary and secondary process.
> >>>>>> 4. indirect function call in secondary to bad location likely causes crash.
> >>>>>
> >>>>> As I remember, RX/TX callbacks were never intended to work over multiple processes.
> >>>>> Right now RX/TX callbacks are private for the process, different process simply should not
> >>>>> see/execute them.
> >>>>> I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the rte_eth_dev.data that is shared
> >>>>> between processes.
> >>>>> It should be normal, wehn for the same port/queue you will end-up with different list of callbacks
> >>>>> for different processes.
> >>>>> So, unless I am missing something, I don't see how we can end-up with 3) and 4) from above:
> >>>>> From my understanding secondary process will never see/call primary's callbacks.
> >>>>>
> >>>>> About pdump itself, it was a while when I looked at it last time, but as I remember to start it to work,
> >>>>> server process has to call rte_pdump_init() which in terns register PDUMP_MP handler.
> >>>>> I suppose for the secondary process to act as a 'pdump server' it needs to call rte_pdump_init() itself,
> >>>>> though I am not sure such option is supported right now.
> >>>>>
> >>>>
> >>>> Did some more tests with modified testpmd, and reached some conclusions:
> >>>>
> >>>> The logical interface would be to allow rte_pdump_init() to be called by
> >>>> the process that would be using rx/tx burst API's.
> >>>>
> >>>> This doesn't work as it should because the multi-process socket API
> >>>> assumes that the it only runs the server in primary. The secondary
> >>>> can start its own MP thread, but it won't work:
> >>>>
> >>>> Primary EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> >>>> Secondary: EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_6057_1ccd4157fd5
> >>>>
> >>>> The problem is when client (pdump or dumpcap) tries to run, it uses the mp_socket
> >>>> in the primary which causes: EAL: Cannot find action: mp_pdump
> >>>>
> >>>> Looks like the whole MP socket mechanism is just not up to this.
> >>>>
> >>>> Maybe pdump needs to have its own socket and control thread?
> >>>> Or MP socket needs to have some multicast fanout to all secondaries?
> >>>
> >>> Might be we can do something simpler: pass to pdump_enable(), where we want to enable it:
> >>> on primary (remote_ process or secondary (local) process?
> >>> And then for primary send a message over MP socket (as we doing now), and for secondary (itself)
> >>> just do actual pdump enablement on it's own (install callbacks, etc.).
> >>> Yes, in that way, one secondary would not be able to enable/idable pdump on another secondary,
> >>> only on itself, but might be it is not needed?
> >>>
> >>>
> >>
> >> How secondary, lets say testpmd secondary, install callbacks without
> >> getting 'mp' & 'ring' info from pdump secondary process?
> >
> > Please see my comment above (I copied it here too):
> >> Yes, in that way, one secondary would not be able to enable/disable pdump on another secondary, only on itself, but might be it is
> not needed?
> >
>
> I saw it Konstantin, but it wasn't clear to me what you are suggesting,
> that is why I am asking more.
>
> Do you suggest when testpmd run as secondary process and doing
> forwarding, it should do the tasks of pdump itself and we don't use
> pdump at all?
Sort of - we can still use pdump API, but under the hood instead of sending request to primary,
secondary would just install an RX/TX callback for itself.
Again, with that schema secondary<->secondary would not be supported.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Issues around packet capture when secondary process is doing rx/tx
2024-01-08 15:13 ` Konstantin Ananyev
` (2 preceding siblings ...)
2024-01-09 23:06 ` Stephen Hemminger
@ 2024-04-03 0:14 ` Stephen Hemminger
2024-04-03 11:42 ` Ferruh Yigit
4 siblings, 0 replies; 18+ messages in thread
From: Stephen Hemminger @ 2024-04-03 0:14 UTC (permalink / raw)
To: Konstantin Ananyev; +Cc: dev, arshdeep.kaur, Gowda, Sandesh, Reshma Pattan
On Mon, 8 Jan 2024 15:13:25 +0000
Konstantin Ananyev <konstantin.ananyev@huawei.com> wrote:
> > I have been looking at a problem reported by Sandesh
> > where packet capture does not work if rx/tx burst is done in secondary process.
> >
> > The root cause is that existing rx/tx callback model just doesn't work
> > unless the process doing the rx/tx burst calls is the same one that
> > registered the callbacks.
> >
> > An example sequence would be:
> > 1. dumpcap (or pdump) as secondary tells pdump in primary to register callback
> > 2. secondary process calls rx_burst.
> > 3. rx_burst sees the callback but it has pointer pdump_rx which is not necessarily
> > at same location in primary and secondary process.
> > 4. indirect function call in secondary to bad location likely causes crash.
>
> As I remember, RX/TX callbacks were never intended to work over multiple processes.
> Right now RX/TX callbacks are private for the process, different process simply should not
> see/execute them.
> I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the rte_eth_dev.data that is shared
> between processes.
> It should be normal, wehn for the same port/queue you will end-up with different list of callbacks
> for different processes.
> So, unless I am missing something, I don't see how we can end-up with 3) and 4) from above:
> From my understanding secondary process will never see/call primary's callbacks.
>
> About pdump itself, it was a while when I looked at it last time, but as I remember to start it to work,
> server process has to call rte_pdump_init() which in terns register PDUMP_MP handler.
> I suppose for the secondary process to act as a 'pdump server' it needs to call rte_pdump_init() itself,
> though I am not sure such option is supported right now.
>
> >
> > Some possible workarounds.
> > 1. Keep callback list per-process: messy, but won't crash. Capture won't work
> > without other changes. In this primary would register callback, but secondaries
> > would not use them in rx/tx burst.
> >
> > 2. Replace use of rx/tx callback in pdump with change to rte_ethdev to have
> > a capture flag. (i.e. don't use indirection). Likely ABI problems.
> > Basically, ignore the rx/tx callback mechanism. This is my preferred
> > solution.
>
> It is not only the capture flag, it is also what to do with the captured packets
> (copy? If yes, then where to? examine? drop?, do something else?).
> It is probably not the best choice to add all these things into ethdev API.
>
> > 3. Some fix up mechanism (in EAL mp support?) to have each process fixup
> > its callback mechanism.
>
> Probably the easiest way to fix that - pass to rte_pdump_enable() extra information
> that would allow it to distinguish on what exact process (local, remote)
> we want to enable pdump functionality. Then it could act accordingly.
>
> >
> > 4. Do something in pdump_init to register the callback in same process context
> > (probably need callbacks to be per-process). Would mean callback is always
> > on independent of capture being enabled.
> >
> > 5. Get rid of indirect function call pointer, and replace it by index into
> > a static table of callback functions. Every process would have same code
> > (in this case pdump_rx) but at different address. Requires all callbacks
> > to be statically defined at build time.
>
> Doesn't look like a good approach - it will break many things.
>
> > The existing rx/tx callback is not safe id rx/tx burst is called from different process
> > than where callback is registered.
>
>
Have been looking into best way to fix this, and the real answer is not to use
callbacks but instead use a flag per-queue. The natural place to put these in
rte_ethdev_driver. BUT this will mean an ABI breakage, so will have to wait for 24.11
release. Sometimes fixing a design flaw means an ABI change.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Issues around packet capture when secondary process is doing rx/tx
2024-01-08 15:13 ` Konstantin Ananyev
` (3 preceding siblings ...)
2024-04-03 0:14 ` Stephen Hemminger
@ 2024-04-03 11:42 ` Ferruh Yigit
4 siblings, 0 replies; 18+ messages in thread
From: Ferruh Yigit @ 2024-04-03 11:42 UTC (permalink / raw)
To: Konstantin Ananyev, Stephen Hemminger, dev
Cc: arshdeep.kaur, Gowda, Sandesh, Reshma Pattan
On 1/8/2024 3:13 PM, Konstantin Ananyev wrote:
>
>
>> I have been looking at a problem reported by Sandesh
>> where packet capture does not work if rx/tx burst is done in secondary process.
>>
>> The root cause is that existing rx/tx callback model just doesn't work
>> unless the process doing the rx/tx burst calls is the same one that
>> registered the callbacks.
>>
>> An example sequence would be:
>> 1. dumpcap (or pdump) as secondary tells pdump in primary to register callback
>> 2. secondary process calls rx_burst.
>> 3. rx_burst sees the callback but it has pointer pdump_rx which is not necessarily
>> at same location in primary and secondary process.
>> 4. indirect function call in secondary to bad location likely causes crash.
>
> As I remember, RX/TX callbacks were never intended to work over multiple processes.
> Right now RX/TX callbacks are private for the process, different process simply should not
> see/execute them.
> I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the rte_eth_dev.data that is shared
> between processes.
> It should be normal, wehn for the same port/queue you will end-up with different list of callbacks
> for different processes.
> So, unless I am missing something, I don't see how we can end-up with 3) and 4) from above:
> From my understanding secondary process will never see/call primary's callbacks.
>
Ack. There should be another reason for crash.
> About pdump itself, it was a while when I looked at it last time, but as I remember to start it to work,
> server process has to call rte_pdump_init() which in terns register PDUMP_MP handler.
> I suppose for the secondary process to act as a 'pdump server' it needs to call rte_pdump_init() itself,
> though I am not sure such option is supported right now.
>
Currently testpmd calls 'rte_pdump_init()', and both primary testpmd and
secondary testpmd process calls this API and both register PDUMP_MP
handler, I think this is OK.
When pdump secondary process sends MP message, both primary testpmd and
secondary testpmd process should register callbacks with provided ring
and mempool information.
I don't know if both primary and secondary process callbacks running
simultaneously causing this problem, otherwise I expect it to work.
>>
>> Some possible workarounds.
>> 1. Keep callback list per-process: messy, but won't crash. Capture won't work
>> without other changes. In this primary would register callback, but secondaries
>> would not use them in rx/tx burst.
>>
>> 2. Replace use of rx/tx callback in pdump with change to rte_ethdev to have
>> a capture flag. (i.e. don't use indirection). Likely ABI problems.
>> Basically, ignore the rx/tx callback mechanism. This is my preferred
>> solution.
>
> It is not only the capture flag, it is also what to do with the captured packets
> (copy? If yes, then where to? examine? drop?, do something else?).
> It is probably not the best choice to add all these things into ethdev API.
>
>> 3. Some fix up mechanism (in EAL mp support?) to have each process fixup
>> its callback mechanism.
>
> Probably the easiest way to fix that - pass to rte_pdump_enable() extra information
> that would allow it to distinguish on what exact process (local, remote)
> we want to enable pdump functionality. Then it could act accordingly.
>
>>
>> 4. Do something in pdump_init to register the callback in same process context
>> (probably need callbacks to be per-process). Would mean callback is always
>> on independent of capture being enabled.
>>
>> 5. Get rid of indirect function call pointer, and replace it by index into
>> a static table of callback functions. Every process would have same code
>> (in this case pdump_rx) but at different address. Requires all callbacks
>> to be statically defined at build time.
>
> Doesn't look like a good approach - it will break many things.
>
>> The existing rx/tx callback is not safe id rx/tx burst is called from different process
>> than where callback is registered.
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread