From: Ferruh Yigit <ferruh.yigit@amd.com>
To: fengchengwen <fengchengwen@huawei.com>, <yuying.zhang@intel.com>,
<aman.deep.singh@intel.com>
Cc: <dev@dpdk.org>, <xiaoyun.li@intel.com>,
Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
Thomas Monjalon <thomas@monjalon.net>
Subject: Re: [PATCH] app/testpmd: remove invalid ports when other process detach
Date: Fri, 20 May 2022 16:14:58 +0100 [thread overview]
Message-ID: <ff3aefc7-6c80-1fab-0ed3-a0dc4d07edff@amd.com> (raw)
In-Reply-To: <80bdb6d3-8259-80b1-3733-ef19b6b4f840@amd.com>
On 5/20/2022 4:05 PM, Ferruh Yigit wrote:
> [CAUTION: External Email]
>
> On 3/2/2022 8:36 AM, fengchengwen wrote:
>> On 2022/3/2 16:26, Thomas Monjalon wrote:
>>> 02/03/2022 03:33, Chengwen Feng:
>>>> Start main and secondary process:
>>>> ./dpdk-testpmd -a BDF0 -a BDF1 --proc-type=auto -- -i --rxq=8 --txq=8
>>>> --num-procs=2 --proc-id=0
>>>> ./dpdk-testpmd -a BDF0 -a BDF1 --proc-type=auto -- -i --rxq=8 --txq=8
>>>> --num-procs=2 --proc-id=1
>>>> Execute following command in main process:
>>>> port stop 0
>>>> port detach 0
>>>> Execute following command in secondary process:
>>>> set fwd mac
>>>> start
>>>> The secondary process will display:
>>>> Invalid port_id=0
>>>> telcore 19 called rx_pkt_burst for not ready port 0
>>>> stpmd> 8: [/lib64/libc.so.6(+0xdf600) [0xffff9e1dc600]]
>>>> 7: [/lib64/libpthread.so.0(+0x7c48) [0xffff9e28ac48]]
>>>> 6: [/usr/app/testpmd(eal_thread_loop+0x2c4) [0xb23574]]
>>>> 5: [/usr/app/testpmd() [0x9c21d8]]
>>>> 4: [/usr/app/testpmd() [0x9c2108]]
>>>> 3: [/usr/app/testpmd() [0x9b6cf0]]
>>>> 2: [/usr/app/testpmd() [0xad8620]]
>>>> 1: [/usr/app/testpmd(rte_dump_stack+0x20) [0xb1a130]]
>>>>
>>>> The root cause it that the secondary process has not removed invalid
>>>> ports when it processes RTE_ETH_EVENT_DESTROY event.
>>>
>>> Why the ports are not removed?
>>
>
> It is referring to application (testpmd) level state, ethdev port is
> removed.
>
> Above mentioned problem is valid since testpmd secondary process support
> is added.
> When primary hot remove a device, it updates relevant application state
> too, but for secondary process ethdev removes device without secondary
> process updating its state.
>
> I agree that using ethdev events is appropriate for this case, since
> action originated from ethdev for secondary process, need a way to
> notify application about it.
> Another option can be checking and removing invalid ports in secondary
> before each forwarding start, but since we already have an event for
> destroy, using it looks good to me.
>
Closing port in the primary has the same problem, secondary process
state is not updated and crashes.
A high level design decision can be to reduce the application level
states and rely on ethdev states more, to reduce/remove this ethdev and
application state sync requirement,
@Aman, @Yuying, what do you think about this high level, long term goal?
(Like why there is a testpmd 'RTE_PORT_CLOSED' state, should be in ethdev?)
((BTW, why testpmd defined 'RTE_PORT_CLOSED' has 'RTE_' prefix, @Aman
can you do a cleanup for it, 'RTE_' -> 'TESTPMD_' ?))
>
>> Testpmd register function eth_event_callback to deal with DESTROY event,
>> currently it only assign ports[port_id].port_status with
>> RTE_PORT_CLOSED, it
>> doesn't update other global variables like nb_ports.
>>
>>>
>>>> This patch adds a delay remove invalid ports invoke when process the
>>>> RTE_ETH_EVENT_DESTROY event.
>>>
>>> Why do we need this delay?
>>
>> The remove_invalid_ports will scan rte_eth_devices[], when process the
>> DESTROY
>> event, the rte_eth_devices[x] still valid, so here we should a delay
>> logic.
>>
>
> There is a dependency problem in the DESTROY event.
>
> We need some ethdev information to be able to deliver the event, so
> ethdev can't be completely destroyed until DESTROY event is processed.
> Which means when application received DESTROY event, ethdev is not fully
> destroyed yet.
>
> Except from above, 'remove_invalid_ports()' is called twice for primary
> process (as mentioned in commit log), this timer can be helping these
> two calls not conflict, but I am not sure if we can rely on a time for
> this.
> Since the problem is mainly for secondary process, assuming control
> commands like detaching a device will be called by primary process, so
> @Feng what do you think about adding a secondary process check to in the
> 'remove_invalid_ports_callback()' call? So 'remove_invalid_ports()' is
> called only once for primary process.
>
>>>
>>> [...]
>>>> +static void
>>>> +remove_invalid_ports_callback(void *arg)
>>>> +{
>>>> + RTE_SET_USED(arg);
>>>> + remove_invalid_ports();
>>>> +}
>>> [...]
>>>> case RTE_ETH_EVENT_DESTROY:
>>>> ports[port_id].port_status = RTE_PORT_CLOSED;
>>>> printf("Port %u is closed\n", port_id);
>>>> + if (rte_eal_alarm_set(100000,
>>>> remove_invalid_ports_callback,
>>>> + (void *)(intptr_t)port_id))
>>>> + fprintf(stderr,
>>>> + "Could not set up deferred device
>>>> released\n");
>>>> break;
>>>
>>>
>>>
>>>
>>> .
>>>
>>
>
next prev parent reply other threads:[~2022-05-20 15:15 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-02 2:33 Chengwen Feng
2022-03-02 8:26 ` Thomas Monjalon
2022-03-02 8:36 ` fengchengwen
2022-04-11 2:05 ` fengchengwen
2022-05-20 15:05 ` Ferruh Yigit
2022-05-20 15:14 ` Ferruh Yigit [this message]
2022-05-21 10:00 ` fengchengwen
2022-05-23 8:43 ` Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ff3aefc7-6c80-1fab-0ed3-a0dc4d07edff@amd.com \
--to=ferruh.yigit@amd.com \
--cc=aman.deep.singh@intel.com \
--cc=andrew.rybchenko@oktetlabs.ru \
--cc=dev@dpdk.org \
--cc=fengchengwen@huawei.com \
--cc=thomas@monjalon.net \
--cc=xiaoyun.li@intel.com \
--cc=yuying.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).