DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ferruh Yigit <ferruh.yigit@amd.com>
To: fengchengwen <fengchengwen@huawei.com>, <yuying.zhang@intel.com>,
	<aman.deep.singh@intel.com>
Cc: <dev@dpdk.org>, <xiaoyun.li@intel.com>,
	Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
	Thomas Monjalon <thomas@monjalon.net>
Subject: Re: [PATCH] app/testpmd: remove invalid ports when other process detach
Date: Fri, 20 May 2022 16:14:58 +0100	[thread overview]
Message-ID: <ff3aefc7-6c80-1fab-0ed3-a0dc4d07edff@amd.com> (raw)
In-Reply-To: <80bdb6d3-8259-80b1-3733-ef19b6b4f840@amd.com>

On 5/20/2022 4:05 PM, Ferruh Yigit wrote:
> [CAUTION: External Email]
> 
> On 3/2/2022 8:36 AM, fengchengwen wrote:
>> On 2022/3/2 16:26, Thomas Monjalon wrote:
>>> 02/03/2022 03:33, Chengwen Feng:
>>>> Start main and secondary process:
>>>> ./dpdk-testpmd -a BDF0 -a BDF1 --proc-type=auto -- -i --rxq=8 --txq=8
>>>>     --num-procs=2 --proc-id=0
>>>> ./dpdk-testpmd -a BDF0 -a BDF1 --proc-type=auto -- -i --rxq=8 --txq=8
>>>>     --num-procs=2 --proc-id=1
>>>> Execute following command in main process:
>>>>     port stop 0
>>>>     port detach 0
>>>> Execute following command in secondary process:
>>>>     set fwd mac
>>>>     start
>>>> The secondary process will display:
>>>>     Invalid port_id=0
>>>>     telcore 19 called rx_pkt_burst for not ready port 0
>>>>     stpmd> 8: [/lib64/libc.so.6(+0xdf600) [0xffff9e1dc600]]
>>>>     7: [/lib64/libpthread.so.0(+0x7c48) [0xffff9e28ac48]]
>>>>     6: [/usr/app/testpmd(eal_thread_loop+0x2c4) [0xb23574]]
>>>>     5: [/usr/app/testpmd() [0x9c21d8]]
>>>>     4: [/usr/app/testpmd() [0x9c2108]]
>>>>     3: [/usr/app/testpmd() [0x9b6cf0]]
>>>>     2: [/usr/app/testpmd() [0xad8620]]
>>>>     1: [/usr/app/testpmd(rte_dump_stack+0x20) [0xb1a130]]
>>>>
>>>> The root cause it that the secondary process has not removed invalid
>>>> ports when it processes RTE_ETH_EVENT_DESTROY event.
>>>
>>> Why the ports are not removed?
>>
> 
> It is referring to application (testpmd) level state, ethdev port is
> removed.
> 
> Above mentioned problem is valid since testpmd secondary process support
> is added.
> When primary hot remove a device, it updates relevant application state
> too, but for secondary process ethdev removes device without secondary
> process updating its state.
> 
> I agree that using ethdev events is appropriate for this case, since
> action originated from ethdev for secondary process, need a way to
> notify application about it.
> Another option can be checking and removing invalid ports in secondary
> before each forwarding start, but since we already have an event for
> destroy, using it looks good to me.
> 

Closing port in the primary has the same problem, secondary process 
state is not updated and crashes.
A high level design decision can be to reduce the application level 
states and rely on ethdev states more, to reduce/remove this ethdev and 
application state sync requirement,
@Aman, @Yuying, what do you think about this high level, long term goal?

(Like why there is a testpmd 'RTE_PORT_CLOSED' state, should be in ethdev?)

((BTW, why testpmd defined 'RTE_PORT_CLOSED' has 'RTE_' prefix, @Aman 
can you do a cleanup for it, 'RTE_' -> 'TESTPMD_' ?))


> 
>> Testpmd register function eth_event_callback to deal with DESTROY event,
>> currently it only assign ports[port_id].port_status with 
>> RTE_PORT_CLOSED, it
>> doesn't update other global variables like nb_ports.
>>
>>>
>>>> This patch adds a delay remove invalid ports invoke when process the
>>>> RTE_ETH_EVENT_DESTROY event.
>>>
>>> Why do we need this delay?
>>
>> The remove_invalid_ports will scan rte_eth_devices[], when process the 
>> DESTROY
>> event, the rte_eth_devices[x] still valid, so here we should a delay 
>> logic.
>>
> 
> There is a dependency problem in the DESTROY event.
> 
> We need some ethdev information to be able to deliver the event, so
> ethdev can't be completely destroyed until DESTROY event is processed.
> Which means when application received DESTROY event, ethdev is not fully
> destroyed yet.
> 
> Except from above, 'remove_invalid_ports()' is called twice for primary
> process (as mentioned in commit log), this timer can be helping these
> two calls not conflict, but I am not sure if we can rely on a time for 
> this.
> Since the problem is mainly for secondary process, assuming control
> commands like detaching a device will be called by primary process, so
> @Feng what do you think about adding a secondary process check to in the
> 'remove_invalid_ports_callback()' call? So 'remove_invalid_ports()' is
> called only once for primary process.
> 
>>>
>>> [...]
>>>> +static void
>>>> +remove_invalid_ports_callback(void *arg)
>>>> +{
>>>> +   RTE_SET_USED(arg);
>>>> +   remove_invalid_ports();
>>>> +}
>>> [...]
>>>>     case RTE_ETH_EVENT_DESTROY:
>>>>             ports[port_id].port_status = RTE_PORT_CLOSED;
>>>>             printf("Port %u is closed\n", port_id);
>>>> +           if (rte_eal_alarm_set(100000, 
>>>> remove_invalid_ports_callback,
>>>> +                           (void *)(intptr_t)port_id))
>>>> +                   fprintf(stderr,
>>>> +                           "Could not set up deferred device 
>>>> released\n");
>>>>             break;
>>>
>>>
>>>
>>>
>>> .
>>>
>>
> 


  reply	other threads:[~2022-05-20 15:15 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-02  2:33 Chengwen Feng
2022-03-02  8:26 ` Thomas Monjalon
2022-03-02  8:36   ` fengchengwen
2022-04-11  2:05     ` fengchengwen
2022-05-20 15:05     ` Ferruh Yigit
2022-05-20 15:14       ` Ferruh Yigit [this message]
2022-05-21 10:00       ` fengchengwen
2022-05-23  8:43         ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ff3aefc7-6c80-1fab-0ed3-a0dc4d07edff@amd.com \
    --to=ferruh.yigit@amd.com \
    --cc=aman.deep.singh@intel.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=fengchengwen@huawei.com \
    --cc=thomas@monjalon.net \
    --cc=xiaoyun.li@intel.com \
    --cc=yuying.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).