DPDK patches and discussions
 help / color / mirror / Atom feed
From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
To: fengchengwen <fengchengwen@huawei.com>,
	Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"thomas@monjalon.net" <thomas@monjalon.net>,
	Ferruh Yigit <ferruh.yigit@amd.com>,
	Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
	Kalesh AP <kalesh-anakkur.purayil@broadcom.com>,
	"Ajit Khaparde (ajit.khaparde@broadcom.com)"
	<ajit.khaparde@broadcom.com>
Cc: nd <nd@arm.com>, nd <nd@arm.com>
Subject: RE: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode
Date: Fri, 10 Mar 2023 03:25:11 +0000	[thread overview]
Message-ID: <DBAPR08MB58145544AA01909D9F8FEFBF98BA9@DBAPR08MB5814.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <af9f8d01-e8e0-3fdf-5cbc-106c05b519a3@huawei.com>



> -----Original Message-----
> From: fengchengwen <fengchengwen@huawei.com>
> Sent: Thursday, March 9, 2023 5:31 AM
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Konstantin
> Ananyev <konstantin.v.ananyev@yandex.ru>; dev@dpdk.org;
> thomas@monjalon.net; Ferruh Yigit <ferruh.yigit@amd.com>; Andrew
> Rybchenko <andrew.rybchenko@oktetlabs.ru>; Kalesh AP <kalesh-
> anakkur.purayil@broadcom.com>; Ajit Khaparde
> (ajit.khaparde@broadcom.com) <ajit.khaparde@broadcom.com>
> Cc: nd <nd@arm.com>
> Subject: Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling
> mode
> 
> 
> 
> On 2023/3/9 11:03, Honnappa Nagarahalli wrote:
> >
> >
> >> -----Original Message-----
> >> From: fengchengwen <fengchengwen@huawei.com>
> >> Sent: Wednesday, March 8, 2023 7:00 PM
> >> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
> Konstantin
> >> Ananyev <konstantin.v.ananyev@yandex.ru>; dev@dpdk.org;
> >> thomas@monjalon.net; Ferruh Yigit <ferruh.yigit@amd.com>; Andrew
> >> Rybchenko <andrew.rybchenko@oktetlabs.ru>; Kalesh AP <kalesh-
> >> anakkur.purayil@broadcom.com>; Ajit Khaparde
> >> (ajit.khaparde@broadcom.com) <ajit.khaparde@broadcom.com>
> >> Cc: nd <nd@arm.com>
> >> Subject: Re: [PATCH 1/5] ethdev: fix race-condition of proactive
> >> error handling mode
> >>
> >>
> >>
> >> On 2023/3/8 9:09, Honnappa Nagarahalli wrote:
> >>> <snip>
> >>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>> Is there any reason not to design this in the same way as
> >>>>>> 'rte_eth_dev_reset'? Why does the PMD have to recover by itself?
> >>>>>>
> >>>>>> I suppose it is a question for the authors of original patch...
> >>>>> Appreciate if the authors could comment on this.
> >>>>
> >>>> The main cause is that the hardware implementation limit, I will
> >>>> try to explain from hns3 PMD's view.
> >>>> For a global reset, all the function need responsed within a
> >>>> centain period of time. otherwise, the reset will fail. and also
> >>>> the reset requirement a few steps (all may take a long time).
> >>>>
> >>>> When with multiple functions in one DPDK, and trigger a global
> >>>> reset, the rte_eth_dev_reset will not cover this scene:
> >>>> 1. each port's will report RTE_ETH_EVENT_INTR_RESET in interrupt
> thread.
> >>>> 2. then invoke application callback, but due to the same thread, and
> each
> >>>>     port's recover will take a long time, so later port will reset failed.
> > I am reading this again. What you are saying is, a single thread running the
> recovery process in sequence for multiple ports will not meet the required
> time limits. Hence, the recovery process needs to run in multiple threads
> simultaneously. This way each thread could run the recovery for a different
> port. Do I understand this correctly?
> 
> No
> It's not realistic to have threads on every port.
> 
> >
> > (Assuming my understanding is correct) The current implementation is
> running the recovery process in the context of data plane threads and not in
> the interrupt thread. Is this correct?
> 
> No, the recovery process is running in the interrupt thread.
Ok.

> 
> >
> >>> If the design were to introduce RTE_ETH_EVENT_INTR_RECOVER and
> >> rte_eth_dev_recover, what problems do you see?
> >>
> >> I see the 'RTE_ETH_EVENT_INTR_RECOVER and rte_eth_dev_recover' has
> no
> >> difference with RTE_ETH_EVENT_INTR_RESET mechanism.
> >> Could you detail more?
They are similar. i.e. we use RTE_ETH_EVENT_INTR_RECOVER to indicate that it is a recovery interrupt (not a reset event). The recovery process is called through new rte_eth_dev_recover API. What problems do you see with it?
I am unable to understand the problems you have described above.

> >>
> >>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>>> We could have a similar API 'rte_eth_dev_recover' to do the
> >>>>>>> recovery
> >>>>>> functionality.
> >>>>>>
> >>>>>> I suppose such approach is also possible.
> >>>>>> Personally I am fine with both ways: either existing one or what
> >>>>>> you propose, as long as we'll fix existing race-condition.
> >>>>>> What is good with what you suggest - that way we probably don't
> >>>>>> need to worry how to allow user to enable/disable auto-recovery
> >>>>>> inside
> >> PMD.
> >>>>>>
> >>>>>> Konstantin
> >>>>>>
> >>>>>

  reply	other threads:[~2023-03-10  3:25 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-01  3:06 [PATCH 0/5] " Chengwen Feng
2023-03-01  3:06 ` [PATCH 1/5] ethdev: " Chengwen Feng
2023-03-02 12:08   ` Konstantin Ananyev
2023-03-03 16:51     ` Ferruh Yigit
2023-03-05 14:53       ` Konstantin Ananyev
2023-03-06  8:55         ` Ferruh Yigit
2023-03-06 10:22           ` Konstantin Ananyev
2023-03-06 11:00             ` Ferruh Yigit
2023-03-06 11:05               ` Ajit Khaparde
2023-03-06 11:13                 ` Konstantin Ananyev
2023-03-07  8:25                   ` fengchengwen
2023-03-07  9:52                     ` Konstantin Ananyev
2023-03-07 10:11                       ` Konstantin Ananyev
2023-03-07 12:07                     ` Ferruh Yigit
2023-03-07 12:26                       ` fengchengwen
2023-03-07 12:39                         ` Konstantin Ananyev
2023-03-09  2:05                           ` Ajit Khaparde
2023-03-06  1:41       ` fengchengwen
2023-03-06  8:57         ` Ferruh Yigit
2023-03-06  9:10         ` Ferruh Yigit
2023-03-02 23:30   ` Honnappa Nagarahalli
2023-03-03  0:21     ` Konstantin Ananyev
2023-03-04  5:08       ` Honnappa Nagarahalli
2023-03-05 15:23         ` Konstantin Ananyev
2023-03-07  5:34           ` Honnappa Nagarahalli
2023-03-07  8:39             ` fengchengwen
2023-03-08  1:09               ` Honnappa Nagarahalli
2023-03-09  0:59                 ` fengchengwen
2023-03-09  3:03                   ` Honnappa Nagarahalli
2023-03-09 11:30                     ` fengchengwen
2023-03-10  3:25                       ` Honnappa Nagarahalli [this message]
2023-03-07  9:56             ` Konstantin Ananyev
2023-03-01  3:06 ` [PATCH 2/5] net/hns3: replace fp ops config function Chengwen Feng
2023-03-02  6:50   ` Dongdong Liu
2023-03-01  3:06 ` [PATCH 3/5] net/bnxt: fix race-condition when report error recovery Chengwen Feng
2023-03-02 12:23   ` Konstantin Ananyev
2023-03-01  3:06 ` [PATCH 4/5] net/bnxt: use fp ops setup function Chengwen Feng
2023-03-02 12:30   ` Konstantin Ananyev
2023-03-03  0:01     ` Konstantin Ananyev
2023-03-03  1:17       ` Ajit Khaparde
2023-03-03  2:02       ` fengchengwen
2023-03-03  1:38     ` fengchengwen
2023-03-05 15:57       ` Konstantin Ananyev
2023-03-06  2:47         ` Ajit Khaparde
2023-03-01  3:06 ` [PATCH 5/5] app/testpmd: add error recovery usage demo Chengwen Feng
2023-03-02 13:01   ` Konstantin Ananyev
2023-03-03  1:49     ` fengchengwen
2023-03-03 16:59       ` Ferruh Yigit
2023-09-21 11:12 ` [PATCH 0/5] fix race-condition of proactive error handling mode Ferruh Yigit
2023-10-07  2:32   ` fengchengwen
2023-10-20 10:07 ` [PATCH v2 0/7] " Chengwen Feng
2023-10-20 10:07   ` [PATCH v2 1/7] ethdev: " Chengwen Feng
2023-11-01  3:39     ` lihuisong (C)
2023-10-20 10:07   ` [PATCH v2 2/7] net/hns3: replace fp ops config function Chengwen Feng
2023-11-01  3:40     ` lihuisong (C)
2023-11-02 10:34     ` Konstantin Ananyev
2023-10-20 10:07   ` [PATCH v2 3/7] net/bnxt: fix race-condition when report error recovery Chengwen Feng
2023-11-02 16:28     ` Ajit Khaparde
2023-10-20 10:07   ` [PATCH v2 4/7] net/bnxt: use fp ops setup function Chengwen Feng
2023-11-01  3:48     ` lihuisong (C)
2023-11-02 10:34     ` Konstantin Ananyev
2023-11-02 16:29       ` Ajit Khaparde
2023-10-20 10:07   ` [PATCH v2 5/7] app/testpmd: add error recovery usage demo Chengwen Feng
2023-11-01  4:08     ` lihuisong (C)
2023-11-06 13:01       ` fengchengwen
2023-10-20 10:07   ` [PATCH v2 6/7] app/testpmd: extract event handling to event.c Chengwen Feng
2023-11-01  4:09     ` lihuisong (C)
2023-10-20 10:07   ` [PATCH v2 7/7] doc: testpmd support event handling section Chengwen Feng
2023-11-06  9:28     ` lihuisong (C)
2023-11-06 12:39       ` fengchengwen
2023-11-08  3:02         ` lihuisong (C)
2023-11-06  1:35   ` [PATCH v2 0/7] fix race-condition of proactive error handling mode fengchengwen
2023-11-06 13:11 ` [PATCH v3 " Chengwen Feng
2023-11-06 13:11   ` [PATCH v3 1/7] ethdev: " Chengwen Feng
2023-11-06 13:11   ` [PATCH v3 2/7] net/hns3: replace fp ops config function Chengwen Feng
2023-11-06 13:11   ` [PATCH v3 3/7] net/bnxt: fix race-condition when report error recovery Chengwen Feng
2023-11-06 13:11   ` [PATCH v3 4/7] net/bnxt: use fp ops setup function Chengwen Feng
2023-11-06 13:11   ` [PATCH v3 5/7] app/testpmd: add error recovery usage demo Chengwen Feng
2023-11-06 13:11   ` [PATCH v3 6/7] app/testpmd: extract event handling to event.c Chengwen Feng
2023-11-06 13:11   ` [PATCH v3 7/7] doc: testpmd support event handling section Chengwen Feng
2023-11-08  3:03     ` lihuisong (C)
2023-12-05  2:30   ` [PATCH v3 0/7] fix race-condition of proactive error handling mode fengchengwen
2024-01-15  1:44     ` fengchengwen
2024-01-29  1:16       ` fengchengwen
2024-02-18  3:41         ` fengchengwen
2024-05-08  9:22           ` fengchengwen
2024-09-05  9:24 ` [PATCH v4 " Chengwen Feng
2024-09-05  9:24   ` [PATCH v4 1/7] ethdev: " Chengwen Feng
2024-10-10  0:46     ` Stephen Hemminger
2024-09-05  9:24   ` [PATCH v4 2/7] net/hns3: replace fp ops config function Chengwen Feng
2024-09-05  9:25   ` [PATCH v4 3/7] net/bnxt: fix race-condition when report error recovery Chengwen Feng
2024-09-05  9:25   ` [PATCH v4 4/7] net/bnxt: use fp ops setup function Chengwen Feng
2024-09-05  9:25   ` [PATCH v4 5/7] app/testpmd: add error recovery usage demo Chengwen Feng
2024-09-05  9:25   ` [PATCH v4 6/7] app/testpmd: extract event handling to event.c Chengwen Feng
2024-09-05  9:25   ` [PATCH v4 7/7] doc: testpmd support event handling section Chengwen Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DBAPR08MB58145544AA01909D9F8FEFBF98BA9@DBAPR08MB5814.eurprd08.prod.outlook.com \
    --to=honnappa.nagarahalli@arm.com \
    --cc=ajit.khaparde@broadcom.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=fengchengwen@huawei.com \
    --cc=ferruh.yigit@amd.com \
    --cc=kalesh-anakkur.purayil@broadcom.com \
    --cc=konstantin.v.ananyev@yandex.ru \
    --cc=nd@arm.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).