DPDK patches and discussions
 help / color / mirror / Atom feed
From: Thomas Monjalon <thomas@monjalon.net>
To: fengchengwen <fengchengwen@huawei.com>,
	Ferruh Yigit <ferruh.yigit@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Subject: Re: [dpdk-dev] Question about hardware error handling policy
Date: Fri, 23 Jul 2021 14:51:55 +0200	[thread overview]
Message-ID: <10636274.k5fJVezR7q@thomas> (raw)
In-Reply-To: <6e220d0b-5683-ee12-bdab-1ef78d19ebdc@intel.com>

23/07/2021 14:33, Ferruh Yigit:
> On 7/22/2021 4:46 PM, Thomas Monjalon wrote:
> > 22/07/2021 15:50, fengchengwen:
> >> Hi, all
> >>
> >>     I notice ethdev support dev_reset ops, which could be used to recover from
> >> errors, and only 13+ drivers support this function.
> 
> 'rte_eth_dev_reset()' can be used to reset device config to defaults, not have
> to be for error recovering.
> 
> >>     And also there is event for reset: RTE_ETH_EVENT_INTR_RESET, and only 6
> >> drivers support it (most of them are VF).
> >>
> >>     This provides users with two ways to handle hardware errors:
> >>     a. driver report RTE_ETH_EVENT_INTR_RESET, and application do reset ops.
> >>     b. application detect errors (the detection method is unclear), and call
> >>     reset ops to recover.
> >>
> >>     According to the design of this API, error handling is assigned to the
> >> application, and the driver is only responsible for reporting events. This
> >> simplifies the driver design (for example, the driver does not need to maintain
> >> mutex locks).
> >>
> >>     As we know, many modern NICs come with firmware, have PCIE interfaces,
> >> support SR-IOV, the hardware errors can have: firmware reboot/PF reset/
> >> VF reset/FLR, but these errors(particularly firmware/PF) are not addressed in
> >> most drivers.
> >>
> >>     Question 1: what do we think of these errors(particularly firmware/PF)? Do
> >> we think that the probability is very low and that there is no need to deal with
> >> them?
> > 
> > Even rare errors must be managed.
> > 
> 
> +1
> 
> >>     Question 2: I prefer to put error handling in the application layer, because
> >> doing it in the driver can make the driver complex, but there is no app to
> >> register the INTR_RESET event handler. I think we can build a standard handler
> >> in testpmd, What do you think?
> > 
> > Absolutely. As any ethdev API, it must be tested with testpmd.
> > 
> 
> Testpmd registers for RESET event, but when event received all it does is print
> a log, so there is not logic behind it.
> 
> If the intention is to add a error handling logic into testpmd, my concern is it
> being too complex or too device specific.

It shows a problem in the API.
We don't have a clear generic recovering process.

> And if there is something to cleanup, or recover etc in application level, it
> makes sense application to receive the event and act on it. But if the
> reset/recover can be handled in the PMD, if possible transparently, I think that
> is better choice.
> 
> Another thing is I am not sure if what the applications should do on the reset
> event clear or same for all PMDs, which is not good.

Indeed we should improve this area,
and implement a logic in testpmd.



  reply	other threads:[~2021-07-23 12:51 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-22 13:50 fengchengwen
2021-07-22 15:46 ` Thomas Monjalon
2021-07-23  2:18   ` fengchengwen
2021-07-25 15:12     ` Matan Azrad
2021-07-26  6:21       ` fengchengwen
2021-07-23 12:33   ` Ferruh Yigit
2021-07-23 12:51     ` Thomas Monjalon [this message]
2021-07-23 13:04     ` Andrew Rybchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=10636274.k5fJVezR7q@thomas \
    --to=thomas@monjalon.net \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=dev@dpdk.org \
    --cc=fengchengwen@huawei.com \
    --cc=ferruh.yigit@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).