DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Zhang, Qi Z" <qi.z.zhang@intel.com>
To: "Yigit, Ferruh" <ferruh.yigit@intel.com>,
	"thomas@monjalon.net" <thomas@monjalon.net>,
	"Doherty, Declan" <declan.doherty@intel.com>
Cc: "ktraynor@redhat.com" <ktraynor@redhat.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"Shelton, Benjamin H" <benjamin.h.shelton@intel.com>,
	"Vangati, Narender" <narender.vangati@intel.com>
Subject: Re: [dpdk-dev] [RFC v3] ethdev: claim device reset as async
Date: Thu, 4 Oct 2018 15:58:42 +0000	[thread overview]
Message-ID: <039ED4275CED7440929022BC67E70611532AA4B1@SHSMSX103.ccr.corp.intel.com> (raw)
In-Reply-To: <b4388875-2404-e33e-3242-9f5c76db7b70@intel.com>



> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Thursday, October 4, 2018 7:30 PM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>; thomas@monjalon.net; Doherty,
> Declan <declan.doherty@intel.com>
> Cc: ktraynor@redhat.com; dev@dpdk.org; Shelton, Benjamin H
> <benjamin.h.shelton@intel.com>; Vangati, Narender
> <narender.vangati@intel.com>
> Subject: Re: [RFC v3] ethdev: claim device reset as async
> 
> On 9/20/2018 5:56 AM, Qi Zhang wrote:
> > Device reset should be implemented in an async way since it is
> > possible to be invoked in interrupt thread and sometimes to reset a
> > device need to wait for some dependency, for example, a VF expects for
> > PF ready or a NIC function as part of a SOC wait for the whole system
> > reset complete, and all these time-consuming tasks will block the
> > interrupt thread.
> > The patch rename rte_eth_dev_reset to rte_eth_dev_reset_async and
> > rework the implementation. It will spawn a new thread which will call
> > ops->dev_reset, and when finished it will raise the event
> > RTE_ETH_EVENT_RESET_COMPLETE. The application should always wait for
> > this event before it continues to configure and restart the device.
> >
> > Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> 
> <...>
> 
> > @@ -1385,10 +1413,26 @@ rte_eth_dev_reset(uint16_t port_id)
> >
> >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_reset, -ENOTSUP);
> >
> > +	/* already on resetting */
> > +	if (dev->state == RTE_ETH_DEV_RESETTING)
> > +		return 0;
> > +
> > +	args = calloc(1, sizeof(struct dev_reset_args));
> > +	if (!args)
> > +		return -ENOMEM;
> > +
> >  	rte_eth_dev_stop(port_id);
> > -	ret = dev->dev_ops->dev_reset(dev);
> >
> > -	return eth_err(port_id, ret);
> > +	/* store previous device state temporary */
> > +	args->pre_state = dev->state;
> > +
> > +	dev->state = RTE_ETH_DEV_RESETTING;
> 
> Do we need to update the state, I think this will break rte_eth_dev_count() and
> friends, like during device reset app will think it has one less device in system.

I'd like to have this new state which represent the situation of the device more accurate.
In this patch RTE_ETH_DEV_RESETTING is just to be used to prevent double reset, but in future it can also be used to prevent invalid operation during device reset.

Of cause we need to make sure it does not break exist behavior and seems add RTE_ETH_DEV_RESETTING check in rte_eth_find_next_owned_by and rte_eth_find_next is able to fix the issue you observed.

I can add this in v4 if you agree the idea.

> 
> <...>
> 
> > @@ -1814,21 +1816,29 @@ void rte_eth_dev_close(uint16_t port_id);
> >   * RTE_ETH_EVENT_INTR_RESET event is detected, but can also use it to
> start
> >   * a port reset in other circumstances.
> >   *
> > - * When this function is called, it first stops the port and then
> > calls the
> > - * PMD specific dev_uninit( ) and dev_init( ) to return the port to
> > initial
> > - * state, in which no Tx and Rx queues are setup, as if the port has
> > been
> > - * reset and not started. The port keeps the port id it had before
> > the
> > - * function call.
> > - *
> > - * After calling rte_eth_dev_reset( ), the application should use
> > - * rte_eth_dev_configure( ), rte_eth_rx_queue_setup( ),
> > - * rte_eth_tx_queue_setup( ), and rte_eth_dev_start( )
> > - * to reconfigure the device as appropriate.
> > - *
> > - * Note: To avoid unexpected behavior, the application should stop
> > calling
> > - * Tx and Rx functions before calling rte_eth_dev_reset( ). For
> > thread
> > - * safety, all these controlling functions should be called from the
> > same
> > - * thread.
> > + * @note
> > + * Device reset may have the dependency, for example, a VF reset
> > + expects
> > + * PF ready, or a NIC function as a part of a SOC need to wait for
> > + other
> > + * parts of the system be ready, these are time-consuming tasks and
> > + will
> > + * block current thread.
> > + *
> > + * As the name, rte_eth_dev_reset_async is an async API, it will
> > + spwan a
> > + * new thread to call ops->dev_reset, once it is finished, it will
> > + raise
> > + * the RTE_ETH_EVENT_RESET_COMPLETE event to notify application.
> > + That makes
> > + * things easy for an application that what to reset the device from
> > + the
> > + * interrupt thread since typically a RTE_ETH_EVENT_INTR_RESET
> > + handler is
> > + * invoked in interrupt thread.
> 
> thread calls dev_ops->dev_reset(dev) and wait for it, so it means
> dev_ops->dev_reset is synchronous, perhaps it would be good to highlight this
> in "dev_reset" comment to help PMD developers.

OK

> 
> of dev_ops->dev_reset() is synchronous, means existing rte_eth_dev_reset() is
> synchronous, so what do you thinks keep rte_eth_dev_reset() as it is and add
> new
> rte_eth_dev_reset_async() API? Than we will have both sync and async
> solution.

Typically device reset happens when application receive RTE_ETH_EVENT_INTR_RESET and this is in interrupt thread. 
Invoke an async API in interrupt thread is the right way, is it better if we highlight this is the only way?
I may not prefer to expose the sync API right now, it's better to figure out some typical usage before we expose this, but so far I don't have.

Regards
Qi


> 
> > + *
> > + * Application should not assume device reset is finished after
> > + * rte_eth_dev_reset_async return, it should always wait for a
> > + * RTE_ETH_EVENT_RESET_COMPLETE event and check the reset result.
> > + * If reset success, application should call rte_eth_dev_configure(
> > + ),
> > + * rte_eth_rx_queue_setup( ), rte_eth_tx_queue_setup( ),
> > + * and rte_eth_dev_start( ) to reconfigure the device as appropriate.
> > + *
> > + * @Note
> > + * To avoid unexpected behavior, the application should stop calling
> > + * Tx and Rx functions before calling rte_eth_dev_reset_async( ).
> >   *
> >   * @param port_id
> >   *   The port identifier of the Ethernet device.
> > @@ -1837,12 +1847,10 @@ void rte_eth_dev_close(uint16_t port_id);
> >   *   - (0) if successful.
> >   *   - (-EINVAL) if port identifier is invalid.
> >   *   - (-ENOTSUP) if hardware doesn't support this function.
> > - *   - (-EPERM) if not ran from the primary process.
> > - *   - (-EIO) if re-initialisation failed or device is removed.
> >   *   - (-ENOMEM) if the reset failed due to OOM.
> > - *   - (-EAGAIN) if the reset temporarily failed and should be retried later.
> > + *   - (<0) other errors from low level driver.
> >   */
> > -int rte_eth_dev_reset(uint16_t port_id);
> > +int rte_eth_dev_reset_async(uint16_t port_id);
> >
> >  /**
> >   * Enable receipt in promiscuous mode for an Ethernet device.
> > @@ -2574,6 +2582,8 @@ enum rte_eth_event_type {
> >  				/**< queue state event (enabled/disabled) */
> >  	RTE_ETH_EVENT_INTR_RESET,
> >  			/**< reset interrupt event, sent to VF on PF reset */
> > +	RTE_ETH_EVENT_RESET_COMPLETE,
> > +			/**< inform application that reset is completed */
> >  	RTE_ETH_EVENT_VF_MBOX,  /**< message from the VF received by PF
> */
> >  	RTE_ETH_EVENT_MACSEC,   /**< MACsec offload related event */
> >  	RTE_ETH_EVENT_INTR_RMV, /**< device removal event */
> >


  reply	other threads:[~2018-10-04 16:02 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-20  4:56 Qi Zhang
2018-10-04 11:30 ` Ferruh Yigit
2018-10-04 15:58   ` Zhang, Qi Z [this message]
2019-03-19 13:13     ` Ferruh Yigit
2019-03-19 13:13       ` Ferruh Yigit
2019-03-19 13:40       ` Zhang, Qi Z
2019-03-19 13:40         ` Zhang, Qi Z
2019-03-19 16:41         ` Ferruh Yigit
2019-03-19 16:41           ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=039ED4275CED7440929022BC67E70611532AA4B1@SHSMSX103.ccr.corp.intel.com \
    --to=qi.z.zhang@intel.com \
    --cc=benjamin.h.shelton@intel.com \
    --cc=declan.doherty@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=ktraynor@redhat.com \
    --cc=narender.vangati@intel.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).