DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Lu, Wenzhuo" <wenzhuo.lu@intel.com>
To: Luca Boccassi <lboccass@Brocade.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v6 0/4] support reset of VF link
Date: Fri, 8 Jul 2016 00:14:13 +0000	[thread overview]
Message-ID: <6A0DE07E22DDAD4C9103DF62FEBC09090348E4AF@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <1467908383.1762.22.camel@brocade.com>

> > > > > > > > > Hello Wenzhuo,
> > > > > > > > >
> > > > > > > > > I'm testing this patchset, but I am sporadically running
> > > > > > > > > into an issue where the VFs reset fails after the PF flaps.
> > > > > > > > >
> > > > > > > > > I have a VM running on a KVM box with a X540-AT2, passing 2 VFs
> in.
> > > > > > > > >
> > > > > > > > > I am using calling rte_eth_dev_reset in response to a
> > > > > > > > > RTE_ETH_EVENT_INTR_RESET callback, and the following
> > > > > > > > > errors appear in the
> > > > > > > > > log:
> > > > > > > > >
> > > > > > > > > PMD: ixgbevf_dev_reset(): Ixgbe VF reset: Failed to update link.
> > > > > > > > > PMD: ixgbe_alloc_rx_queue_mbufs(): RX mbuf alloc failed
> > > > > > > > > queue_id=0
> > > > > > > > > PMD: ixgbevf_dev_start(): Unable to initialize RX
> > > > > > > > > hardware
> > > > > > > > > (-12)
> > > > > > > > > PMD: ixgbevf_dev_reset(): Ixgbe VF reset: Failed to start device.
> > > > > > > > >
> > > > > > > > > Jumping in with GDB, it seems that the rte_rxmbuf_alloc
> > > > > > > > > call in ixgbe_alloc_rx_queue_mbufs returns NULL at
> > > > > > > > > iteration 64 out of
> > > 2048.
> > > > > > > > > The application has ~500 2MB hugepages, and there's 2GB
> > > > > > > > > of free memory available on top of that.
> > > > > > > > >
> > > > > > > > > Have you seen this before? Any pointer or suggestion for
> debugging?
> > > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Kind regards,
> > > > > > > > > Luca Boccassi
> > > > > > > > I think the problem is the mbuf occupied by the packets is
> > > > > > > > not released. This
> > > > > > > memory has to be released by the APP, so my patches haven’t
> > > > > > > covered
> > > this.
> > > > > > > Actually an example is needed to show how to use the reset API.
> > > > > > > I plan to modify the testpmd.
> > > > > > > > You may notice this feature is postponed to 16.11. Would
> > > > > > > > you like to wait for
> > > > > > > the new version that will include an example?
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Unfortunately we need the VF reset working sooner than that,
> > > > > > > so one way or the other I'll need to sort it out. Given I've
> > > > > > > got a use case where this is happening, if it can be helpful
> > > > > > > for you I'm more than happy to help as a guinea pig. If you
> > > > > > > could please give some guidance/guidelines with regards to
> > > > > > > which API to use to sort the mbuf
> > > > > problem, I can try it out and give back some feedback.
> > > > > > >
> > > > > > > Thanks!
> > > > > > I made a stupid mistake and deleted all my code. So, I have to
> > > > > > take some time to rewrite it :( Attached the example I used to
> > > > > > test the reset API. It's
> > > > > modified from the l2fwd example. So you can compare it with
> > > > > l2fwd to see what need to be added.
> > > > > > Hopefully it can help :)
> > > > >
> > > > > Thanks! That made me understand a couple of things more, and
> > > > > I've got past the problem.
> > > > >
> > > > > Unfortunately now there's a bigger issue - rte_eth_dev_reset is
> > > > > a blocking
> > > call.
> > > > > the _RESET event callback is fired when the PF goes down, but
> > > > > when I call rte_eth_dev_reset it will block until the PF goes back up.
> > > > > There is no way, as far as I can see, to know if the PF is back
> > > > > up before
> > > calling rte_eth_dev_reset.
> > > > >
> > > > > This is a problem because, as far as I understand, I have to
> > > > > call all the rte_eth_dev_ APIs from the same thread, in my case
> > > > > the master thread, and I can't have that block potentially indefinitely.
> > > > >
> > > > > Would it be possible to have 2 events instead of 1, one when the
> > > > > PF goes down and one when it goes up? This way an application
> > > > > would be able to soft-stop the port (drain queues, etc) when the
> > > > > PF is down, and then call the reset API when it goes back up.
> > > > >
> > > > > Thanks!
> > > > Sorry we cannot have 2 events now. There're 2 problems to have 2 events.
> > > > 1, Normally we use kernel driver for PF. Now the kernel driver
> > > > only have one
> > > kind of message for link down and up. So we cannot tell if it's down or up.
> > > > 2, When the PF is down, if we don't reset the VF, VF is not working.
> > > > It cannot receive any message from PF. So we cannot know that when
> > > > PF is up. It means normally we have to reset VF twice when PF down
> > > > and up. (Surely we can wait a while when we receive the message
> > > > from PF until PF is up. But we cannot tell how long the time is appropriate.
> > > > So this *wait a while* may work for flash.)
> > >
> > > Thanks for the clarification, I understand.
> > >
> > > The problem with a blocking call is that we basically need to spawn
> > > one thread per rte_eth_dev_reset call, since there is no way of
> > > knowing if a PF is down for good or just flapping, and we can't have
> > > a single thread managing all the interfaces being blocked forever
> > > (EG: PF 1 and 2 go down, thread blocks on PF 1 reset call but it
> > > never returns, meanwhile PF 2 goes back up but call is never made).
> > >
> > > A colleague of mine, Eric Kinzie, suggested to add a blocking
> > > boolean parameter to rte_eth_dev_reset API. If set to false, then
> > > the call will not block and just does one try and return an error (EAGAIN ?).
> Would this be an acceptable proposition?
> > It's a good suggestion.
> > And I think if the parameter is set to false and the link is not up after trying
> once, it will be APP's responsibility to setup a timer or something like that to
> keep trying to bring up the link.
> 
> That seems reasonable. I've thrown together a quick diff and played with it on
> top of your patches and DPDK 2.2, seems to work as intended, I'm attaching it
> for reference. Feel free to pick it up, adapt it or ignore it :-)
> 
> Also I've noticed that the ixgbe is the only one that actually blocks,
> e1000 returns already immediately if the dev_start fails (perhaps it should be
> changed to be consistent?) and ixgb40 does weird things that I'm not sure about,
> but couldn't spot a loop in there :-)
> 
> Also I've used int instead of bool because
> drivers/net/e1000/base/e1000_osdep.h redefines bool and true/false, so
> compilation fails when including stdbool.h and using bool in rte_ethdev.h
> 
> --
> Kind regards,
> Luca Boccassi
Glad to know it's working now.  Thanks for your patch.  Surely I'll try to include it in the next version :)


  reply	other threads:[~2016-07-08  0:14 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-06  5:40 [dpdk-dev] [PATCH 0/8] " Wenzhuo Lu
2016-06-06  5:40 ` [dpdk-dev] [PATCH 1/8] lib/librte_ether: support device reset Wenzhuo Lu
2016-06-06  5:40 ` [dpdk-dev] [PATCH 2/8] lib/librte_ether: defind RX/TX lock mode Wenzhuo Lu
2016-06-08  2:15   ` Stephen Hemminger
2016-06-08  7:34     ` Lu, Wenzhuo
2016-06-09  7:50       ` Olivier Matz
2016-06-12  5:25         ` Lu, Wenzhuo
2016-06-10 18:12       ` Stephen Hemminger
2016-06-12  5:27         ` Lu, Wenzhuo
2016-06-06  5:40 ` [dpdk-dev] [PATCH 3/8] ixgbe: RX/TX with lock on VF Wenzhuo Lu
2016-06-06  5:40 ` [dpdk-dev] [PATCH 4/8] ixgbe: implement device reset " Wenzhuo Lu
2016-06-06  5:40 ` [dpdk-dev] [PATCH 5/8] igb: RX/TX with lock " Wenzhuo Lu
2016-06-06  5:40 ` [dpdk-dev] [PATCH 6/8] igb: implement device reset " Wenzhuo Lu
2016-06-06  5:40 ` [dpdk-dev] [PATCH 7/8] i40e:RX/TX with lock " Wenzhuo Lu
2016-06-06  5:40 ` [dpdk-dev] [PATCH 8/8] i40e: implement device reset " Wenzhuo Lu
2016-06-15  3:03 ` [dpdk-dev] [PATCH v5 0/4] support reset of VF link Wenzhuo Lu
2016-06-15  3:03   ` [dpdk-dev] [PATCH v5 1/4] lib/librte_ether: support device reset Wenzhuo Lu
2016-06-16 15:31     ` Bruce Richardson
2016-06-16 15:36     ` Thomas Monjalon
2016-06-15  3:03   ` [dpdk-dev] [PATCH v5 2/4] ixgbe: implement device reset on VF Wenzhuo Lu
2016-06-15  3:03   ` [dpdk-dev] [PATCH v5 3/4] igb: " Wenzhuo Lu
2016-06-15  3:03   ` [dpdk-dev] [PATCH v5 4/4] i40e: " Wenzhuo Lu
2016-06-20  6:24 ` [dpdk-dev] [PATCH v6 0/4] support reset of VF link Wenzhuo Lu
2016-06-20  6:24   ` [dpdk-dev] [PATCH v6 1/4] lib/librte_ether: support device reset Wenzhuo Lu
2016-06-20  9:14     ` Jerin Jacob
2016-06-20 16:17       ` Stephen Hemminger
2016-06-21  3:51         ` Jerin Jacob
2016-06-21  6:14           ` Lu, Wenzhuo
2016-06-21  7:37             ` Jerin Jacob
2016-06-21  8:24               ` Lu, Wenzhuo
2016-06-21  8:55                 ` Jerin Jacob
2016-06-21  9:26                   ` Ananyev, Konstantin
2016-06-21 10:57                     ` Jerin Jacob
2016-06-21 13:10                       ` Ananyev, Konstantin
2016-06-21 13:30                         ` Jerin Jacob
2016-06-21 14:03                           ` Ananyev, Konstantin
2016-06-21 14:29                             ` Jerin Jacob
2016-06-22  1:35                               ` Lu, Wenzhuo
2016-06-22  2:37                                 ` Jerin Jacob
2016-06-22  3:32                                   ` Lu, Wenzhuo
2016-06-22  4:14                                     ` Jerin Jacob
2016-06-22  5:05                                       ` Lu, Wenzhuo
2016-06-22  6:10                                         ` Jerin Jacob
2016-06-22  6:42                                           ` Lu, Wenzhuo
2016-06-22  7:59                                             ` Jerin Jacob
2016-06-22  8:17                                               ` Thomas Monjalon
2016-06-22  8:25                                                 ` Lu, Wenzhuo
2016-06-22  9:18                                                   ` Thomas Monjalon
2016-06-22 11:06                                                     ` Jerin Jacob
2016-06-23  0:45                                                       ` Lu, Wenzhuo
2016-06-23  0:39                                                     ` Lu, Wenzhuo
2016-06-21  0:51       ` Lu, Wenzhuo
2016-06-20  6:24   ` [dpdk-dev] [PATCH v6 2/4] ixgbe: implement device reset on VF Wenzhuo Lu
2016-06-20  6:24   ` [dpdk-dev] [PATCH v6 3/4] igb: " Wenzhuo Lu
2016-06-20  6:24   ` [dpdk-dev] [PATCH v6 4/4] i40e: " Wenzhuo Lu
2016-07-04 15:48   ` [dpdk-dev] [PATCH v6 0/4] support reset of VF link Luca Boccassi
2016-07-05  0:52     ` Lu, Wenzhuo
2016-07-05  9:52       ` Luca Boccassi
2016-07-06  0:45         ` Lu, Wenzhuo
2016-07-06 16:26           ` Luca Boccassi
     [not found]           ` <1467822182.32466.34.camel@brocade.com>
2016-07-07  1:09             ` Lu, Wenzhuo
2016-07-07 10:20               ` Luca Boccassi
2016-07-07 13:12                 ` Lu, Wenzhuo
2016-07-07 16:19                   ` Luca Boccassi
2016-07-08  0:14                     ` Lu, Wenzhuo [this message]
2016-07-08 17:15                       ` Luca Boccassi
2016-07-11  1:32                         ` Lu, Wenzhuo
2016-07-11 12:02                           ` Luca Boccassi
2016-07-11 15:43                             ` Luca Boccassi
2016-07-12  1:19                               ` Lu, Wenzhuo
2016-08-26 12:58                               ` Luca Boccassi
2016-08-29  1:04                                 ` Lu, Wenzhuo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6A0DE07E22DDAD4C9103DF62FEBC09090348E4AF@shsmsx102.ccr.corp.intel.com \
    --to=wenzhuo.lu@intel.com \
    --cc=dev@dpdk.org \
    --cc=lboccass@Brocade.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).