DPDK patches and discussions
 help / color / mirror / Atom feed
From: Edwin Brossette <edwin.brossette@6wind.com>
To: Somnath Kotur <somnath.kotur@broadcom.com>
Cc: dev@dpdk.org, ajit.khaparde@broadcom.com,
	 Didier Pallard <didier.pallard@6wind.com>,
	Olivier Matz <olivier.matz@6wind.com>,
	 Laurent Hardy <laurent.hardy@6wind.com>
Subject: Re: net/bnxt: wrong link status when lsc_intr is used
Date: Mon, 6 Feb 2023 09:44:56 +0100	[thread overview]
Message-ID: <CANDF9xCBXR+iLes+TeHW1TLdhM0W4qVLY-vK6rqAUuj1o0zG5A@mail.gmail.com> (raw)
In-Reply-To: <CAOBf=mu9w4OSdiXeh+-+YP=h92eg2NSGbBrB7Hk7jfJqn8oL9g@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 17168 bytes --]

Hello,

Thank you for your quick answer.

This was added long back in the driver code by this commit (as you can
> see 6+ yrs old :)) , so I believe at the time the intent was to get
> this link notification asynchronous
> and 'lsc' I believe was for link state change?  Please suggest an
> alternative place to do this or you may even post the patch yourself
> to help do this the right way
>

Looking into testpmd's code, I found the following piece of code in
init_port_config() :

  >  if (lsc_interrupt && (*port->dev_info.dev_flags &
RTE_ETH_DEV_INTR_LSC))
  >              port->dev_conf.intr_conf.lsc = 1;

I believe the flag RTE_ETH_DEV_INTR_LSC is used to inform the application
that the pmd supports the use of link state change interrupts. Thus using
this flag, the application itself can decide whether or not to enable the
lsc interrupts depending on the need. The above code is used to enable
these interrupts if the option --no-lsc-interrupts was not specified at
test-pmd's start.

Therefore, I can suggest enabling this flag and removing the line which
sets lsc to 1 in bnxt_dev_info_get_op() so the above application can decide
to turn these interrupts ON/OFF.
A patch would look like something like this:

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
> index b3de490d3667..753e86b4b2af 100644
> --- a/drivers/net/bnxt/bnxt_ethdev.c
> +++ b/drivers/net/bnxt/bnxt_ethdev.c
> @@ -1017,7 +1017,6 @@ static int bnxt_dev_info_get_op(struct rte_eth_dev
> *eth_dev,
>                 .tx_free_thresh = 32,
>                 .tx_rs_thresh = 32,
>         };
> -       eth_dev->data->dev_conf.intr_conf.lsc = 1;
>
>         dev_info->rx_desc_lim.nb_min = BNXT_MIN_RING_DESC;
>         dev_info->rx_desc_lim.nb_max = BNXT_MAX_RX_RING_DESC;
> @@ -5859,6 +5858,7 @@ bnxt_dev_init(struct rte_eth_dev *eth_dev, void
> *params __rte_unused)
>
>         rte_eth_copy_pci_info(eth_dev, pci_dev);
>         eth_dev->data->dev_flags |= RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS;
> +       eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
>
>         bp = eth_dev->data->dev_private;
>

I will submit a proper patch to dev as soon as I will have my subscription
to the mailing list verified.

It can be tested in test-pmd: When starting test-pmd with
--no-lsc-interrupt, the log I introduced previously shows:
><><><><!!<devstart>: dev->data->dev_conf.intr_conf=0

If I start it without the option, it shows instead:
><><><><!!<devstart>: dev->data->dev_conf.intr_conf=1

So what happens if/when you get rid of that lsc setting line in
> dev_info_get_op() and actually enabled the 'else' part of the above
> code snippet ? Was it helping your link to come up with proper values?
>

With the patch, my link is displayed with the proper state: link cannot be
set up and is shown in down state, which corresponds to its actual state.
I can show the following testpmd logs to illustrate:

testpmd> port config 0 speed auto duplex auto
> testpmd> port start 0
> Configuring Port 0 (socket 0)
> bnxt_rx_queue_setup_op(): App supplied RXQ drop_en status : 1
> bnxt_rx_queue_setup_op(): RX Buf MTU 1500
> bnxt_rx_queue_setup_op(): RX Buf size is 9728
> bnxt_rx_queue_setup_op(): App supplied RXQ drop_en status : 1
> bnxt_rx_queue_setup_op(): RX Buf MTU 1500
> bnxt_rx_queue_setup_op(): RX Buf size is 9728
> bnxt_mq_rx_configure(): pools = 1 nb_q_per_grp = 2
> bnxt_mq_rx_configure(): rxq[0] = 0x10040b080 vnic[0] = 0x100227080
> bnxt_mq_rx_configure(): rxq[1] = 0x105fb0e40 vnic[0] = 0x100227080
> bnxt_setup_one_vnic(): vnic[0] = 0x100227080 vnic->fw_grp_ids = 0x105fa7400
> bnxt_hwrm_vnic_alloc(): Alloc VNIC. Start 0, End 2
> bnxt_hwrm_vnic_alloc(): VNIC ID 2
> bnxt_setup_one_vnic(): rxq[0]->vnic=0x100227080
> vnic->fw_grp_ids=0x105fa7400
> bnxt_setup_one_vnic(): rxq[1]->vnic=0x100227080
> vnic->fw_grp_ids=0x105fa7400
> bnxt_setup_one_vnic(): vnic->rx_queue_cnt = 2
> bnxt_hwrm_port_phy_qcfg(): Link Speed:0,Auto:0:0:140,Support:140,Force:64
> bnxt_hwrm_port_phy_qcfg(): Link Signal:0,PAM::Auto:0,Support:0,Force:0
> bnxt_hwrm_port_phy_qcfg(): Link Speed:0,Auto:4:64:140,Support:140,Force:0
> bnxt_hwrm_port_phy_qcfg(): Link Signal:0,PAM::Auto:0,Support:0,Force:0
> bnxt_ulp_port_init(): Skip ulp init for port: 0, TF is not enabled
> bnxt_receive_function(): Using SSE vector mode receive for port 0
> bnxt_transmit_function(): Using SSE vector mode transmit for port 0
> bnxt_hwrm_port_phy_qcfg(): Link Speed:0,Auto:4:64:140,Support:140,Force:0
> bnxt_hwrm_port_phy_qcfg(): Link Signal:0,PAM::Auto:0,Support:0,Force:0
> Port 0: 00:0A:F7:B6:E3:D0
> Checking link statuses...
> bnxt_hwrm_port_phy_qcfg(): Link Speed:0,Auto:4:64:140,Support:140,Force:0
> bnxt_hwrm_port_phy_qcfg(): Link Signal:0,PAM::Auto:0,Support:0,Force:0
> bnxt_hwrm_port_phy_qcfg(): Link Speed:0,Auto:4:64:140,Support:140,Force:0
> bnxt_hwrm_port_phy_qcfg(): Link Signal:0,PAM::Auto:0,Support:0,Force:0
>
[...]
>
bnxt_hwrm_port_phy_qcfg(): Link Speed:0,Auto:4:64:140,Support:140,Force:0
> bnxt_hwrm_port_phy_qcfg(): Link Signal:0,PAM::Auto:0,Support:0,Force:0
> Port 0 Link down
> bnxt_hwrm_port_phy_qcfg(): Link Speed:0,Auto:4:64:40,Support:140,Force:0
> bnxt_hwrm_port_phy_qcfg(): Link Signal:0,PAM::Auto:0,Support:0,Force:0
> Port 1 Link down
> Done
> testpmd> show port info 0
> bnxt_hwrm_port_phy_qcfg(): Link Speed:0,Auto:4:64:140,Support:140,Force:0
> bnxt_hwrm_port_phy_qcfg(): Link Signal:0,PAM::Auto:0,Support:0,Force:0
>
> ********************* Infos for port 0  *********************
> MAC address: 00:0A:F7:B6:E3:D0
> Device name: 0000:02:00.0
> Driver name: net_bnxt
> Firmware-version: 223.0.161.0
> Devargs:
> Connect to socket: 0
> memory allocation on the socket: 0
> Link status: down                                <======================
> Link speed: None
> Link duplex: half-duplex
> Autoneg status: Off
> MTU: 1500
> Promiscuous mode: enabled
> Allmulticast mode: disabled
> Maximum number of MAC addresses: 128
> Maximum number of MAC addresses of hash filtering: 0
> VLAN offload:
>   strip off, filter off, extend off, qinq strip off
> Hash key size in bytes: 40
> Redirection table size: 128
> Supported RSS offload flow types:
>   ipv4  ipv4-tcp  ipv4-udp  ipv6  ipv6-tcp  ipv6-udp
>   user-defined-50  user-defined-51
> Minimum size of RX buffer: 1
> Maximum configurable length of RX packet: 9600
> Maximum configurable size of LRO aggregated packet: 0
> Maximum number of VMDq pools: 64
> Current number of RX queues: 2
> Max possible RX queues: 117
> Max possible number of RXDs per queue: 8192
> Min possible number of RXDs per queue: 16
> RXDs number alignment: 1
> Current number of TX queues: 2
> Max possible TX queues: 117
> Max possible number of TXDs per queue: 4096
> Min possible number of TXDs per queue: 16
> TXDs number alignment: 1
> Max segment number per packet: 65535
> Max segment number per MTU/TSO: 65535
> Device capabilities: 0x3( RUNTIME_RX_QUEUE_SETUP RUNTIME_TX_QUEUE_SETUP )
> Switch name: 0000:02:00.0
> Switch domain Id: 0
> Switch Port Id: 32768
> Device error handling mode: proactive
> testpmd>
>

 Here you can see the link status appears down if lsc interrupts are turned
off, which is the expected result.

Sincerely,
Edwin Brossette.

On Tue, Jan 31, 2023 at 6:08 AM Somnath Kotur <somnath.kotur@broadcom.com>
wrote:

> On Thu, Jan 19, 2023 at 7:07 PM Edwin Brossette
> <edwin.brossette@6wind.com> wrote:
> >
> > Hello,
> >
> Hi Edwin,
>  Thanks for reaching out, here's my attempt at answering your questions
> > I am trying to operate a Broadcom BCM57414 2x10G nic using dpdk bnxt
> pmd. I use DPDK 22.11.
> > However, doing so I stumbled over a number of different issues. Mainly,
> using the dpdk rte_eth library, I don't seem to be able to correctly poll
> the link status: I expect my nic has a problem using autoneg to set the
> link speed/duplex, because these parameters remain unknown while autoneg is
> on. However, after trying to set link up, instead of showing the link state
> as down, I see the link being up, which is in truth not the case, as no
> packets can transit on the line and the switch at the other end sees it
> down.
> >
> > When searching around and trying to debug the code, I found the function
> bnxt_dev_info_get_op() sets my nic in interrupt mode:
> >
> > > eth_dev->data->dev_conf.intr_conf.lsc = 1;
> This was added long back in the driver code by this commit (as you can
> see 6+ yrs old :)) , so I believe at the time the intent was to get
> this link notification asynchronous
> and 'lsc' I believe was for link state change?  Please suggest an
> alternative place to do this or you may even post the patch yourself
> to help do this the right way
> commit 7bc8e9a227ccbc649c2d230f0ee92ee58b1cb955
> Author: Ajit Khaparde <ajit.khaparde@broadcom.com>
> Date:   Tue Oct 11 16:47:50 2016 -0500
>
>     net/bnxt: support async link notification
>
> >
> > Which is a bit of a surprising thing to do for a function meant to get
> info.
> > Thus my card ends up working in a mode I didn't configure it to, which
> may be the cause of my issue: later when setting the link up in function
> bnxt_dev_set_link_up_op():
> >
> > >  if (!bp->link_info->link_up)
> > >     rc = bnxt_set_hwrm_link_config(bp, true);
> > >  if (!rc)
> > >     eth_dev->data->dev_link.link_status = 1;
> >
> > So link_status in eth_dev gets set to 1 as long as the operation did not
> return any error code. This is the case when setting my card's link up
> (rc=0), although the link clearly can't get up, for whatever other bug is
> present. Now this shouldn't be much of an issue given we will update the
> link status at some point, mainly in rte_eth_link_get_nowait():
> >
> > >  if (dev->data->dev_conf.intr_conf.lsc && dev->data->dev_started)
> > >     rte_eth_linkstatus_get(dev, eth_link);
> > >  else {
> > >     if (*dev->dev_ops->link_update == NULL)
> > >         return -ENOTSUP;
> > >     (*dev->dev_ops->link_update)(dev, 0);
> > >      *eth_link = dev->data->dev_link;
> >
> > Here we can see in the else statement that the link status gets updated.
> However because the pmd auto-configured the nic in interrupt mode when
> calling the get_info function, we are not going through that else
> statement. So when reading the value of the link_status, we read 1 instead
> of 0. I suppose with interrupt mode enabled, the nic should be able to
> update this variable on its own, but it is clearly not the case in my
> setup: link status is never updated and incorrectly indicates the link is
> UP.
> >
> So what happens if/when you get rid of that lsc setting line in
> dev_info_get_op() and actually enabled the 'else' part of the above
> code snippet ? Was it helping your link to come up with proper values?
> What is your link status on the other side ? (I believe you said
> switch port, but have you tried with another card connected back-back
> ?)  Also I'm assuming it comes up fine with the regular kernel driver
> ?
> What does that show ?
> > I can suggest a testpmd reproduction setup using the --no-lsc-interrupt
> option. With this option, dev_conf.intr_conf.lsc should be 0. In addition,
> I added a log to the rte_eth library in rte_eth_dev_start() to display
> dev_conf.intr_conf.lsc state when starting the port:
> >
> > > + RTE_ETHDEV_LOG(ERR, "><><><><!!<devstart>:
> dev->data->dev_conf.intr_conf=%d\n", dev->data->dev_conf.intr_conf.lsc);
> > >   diag = (*dev->dev_ops->dev_start)(dev);
> >
> > Running testpmd, we can see the following outpout when starting port:
> >
> > dpdk-testpmd --log-level=pmd.net.bnxt.driver:8 -a 0000:02:00.0 -a
> 0000:02:00.1 -- -i --rxq=2 --txq=2 --coremask=0x0c --total-num-mbufs=250000
> --no-lsc-interrupt
> > [...]
> > Configuring Port 0 (socket 0)
> > bnxt_rx_queue_setup_op(): App supplied RXQ drop_en status : 1
> > bnxt_rx_queue_setup_op(): RX Buf MTU 1500
> > bnxt_rx_queue_setup_op(): RX Buf size is 9728
> > bnxt_rx_queue_setup_op(): App supplied RXQ drop_en status : 1
> > bnxt_rx_queue_setup_op(): RX Buf MTU 1500
> > bnxt_rx_queue_setup_op(): RX Buf size is 9728
> > ><><><><!!<devstart>: dev->data->dev_conf.intr_conf=1
> > bnxt_mq_rx_configure(): pools = 1 nb_q_per_grp = 2
> > bnxt_mq_rx_configure(): rxq[0] = 0x105fb7ac0 vnic[0] = 0x100227080
> > bnxt_mq_rx_configure(): rxq[1] = 0x105fb0e40 vnic[0] = 0x100227080
> > bnxt_setup_one_vnic(): vnic[0] = 0x100227080 vnic->fw_grp_ids =
> 0x105fa7e00
> > bnxt_hwrm_vnic_alloc(): Alloc VNIC. Start 0, End 2
> > bnxt_hwrm_vnic_alloc(): VNIC ID 2
> > bnxt_setup_one_vnic(): rxq[0]->vnic=0x100227080
> vnic->fw_grp_ids=0x105fa7e00
> > bnxt_setup_one_vnic(): rxq[1]->vnic=0x100227080
> vnic->fw_grp_ids=0x105fa7e00
> > bnxt_setup_one_vnic(): vnic->rx_queue_cnt = 2
> > bnxt_hwrm_port_phy_qcfg(): Link Speed:0,Auto:4:64:140,Support:140,Force:0
> > bnxt_hwrm_port_phy_qcfg(): Link Signal:0,PAM::Auto:0,Support:0,Force:0
> > bnxt_hwrm_port_phy_qcfg(): Link Speed:0,Auto:4:64:140,Support:140,Force:0
> > bnxt_hwrm_port_phy_qcfg(): Link Signal:0,PAM::Auto:0,Support:0,Force:0
> > bnxt_ulp_port_init(): Skip ulp init for port: 0, TF is not enabled
> > bnxt_receive_function(): Using SSE vector mode receive for port 0
> > bnxt_transmit_function(): Using SSE vector mode transmit for port 0
> > Port 0: 00:0A:F7:B6:E3:D0
> > Configuring Port 1 (socket 0)
> > bnxt_rx_queue_setup_op(): App supplied RXQ drop_en status : 1
> > bnxt_rx_queue_setup_op(): RX Buf MTU 1500
> > bnxt_rx_queue_setup_op(): RX Buf size is 9728
> > bnxt_rx_queue_setup_op(): App supplied RXQ drop_en status : 1
> > bnxt_rx_queue_setup_op(): RX Buf MTU 1500
> > bnxt_rx_queue_setup_op(): RX Buf size is 9728
> > ><><><><!!<devstart>: dev->data->dev_conf.intr_conf=1
> > bnxt_mq_rx_configure(): pools = 1 nb_q_per_grp = 2
> > bnxt_mq_rx_configure(): rxq[0] = 0x106200280 vnic[0] = 0x100200080
> > bnxt_mq_rx_configure(): rxq[1] = 0x105f10e40 vnic[0] = 0x100200080
> > bnxt_setup_one_vnic(): vnic[0] = 0x100200080 vnic->fw_grp_ids =
> 0x105f07e00
> > bnxt_hwrm_vnic_alloc(): Alloc VNIC. Start 0, End 2
> > bnxt_hwrm_vnic_alloc(): VNIC ID 3
> > bnxt_setup_one_vnic(): rxq[0]->vnic=0x100200080
> vnic->fw_grp_ids=0x105f07e00
> > bnxt_setup_one_vnic(): rxq[1]->vnic=0x100200080
> vnic->fw_grp_ids=0x105f07e00
> > bnxt_setup_one_vnic(): vnic->rx_queue_cnt = 2
> > bnxt_hwrm_port_phy_qcfg(): Link Speed:0,Auto:4:64:140,Support:140,Force:0
> > bnxt_hwrm_port_phy_qcfg(): Link Signal:0,PAM::Auto:0,Support:0,Force:0
> > bnxt_hwrm_port_phy_qcfg(): Link Speed:0,Auto:4:64:140,Support:140,Force:0
> > bnxt_hwrm_port_phy_qcfg(): Link Signal:0,PAM::Auto:0,Support:0,Force:0
> > bnxt_ulp_port_init(): Skip ulp init for port: 1, TF is not enabled
> > bnxt_receive_function(): Using SSE vector mode receive for port 1
> > bnxt_transmit_function(): Using SSE vector mode transmit for port 1
> > Port 1: 00:0A:F7:B6:E3:D1
> >
> > Here, we can see that lsc interrupts are enabled even though we
> specified not to enable them. Then given autoneg does not work on my nic, I
> can try setting the link up and showing port info:
> >
> > testpmd> set link-up port 0
> > bnxt_print_link_info(): Port 0 Link Up - speed 0 Mbps - half-duplex
> >
> > testpmd> show port info 0
> >
> > ********************* Infos for port 0  *********************
> > MAC address: 00:0A:F7:B6:E3:D0
> > Device name: 0000:02:00.0
> > Driver name: net_bnxt
> > Firmware-version: 223.0.161.0
> > Devargs:
> > Connect to socket: 0
> > memory allocation on the socket: 0
> > Link status: up
> > Link speed: None
> > Link duplex: half-duplex
> > Autoneg status: Off
> > MTU: 1500
> > Promiscuous mode: enabled
> > Allmulticast mode: disabled
> > Maximum number of MAC addresses: 128
> > Maximum number of MAC addresses of hash filtering: 0
> > VLAN offload:
> >   strip off, filter off, extend off, qinq strip off
> > Hash key size in bytes: 40
> > Redirection table size: 128
> > Supported RSS offload flow types:
> >   ipv4  ipv4-tcp  ipv4-udp  ipv6  ipv6-tcp  ipv6-udp
> >   user-defined-50  user-defined-51
> > Minimum size of RX buffer: 1
> > Maximum configurable length of RX packet: 9600
> > Maximum configurable size of LRO aggregated packet: 0
> > Maximum number of VMDq pools: 64
> > Current number of RX queues: 2
> > Max possible RX queues: 117
> > Max possible number of RXDs per queue: 8192
> > Min possible number of RXDs per queue: 16
> > RXDs number alignment: 1
> > Current number of TX queues: 2
> > Max possible TX queues: 117
> > Max possible number of TXDs per queue: 4096
> > Min possible number of TXDs per queue: 16
> > TXDs number alignment: 1
> > Max segment number per packet: 65535
> > Max segment number per MTU/TSO: 65535
> > Device capabilities: 0x3( RUNTIME_RX_QUEUE_SETUP RUNTIME_TX_QUEUE_SETUP )
> > Switch name: 0000:02:00.0
> > Switch domain Id: 0
> > Switch Port Id: 32768
> > Device error handling mode: proactive
> >
> > This shows link status is seen as up, even although link speed is None.
> >
> > I was wondering if patching the code to move this line which sets lsc
> interrupt on somewhere else might be reasonable, or if this could cause
> further trouble. Maybe having a parameter to trigger it ON/OFF might be a
> good addition.
> > May I have your opinion on this matter?
> >
> > Sincerely,
> > Edwin Brossette
>

[-- Attachment #2: Type: text/html, Size: 20630 bytes --]

  reply	other threads:[~2023-02-08 13:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-19 13:37 Edwin Brossette
2023-01-20  1:04 ` Ajit Khaparde
2023-01-31  5:08 ` Somnath Kotur
2023-02-06  8:44   ` Edwin Brossette [this message]
2023-02-07  2:07     ` Somnath Kotur
2023-02-06 17:11 ` [PATCH] bnxt: fix unwanted interrupt config on link state change edwin.brossette
2023-02-07  5:44   ` Somnath Kotur
2023-02-08 18:19     ` Ajit Khaparde

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANDF9xCBXR+iLes+TeHW1TLdhM0W4qVLY-vK6rqAUuj1o0zG5A@mail.gmail.com \
    --to=edwin.brossette@6wind.com \
    --cc=ajit.khaparde@broadcom.com \
    --cc=dev@dpdk.org \
    --cc=didier.pallard@6wind.com \
    --cc=laurent.hardy@6wind.com \
    --cc=olivier.matz@6wind.com \
    --cc=somnath.kotur@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).