From: Thomas Monjalon <thomas@monjalon.net>
To: CJ Sculti <cj@cj.gy>
Cc: Yasuhiro Ohara <yasu1976@gmail.com>,
users@dpdk.org, Dariusz Sosnowski <dsosnowski@nvidia.com>
Subject: Re: DPDK with Mellanox ConnectX-5, complaining about mlx5_eth?
Date: Mon, 18 Nov 2024 18:46:31 +0100 [thread overview]
Message-ID: <1835831.3VsfAaAtOV@thomas> (raw)
In-Reply-To: <CANvjkS8rW4XpaHpjEgJO2JUg=__Ok-SLufV-ASRjV7rQuvMWzw@mail.gmail.com>
I'm not sure to understand your need.
You want ports being bonded in the kernel but use them separately in userspace?
About "reinjecting" in the kernel, you don't need to do that with mlx5,
as I said before, you can enjoy the bifurcated model to make some flows
being processed in the kernel directly: you choose which flows go to the kernel or to userspace.
Also, you can find this in mlx5 doc:
"
- ``lacp_by_user`` parameter [int]
A nonzero value enables the control of LACP traffic by the user application.
When a bond exists in the driver, by default it should be managed by the
kernel and therefore LACP traffic should be steered to the kernel.
If this devarg is set to 1 it will allow the user to manage the bond by
itself and not steer LACP traffic to the kernel.
Disabled by default (set to 0).
"
18/11/2024 18:33, CJ Sculti:
> Thank you for the advice on that. I have removed the reset, and I also
> fixed the rte_eth_dev_get_supported_ptypes(). I moved
> my rte_eth_dev_get_supported_ptypes() calls after the bypass interface was
> configured, and now they're returning expected values.
>
> My next issue I'm running into is the bonding issue that I talked about in
> my initial post. It seems that when part of a kernel bond, the mlx5_core
> driver combines both ports into a single 'verbs' device named
> 'mlx5_bond_0'. On my old setup with igb_uio, it worked like this:
>
> - At Linux boot, the 2x Intel ports were configured as a bond on the kernel
> via /etc/network/interfaces file.
> - Before starting DPDK software, both ports bound to igb_uio driver.
> - Software started, ports are setup by my software as kernel bypass.
> - Enslaved the 2x 'new' bypass interfaces onto the same bond.
> - Software reinjected LACP packets into kernel, to let kernel handle LACP
> protocol.
>
> This behavior where both ports are 'combined' into a single verbs device is
> strange to me. How should I handle this? Is there any way to disable it and
> just have the 2 ports be separate interfaces?
>
> 1.
> 2.
>
> root@DDoSMitigation:~/anubis/engine/bin# ibv_devinfo
> hca_id: mlx5_bond_0
> transport: InfiniBand (0)
> fw_ver: 16.35.4030
> node_guid: 506b:4b03:00b6:76ec
> sys_image_guid: 506b:4b03:00b6:76ec
> vendor_id: 0x02c9
> vendor_part_id: 4119
> hw_ver: 0x0
> board_id: MT_0000000090
> phys_port_cnt: 1
> port: 1
> state: PORT_ACTIVE (4)
> max_mtu: 4096 (5)
> active_mtu: 1024 (3)
> sm_lid: 0
> port_lid: 0
> port_lmc: 0x00
> link_layer: Ethernet
>
> 3. [12:58 PM]
>
>
>
> On Fri, Nov 15, 2024 at 5:12 AM Yasuhiro Ohara <yasu1976@gmail.com> wrote:
>
> > > I think you should try a bit more, we are here to help.
> >
> > I second Thomas's opinion. Mellanox CX5 is a well-tested NIC on DPDK,
> > and I think you can make it work in only a few more steps.
> >
> > I've never tried rte_eth_dev_reset(), and now I suspect that ENOTSUPP
> > might have made the whole NIC functions stopped.
> > IIRC the ptype was working fine on CX5 in my app too.
> > Can you comment out the rte_eth_dev_reset() ?
> > (I think it's worth to try.)
> >
> > I don't think it is so uphill, but I don't disagree with you about
> > purchasing another
> > Intel 40G NICs. It'll also work perfectly fine.
> >
> > 2024年11月15日(金) 4:16 Thomas Monjalon <thomas@monjalon.net>:
> > >
> > > 14/11/2024 17:10, CJ Sculti:
> > > > I figured out the initial issue. For some reason, having both devices
> > in a
> > > > bond on the kernel results in only 1 of the two ports being exposed as
> > > > 'verb' devices. Previously, ibv_devinfo returned only one port. After
> > > > removing both from the bond, ibv_devinfo returns both ports, and the
> > DPDK
> > > > application successfully takes both in. I'm still having some weird
> > > > behavior trying to create a bypass interface with these ports though.
> > I'm
> > > > using the same code that I've been using on my Intel NICs with igb_uio
> > for
> > > > years, but seeing weird behavior. The ports are connected to our 40Gbps
> > > > Ethernet switch, and set to link_layer: Ethernet.
> > >
> > > You should be able to make it work with kernel bonding,
> > > but I'm not sure what's wrong to do that.
> > > And it looks not a priority for you. Let's focus on the other parts.
> > >
> > >
> > > > The first thing I noticed is that rte_eth_dev_reset() fails on these
> > > > interfaces with "ENOTSUP: hardware doesn't support reset".
> > >
> > > You don't need the reset procedure with mlx5,
> > > so you can make this code optional.
> > >
> > >
> > > > Secondly, when checking ptypes, I noticed my code says these NICs are
> > > > unable to support any sort of packet detection capability (code below,
> > all
> > > > return false). The MLX5 docs do say that all of these ptypes used here
> > are
> > > > supported by MLX5.
> > >
> > > The supported ptypes can be checked in mlx5_dev_supported_ptypes_get()
> > code.
> > > I don't understand why it does not work for you.
> > >
> > >
> > > > I'm just picking up a project that was left off by an older dev. It
> > hasn't
> > > > been touched in years, but has been working fine with our Intel NICs.
> > All
> > > > I'm trying to do is update DPDK (which is done, updated from dpdk
> > 19.05 to
> > > > DPDK 22.11, latest version with KNI support),
> > >
> > > You don't need KNI with mlx5.
> > > That's a big benefit of mlx5 design, it is natively bifurcated:
> > > https://doc.dpdk.org/guides/howto/flow_bifurcation.html
> > >
> > >
> > > > and get it to work with our Mellanox CX5 NICs.
> > > > This is my first time working with DPDK and I'm not very
> > > > familiar. Should I expect to be able to do this without having to make
> > a
> > > > ton of code changes, or is this going to be an uphill battle for me? If
> > > > it's the latter, I will likely just go purchase Intel NICs and give up
> > on
> > > > this.
> > >
> > > The NICs have difference that DPDK is trying to hide.
> > > If something is not compatible you may consider it as a bug or a
> > limitation.
> > > I think you should try a bit more, we are here to help.
> > >
> > >
> >
>
prev parent reply other threads:[~2024-11-18 17:46 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-13 20:10 CJ Sculti
2024-11-13 21:26 ` Thomas Monjalon
2024-11-13 21:43 ` CJ Sculti
2024-11-13 22:43 ` Thomas Monjalon
2024-11-14 2:10 ` Yasuhiro Ohara
2024-11-14 16:10 ` CJ Sculti
2024-11-14 20:16 ` Thomas Monjalon
2024-11-15 10:11 ` Yasuhiro Ohara
2024-11-18 17:33 ` CJ Sculti
2024-11-18 17:46 ` Thomas Monjalon [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1835831.3VsfAaAtOV@thomas \
--to=thomas@monjalon.net \
--cc=cj@cj.gy \
--cc=dsosnowski@nvidia.com \
--cc=users@dpdk.org \
--cc=yasu1976@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).