DPDK patches and discussions
 help / color / mirror / Atom feed
* [DPDK/ethdev Bug 1419] [mlx5] Segfault when calling rte_eth_dev_start() twice
@ 2024-04-24  8:38 bugzilla
  0 siblings, 0 replies; only message in thread
From: bugzilla @ 2024-04-24  8:38 UTC (permalink / raw)
  To: dev

[-- Attachment #1: Type: text/plain, Size: 3118 bytes --]

https://bugs.dpdk.org/show_bug.cgi?id=1419

            Bug ID: 1419
           Summary: [mlx5] Segfault when calling rte_eth_dev_start() twice
           Product: DPDK
           Version: unspecified
          Hardware: x86
                OS: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: ethdev
          Assignee: dev@dpdk.org
          Reporter: vojanec@cesnet.cz
  Target Milestone: ---

Created attachment 279
  --> https://bugs.dpdk.org/attachment.cgi?id=279&action=edit
Example application for reproducing

When calling 'rte_eth_dev_start()' on a port whose mempool is not large enough,
the function fails with
an error code '-ENOMEM' and message:

    mlx5_net: port 0 Rx queue allocation failed: Cannot allocate memory

This is expected behaviour. However, when retrying the same call right after
the failure,
the function now fails with error code '-EINVAL' and a message:

    mlx5_net: port 0 failed to set defaults flows

This behaviour is suspicious, as the expected behaviour would be to return the
same error
message since no more memory was allocated in the meantime.

Furthermore, even more suspicious and incorrect behaviour is observed when flow
isolated mode
is enabled. In that case, the first call to 'rte_eth_dev_start()' fails as
expected, but the
second call actually succeeds (return value 0). This leads to undefined
behaviour and a segfault
when calling 'rte_eth_rx_burst()' later.

[Steps to reproduce]
See the attached patch introducing an example application. Apply the patch and
build the application
using 'make'. Run the application as follows:

    # dpdk-hugepages --setup 2G
    # ./build/crash <EAL ARGS> -- 1024

The only application argument is the packet mempool size. Setting it to 1024
ensures that the mempool
is small enough to get allocated, but also fails the first
'rte_eth_dev_start()'.

The application initializes a single DPDK port (use the '--allow' argument to
specify), enables
flow isolate mode and attempts to start the port twice. After that, the
application segfaults when
calling 'rte_eth_rx_burst()'.

[Bug investigation]
The 'mlx5_dev_start()' function deallocates used memory when failing after its
first
call. However, it seems that it deallocates more memory than it actually
allocated, thus effectively
unconfiguring the queues (or entire port, unsure). In flow isolate mode, it
seems the second call
to 'mlx5_dev_start()' skips some initialization and does not return an error.

[DPDK Version]
Tested on:
    e2e546ab5b ("version: 24.07-rc0")
    eeb0605f11 ("version: 23.11.0"), tag: v23.11

[OS Version]
Operating system: Red Hat Enterprise Linux release 8.9 (Ootpa)
Kernel: 4.18.0-477.10.1.el8_8.x86_64
Architecture: x86_64

[Network Devices]
0000:c4:00.0 'MT2892 Family [ConnectX-6 Dx] 101d' if=ens3f0np0 drv=mlx5_core
unused= 
0000:c4:00.1 'MT2892 Family [ConnectX-6 Dx] 101d' if=ens3f1np1 drv=mlx5_core
unused=

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #2: Type: text/html, Size: 5189 bytes --]

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2024-04-24  8:38 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-24  8:38 [DPDK/ethdev Bug 1419] [mlx5] Segfault when calling rte_eth_dev_start() twice bugzilla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).