You can see the failsafe driver.

https://git.dpdk.org/dpdk/tree/drivers/net/failsafe?h=v19.11

 

Generally, it gets 2 devices.

Once the primary device gets the RMV event it closes it and switch the control to a secondary device.

Meanwhile the drive periodically checks if the primary device is back in the bus and if so, it probe the device again and reconfigure it to be able to switch the control back to the primary device.

 

Thanks,

Matan

 

 

From: jinag <15720603159@163.com>
Sent: Monday, 14 August 2023 13:06
To: Matan Azrad <matan@nvidia.com>
Cc: users@dpdk.org; Shahaf Shuler <shahafs@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>
Subject: Re:RE: Does the mlx5 NIC support reloading

 

External email: Use caution opening links or attachments

 

Hi Matan:

Could you please provide me a example?

Thanks a lot

 

At 2023-08-14 14:41:22, "Matan Azrad" <matan@nvidia.com> wrote:

Hi Jinag

 

After plugging out the device from the bus, you need to get the event RTE_ETH_EVENT_INTR_RMV.

You need to listen to this event and close the port when you see it.

 

After plugging in the device, you need to scan the bus again and attach the mlx5 device, so it will be probed again and a new ethdev port will be created.

 

Then, you need to reconfigure the port, as regular, and reuse the device.

 

Thanks,

Matan

 

From: jinag <15720603159@163.com>
Sent: Monday, 14 August 2023 6:07
To: users@dpdk.org; Matan Azrad <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>
Subject: Does the mlx5 NIC support reloading

 

External email: Use caution opening links or attachments

 

Hi 

I am verifying the reload function of the mlx5 nic based on dpdk 19.11:

echo 1 > /sys/bus/pci/devices/$pci_address/remove

echo 1 > /sys/bus/pci/rescan

rte_bus_probe();  

rte_eth_dev_stop(); 

rte_eth_dev_start(); 

    net_mlx5: port 0 TX queue 0 CQ creation failure

    net_mlx5: port 0 TX queue allocation failed: cannot allocate memory

The nic cannot be reinitialized.

 

I am not sure if the above operation is correct. Could you please tell me if the mlx5 nic supports reloading(for example, the network card is down during normal  operation) and which dpdk functions need to be called.

 

Thanks!