Hi Dariusz, Khadem,
On Mon, 21 Jul 2025, Dariusz Sosnowski wrote:
> On Mon, Jul 21, 2025 at 01:59:59PM +0200, Thomas Monjalon wrote:
>> 21/07/2025 13:46, Ivan Malov:
>>> On Mon, 21 Jul 2025, Dariusz Sosnowski wrote:
>>>
>>>> + mlx5 maintainers
>>>>
>>>> Thank you for the patch.
>>>>
>>>> Could you please include other PMD maintainers (or other maintainers, depending on changed code)
>>>> in the future patches?
>>>> There is a script which automatically adds maintainers while sending a patch.
>>>> It is described in: https://doc.dpdk.org/guides/contributing/patches.html#sending-patches
>>>>
>>>> On Mon, Jul 21, 2025 at 03:38:51AM -0400, Khadem Ullah wrote:
>>>>> When the primary process exits, the shared mlx5 state becomes
>>>>> unavailable to secondary processes. If a secondary process attempts
>>>>> to query device information (e.g., via testpmd), a NULL dereference
>>>>> may occur due to missing shared data.
>>>>>
>>>>> This patch adds a check for shared context availability and fails
>>>>> gracefully while preventing a crash.
>>>>>
>>>>> Fixes: e60fbd5b24fc ("mlx5: add device configure/start/stop")
>>>>> Cc: stable@dpdk.org
>>>>>
>>>>> Signed-off-by: Khadem Ullah <14pwcse1224@uetpeshawar.edu.pk>
>>>>> ---
>>>>> drivers/net/mlx5/mlx5_ethdev.c | 6 ++++++
>>>>> 1 file changed, 6 insertions(+)
>>>>>
>>>>> diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
>>>>> index 68d1c1bfa7..1848f6536a 100644
>>>>> --- a/drivers/net/mlx5/mlx5_ethdev.c
>>>>> +++ b/drivers/net/mlx5/mlx5_ethdev.c
>>>>> @@ -368,6 +368,12 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *info)
>>>>> * Since we need one CQ per QP, the limit is the minimum number
>>>>> * between the two values.
>>>>> */
>>>>> + if (priv == NULL || priv->sh == NULL) {
>>>>> + DRV_LOG(ERR,
>>>>> + "mlx5 shared data unavailable (primary process likely exited)");
>>>>> + rte_errno = ENODEV;
>>>>> + return -rte_errno;
>>>>> + }
>>>>
>>>> I don't think it's an issue on PMD level, but rather on
>>>> ethdev/multi-process handling level.
>>>
>>> I may be very wrong here, but can't the PMD use its internal 'shared' state to
>>> somehow memorise the fact that a secondary process has attached, in order to not
>>> let the primary process close in the first place (before the secondary process
>>> detaches)? Or is this going to violate some established convention?
>>
>> It looks to be a good idea.
>
> I agree with idea of adding these checks, but not entirely agree with
> it being at driver level, since all drivers would have to duplicate this logic.
>
> Drivers already have to go through ethdev library when port is probed
> on secondary process - rte_eth_dev_attach_secondary() must be
> called to retrieve port data local to the process and device data shared
> between processes.
> Memorizing whether a secondary process is using given port can be added
> to rte_eth_dev_attach_secondary(), and relevant check for primary
> process can then be added to rte_eth_dev_close(), so that all drivers
> benefit.
>
> What do you think?
Yes, in general, the idea would be to have a "secondary reference counter" field
in 'rte_eth_dev_data' and have it incremented/decremented/checked in some
generic places. The issue is that, in addition to 'rte_eth_dev_close', a device
can be "closed" via its respective bus 'remove' method. Yes, there are drivers
that invoke 'rte_eth_dev_close' inside the 'remove' implementation, but not all
of them, unfortunately. For example, take a look at 'af_packet' and similar.
On the other hands, there are drivers that use 'rte_eth_dev_create' and its
counterpart 'rte_eth_dev_destroy' in the implementations of bus 'probe/remove',
but that is also not consistent across the 'drivers/net' tree.
Given all these issues and the fact that ethdev is not the only possible device
type, may be 'rte_device' can house the reference counter, with 'rte_dev_probe'
and 'rte_dev_remove' augmented to do the inrement/decrement/validate thing?
And, since 'rte_eth_dev_close' invokes direct ethdev 'close' method, also add a
check to over there (and may be to 'rte_eth_dev_reset', too)? May be I'm wrong.
However, all of that might still make no sense, given the fact that the primary
process can just die. So, may be I am indeed very wrong and, as Stephen has
suggested in [1], this issue it to be addressed by clarifying the documentation.
[1] https://mails.dpdk.org/archives/dev/2025-July/321865.html
Thank you.
>
>>
>>>> When primary process closes the port, ethdev library zeroes and frees
>>>> device data shared between processes.
>>>> ethdev port data (rte_eth_dev) on secondary is not updated so it now points to
>>>> invalid data. rte_eth_dev_info_get() is not the only API call affected.
>>>>
>>>> If the primary process closes the port before exiting
>>>> (like testpmd does) and it exits before the secondary,
>>>> the any driver call seems invalid because of that use-after-free behavior.
>>>>
>>>> @Thomas, @Andrew - Do you happen to know if doing anything on ethdev ports
>>>> in secondary process after primary has gracefully exited is supported?
>>
>> No there is no statement about whether it is supported or not.
>> I think we should at least return an error instead of crashing.
>>
>>
>