* [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice
@ 2021-08-02 12:46 Huisong Li
2021-08-03 2:30 ` [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice Huisong Li
` (5 more replies)
0 siblings, 6 replies; 33+ messages in thread
From: Huisong Li @ 2021-08-02 12:46 UTC (permalink / raw)
To: dev; +Cc: thomas, ferruh.yigit, lihuisong
Ethernet devices in DPDK can be released by rte_eth_dev_close() and
rte_dev_remove(). However, these two APIs do not have explicit invocation
restrictions. In other words, at the ethdev layer, calling
rte_eth_dev_close() and then rte_dev_remove() or rte_eal_hotplug_remove()
is allowed. In such a bad scenario, the primary process may be fine, but it
may cause that dev_unint() in the secondary process will be called twice,
and even other serious problems. So this patch fixes it.
Fixes: 99a2dd955fba ("lib: remove librte_ prefix from directory names")
Signed-off-by: Huisong Li <lihuisong@huawei.com>
---
lib/ethdev/ethdev_pci.h | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
index 8edca82..14a0e01 100644
--- a/lib/ethdev/ethdev_pci.h
+++ b/lib/ethdev/ethdev_pci.h
@@ -151,6 +151,19 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
if (!eth_dev)
return 0;
+ /*
+ * The eth_dev->data->name doesn't be cleared by the secondary precess,
+ * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
+ * Namely, whether "eth_dev" is NULL cannot be used to determine whether
+ * an ethdev port has been released.
+ * For both primary precess and secondary precess, eth_dev->state is
+ * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
+ */
+ if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
+ RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
+ return 0;
+ }
+
if (dev_uninit) {
ret = dev_uninit(eth_dev);
if (ret)
--
2.8.1
^ permalink raw reply [flat|nested] 33+ messages in thread
* [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-08-02 12:46 [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice Huisong Li
@ 2021-08-03 2:30 ` Huisong Li
2021-08-13 2:11 ` Huisong Li
2021-08-18 9:47 ` [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice Singh, Aman Deep
` (4 subsequent siblings)
5 siblings, 1 reply; 33+ messages in thread
From: Huisong Li @ 2021-08-03 2:30 UTC (permalink / raw)
To: dev; +Cc: thomas, ferruh.yigit, lihuisong
Ethernet devices in DPDK can be released by rte_eth_dev_close() and
rte_dev_remove(). These APIs both call xxx_dev_close() in PMD layer
to uninstall hardware. However, the two APIs do not have explicit
invocation restrictions. In other words, at the ethdev layer, it is
possible to call rte_eth_dev_close() before calling rte_dev_remove()
or rte_eal_hotplug_remove(). In such a bad scenario, the primary
process may be fine, but it may cause that xxx_dev_close() in the PMD
layer will be called twice in the secondary process. So this patch
fixes it.
Fixes: 99a2dd955fba ("lib: remove librte_ prefix from directory names")
Signed-off-by: Huisong Li <lihuisong@huawei.com>
---
v2: fix commit description
---
lib/ethdev/ethdev_pci.h | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
index 8edca82..429c4c7 100644
--- a/lib/ethdev/ethdev_pci.h
+++ b/lib/ethdev/ethdev_pci.h
@@ -151,6 +151,19 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
if (!eth_dev)
return 0;
+ /*
+ * The eth_dev->data->name doesn't be cleared by the secondary process,
+ * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
+ * Namely, whether "eth_dev" is NULL cannot be used to determine whether
+ * an ethdev port has been released.
+ * For both primary process and secondary process, eth_dev->state is
+ * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
+ */
+ if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
+ RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
+ return 0;
+ }
+
if (dev_uninit) {
ret = dev_uninit(eth_dev);
if (ret)
--
2.8.1
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-08-03 2:30 ` [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice Huisong Li
@ 2021-08-13 2:11 ` Huisong Li
2021-08-13 6:12 ` Thomas Monjalon
0 siblings, 1 reply; 33+ messages in thread
From: Huisong Li @ 2021-08-13 2:11 UTC (permalink / raw)
To: Thomas Monjalon, Yigit, Ferruh, Andrew Rybchenko; +Cc: dev
Hi, all
This patch can enhance the security of device uninstallation to
eliminate dependency on user usage methods.
Can you check this patch?
在 2021/8/3 10:30, Huisong Li 写道:
> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD layer
> to uninstall hardware. However, the two APIs do not have explicit
> invocation restrictions. In other words, at the ethdev layer, it is
> possible to call rte_eth_dev_close() before calling rte_dev_remove()
> or rte_eal_hotplug_remove(). In such a bad scenario, the primary
> process may be fine, but it may cause that xxx_dev_close() in the PMD
> layer will be called twice in the secondary process. So this patch
> fixes it.
>
> Fixes: 99a2dd955fba ("lib: remove librte_ prefix from directory names")
>
> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> ---
> v2: fix commit description
> ---
> lib/ethdev/ethdev_pci.h | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
> index 8edca82..429c4c7 100644
> --- a/lib/ethdev/ethdev_pci.h
> +++ b/lib/ethdev/ethdev_pci.h
> @@ -151,6 +151,19 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
> if (!eth_dev)
> return 0;
>
> + /*
> + * The eth_dev->data->name doesn't be cleared by the secondary process,
> + * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
> + * Namely, whether "eth_dev" is NULL cannot be used to determine whether
> + * an ethdev port has been released.
> + * For both primary process and secondary process, eth_dev->state is
> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
> + */
> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
> + return 0;
> + }
> +
> if (dev_uninit) {
> ret = dev_uninit(eth_dev);
> if (ret)
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-08-13 2:11 ` Huisong Li
@ 2021-08-13 6:12 ` Thomas Monjalon
2021-08-13 8:16 ` Huisong Li
0 siblings, 1 reply; 33+ messages in thread
From: Thomas Monjalon @ 2021-08-13 6:12 UTC (permalink / raw)
To: Huisong Li; +Cc: Yigit, Ferruh, Andrew Rybchenko, dev
13/08/2021 04:11, Huisong Li:
> Hi, all
>
> This patch can enhance the security of device uninstallation to
> eliminate dependency on user usage methods.
>
> Can you check this patch?
>
>
> 在 2021/8/3 10:30, Huisong Li 写道:
> > Ethernet devices in DPDK can be released by rte_eth_dev_close() and
> > rte_dev_remove(). These APIs both call xxx_dev_close() in PMD layer
> > to uninstall hardware. However, the two APIs do not have explicit
> > invocation restrictions. In other words, at the ethdev layer, it is
> > possible to call rte_eth_dev_close() before calling rte_dev_remove()
> > or rte_eal_hotplug_remove(). In such a bad scenario,
It is not a bad scenario.
If there is no more port for the device after calling close,
the device should be removed automatically.
Keep in mind "close" is for one port, "remove" is for the entire device
which can have more than one port.
> > the primary
> > process may be fine, but it may cause that xxx_dev_close() in the PMD
> > layer will be called twice in the secondary process. So this patch
> > fixes it.
If a port is closed in primary, it should be the same in secondary.
> > + /*
> > + * The eth_dev->data->name doesn't be cleared by the secondary process,
> > + * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
This assumption is not clear. All should be closed together.
> > + * Namely, whether "eth_dev" is NULL cannot be used to determine whether
> > + * an ethdev port has been released.
> > + * For both primary process and secondary process, eth_dev->state is
> > + * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
> > + */
> > + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
> > + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
> > + return 0;
> > + }
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-08-13 6:12 ` Thomas Monjalon
@ 2021-08-13 8:16 ` Huisong Li
2021-08-18 11:24 ` Ferruh Yigit
0 siblings, 1 reply; 33+ messages in thread
From: Huisong Li @ 2021-08-13 8:16 UTC (permalink / raw)
To: Thomas Monjalon; +Cc: Yigit, Ferruh, Andrew Rybchenko, dev
在 2021/8/13 14:12, Thomas Monjalon 写道:
> 13/08/2021 04:11, Huisong Li:
>> Hi, all
>>
>> This patch can enhance the security of device uninstallation to
>> eliminate dependency on user usage methods.
>>
>> Can you check this patch?
>>
>>
>> 在 2021/8/3 10:30, Huisong Li 写道:
>>> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
>>> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD layer
>>> to uninstall hardware. However, the two APIs do not have explicit
>>> invocation restrictions. In other words, at the ethdev layer, it is
>>> possible to call rte_eth_dev_close() before calling rte_dev_remove()
>>> or rte_eal_hotplug_remove(). In such a bad scenario,
> It is not a bad scenario.
> If there is no more port for the device after calling close,
> the device should be removed automatically.
> Keep in mind "close" is for one port, "remove" is for the entire device
> which can have more than one port.
I know.
dev_close() is for removing an eth device. And rte_dev_remove() can be used
for removing the rte device and all its eth devices belonging to the rte
device.
In rte_dev_remove(), "remove" is executed in primary or one of secondary,
all eth devices having same pci address will be closed and removed.
>
>>> the primary
>>> process may be fine, but it may cause that xxx_dev_close() in the PMD
>>> layer will be called twice in the secondary process. So this patch
>>> fixes it.
> If a port is closed in primary, it should be the same in secondary.
>
>
>>> + /*
>>> + * The eth_dev->data->name doesn't be cleared by the secondary process,
>>> + * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
> This assumption is not clear. All should be closed together.
However, dev_close() does not have the feature similar to rte_dev_remove().
Namely, it is not guaranteed that all eth devices are closed together in
ethdev layer. It depends on app or user.
If the app does not close together, the operation of repeatedly
uninstalling an eth device in the secondary process
will be triggered when dev_close() is first called by one secondary
process, and then rte_dev_remove() is called.
So I think it should be avoided.
>
>>> + * Namely, whether "eth_dev" is NULL cannot be used to determine whether
>>> + * an ethdev port has been released.
>>> + * For both primary process and secondary process, eth_dev->state is
>>> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
>>> + */
>>> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
>>> + return 0;
>>> + }
>
>
> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice
2021-08-02 12:46 [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice Huisong Li
2021-08-03 2:30 ` [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice Huisong Li
@ 2021-08-18 9:47 ` Singh, Aman Deep
2021-08-24 2:10 ` Huisong Li
2021-10-08 8:21 ` [dpdk-dev] [PATCH] ethdev: fix eth device released repeatedly Huisong Li
` (3 subsequent siblings)
5 siblings, 1 reply; 33+ messages in thread
From: Singh, Aman Deep @ 2021-08-18 9:47 UTC (permalink / raw)
To: Huisong Li, dev; +Cc: thomas, ferruh.yigit
Hi Huison,
On 8/2/2021 6:16 PM, Huisong Li wrote:
> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
> rte_dev_remove(). However, these two APIs do not have explicit invocation
> restrictions. In other words, at the ethdev layer, calling
> rte_eth_dev_close() and then rte_dev_remove() or rte_eal_hotplug_remove()
> is allowed. In such a bad scenario, the primary process may be fine, but it
> may cause that dev_unint() in the secondary process will be called twice,
Shouldn't dev_unint() for Secondary process, simply return with no-action.
> and even other serious problems. So this patch fixes it.
>
> Fixes: 99a2dd955fba ("lib: remove librte_ prefix from directory names")
>
> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> ---
> lib/ethdev/ethdev_pci.h | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
> index 8edca82..14a0e01 100644
> --- a/lib/ethdev/ethdev_pci.h
> +++ b/lib/ethdev/ethdev_pci.h
> @@ -151,6 +151,19 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
> if (!eth_dev)
> return 0;
>
> + /*
> + * The eth_dev->data->name doesn't be cleared by the secondary precess,
Can we reprase above sentence "doesn't be cleared "
> + * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
> + * Namely, whether "eth_dev" is NULL cannot be used to determine whether
> + * an ethdev port has been released.
> + * For both primary precess and secondary precess, eth_dev->state is
s/ precess / process
> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
> + */
> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
> + return 0;
> + }
> +
> if (dev_uninit) {
> ret = dev_uninit(eth_dev);
> if (ret)
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-08-13 8:16 ` Huisong Li
@ 2021-08-18 11:24 ` Ferruh Yigit
2021-08-19 3:45 ` Huisong Li
0 siblings, 1 reply; 33+ messages in thread
From: Ferruh Yigit @ 2021-08-18 11:24 UTC (permalink / raw)
To: Huisong Li, Thomas Monjalon
Cc: Andrew Rybchenko, dev, Anatoly Burakov, David Marchand
On 8/13/2021 9:16 AM, Huisong Li wrote:
>
> 在 2021/8/13 14:12, Thomas Monjalon 写道:
>> 13/08/2021 04:11, Huisong Li:
>>> Hi, all
>>>
>>> This patch can enhance the security of device uninstallation to
>>> eliminate dependency on user usage methods.
>>>
>>> Can you check this patch?
>>>
>>>
>>> 在 2021/8/3 10:30, Huisong Li 写道:
>>>> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD layer
>>>> to uninstall hardware. However, the two APIs do not have explicit
>>>> invocation restrictions. In other words, at the ethdev layer, it is
>>>> possible to call rte_eth_dev_close() before calling rte_dev_remove()
>>>> or rte_eal_hotplug_remove(). In such a bad scenario,
>> It is not a bad scenario.
>> If there is no more port for the device after calling close,
>> the device should be removed automatically.
>> Keep in mind "close" is for one port, "remove" is for the entire device
>> which can have more than one port.
>
> I know.
>
> dev_close() is for removing an eth device. And rte_dev_remove() can be used
>
> for removing the rte device and all its eth devices belonging to the rte device.
>
> In rte_dev_remove(), "remove" is executed in primary or one of secondary,
>
> all eth devices having same pci address will be closed and removed.
>
>>
>>>> the primary
>>>> process may be fine, but it may cause that xxx_dev_close() in the PMD
>>>> layer will be called twice in the secondary process. So this patch
>>>> fixes it.
>> If a port is closed in primary, it should be the same in secondary.
>>
>>
>>>> + /*
>>>> + * The eth_dev->data->name doesn't be cleared by the secondary process,
>>>> + * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
>> This assumption is not clear. All should be closed together.
>
> However, dev_close() does not have the feature similar to rte_dev_remove().
>
> Namely, it is not guaranteed that all eth devices are closed together in ethdev
> layer. It depends on app or user.
>
> If the app does not close together, the operation of repeatedly uninstalling an
> eth device in the secondary process
>
> will be triggered when dev_close() is first called by one secondary process, and
> then rte_dev_remove() is called.
>
> So I think it should be avoided.
First of all, I am not sure about calling 'rte_eth_dev_close()' or
'rte_dev_remove()' from the secondary process.
There are explicit checks in various locations to prevent clearing resources
completely from secondary process.
Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary is technically
can be done but application needs to be extra cautious and should take extra
measures and synchronization to make it work.
Regular use-case is secondary processes do the packet processing and all control
commands run by primary.
In primary, if you call 'rte_eth_dev_close()' it will clear all ethdev resources
and further 'rte_dev_remove()' call will detect missing ethdev resources and
won't try to clear them again.
In secondary, if you call 'rte_eth_dev_close()', it WON'T clear all resources
and further 'rte_dev_remove()' call (either from primary or secondary) will try
to clean ethdev resources again. You are trying to prevent this retry in remove
happening for secondary process.
In secondary it won't free ethdev resources anyway if you let it continue, but I
guess here you are trying to prevent the PMD dev_close() called again. Why? Is
it just for optimization or does it cause unexpected behavior in the PMD?
Overall, to free resources you need to do the 'rte_eth_dev_close()' or
'rte_dev_remove()' in the primary anyway. So instead of this workaround, I would
suggest making PMD dev_close() safe to be called multiple times (if this is the
problem.)
And again, please re-consider calling 'rte_eth_dev_close()' or
'rte_dev_remove()' from the secondary process.
>
>>
>>>> + * Namely, whether "eth_dev" is NULL cannot be used to determine whether
>>>> + * an ethdev port has been released.
>>>> + * For both primary process and secondary process, eth_dev->state is
>>>> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
>>>> + */
>>>> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>>>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
>>>> + return 0;
>>>> + }
>>
>>
>> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-08-18 11:24 ` Ferruh Yigit
@ 2021-08-19 3:45 ` Huisong Li
2021-08-24 14:42 ` Ferruh Yigit
0 siblings, 1 reply; 33+ messages in thread
From: Huisong Li @ 2021-08-19 3:45 UTC (permalink / raw)
To: Ferruh Yigit, Thomas Monjalon
Cc: Andrew Rybchenko, dev, Anatoly Burakov, David Marchand
在 2021/8/18 19:24, Ferruh Yigit 写道:
> On 8/13/2021 9:16 AM, Huisong Li wrote:
>> 在 2021/8/13 14:12, Thomas Monjalon 写道:
>>> 13/08/2021 04:11, Huisong Li:
>>>> Hi, all
>>>>
>>>> This patch can enhance the security of device uninstallation to
>>>> eliminate dependency on user usage methods.
>>>>
>>>> Can you check this patch?
>>>>
>>>>
>>>> 在 2021/8/3 10:30, Huisong Li 写道:
>>>>> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
>>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD layer
>>>>> to uninstall hardware. However, the two APIs do not have explicit
>>>>> invocation restrictions. In other words, at the ethdev layer, it is
>>>>> possible to call rte_eth_dev_close() before calling rte_dev_remove()
>>>>> or rte_eal_hotplug_remove(). In such a bad scenario,
>>> It is not a bad scenario.
>>> If there is no more port for the device after calling close,
>>> the device should be removed automatically.
>>> Keep in mind "close" is for one port, "remove" is for the entire device
>>> which can have more than one port.
>> I know.
>>
>> dev_close() is for removing an eth device. And rte_dev_remove() can be used
>>
>> for removing the rte device and all its eth devices belonging to the rte device.
>>
>> In rte_dev_remove(), "remove" is executed in primary or one of secondary,
>>
>> all eth devices having same pci address will be closed and removed.
>>
>>>>> the primary
>>>>> process may be fine, but it may cause that xxx_dev_close() in the PMD
>>>>> layer will be called twice in the secondary process. So this patch
>>>>> fixes it.
>>> If a port is closed in primary, it should be the same in secondary.
>>>
>>>
>>>>> + /*
>>>>> + * The eth_dev->data->name doesn't be cleared by the secondary process,
>>>>> + * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
>>> This assumption is not clear. All should be closed together.
>> However, dev_close() does not have the feature similar to rte_dev_remove().
>>
>> Namely, it is not guaranteed that all eth devices are closed together in ethdev
>> layer. It depends on app or user.
>>
>> If the app does not close together, the operation of repeatedly uninstalling an
>> eth device in the secondary process
>>
>> will be triggered when dev_close() is first called by one secondary process, and
>> then rte_dev_remove() is called.
>>
>> So I think it should be avoided.
> First of all, I am not sure about calling 'rte_eth_dev_close()' or
> 'rte_dev_remove()' from the secondary process.
> There are explicit checks in various locations to prevent clearing resources
> completely from secondary process.
There's no denying that.
Generally, hardware resources of eth device and shared data of the
primary and secondary process
are cleared by primary, which are controled by ethdev layer or PMD layer.
But there may be some private data or resources of each process (primary
or secondary ), such as mp action
registered by rte_mp_action_register() or others. For these resources,
the secondary process still needs to clear.
Namely, both primary and secondary processes need to prevent repeated
offloading of resources.
>
> Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary is technically
> can be done but application needs to be extra cautious and should take extra
> measures and synchronization to make it work.
> Regular use-case is secondary processes do the packet processing and all control
> commands run by primary.
You are right. We have a consensus that 'rte_eth_dev_close()' or
'rte_dev_remove()'
can be called by primary and secondary processes.
But DPDK framework cannot assume user behavior.😁
We need to make it more secure and reliable for both primary and
secondary processes.
>
> In primary, if you call 'rte_eth_dev_close()' it will clear all ethdev resources
> and further 'rte_dev_remove()' call will detect missing ethdev resources and
> won't try to clear them again.
>
> In secondary, if you call 'rte_eth_dev_close()', it WON'T clear all resources
> and further 'rte_dev_remove()' call (either from primary or secondary) will try
> to clean ethdev resources again. You are trying to prevent this retry in remove
> happening for secondary process.
Right. However, if secondary process in PMD layer has its own private
resources to be
cleared, it still need to do it by calling 'rte_eth_dev_close()' or
'rte_dev_remove()'.
> In secondary it won't free ethdev resources anyway if you let it continue, but I
> guess here you are trying to prevent the PMD dev_close() called again. Why? Is
> it just for optimization or does it cause unexpected behavior in the PMD?
>
>
> Overall, to free resources you need to do the 'rte_eth_dev_close()' or
> 'rte_dev_remove()' in the primary anyway. So instead of this workaround, I would
> suggest making PMD dev_close() safe to be called multiple times (if this is the
> problem.)
In conclusion, primary and secondary processes in PMD layer may have
their own
private data and resources, which need to be processed and released.
Currently, these for PMD are either handled and cleaned up in
dev_close() or remove().
However, code streams in rte_dev_remove() cannot ensure that the
uninstallation
from secondary process will not be repeated if rte_eth_dev_close() is
first called by
secondary(primary is ok, plese review this patch).
I think, this is the same for each PMD and is better suited to doing it
in ethdev layer.
>
> And again, please re-consider calling 'rte_eth_dev_close()' or
> 'rte_dev_remove()' from the secondary process.
>
>>>>> + * Namely, whether "eth_dev" is NULL cannot be used to determine whether
>>>>> + * an ethdev port has been released.
>>>>> + * For both primary process and secondary process, eth_dev->state is
>>>>> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
>>>>> + */
>>>>> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>>>>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
>>>>> + return 0;
>>>>> + }
>>>
>>> .
> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice
2021-08-18 9:47 ` [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice Singh, Aman Deep
@ 2021-08-24 2:10 ` Huisong Li
0 siblings, 0 replies; 33+ messages in thread
From: Huisong Li @ 2021-08-24 2:10 UTC (permalink / raw)
To: Singh, Aman Deep, dev; +Cc: thomas, ferruh.yigit
Hi, Singh, Aman Deep
Sorry, I missed your review. Thank you for your review.😁
在 2021/8/18 17:47, Singh, Aman Deep 写道:
>
> Hi Huison,
>
> On 8/2/2021 6:16 PM, Huisong Li wrote:
>> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
>> rte_dev_remove(). However, these two APIs do not have explicit invocation
>> restrictions. In other words, at the ethdev layer, calling
>> rte_eth_dev_close() and then rte_dev_remove() or rte_eal_hotplug_remove()
>> is allowed. In such a bad scenario, the primary process may be fine, but it
>> may cause that dev_unint() in the secondary process will be called twice,
> Shouldn't dev_unint() for Secondary process, simply return with no-action.
The prerequisite is that the secondary process does not have any
resources that need to be released.
However, some resources from secondary process may also need to be
released. For example, mp action
registered by rte_mp_action_register() is used to for multi-process
communication. It should be unloaded
when all eth devices driven by one PMD in a process are removed. In
order to achieve the above purpose,
secondary process may have data recording the number of device to decide
when to deregister the action.
Of course, this is just the case. In short, secondary process may have
its own private data or resources to
be released.
It is mentioned in RFC v2. Please go to discussion line of RFC v2.
>> and even other serious problems. So this patch fixes it.
>>
>> Fixes: 99a2dd955fba ("lib: remove librte_ prefix from directory names")
>>
>> Signed-off-by: Huisong Li<lihuisong@huawei.com>
>> ---
>> lib/ethdev/ethdev_pci.h | 13 +++++++++++++
>> 1 file changed, 13 insertions(+)
>>
>> diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
>> index 8edca82..14a0e01 100644
>> --- a/lib/ethdev/ethdev_pci.h
>> +++ b/lib/ethdev/ethdev_pci.h
>> @@ -151,6 +151,19 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
>> if (!eth_dev)
>> return 0;
>>
>> + /*
>> + * The eth_dev->data->name doesn't be cleared by the secondary precess,
> Can we reprase above sentence "doesn't be cleared "
ok
>> + * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
>> + * Namely, whether "eth_dev" is NULL cannot be used to determine whether
>> + * an ethdev port has been released.
>> + * For both primary precess and secondary precess, eth_dev->state is
> s/ precess / process
Thanks. RFC v2 has corrected it.
>> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
>> + */
>> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
>> + return 0;
>> + }
>> +
>> if (dev_uninit) {
>> ret = dev_uninit(eth_dev);
>> if (ret)
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-08-19 3:45 ` Huisong Li
@ 2021-08-24 14:42 ` Ferruh Yigit
2021-08-25 9:53 ` Huisong Li
0 siblings, 1 reply; 33+ messages in thread
From: Ferruh Yigit @ 2021-08-24 14:42 UTC (permalink / raw)
To: Huisong Li, Thomas Monjalon
Cc: Andrew Rybchenko, dev, Anatoly Burakov, David Marchand
On 8/19/2021 4:45 AM, Huisong Li wrote:
> 在 2021/8/18 19:24, Ferruh Yigit 写道:
>> On 8/13/2021 9:16 AM, Huisong Li wrote:
>>> 在 2021/8/13 14:12, Thomas Monjalon 写道:
>>>> 13/08/2021 04:11, Huisong Li:
>>>>> Hi, all
>>>>>
>>>>> This patch can enhance the security of device uninstallation to
>>>>> eliminate dependency on user usage methods.
>>>>>
>>>>> Can you check this patch?
>>>>>
>>>>>
>>>>> 在 2021/8/3 10:30, Huisong Li 写道:
>>>>>> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
>>>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD layer
>>>>>> to uninstall hardware. However, the two APIs do not have explicit
>>>>>> invocation restrictions. In other words, at the ethdev layer, it is
>>>>>> possible to call rte_eth_dev_close() before calling rte_dev_remove()
>>>>>> or rte_eal_hotplug_remove(). In such a bad scenario,
>>>> It is not a bad scenario.
>>>> If there is no more port for the device after calling close,
>>>> the device should be removed automatically.
>>>> Keep in mind "close" is for one port, "remove" is for the entire device
>>>> which can have more than one port.
>>> I know.
>>>
>>> dev_close() is for removing an eth device. And rte_dev_remove() can be used
>>>
>>> for removing the rte device and all its eth devices belonging to the rte device.
>>>
>>> In rte_dev_remove(), "remove" is executed in primary or one of secondary,
>>>
>>> all eth devices having same pci address will be closed and removed.
>>>
>>>>>> the primary
>>>>>> process may be fine, but it may cause that xxx_dev_close() in the PMD
>>>>>> layer will be called twice in the secondary process. So this patch
>>>>>> fixes it.
>>>> If a port is closed in primary, it should be the same in secondary.
>>>>
>>>>
>>>>>> + /*
>>>>>> + * The eth_dev->data->name doesn't be cleared by the secondary process,
>>>>>> + * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
>>>> This assumption is not clear. All should be closed together.
>>> However, dev_close() does not have the feature similar to rte_dev_remove().
>>>
>>> Namely, it is not guaranteed that all eth devices are closed together in ethdev
>>> layer. It depends on app or user.
>>>
>>> If the app does not close together, the operation of repeatedly uninstalling an
>>> eth device in the secondary process
>>>
>>> will be triggered when dev_close() is first called by one secondary process, and
>>> then rte_dev_remove() is called.
>>>
>>> So I think it should be avoided.
>> First of all, I am not sure about calling 'rte_eth_dev_close()' or
>> 'rte_dev_remove()' from the secondary process.
>> There are explicit checks in various locations to prevent clearing resources
>> completely from secondary process.
>
> There's no denying that.
>
> Generally, hardware resources of eth device and shared data of the primary and
> secondary process
>
> are cleared by primary, which are controled by ethdev layer or PMD layer.
>
> But there may be some private data or resources of each process (primary or
> secondary ), such as mp action
>
> registered by rte_mp_action_register() or others. For these resources, the
> secondary process still needs to clear.
>
> Namely, both primary and secondary processes need to prevent repeated offloading
> of resources.
>
>>
>> Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary is technically
>> can be done but application needs to be extra cautious and should take extra
>> measures and synchronization to make it work.
>> Regular use-case is secondary processes do the packet processing and all control
>> commands run by primary.
>
> You are right. We have a consensus that 'rte_eth_dev_close()' or 'rte_dev_remove()'
>
> can be called by primary and secondary processes.
>
> But DPDK framework cannot assume user behavior.😁
>
> We need to make it more secure and reliable for both primary and secondary
> processes.
>
>>
>> In primary, if you call 'rte_eth_dev_close()' it will clear all ethdev resources
>> and further 'rte_dev_remove()' call will detect missing ethdev resources and
>> won't try to clear them again.
>>
>> In secondary, if you call 'rte_eth_dev_close()', it WON'T clear all resources
>> and further 'rte_dev_remove()' call (either from primary or secondary) will try
>> to clean ethdev resources again. You are trying to prevent this retry in remove
>> happening for secondary process.
>
> Right. However, if secondary process in PMD layer has its own private resources
> to be
>
> cleared, it still need to do it by calling 'rte_eth_dev_close()' or
> 'rte_dev_remove()'.
>
>> In secondary it won't free ethdev resources anyway if you let it continue, but I
>> guess here you are trying to prevent the PMD dev_close() called again. Why? Is
>> it just for optimization or does it cause unexpected behavior in the PMD?
>>
>>
>> Overall, to free resources you need to do the 'rte_eth_dev_close()' or
>> 'rte_dev_remove()' in the primary anyway. So instead of this workaround, I would
>> suggest making PMD dev_close() safe to be called multiple times (if this is the
>> problem.)
>
> In conclusion, primary and secondary processes in PMD layer may have their own
>
> private data and resources, which need to be processed and released.
>
> Currently, these for PMD are either handled and cleaned up in dev_close() or
> remove().
>
> However, code streams in rte_dev_remove() cannot ensure that the uninstallation
>
> from secondary process will not be repeated if rte_eth_dev_close() is first
> called by
>
> secondary(primary is ok, plese review this patch).
>
> I think, this is the same for each PMD and is better suited to doing it in
> ethdev layer.
>
This patch prevents to call dev_close() twice in the secondary process, is this
fixing a theoretical problem or an actual problem?
If it is an actual problem can you please provide details, callstack of the
problematic case?
>>
>> And again, please re-consider calling 'rte_eth_dev_close()' or
>> 'rte_dev_remove()' from the secondary process.
>>
>>>>>> + * Namely, whether "eth_dev" is NULL cannot be used to determine whether
>>>>>> + * an ethdev port has been released.
>>>>>> + * For both primary process and secondary process, eth_dev->state is
>>>>>> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
>>>>>> + */
>>>>>> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>>>>>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
>>>>>> + return 0;
>>>>>> + }
>>>>
>>>> .
>> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-08-24 14:42 ` Ferruh Yigit
@ 2021-08-25 9:53 ` Huisong Li
2021-09-04 1:23 ` Huisong Li
` (2 more replies)
0 siblings, 3 replies; 33+ messages in thread
From: Huisong Li @ 2021-08-25 9:53 UTC (permalink / raw)
To: Ferruh Yigit, Thomas Monjalon
Cc: Andrew Rybchenko, dev, Anatoly Burakov, David Marchand
在 2021/8/24 22:42, Ferruh Yigit 写道:
> On 8/19/2021 4:45 AM, Huisong Li wrote:
>> 在 2021/8/18 19:24, Ferruh Yigit 写道:
>>> On 8/13/2021 9:16 AM, Huisong Li wrote:
>>>> 在 2021/8/13 14:12, Thomas Monjalon 写道:
>>>>> 13/08/2021 04:11, Huisong Li:
>>>>>> Hi, all
>>>>>>
>>>>>> This patch can enhance the security of device uninstallation to
>>>>>> eliminate dependency on user usage methods.
>>>>>>
>>>>>> Can you check this patch?
>>>>>>
>>>>>>
>>>>>> 在 2021/8/3 10:30, Huisong Li 写道:
>>>>>>> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
>>>>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD layer
>>>>>>> to uninstall hardware. However, the two APIs do not have explicit
>>>>>>> invocation restrictions. In other words, at the ethdev layer, it is
>>>>>>> possible to call rte_eth_dev_close() before calling rte_dev_remove()
>>>>>>> or rte_eal_hotplug_remove(). In such a bad scenario,
>>>>> It is not a bad scenario.
>>>>> If there is no more port for the device after calling close,
>>>>> the device should be removed automatically.
>>>>> Keep in mind "close" is for one port, "remove" is for the entire device
>>>>> which can have more than one port.
>>>> I know.
>>>>
>>>> dev_close() is for removing an eth device. And rte_dev_remove() can be used
>>>>
>>>> for removing the rte device and all its eth devices belonging to the rte device.
>>>>
>>>> In rte_dev_remove(), "remove" is executed in primary or one of secondary,
>>>>
>>>> all eth devices having same pci address will be closed and removed.
>>>>
>>>>>>> the primary
>>>>>>> process may be fine, but it may cause that xxx_dev_close() in the PMD
>>>>>>> layer will be called twice in the secondary process. So this patch
>>>>>>> fixes it.
>>>>> If a port is closed in primary, it should be the same in secondary.
>>>>>
>>>>>
>>>>>>> + /*
>>>>>>> + * The eth_dev->data->name doesn't be cleared by the secondary process,
>>>>>>> + * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
>>>>> This assumption is not clear. All should be closed together.
>>>> However, dev_close() does not have the feature similar to rte_dev_remove().
>>>>
>>>> Namely, it is not guaranteed that all eth devices are closed together in ethdev
>>>> layer. It depends on app or user.
>>>>
>>>> If the app does not close together, the operation of repeatedly uninstalling an
>>>> eth device in the secondary process
>>>>
>>>> will be triggered when dev_close() is first called by one secondary process, and
>>>> then rte_dev_remove() is called.
>>>>
>>>> So I think it should be avoided.
>>> First of all, I am not sure about calling 'rte_eth_dev_close()' or
>>> 'rte_dev_remove()' from the secondary process.
>>> There are explicit checks in various locations to prevent clearing resources
>>> completely from secondary process.
>> There's no denying that.
>>
>> Generally, hardware resources of eth device and shared data of the primary and
>> secondary process
>>
>> are cleared by primary, which are controled by ethdev layer or PMD layer.
>>
>> But there may be some private data or resources of each process (primary or
>> secondary ), such as mp action
>>
>> registered by rte_mp_action_register() or others. For these resources, the
>> secondary process still needs to clear.
>>
>> Namely, both primary and secondary processes need to prevent repeated offloading
>> of resources.
>>
>>> Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary is technically
>>> can be done but application needs to be extra cautious and should take extra
>>> measures and synchronization to make it work.
>>> Regular use-case is secondary processes do the packet processing and all control
>>> commands run by primary.
>> You are right. We have a consensus that 'rte_eth_dev_close()' or 'rte_dev_remove()'
>>
>> can be called by primary and secondary processes.
>>
>> But DPDK framework cannot assume user behavior.😁
>>
>> We need to make it more secure and reliable for both primary and secondary
>> processes.
>>
>>> In primary, if you call 'rte_eth_dev_close()' it will clear all ethdev resources
>>> and further 'rte_dev_remove()' call will detect missing ethdev resources and
>>> won't try to clear them again.
>>>
>>> In secondary, if you call 'rte_eth_dev_close()', it WON'T clear all resources
>>> and further 'rte_dev_remove()' call (either from primary or secondary) will try
>>> to clean ethdev resources again. You are trying to prevent this retry in remove
>>> happening for secondary process.
>> Right. However, if secondary process in PMD layer has its own private resources
>> to be
>>
>> cleared, it still need to do it by calling 'rte_eth_dev_close()' or
>> 'rte_dev_remove()'.
>>
>>> In secondary it won't free ethdev resources anyway if you let it continue, but I
>>> guess here you are trying to prevent the PMD dev_close() called again. Why? Is
>>> it just for optimization or does it cause unexpected behavior in the PMD?
>>>
>>>
>>> Overall, to free resources you need to do the 'rte_eth_dev_close()' or
>>> 'rte_dev_remove()' in the primary anyway. So instead of this workaround, I would
>>> suggest making PMD dev_close() safe to be called multiple times (if this is the
>>> problem.)
>> In conclusion, primary and secondary processes in PMD layer may have their own
>>
>> private data and resources, which need to be processed and released.
>>
>> Currently, these for PMD are either handled and cleaned up in dev_close() or
>> remove().
>>
>> However, code streams in rte_dev_remove() cannot ensure that the uninstallation
>>
>> from secondary process will not be repeated if rte_eth_dev_close() is first
>> called by
>>
>> secondary(primary is ok, plese review this patch).
>>
>> I think, this is the same for each PMD and is better suited to doing it in
>> ethdev layer.
>>
> This patch prevents to call dev_close() twice in the secondary process, is this
> fixing a theoretical problem or an actual problem?
>
> If it is an actual problem can you please provide details, callstack of the
> problematic case?
We analyzed the code when modifying the bug and found that the problem
did exist.
The ethdev layer did not guarantee the security.
The general function of the two interfaces is as follows:
rte_eth_dev_close() --> release eth device
rte_dev_remove() --> release eth device + remove and free
rte_pci_device(primary andsecondary).
According to the OVS application scenario, first call dev_close() and
then call
remove(), which is possible.
We constructed this scenario using testpmd to start the secondary
process(the multi-process
patch of testpmd is being uploaded.). It is proved that the ethdev layer
cannot guarantee
this security. The callstack is as follows:
**************************************************************************************************************************************************
gdb) set args -a 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee
--proc-type=auto -- -i --rxq 4 --txq 4 --num-procs=2 --proc-id=1
(gdb) b hns3_dev_uninit
Breakpoint 1 at 0xbca7d0: file ../drivers/net/hns3/hns3_ethdev.c, line 7806.
(gdb) b hns3_dev_close
Breakpoint 2 at 0xbc6d44: file ../drivers/net/hns3/hns3_ethdev.c, line 6189.
(gdb) r
Starting program:
/home/lihuisong/v500/gcov-test/dpdk/build/app/dpdk-testpmd -a
0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto -- -i --rxq 4
--txq 4 --num-procs=2 --proc-id=1
Missing separate debuginfo for /root/lib/libnuma.so.1
Try: yum --enablerepo='*debug*' install
/usr/lib/debug/.build-id/ce/4eea0b0f2150f70a080bbd8835e43e78373096.debug
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
EAL: Detected 128 lcore(s)
EAL: Detected 4 NUMA nodes
EAL: Auto-detected process type: SECONDARY
EAL: Detected static linkage of DPDK
[New Thread 0xfffff7a0ad10 (LWP 17717)]
EAL: Multi-process socket /var/run/dpdk/rte_lee/mp_socket_17714_66aa8d3a60a9
[New Thread 0xfffff7209d10 (LWP 17718)]
EAL: Selected IOVA mode 'VA'
EAL: VFIO support initialized
[New Thread 0xfffff69f8d10 (LWP 17719)]
EAL: Using IOMMU type 1 (Type 1)
EAL: Probe PCI driver: net_hns3 (19e5:a222) device: 0000:7d:00.0 (socket 0)
[New Thread 0xfffff61f7d10 (LWP 17720)]
TELEMETRY: No legacy callbacks, legacy socket not created
Interactive-mode selected
0000:7d:00.0 hns3_dev_mtu_set(): Failed to set mtu, port 0 must be
stopped before configuration
Failed to set MTU to 1500 for port 0
testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Warning! port-topology=paired and odd forward ports number, the last
port will pair with itself.
Configuring Port 0 (socket 0)
Port 0: 00:18:2D:00:00:9E
Checking link statuses...
Done
testpmd> port close 0
Closing ports...
Breakpoint 2, hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>)
at ../drivers/net/hns3/hns3_ethdev.c:6189
6189 struct hns3_adapter *hns = eth_dev->data->dev_private;
Missing separate debuginfos, use: debuginfo-install
glibc-2.17-260.el7.aarch64 libpcap-1.5.3-11.el7.aarch64
openssl-libs-1.0.2k-16.el7.aarch64 zlib-1.2.7-18.el7.aarch64
(gdb) bt
#0 hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>) at
../drivers/net/hns3/hns3_ethdev.c:6189
#1 0x0000000000742eac in rte_eth_dev_close (port_id=0) at
../lib/ethdev/rte_ethdev.c:1870
#2 0x0000000000542a8c in close_port (pid=0) at
../app/test-pmd/testpmd.c:2895
#3 0x00000000004df82c in cmd_operate_specific_port_parsed
(parsed_result=0xffffffffb0f0,
cl=0x2475080, data=0x0) at ../app/test-pmd/cmdline.c:1272
#4 0x0000000000738ce4 in cmdline_parse (cl=0x2475080, buf=0x24750c8
"port close 0\n")
at ../lib/cmdline/cmdline_parse.c:290
#5 0x0000000000736b7c in cmdline_valid_buffer (rdl=0x2475090,
buf=0x24750c8 "port close 0\n",
size=14) at ../lib/cmdline/cmdline.c:26
#6 0x000000000073c078 in rdline_char_in (rdl=0x2475090, c=10 '\n')
at ../lib/cmdline/cmdline_rdline.c:421
#7 0x0000000000736fcc in cmdline_in (cl=0x2475080, buf=0xfffffffff25f
"\n\200\362\377\377\377\377",
size=1) at ../lib/cmdline/cmdline.c:149
#8 0x0000000000737270 in cmdline_interact (cl=0x2475080) at
../lib/cmdline/cmdline.c:223
#9 0x00000000004f0ccc in prompt () at ../app/test-pmd/cmdline.c:17882
#10 0x0000000000545528 in main (argc=8, argv=0xfffffffff470) at
../app/test-pmd/testpmd.c:3998
(gdb) c
Continuing.
Port 0 is closed
Done
testpmd> device detach 0000:7d:00.0
Removing a device...
[Switching to Thread 0xfffff7a0ad10 (LWP 17717)]
Breakpoint 1, hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>)
at ../drivers/net/hns3/hns3_ethdev.c:7806
7806 struct hns3_adapter *hns = eth_dev->data->dev_private;
(gdb) bt
#0 hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>) at
../drivers/net/hns3/hns3_ethdev.c:7806
#1 0x0000000000bb8668 in rte_eth_dev_pci_generic_remove
(pci_dev=0x247a600,
dev_uninit=0xbca7c4 <hns3_dev_uninit>) at
../lib/ethdev/ethdev_pci.h:155
#2 0x0000000000bca89c in eth_hns3_pci_remove (pci_dev=0x247a600)
at ../drivers/net/hns3/hns3_ethdev.c:7833
#3 0x00000000007e85f4 in rte_pci_detach_dev (dev=0x247a600) at
../drivers/bus/pci/pci_common.c:287
#4 0x00000000007e8f14 in pci_unplug (dev=0x247a610) at
../drivers/bus/pci/pci_common.c:570
#5 0x0000000000775678 in local_dev_remove (dev=0x247a610) at
../lib/eal/common/eal_common_dev.c:319
#6 0x000000000078f114 in __handle_primary_request (param=0xfffff00008c0)
at ../lib/eal/common/hotplug_mp.c:284
#7 0x000000000079ebf4 in eal_alarm_callback (arg=0x0) at
../lib/eal/linux/eal_alarm.c:92
#8 0x00000000007a3f30 in eal_intr_process_interrupts
(events=0xfffff7a0a3e0, nfds=1)
at ../lib/eal/linux/eal_interrupts.c:998
#9 0x00000000007a4224 in eal_intr_handle_interrupts (pfd=10, totalfds=2)
at ../lib/eal/linux/eal_interrupts.c:1071
#10 0x00000000007a442c in eal_intr_thread_main (arg=0x0) at
../lib/eal/linux/eal_interrupts.c:1142
#11 0x0000000000789388 in ctrl_thread_init (arg=0x246ef40)
at ../lib/eal/common/eal_common_thread.c:202
#12 0x0000fffff7bf3c48 in start_thread () from /lib64/libpthread.so.0
#13 0x0000fffff7b45600 in thread_start () from /lib64/libc.so.6
**************************************************************************************************************************************************
Note: above log is from the secondary process.
>>> And again, please re-consider calling 'rte_eth_dev_close()' or
>>> 'rte_dev_remove()' from the secondary process.
>>>
>>>>>>> + * Namely, whether "eth_dev" is NULL cannot be used to determine whether
>>>>>>> + * an ethdev port has been released.
>>>>>>> + * For both primary process and secondary process, eth_dev->state is
>>>>>>> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
>>>>>>> + */
>>>>>>> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>>>>>>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
>>>>>>> + return 0;
>>>>>>> + }
>>>>> .
>>> .
> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-08-25 9:53 ` Huisong Li
@ 2021-09-04 1:23 ` Huisong Li
2021-09-18 3:31 ` Huisong Li
2021-09-20 14:07 ` Ferruh Yigit
2 siblings, 0 replies; 33+ messages in thread
From: Huisong Li @ 2021-09-04 1:23 UTC (permalink / raw)
To: Ferruh Yigit
Cc: Andrew Rybchenko, dev, Anatoly Burakov, David Marchand, Singh,
Aman Deep, Thomas Monjalon
Hi, Ferruh
The callstack information for this problem has been sent. Please check it.
Looking forward to your reply.
Thanks.
在 2021/8/25 17:53, Huisong Li 写道:
>
> 在 2021/8/24 22:42, Ferruh Yigit 写道:
>> On 8/19/2021 4:45 AM, Huisong Li wrote:
>>> 在 2021/8/18 19:24, Ferruh Yigit 写道:
>>>> On 8/13/2021 9:16 AM, Huisong Li wrote:
>>>>> 在 2021/8/13 14:12, Thomas Monjalon 写道:
>>>>>> 13/08/2021 04:11, Huisong Li:
>>>>>>> Hi, all
>>>>>>>
>>>>>>> This patch can enhance the security of device uninstallation to
>>>>>>> eliminate dependency on user usage methods.
>>>>>>>
>>>>>>> Can you check this patch?
>>>>>>>
>>>>>>>
>>>>>>> 在 2021/8/3 10:30, Huisong Li 写道:
>>>>>>>> Ethernet devices in DPDK can be released by rte_eth_dev_close()
>>>>>>>> and
>>>>>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD
>>>>>>>> layer
>>>>>>>> to uninstall hardware. However, the two APIs do not have explicit
>>>>>>>> invocation restrictions. In other words, at the ethdev layer,
>>>>>>>> it is
>>>>>>>> possible to call rte_eth_dev_close() before calling
>>>>>>>> rte_dev_remove()
>>>>>>>> or rte_eal_hotplug_remove(). In such a bad scenario,
>>>>>> It is not a bad scenario.
>>>>>> If there is no more port for the device after calling close,
>>>>>> the device should be removed automatically.
>>>>>> Keep in mind "close" is for one port, "remove" is for the entire
>>>>>> device
>>>>>> which can have more than one port.
>>>>> I know.
>>>>>
>>>>> dev_close() is for removing an eth device. And rte_dev_remove()
>>>>> can be used
>>>>>
>>>>> for removing the rte device and all its eth devices belonging to
>>>>> the rte device.
>>>>>
>>>>> In rte_dev_remove(), "remove" is executed in primary or one of
>>>>> secondary,
>>>>>
>>>>> all eth devices having same pci address will be closed and removed.
>>>>>
>>>>>>>> the primary
>>>>>>>> process may be fine, but it may cause that xxx_dev_close() in
>>>>>>>> the PMD
>>>>>>>> layer will be called twice in the secondary process. So this patch
>>>>>>>> fixes it.
>>>>>> If a port is closed in primary, it should be the same in secondary.
>>>>>>
>>>>>>
>>>>>>>> + /*
>>>>>>>> + * The eth_dev->data->name doesn't be cleared by the
>>>>>>>> secondary process,
>>>>>>>> + * so above "eth_dev" isn't NULL after rte_eth_dev_close()
>>>>>>>> called.
>>>>>> This assumption is not clear. All should be closed together.
>>>>> However, dev_close() does not have the feature similar to
>>>>> rte_dev_remove().
>>>>>
>>>>> Namely, it is not guaranteed that all eth devices are closed
>>>>> together in ethdev
>>>>> layer. It depends on app or user.
>>>>>
>>>>> If the app does not close together, the operation of repeatedly
>>>>> uninstalling an
>>>>> eth device in the secondary process
>>>>>
>>>>> will be triggered when dev_close() is first called by one
>>>>> secondary process, and
>>>>> then rte_dev_remove() is called.
>>>>>
>>>>> So I think it should be avoided.
>>>> First of all, I am not sure about calling 'rte_eth_dev_close()' or
>>>> 'rte_dev_remove()' from the secondary process.
>>>> There are explicit checks in various locations to prevent clearing
>>>> resources
>>>> completely from secondary process.
>>> There's no denying that.
>>>
>>> Generally, hardware resources of eth device and shared data of the
>>> primary and
>>> secondary process
>>>
>>> are cleared by primary, which are controled by ethdev layer or PMD
>>> layer.
>>>
>>> But there may be some private data or resources of each process
>>> (primary or
>>> secondary ), such as mp action
>>>
>>> registered by rte_mp_action_register() or others. For these
>>> resources, the
>>> secondary process still needs to clear.
>>>
>>> Namely, both primary and secondary processes need to prevent
>>> repeated offloading
>>> of resources.
>>>
>>>> Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary is
>>>> technically
>>>> can be done but application needs to be extra cautious and should
>>>> take extra
>>>> measures and synchronization to make it work.
>>>> Regular use-case is secondary processes do the packet processing
>>>> and all control
>>>> commands run by primary.
>>> You are right. We have a consensus that 'rte_eth_dev_close()' or
>>> 'rte_dev_remove()'
>>>
>>> can be called by primary and secondary processes.
>>>
>>> But DPDK framework cannot assume user behavior.😁
>>>
>>> We need to make it more secure and reliable for both primary and
>>> secondary
>>> processes.
>>>
>>>> In primary, if you call 'rte_eth_dev_close()' it will clear all
>>>> ethdev resources
>>>> and further 'rte_dev_remove()' call will detect missing ethdev
>>>> resources and
>>>> won't try to clear them again.
>>>>
>>>> In secondary, if you call 'rte_eth_dev_close()', it WON'T clear all
>>>> resources
>>>> and further 'rte_dev_remove()' call (either from primary or
>>>> secondary) will try
>>>> to clean ethdev resources again. You are trying to prevent this
>>>> retry in remove
>>>> happening for secondary process.
>>> Right. However, if secondary process in PMD layer has its own
>>> private resources
>>> to be
>>>
>>> cleared, it still need to do it by calling 'rte_eth_dev_close()' or
>>> 'rte_dev_remove()'.
>>>
>>>> In secondary it won't free ethdev resources anyway if you let it
>>>> continue, but I
>>>> guess here you are trying to prevent the PMD dev_close() called
>>>> again. Why? Is
>>>> it just for optimization or does it cause unexpected behavior in
>>>> the PMD?
>>>>
>>>>
>>>> Overall, to free resources you need to do the 'rte_eth_dev_close()' or
>>>> 'rte_dev_remove()' in the primary anyway. So instead of this
>>>> workaround, I would
>>>> suggest making PMD dev_close() safe to be called multiple times (if
>>>> this is the
>>>> problem.)
>>> In conclusion, primary and secondary processes in PMD layer may
>>> have their own
>>>
>>> private data and resources, which need to be processed and released.
>>>
>>> Currently, these for PMD are either handled and cleaned up in
>>> dev_close() or
>>> remove().
>>>
>>> However, code streams in rte_dev_remove() cannot ensure that the
>>> uninstallation
>>>
>>> from secondary process will not be repeated if rte_eth_dev_close()
>>> is first
>>> called by
>>>
>>> secondary(primary is ok, plese review this patch).
>>>
>>> I think, this is the same for each PMD and is better suited to doing
>>> it in
>>> ethdev layer.
>>>
>> This patch prevents to call dev_close() twice in the secondary
>> process, is this
>> fixing a theoretical problem or an actual problem?
>>
>> If it is an actual problem can you please provide details, callstack
>> of the
>> problematic case?
>
> We analyzed the code when modifying the bug and found that the problem
> did exist.
>
> The ethdev layer did not guarantee the security.
>
> The general function of the two interfaces is as follows:
>
> rte_eth_dev_close() --> release eth device
>
> rte_dev_remove() --> release eth device + remove and free
> rte_pci_device(primary andsecondary).
>
> According to the OVS application scenario, first call dev_close() and
> then call
>
> remove(), which is possible.
>
> We constructed this scenario using testpmd to start the secondary
> process(the multi-process
>
> patch of testpmd is being uploaded.). It is proved that the ethdev
> layer cannot guarantee
>
> this security. The callstack is as follows:
>
> **************************************************************************************************************************************************
>
>
> gdb) set args -a 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee
> --proc-type=auto -- -i --rxq 4 --txq 4 --num-procs=2 --proc-id=1
> (gdb) b hns3_dev_uninit
> Breakpoint 1 at 0xbca7d0: file ../drivers/net/hns3/hns3_ethdev.c, line
> 7806.
> (gdb) b hns3_dev_close
> Breakpoint 2 at 0xbc6d44: file ../drivers/net/hns3/hns3_ethdev.c, line
> 6189.
> (gdb) r
> Starting program:
> /home/lihuisong/v500/gcov-test/dpdk/build/app/dpdk-testpmd -a
> 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto -- -i --rxq
> 4 --txq 4 --num-procs=2 --proc-id=1
> Missing separate debuginfo for /root/lib/libnuma.so.1
> Try: yum --enablerepo='*debug*' install
> /usr/lib/debug/.build-id/ce/4eea0b0f2150f70a080bbd8835e43e78373096.debug
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> EAL: Detected 128 lcore(s)
> EAL: Detected 4 NUMA nodes
> EAL: Auto-detected process type: SECONDARY
> EAL: Detected static linkage of DPDK
> [New Thread 0xfffff7a0ad10 (LWP 17717)]
> EAL: Multi-process socket
> /var/run/dpdk/rte_lee/mp_socket_17714_66aa8d3a60a9
> [New Thread 0xfffff7209d10 (LWP 17718)]
> EAL: Selected IOVA mode 'VA'
> EAL: VFIO support initialized
> [New Thread 0xfffff69f8d10 (LWP 17719)]
> EAL: Using IOMMU type 1 (Type 1)
> EAL: Probe PCI driver: net_hns3 (19e5:a222) device: 0000:7d:00.0
> (socket 0)
> [New Thread 0xfffff61f7d10 (LWP 17720)]
> TELEMETRY: No legacy callbacks, legacy socket not created
> Interactive-mode selected
> 0000:7d:00.0 hns3_dev_mtu_set(): Failed to set mtu, port 0 must be
> stopped before configuration
> Failed to set MTU to 1500 for port 0
> testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176,
> socket=0
> testpmd: preferred mempool ops selected: ring_mp_mc
>
> Warning! port-topology=paired and odd forward ports number, the last
> port will pair with itself.
>
> Configuring Port 0 (socket 0)
> Port 0: 00:18:2D:00:00:9E
> Checking link statuses...
> Done
> testpmd> port close 0
> Closing ports...
>
> Breakpoint 2, hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>)
> at ../drivers/net/hns3/hns3_ethdev.c:6189
> 6189 struct hns3_adapter *hns = eth_dev->data->dev_private;
> Missing separate debuginfos, use: debuginfo-install
> glibc-2.17-260.el7.aarch64 libpcap-1.5.3-11.el7.aarch64
> openssl-libs-1.0.2k-16.el7.aarch64 zlib-1.2.7-18.el7.aarch64
> (gdb) bt
> #0 hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>) at
> ../drivers/net/hns3/hns3_ethdev.c:6189
> #1 0x0000000000742eac in rte_eth_dev_close (port_id=0) at
> ../lib/ethdev/rte_ethdev.c:1870
> #2 0x0000000000542a8c in close_port (pid=0) at
> ../app/test-pmd/testpmd.c:2895
> #3 0x00000000004df82c in cmd_operate_specific_port_parsed
> (parsed_result=0xffffffffb0f0,
> cl=0x2475080, data=0x0) at ../app/test-pmd/cmdline.c:1272
> #4 0x0000000000738ce4 in cmdline_parse (cl=0x2475080, buf=0x24750c8
> "port close 0\n")
> at ../lib/cmdline/cmdline_parse.c:290
> #5 0x0000000000736b7c in cmdline_valid_buffer (rdl=0x2475090,
> buf=0x24750c8 "port close 0\n",
> size=14) at ../lib/cmdline/cmdline.c:26
> #6 0x000000000073c078 in rdline_char_in (rdl=0x2475090, c=10 '\n')
> at ../lib/cmdline/cmdline_rdline.c:421
> #7 0x0000000000736fcc in cmdline_in (cl=0x2475080, buf=0xfffffffff25f
> "\n\200\362\377\377\377\377",
> size=1) at ../lib/cmdline/cmdline.c:149
> #8 0x0000000000737270 in cmdline_interact (cl=0x2475080) at
> ../lib/cmdline/cmdline.c:223
> #9 0x00000000004f0ccc in prompt () at ../app/test-pmd/cmdline.c:17882
> #10 0x0000000000545528 in main (argc=8, argv=0xfffffffff470) at
> ../app/test-pmd/testpmd.c:3998
> (gdb) c
> Continuing.
> Port 0 is closed
> Done
> testpmd> device detach 0000:7d:00.0
> Removing a device...
> [Switching to Thread 0xfffff7a0ad10 (LWP 17717)]
>
> Breakpoint 1, hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>)
> at ../drivers/net/hns3/hns3_ethdev.c:7806
> 7806 struct hns3_adapter *hns = eth_dev->data->dev_private;
> (gdb) bt
> #0 hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>) at
> ../drivers/net/hns3/hns3_ethdev.c:7806
> #1 0x0000000000bb8668 in rte_eth_dev_pci_generic_remove
> (pci_dev=0x247a600,
> dev_uninit=0xbca7c4 <hns3_dev_uninit>) at
> ../lib/ethdev/ethdev_pci.h:155
> #2 0x0000000000bca89c in eth_hns3_pci_remove (pci_dev=0x247a600)
> at ../drivers/net/hns3/hns3_ethdev.c:7833
> #3 0x00000000007e85f4 in rte_pci_detach_dev (dev=0x247a600) at
> ../drivers/bus/pci/pci_common.c:287
> #4 0x00000000007e8f14 in pci_unplug (dev=0x247a610) at
> ../drivers/bus/pci/pci_common.c:570
> #5 0x0000000000775678 in local_dev_remove (dev=0x247a610) at
> ../lib/eal/common/eal_common_dev.c:319
> #6 0x000000000078f114 in __handle_primary_request (param=0xfffff00008c0)
> at ../lib/eal/common/hotplug_mp.c:284
> #7 0x000000000079ebf4 in eal_alarm_callback (arg=0x0) at
> ../lib/eal/linux/eal_alarm.c:92
> #8 0x00000000007a3f30 in eal_intr_process_interrupts
> (events=0xfffff7a0a3e0, nfds=1)
> at ../lib/eal/linux/eal_interrupts.c:998
> #9 0x00000000007a4224 in eal_intr_handle_interrupts (pfd=10, totalfds=2)
> at ../lib/eal/linux/eal_interrupts.c:1071
> #10 0x00000000007a442c in eal_intr_thread_main (arg=0x0) at
> ../lib/eal/linux/eal_interrupts.c:1142
> #11 0x0000000000789388 in ctrl_thread_init (arg=0x246ef40)
> at ../lib/eal/common/eal_common_thread.c:202
> #12 0x0000fffff7bf3c48 in start_thread () from /lib64/libpthread.so.0
> #13 0x0000fffff7b45600 in thread_start () from /lib64/libc.so.6
>
> **************************************************************************************************************************************************
>
>
> Note: above log is from the secondary process.
>
>>>> And again, please re-consider calling 'rte_eth_dev_close()' or
>>>> 'rte_dev_remove()' from the secondary process.
>>>>
>>>>>>>> + * Namely, whether "eth_dev" is NULL cannot be used to
>>>>>>>> determine whether
>>>>>>>> + * an ethdev port has been released.
>>>>>>>> + * For both primary process and secondary process,
>>>>>>>> eth_dev->state is
>>>>>>>> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has
>>>>>>>> been released.
>>>>>>>> + */
>>>>>>>> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>>>>>>>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been
>>>>>>>> released.");
>>>>>>>> + return 0;
>>>>>>>> + }
>>>>>> .
>>>> .
>> .
> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-08-25 9:53 ` Huisong Li
2021-09-04 1:23 ` Huisong Li
@ 2021-09-18 3:31 ` Huisong Li
2021-09-20 14:07 ` Ferruh Yigit
2 siblings, 0 replies; 33+ messages in thread
From: Huisong Li @ 2021-09-18 3:31 UTC (permalink / raw)
To: Ferruh Yigit, Thomas Monjalon
Cc: Andrew Rybchenko, dev, Anatoly Burakov, David Marchand
Hi, All
Can you take a look at this problem?
Thanks.
在 2021/8/25 17:53, Huisong Li 写道:
>
> 在 2021/8/24 22:42, Ferruh Yigit 写道:
>> On 8/19/2021 4:45 AM, Huisong Li wrote:
>>> 在 2021/8/18 19:24, Ferruh Yigit 写道:
>>>> On 8/13/2021 9:16 AM, Huisong Li wrote:
>>>>> 在 2021/8/13 14:12, Thomas Monjalon 写道:
>>>>>> 13/08/2021 04:11, Huisong Li:
>>>>>>> Hi, all
>>>>>>>
>>>>>>> This patch can enhance the security of device uninstallation to
>>>>>>> eliminate dependency on user usage methods.
>>>>>>>
>>>>>>> Can you check this patch?
>>>>>>>
>>>>>>>
>>>>>>> 在 2021/8/3 10:30, Huisong Li 写道:
>>>>>>>> Ethernet devices in DPDK can be released by rte_eth_dev_close()
>>>>>>>> and
>>>>>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD
>>>>>>>> layer
>>>>>>>> to uninstall hardware. However, the two APIs do not have explicit
>>>>>>>> invocation restrictions. In other words, at the ethdev layer,
>>>>>>>> it is
>>>>>>>> possible to call rte_eth_dev_close() before calling
>>>>>>>> rte_dev_remove()
>>>>>>>> or rte_eal_hotplug_remove(). In such a bad scenario,
>>>>>> It is not a bad scenario.
>>>>>> If there is no more port for the device after calling close,
>>>>>> the device should be removed automatically.
>>>>>> Keep in mind "close" is for one port, "remove" is for the entire
>>>>>> device
>>>>>> which can have more than one port.
>>>>> I know.
>>>>>
>>>>> dev_close() is for removing an eth device. And rte_dev_remove()
>>>>> can be used
>>>>>
>>>>> for removing the rte device and all its eth devices belonging to
>>>>> the rte device.
>>>>>
>>>>> In rte_dev_remove(), "remove" is executed in primary or one of
>>>>> secondary,
>>>>>
>>>>> all eth devices having same pci address will be closed and removed.
>>>>>
>>>>>>>> the primary
>>>>>>>> process may be fine, but it may cause that xxx_dev_close() in
>>>>>>>> the PMD
>>>>>>>> layer will be called twice in the secondary process. So this patch
>>>>>>>> fixes it.
>>>>>> If a port is closed in primary, it should be the same in secondary.
>>>>>>
>>>>>>
>>>>>>>> + /*
>>>>>>>> + * The eth_dev->data->name doesn't be cleared by the
>>>>>>>> secondary process,
>>>>>>>> + * so above "eth_dev" isn't NULL after rte_eth_dev_close()
>>>>>>>> called.
>>>>>> This assumption is not clear. All should be closed together.
>>>>> However, dev_close() does not have the feature similar to
>>>>> rte_dev_remove().
>>>>>
>>>>> Namely, it is not guaranteed that all eth devices are closed
>>>>> together in ethdev
>>>>> layer. It depends on app or user.
>>>>>
>>>>> If the app does not close together, the operation of repeatedly
>>>>> uninstalling an
>>>>> eth device in the secondary process
>>>>>
>>>>> will be triggered when dev_close() is first called by one
>>>>> secondary process, and
>>>>> then rte_dev_remove() is called.
>>>>>
>>>>> So I think it should be avoided.
>>>> First of all, I am not sure about calling 'rte_eth_dev_close()' or
>>>> 'rte_dev_remove()' from the secondary process.
>>>> There are explicit checks in various locations to prevent clearing
>>>> resources
>>>> completely from secondary process.
>>> There's no denying that.
>>>
>>> Generally, hardware resources of eth device and shared data of the
>>> primary and
>>> secondary process
>>>
>>> are cleared by primary, which are controled by ethdev layer or PMD
>>> layer.
>>>
>>> But there may be some private data or resources of each process
>>> (primary or
>>> secondary ), such as mp action
>>>
>>> registered by rte_mp_action_register() or others. For these
>>> resources, the
>>> secondary process still needs to clear.
>>>
>>> Namely, both primary and secondary processes need to prevent
>>> repeated offloading
>>> of resources.
>>>
>>>> Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary is
>>>> technically
>>>> can be done but application needs to be extra cautious and should
>>>> take extra
>>>> measures and synchronization to make it work.
>>>> Regular use-case is secondary processes do the packet processing
>>>> and all control
>>>> commands run by primary.
>>> You are right. We have a consensus that 'rte_eth_dev_close()' or
>>> 'rte_dev_remove()'
>>>
>>> can be called by primary and secondary processes.
>>>
>>> But DPDK framework cannot assume user behavior.😁
>>>
>>> We need to make it more secure and reliable for both primary and
>>> secondary
>>> processes.
>>>
>>>> In primary, if you call 'rte_eth_dev_close()' it will clear all
>>>> ethdev resources
>>>> and further 'rte_dev_remove()' call will detect missing ethdev
>>>> resources and
>>>> won't try to clear them again.
>>>>
>>>> In secondary, if you call 'rte_eth_dev_close()', it WON'T clear all
>>>> resources
>>>> and further 'rte_dev_remove()' call (either from primary or
>>>> secondary) will try
>>>> to clean ethdev resources again. You are trying to prevent this
>>>> retry in remove
>>>> happening for secondary process.
>>> Right. However, if secondary process in PMD layer has its own
>>> private resources
>>> to be
>>>
>>> cleared, it still need to do it by calling 'rte_eth_dev_close()' or
>>> 'rte_dev_remove()'.
>>>
>>>> In secondary it won't free ethdev resources anyway if you let it
>>>> continue, but I
>>>> guess here you are trying to prevent the PMD dev_close() called
>>>> again. Why? Is
>>>> it just for optimization or does it cause unexpected behavior in
>>>> the PMD?
>>>>
>>>>
>>>> Overall, to free resources you need to do the 'rte_eth_dev_close()' or
>>>> 'rte_dev_remove()' in the primary anyway. So instead of this
>>>> workaround, I would
>>>> suggest making PMD dev_close() safe to be called multiple times (if
>>>> this is the
>>>> problem.)
>>> In conclusion, primary and secondary processes in PMD layer may
>>> have their own
>>>
>>> private data and resources, which need to be processed and released.
>>>
>>> Currently, these for PMD are either handled and cleaned up in
>>> dev_close() or
>>> remove().
>>>
>>> However, code streams in rte_dev_remove() cannot ensure that the
>>> uninstallation
>>>
>>> from secondary process will not be repeated if rte_eth_dev_close()
>>> is first
>>> called by
>>>
>>> secondary(primary is ok, plese review this patch).
>>>
>>> I think, this is the same for each PMD and is better suited to doing
>>> it in
>>> ethdev layer.
>>>
>> This patch prevents to call dev_close() twice in the secondary
>> process, is this
>> fixing a theoretical problem or an actual problem?
>>
>> If it is an actual problem can you please provide details, callstack
>> of the
>> problematic case?
>
> We analyzed the code when modifying the bug and found that the problem
> did exist.
>
> The ethdev layer did not guarantee the security.
>
> The general function of the two interfaces is as follows:
>
> rte_eth_dev_close() --> release eth device
>
> rte_dev_remove() --> release eth device + remove and free
> rte_pci_device(primary andsecondary).
>
> According to the OVS application scenario, first call dev_close() and
> then call
>
> remove(), which is possible.
>
> We constructed this scenario using testpmd to start the secondary
> process(the multi-process
>
> patch of testpmd is being uploaded.). It is proved that the ethdev
> layer cannot guarantee
>
> this security. The callstack is as follows:
>
> **************************************************************************************************************************************************
>
>
> gdb) set args -a 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee
> --proc-type=auto -- -i --rxq 4 --txq 4 --num-procs=2 --proc-id=1
> (gdb) b hns3_dev_uninit
> Breakpoint 1 at 0xbca7d0: file ../drivers/net/hns3/hns3_ethdev.c, line
> 7806.
> (gdb) b hns3_dev_close
> Breakpoint 2 at 0xbc6d44: file ../drivers/net/hns3/hns3_ethdev.c, line
> 6189.
> (gdb) r
> Starting program:
> /home/lihuisong/v500/gcov-test/dpdk/build/app/dpdk-testpmd -a
> 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto -- -i --rxq
> 4 --txq 4 --num-procs=2 --proc-id=1
> Missing separate debuginfo for /root/lib/libnuma.so.1
> Try: yum --enablerepo='*debug*' install
> /usr/lib/debug/.build-id/ce/4eea0b0f2150f70a080bbd8835e43e78373096.debug
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> EAL: Detected 128 lcore(s)
> EAL: Detected 4 NUMA nodes
> EAL: Auto-detected process type: SECONDARY
> EAL: Detected static linkage of DPDK
> [New Thread 0xfffff7a0ad10 (LWP 17717)]
> EAL: Multi-process socket
> /var/run/dpdk/rte_lee/mp_socket_17714_66aa8d3a60a9
> [New Thread 0xfffff7209d10 (LWP 17718)]
> EAL: Selected IOVA mode 'VA'
> EAL: VFIO support initialized
> [New Thread 0xfffff69f8d10 (LWP 17719)]
> EAL: Using IOMMU type 1 (Type 1)
> EAL: Probe PCI driver: net_hns3 (19e5:a222) device: 0000:7d:00.0
> (socket 0)
> [New Thread 0xfffff61f7d10 (LWP 17720)]
> TELEMETRY: No legacy callbacks, legacy socket not created
> Interactive-mode selected
> 0000:7d:00.0 hns3_dev_mtu_set(): Failed to set mtu, port 0 must be
> stopped before configuration
> Failed to set MTU to 1500 for port 0
> testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176,
> socket=0
> testpmd: preferred mempool ops selected: ring_mp_mc
>
> Warning! port-topology=paired and odd forward ports number, the last
> port will pair with itself.
>
> Configuring Port 0 (socket 0)
> Port 0: 00:18:2D:00:00:9E
> Checking link statuses...
> Done
> testpmd> port close 0
> Closing ports...
>
> Breakpoint 2, hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>)
> at ../drivers/net/hns3/hns3_ethdev.c:6189
> 6189 struct hns3_adapter *hns = eth_dev->data->dev_private;
> Missing separate debuginfos, use: debuginfo-install
> glibc-2.17-260.el7.aarch64 libpcap-1.5.3-11.el7.aarch64
> openssl-libs-1.0.2k-16.el7.aarch64 zlib-1.2.7-18.el7.aarch64
> (gdb) bt
> #0 hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>) at
> ../drivers/net/hns3/hns3_ethdev.c:6189
> #1 0x0000000000742eac in rte_eth_dev_close (port_id=0) at
> ../lib/ethdev/rte_ethdev.c:1870
> #2 0x0000000000542a8c in close_port (pid=0) at
> ../app/test-pmd/testpmd.c:2895
> #3 0x00000000004df82c in cmd_operate_specific_port_parsed
> (parsed_result=0xffffffffb0f0,
> cl=0x2475080, data=0x0) at ../app/test-pmd/cmdline.c:1272
> #4 0x0000000000738ce4 in cmdline_parse (cl=0x2475080, buf=0x24750c8
> "port close 0\n")
> at ../lib/cmdline/cmdline_parse.c:290
> #5 0x0000000000736b7c in cmdline_valid_buffer (rdl=0x2475090,
> buf=0x24750c8 "port close 0\n",
> size=14) at ../lib/cmdline/cmdline.c:26
> #6 0x000000000073c078 in rdline_char_in (rdl=0x2475090, c=10 '\n')
> at ../lib/cmdline/cmdline_rdline.c:421
> #7 0x0000000000736fcc in cmdline_in (cl=0x2475080, buf=0xfffffffff25f
> "\n\200\362\377\377\377\377",
> size=1) at ../lib/cmdline/cmdline.c:149
> #8 0x0000000000737270 in cmdline_interact (cl=0x2475080) at
> ../lib/cmdline/cmdline.c:223
> #9 0x00000000004f0ccc in prompt () at ../app/test-pmd/cmdline.c:17882
> #10 0x0000000000545528 in main (argc=8, argv=0xfffffffff470) at
> ../app/test-pmd/testpmd.c:3998
> (gdb) c
> Continuing.
> Port 0 is closed
> Done
> testpmd> device detach 0000:7d:00.0
> Removing a device...
> [Switching to Thread 0xfffff7a0ad10 (LWP 17717)]
>
> Breakpoint 1, hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>)
> at ../drivers/net/hns3/hns3_ethdev.c:7806
> 7806 struct hns3_adapter *hns = eth_dev->data->dev_private;
> (gdb) bt
> #0 hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>) at
> ../drivers/net/hns3/hns3_ethdev.c:7806
> #1 0x0000000000bb8668 in rte_eth_dev_pci_generic_remove
> (pci_dev=0x247a600,
> dev_uninit=0xbca7c4 <hns3_dev_uninit>) at
> ../lib/ethdev/ethdev_pci.h:155
> #2 0x0000000000bca89c in eth_hns3_pci_remove (pci_dev=0x247a600)
> at ../drivers/net/hns3/hns3_ethdev.c:7833
> #3 0x00000000007e85f4 in rte_pci_detach_dev (dev=0x247a600) at
> ../drivers/bus/pci/pci_common.c:287
> #4 0x00000000007e8f14 in pci_unplug (dev=0x247a610) at
> ../drivers/bus/pci/pci_common.c:570
> #5 0x0000000000775678 in local_dev_remove (dev=0x247a610) at
> ../lib/eal/common/eal_common_dev.c:319
> #6 0x000000000078f114 in __handle_primary_request (param=0xfffff00008c0)
> at ../lib/eal/common/hotplug_mp.c:284
> #7 0x000000000079ebf4 in eal_alarm_callback (arg=0x0) at
> ../lib/eal/linux/eal_alarm.c:92
> #8 0x00000000007a3f30 in eal_intr_process_interrupts
> (events=0xfffff7a0a3e0, nfds=1)
> at ../lib/eal/linux/eal_interrupts.c:998
> #9 0x00000000007a4224 in eal_intr_handle_interrupts (pfd=10, totalfds=2)
> at ../lib/eal/linux/eal_interrupts.c:1071
> #10 0x00000000007a442c in eal_intr_thread_main (arg=0x0) at
> ../lib/eal/linux/eal_interrupts.c:1142
> #11 0x0000000000789388 in ctrl_thread_init (arg=0x246ef40)
> at ../lib/eal/common/eal_common_thread.c:202
> #12 0x0000fffff7bf3c48 in start_thread () from /lib64/libpthread.so.0
> #13 0x0000fffff7b45600 in thread_start () from /lib64/libc.so.6
>
> **************************************************************************************************************************************************
>
>
> Note: above log is from the secondary process.
>
>>>> And again, please re-consider calling 'rte_eth_dev_close()' or
>>>> 'rte_dev_remove()' from the secondary process.
>>>>
>>>>>>>> + * Namely, whether "eth_dev" is NULL cannot be used to
>>>>>>>> determine whether
>>>>>>>> + * an ethdev port has been released.
>>>>>>>> + * For both primary process and secondary process,
>>>>>>>> eth_dev->state is
>>>>>>>> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has
>>>>>>>> been released.
>>>>>>>> + */
>>>>>>>> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>>>>>>>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been
>>>>>>>> released.");
>>>>>>>> + return 0;
>>>>>>>> + }
>>>>>> .
>>>> .
>> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-08-25 9:53 ` Huisong Li
2021-09-04 1:23 ` Huisong Li
2021-09-18 3:31 ` Huisong Li
@ 2021-09-20 14:07 ` Ferruh Yigit
2021-09-22 3:31 ` Huisong Li
2 siblings, 1 reply; 33+ messages in thread
From: Ferruh Yigit @ 2021-09-20 14:07 UTC (permalink / raw)
To: Huisong Li, Thomas Monjalon
Cc: Andrew Rybchenko, dev, Anatoly Burakov, David Marchand
On 8/25/2021 10:53 AM, Huisong Li wrote:
>
> 在 2021/8/24 22:42, Ferruh Yigit 写道:
>> On 8/19/2021 4:45 AM, Huisong Li wrote:
>>> 在 2021/8/18 19:24, Ferruh Yigit 写道:
>>>> On 8/13/2021 9:16 AM, Huisong Li wrote:
>>>>> 在 2021/8/13 14:12, Thomas Monjalon 写道:
>>>>>> 13/08/2021 04:11, Huisong Li:
>>>>>>> Hi, all
>>>>>>>
>>>>>>> This patch can enhance the security of device uninstallation to
>>>>>>> eliminate dependency on user usage methods.
>>>>>>>
>>>>>>> Can you check this patch?
>>>>>>>
>>>>>>>
>>>>>>> 在 2021/8/3 10:30, Huisong Li 写道:
>>>>>>>> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
>>>>>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD layer
>>>>>>>> to uninstall hardware. However, the two APIs do not have explicit
>>>>>>>> invocation restrictions. In other words, at the ethdev layer, it is
>>>>>>>> possible to call rte_eth_dev_close() before calling rte_dev_remove()
>>>>>>>> or rte_eal_hotplug_remove(). In such a bad scenario,
>>>>>> It is not a bad scenario.
>>>>>> If there is no more port for the device after calling close,
>>>>>> the device should be removed automatically.
>>>>>> Keep in mind "close" is for one port, "remove" is for the entire device
>>>>>> which can have more than one port.
>>>>> I know.
>>>>>
>>>>> dev_close() is for removing an eth device. And rte_dev_remove() can be used
>>>>>
>>>>> for removing the rte device and all its eth devices belonging to the rte
>>>>> device.
>>>>>
>>>>> In rte_dev_remove(), "remove" is executed in primary or one of secondary,
>>>>>
>>>>> all eth devices having same pci address will be closed and removed.
>>>>>
>>>>>>>> the primary
>>>>>>>> process may be fine, but it may cause that xxx_dev_close() in the PMD
>>>>>>>> layer will be called twice in the secondary process. So this patch
>>>>>>>> fixes it.
>>>>>> If a port is closed in primary, it should be the same in secondary.
>>>>>>
>>>>>>
>>>>>>>> + /*
>>>>>>>> + * The eth_dev->data->name doesn't be cleared by the secondary
>>>>>>>> process,
>>>>>>>> + * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
>>>>>> This assumption is not clear. All should be closed together.
>>>>> However, dev_close() does not have the feature similar to rte_dev_remove().
>>>>>
>>>>> Namely, it is not guaranteed that all eth devices are closed together in
>>>>> ethdev
>>>>> layer. It depends on app or user.
>>>>>
>>>>> If the app does not close together, the operation of repeatedly
>>>>> uninstalling an
>>>>> eth device in the secondary process
>>>>>
>>>>> will be triggered when dev_close() is first called by one secondary
>>>>> process, and
>>>>> then rte_dev_remove() is called.
>>>>>
>>>>> So I think it should be avoided.
>>>> First of all, I am not sure about calling 'rte_eth_dev_close()' or
>>>> 'rte_dev_remove()' from the secondary process.
>>>> There are explicit checks in various locations to prevent clearing resources
>>>> completely from secondary process.
>>> There's no denying that.
>>>
>>> Generally, hardware resources of eth device and shared data of the primary and
>>> secondary process
>>>
>>> are cleared by primary, which are controled by ethdev layer or PMD layer.
>>>
>>> But there may be some private data or resources of each process (primary or
>>> secondary ), such as mp action
>>>
>>> registered by rte_mp_action_register() or others. For these resources, the
>>> secondary process still needs to clear.
>>>
>>> Namely, both primary and secondary processes need to prevent repeated offloading
>>> of resources.
>>>
>>>> Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary is technically
>>>> can be done but application needs to be extra cautious and should take extra
>>>> measures and synchronization to make it work.
>>>> Regular use-case is secondary processes do the packet processing and all
>>>> control
>>>> commands run by primary.
>>> You are right. We have a consensus that 'rte_eth_dev_close()' or
>>> 'rte_dev_remove()'
>>>
>>> can be called by primary and secondary processes.
>>>
>>> But DPDK framework cannot assume user behavior.😁
>>>
>>> We need to make it more secure and reliable for both primary and secondary
>>> processes.
>>>
>>>> In primary, if you call 'rte_eth_dev_close()' it will clear all ethdev
>>>> resources
>>>> and further 'rte_dev_remove()' call will detect missing ethdev resources and
>>>> won't try to clear them again.
>>>>
>>>> In secondary, if you call 'rte_eth_dev_close()', it WON'T clear all resources
>>>> and further 'rte_dev_remove()' call (either from primary or secondary) will try
>>>> to clean ethdev resources again. You are trying to prevent this retry in remove
>>>> happening for secondary process.
>>> Right. However, if secondary process in PMD layer has its own private resources
>>> to be
>>>
>>> cleared, it still need to do it by calling 'rte_eth_dev_close()' or
>>> 'rte_dev_remove()'.
>>>
>>>> In secondary it won't free ethdev resources anyway if you let it continue,
>>>> but I
>>>> guess here you are trying to prevent the PMD dev_close() called again. Why? Is
>>>> it just for optimization or does it cause unexpected behavior in the PMD?
>>>>
>>>>
>>>> Overall, to free resources you need to do the 'rte_eth_dev_close()' or
>>>> 'rte_dev_remove()' in the primary anyway. So instead of this workaround, I
>>>> would
>>>> suggest making PMD dev_close() safe to be called multiple times (if this is the
>>>> problem.)
>>> In conclusion, primary and secondary processes in PMD layer may have their own
>>>
>>> private data and resources, which need to be processed and released.
>>>
>>> Currently, these for PMD are either handled and cleaned up in dev_close() or
>>> remove().
>>>
>>> However, code streams in rte_dev_remove() cannot ensure that the uninstallation
>>>
>>> from secondary process will not be repeated if rte_eth_dev_close() is first
>>> called by
>>>
>>> secondary(primary is ok, plese review this patch).
>>>
>>> I think, this is the same for each PMD and is better suited to doing it in
>>> ethdev layer.
>>>
>> This patch prevents to call dev_close() twice in the secondary process, is this
>> fixing a theoretical problem or an actual problem?
>>
>> If it is an actual problem can you please provide details, callstack of the
>> problematic case?
>
> We analyzed the code when modifying the bug and found that the problem did exist.
>
> The ethdev layer did not guarantee the security.
>
I was wondering if there is a crash for an unexpected path, for below case the
primary process check in the 'hns3_dev_uninit()' should already prevent anything
unexpected. So I assume this is a fix for a theoretical issue.
In secondary process, these init/uninit device is already not very safe. For
example, as far as I can see in secondary if you hot remove a device device and
hot plug a new one, new device will use wrong device data (since hot remove
won't clear device data, new one will continue to use it).
So I am not sure about adding secondary process related checks in that area and
causing a false sense of security, and polluting the logic with secondary
specific checks.
Also the check you add may hit by primary process and I am worried on an
unexpected side affect it cause.
As said above I am not sure about this new check, but even we continue with it,
what about wrapping the check with secondary process check, at least to be sure
there won't be any side affect for primary process.
> The general function of the two interfaces is as follows:
>
> rte_eth_dev_close() --> release eth device
>
> rte_dev_remove() --> release eth device + remove and free
> rte_pci_device(primary andsecondary).
>
> According to the OVS application scenario, first call dev_close() and then call
>
> remove(), which is possible.
>
> We constructed this scenario using testpmd to start the secondary process(the
> multi-process
>
> patch of testpmd is being uploaded.). It is proved that the ethdev layer cannot
> guarantee
>
> this security. The callstack is as follows:
>
> **************************************************************************************************************************************************
>
>
> gdb) set args -a 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto --
> -i --rxq 4 --txq 4 --num-procs=2 --proc-id=1
> (gdb) b hns3_dev_uninit
> Breakpoint 1 at 0xbca7d0: file ../drivers/net/hns3/hns3_ethdev.c, line 7806.
> (gdb) b hns3_dev_close
> Breakpoint 2 at 0xbc6d44: file ../drivers/net/hns3/hns3_ethdev.c, line 6189.
> (gdb) r
> Starting program: /home/lihuisong/v500/gcov-test/dpdk/build/app/dpdk-testpmd -a
> 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto -- -i --rxq 4 --txq 4
> --num-procs=2 --proc-id=1
> Missing separate debuginfo for /root/lib/libnuma.so.1
> Try: yum --enablerepo='*debug*' install
> /usr/lib/debug/.build-id/ce/4eea0b0f2150f70a080bbd8835e43e78373096.debug
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> EAL: Detected 128 lcore(s)
> EAL: Detected 4 NUMA nodes
> EAL: Auto-detected process type: SECONDARY
> EAL: Detected static linkage of DPDK
> [New Thread 0xfffff7a0ad10 (LWP 17717)]
> EAL: Multi-process socket /var/run/dpdk/rte_lee/mp_socket_17714_66aa8d3a60a9
> [New Thread 0xfffff7209d10 (LWP 17718)]
> EAL: Selected IOVA mode 'VA'
> EAL: VFIO support initialized
> [New Thread 0xfffff69f8d10 (LWP 17719)]
> EAL: Using IOMMU type 1 (Type 1)
> EAL: Probe PCI driver: net_hns3 (19e5:a222) device: 0000:7d:00.0 (socket 0)
> [New Thread 0xfffff61f7d10 (LWP 17720)]
> TELEMETRY: No legacy callbacks, legacy socket not created
> Interactive-mode selected
> 0000:7d:00.0 hns3_dev_mtu_set(): Failed to set mtu, port 0 must be stopped
> before configuration
> Failed to set MTU to 1500 for port 0
> testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
> testpmd: preferred mempool ops selected: ring_mp_mc
>
> Warning! port-topology=paired and odd forward ports number, the last port will
> pair with itself.
>
> Configuring Port 0 (socket 0)
> Port 0: 00:18:2D:00:00:9E
> Checking link statuses...
> Done
> testpmd> port close 0
> Closing ports...
>
> Breakpoint 2, hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>)
> at ../drivers/net/hns3/hns3_ethdev.c:6189
> 6189 struct hns3_adapter *hns = eth_dev->data->dev_private;
> Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7.aarch64
> libpcap-1.5.3-11.el7.aarch64 openssl-libs-1.0.2k-16.el7.aarch64
> zlib-1.2.7-18.el7.aarch64
> (gdb) bt
> #0 hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>) at
> ../drivers/net/hns3/hns3_ethdev.c:6189
> #1 0x0000000000742eac in rte_eth_dev_close (port_id=0) at
> ../lib/ethdev/rte_ethdev.c:1870
> #2 0x0000000000542a8c in close_port (pid=0) at ../app/test-pmd/testpmd.c:2895
> #3 0x00000000004df82c in cmd_operate_specific_port_parsed
> (parsed_result=0xffffffffb0f0,
> cl=0x2475080, data=0x0) at ../app/test-pmd/cmdline.c:1272
> #4 0x0000000000738ce4 in cmdline_parse (cl=0x2475080, buf=0x24750c8 "port close
> 0\n")
> at ../lib/cmdline/cmdline_parse.c:290
> #5 0x0000000000736b7c in cmdline_valid_buffer (rdl=0x2475090, buf=0x24750c8
> "port close 0\n",
> size=14) at ../lib/cmdline/cmdline.c:26
> #6 0x000000000073c078 in rdline_char_in (rdl=0x2475090, c=10 '\n')
> at ../lib/cmdline/cmdline_rdline.c:421
> #7 0x0000000000736fcc in cmdline_in (cl=0x2475080, buf=0xfffffffff25f
> "\n\200\362\377\377\377\377",
> size=1) at ../lib/cmdline/cmdline.c:149
> #8 0x0000000000737270 in cmdline_interact (cl=0x2475080) at
> ../lib/cmdline/cmdline.c:223
> #9 0x00000000004f0ccc in prompt () at ../app/test-pmd/cmdline.c:17882
> #10 0x0000000000545528 in main (argc=8, argv=0xfffffffff470) at
> ../app/test-pmd/testpmd.c:3998
> (gdb) c
> Continuing.
> Port 0 is closed
> Done
> testpmd> device detach 0000:7d:00.0
> Removing a device...
> [Switching to Thread 0xfffff7a0ad10 (LWP 17717)]
>
> Breakpoint 1, hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>)
> at ../drivers/net/hns3/hns3_ethdev.c:7806
> 7806 struct hns3_adapter *hns = eth_dev->data->dev_private;
> (gdb) bt
> #0 hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>) at
> ../drivers/net/hns3/hns3_ethdev.c:7806
> #1 0x0000000000bb8668 in rte_eth_dev_pci_generic_remove (pci_dev=0x247a600,
> dev_uninit=0xbca7c4 <hns3_dev_uninit>) at ../lib/ethdev/ethdev_pci.h:155
> #2 0x0000000000bca89c in eth_hns3_pci_remove (pci_dev=0x247a600)
> at ../drivers/net/hns3/hns3_ethdev.c:7833
> #3 0x00000000007e85f4 in rte_pci_detach_dev (dev=0x247a600) at
> ../drivers/bus/pci/pci_common.c:287
> #4 0x00000000007e8f14 in pci_unplug (dev=0x247a610) at
> ../drivers/bus/pci/pci_common.c:570
> #5 0x0000000000775678 in local_dev_remove (dev=0x247a610) at
> ../lib/eal/common/eal_common_dev.c:319
> #6 0x000000000078f114 in __handle_primary_request (param=0xfffff00008c0)
> at ../lib/eal/common/hotplug_mp.c:284
> #7 0x000000000079ebf4 in eal_alarm_callback (arg=0x0) at
> ../lib/eal/linux/eal_alarm.c:92
> #8 0x00000000007a3f30 in eal_intr_process_interrupts (events=0xfffff7a0a3e0,
> nfds=1)
> at ../lib/eal/linux/eal_interrupts.c:998
> #9 0x00000000007a4224 in eal_intr_handle_interrupts (pfd=10, totalfds=2)
> at ../lib/eal/linux/eal_interrupts.c:1071
> #10 0x00000000007a442c in eal_intr_thread_main (arg=0x0) at
> ../lib/eal/linux/eal_interrupts.c:1142
> #11 0x0000000000789388 in ctrl_thread_init (arg=0x246ef40)
> at ../lib/eal/common/eal_common_thread.c:202
> #12 0x0000fffff7bf3c48 in start_thread () from /lib64/libpthread.so.0
> #13 0x0000fffff7b45600 in thread_start () from /lib64/libc.so.6
>
> **************************************************************************************************************************************************
>
>
> Note: above log is from the secondary process.
>
>>>> And again, please re-consider calling 'rte_eth_dev_close()' or
>>>> 'rte_dev_remove()' from the secondary process.
>>>>
>>>>>>>> + * Namely, whether "eth_dev" is NULL cannot be used to determine
>>>>>>>> whether
>>>>>>>> + * an ethdev port has been released.
>>>>>>>> + * For both primary process and secondary process, eth_dev->state is
>>>>>>>> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
>>>>>>>> + */
>>>>>>>> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>>>>>>>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
>>>>>>>> + return 0;
>>>>>>>> + }
>>>>>> .
>>>> .
>> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-09-20 14:07 ` Ferruh Yigit
@ 2021-09-22 3:31 ` Huisong Li
2021-09-28 7:19 ` Singh, Aman Deep
0 siblings, 1 reply; 33+ messages in thread
From: Huisong Li @ 2021-09-22 3:31 UTC (permalink / raw)
To: Ferruh Yigit, Thomas Monjalon
Cc: Andrew Rybchenko, dev, Anatoly Burakov, David Marchand
在 2021/9/20 22:07, Ferruh Yigit 写道:
> On 8/25/2021 10:53 AM, Huisong Li wrote:
>> 在 2021/8/24 22:42, Ferruh Yigit 写道:
>>> On 8/19/2021 4:45 AM, Huisong Li wrote:
>>>> 在 2021/8/18 19:24, Ferruh Yigit 写道:
>>>>> On 8/13/2021 9:16 AM, Huisong Li wrote:
>>>>>> 在 2021/8/13 14:12, Thomas Monjalon 写道:
>>>>>>> 13/08/2021 04:11, Huisong Li:
>>>>>>>> Hi, all
>>>>>>>>
>>>>>>>> This patch can enhance the security of device uninstallation to
>>>>>>>> eliminate dependency on user usage methods.
>>>>>>>>
>>>>>>>> Can you check this patch?
>>>>>>>>
>>>>>>>>
>>>>>>>> 在 2021/8/3 10:30, Huisong Li 写道:
>>>>>>>>> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
>>>>>>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD layer
>>>>>>>>> to uninstall hardware. However, the two APIs do not have explicit
>>>>>>>>> invocation restrictions. In other words, at the ethdev layer, it is
>>>>>>>>> possible to call rte_eth_dev_close() before calling rte_dev_remove()
>>>>>>>>> or rte_eal_hotplug_remove(). In such a bad scenario,
>>>>>>> It is not a bad scenario.
>>>>>>> If there is no more port for the device after calling close,
>>>>>>> the device should be removed automatically.
>>>>>>> Keep in mind "close" is for one port, "remove" is for the entire device
>>>>>>> which can have more than one port.
>>>>>> I know.
>>>>>>
>>>>>> dev_close() is for removing an eth device. And rte_dev_remove() can be used
>>>>>>
>>>>>> for removing the rte device and all its eth devices belonging to the rte
>>>>>> device.
>>>>>>
>>>>>> In rte_dev_remove(), "remove" is executed in primary or one of secondary,
>>>>>>
>>>>>> all eth devices having same pci address will be closed and removed.
>>>>>>
>>>>>>>>> the primary
>>>>>>>>> process may be fine, but it may cause that xxx_dev_close() in the PMD
>>>>>>>>> layer will be called twice in the secondary process. So this patch
>>>>>>>>> fixes it.
>>>>>>> If a port is closed in primary, it should be the same in secondary.
>>>>>>>
>>>>>>>
>>>>>>>>> + /*
>>>>>>>>> + * The eth_dev->data->name doesn't be cleared by the secondary
>>>>>>>>> process,
>>>>>>>>> + * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
>>>>>>> This assumption is not clear. All should be closed together.
>>>>>> However, dev_close() does not have the feature similar to rte_dev_remove().
>>>>>>
>>>>>> Namely, it is not guaranteed that all eth devices are closed together in
>>>>>> ethdev
>>>>>> layer. It depends on app or user.
>>>>>>
>>>>>> If the app does not close together, the operation of repeatedly
>>>>>> uninstalling an
>>>>>> eth device in the secondary process
>>>>>>
>>>>>> will be triggered when dev_close() is first called by one secondary
>>>>>> process, and
>>>>>> then rte_dev_remove() is called.
>>>>>>
>>>>>> So I think it should be avoided.
>>>>> First of all, I am not sure about calling 'rte_eth_dev_close()' or
>>>>> 'rte_dev_remove()' from the secondary process.
>>>>> There are explicit checks in various locations to prevent clearing resources
>>>>> completely from secondary process.
>>>> There's no denying that.
>>>>
>>>> Generally, hardware resources of eth device and shared data of the primary and
>>>> secondary process
>>>>
>>>> are cleared by primary, which are controled by ethdev layer or PMD layer.
>>>>
>>>> But there may be some private data or resources of each process (primary or
>>>> secondary ), such as mp action
>>>>
>>>> registered by rte_mp_action_register() or others. For these resources, the
>>>> secondary process still needs to clear.
>>>>
>>>> Namely, both primary and secondary processes need to prevent repeated offloading
>>>> of resources.
>>>>
>>>>> Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary is technically
>>>>> can be done but application needs to be extra cautious and should take extra
>>>>> measures and synchronization to make it work.
>>>>> Regular use-case is secondary processes do the packet processing and all
>>>>> control
>>>>> commands run by primary.
>>>> You are right. We have a consensus that 'rte_eth_dev_close()' or
>>>> 'rte_dev_remove()'
>>>>
>>>> can be called by primary and secondary processes.
>>>>
>>>> But DPDK framework cannot assume user behavior.😁
>>>>
>>>> We need to make it more secure and reliable for both primary and secondary
>>>> processes.
>>>>
>>>>> In primary, if you call 'rte_eth_dev_close()' it will clear all ethdev
>>>>> resources
>>>>> and further 'rte_dev_remove()' call will detect missing ethdev resources and
>>>>> won't try to clear them again.
>>>>>
>>>>> In secondary, if you call 'rte_eth_dev_close()', it WON'T clear all resources
>>>>> and further 'rte_dev_remove()' call (either from primary or secondary) will try
>>>>> to clean ethdev resources again. You are trying to prevent this retry in remove
>>>>> happening for secondary process.
>>>> Right. However, if secondary process in PMD layer has its own private resources
>>>> to be
>>>>
>>>> cleared, it still need to do it by calling 'rte_eth_dev_close()' or
>>>> 'rte_dev_remove()'.
>>>>
>>>>> In secondary it won't free ethdev resources anyway if you let it continue,
>>>>> but I
>>>>> guess here you are trying to prevent the PMD dev_close() called again. Why? Is
>>>>> it just for optimization or does it cause unexpected behavior in the PMD?
>>>>>
>>>>>
>>>>> Overall, to free resources you need to do the 'rte_eth_dev_close()' or
>>>>> 'rte_dev_remove()' in the primary anyway. So instead of this workaround, I
>>>>> would
>>>>> suggest making PMD dev_close() safe to be called multiple times (if this is the
>>>>> problem.)
>>>> In conclusion, primary and secondary processes in PMD layer may have their own
>>>>
>>>> private data and resources, which need to be processed and released.
>>>>
>>>> Currently, these for PMD are either handled and cleaned up in dev_close() or
>>>> remove().
>>>>
>>>> However, code streams in rte_dev_remove() cannot ensure that the uninstallation
>>>>
>>>> from secondary process will not be repeated if rte_eth_dev_close() is first
>>>> called by
>>>>
>>>> secondary(primary is ok, plese review this patch).
>>>>
>>>> I think, this is the same for each PMD and is better suited to doing it in
>>>> ethdev layer.
>>>>
>>> This patch prevents to call dev_close() twice in the secondary process, is this
>>> fixing a theoretical problem or an actual problem?
>>>
>>> If it is an actual problem can you please provide details, callstack of the
>>> problematic case?
>> We analyzed the code when modifying the bug and found that the problem did exist.
>>
>> The ethdev layer did not guarantee the security.
>>
> I was wondering if there is a crash for an unexpected path, for below case the
> primary process check in the 'hns3_dev_uninit()' should already prevent anything
> unexpected. So I assume this is a fix for a theoretical issue.
Yes. For primary process, hns3 can prevent multiple device
uninstallation through
"adapter_state" controled by primary process.
The problem of multiple device uninstallation has been prevented at
rte_eth_dev_pci_generic_remove()
in the ethdev layer, as described in the patch.
In primary process, rte_eth_dev_allocated() in
rte_eth_dev_pci_generic_remove() will return NULL
when first calling dev_close(), and then calling rte_dev_remove(). So it
is ok. But the logic can not
prevent the same case in secondary because secondary does not clear
dev->data.
>
> In secondary process, these init/uninit device is already not very safe. For
> example, as far as I can see in secondary if you hot remove a device device and
> hot plug a new one, new device will use wrong device data (since hot remove
> won't clear device data, new one will continue to use it).
No. If we hot remove a device in secondary, the secondary will request
its primary
send "remove device" message to all secondaries. After all secondaries
are removed,
the primary also removes the device and the device data will be cleared.
This is the logic of rte_dev_remove().
> So I am not sure about adding secondary process related checks in that area and
> causing a false sense of security, and polluting the logic with secondary
> specific checks.
> Also the check you add may hit by primary process and I am worried on an
> unexpected side affect it cause.
>
>
> As said above I am not sure about this new check, but even we continue with it,
> what about wrapping the check with secondary process check, at least to be sure
> there won't be any side affect for primary process.
>
If app hot remove device in primary/secondary, the original logic is
still used in this interface, because of
bing RTE_ETH_DEV_ATTACHED state for eth_dev. If app first calls
dev_close(), eth device is
RTE_ETH_DEV_UNUSED state in primary/secondary. In this issue case, this
interface is still return from the
original place instead of the new check.
So I don't think the new check will affect the primary process.
Conversely, it's safer for secondary processes.
Please check it again, thanks.
>> The general function of the two interfaces is as follows:
>>
>> rte_eth_dev_close() --> release eth device
>>
>> rte_dev_remove() --> release eth device + remove and free
>> rte_pci_device(primary andsecondary).
>>
>> According to the OVS application scenario, first call dev_close() and then call
>>
>> remove(), which is possible.
>>
>> We constructed this scenario using testpmd to start the secondary process(the
>> multi-process
>>
>> patch of testpmd is being uploaded.). It is proved that the ethdev layer cannot
>> guarantee
>>
>> this security. The callstack is as follows:
>>
>> **************************************************************************************************************************************************
>>
>>
>> gdb) set args -a 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto --
>> -i --rxq 4 --txq 4 --num-procs=2 --proc-id=1
>> (gdb) b hns3_dev_uninit
>> Breakpoint 1 at 0xbca7d0: file ../drivers/net/hns3/hns3_ethdev.c, line 7806.
>> (gdb) b hns3_dev_close
>> Breakpoint 2 at 0xbc6d44: file ../drivers/net/hns3/hns3_ethdev.c, line 6189.
>> (gdb) r
>> Starting program: /home/lihuisong/v500/gcov-test/dpdk/build/app/dpdk-testpmd -a
>> 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto -- -i --rxq 4 --txq 4
>> --num-procs=2 --proc-id=1
>> Missing separate debuginfo for /root/lib/libnuma.so.1
>> Try: yum --enablerepo='*debug*' install
>> /usr/lib/debug/.build-id/ce/4eea0b0f2150f70a080bbd8835e43e78373096.debug
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib64/libthread_db.so.1".
>> EAL: Detected 128 lcore(s)
>> EAL: Detected 4 NUMA nodes
>> EAL: Auto-detected process type: SECONDARY
>> EAL: Detected static linkage of DPDK
>> [New Thread 0xfffff7a0ad10 (LWP 17717)]
>> EAL: Multi-process socket /var/run/dpdk/rte_lee/mp_socket_17714_66aa8d3a60a9
>> [New Thread 0xfffff7209d10 (LWP 17718)]
>> EAL: Selected IOVA mode 'VA'
>> EAL: VFIO support initialized
>> [New Thread 0xfffff69f8d10 (LWP 17719)]
>> EAL: Using IOMMU type 1 (Type 1)
>> EAL: Probe PCI driver: net_hns3 (19e5:a222) device: 0000:7d:00.0 (socket 0)
>> [New Thread 0xfffff61f7d10 (LWP 17720)]
>> TELEMETRY: No legacy callbacks, legacy socket not created
>> Interactive-mode selected
>> 0000:7d:00.0 hns3_dev_mtu_set(): Failed to set mtu, port 0 must be stopped
>> before configuration
>> Failed to set MTU to 1500 for port 0
>> testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
>> testpmd: preferred mempool ops selected: ring_mp_mc
>>
>> Warning! port-topology=paired and odd forward ports number, the last port will
>> pair with itself.
>>
>> Configuring Port 0 (socket 0)
>> Port 0: 00:18:2D:00:00:9E
>> Checking link statuses...
>> Done
>> testpmd> port close 0
>> Closing ports...
>>
>> Breakpoint 2, hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>)
>> at ../drivers/net/hns3/hns3_ethdev.c:6189
>> 6189 struct hns3_adapter *hns = eth_dev->data->dev_private;
>> Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7.aarch64
>> libpcap-1.5.3-11.el7.aarch64 openssl-libs-1.0.2k-16.el7.aarch64
>> zlib-1.2.7-18.el7.aarch64
>> (gdb) bt
>> #0 hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>) at
>> ../drivers/net/hns3/hns3_ethdev.c:6189
>> #1 0x0000000000742eac in rte_eth_dev_close (port_id=0) at
>> ../lib/ethdev/rte_ethdev.c:1870
>> #2 0x0000000000542a8c in close_port (pid=0) at ../app/test-pmd/testpmd.c:2895
>> #3 0x00000000004df82c in cmd_operate_specific_port_parsed
>> (parsed_result=0xffffffffb0f0,
>> cl=0x2475080, data=0x0) at ../app/test-pmd/cmdline.c:1272
>> #4 0x0000000000738ce4 in cmdline_parse (cl=0x2475080, buf=0x24750c8 "port close
>> 0\n")
>> at ../lib/cmdline/cmdline_parse.c:290
>> #5 0x0000000000736b7c in cmdline_valid_buffer (rdl=0x2475090, buf=0x24750c8
>> "port close 0\n",
>> size=14) at ../lib/cmdline/cmdline.c:26
>> #6 0x000000000073c078 in rdline_char_in (rdl=0x2475090, c=10 '\n')
>> at ../lib/cmdline/cmdline_rdline.c:421
>> #7 0x0000000000736fcc in cmdline_in (cl=0x2475080, buf=0xfffffffff25f
>> "\n\200\362\377\377\377\377",
>> size=1) at ../lib/cmdline/cmdline.c:149
>> #8 0x0000000000737270 in cmdline_interact (cl=0x2475080) at
>> ../lib/cmdline/cmdline.c:223
>> #9 0x00000000004f0ccc in prompt () at ../app/test-pmd/cmdline.c:17882
>> #10 0x0000000000545528 in main (argc=8, argv=0xfffffffff470) at
>> ../app/test-pmd/testpmd.c:3998
>> (gdb) c
>> Continuing.
>> Port 0 is closed
>> Done
>> testpmd> device detach 0000:7d:00.0
>> Removing a device...
>> [Switching to Thread 0xfffff7a0ad10 (LWP 17717)]
>>
>> Breakpoint 1, hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>)
>> at ../drivers/net/hns3/hns3_ethdev.c:7806
>> 7806 struct hns3_adapter *hns = eth_dev->data->dev_private;
>> (gdb) bt
>> #0 hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>) at
>> ../drivers/net/hns3/hns3_ethdev.c:7806
>> #1 0x0000000000bb8668 in rte_eth_dev_pci_generic_remove (pci_dev=0x247a600,
>> dev_uninit=0xbca7c4 <hns3_dev_uninit>) at ../lib/ethdev/ethdev_pci.h:155
>> #2 0x0000000000bca89c in eth_hns3_pci_remove (pci_dev=0x247a600)
>> at ../drivers/net/hns3/hns3_ethdev.c:7833
>> #3 0x00000000007e85f4 in rte_pci_detach_dev (dev=0x247a600) at
>> ../drivers/bus/pci/pci_common.c:287
>> #4 0x00000000007e8f14 in pci_unplug (dev=0x247a610) at
>> ../drivers/bus/pci/pci_common.c:570
>> #5 0x0000000000775678 in local_dev_remove (dev=0x247a610) at
>> ../lib/eal/common/eal_common_dev.c:319
>> #6 0x000000000078f114 in __handle_primary_request (param=0xfffff00008c0)
>> at ../lib/eal/common/hotplug_mp.c:284
>> #7 0x000000000079ebf4 in eal_alarm_callback (arg=0x0) at
>> ../lib/eal/linux/eal_alarm.c:92
>> #8 0x00000000007a3f30 in eal_intr_process_interrupts (events=0xfffff7a0a3e0,
>> nfds=1)
>> at ../lib/eal/linux/eal_interrupts.c:998
>> #9 0x00000000007a4224 in eal_intr_handle_interrupts (pfd=10, totalfds=2)
>> at ../lib/eal/linux/eal_interrupts.c:1071
>> #10 0x00000000007a442c in eal_intr_thread_main (arg=0x0) at
>> ../lib/eal/linux/eal_interrupts.c:1142
>> #11 0x0000000000789388 in ctrl_thread_init (arg=0x246ef40)
>> at ../lib/eal/common/eal_common_thread.c:202
>> #12 0x0000fffff7bf3c48 in start_thread () from /lib64/libpthread.so.0
>> #13 0x0000fffff7b45600 in thread_start () from /lib64/libc.so.6
>>
>> **************************************************************************************************************************************************
>>
>>
>> Note: above log is from the secondary process.
>>
>>>>> And again, please re-consider calling 'rte_eth_dev_close()' or
>>>>> 'rte_dev_remove()' from the secondary process.
>>>>>
>>>>>>>>> + * Namely, whether "eth_dev" is NULL cannot be used to determine
>>>>>>>>> whether
>>>>>>>>> + * an ethdev port has been released.
>>>>>>>>> + * For both primary process and secondary process, eth_dev->state is
>>>>>>>>> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
>>>>>>>>> + */
>>>>>>>>> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>>>>>>>>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
>>>>>>>>> + return 0;
>>>>>>>>> + }
>>>>>>> .
>>>>> .
>>> .
> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-09-22 3:31 ` Huisong Li
@ 2021-09-28 7:19 ` Singh, Aman Deep
2021-09-30 10:54 ` Huisong Li
0 siblings, 1 reply; 33+ messages in thread
From: Singh, Aman Deep @ 2021-09-28 7:19 UTC (permalink / raw)
To: Huisong Li, Ferruh Yigit, Thomas Monjalon
Cc: Andrew Rybchenko, dev, Anatoly Burakov, David Marchand
On 9/22/2021 9:01 AM, Huisong Li wrote:
>
> 在 2021/9/20 22:07, Ferruh Yigit 写道:
>> On 8/25/2021 10:53 AM, Huisong Li wrote:
>>> 在 2021/8/24 22:42, Ferruh Yigit 写道:
>>>> On 8/19/2021 4:45 AM, Huisong Li wrote:
>>>>> 在 2021/8/18 19:24, Ferruh Yigit 写道:
>>>>>> On 8/13/2021 9:16 AM, Huisong Li wrote:
>>>>>>> 在 2021/8/13 14:12, Thomas Monjalon 写道:
>>>>>>>> 13/08/2021 04:11, Huisong Li:
>>>>>>>>> Hi, all
>>>>>>>>>
>>>>>>>>> This patch can enhance the security of device uninstallation to
>>>>>>>>> eliminate dependency on user usage methods.
>>>>>>>>>
>>>>>>>>> Can you check this patch?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 在 2021/8/3 10:30, Huisong Li 写道:
>>>>>>>>>> Ethernet devices in DPDK can be released by
>>>>>>>>>> rte_eth_dev_close() and
>>>>>>>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD
>>>>>>>>>> layer
>>>>>>>>>> to uninstall hardware. However, the two APIs do not have
>>>>>>>>>> explicit
>>>>>>>>>> invocation restrictions. In other words, at the ethdev layer,
>>>>>>>>>> it is
>>>>>>>>>> possible to call rte_eth_dev_close() before calling
>>>>>>>>>> rte_dev_remove()
>>>>>>>>>> or rte_eal_hotplug_remove(). In such a bad scenario,
>>>>>>>> It is not a bad scenario.
>>>>>>>> If there is no more port for the device after calling close,
>>>>>>>> the device should be removed automatically.
>>>>>>>> Keep in mind "close" is for one port, "remove" is for the
>>>>>>>> entire device
>>>>>>>> which can have more than one port.
>>>>>>> I know.
>>>>>>>
>>>>>>> dev_close() is for removing an eth device. And rte_dev_remove()
>>>>>>> can be used
>>>>>>>
>>>>>>> for removing the rte device and all its eth devices belonging to
>>>>>>> the rte
>>>>>>> device.
>>>>>>>
>>>>>>> In rte_dev_remove(), "remove" is executed in primary or one of
>>>>>>> secondary,
>>>>>>>
>>>>>>> all eth devices having same pci address will be closed and removed.
>>>>>>>
>>>>>>>>>> the primary
>>>>>>>>>> process may be fine, but it may cause that xxx_dev_close() in
>>>>>>>>>> the PMD
>>>>>>>>>> layer will be called twice in the secondary process. So this
>>>>>>>>>> patch
>>>>>>>>>> fixes it.
>>>>>>>> If a port is closed in primary, it should be the same in
>>>>>>>> secondary.
>>>>>>>>
>>>>>>>>
>>>>>>>>>> + /*
>>>>>>>>>> + * The eth_dev->data->name doesn't be cleared by the
>>>>>>>>>> secondary
>>>>>>>>>> process,
>>>>>>>>>> + * so above "eth_dev" isn't NULL after
>>>>>>>>>> rte_eth_dev_close() called.
>>>>>>>> This assumption is not clear. All should be closed together.
>>>>>>> However, dev_close() does not have the feature similar to
>>>>>>> rte_dev_remove().
>>>>>>>
>>>>>>> Namely, it is not guaranteed that all eth devices are closed
>>>>>>> together in
>>>>>>> ethdev
>>>>>>> layer. It depends on app or user.
>>>>>>>
>>>>>>> If the app does not close together, the operation of repeatedly
>>>>>>> uninstalling an
>>>>>>> eth device in the secondary process
>>>>>>>
>>>>>>> will be triggered when dev_close() is first called by one secondary
>>>>>>> process, and
>>>>>>> then rte_dev_remove() is called.
>>>>>>>
>>>>>>> So I think it should be avoided.
>>>>>> First of all, I am not sure about calling 'rte_eth_dev_close()' or
>>>>>> 'rte_dev_remove()' from the secondary process.
>>>>>> There are explicit checks in various locations to prevent
>>>>>> clearing resources
>>>>>> completely from secondary process.
>>>>> There's no denying that.
>>>>>
>>>>> Generally, hardware resources of eth device and shared data of the
>>>>> primary and
>>>>> secondary process
>>>>>
>>>>> are cleared by primary, which are controled by ethdev layer or PMD
>>>>> layer.
>>>>>
>>>>> But there may be some private data or resources of each process
>>>>> (primary or
>>>>> secondary ), such as mp action
>>>>>
>>>>> registered by rte_mp_action_register() or others. For these
>>>>> resources, the
>>>>> secondary process still needs to clear.
>>>>>
>>>>> Namely, both primary and secondary processes need to prevent
>>>>> repeated offloading
>>>>> of resources.
>>>>>
>>>>>> Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary
>>>>>> is technically
>>>>>> can be done but application needs to be extra cautious and should
>>>>>> take extra
>>>>>> measures and synchronization to make it work.
>>>>>> Regular use-case is secondary processes do the packet processing
>>>>>> and all
>>>>>> control
>>>>>> commands run by primary.
>>>>> You are right. We have a consensus that 'rte_eth_dev_close()' or
>>>>> 'rte_dev_remove()'
>>>>>
>>>>> can be called by primary and secondary processes.
>>>>>
>>>>> But DPDK framework cannot assume user behavior.😁
>>>>>
>>>>> We need to make it more secure and reliable for both primary and
>>>>> secondary
>>>>> processes.
>>>>>
>>>>>> In primary, if you call 'rte_eth_dev_close()' it will clear all
>>>>>> ethdev
>>>>>> resources
>>>>>> and further 'rte_dev_remove()' call will detect missing ethdev
>>>>>> resources and
>>>>>> won't try to clear them again.
>>>>>>
>>>>>> In secondary, if you call 'rte_eth_dev_close()', it WON'T clear
>>>>>> all resources
>>>>>> and further 'rte_dev_remove()' call (either from primary or
>>>>>> secondary) will try
>>>>>> to clean ethdev resources again. You are trying to prevent this
>>>>>> retry in remove
>>>>>> happening for secondary process.
>>>>> Right. However, if secondary process in PMD layer has its own
>>>>> private resources
>>>>> to be
>>>>>
>>>>> cleared, it still need to do it by calling 'rte_eth_dev_close()' or
>>>>> 'rte_dev_remove()'.
>>>>>
>>>>>> In secondary it won't free ethdev resources anyway if you let it
>>>>>> continue,
>>>>>> but I
>>>>>> guess here you are trying to prevent the PMD dev_close() called
>>>>>> again. Why? Is
>>>>>> it just for optimization or does it cause unexpected behavior in
>>>>>> the PMD?
>>>>>>
>>>>>>
>>>>>> Overall, to free resources you need to do the
>>>>>> 'rte_eth_dev_close()' or
>>>>>> 'rte_dev_remove()' in the primary anyway. So instead of this
>>>>>> workaround, I
>>>>>> would
>>>>>> suggest making PMD dev_close() safe to be called multiple times
>>>>>> (if this is the
>>>>>> problem.)
>>>>> In conclusion, primary and secondary processes in PMD layer may
>>>>> have their own
>>>>>
>>>>> private data and resources, which need to be processed and released.
>>>>>
>>>>> Currently, these for PMD are either handled and cleaned up in
>>>>> dev_close() or
>>>>> remove().
>>>>>
>>>>> However, code streams in rte_dev_remove() cannot ensure that the
>>>>> uninstallation
>>>>>
>>>>> from secondary process will not be repeated if rte_eth_dev_close()
>>>>> is first
>>>>> called by
>>>>>
>>>>> secondary(primary is ok, plese review this patch).
>>>>>
>>>>> I think, this is the same for each PMD and is better suited to
>>>>> doing it in
>>>>> ethdev layer.
>>>>>
>>>> This patch prevents to call dev_close() twice in the secondary
>>>> process, is this
>>>> fixing a theoretical problem or an actual problem?
>>>>
>>>> If it is an actual problem can you please provide details,
>>>> callstack of the
>>>> problematic case?
>>> We analyzed the code when modifying the bug and found that the
>>> problem did exist.
>>>
>>> The ethdev layer did not guarantee the security.
>>>
>> I was wondering if there is a crash for an unexpected path, for below
>> case the
>> primary process check in the 'hns3_dev_uninit()' should already
>> prevent anything
>> unexpected. So I assume this is a fix for a theoretical issue.
>
> Yes. For primary process, hns3 can prevent multiple device
> uninstallation through
>
> "adapter_state" controled by primary process.
>
> The problem of multiple device uninstallation has been prevented at
> rte_eth_dev_pci_generic_remove()
>
> in the ethdev layer, as described in the patch.
>
> In primary process, rte_eth_dev_allocated() in
> rte_eth_dev_pci_generic_remove() will return NULL
>
> when first calling dev_close(), and then calling rte_dev_remove(). So
> it is ok. But the logic can not
>
> prevent the same case in secondary because secondary does not clear
> dev->data.
>
>>
>> In secondary process, these init/uninit device is already not very
>> safe. For
>> example, as far as I can see in secondary if you hot remove a device
>> device and
>> hot plug a new one, new device will use wrong device data (since hot
>> remove
>> won't clear device data, new one will continue to use it).
>
> No. If we hot remove a device in secondary, the secondary will request
> its primary
>
> send "remove device" message to all secondaries. After all secondaries
> are removed,
>
> the primary also removes the device and the device data will be cleared.
>
> This is the logic of rte_dev_remove().
>
>> So I am not sure about adding secondary process related checks in
>> that area and
>> causing a false sense of security, and polluting the logic with
>> secondary
>> specific checks.
>> Also the check you add may hit by primary process and I am worried on an
>> unexpected side affect it cause.
>>
>>
>> As said above I am not sure about this new check, but even we
>> continue with it,
>> what about wrapping the check with secondary process check, at least
>> to be sure
>> there won't be any side affect for primary process.
>>
> If app hot remove device in primary/secondary, the original logic is
> still used in this interface, because of
>
> bing RTE_ETH_DEV_ATTACHED state for eth_dev. If app first calls
> dev_close(), eth device is
>
> RTE_ETH_DEV_UNUSED state in primary/secondary. In this issue case,
> this interface is still return from the
>
> original place instead of the new check.
>
> So I don't think the new check will affect the primary process.
> Conversely, it's safer for secondary processes.
>
> Please check it again, thanks.
As this issue is specific to secondary process, can we have these change
under a check like "if (rte_eal_process_type() != RTE_PROC_PRIMARY)"
By this we can avoid any side effect of these changes for primary
process and also it makes code readablity easier for designers who are
not checking secondary process changes.
>
>>> The general function of the two interfaces is as follows:
>>>
>>> rte_eth_dev_close() --> release eth device
>>>
>>> rte_dev_remove() --> release eth device + remove and free
>>> rte_pci_device(primary andsecondary).
>>>
>>> According to the OVS application scenario, first call dev_close()
>>> and then call
>>>
>>> remove(), which is possible.
>>>
>>> We constructed this scenario using testpmd to start the secondary
>>> process(the
>>> multi-process
>>>
>>> patch of testpmd is being uploaded.). It is proved that the ethdev
>>> layer cannot
>>> guarantee
>>>
>>> this security. The callstack is as follows:
>>>
>>> **************************************************************************************************************************************************
>>>
>>>
>>>
>>> gdb) set args -a 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee
>>> --proc-type=auto --
>>> -i --rxq 4 --txq 4 --num-procs=2 --proc-id=1
>>> (gdb) b hns3_dev_uninit
>>> Breakpoint 1 at 0xbca7d0: file ../drivers/net/hns3/hns3_ethdev.c,
>>> line 7806.
>>> (gdb) b hns3_dev_close
>>> Breakpoint 2 at 0xbc6d44: file ../drivers/net/hns3/hns3_ethdev.c,
>>> line 6189.
>>> (gdb) r
>>> Starting program:
>>> /home/lihuisong/v500/gcov-test/dpdk/build/app/dpdk-testpmd -a
>>> 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto -- -i
>>> --rxq 4 --txq 4
>>> --num-procs=2 --proc-id=1
>>> Missing separate debuginfo for /root/lib/libnuma.so.1
>>> Try: yum --enablerepo='*debug*' install
>>> /usr/lib/debug/.build-id/ce/4eea0b0f2150f70a080bbd8835e43e78373096.debug
>>>
>>> [Thread debugging using libthread_db enabled]
>>> Using host libthread_db library "/lib64/libthread_db.so.1".
>>> EAL: Detected 128 lcore(s)
>>> EAL: Detected 4 NUMA nodes
>>> EAL: Auto-detected process type: SECONDARY
>>> EAL: Detected static linkage of DPDK
>>> [New Thread 0xfffff7a0ad10 (LWP 17717)]
>>> EAL: Multi-process socket
>>> /var/run/dpdk/rte_lee/mp_socket_17714_66aa8d3a60a9
>>> [New Thread 0xfffff7209d10 (LWP 17718)]
>>> EAL: Selected IOVA mode 'VA'
>>> EAL: VFIO support initialized
>>> [New Thread 0xfffff69f8d10 (LWP 17719)]
>>> EAL: Using IOMMU type 1 (Type 1)
>>> EAL: Probe PCI driver: net_hns3 (19e5:a222) device: 0000:7d:00.0
>>> (socket 0)
>>> [New Thread 0xfffff61f7d10 (LWP 17720)]
>>> TELEMETRY: No legacy callbacks, legacy socket not created
>>> Interactive-mode selected
>>> 0000:7d:00.0 hns3_dev_mtu_set(): Failed to set mtu, port 0 must be
>>> stopped
>>> before configuration
>>> Failed to set MTU to 1500 for port 0
>>> testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176,
>>> socket=0
>>> testpmd: preferred mempool ops selected: ring_mp_mc
>>>
>>> Warning! port-topology=paired and odd forward ports number, the last
>>> port will
>>> pair with itself.
>>>
>>> Configuring Port 0 (socket 0)
>>> Port 0: 00:18:2D:00:00:9E
>>> Checking link statuses...
>>> Done
>>> testpmd> port close 0
>>> Closing ports...
>>>
>>> Breakpoint 2, hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>)
>>> at ../drivers/net/hns3/hns3_ethdev.c:6189
>>> 6189 struct hns3_adapter *hns = eth_dev->data->dev_private;
>>> Missing separate debuginfos, use: debuginfo-install
>>> glibc-2.17-260.el7.aarch64
>>> libpcap-1.5.3-11.el7.aarch64 openssl-libs-1.0.2k-16.el7.aarch64
>>> zlib-1.2.7-18.el7.aarch64
>>> (gdb) bt
>>> #0 hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>) at
>>> ../drivers/net/hns3/hns3_ethdev.c:6189
>>> #1 0x0000000000742eac in rte_eth_dev_close (port_id=0) at
>>> ../lib/ethdev/rte_ethdev.c:1870
>>> #2 0x0000000000542a8c in close_port (pid=0) at
>>> ../app/test-pmd/testpmd.c:2895
>>> #3 0x00000000004df82c in cmd_operate_specific_port_parsed
>>> (parsed_result=0xffffffffb0f0,
>>> cl=0x2475080, data=0x0) at ../app/test-pmd/cmdline.c:1272
>>> #4 0x0000000000738ce4 in cmdline_parse (cl=0x2475080, buf=0x24750c8
>>> "port close
>>> 0\n")
>>> at ../lib/cmdline/cmdline_parse.c:290
>>> #5 0x0000000000736b7c in cmdline_valid_buffer (rdl=0x2475090,
>>> buf=0x24750c8
>>> "port close 0\n",
>>> size=14) at ../lib/cmdline/cmdline.c:26
>>> #6 0x000000000073c078 in rdline_char_in (rdl=0x2475090, c=10 '\n')
>>> at ../lib/cmdline/cmdline_rdline.c:421
>>> #7 0x0000000000736fcc in cmdline_in (cl=0x2475080, buf=0xfffffffff25f
>>> "\n\200\362\377\377\377\377",
>>> size=1) at ../lib/cmdline/cmdline.c:149
>>> #8 0x0000000000737270 in cmdline_interact (cl=0x2475080) at
>>> ../lib/cmdline/cmdline.c:223
>>> #9 0x00000000004f0ccc in prompt () at ../app/test-pmd/cmdline.c:17882
>>> #10 0x0000000000545528 in main (argc=8, argv=0xfffffffff470) at
>>> ../app/test-pmd/testpmd.c:3998
>>> (gdb) c
>>> Continuing.
>>> Port 0 is closed
>>> Done
>>> testpmd> device detach 0000:7d:00.0
>>> Removing a device...
>>> [Switching to Thread 0xfffff7a0ad10 (LWP 17717)]
>>>
>>> Breakpoint 1, hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>)
>>> at ../drivers/net/hns3/hns3_ethdev.c:7806
>>> 7806 struct hns3_adapter *hns = eth_dev->data->dev_private;
>>> (gdb) bt
>>> #0 hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>) at
>>> ../drivers/net/hns3/hns3_ethdev.c:7806
>>> #1 0x0000000000bb8668 in rte_eth_dev_pci_generic_remove
>>> (pci_dev=0x247a600,
>>> dev_uninit=0xbca7c4 <hns3_dev_uninit>) at
>>> ../lib/ethdev/ethdev_pci.h:155
>>> #2 0x0000000000bca89c in eth_hns3_pci_remove (pci_dev=0x247a600)
>>> at ../drivers/net/hns3/hns3_ethdev.c:7833
>>> #3 0x00000000007e85f4 in rte_pci_detach_dev (dev=0x247a600) at
>>> ../drivers/bus/pci/pci_common.c:287
>>> #4 0x00000000007e8f14 in pci_unplug (dev=0x247a610) at
>>> ../drivers/bus/pci/pci_common.c:570
>>> #5 0x0000000000775678 in local_dev_remove (dev=0x247a610) at
>>> ../lib/eal/common/eal_common_dev.c:319
>>> #6 0x000000000078f114 in __handle_primary_request
>>> (param=0xfffff00008c0)
>>> at ../lib/eal/common/hotplug_mp.c:284
>>> #7 0x000000000079ebf4 in eal_alarm_callback (arg=0x0) at
>>> ../lib/eal/linux/eal_alarm.c:92
>>> #8 0x00000000007a3f30 in eal_intr_process_interrupts
>>> (events=0xfffff7a0a3e0,
>>> nfds=1)
>>> at ../lib/eal/linux/eal_interrupts.c:998
>>> #9 0x00000000007a4224 in eal_intr_handle_interrupts (pfd=10,
>>> totalfds=2)
>>> at ../lib/eal/linux/eal_interrupts.c:1071
>>> #10 0x00000000007a442c in eal_intr_thread_main (arg=0x0) at
>>> ../lib/eal/linux/eal_interrupts.c:1142
>>> #11 0x0000000000789388 in ctrl_thread_init (arg=0x246ef40)
>>> at ../lib/eal/common/eal_common_thread.c:202
>>> #12 0x0000fffff7bf3c48 in start_thread () from /lib64/libpthread.so.0
>>> #13 0x0000fffff7b45600 in thread_start () from /lib64/libc.so.6
>>>
>>> **************************************************************************************************************************************************
>>>
>>>
>>>
>>> Note: above log is from the secondary process.
>>>
>>>>>> And again, please re-consider calling 'rte_eth_dev_close()' or
>>>>>> 'rte_dev_remove()' from the secondary process.
>>>>>>
>>>>>>>>>> + * Namely, whether "eth_dev" is NULL cannot be used to
>>>>>>>>>> determine
>>>>>>>>>> whether
>>>>>>>>>> + * an ethdev port has been released.
>>>>>>>>>> + * For both primary process and secondary process,
>>>>>>>>>> eth_dev->state is
>>>>>>>>>> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has
>>>>>>>>>> been released.
>>>>>>>>>> + */
>>>>>>>>>> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>>>>>>>>>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been
>>>>>>>>>> released.");
>>>>>>>>>> + return 0;
>>>>>>>>>> + }
>>>>>>>> .
>>>>>> .
>>>> .
>> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-09-28 7:19 ` Singh, Aman Deep
@ 2021-09-30 10:54 ` Huisong Li
2021-09-30 11:01 ` Ferruh Yigit
0 siblings, 1 reply; 33+ messages in thread
From: Huisong Li @ 2021-09-30 10:54 UTC (permalink / raw)
To: Singh, Aman Deep, Ferruh Yigit, Thomas Monjalon
Cc: Andrew Rybchenko, dev, Anatoly Burakov, David Marchand
在 2021/9/28 15:19, Singh, Aman Deep 写道:
>
> On 9/22/2021 9:01 AM, Huisong Li wrote:
>>
>> 在 2021/9/20 22:07, Ferruh Yigit 写道:
>>> On 8/25/2021 10:53 AM, Huisong Li wrote:
>>>> 在 2021/8/24 22:42, Ferruh Yigit 写道:
>>>>> On 8/19/2021 4:45 AM, Huisong Li wrote:
>>>>>> 在 2021/8/18 19:24, Ferruh Yigit 写道:
>>>>>>> On 8/13/2021 9:16 AM, Huisong Li wrote:
>>>>>>>> 在 2021/8/13 14:12, Thomas Monjalon 写道:
>>>>>>>>> 13/08/2021 04:11, Huisong Li:
>>>>>>>>>> Hi, all
>>>>>>>>>>
>>>>>>>>>> This patch can enhance the security of device uninstallation to
>>>>>>>>>> eliminate dependency on user usage methods.
>>>>>>>>>>
>>>>>>>>>> Can you check this patch?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 在 2021/8/3 10:30, Huisong Li 写道:
>>>>>>>>>>> Ethernet devices in DPDK can be released by
>>>>>>>>>>> rte_eth_dev_close() and
>>>>>>>>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in
>>>>>>>>>>> PMD layer
>>>>>>>>>>> to uninstall hardware. However, the two APIs do not have
>>>>>>>>>>> explicit
>>>>>>>>>>> invocation restrictions. In other words, at the ethdev
>>>>>>>>>>> layer, it is
>>>>>>>>>>> possible to call rte_eth_dev_close() before calling
>>>>>>>>>>> rte_dev_remove()
>>>>>>>>>>> or rte_eal_hotplug_remove(). In such a bad scenario,
>>>>>>>>> It is not a bad scenario.
>>>>>>>>> If there is no more port for the device after calling close,
>>>>>>>>> the device should be removed automatically.
>>>>>>>>> Keep in mind "close" is for one port, "remove" is for the
>>>>>>>>> entire device
>>>>>>>>> which can have more than one port.
>>>>>>>> I know.
>>>>>>>>
>>>>>>>> dev_close() is for removing an eth device. And rte_dev_remove()
>>>>>>>> can be used
>>>>>>>>
>>>>>>>> for removing the rte device and all its eth devices belonging
>>>>>>>> to the rte
>>>>>>>> device.
>>>>>>>>
>>>>>>>> In rte_dev_remove(), "remove" is executed in primary or one of
>>>>>>>> secondary,
>>>>>>>>
>>>>>>>> all eth devices having same pci address will be closed and
>>>>>>>> removed.
>>>>>>>>
>>>>>>>>>>> the primary
>>>>>>>>>>> process may be fine, but it may cause that xxx_dev_close()
>>>>>>>>>>> in the PMD
>>>>>>>>>>> layer will be called twice in the secondary process. So this
>>>>>>>>>>> patch
>>>>>>>>>>> fixes it.
>>>>>>>>> If a port is closed in primary, it should be the same in
>>>>>>>>> secondary.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> + /*
>>>>>>>>>>> + * The eth_dev->data->name doesn't be cleared by the
>>>>>>>>>>> secondary
>>>>>>>>>>> process,
>>>>>>>>>>> + * so above "eth_dev" isn't NULL after
>>>>>>>>>>> rte_eth_dev_close() called.
>>>>>>>>> This assumption is not clear. All should be closed together.
>>>>>>>> However, dev_close() does not have the feature similar to
>>>>>>>> rte_dev_remove().
>>>>>>>>
>>>>>>>> Namely, it is not guaranteed that all eth devices are closed
>>>>>>>> together in
>>>>>>>> ethdev
>>>>>>>> layer. It depends on app or user.
>>>>>>>>
>>>>>>>> If the app does not close together, the operation of repeatedly
>>>>>>>> uninstalling an
>>>>>>>> eth device in the secondary process
>>>>>>>>
>>>>>>>> will be triggered when dev_close() is first called by one
>>>>>>>> secondary
>>>>>>>> process, and
>>>>>>>> then rte_dev_remove() is called.
>>>>>>>>
>>>>>>>> So I think it should be avoided.
>>>>>>> First of all, I am not sure about calling 'rte_eth_dev_close()' or
>>>>>>> 'rte_dev_remove()' from the secondary process.
>>>>>>> There are explicit checks in various locations to prevent
>>>>>>> clearing resources
>>>>>>> completely from secondary process.
>>>>>> There's no denying that.
>>>>>>
>>>>>> Generally, hardware resources of eth device and shared data of
>>>>>> the primary and
>>>>>> secondary process
>>>>>>
>>>>>> are cleared by primary, which are controled by ethdev layer or
>>>>>> PMD layer.
>>>>>>
>>>>>> But there may be some private data or resources of each process
>>>>>> (primary or
>>>>>> secondary ), such as mp action
>>>>>>
>>>>>> registered by rte_mp_action_register() or others. For these
>>>>>> resources, the
>>>>>> secondary process still needs to clear.
>>>>>>
>>>>>> Namely, both primary and secondary processes need to prevent
>>>>>> repeated offloading
>>>>>> of resources.
>>>>>>
>>>>>>> Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary
>>>>>>> is technically
>>>>>>> can be done but application needs to be extra cautious and
>>>>>>> should take extra
>>>>>>> measures and synchronization to make it work.
>>>>>>> Regular use-case is secondary processes do the packet processing
>>>>>>> and all
>>>>>>> control
>>>>>>> commands run by primary.
>>>>>> You are right. We have a consensus that 'rte_eth_dev_close()' or
>>>>>> 'rte_dev_remove()'
>>>>>>
>>>>>> can be called by primary and secondary processes.
>>>>>>
>>>>>> But DPDK framework cannot assume user behavior.😁
>>>>>>
>>>>>> We need to make it more secure and reliable for both primary and
>>>>>> secondary
>>>>>> processes.
>>>>>>
>>>>>>> In primary, if you call 'rte_eth_dev_close()' it will clear all
>>>>>>> ethdev
>>>>>>> resources
>>>>>>> and further 'rte_dev_remove()' call will detect missing ethdev
>>>>>>> resources and
>>>>>>> won't try to clear them again.
>>>>>>>
>>>>>>> In secondary, if you call 'rte_eth_dev_close()', it WON'T clear
>>>>>>> all resources
>>>>>>> and further 'rte_dev_remove()' call (either from primary or
>>>>>>> secondary) will try
>>>>>>> to clean ethdev resources again. You are trying to prevent this
>>>>>>> retry in remove
>>>>>>> happening for secondary process.
>>>>>> Right. However, if secondary process in PMD layer has its own
>>>>>> private resources
>>>>>> to be
>>>>>>
>>>>>> cleared, it still need to do it by calling 'rte_eth_dev_close()' or
>>>>>> 'rte_dev_remove()'.
>>>>>>
>>>>>>> In secondary it won't free ethdev resources anyway if you let it
>>>>>>> continue,
>>>>>>> but I
>>>>>>> guess here you are trying to prevent the PMD dev_close() called
>>>>>>> again. Why? Is
>>>>>>> it just for optimization or does it cause unexpected behavior in
>>>>>>> the PMD?
>>>>>>>
>>>>>>>
>>>>>>> Overall, to free resources you need to do the
>>>>>>> 'rte_eth_dev_close()' or
>>>>>>> 'rte_dev_remove()' in the primary anyway. So instead of this
>>>>>>> workaround, I
>>>>>>> would
>>>>>>> suggest making PMD dev_close() safe to be called multiple times
>>>>>>> (if this is the
>>>>>>> problem.)
>>>>>> In conclusion, primary and secondary processes in PMD layer may
>>>>>> have their own
>>>>>>
>>>>>> private data and resources, which need to be processed and released.
>>>>>>
>>>>>> Currently, these for PMD are either handled and cleaned up in
>>>>>> dev_close() or
>>>>>> remove().
>>>>>>
>>>>>> However, code streams in rte_dev_remove() cannot ensure that the
>>>>>> uninstallation
>>>>>>
>>>>>> from secondary process will not be repeated if
>>>>>> rte_eth_dev_close() is first
>>>>>> called by
>>>>>>
>>>>>> secondary(primary is ok, plese review this patch).
>>>>>>
>>>>>> I think, this is the same for each PMD and is better suited to
>>>>>> doing it in
>>>>>> ethdev layer.
>>>>>>
>>>>> This patch prevents to call dev_close() twice in the secondary
>>>>> process, is this
>>>>> fixing a theoretical problem or an actual problem?
>>>>>
>>>>> If it is an actual problem can you please provide details,
>>>>> callstack of the
>>>>> problematic case?
>>>> We analyzed the code when modifying the bug and found that the
>>>> problem did exist.
>>>>
>>>> The ethdev layer did not guarantee the security.
>>>>
>>> I was wondering if there is a crash for an unexpected path, for
>>> below case the
>>> primary process check in the 'hns3_dev_uninit()' should already
>>> prevent anything
>>> unexpected. So I assume this is a fix for a theoretical issue.
>>
>> Yes. For primary process, hns3 can prevent multiple device
>> uninstallation through
>>
>> "adapter_state" controled by primary process.
>>
>> The problem of multiple device uninstallation has been prevented at
>> rte_eth_dev_pci_generic_remove()
>>
>> in the ethdev layer, as described in the patch.
>>
>> In primary process, rte_eth_dev_allocated() in
>> rte_eth_dev_pci_generic_remove() will return NULL
>>
>> when first calling dev_close(), and then calling rte_dev_remove(). So
>> it is ok. But the logic can not
>>
>> prevent the same case in secondary because secondary does not clear
>> dev->data.
>>
>>>
>>> In secondary process, these init/uninit device is already not very
>>> safe. For
>>> example, as far as I can see in secondary if you hot remove a device
>>> device and
>>> hot plug a new one, new device will use wrong device data (since hot
>>> remove
>>> won't clear device data, new one will continue to use it).
>>
>> No. If we hot remove a device in secondary, the secondary will
>> request its primary
>>
>> send "remove device" message to all secondaries. After all
>> secondaries are removed,
>>
>> the primary also removes the device and the device data will be cleared.
>>
>> This is the logic of rte_dev_remove().
>>
>>> So I am not sure about adding secondary process related checks in
>>> that area and
>>> causing a false sense of security, and polluting the logic with
>>> secondary
>>> specific checks.
>>> Also the check you add may hit by primary process and I am worried
>>> on an
>>> unexpected side affect it cause.
>>>
>>>
>>> As said above I am not sure about this new check, but even we
>>> continue with it,
>>> what about wrapping the check with secondary process check, at least
>>> to be sure
>>> there won't be any side affect for primary process.
>>>
>> If app hot remove device in primary/secondary, the original logic is
>> still used in this interface, because of
>>
>> bing RTE_ETH_DEV_ATTACHED state for eth_dev. If app first calls
>> dev_close(), eth device is
>>
>> RTE_ETH_DEV_UNUSED state in primary/secondary. In this issue case,
>> this interface is still return from the
>>
>> original place instead of the new check.
>>
>> So I don't think the new check will affect the primary process.
>> Conversely, it's safer for secondary processes.
>>
>> Please check it again, thanks.
>
> As this issue is specific to secondary process, can we have these
> change under a check like "if (rte_eal_process_type() !=
> RTE_PROC_PRIMARY)"
>
> By this we can avoid any side effect of these changes for primary
> process and also it makes code readablity easier for designers who are
> not checking secondary process changes.
yes.
@Ferruh, what do you think?
>
>>
>>>> The general function of the two interfaces is as follows:
>>>>
>>>> rte_eth_dev_close() --> release eth device
>>>>
>>>> rte_dev_remove() --> release eth device + remove and free
>>>> rte_pci_device(primary andsecondary).
>>>>
>>>> According to the OVS application scenario, first call dev_close()
>>>> and then call
>>>>
>>>> remove(), which is possible.
>>>>
>>>> We constructed this scenario using testpmd to start the secondary
>>>> process(the
>>>> multi-process
>>>>
>>>> patch of testpmd is being uploaded.). It is proved that the ethdev
>>>> layer cannot
>>>> guarantee
>>>>
>>>> this security. The callstack is as follows:
>>>>
>>>> **************************************************************************************************************************************************
>>>>
>>>>
>>>>
>>>> gdb) set args -a 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee
>>>> --proc-type=auto --
>>>> -i --rxq 4 --txq 4 --num-procs=2 --proc-id=1
>>>> (gdb) b hns3_dev_uninit
>>>> Breakpoint 1 at 0xbca7d0: file ../drivers/net/hns3/hns3_ethdev.c,
>>>> line 7806.
>>>> (gdb) b hns3_dev_close
>>>> Breakpoint 2 at 0xbc6d44: file ../drivers/net/hns3/hns3_ethdev.c,
>>>> line 6189.
>>>> (gdb) r
>>>> Starting program:
>>>> /home/lihuisong/v500/gcov-test/dpdk/build/app/dpdk-testpmd -a
>>>> 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto -- -i
>>>> --rxq 4 --txq 4
>>>> --num-procs=2 --proc-id=1
>>>> Missing separate debuginfo for /root/lib/libnuma.so.1
>>>> Try: yum --enablerepo='*debug*' install
>>>> /usr/lib/debug/.build-id/ce/4eea0b0f2150f70a080bbd8835e43e78373096.debug
>>>>
>>>> [Thread debugging using libthread_db enabled]
>>>> Using host libthread_db library "/lib64/libthread_db.so.1".
>>>> EAL: Detected 128 lcore(s)
>>>> EAL: Detected 4 NUMA nodes
>>>> EAL: Auto-detected process type: SECONDARY
>>>> EAL: Detected static linkage of DPDK
>>>> [New Thread 0xfffff7a0ad10 (LWP 17717)]
>>>> EAL: Multi-process socket
>>>> /var/run/dpdk/rte_lee/mp_socket_17714_66aa8d3a60a9
>>>> [New Thread 0xfffff7209d10 (LWP 17718)]
>>>> EAL: Selected IOVA mode 'VA'
>>>> EAL: VFIO support initialized
>>>> [New Thread 0xfffff69f8d10 (LWP 17719)]
>>>> EAL: Using IOMMU type 1 (Type 1)
>>>> EAL: Probe PCI driver: net_hns3 (19e5:a222) device: 0000:7d:00.0
>>>> (socket 0)
>>>> [New Thread 0xfffff61f7d10 (LWP 17720)]
>>>> TELEMETRY: No legacy callbacks, legacy socket not created
>>>> Interactive-mode selected
>>>> 0000:7d:00.0 hns3_dev_mtu_set(): Failed to set mtu, port 0 must be
>>>> stopped
>>>> before configuration
>>>> Failed to set MTU to 1500 for port 0
>>>> testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176,
>>>> socket=0
>>>> testpmd: preferred mempool ops selected: ring_mp_mc
>>>>
>>>> Warning! port-topology=paired and odd forward ports number, the
>>>> last port will
>>>> pair with itself.
>>>>
>>>> Configuring Port 0 (socket 0)
>>>> Port 0: 00:18:2D:00:00:9E
>>>> Checking link statuses...
>>>> Done
>>>> testpmd> port close 0
>>>> Closing ports...
>>>>
>>>> Breakpoint 2, hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>)
>>>> at ../drivers/net/hns3/hns3_ethdev.c:6189
>>>> 6189 struct hns3_adapter *hns = eth_dev->data->dev_private;
>>>> Missing separate debuginfos, use: debuginfo-install
>>>> glibc-2.17-260.el7.aarch64
>>>> libpcap-1.5.3-11.el7.aarch64 openssl-libs-1.0.2k-16.el7.aarch64
>>>> zlib-1.2.7-18.el7.aarch64
>>>> (gdb) bt
>>>> #0 hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>) at
>>>> ../drivers/net/hns3/hns3_ethdev.c:6189
>>>> #1 0x0000000000742eac in rte_eth_dev_close (port_id=0) at
>>>> ../lib/ethdev/rte_ethdev.c:1870
>>>> #2 0x0000000000542a8c in close_port (pid=0) at
>>>> ../app/test-pmd/testpmd.c:2895
>>>> #3 0x00000000004df82c in cmd_operate_specific_port_parsed
>>>> (parsed_result=0xffffffffb0f0,
>>>> cl=0x2475080, data=0x0) at ../app/test-pmd/cmdline.c:1272
>>>> #4 0x0000000000738ce4 in cmdline_parse (cl=0x2475080,
>>>> buf=0x24750c8 "port close
>>>> 0\n")
>>>> at ../lib/cmdline/cmdline_parse.c:290
>>>> #5 0x0000000000736b7c in cmdline_valid_buffer (rdl=0x2475090,
>>>> buf=0x24750c8
>>>> "port close 0\n",
>>>> size=14) at ../lib/cmdline/cmdline.c:26
>>>> #6 0x000000000073c078 in rdline_char_in (rdl=0x2475090, c=10 '\n')
>>>> at ../lib/cmdline/cmdline_rdline.c:421
>>>> #7 0x0000000000736fcc in cmdline_in (cl=0x2475080, buf=0xfffffffff25f
>>>> "\n\200\362\377\377\377\377",
>>>> size=1) at ../lib/cmdline/cmdline.c:149
>>>> #8 0x0000000000737270 in cmdline_interact (cl=0x2475080) at
>>>> ../lib/cmdline/cmdline.c:223
>>>> #9 0x00000000004f0ccc in prompt () at ../app/test-pmd/cmdline.c:17882
>>>> #10 0x0000000000545528 in main (argc=8, argv=0xfffffffff470) at
>>>> ../app/test-pmd/testpmd.c:3998
>>>> (gdb) c
>>>> Continuing.
>>>> Port 0 is closed
>>>> Done
>>>> testpmd> device detach 0000:7d:00.0
>>>> Removing a device...
>>>> [Switching to Thread 0xfffff7a0ad10 (LWP 17717)]
>>>>
>>>> Breakpoint 1, hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>)
>>>> at ../drivers/net/hns3/hns3_ethdev.c:7806
>>>> 7806 struct hns3_adapter *hns = eth_dev->data->dev_private;
>>>> (gdb) bt
>>>> #0 hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>) at
>>>> ../drivers/net/hns3/hns3_ethdev.c:7806
>>>> #1 0x0000000000bb8668 in rte_eth_dev_pci_generic_remove
>>>> (pci_dev=0x247a600,
>>>> dev_uninit=0xbca7c4 <hns3_dev_uninit>) at
>>>> ../lib/ethdev/ethdev_pci.h:155
>>>> #2 0x0000000000bca89c in eth_hns3_pci_remove (pci_dev=0x247a600)
>>>> at ../drivers/net/hns3/hns3_ethdev.c:7833
>>>> #3 0x00000000007e85f4 in rte_pci_detach_dev (dev=0x247a600) at
>>>> ../drivers/bus/pci/pci_common.c:287
>>>> #4 0x00000000007e8f14 in pci_unplug (dev=0x247a610) at
>>>> ../drivers/bus/pci/pci_common.c:570
>>>> #5 0x0000000000775678 in local_dev_remove (dev=0x247a610) at
>>>> ../lib/eal/common/eal_common_dev.c:319
>>>> #6 0x000000000078f114 in __handle_primary_request
>>>> (param=0xfffff00008c0)
>>>> at ../lib/eal/common/hotplug_mp.c:284
>>>> #7 0x000000000079ebf4 in eal_alarm_callback (arg=0x0) at
>>>> ../lib/eal/linux/eal_alarm.c:92
>>>> #8 0x00000000007a3f30 in eal_intr_process_interrupts
>>>> (events=0xfffff7a0a3e0,
>>>> nfds=1)
>>>> at ../lib/eal/linux/eal_interrupts.c:998
>>>> #9 0x00000000007a4224 in eal_intr_handle_interrupts (pfd=10,
>>>> totalfds=2)
>>>> at ../lib/eal/linux/eal_interrupts.c:1071
>>>> #10 0x00000000007a442c in eal_intr_thread_main (arg=0x0) at
>>>> ../lib/eal/linux/eal_interrupts.c:1142
>>>> #11 0x0000000000789388 in ctrl_thread_init (arg=0x246ef40)
>>>> at ../lib/eal/common/eal_common_thread.c:202
>>>> #12 0x0000fffff7bf3c48 in start_thread () from /lib64/libpthread.so.0
>>>> #13 0x0000fffff7b45600 in thread_start () from /lib64/libc.so.6
>>>>
>>>> **************************************************************************************************************************************************
>>>>
>>>>
>>>>
>>>> Note: above log is from the secondary process.
>>>>
>>>>>>> And again, please re-consider calling 'rte_eth_dev_close()' or
>>>>>>> 'rte_dev_remove()' from the secondary process.
>>>>>>>
>>>>>>>>>>> + * Namely, whether "eth_dev" is NULL cannot be used to
>>>>>>>>>>> determine
>>>>>>>>>>> whether
>>>>>>>>>>> + * an ethdev port has been released.
>>>>>>>>>>> + * For both primary process and secondary process,
>>>>>>>>>>> eth_dev->state is
>>>>>>>>>>> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has
>>>>>>>>>>> been released.
>>>>>>>>>>> + */
>>>>>>>>>>> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>>>>>>>>>>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been
>>>>>>>>>>> released.");
>>>>>>>>>>> + return 0;
>>>>>>>>>>> + }
>>>>>>>>> .
>>>>>>> .
>>>>> .
>>> .
> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-09-30 10:54 ` Huisong Li
@ 2021-09-30 11:01 ` Ferruh Yigit
2021-10-08 6:13 ` lihuisong (C)
0 siblings, 1 reply; 33+ messages in thread
From: Ferruh Yigit @ 2021-09-30 11:01 UTC (permalink / raw)
To: Huisong Li, Singh, Aman Deep, Thomas Monjalon
Cc: Andrew Rybchenko, dev, Anatoly Burakov, David Marchand
On 9/30/2021 11:54 AM, Huisong Li wrote:
>
> 在 2021/9/28 15:19, Singh, Aman Deep 写道:
>>
>> On 9/22/2021 9:01 AM, Huisong Li wrote:
>>>
>>> 在 2021/9/20 22:07, Ferruh Yigit 写道:
>>>> On 8/25/2021 10:53 AM, Huisong Li wrote:
>>>>> 在 2021/8/24 22:42, Ferruh Yigit 写道:
>>>>>> On 8/19/2021 4:45 AM, Huisong Li wrote:
>>>>>>> 在 2021/8/18 19:24, Ferruh Yigit 写道:
>>>>>>>> On 8/13/2021 9:16 AM, Huisong Li wrote:
>>>>>>>>> 在 2021/8/13 14:12, Thomas Monjalon 写道:
>>>>>>>>>> 13/08/2021 04:11, Huisong Li:
>>>>>>>>>>> Hi, all
>>>>>>>>>>>
>>>>>>>>>>> This patch can enhance the security of device uninstallation to
>>>>>>>>>>> eliminate dependency on user usage methods.
>>>>>>>>>>>
>>>>>>>>>>> Can you check this patch?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 在 2021/8/3 10:30, Huisong Li 写道:
>>>>>>>>>>>> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
>>>>>>>>>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD layer
>>>>>>>>>>>> to uninstall hardware. However, the two APIs do not have explicit
>>>>>>>>>>>> invocation restrictions. In other words, at the ethdev layer, it is
>>>>>>>>>>>> possible to call rte_eth_dev_close() before calling rte_dev_remove()
>>>>>>>>>>>> or rte_eal_hotplug_remove(). In such a bad scenario,
>>>>>>>>>> It is not a bad scenario.
>>>>>>>>>> If there is no more port for the device after calling close,
>>>>>>>>>> the device should be removed automatically.
>>>>>>>>>> Keep in mind "close" is for one port, "remove" is for the entire device
>>>>>>>>>> which can have more than one port.
>>>>>>>>> I know.
>>>>>>>>>
>>>>>>>>> dev_close() is for removing an eth device. And rte_dev_remove() can be
>>>>>>>>> used
>>>>>>>>>
>>>>>>>>> for removing the rte device and all its eth devices belonging to the rte
>>>>>>>>> device.
>>>>>>>>>
>>>>>>>>> In rte_dev_remove(), "remove" is executed in primary or one of secondary,
>>>>>>>>>
>>>>>>>>> all eth devices having same pci address will be closed and removed.
>>>>>>>>>
>>>>>>>>>>>> the primary
>>>>>>>>>>>> process may be fine, but it may cause that xxx_dev_close() in the PMD
>>>>>>>>>>>> layer will be called twice in the secondary process. So this patch
>>>>>>>>>>>> fixes it.
>>>>>>>>>> If a port is closed in primary, it should be the same in secondary.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>> + /*
>>>>>>>>>>>> + * The eth_dev->data->name doesn't be cleared by the secondary
>>>>>>>>>>>> process,
>>>>>>>>>>>> + * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
>>>>>>>>>> This assumption is not clear. All should be closed together.
>>>>>>>>> However, dev_close() does not have the feature similar to
>>>>>>>>> rte_dev_remove().
>>>>>>>>>
>>>>>>>>> Namely, it is not guaranteed that all eth devices are closed together in
>>>>>>>>> ethdev
>>>>>>>>> layer. It depends on app or user.
>>>>>>>>>
>>>>>>>>> If the app does not close together, the operation of repeatedly
>>>>>>>>> uninstalling an
>>>>>>>>> eth device in the secondary process
>>>>>>>>>
>>>>>>>>> will be triggered when dev_close() is first called by one secondary
>>>>>>>>> process, and
>>>>>>>>> then rte_dev_remove() is called.
>>>>>>>>>
>>>>>>>>> So I think it should be avoided.
>>>>>>>> First of all, I am not sure about calling 'rte_eth_dev_close()' or
>>>>>>>> 'rte_dev_remove()' from the secondary process.
>>>>>>>> There are explicit checks in various locations to prevent clearing
>>>>>>>> resources
>>>>>>>> completely from secondary process.
>>>>>>> There's no denying that.
>>>>>>>
>>>>>>> Generally, hardware resources of eth device and shared data of the
>>>>>>> primary and
>>>>>>> secondary process
>>>>>>>
>>>>>>> are cleared by primary, which are controled by ethdev layer or PMD layer.
>>>>>>>
>>>>>>> But there may be some private data or resources of each process (primary or
>>>>>>> secondary ), such as mp action
>>>>>>>
>>>>>>> registered by rte_mp_action_register() or others. For these resources, the
>>>>>>> secondary process still needs to clear.
>>>>>>>
>>>>>>> Namely, both primary and secondary processes need to prevent repeated
>>>>>>> offloading
>>>>>>> of resources.
>>>>>>>
>>>>>>>> Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary is
>>>>>>>> technically
>>>>>>>> can be done but application needs to be extra cautious and should take
>>>>>>>> extra
>>>>>>>> measures and synchronization to make it work.
>>>>>>>> Regular use-case is secondary processes do the packet processing and all
>>>>>>>> control
>>>>>>>> commands run by primary.
>>>>>>> You are right. We have a consensus that 'rte_eth_dev_close()' or
>>>>>>> 'rte_dev_remove()'
>>>>>>>
>>>>>>> can be called by primary and secondary processes.
>>>>>>>
>>>>>>> But DPDK framework cannot assume user behavior.😁
>>>>>>>
>>>>>>> We need to make it more secure and reliable for both primary and secondary
>>>>>>> processes.
>>>>>>>
>>>>>>>> In primary, if you call 'rte_eth_dev_close()' it will clear all ethdev
>>>>>>>> resources
>>>>>>>> and further 'rte_dev_remove()' call will detect missing ethdev resources
>>>>>>>> and
>>>>>>>> won't try to clear them again.
>>>>>>>>
>>>>>>>> In secondary, if you call 'rte_eth_dev_close()', it WON'T clear all
>>>>>>>> resources
>>>>>>>> and further 'rte_dev_remove()' call (either from primary or secondary)
>>>>>>>> will try
>>>>>>>> to clean ethdev resources again. You are trying to prevent this retry in
>>>>>>>> remove
>>>>>>>> happening for secondary process.
>>>>>>> Right. However, if secondary process in PMD layer has its own private
>>>>>>> resources
>>>>>>> to be
>>>>>>>
>>>>>>> cleared, it still need to do it by calling 'rte_eth_dev_close()' or
>>>>>>> 'rte_dev_remove()'.
>>>>>>>
>>>>>>>> In secondary it won't free ethdev resources anyway if you let it continue,
>>>>>>>> but I
>>>>>>>> guess here you are trying to prevent the PMD dev_close() called again.
>>>>>>>> Why? Is
>>>>>>>> it just for optimization or does it cause unexpected behavior in the PMD?
>>>>>>>>
>>>>>>>>
>>>>>>>> Overall, to free resources you need to do the 'rte_eth_dev_close()' or
>>>>>>>> 'rte_dev_remove()' in the primary anyway. So instead of this workaround, I
>>>>>>>> would
>>>>>>>> suggest making PMD dev_close() safe to be called multiple times (if this
>>>>>>>> is the
>>>>>>>> problem.)
>>>>>>> In conclusion, primary and secondary processes in PMD layer may have
>>>>>>> their own
>>>>>>>
>>>>>>> private data and resources, which need to be processed and released.
>>>>>>>
>>>>>>> Currently, these for PMD are either handled and cleaned up in
>>>>>>> dev_close() or
>>>>>>> remove().
>>>>>>>
>>>>>>> However, code streams in rte_dev_remove() cannot ensure that the
>>>>>>> uninstallation
>>>>>>>
>>>>>>> from secondary process will not be repeated if rte_eth_dev_close() is first
>>>>>>> called by
>>>>>>>
>>>>>>> secondary(primary is ok, plese review this patch).
>>>>>>>
>>>>>>> I think, this is the same for each PMD and is better suited to doing it in
>>>>>>> ethdev layer.
>>>>>>>
>>>>>> This patch prevents to call dev_close() twice in the secondary process, is
>>>>>> this
>>>>>> fixing a theoretical problem or an actual problem?
>>>>>>
>>>>>> If it is an actual problem can you please provide details, callstack of the
>>>>>> problematic case?
>>>>> We analyzed the code when modifying the bug and found that the problem did
>>>>> exist.
>>>>>
>>>>> The ethdev layer did not guarantee the security.
>>>>>
>>>> I was wondering if there is a crash for an unexpected path, for below case the
>>>> primary process check in the 'hns3_dev_uninit()' should already prevent
>>>> anything
>>>> unexpected. So I assume this is a fix for a theoretical issue.
>>>
>>> Yes. For primary process, hns3 can prevent multiple device uninstallation
>>> through
>>>
>>> "adapter_state" controled by primary process.
>>>
>>> The problem of multiple device uninstallation has been prevented at
>>> rte_eth_dev_pci_generic_remove()
>>>
>>> in the ethdev layer, as described in the patch.
>>>
>>> In primary process, rte_eth_dev_allocated() in
>>> rte_eth_dev_pci_generic_remove() will return NULL
>>>
>>> when first calling dev_close(), and then calling rte_dev_remove(). So it is
>>> ok. But the logic can not
>>>
>>> prevent the same case in secondary because secondary does not clear dev->data.
>>>
>>>>
>>>> In secondary process, these init/uninit device is already not very safe. For
>>>> example, as far as I can see in secondary if you hot remove a device device and
>>>> hot plug a new one, new device will use wrong device data (since hot remove
>>>> won't clear device data, new one will continue to use it).
>>>
>>> No. If we hot remove a device in secondary, the secondary will request its
>>> primary
>>>
>>> send "remove device" message to all secondaries. After all secondaries are
>>> removed,
>>>
>>> the primary also removes the device and the device data will be cleared.
>>>
>>> This is the logic of rte_dev_remove().
>>>
>>>> So I am not sure about adding secondary process related checks in that area and
>>>> causing a false sense of security, and polluting the logic with secondary
>>>> specific checks.
>>>> Also the check you add may hit by primary process and I am worried on an
>>>> unexpected side affect it cause.
>>>>
>>>>
>>>> As said above I am not sure about this new check, but even we continue with it,
>>>> what about wrapping the check with secondary process check, at least to be sure
>>>> there won't be any side affect for primary process.
>>>>
>>> If app hot remove device in primary/secondary, the original logic is still
>>> used in this interface, because of
>>>
>>> bing RTE_ETH_DEV_ATTACHED state for eth_dev. If app first calls dev_close(),
>>> eth device is
>>>
>>> RTE_ETH_DEV_UNUSED state in primary/secondary. In this issue case, this
>>> interface is still return from the
>>>
>>> original place instead of the new check.
>>>
>>> So I don't think the new check will affect the primary process. Conversely,
>>> it's safer for secondary processes.
>>>
>>> Please check it again, thanks.
>>
>> As this issue is specific to secondary process, can we have these change under
>> a check like "if (rte_eal_process_type() != RTE_PROC_PRIMARY)"
>>
>> By this we can avoid any side effect of these changes for primary process and
>> also it makes code readablity easier for designers who are not checking
>> secondary process changes.
>
> yes.
>
> @Ferruh, what do you think?
>
+1 to have it, it clarifies the intention of the check, plus prevents possible
sice affect for primary as Aman mentioned.
>>
>>>
>>>>> The general function of the two interfaces is as follows:
>>>>>
>>>>> rte_eth_dev_close() --> release eth device
>>>>>
>>>>> rte_dev_remove() --> release eth device + remove and free
>>>>> rte_pci_device(primary andsecondary).
>>>>>
>>>>> According to the OVS application scenario, first call dev_close() and then
>>>>> call
>>>>>
>>>>> remove(), which is possible.
>>>>>
>>>>> We constructed this scenario using testpmd to start the secondary process(the
>>>>> multi-process
>>>>>
>>>>> patch of testpmd is being uploaded.). It is proved that the ethdev layer
>>>>> cannot
>>>>> guarantee
>>>>>
>>>>> this security. The callstack is as follows:
>>>>>
>>>>> **************************************************************************************************************************************************
>>>>>
>>>>>
>>>>>
>>>>> gdb) set args -a 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto --
>>>>> -i --rxq 4 --txq 4 --num-procs=2 --proc-id=1
>>>>> (gdb) b hns3_dev_uninit
>>>>> Breakpoint 1 at 0xbca7d0: file ../drivers/net/hns3/hns3_ethdev.c, line 7806.
>>>>> (gdb) b hns3_dev_close
>>>>> Breakpoint 2 at 0xbc6d44: file ../drivers/net/hns3/hns3_ethdev.c, line 6189.
>>>>> (gdb) r
>>>>> Starting program:
>>>>> /home/lihuisong/v500/gcov-test/dpdk/build/app/dpdk-testpmd -a
>>>>> 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto -- -i --rxq 4
>>>>> --txq 4
>>>>> --num-procs=2 --proc-id=1
>>>>> Missing separate debuginfo for /root/lib/libnuma.so.1
>>>>> Try: yum --enablerepo='*debug*' install
>>>>> /usr/lib/debug/.build-id/ce/4eea0b0f2150f70a080bbd8835e43e78373096.debug
>>>>> [Thread debugging using libthread_db enabled]
>>>>> Using host libthread_db library "/lib64/libthread_db.so.1".
>>>>> EAL: Detected 128 lcore(s)
>>>>> EAL: Detected 4 NUMA nodes
>>>>> EAL: Auto-detected process type: SECONDARY
>>>>> EAL: Detected static linkage of DPDK
>>>>> [New Thread 0xfffff7a0ad10 (LWP 17717)]
>>>>> EAL: Multi-process socket /var/run/dpdk/rte_lee/mp_socket_17714_66aa8d3a60a9
>>>>> [New Thread 0xfffff7209d10 (LWP 17718)]
>>>>> EAL: Selected IOVA mode 'VA'
>>>>> EAL: VFIO support initialized
>>>>> [New Thread 0xfffff69f8d10 (LWP 17719)]
>>>>> EAL: Using IOMMU type 1 (Type 1)
>>>>> EAL: Probe PCI driver: net_hns3 (19e5:a222) device: 0000:7d:00.0 (socket 0)
>>>>> [New Thread 0xfffff61f7d10 (LWP 17720)]
>>>>> TELEMETRY: No legacy callbacks, legacy socket not created
>>>>> Interactive-mode selected
>>>>> 0000:7d:00.0 hns3_dev_mtu_set(): Failed to set mtu, port 0 must be stopped
>>>>> before configuration
>>>>> Failed to set MTU to 1500 for port 0
>>>>> testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
>>>>> testpmd: preferred mempool ops selected: ring_mp_mc
>>>>>
>>>>> Warning! port-topology=paired and odd forward ports number, the last port will
>>>>> pair with itself.
>>>>>
>>>>> Configuring Port 0 (socket 0)
>>>>> Port 0: 00:18:2D:00:00:9E
>>>>> Checking link statuses...
>>>>> Done
>>>>> testpmd> port close 0
>>>>> Closing ports...
>>>>>
>>>>> Breakpoint 2, hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>)
>>>>> at ../drivers/net/hns3/hns3_ethdev.c:6189
>>>>> 6189 struct hns3_adapter *hns = eth_dev->data->dev_private;
>>>>> Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7.aarch64
>>>>> libpcap-1.5.3-11.el7.aarch64 openssl-libs-1.0.2k-16.el7.aarch64
>>>>> zlib-1.2.7-18.el7.aarch64
>>>>> (gdb) bt
>>>>> #0 hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>) at
>>>>> ../drivers/net/hns3/hns3_ethdev.c:6189
>>>>> #1 0x0000000000742eac in rte_eth_dev_close (port_id=0) at
>>>>> ../lib/ethdev/rte_ethdev.c:1870
>>>>> #2 0x0000000000542a8c in close_port (pid=0) at ../app/test-pmd/testpmd.c:2895
>>>>> #3 0x00000000004df82c in cmd_operate_specific_port_parsed
>>>>> (parsed_result=0xffffffffb0f0,
>>>>> cl=0x2475080, data=0x0) at ../app/test-pmd/cmdline.c:1272
>>>>> #4 0x0000000000738ce4 in cmdline_parse (cl=0x2475080, buf=0x24750c8 "port
>>>>> close
>>>>> 0\n")
>>>>> at ../lib/cmdline/cmdline_parse.c:290
>>>>> #5 0x0000000000736b7c in cmdline_valid_buffer (rdl=0x2475090, buf=0x24750c8
>>>>> "port close 0\n",
>>>>> size=14) at ../lib/cmdline/cmdline.c:26
>>>>> #6 0x000000000073c078 in rdline_char_in (rdl=0x2475090, c=10 '\n')
>>>>> at ../lib/cmdline/cmdline_rdline.c:421
>>>>> #7 0x0000000000736fcc in cmdline_in (cl=0x2475080, buf=0xfffffffff25f
>>>>> "\n\200\362\377\377\377\377",
>>>>> size=1) at ../lib/cmdline/cmdline.c:149
>>>>> #8 0x0000000000737270 in cmdline_interact (cl=0x2475080) at
>>>>> ../lib/cmdline/cmdline.c:223
>>>>> #9 0x00000000004f0ccc in prompt () at ../app/test-pmd/cmdline.c:17882
>>>>> #10 0x0000000000545528 in main (argc=8, argv=0xfffffffff470) at
>>>>> ../app/test-pmd/testpmd.c:3998
>>>>> (gdb) c
>>>>> Continuing.
>>>>> Port 0 is closed
>>>>> Done
>>>>> testpmd> device detach 0000:7d:00.0
>>>>> Removing a device...
>>>>> [Switching to Thread 0xfffff7a0ad10 (LWP 17717)]
>>>>>
>>>>> Breakpoint 1, hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>)
>>>>> at ../drivers/net/hns3/hns3_ethdev.c:7806
>>>>> 7806 struct hns3_adapter *hns = eth_dev->data->dev_private;
>>>>> (gdb) bt
>>>>> #0 hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>) at
>>>>> ../drivers/net/hns3/hns3_ethdev.c:7806
>>>>> #1 0x0000000000bb8668 in rte_eth_dev_pci_generic_remove (pci_dev=0x247a600,
>>>>> dev_uninit=0xbca7c4 <hns3_dev_uninit>) at ../lib/ethdev/ethdev_pci.h:155
>>>>> #2 0x0000000000bca89c in eth_hns3_pci_remove (pci_dev=0x247a600)
>>>>> at ../drivers/net/hns3/hns3_ethdev.c:7833
>>>>> #3 0x00000000007e85f4 in rte_pci_detach_dev (dev=0x247a600) at
>>>>> ../drivers/bus/pci/pci_common.c:287
>>>>> #4 0x00000000007e8f14 in pci_unplug (dev=0x247a610) at
>>>>> ../drivers/bus/pci/pci_common.c:570
>>>>> #5 0x0000000000775678 in local_dev_remove (dev=0x247a610) at
>>>>> ../lib/eal/common/eal_common_dev.c:319
>>>>> #6 0x000000000078f114 in __handle_primary_request (param=0xfffff00008c0)
>>>>> at ../lib/eal/common/hotplug_mp.c:284
>>>>> #7 0x000000000079ebf4 in eal_alarm_callback (arg=0x0) at
>>>>> ../lib/eal/linux/eal_alarm.c:92
>>>>> #8 0x00000000007a3f30 in eal_intr_process_interrupts (events=0xfffff7a0a3e0,
>>>>> nfds=1)
>>>>> at ../lib/eal/linux/eal_interrupts.c:998
>>>>> #9 0x00000000007a4224 in eal_intr_handle_interrupts (pfd=10, totalfds=2)
>>>>> at ../lib/eal/linux/eal_interrupts.c:1071
>>>>> #10 0x00000000007a442c in eal_intr_thread_main (arg=0x0) at
>>>>> ../lib/eal/linux/eal_interrupts.c:1142
>>>>> #11 0x0000000000789388 in ctrl_thread_init (arg=0x246ef40)
>>>>> at ../lib/eal/common/eal_common_thread.c:202
>>>>> #12 0x0000fffff7bf3c48 in start_thread () from /lib64/libpthread.so.0
>>>>> #13 0x0000fffff7b45600 in thread_start () from /lib64/libc.so.6
>>>>>
>>>>> **************************************************************************************************************************************************
>>>>>
>>>>>
>>>>>
>>>>> Note: above log is from the secondary process.
>>>>>
>>>>>>>> And again, please re-consider calling 'rte_eth_dev_close()' or
>>>>>>>> 'rte_dev_remove()' from the secondary process.
>>>>>>>>
>>>>>>>>>>>> + * Namely, whether "eth_dev" is NULL cannot be used to determine
>>>>>>>>>>>> whether
>>>>>>>>>>>> + * an ethdev port has been released.
>>>>>>>>>>>> + * For both primary process and secondary process,
>>>>>>>>>>>> eth_dev->state is
>>>>>>>>>>>> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has been
>>>>>>>>>>>> released.
>>>>>>>>>>>> + */
>>>>>>>>>>>> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>>>>>>>>>>>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
>>>>>>>>>>>> + return 0;
>>>>>>>>>>>> + }
>>>>>>>>>> .
>>>>>>>> .
>>>>>> .
>>>> .
>> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
2021-09-30 11:01 ` Ferruh Yigit
@ 2021-10-08 6:13 ` lihuisong (C)
0 siblings, 0 replies; 33+ messages in thread
From: lihuisong (C) @ 2021-10-08 6:13 UTC (permalink / raw)
To: Ferruh Yigit, Singh, Aman Deep, Thomas Monjalon
Cc: Andrew Rybchenko, dev, Anatoly Burakov, David Marchand
在 2021/9/30 19:01, Ferruh Yigit 写道:
> On 9/30/2021 11:54 AM, Huisong Li wrote:
>> 在 2021/9/28 15:19, Singh, Aman Deep 写道:
>>> On 9/22/2021 9:01 AM, Huisong Li wrote:
>>>> 在 2021/9/20 22:07, Ferruh Yigit 写道:
>>>>> On 8/25/2021 10:53 AM, Huisong Li wrote:
>>>>>> 在 2021/8/24 22:42, Ferruh Yigit 写道:
>>>>>>> On 8/19/2021 4:45 AM, Huisong Li wrote:
>>>>>>>> 在 2021/8/18 19:24, Ferruh Yigit 写道:
>>>>>>>>> On 8/13/2021 9:16 AM, Huisong Li wrote:
>>>>>>>>>> 在 2021/8/13 14:12, Thomas Monjalon 写道:
>>>>>>>>>>> 13/08/2021 04:11, Huisong Li:
>>>>>>>>>>>> Hi, all
>>>>>>>>>>>>
>>>>>>>>>>>> This patch can enhance the security of device uninstallation to
>>>>>>>>>>>> eliminate dependency on user usage methods.
>>>>>>>>>>>>
>>>>>>>>>>>> Can you check this patch?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 在 2021/8/3 10:30, Huisong Li 写道:
>>>>>>>>>>>>> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
>>>>>>>>>>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD layer
>>>>>>>>>>>>> to uninstall hardware. However, the two APIs do not have explicit
>>>>>>>>>>>>> invocation restrictions. In other words, at the ethdev layer, it is
>>>>>>>>>>>>> possible to call rte_eth_dev_close() before calling rte_dev_remove()
>>>>>>>>>>>>> or rte_eal_hotplug_remove(). In such a bad scenario,
>>>>>>>>>>> It is not a bad scenario.
>>>>>>>>>>> If there is no more port for the device after calling close,
>>>>>>>>>>> the device should be removed automatically.
>>>>>>>>>>> Keep in mind "close" is for one port, "remove" is for the entire device
>>>>>>>>>>> which can have more than one port.
>>>>>>>>>> I know.
>>>>>>>>>>
>>>>>>>>>> dev_close() is for removing an eth device. And rte_dev_remove() can be
>>>>>>>>>> used
>>>>>>>>>>
>>>>>>>>>> for removing the rte device and all its eth devices belonging to the rte
>>>>>>>>>> device.
>>>>>>>>>>
>>>>>>>>>> In rte_dev_remove(), "remove" is executed in primary or one of secondary,
>>>>>>>>>>
>>>>>>>>>> all eth devices having same pci address will be closed and removed.
>>>>>>>>>>
>>>>>>>>>>>>> the primary
>>>>>>>>>>>>> process may be fine, but it may cause that xxx_dev_close() in the PMD
>>>>>>>>>>>>> layer will be called twice in the secondary process. So this patch
>>>>>>>>>>>>> fixes it.
>>>>>>>>>>> If a port is closed in primary, it should be the same in secondary.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> + /*
>>>>>>>>>>>>> + * The eth_dev->data->name doesn't be cleared by the secondary
>>>>>>>>>>>>> process,
>>>>>>>>>>>>> + * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
>>>>>>>>>>> This assumption is not clear. All should be closed together.
>>>>>>>>>> However, dev_close() does not have the feature similar to
>>>>>>>>>> rte_dev_remove().
>>>>>>>>>>
>>>>>>>>>> Namely, it is not guaranteed that all eth devices are closed together in
>>>>>>>>>> ethdev
>>>>>>>>>> layer. It depends on app or user.
>>>>>>>>>>
>>>>>>>>>> If the app does not close together, the operation of repeatedly
>>>>>>>>>> uninstalling an
>>>>>>>>>> eth device in the secondary process
>>>>>>>>>>
>>>>>>>>>> will be triggered when dev_close() is first called by one secondary
>>>>>>>>>> process, and
>>>>>>>>>> then rte_dev_remove() is called.
>>>>>>>>>>
>>>>>>>>>> So I think it should be avoided.
>>>>>>>>> First of all, I am not sure about calling 'rte_eth_dev_close()' or
>>>>>>>>> 'rte_dev_remove()' from the secondary process.
>>>>>>>>> There are explicit checks in various locations to prevent clearing
>>>>>>>>> resources
>>>>>>>>> completely from secondary process.
>>>>>>>> There's no denying that.
>>>>>>>>
>>>>>>>> Generally, hardware resources of eth device and shared data of the
>>>>>>>> primary and
>>>>>>>> secondary process
>>>>>>>>
>>>>>>>> are cleared by primary, which are controled by ethdev layer or PMD layer.
>>>>>>>>
>>>>>>>> But there may be some private data or resources of each process (primary or
>>>>>>>> secondary ), such as mp action
>>>>>>>>
>>>>>>>> registered by rte_mp_action_register() or others. For these resources, the
>>>>>>>> secondary process still needs to clear.
>>>>>>>>
>>>>>>>> Namely, both primary and secondary processes need to prevent repeated
>>>>>>>> offloading
>>>>>>>> of resources.
>>>>>>>>
>>>>>>>>> Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary is
>>>>>>>>> technically
>>>>>>>>> can be done but application needs to be extra cautious and should take
>>>>>>>>> extra
>>>>>>>>> measures and synchronization to make it work.
>>>>>>>>> Regular use-case is secondary processes do the packet processing and all
>>>>>>>>> control
>>>>>>>>> commands run by primary.
>>>>>>>> You are right. We have a consensus that 'rte_eth_dev_close()' or
>>>>>>>> 'rte_dev_remove()'
>>>>>>>>
>>>>>>>> can be called by primary and secondary processes.
>>>>>>>>
>>>>>>>> But DPDK framework cannot assume user behavior.😁
>>>>>>>>
>>>>>>>> We need to make it more secure and reliable for both primary and secondary
>>>>>>>> processes.
>>>>>>>>
>>>>>>>>> In primary, if you call 'rte_eth_dev_close()' it will clear all ethdev
>>>>>>>>> resources
>>>>>>>>> and further 'rte_dev_remove()' call will detect missing ethdev resources
>>>>>>>>> and
>>>>>>>>> won't try to clear them again.
>>>>>>>>>
>>>>>>>>> In secondary, if you call 'rte_eth_dev_close()', it WON'T clear all
>>>>>>>>> resources
>>>>>>>>> and further 'rte_dev_remove()' call (either from primary or secondary)
>>>>>>>>> will try
>>>>>>>>> to clean ethdev resources again. You are trying to prevent this retry in
>>>>>>>>> remove
>>>>>>>>> happening for secondary process.
>>>>>>>> Right. However, if secondary process in PMD layer has its own private
>>>>>>>> resources
>>>>>>>> to be
>>>>>>>>
>>>>>>>> cleared, it still need to do it by calling 'rte_eth_dev_close()' or
>>>>>>>> 'rte_dev_remove()'.
>>>>>>>>
>>>>>>>>> In secondary it won't free ethdev resources anyway if you let it continue,
>>>>>>>>> but I
>>>>>>>>> guess here you are trying to prevent the PMD dev_close() called again.
>>>>>>>>> Why? Is
>>>>>>>>> it just for optimization or does it cause unexpected behavior in the PMD?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Overall, to free resources you need to do the 'rte_eth_dev_close()' or
>>>>>>>>> 'rte_dev_remove()' in the primary anyway. So instead of this workaround, I
>>>>>>>>> would
>>>>>>>>> suggest making PMD dev_close() safe to be called multiple times (if this
>>>>>>>>> is the
>>>>>>>>> problem.)
>>>>>>>> In conclusion, primary and secondary processes in PMD layer may have
>>>>>>>> their own
>>>>>>>>
>>>>>>>> private data and resources, which need to be processed and released.
>>>>>>>>
>>>>>>>> Currently, these for PMD are either handled and cleaned up in
>>>>>>>> dev_close() or
>>>>>>>> remove().
>>>>>>>>
>>>>>>>> However, code streams in rte_dev_remove() cannot ensure that the
>>>>>>>> uninstallation
>>>>>>>>
>>>>>>>> from secondary process will not be repeated if rte_eth_dev_close() is first
>>>>>>>> called by
>>>>>>>>
>>>>>>>> secondary(primary is ok, plese review this patch).
>>>>>>>>
>>>>>>>> I think, this is the same for each PMD and is better suited to doing it in
>>>>>>>> ethdev layer.
>>>>>>>>
>>>>>>> This patch prevents to call dev_close() twice in the secondary process, is
>>>>>>> this
>>>>>>> fixing a theoretical problem or an actual problem?
>>>>>>>
>>>>>>> If it is an actual problem can you please provide details, callstack of the
>>>>>>> problematic case?
>>>>>> We analyzed the code when modifying the bug and found that the problem did
>>>>>> exist.
>>>>>>
>>>>>> The ethdev layer did not guarantee the security.
>>>>>>
>>>>> I was wondering if there is a crash for an unexpected path, for below case the
>>>>> primary process check in the 'hns3_dev_uninit()' should already prevent
>>>>> anything
>>>>> unexpected. So I assume this is a fix for a theoretical issue.
>>>> Yes. For primary process, hns3 can prevent multiple device uninstallation
>>>> through
>>>>
>>>> "adapter_state" controled by primary process.
>>>>
>>>> The problem of multiple device uninstallation has been prevented at
>>>> rte_eth_dev_pci_generic_remove()
>>>>
>>>> in the ethdev layer, as described in the patch.
>>>>
>>>> In primary process, rte_eth_dev_allocated() in
>>>> rte_eth_dev_pci_generic_remove() will return NULL
>>>>
>>>> when first calling dev_close(), and then calling rte_dev_remove(). So it is
>>>> ok. But the logic can not
>>>>
>>>> prevent the same case in secondary because secondary does not clear dev->data.
>>>>
>>>>> In secondary process, these init/uninit device is already not very safe. For
>>>>> example, as far as I can see in secondary if you hot remove a device device and
>>>>> hot plug a new one, new device will use wrong device data (since hot remove
>>>>> won't clear device data, new one will continue to use it).
>>>> No. If we hot remove a device in secondary, the secondary will request its
>>>> primary
>>>>
>>>> send "remove device" message to all secondaries. After all secondaries are
>>>> removed,
>>>>
>>>> the primary also removes the device and the device data will be cleared.
>>>>
>>>> This is the logic of rte_dev_remove().
>>>>
>>>>> So I am not sure about adding secondary process related checks in that area and
>>>>> causing a false sense of security, and polluting the logic with secondary
>>>>> specific checks.
>>>>> Also the check you add may hit by primary process and I am worried on an
>>>>> unexpected side affect it cause.
>>>>>
>>>>>
>>>>> As said above I am not sure about this new check, but even we continue with it,
>>>>> what about wrapping the check with secondary process check, at least to be sure
>>>>> there won't be any side affect for primary process.
>>>>>
>>>> If app hot remove device in primary/secondary, the original logic is still
>>>> used in this interface, because of
>>>>
>>>> bing RTE_ETH_DEV_ATTACHED state for eth_dev. If app first calls dev_close(),
>>>> eth device is
>>>>
>>>> RTE_ETH_DEV_UNUSED state in primary/secondary. In this issue case, this
>>>> interface is still return from the
>>>>
>>>> original place instead of the new check.
>>>>
>>>> So I don't think the new check will affect the primary process. Conversely,
>>>> it's safer for secondary processes.
>>>>
>>>> Please check it again, thanks.
>>> As this issue is specific to secondary process, can we have these change under
>>> a check like "if (rte_eal_process_type() != RTE_PROC_PRIMARY)"
>>>
>>> By this we can avoid any side effect of these changes for primary process and
>>> also it makes code readablity easier for designers who are not checking
>>> secondary process changes.
>> yes.
>>
>> @Ferruh, what do you think?
>>
> +1 to have it, it clarifies the intention of the check, plus prevents possible
> sice affect for primary as Aman mentioned.
ok. I will modify it in patch V1.
>
>>>>>> The general function of the two interfaces is as follows:
>>>>>>
>>>>>> rte_eth_dev_close() --> release eth device
>>>>>>
>>>>>> rte_dev_remove() --> release eth device + remove and free
>>>>>> rte_pci_device(primary andsecondary).
>>>>>>
>>>>>> According to the OVS application scenario, first call dev_close() and then
>>>>>> call
>>>>>>
>>>>>> remove(), which is possible.
>>>>>>
>>>>>> We constructed this scenario using testpmd to start the secondary process(the
>>>>>> multi-process
>>>>>>
>>>>>> patch of testpmd is being uploaded.). It is proved that the ethdev layer
>>>>>> cannot
>>>>>> guarantee
>>>>>>
>>>>>> this security. The callstack is as follows:
>>>>>>
>>>>>> **************************************************************************************************************************************************
>>>>>>
>>>>>>
>>>>>>
>>>>>> gdb) set args -a 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto --
>>>>>> -i --rxq 4 --txq 4 --num-procs=2 --proc-id=1
>>>>>> (gdb) b hns3_dev_uninit
>>>>>> Breakpoint 1 at 0xbca7d0: file ../drivers/net/hns3/hns3_ethdev.c, line 7806.
>>>>>> (gdb) b hns3_dev_close
>>>>>> Breakpoint 2 at 0xbc6d44: file ../drivers/net/hns3/hns3_ethdev.c, line 6189.
>>>>>> (gdb) r
>>>>>> Starting program:
>>>>>> /home/lihuisong/v500/gcov-test/dpdk/build/app/dpdk-testpmd -a
>>>>>> 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto -- -i --rxq 4
>>>>>> --txq 4
>>>>>> --num-procs=2 --proc-id=1
>>>>>> Missing separate debuginfo for /root/lib/libnuma.so.1
>>>>>> Try: yum --enablerepo='*debug*' install
>>>>>> /usr/lib/debug/.build-id/ce/4eea0b0f2150f70a080bbd8835e43e78373096.debug
>>>>>> [Thread debugging using libthread_db enabled]
>>>>>> Using host libthread_db library "/lib64/libthread_db.so.1".
>>>>>> EAL: Detected 128 lcore(s)
>>>>>> EAL: Detected 4 NUMA nodes
>>>>>> EAL: Auto-detected process type: SECONDARY
>>>>>> EAL: Detected static linkage of DPDK
>>>>>> [New Thread 0xfffff7a0ad10 (LWP 17717)]
>>>>>> EAL: Multi-process socket /var/run/dpdk/rte_lee/mp_socket_17714_66aa8d3a60a9
>>>>>> [New Thread 0xfffff7209d10 (LWP 17718)]
>>>>>> EAL: Selected IOVA mode 'VA'
>>>>>> EAL: VFIO support initialized
>>>>>> [New Thread 0xfffff69f8d10 (LWP 17719)]
>>>>>> EAL: Using IOMMU type 1 (Type 1)
>>>>>> EAL: Probe PCI driver: net_hns3 (19e5:a222) device: 0000:7d:00.0 (socket 0)
>>>>>> [New Thread 0xfffff61f7d10 (LWP 17720)]
>>>>>> TELEMETRY: No legacy callbacks, legacy socket not created
>>>>>> Interactive-mode selected
>>>>>> 0000:7d:00.0 hns3_dev_mtu_set(): Failed to set mtu, port 0 must be stopped
>>>>>> before configuration
>>>>>> Failed to set MTU to 1500 for port 0
>>>>>> testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
>>>>>> testpmd: preferred mempool ops selected: ring_mp_mc
>>>>>>
>>>>>> Warning! port-topology=paired and odd forward ports number, the last port will
>>>>>> pair with itself.
>>>>>>
>>>>>> Configuring Port 0 (socket 0)
>>>>>> Port 0: 00:18:2D:00:00:9E
>>>>>> Checking link statuses...
>>>>>> Done
>>>>>> testpmd> port close 0
>>>>>> Closing ports...
>>>>>>
>>>>>> Breakpoint 2, hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>)
>>>>>> at ../drivers/net/hns3/hns3_ethdev.c:6189
>>>>>> 6189 struct hns3_adapter *hns = eth_dev->data->dev_private;
>>>>>> Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7.aarch64
>>>>>> libpcap-1.5.3-11.el7.aarch64 openssl-libs-1.0.2k-16.el7.aarch64
>>>>>> zlib-1.2.7-18.el7.aarch64
>>>>>> (gdb) bt
>>>>>> #0 hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>) at
>>>>>> ../drivers/net/hns3/hns3_ethdev.c:6189
>>>>>> #1 0x0000000000742eac in rte_eth_dev_close (port_id=0) at
>>>>>> ../lib/ethdev/rte_ethdev.c:1870
>>>>>> #2 0x0000000000542a8c in close_port (pid=0) at ../app/test-pmd/testpmd.c:2895
>>>>>> #3 0x00000000004df82c in cmd_operate_specific_port_parsed
>>>>>> (parsed_result=0xffffffffb0f0,
>>>>>> cl=0x2475080, data=0x0) at ../app/test-pmd/cmdline.c:1272
>>>>>> #4 0x0000000000738ce4 in cmdline_parse (cl=0x2475080, buf=0x24750c8 "port
>>>>>> close
>>>>>> 0\n")
>>>>>> at ../lib/cmdline/cmdline_parse.c:290
>>>>>> #5 0x0000000000736b7c in cmdline_valid_buffer (rdl=0x2475090, buf=0x24750c8
>>>>>> "port close 0\n",
>>>>>> size=14) at ../lib/cmdline/cmdline.c:26
>>>>>> #6 0x000000000073c078 in rdline_char_in (rdl=0x2475090, c=10 '\n')
>>>>>> at ../lib/cmdline/cmdline_rdline.c:421
>>>>>> #7 0x0000000000736fcc in cmdline_in (cl=0x2475080, buf=0xfffffffff25f
>>>>>> "\n\200\362\377\377\377\377",
>>>>>> size=1) at ../lib/cmdline/cmdline.c:149
>>>>>> #8 0x0000000000737270 in cmdline_interact (cl=0x2475080) at
>>>>>> ../lib/cmdline/cmdline.c:223
>>>>>> #9 0x00000000004f0ccc in prompt () at ../app/test-pmd/cmdline.c:17882
>>>>>> #10 0x0000000000545528 in main (argc=8, argv=0xfffffffff470) at
>>>>>> ../app/test-pmd/testpmd.c:3998
>>>>>> (gdb) c
>>>>>> Continuing.
>>>>>> Port 0 is closed
>>>>>> Done
>>>>>> testpmd> device detach 0000:7d:00.0
>>>>>> Removing a device...
>>>>>> [Switching to Thread 0xfffff7a0ad10 (LWP 17717)]
>>>>>>
>>>>>> Breakpoint 1, hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>)
>>>>>> at ../drivers/net/hns3/hns3_ethdev.c:7806
>>>>>> 7806 struct hns3_adapter *hns = eth_dev->data->dev_private;
>>>>>> (gdb) bt
>>>>>> #0 hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>) at
>>>>>> ../drivers/net/hns3/hns3_ethdev.c:7806
>>>>>> #1 0x0000000000bb8668 in rte_eth_dev_pci_generic_remove (pci_dev=0x247a600,
>>>>>> dev_uninit=0xbca7c4 <hns3_dev_uninit>) at ../lib/ethdev/ethdev_pci.h:155
>>>>>> #2 0x0000000000bca89c in eth_hns3_pci_remove (pci_dev=0x247a600)
>>>>>> at ../drivers/net/hns3/hns3_ethdev.c:7833
>>>>>> #3 0x00000000007e85f4 in rte_pci_detach_dev (dev=0x247a600) at
>>>>>> ../drivers/bus/pci/pci_common.c:287
>>>>>> #4 0x00000000007e8f14 in pci_unplug (dev=0x247a610) at
>>>>>> ../drivers/bus/pci/pci_common.c:570
>>>>>> #5 0x0000000000775678 in local_dev_remove (dev=0x247a610) at
>>>>>> ../lib/eal/common/eal_common_dev.c:319
>>>>>> #6 0x000000000078f114 in __handle_primary_request (param=0xfffff00008c0)
>>>>>> at ../lib/eal/common/hotplug_mp.c:284
>>>>>> #7 0x000000000079ebf4 in eal_alarm_callback (arg=0x0) at
>>>>>> ../lib/eal/linux/eal_alarm.c:92
>>>>>> #8 0x00000000007a3f30 in eal_intr_process_interrupts (events=0xfffff7a0a3e0,
>>>>>> nfds=1)
>>>>>> at ../lib/eal/linux/eal_interrupts.c:998
>>>>>> #9 0x00000000007a4224 in eal_intr_handle_interrupts (pfd=10, totalfds=2)
>>>>>> at ../lib/eal/linux/eal_interrupts.c:1071
>>>>>> #10 0x00000000007a442c in eal_intr_thread_main (arg=0x0) at
>>>>>> ../lib/eal/linux/eal_interrupts.c:1142
>>>>>> #11 0x0000000000789388 in ctrl_thread_init (arg=0x246ef40)
>>>>>> at ../lib/eal/common/eal_common_thread.c:202
>>>>>> #12 0x0000fffff7bf3c48 in start_thread () from /lib64/libpthread.so.0
>>>>>> #13 0x0000fffff7b45600 in thread_start () from /lib64/libc.so.6
>>>>>>
>>>>>> **************************************************************************************************************************************************
>>>>>>
>>>>>>
>>>>>>
>>>>>> Note: above log is from the secondary process.
>>>>>>
>>>>>>>>> And again, please re-consider calling 'rte_eth_dev_close()' or
>>>>>>>>> 'rte_dev_remove()' from the secondary process.
>>>>>>>>>
>>>>>>>>>>>>> + * Namely, whether "eth_dev" is NULL cannot be used to determine
>>>>>>>>>>>>> whether
>>>>>>>>>>>>> + * an ethdev port has been released.
>>>>>>>>>>>>> + * For both primary process and secondary process,
>>>>>>>>>>>>> eth_dev->state is
>>>>>>>>>>>>> + * RTE_ETH_DEV_UNUSED, which means the ethdev port has been
>>>>>>>>>>>>> released.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> + if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>>>>>>>>>>>>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
>>>>>>>>>>>>> + return 0;
>>>>>>>>>>>>> + }
>>>>>>>>>>> .
>>>>>>>>> .
>>>>>>> .
>>>>> .
>>> .
> .
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* [dpdk-dev] [PATCH] ethdev: fix eth device released repeatedly
2021-08-02 12:46 [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice Huisong Li
2021-08-03 2:30 ` [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice Huisong Li
2021-08-18 9:47 ` [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice Singh, Aman Deep
@ 2021-10-08 8:21 ` Huisong Li
2021-10-08 10:23 ` Thomas Monjalon
2021-10-12 11:39 ` [dpdk-dev] [PATCH V2] " Huisong Li
` (2 subsequent siblings)
5 siblings, 1 reply; 33+ messages in thread
From: Huisong Li @ 2021-10-08 8:21 UTC (permalink / raw)
To: dev
Cc: thomas, ferruh.yigit, aman.deep.singh, andrew.rybchenko,
anatoly.burakov, lihuisong
In secondary process, because it doesn't clear eth_dev->data, the "eth_dev"
above will not be NULL when rte_eth_dev_close() has been called before this
interface is called. In this case, Ethernet device will be released
repeatedly. The state of the Ethernet device is equal to RTE_ETH_DEV_UNUSED
after calling rte_eth_dev_close(). Using this state resolves problem.
Fixes: 99a2dd955fba ("lib: remove librte_ prefix from directory names")
Signed-off-by: Huisong Li <lihuisong@huawei.com>
---
RFC -> v1:
* fix commit log and add a judgment for secondary process.
---
lib/ethdev/ethdev_pci.h | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
index 8edca82ce8..be695feefe 100644
--- a/lib/ethdev/ethdev_pci.h
+++ b/lib/ethdev/ethdev_pci.h
@@ -151,6 +151,20 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
if (!eth_dev)
return 0;
+ /*
+ * In secondary process, because it doesn't clear eth_dev->data, the
+ * "eth_dev" above will not be NULL when rte_eth_dev_close() has been
+ * called before this interface is called.
+ * In this case, Ethernet device will be released repeatedly.
+ * The state of the Ethernet device is equal to RTE_ETH_DEV_UNUSED after
+ * calling rte_eth_dev_close(). Using this state resolves problem.
+ */
+ if (rte_eal_process_type() != RTE_PROC_PRIMARY &&
+ eth_dev->state == RTE_ETH_DEV_UNUSED) {
+ RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
+ return 0;
+ }
+
if (dev_uninit) {
ret = dev_uninit(eth_dev);
if (ret)
--
2.33.0
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [PATCH] ethdev: fix eth device released repeatedly
2021-10-08 8:21 ` [dpdk-dev] [PATCH] ethdev: fix eth device released repeatedly Huisong Li
@ 2021-10-08 10:23 ` Thomas Monjalon
2021-10-09 1:29 ` lihuisong (C)
0 siblings, 1 reply; 33+ messages in thread
From: Thomas Monjalon @ 2021-10-08 10:23 UTC (permalink / raw)
To: Huisong Li
Cc: dev, ferruh.yigit, aman.deep.singh, andrew.rybchenko,
anatoly.burakov, lihuisong
08/10/2021 10:21, Huisong Li:
> In secondary process, because it doesn't clear eth_dev->data, the "eth_dev"
> above will not be NULL when rte_eth_dev_close() has been called before this
> interface is called. In this case, Ethernet device will be released
> repeatedly. The state of the Ethernet device is equal to RTE_ETH_DEV_UNUSED
> after calling rte_eth_dev_close(). Using this state resolves problem.
Sorry I have difficulties to understand.
The use of "it" everywhere doesn't help.
You should name things instead of refering to "it".
> Fixes: 99a2dd955fba ("lib: remove librte_ prefix from directory names")
For sure that's not the root cause.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [PATCH] ethdev: fix eth device released repeatedly
2021-10-08 10:23 ` Thomas Monjalon
@ 2021-10-09 1:29 ` lihuisong (C)
0 siblings, 0 replies; 33+ messages in thread
From: lihuisong (C) @ 2021-10-09 1:29 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, ferruh.yigit, aman.deep.singh, andrew.rybchenko, anatoly.burakov
在 2021/10/8 18:23, Thomas Monjalon 写道:
> 08/10/2021 10:21, Huisong Li:
>> In secondary process, because it doesn't clear eth_dev->data, the "eth_dev"
>> above will not be NULL when rte_eth_dev_close() has been called before this
>> interface is called. In this case, Ethernet device will be released
>> repeatedly. The state of the Ethernet device is equal to RTE_ETH_DEV_UNUSED
>> after calling rte_eth_dev_close(). Using this state resolves problem.
> Sorry I have difficulties to understand.
> The use of "it" everywhere doesn't help.
> You should name things instead of refering to "it".
ok. I will fix it. Thanks.
>> Fixes: 99a2dd955fba ("lib: remove librte_ prefix from directory names")
> For sure that's not the root cause.
>
>
>
> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* [dpdk-dev] [PATCH V2] ethdev: fix eth device released repeatedly
2021-08-02 12:46 [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice Huisong Li
` (2 preceding siblings ...)
2021-10-08 8:21 ` [dpdk-dev] [PATCH] ethdev: fix eth device released repeatedly Huisong Li
@ 2021-10-12 11:39 ` Huisong Li
2021-10-12 15:33 ` Thomas Monjalon
2021-10-15 3:44 ` [dpdk-dev] [PATCH V3] " Huisong Li
2021-10-21 2:24 ` [dpdk-dev] [PATCH V4] " Huisong Li
5 siblings, 1 reply; 33+ messages in thread
From: Huisong Li @ 2021-10-12 11:39 UTC (permalink / raw)
To: dev
Cc: thomas, ferruh.yigit, aman.deep.singh, andrew.rybchenko,
anatoly.burakov, lihuisong
The rte_eth_dev_pci_generic_remove() will be called to detach an Ethernet
device when App calls rte_dev_remove() to detach a pci device. In addition,
the rte_eth_dev_close() can also detach an Ethernet device.
In secondary process, if App first calls rte_eth_dev_close() and then calls
rte_dev_remove(), because rte_eth_dev_close() doesn't clear "eth_dev->data"
, the address of the released Ethernet device can still be found by device
name. As a result, the Ethernet device will be released repeatedly in this
case. The state of the Ethernet device is equal to RTE_ETH_DEV_UNUSED after
calling rte_eth_dev_close(). Use this state to avoid this problem.
Signed-off-by: Huisong Li <lihuisong@huawei.com>
---
v1 -> v2:
* fix the commit log description.
RFC -> v1:
* fix commit log and add a judgment for secondary process.
---
lib/ethdev/ethdev_pci.h | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
index 8edca82ce8..af01cceddf 100644
--- a/lib/ethdev/ethdev_pci.h
+++ b/lib/ethdev/ethdev_pci.h
@@ -151,6 +151,21 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
if (!eth_dev)
return 0;
+ /*
+ * In secondary process, if applications first call rte_eth_dev_close()
+ * and then call this interface, because rte_eth_dev_close() doesn't
+ * clear eth_dev->data, the address of the released Ethernet device can
+ * still be found by device name. As a result, the Ethernet device will
+ * be released repeatedly in this case.
+ * The state of the Ethernet device is equal to RTE_ETH_DEV_UNUSED after
+ * calling rte_eth_dev_close(). Use this state to avoid this problem.
+ */
+ if (rte_eal_process_type() != RTE_PROC_PRIMARY &&
+ eth_dev->state == RTE_ETH_DEV_UNUSED) {
+ RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
+ return 0;
+ }
+
if (dev_uninit) {
ret = dev_uninit(eth_dev);
if (ret)
--
2.33.0
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [PATCH V2] ethdev: fix eth device released repeatedly
2021-10-12 11:39 ` [dpdk-dev] [PATCH V2] " Huisong Li
@ 2021-10-12 15:33 ` Thomas Monjalon
2021-10-14 3:50 ` lihuisong (C)
2021-10-14 12:32 ` lihuisong (C)
0 siblings, 2 replies; 33+ messages in thread
From: Thomas Monjalon @ 2021-10-12 15:33 UTC (permalink / raw)
To: Huisong Li
Cc: dev, ferruh.yigit, aman.deep.singh, andrew.rybchenko, anatoly.burakov
12/10/2021 13:39, Huisong Li:
> The rte_eth_dev_pci_generic_remove() will be called to detach an Ethernet
> device when App calls rte_dev_remove() to detach a pci device. In addition,
> the rte_eth_dev_close() can also detach an Ethernet device.
> In secondary process, if App first calls rte_eth_dev_close() and then calls
> rte_dev_remove(), because rte_eth_dev_close() doesn't clear "eth_dev->data"
It would be clearer if you start this sentence with:
"In secondary process, rte_eth_dev_close() doesn't clear eth_dev->data."
Then you can explain that if calling rte_dev_remove() after rte_eth_dev_close(),
etc...
> , the address of the released Ethernet device can still be found by device
> name. As a result, the Ethernet device will be released repeatedly in this
> case. The state of the Ethernet device is equal to RTE_ETH_DEV_UNUSED after
> calling rte_eth_dev_close(). Use this state to avoid this problem.
>
> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> ---
> + /*
> + * In secondary process, if applications first call rte_eth_dev_close()
> + * and then call this interface, because rte_eth_dev_close() doesn't
> + * clear eth_dev->data, the address of the released Ethernet device can
> + * still be found by device name. As a result, the Ethernet device will
> + * be released repeatedly in this case.
> + * The state of the Ethernet device is equal to RTE_ETH_DEV_UNUSED after
> + * calling rte_eth_dev_close(). Use this state to avoid this problem.
This is a comment for the commit log.
Inside the code, we should be more to the point.
I suggest this comment:
/* A released port can be found by its name in shared memory. */
> + */
> + if (rte_eal_process_type() != RTE_PROC_PRIMARY &&
Better to directly compare with RTE_PROC_SECONDARY
> + eth_dev->state == RTE_ETH_DEV_UNUSED) {
> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
Not sure we need any log here.
> + return 0;
> + }
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [PATCH V2] ethdev: fix eth device released repeatedly
2021-10-12 15:33 ` Thomas Monjalon
@ 2021-10-14 3:50 ` lihuisong (C)
2021-10-14 12:32 ` lihuisong (C)
1 sibling, 0 replies; 33+ messages in thread
From: lihuisong (C) @ 2021-10-14 3:50 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, ferruh.yigit, aman.deep.singh, andrew.rybchenko, anatoly.burakov
在 2021/10/12 23:33, Thomas Monjalon 写道:
> 12/10/2021 13:39, Huisong Li:
>> The rte_eth_dev_pci_generic_remove() will be called to detach an Ethernet
>> device when App calls rte_dev_remove() to detach a pci device. In addition,
>> the rte_eth_dev_close() can also detach an Ethernet device.
>> In secondary process, if App first calls rte_eth_dev_close() and then calls
>> rte_dev_remove(), because rte_eth_dev_close() doesn't clear "eth_dev->data"
> It would be clearer if you start this sentence with:
> "In secondary process, rte_eth_dev_close() doesn't clear eth_dev->data."
> Then you can explain that if calling rte_dev_remove() after rte_eth_dev_close(),
> etc...
Right. Thanks!😁
>> , the address of the released Ethernet device can still be found by device
>> name. As a result, the Ethernet device will be released repeatedly in this
>> case. The state of the Ethernet device is equal to RTE_ETH_DEV_UNUSED after
>> calling rte_eth_dev_close(). Use this state to avoid this problem.
>>
>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>> ---
>> + /*
>> + * In secondary process, if applications first call rte_eth_dev_close()
>> + * and then call this interface, because rte_eth_dev_close() doesn't
>> + * clear eth_dev->data, the address of the released Ethernet device can
>> + * still be found by device name. As a result, the Ethernet device will
>> + * be released repeatedly in this case.
>> + * The state of the Ethernet device is equal to RTE_ETH_DEV_UNUSED after
>> + * calling rte_eth_dev_close(). Use this state to avoid this problem.
> This is a comment for the commit log.
> Inside the code, we should be more to the point.
> I suggest this comment:
> /* A released port can be found by its name in shared memory. */
ack
>
>> + */
>> + if (rte_eal_process_type() != RTE_PROC_PRIMARY &&
> Better to directly compare with RTE_PROC_SECONDARY
ack
>
>> + eth_dev->state == RTE_ETH_DEV_UNUSED) {
>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
> Not sure we need any log here.
ack
>
>> + return 0;
>> + }
>
>
> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [PATCH V2] ethdev: fix eth device released repeatedly
2021-10-12 15:33 ` Thomas Monjalon
2021-10-14 3:50 ` lihuisong (C)
@ 2021-10-14 12:32 ` lihuisong (C)
2021-10-14 12:50 ` Thomas Monjalon
1 sibling, 1 reply; 33+ messages in thread
From: lihuisong (C) @ 2021-10-14 12:32 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, ferruh.yigit, aman.deep.singh, andrew.rybchenko, anatoly.burakov
Hi, Thomas
*The commit log:*
In secondary process, rte_eth_dev_close() doesn't clear eth_dev->data.
If calling rte_dev_remove() after rte_eth_dev_close(), in
rte_eth_dev_pci_generic_remove()
function, the released eth device still can be found by its name in
shared memory.
As a result, the eth device will be released repeatedly. The state of
the eth device
is modified to RTE_ETH_DEV_UNUSED after rte_eth_dev_close(). So this
state can
be used to avoid this problem.
Is that will be more clear?
/*
* A released eth device can be found by its name in shared memory.
* If the state of the eth device is RTE_ETH_DEV_UNUSED, which means
* the eth device has been released.
*/
Is it ok to use the above description as a comment in the code?
Hope for your reply. Thanks.
在 2021/10/12 23:33, Thomas Monjalon 写道:
> 12/10/2021 13:39, Huisong Li:
>> The rte_eth_dev_pci_generic_remove() will be called to detach an Ethernet
>> device when App calls rte_dev_remove() to detach a pci device. In addition,
>> the rte_eth_dev_close() can also detach an Ethernet device.
>> In secondary process, if App first calls rte_eth_dev_close() and then calls
>> rte_dev_remove(), because rte_eth_dev_close() doesn't clear "eth_dev->data"
> It would be clearer if you start this sentence with:
> "In secondary process, rte_eth_dev_close() doesn't clear eth_dev->data."
> Then you can explain that if calling rte_dev_remove() after rte_eth_dev_close(),
> etc...
>
>> , the address of the released Ethernet device can still be found by device
>> name. As a result, the Ethernet device will be released repeatedly in this
>> case. The state of the Ethernet device is equal to RTE_ETH_DEV_UNUSED after
>> calling rte_eth_dev_close(). Use this state to avoid this problem.
>>
>> Signed-off-by: Huisong Li<lihuisong@huawei.com>
>> ---
>> + /*
>> + * In secondary process, if applications first call rte_eth_dev_close()
>> + * and then call this interface, because rte_eth_dev_close() doesn't
>> + * clear eth_dev->data, the address of the released Ethernet device can
>> + * still be found by device name. As a result, the Ethernet device will
>> + * be released repeatedly in this case.
>> + * The state of the Ethernet device is equal to RTE_ETH_DEV_UNUSED after
>> + * calling rte_eth_dev_close(). Use this state to avoid this problem.
> This is a comment for the commit log.
> Inside the code, we should be more to the point.
> I suggest this comment:
> /* A released port can be found by its name in shared memory. */
>
>> + */
>> + if (rte_eal_process_type() != RTE_PROC_PRIMARY &&
> Better to directly compare with RTE_PROC_SECONDARY
>
>> + eth_dev->state == RTE_ETH_DEV_UNUSED) {
>> + RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
> Not sure we need any log here.
>
>> + return 0;
>> + }
>
>
> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [PATCH V2] ethdev: fix eth device released repeatedly
2021-10-14 12:32 ` lihuisong (C)
@ 2021-10-14 12:50 ` Thomas Monjalon
2021-10-15 3:03 ` lihuisong (C)
0 siblings, 1 reply; 33+ messages in thread
From: Thomas Monjalon @ 2021-10-14 12:50 UTC (permalink / raw)
To: lihuisong (C)
Cc: dev, ferruh.yigit, aman.deep.singh, andrew.rybchenko, anatoly.burakov
14/10/2021 14:32, lihuisong (C):
> Hi, Thomas
>
> *The commit log:*
> In secondary process, rte_eth_dev_close() doesn't clear eth_dev->data.
> If calling rte_dev_remove() after rte_eth_dev_close(), in
> rte_eth_dev_pci_generic_remove()
> function, the released eth device still can be found by its name in
> shared memory.
> As a result, the eth device will be released repeatedly. The state of
> the eth device
> is modified to RTE_ETH_DEV_UNUSED after rte_eth_dev_close(). So this
> state can
> be used to avoid this problem.
>
> Is that will be more clear?
Yes, that's clear (at least for me).
> /*
> * A released eth device can be found by its name in shared memory.
> * If the state of the eth device is RTE_ETH_DEV_UNUSED, which means
> * the eth device has been released.
> */
>
> Is it ok to use the above description as a comment in the code?
Yes. One small change, I think "which" should be "it".
> Hope for your reply. Thanks.
Thanks
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [PATCH V2] ethdev: fix eth device released repeatedly
2021-10-14 12:50 ` Thomas Monjalon
@ 2021-10-15 3:03 ` lihuisong (C)
0 siblings, 0 replies; 33+ messages in thread
From: lihuisong (C) @ 2021-10-15 3:03 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, ferruh.yigit, aman.deep.singh, andrew.rybchenko, anatoly.burakov
在 2021/10/14 20:50, Thomas Monjalon 写道:
> 14/10/2021 14:32, lihuisong (C):
>> Hi, Thomas
>>
>> *The commit log:*
>> In secondary process, rte_eth_dev_close() doesn't clear eth_dev->data.
>> If calling rte_dev_remove() after rte_eth_dev_close(), in
>> rte_eth_dev_pci_generic_remove()
>> function, the released eth device still can be found by its name in
>> shared memory.
>> As a result, the eth device will be released repeatedly. The state of
>> the eth device
>> is modified to RTE_ETH_DEV_UNUSED after rte_eth_dev_close(). So this
>> state can
>> be used to avoid this problem.
>>
>> Is that will be more clear?
> Yes, that's clear (at least for me).
>
>> /*
>> * A released eth device can be found by its name in shared memory.
>> * If the state of the eth device is RTE_ETH_DEV_UNUSED, which means
>> * the eth device has been released.
>> */
>>
>> Is it ok to use the above description as a comment in the code?
> Yes. One small change, I think "which" should be "it".
Thanks. I will fix it.
>
>> Hope for your reply. Thanks.
> Thanks
>
>
>
>
> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* [dpdk-dev] [PATCH V3] ethdev: fix eth device released repeatedly
2021-08-02 12:46 [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice Huisong Li
` (3 preceding siblings ...)
2021-10-12 11:39 ` [dpdk-dev] [PATCH V2] " Huisong Li
@ 2021-10-15 3:44 ` Huisong Li
2021-10-19 13:09 ` Ferruh Yigit
2021-10-21 2:24 ` [dpdk-dev] [PATCH V4] " Huisong Li
5 siblings, 1 reply; 33+ messages in thread
From: Huisong Li @ 2021-10-15 3:44 UTC (permalink / raw)
To: dev
Cc: thomas, ferruh.yigit, aman.deep.singh, andrew.rybchenko,
anatoly.burakov, lihuisong
In secondary process, rte_eth_dev_close() doesn't clear eth_dev->data.
If calling rte_dev_remove() after rte_eth_dev_close(),
in rte_eth_dev_pci_generic_remove() function, the released eth device still
can be found by its name in shared memory. As a result, the eth device will
be released repeatedly. The state of the eth device is modified to
RTE_ETH_DEV_UNUSED after rte_eth_dev_close(). So this state can be used to
avoid this problem.
Signed-off-by: Huisong Li <lihuisong@huawei.com>
---
v2 -> v3:
* fix the commit log description and the comment inside the code.
v1 -> v2:
* fix the commit log description.
RFC -> v1:
* fix commit log and add a judgment for secondary process.
---
lib/ethdev/ethdev_pci.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
index 8edca82ce8..fcabae02fa 100644
--- a/lib/ethdev/ethdev_pci.h
+++ b/lib/ethdev/ethdev_pci.h
@@ -151,6 +151,16 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
if (!eth_dev)
return 0;
+ /*
+ * In secondary process, a released eth device can be found by its name
+ * in shared memory.
+ * If the state of the eth device is RTE_ETH_DEV_UNUSED, it means the
+ * eth device has been released.
+ */
+ if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+ eth_dev->state == RTE_ETH_DEV_UNUSED)
+ return 0;
+
if (dev_uninit) {
ret = dev_uninit(eth_dev);
if (ret)
--
2.33.0
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [PATCH V3] ethdev: fix eth device released repeatedly
2021-10-15 3:44 ` [dpdk-dev] [PATCH V3] " Huisong Li
@ 2021-10-19 13:09 ` Ferruh Yigit
2021-10-21 2:31 ` lihuisong (C)
0 siblings, 1 reply; 33+ messages in thread
From: Ferruh Yigit @ 2021-10-19 13:09 UTC (permalink / raw)
To: Huisong Li, dev
Cc: thomas, aman.deep.singh, andrew.rybchenko, anatoly.burakov
On 10/15/2021 4:44 AM, Huisong Li wrote:
> In secondary process, rte_eth_dev_close() doesn't clear eth_dev->data.
> If calling rte_dev_remove() after rte_eth_dev_close(),
> in rte_eth_dev_pci_generic_remove() function, the released eth device still
> can be found by its name in shared memory. As a result, the eth device will
> be released repeatedly. The state of the eth device is modified to
> RTE_ETH_DEV_UNUSED after rte_eth_dev_close(). So this state can be used to
> avoid this problem.
>
Hi Huisong,
Can you please add Fixes line, and stable tag if the change is requested
for backport?
> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> ---
> v2 -> v3:
> * fix the commit log description and the comment inside the code.
> v1 -> v2:
> * fix the commit log description.
> RFC -> v1:
> * fix commit log and add a judgment for secondary process.
> ---
> lib/ethdev/ethdev_pci.h | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
> index 8edca82ce8..fcabae02fa 100644
> --- a/lib/ethdev/ethdev_pci.h
> +++ b/lib/ethdev/ethdev_pci.h
> @@ -151,6 +151,16 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
> if (!eth_dev)
> return 0;
>
> + /*
> + * In secondary process, a released eth device can be found by its name
> + * in shared memory.
> + * If the state of the eth device is RTE_ETH_DEV_UNUSED, it means the
> + * eth device has been released.
> + */
> + if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
> + eth_dev->state == RTE_ETH_DEV_UNUSED)
> + return 0;
> +
> if (dev_uninit) {
> ret = dev_uninit(eth_dev);
> if (ret)
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* [dpdk-dev] [PATCH V4] ethdev: fix eth device released repeatedly
2021-08-02 12:46 [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice Huisong Li
` (4 preceding siblings ...)
2021-10-15 3:44 ` [dpdk-dev] [PATCH V3] " Huisong Li
@ 2021-10-21 2:24 ` Huisong Li
2021-10-21 21:19 ` Ferruh Yigit
5 siblings, 1 reply; 33+ messages in thread
From: Huisong Li @ 2021-10-21 2:24 UTC (permalink / raw)
To: dev
Cc: thomas, ferruh.yigit, aman.deep.singh, andrew.rybchenko,
anatoly.burakov, lihuisong
In secondary process, rte_eth_dev_close() doesn't clear eth_dev->data.
If calling rte_dev_remove() after rte_eth_dev_close(),
in rte_eth_dev_pci_generic_remove() function, the released eth device still
can be found by its name in shared memory. As a result, the eth device will
be released repeatedly. The state of the eth device is modified to
RTE_ETH_DEV_UNUSED after rte_eth_dev_close(). So this state can be used to
avoid this problem.
Fixes: dcd5c8112bc3 ("ethdev: add PCI driver helpers")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
---
v4:
* add fixes line and stable tag.
v3:
* fix the commit log description and the comment inside the code.
v2:
* fix the commit log description.
v1:
* fix commit log and add a judgment for secondary process.
---
lib/ethdev/ethdev_pci.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
index 8edca82ce8..fcabae02fa 100644
--- a/lib/ethdev/ethdev_pci.h
+++ b/lib/ethdev/ethdev_pci.h
@@ -151,6 +151,16 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device *pci_dev,
if (!eth_dev)
return 0;
+ /*
+ * In secondary process, a released eth device can be found by its name
+ * in shared memory.
+ * If the state of the eth device is RTE_ETH_DEV_UNUSED, it means the
+ * eth device has been released.
+ */
+ if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
+ eth_dev->state == RTE_ETH_DEV_UNUSED)
+ return 0;
+
if (dev_uninit) {
ret = dev_uninit(eth_dev);
if (ret)
--
2.33.0
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [PATCH V3] ethdev: fix eth device released repeatedly
2021-10-19 13:09 ` Ferruh Yigit
@ 2021-10-21 2:31 ` lihuisong (C)
0 siblings, 0 replies; 33+ messages in thread
From: lihuisong (C) @ 2021-10-21 2:31 UTC (permalink / raw)
To: Ferruh Yigit, dev
Cc: thomas, aman.deep.singh, andrew.rybchenko, anatoly.burakov
在 2021/10/19 21:09, Ferruh Yigit 写道:
> On 10/15/2021 4:44 AM, Huisong Li wrote:
>> In secondary process, rte_eth_dev_close() doesn't clear eth_dev->data.
>> If calling rte_dev_remove() after rte_eth_dev_close(),
>> in rte_eth_dev_pci_generic_remove() function, the released eth device
>> still
>> can be found by its name in shared memory. As a result, the eth
>> device will
>> be released repeatedly. The state of the eth device is modified to
>> RTE_ETH_DEV_UNUSED after rte_eth_dev_close(). So this state can be
>> used to
>> avoid this problem.
>>
>
> Hi Huisong,
>
> Can you please add Fixes line, and stable tag if the change is requested
> for backport?
ok. It has been added. Please review v4.
>
>
>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>> ---
>> v2 -> v3:
>> * fix the commit log description and the comment inside the code.
>> v1 -> v2:
>> * fix the commit log description.
>> RFC -> v1:
>> * fix commit log and add a judgment for secondary process.
>> ---
>> lib/ethdev/ethdev_pci.h | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
>> index 8edca82ce8..fcabae02fa 100644
>> --- a/lib/ethdev/ethdev_pci.h
>> +++ b/lib/ethdev/ethdev_pci.h
>> @@ -151,6 +151,16 @@ rte_eth_dev_pci_generic_remove(struct
>> rte_pci_device *pci_dev,
>> if (!eth_dev)
>> return 0;
>> + /*
>> + * In secondary process, a released eth device can be found by
>> its name
>> + * in shared memory.
>> + * If the state of the eth device is RTE_ETH_DEV_UNUSED, it
>> means the
>> + * eth device has been released.
>> + */
>> + if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
>> + eth_dev->state == RTE_ETH_DEV_UNUSED)
>> + return 0;
>> +
>> if (dev_uninit) {
>> ret = dev_uninit(eth_dev);
>> if (ret)
>>
>
> .
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [dpdk-dev] [PATCH V4] ethdev: fix eth device released repeatedly
2021-10-21 2:24 ` [dpdk-dev] [PATCH V4] " Huisong Li
@ 2021-10-21 21:19 ` Ferruh Yigit
0 siblings, 0 replies; 33+ messages in thread
From: Ferruh Yigit @ 2021-10-21 21:19 UTC (permalink / raw)
To: Huisong Li, dev
Cc: thomas, aman.deep.singh, andrew.rybchenko, anatoly.burakov
On 10/21/2021 3:24 AM, Huisong Li wrote:
> In secondary process, rte_eth_dev_close() doesn't clear eth_dev->data.
> If calling rte_dev_remove() after rte_eth_dev_close(),
> in rte_eth_dev_pci_generic_remove() function, the released eth device still
> can be found by its name in shared memory. As a result, the eth device will
> be released repeatedly. The state of the eth device is modified to
> RTE_ETH_DEV_UNUSED after rte_eth_dev_close(). So this state can be used to
> avoid this problem.
>
> Fixes: dcd5c8112bc3 ("ethdev: add PCI driver helpers")
> Cc: stable@dpdk.org
>
> Signed-off-by: Huisong Li <lihuisong@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Applied to dpdk-next-net/main, thanks.
^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2021-10-21 21:19 UTC | newest]
Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-02 12:46 [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice Huisong Li
2021-08-03 2:30 ` [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice Huisong Li
2021-08-13 2:11 ` Huisong Li
2021-08-13 6:12 ` Thomas Monjalon
2021-08-13 8:16 ` Huisong Li
2021-08-18 11:24 ` Ferruh Yigit
2021-08-19 3:45 ` Huisong Li
2021-08-24 14:42 ` Ferruh Yigit
2021-08-25 9:53 ` Huisong Li
2021-09-04 1:23 ` Huisong Li
2021-09-18 3:31 ` Huisong Li
2021-09-20 14:07 ` Ferruh Yigit
2021-09-22 3:31 ` Huisong Li
2021-09-28 7:19 ` Singh, Aman Deep
2021-09-30 10:54 ` Huisong Li
2021-09-30 11:01 ` Ferruh Yigit
2021-10-08 6:13 ` lihuisong (C)
2021-08-18 9:47 ` [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice Singh, Aman Deep
2021-08-24 2:10 ` Huisong Li
2021-10-08 8:21 ` [dpdk-dev] [PATCH] ethdev: fix eth device released repeatedly Huisong Li
2021-10-08 10:23 ` Thomas Monjalon
2021-10-09 1:29 ` lihuisong (C)
2021-10-12 11:39 ` [dpdk-dev] [PATCH V2] " Huisong Li
2021-10-12 15:33 ` Thomas Monjalon
2021-10-14 3:50 ` lihuisong (C)
2021-10-14 12:32 ` lihuisong (C)
2021-10-14 12:50 ` Thomas Monjalon
2021-10-15 3:03 ` lihuisong (C)
2021-10-15 3:44 ` [dpdk-dev] [PATCH V3] " Huisong Li
2021-10-19 13:09 ` Ferruh Yigit
2021-10-21 2:31 ` lihuisong (C)
2021-10-21 2:24 ` [dpdk-dev] [PATCH V4] " Huisong Li
2021-10-21 21:19 ` Ferruh Yigit
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).