From: Huisong Li <lihuisong@huawei.com>
To: Thomas Monjalon <thomas@monjalon.net>
Cc: <dev@dpdk.org>, <ferruh.yigit@intel.com>, <david.marchand@redhat.com>
Subject: Re: [dpdk-dev] [RFC V1] examples/l3fwd-power: fix memory leak for rte_pci_device
Date: Sun, 26 Sep 2021 20:20:27 +0800 [thread overview]
Message-ID: <430246ab-36ce-402f-8570-d305ada9d720@huawei.com> (raw)
In-Reply-To: <2004569.RrOHqjGOaX@thomas>
在 2021/9/18 16:46, Thomas Monjalon 写道:
> 18/09/2021 05:24, Huisong Li:
>> 在 2021/9/17 20:50, Thomas Monjalon 写道:
>>> 17/09/2021 04:13, Huisong Li:
>>>> 在 2021/9/16 18:36, Thomas Monjalon 写道:
>>>>> 16/09/2021 10:01, Huisong Li:
>>>>>> 在 2021/9/8 15:20, Thomas Monjalon 写道:
>>>>>>> 08/09/2021 04:01, Huisong Li:
>>>>>>>> 在 2021/9/7 16:53, Thomas Monjalon 写道:
>>>>>>>>> 07/09/2021 05:41, Huisong Li:
>>>>>>>>>> Calling rte_eth_dev_close() will release resources of eth device and close
>>>>>>>>>> it. But rte_pci_device struct isn't released when app exit, which will lead
>>>>>>>>>> to memory leak.
>>>>>>>>> That's a PMD issue.
>>>>>>>>> When the last port of a PCI device is closed, the device should be freed.
>>>>>>>> Why is this a PMD problem? I don't understand.
>>>>>>> In the PMD close function, freeing of PCI device must be managed,
>>>>>>> so the app doesn't have to bother.
>>>>>> I know what you mean. Currently, there are two ways to close PMD device
>>>>>> (rte_eth_dev_close() and rte_dev_remove()).
>>>>>>
>>>>>> For rte_dev_remove(), eth device can be closed and rte_pci_device also
>>>>>> can be freed, so it can make app not care about that.
>>>>>>
>>>>>> But dev_close() is only used to close eth device, and nothing about
>>>>>> rte_pci_device is involved in the framework layer
>>>>>>
>>>>>> call stack of dev_close(). The rte_pci_device is allocated and
>>>>>> initialized when the rte_pci_bus scans "/sys/bus/pci/devices" directory.
>>>>>>
>>>>>> Generally, the PMD of eth devices operates on the basis of eth devices,
>>>>>> and rarely on rte_pci_device.
>>>>> No. The PMD is doing the relation between the PCI device and the ethdev port.
>>>> It seems that the ethdev layer can create eth devices based on
>>>> rte_pci_device, but does not release rte_pci_device.
>>> No, the ethdev layer does not manage any bus.
>>> Only the PMD does that.
>> I don't mean that the ethdev layer manages the bus.
>>
>> I mean, it neither allocate rte_pci_device nor free it.
>>
>>>>>> And the rte_pci_device corresponding to the eth devices managed and
>>>>>> processed by rte_pci_bus.
>>>>>>
>>>>>> So, PMD is closed only based on the port ID of the eth device, whilch
>>>>>> only shuts down eth devices, not frees rte_pci_device
>>>>>> and remove it from rte_pci_bus.
>>>>> Not really.
>>>> I do not see any PMD driver releasing rte_pci_device in dev_close().
>>> Maybe not but we should.
>> I'm sure.
>>
>> As far as I know, the PMD does not free rte_pci_device for devices under
>> the PCI bus, whether ethdev or dmadev.
>>
>>>>> If there is no port using the PCI device, it should be released.
>>>> Yes.
>>>>>>>> As far as I know, most apps or examples in the DPDK project have only
>>>>>>>> one port for a pci device.
>>>>>>> The number of ports per PCI device is driver-specific.
>>>>>>>
>>>>>>>> When the port is closed, the rte_pci_device should be freed. But none of
>>>>>>>> the apps seem to do this.
>>>>>>> That's because from the app point of view, only ports should be managed.
>>>>>>> The hardware device is managed by the PMD.
>>>>>>> Only drivers (PMDs) have to do the relation between class ports
>>>>>>> and hardware devices.
>>>>>> Yes. But the current app only closes the port to disable the PMD, and
>>>>>> the rte_pci_device cannot be freed.
>>>>> Why not?
>>>> Because most apps in DPDK call dev_close() to close the eth device
>>>> corresponding to a port.
>>> You don't say why the underlying PCI device could not be freed.
>> From the current implementation, rte_eth_dev_close() in ethdev layer
>> and dev_close() in PMD both do not free it.
>>>>>> Because rte_pci_device cannot be released in dev_close() of PMD, and is
>>>>>> managed by framework layer.
>>>>> No
>>>>>
>>>>>> Btw. Excluding rte_dev_probe() and rte_dev_remove(), it seems that the
>>>>>> DPDK framework only automatically
>>>>>> scans PCI devices, but does not automatically release PCI devices when
>>>>>> the process exits.
>>>>> Indeed, because such freeing is the responsibility of the PMD.
>>>> Do you mean to free rte_pci_device in the dev_close() API?
>>> I mean free the PCI device in the PMD implementation of dev_close.
>> I don't think it's reasonable.
> What is not reasonable, is to not free a device which is closed.
>
>> In the normal process, the rte_pci_device is allocated rte_eal_init()
>> when pci bus scan "/sys/bus/pci/devices"
>>
>> by calling rte_bus_scan() and insert to rte_pci_bus.device_list.
>>
>> Then, calling rte_bus_probe() in rte_eal_init to match rte_pci_device
>> and rte_pci_driver registered under rte_pci_bus
>>
>> to generate an eth device.
>>
>> From this point of view, the rte_pci_device should be managed and
>> released by the rte_pci_bus.
>>
>> Generally, the uninstallation operation should be reversed. Release the
>> eth device first and then release the rte_pci_device.
> Same for mbuf in mempool: allocation is done by the app,
> free is done by the PMD.
That doesn't seem to be the case. In the rx direction, the mbuf is
allocated by PMD.
So it should be freed by the PMD.
> Not everything is symmetrical.
>
>> Therefore the rte_pci_device does not be freed in the PMD
>> implementation of dev_close.
>>
>>>> How should PMD free it? What should we do? Any good suggestions?
>>> Check that there is no other port sharing the same PCI device,
>>> then call the PMD callback for rte_pci_remove_t.
>> For primary and secondary processes, their rte_pci_device is independent.
> Yes it requires to free on both primary and secondary.
>
>> Is this for a scenario where there are multiple representor ports under
>> the same PCI address in the same processe?
> A PCI device can have multiple physical or representor ports.
Got it.
>
>>>> Would it be more appropriate to do this in rte_eal_cleanup() if it
>>>> cann't be done in the API above?
>>> rte_eal_cleanup is a last cleanup for what was not done earlier.
>>> We could do that but first we should properly free devices when closed.
>>>
>> Totally, it is appropriate that rte_eal_cleanup is responsible for
>> releasing devices under the pci bus.
> Yes, but if a device is closed while the rest of the app keep running,
> we should not wait to free it.
From this point of view, it seems to make sense. However, according to
the OVS-DPDK
usage, it calls dev_close() first, and then check whether all ports
under the PCI address are
closed to free rte_pci_device by calling rte_dev_remove().
If we do not want the user to be aware of this, and we want
rte_pci_device to be freed
in a timely manner. Can we add a code logic calculating the number of
ports under a PCI address
and calling rte_dev_remove() to rte_eth_dev_close() to free
rte_pci_device and delete it from rte_pci_bus?
If we do, we may need to make some extra work, otherwise some
applications, such as OVS-DPDK, will
fail due to a second call to rte_dev_remove().
>
>
> .
next prev parent reply other threads:[~2021-09-26 12:21 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-07 3:41 Huisong Li
2021-09-07 8:53 ` Thomas Monjalon
2021-09-08 2:01 ` Huisong Li
2021-09-08 7:20 ` Thomas Monjalon
2021-09-16 8:01 ` Huisong Li
2021-09-16 10:36 ` Thomas Monjalon
2021-09-17 2:13 ` Huisong Li
2021-09-17 12:50 ` Thomas Monjalon
2021-09-18 3:24 ` Huisong Li
2021-09-18 8:46 ` Thomas Monjalon
2021-09-26 12:20 ` Huisong Li [this message]
2021-09-26 19:16 ` Thomas Monjalon
2021-09-27 1:44 ` Huisong Li
2021-09-30 6:28 ` Huisong Li
2021-09-30 7:50 ` Thomas Monjalon
2021-10-08 6:26 ` lihuisong (C)
2021-10-08 6:29 ` Thomas Monjalon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=430246ab-36ce-402f-8570-d305ada9d720@huawei.com \
--to=lihuisong@huawei.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).