DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ferruh Yigit <ferruh.yigit@intel.com>
To: Huisong Li <lihuisong@huawei.com>, Thomas Monjalon <thomas@monjalon.net>
Cc: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>,
	"dev@dpdk.org" <dev@dpdk.org>,
	Anatoly Burakov <anatoly.burakov@intel.com>,
	David Marchand <david.marchand@redhat.com>
Subject: Re: [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice
Date: Mon, 20 Sep 2021 15:07:46 +0100
Message-ID: <f20be8f2-fd91-147f-efd5-468805317363@intel.com> (raw)
In-Reply-To: <cdd011aa-f6d6-6dd5-a1db-f1b57f3e06f9@huawei.com>

On 8/25/2021 10:53 AM, Huisong Li wrote:
> 
> 在 2021/8/24 22:42, Ferruh Yigit 写道:
>> On 8/19/2021 4:45 AM, Huisong Li wrote:
>>> 在 2021/8/18 19:24, Ferruh Yigit 写道:
>>>> On 8/13/2021 9:16 AM, Huisong Li wrote:
>>>>> 在 2021/8/13 14:12, Thomas Monjalon 写道:
>>>>>> 13/08/2021 04:11, Huisong Li:
>>>>>>> Hi, all
>>>>>>>
>>>>>>> This patch can enhance the security of device uninstallation to
>>>>>>> eliminate dependency on user usage methods.
>>>>>>>
>>>>>>> Can you check this patch?
>>>>>>>
>>>>>>>
>>>>>>> 在 2021/8/3 10:30, Huisong Li 写道:
>>>>>>>> Ethernet devices in DPDK can be released by rte_eth_dev_close() and
>>>>>>>> rte_dev_remove(). These APIs both call xxx_dev_close() in PMD layer
>>>>>>>> to uninstall hardware. However, the two APIs do not have explicit
>>>>>>>> invocation restrictions. In other words, at the ethdev layer, it is
>>>>>>>> possible to call rte_eth_dev_close() before calling rte_dev_remove()
>>>>>>>> or rte_eal_hotplug_remove(). In such a bad scenario,
>>>>>> It is not a bad scenario.
>>>>>> If there is no more port for the device after calling close,
>>>>>> the device should be removed automatically.
>>>>>> Keep in mind "close" is for one port, "remove" is for the entire device
>>>>>> which can have more than one port.
>>>>> I know.
>>>>>
>>>>> dev_close() is for removing an eth device. And rte_dev_remove() can be used
>>>>>
>>>>> for removing the rte device and all its eth devices belonging to the rte
>>>>> device.
>>>>>
>>>>> In rte_dev_remove(), "remove" is executed in primary or one of secondary,
>>>>>
>>>>> all eth devices having same pci address will be closed and removed.
>>>>>
>>>>>>>> the primary
>>>>>>>> process may be fine, but it may cause that xxx_dev_close() in the PMD
>>>>>>>> layer will be called twice in the secondary process. So this patch
>>>>>>>> fixes it.
>>>>>> If a port is closed in primary, it should be the same in secondary.
>>>>>>
>>>>>>
>>>>>>>> +    /*
>>>>>>>> +     * The eth_dev->data->name doesn't be cleared by the secondary
>>>>>>>> process,
>>>>>>>> +     * so above "eth_dev" isn't NULL after rte_eth_dev_close() called.
>>>>>> This assumption is not clear. All should be closed together.
>>>>> However, dev_close() does not have the feature similar to rte_dev_remove().
>>>>>
>>>>> Namely, it is not guaranteed that all eth devices are closed together in
>>>>> ethdev
>>>>> layer. It depends on app or user.
>>>>>
>>>>> If the app does not close together, the operation of repeatedly
>>>>> uninstalling an
>>>>> eth device in the secondary process
>>>>>
>>>>> will be triggered when dev_close() is first called by one secondary
>>>>> process, and
>>>>> then rte_dev_remove() is called.
>>>>>
>>>>> So I think it should be avoided.
>>>> First of all, I am not sure about calling 'rte_eth_dev_close()' or
>>>> 'rte_dev_remove()' from the secondary process.
>>>> There are explicit checks in various locations to prevent clearing resources
>>>> completely from secondary process.
>>> There's no denying that.
>>>
>>> Generally, hardware resources of eth device and shared data of the primary and
>>> secondary process
>>>
>>> are cleared by primary, which are controled by ethdev layer or PMD layer.
>>>
>>> But there may be some private data or resources of each process (primary or
>>> secondary ), such as mp action
>>>
>>> registered by rte_mp_action_register() or others.  For these resources, the
>>> secondary process still needs to clear.
>>>
>>> Namely, both primary and secondary processes need to prevent repeated offloading
>>> of resources.
>>>
>>>> Calling 'rte_eth_dev_close()' or 'rte_dev_remove()' by secondary is technically
>>>> can be done but application needs to be extra cautious and should take extra
>>>> measures and synchronization to make it work.
>>>> Regular use-case is secondary processes do the packet processing and all
>>>> control
>>>> commands run by primary.
>>> You are right. We have a consensus that 'rte_eth_dev_close()' or
>>> 'rte_dev_remove()'
>>>
>>> can be called by primary and secondary processes.
>>>
>>> But DPDK framework cannot assume user behavior.😁
>>>
>>> We need to make it more secure and reliable for both primary and secondary
>>> processes.
>>>
>>>> In primary, if you call 'rte_eth_dev_close()' it will clear all ethdev
>>>> resources
>>>> and further 'rte_dev_remove()' call will detect missing ethdev resources and
>>>> won't try to clear them again.
>>>>
>>>> In secondary, if you call 'rte_eth_dev_close()', it WON'T clear all resources
>>>> and further 'rte_dev_remove()' call (either from primary or secondary) will try
>>>> to clean ethdev resources again. You are trying to prevent this retry in remove
>>>> happening for secondary process.
>>> Right. However, if secondary process in PMD layer has its own private resources
>>> to be
>>>
>>> cleared, it still need to do it by calling 'rte_eth_dev_close()' or
>>> 'rte_dev_remove()'.
>>>
>>>> In secondary it won't free ethdev resources anyway if you let it continue,
>>>> but I
>>>> guess here you are trying to prevent the PMD dev_close() called again. Why? Is
>>>> it just for optimization or does it cause unexpected behavior in the PMD?
>>>>
>>>>
>>>> Overall, to free resources you need to do the 'rte_eth_dev_close()' or
>>>> 'rte_dev_remove()' in the primary anyway. So instead of this workaround, I
>>>> would
>>>> suggest making PMD dev_close() safe to be called multiple times (if this is the
>>>> problem.)
>>> In conclusion,  primary and secondary processes in PMD layer may have their own
>>>
>>> private data and resources, which need to be processed and released.
>>>
>>> Currently,  these for PMD are either handled and cleaned up in dev_close() or
>>> remove().
>>>
>>> However, code streams in rte_dev_remove() cannot ensure that the uninstallation
>>>
>>> from secondary process will not be repeated if rte_eth_dev_close() is first
>>> called by
>>>
>>> secondary(primary is ok, plese review this patch).
>>>
>>> I think, this is the same for each PMD and is better suited to doing it in
>>> ethdev layer.
>>>
>> This patch prevents to call dev_close() twice in the secondary process, is this
>> fixing a theoretical problem or an actual problem?
>>
>> If it is an actual problem can you please provide details, callstack of the
>> problematic case?
> 
> We analyzed the code when modifying the bug and found that the problem did exist.
> 
> The ethdev layer did not guarantee the security.
> 

I was wondering if there is a crash for an unexpected path, for below case the
primary process check in the 'hns3_dev_uninit()' should already prevent anything
unexpected. So I assume this is a fix for a theoretical issue.

In secondary process, these init/uninit device is already not very safe. For
example, as far as I can see in secondary if you hot remove a device device and
hot plug a new one, new device will use wrong device data (since hot remove
won't clear device data, new one will continue to use it).

So I am not sure about adding secondary process related checks in that area and
causing a false sense of security, and polluting the logic with secondary
specific checks.
Also the check you add may hit by primary process and I am worried on an
unexpected side affect it cause.


As said above I am not sure about this new check, but even we continue with it,
what about wrapping the check with secondary process check, at least to be sure
there won't be any side affect for primary process.


> The general function of the two interfaces is as follows:
> 
> rte_eth_dev_close() --> release eth device
> 
> rte_dev_remove() -->   release eth device + remove and free
> rte_pci_device(primary andsecondary).
> 
> According to the OVS application scenario, first call dev_close() and then call
> 
> remove(), which is possible.
> 
> We constructed this scenario using testpmd to start the secondary process(the
> multi-process
> 
> patch of testpmd is being uploaded.). It is proved that the ethdev layer cannot
> guarantee
> 
> this security. The callstack is as follows:
> 
> **************************************************************************************************************************************************
> 
> 
> gdb) set args -a 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto --
> -i --rxq 4 --txq 4 --num-procs=2 --proc-id=1
> (gdb) b hns3_dev_uninit
> Breakpoint 1 at 0xbca7d0: file ../drivers/net/hns3/hns3_ethdev.c, line 7806.
> (gdb) b hns3_dev_close
> Breakpoint 2 at 0xbc6d44: file ../drivers/net/hns3/hns3_ethdev.c, line 6189.
> (gdb) r
> Starting program: /home/lihuisong/v500/gcov-test/dpdk/build/app/dpdk-testpmd -a
> 0000:7d:00.0 -l 2-3 --file-prefix=rte_lee --proc-type=auto -- -i --rxq 4 --txq 4
> --num-procs=2 --proc-id=1
> Missing separate debuginfo for /root/lib/libnuma.so.1
> Try: yum --enablerepo='*debug*' install
> /usr/lib/debug/.build-id/ce/4eea0b0f2150f70a080bbd8835e43e78373096.debug
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> EAL: Detected 128 lcore(s)
> EAL: Detected 4 NUMA nodes
> EAL: Auto-detected process type: SECONDARY
> EAL: Detected static linkage of DPDK
> [New Thread 0xfffff7a0ad10 (LWP 17717)]
> EAL: Multi-process socket /var/run/dpdk/rte_lee/mp_socket_17714_66aa8d3a60a9
> [New Thread 0xfffff7209d10 (LWP 17718)]
> EAL: Selected IOVA mode 'VA'
> EAL: VFIO support initialized
> [New Thread 0xfffff69f8d10 (LWP 17719)]
> EAL: Using IOMMU type 1 (Type 1)
> EAL: Probe PCI driver: net_hns3 (19e5:a222) device: 0000:7d:00.0 (socket 0)
> [New Thread 0xfffff61f7d10 (LWP 17720)]
> TELEMETRY: No legacy callbacks, legacy socket not created
> Interactive-mode selected
> 0000:7d:00.0 hns3_dev_mtu_set(): Failed to set mtu, port 0 must be stopped
> before configuration
> Failed to set MTU to 1500 for port 0
> testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
> testpmd: preferred mempool ops selected: ring_mp_mc
> 
> Warning! port-topology=paired and odd forward ports number, the last port will
> pair with itself.
> 
> Configuring Port 0 (socket 0)
> Port 0: 00:18:2D:00:00:9E
> Checking link statuses...
> Done
> testpmd> port close 0
> Closing ports...
> 
> Breakpoint 2, hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>)
>     at ../drivers/net/hns3/hns3_ethdev.c:6189
> 6189        struct hns3_adapter *hns = eth_dev->data->dev_private;
> Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7.aarch64
> libpcap-1.5.3-11.el7.aarch64 openssl-libs-1.0.2k-16.el7.aarch64
> zlib-1.2.7-18.el7.aarch64
> (gdb) bt
> #0  hns3_dev_close (eth_dev=0x21cdb80 <rte_eth_devices>) at
> ../drivers/net/hns3/hns3_ethdev.c:6189
> #1  0x0000000000742eac in rte_eth_dev_close (port_id=0) at
> ../lib/ethdev/rte_ethdev.c:1870
> #2  0x0000000000542a8c in close_port (pid=0) at ../app/test-pmd/testpmd.c:2895
> #3  0x00000000004df82c in cmd_operate_specific_port_parsed
> (parsed_result=0xffffffffb0f0,
>     cl=0x2475080, data=0x0) at ../app/test-pmd/cmdline.c:1272
> #4  0x0000000000738ce4 in cmdline_parse (cl=0x2475080, buf=0x24750c8 "port close
> 0\n")
>     at ../lib/cmdline/cmdline_parse.c:290
> #5  0x0000000000736b7c in cmdline_valid_buffer (rdl=0x2475090, buf=0x24750c8
> "port close 0\n",
>     size=14) at ../lib/cmdline/cmdline.c:26
> #6  0x000000000073c078 in rdline_char_in (rdl=0x2475090, c=10 '\n')
>     at ../lib/cmdline/cmdline_rdline.c:421
> #7  0x0000000000736fcc in cmdline_in (cl=0x2475080, buf=0xfffffffff25f
> "\n\200\362\377\377\377\377",
>     size=1) at ../lib/cmdline/cmdline.c:149
> #8  0x0000000000737270 in cmdline_interact (cl=0x2475080) at
> ../lib/cmdline/cmdline.c:223
> #9  0x00000000004f0ccc in prompt () at ../app/test-pmd/cmdline.c:17882
> #10 0x0000000000545528 in main (argc=8, argv=0xfffffffff470) at
> ../app/test-pmd/testpmd.c:3998
> (gdb) c
> Continuing.
> Port 0 is closed
> Done
> testpmd> device detach 0000:7d:00.0
> Removing a device...
> [Switching to Thread 0xfffff7a0ad10 (LWP 17717)]
> 
> Breakpoint 1, hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>)
>     at ../drivers/net/hns3/hns3_ethdev.c:7806
> 7806        struct hns3_adapter *hns = eth_dev->data->dev_private;
> (gdb) bt
> #0  hns3_dev_uninit (eth_dev=0x21cdb80 <rte_eth_devices>) at
> ../drivers/net/hns3/hns3_ethdev.c:7806
> #1  0x0000000000bb8668 in rte_eth_dev_pci_generic_remove (pci_dev=0x247a600,
>     dev_uninit=0xbca7c4 <hns3_dev_uninit>) at ../lib/ethdev/ethdev_pci.h:155
> #2  0x0000000000bca89c in eth_hns3_pci_remove (pci_dev=0x247a600)
>     at ../drivers/net/hns3/hns3_ethdev.c:7833
> #3  0x00000000007e85f4 in rte_pci_detach_dev (dev=0x247a600) at
> ../drivers/bus/pci/pci_common.c:287
> #4  0x00000000007e8f14 in pci_unplug (dev=0x247a610) at
> ../drivers/bus/pci/pci_common.c:570
> #5  0x0000000000775678 in local_dev_remove (dev=0x247a610) at
> ../lib/eal/common/eal_common_dev.c:319
> #6  0x000000000078f114 in __handle_primary_request (param=0xfffff00008c0)
>     at ../lib/eal/common/hotplug_mp.c:284
> #7  0x000000000079ebf4 in eal_alarm_callback (arg=0x0) at
> ../lib/eal/linux/eal_alarm.c:92
> #8  0x00000000007a3f30 in eal_intr_process_interrupts (events=0xfffff7a0a3e0,
> nfds=1)
>     at ../lib/eal/linux/eal_interrupts.c:998
> #9  0x00000000007a4224 in eal_intr_handle_interrupts (pfd=10, totalfds=2)
>     at ../lib/eal/linux/eal_interrupts.c:1071
> #10 0x00000000007a442c in eal_intr_thread_main (arg=0x0) at
> ../lib/eal/linux/eal_interrupts.c:1142
> #11 0x0000000000789388 in ctrl_thread_init (arg=0x246ef40)
>     at ../lib/eal/common/eal_common_thread.c:202
> #12 0x0000fffff7bf3c48 in start_thread () from /lib64/libpthread.so.0
> #13 0x0000fffff7b45600 in thread_start () from /lib64/libc.so.6
> 
> **************************************************************************************************************************************************
> 
> 
> Note: above log is from the secondary process.
> 
>>>> And again, please re-consider calling 'rte_eth_dev_close()' or
>>>> 'rte_dev_remove()' from the secondary process.
>>>>
>>>>>>>> +     * Namely, whether "eth_dev" is NULL cannot be used to determine
>>>>>>>> whether
>>>>>>>> +     * an ethdev port has been released.
>>>>>>>> +     * For both primary process and secondary process, eth_dev->state is
>>>>>>>> +     * RTE_ETH_DEV_UNUSED, which means the ethdev port has been released.
>>>>>>>> +     */
>>>>>>>> +    if (eth_dev->state == RTE_ETH_DEV_UNUSED) {
>>>>>>>> +        RTE_ETHDEV_LOG(INFO, "The ethdev port has been released.");
>>>>>>>> +        return 0;
>>>>>>>> +    }
>>>>>> .
>>>> .
>> .


  parent reply	other threads:[~2021-09-20 14:08 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-02 12:46 [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice Huisong Li
2021-08-03  2:30 ` [dpdk-dev] [RFC V2] ethdev: fix issue that dev close in PMD calls twice Huisong Li
2021-08-13  2:11   ` Huisong Li
2021-08-13  6:12     ` Thomas Monjalon
2021-08-13  8:16       ` Huisong Li
2021-08-18 11:24         ` Ferruh Yigit
2021-08-19  3:45           ` Huisong Li
2021-08-24 14:42             ` Ferruh Yigit
2021-08-25  9:53               ` Huisong Li
2021-09-04  1:23                 ` Huisong Li
2021-09-18  3:31                 ` Huisong Li
2021-09-20 14:07                 ` Ferruh Yigit [this message]
2021-09-22  3:31                   ` Huisong Li
2021-09-28  7:19                     ` Singh, Aman Deep
2021-09-30 10:54                       ` Huisong Li
2021-09-30 11:01                         ` Ferruh Yigit
2021-10-08  6:13                           ` lihuisong (C)
2021-08-18  9:47 ` [dpdk-dev] [RFC V1] ethdev: fix the issue that dev uninit may be called twice Singh, Aman Deep
2021-08-24  2:10   ` Huisong Li
2021-10-08  8:21 ` [dpdk-dev] [PATCH] ethdev: fix eth device released repeatedly Huisong Li
2021-10-08 10:23   ` Thomas Monjalon
2021-10-09  1:29     ` lihuisong (C)
2021-10-12 11:39 ` [dpdk-dev] [PATCH V2] " Huisong Li
2021-10-12 15:33   ` Thomas Monjalon
2021-10-14  3:50     ` lihuisong (C)
2021-10-14 12:32     ` lihuisong (C)
2021-10-14 12:50       ` Thomas Monjalon
2021-10-15  3:03         ` lihuisong (C)
2021-10-15  3:44 ` [dpdk-dev] [PATCH V3] " Huisong Li
2021-10-19 13:09   ` Ferruh Yigit
2021-10-21  2:31     ` lihuisong (C)
2021-10-21  2:24 ` [dpdk-dev] [PATCH V4] " Huisong Li
2021-10-21 21:19   ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f20be8f2-fd91-147f-efd5-468805317363@intel.com \
    --to=ferruh.yigit@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=lihuisong@huawei.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git