From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9F5A9A04C1; Wed, 20 Nov 2019 14:03:57 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 8DD672956; Wed, 20 Nov 2019 14:03:56 +0100 (CET) Received: from us-smtp-delivery-1.mimecast.com (us-smtp-2.mimecast.com [205.139.110.61]) by dpdk.org (Postfix) with ESMTP id 48EC3235 for ; Wed, 20 Nov 2019 14:03:54 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1574255033; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Zx7f+C4/iHFnWEmeVlnnAdzcHJqBjthgXDhL9QgeJTE=; b=e3mTGXjvrr+vgCeDaY4S6/BcpVUMdF+xZLkwogI5sPp2UfFhXTQMe09LVYz7oj15NptsY1 Ywx2efuNqPSLeJ1aHjEgqi3IUjY4jIuPbPfTMXIsKL/0TAUkwPS3LqL/YIgKUvoJdl6vf5 H05piL0pHaK/VSy8YXhLjFJ02CPDvqg= Received: from mail-vk1-f200.google.com (mail-vk1-f200.google.com [209.85.221.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-266-mldr5MPtM7a9Gko_dNQZMw-1; Wed, 20 Nov 2019 08:03:51 -0500 Received: by mail-vk1-f200.google.com with SMTP id s17so11344813vkb.5 for ; Wed, 20 Nov 2019 05:03:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=j6xZ8jaOj9JZnXH9H2LWP8J11l1kRYwK2aycIfuplX4=; b=RefqVcAOr64RyB9y4reQMTbDpdaMtyJ8xSd0dVCkekhQjKAmjwGi5KLf8L2an6gsZU pbh5A0lVDp66lXYB3hTrK8UUbEk1twIKnjqv6FQ941G26v9w+6ldb8zSoj5GVmmFvgUl WC8fm9kv/9xvYycXof6xVZm7xqm2TOjKzOxYWZVFgeWnFJ1PyI/gHQ8SMo6ClXFDQNbe 8/z6rLGwBsXIdVBSz/yuHArOvjsy5EtRgf6sD83oABJH75d+yRuAjYqvSuTiO3RFQSvL 1t9x6zUNes0Gw7BG/Pg4kQLbf8stDpggKRG6HewrxBcqDcB6M4I8f4iAU+WL6VieWA2e AXhw== X-Gm-Message-State: APjAAAXoGjaamIbrAMNqcyUECY6i6CdLrGDfVJ+0IynbkMJE8Y+RGJER 6najZQmGuWdPSti84CFKPQ6c2YKaWB8+8dcqs9kClBjWs1gGasC0ZhMiHDos/h90FLIzqDdoEL1 1nAFwiRGSB+T0yKl4Dvk= X-Received: by 2002:a67:bd05:: with SMTP id y5mr1531638vsq.180.1574255031331; Wed, 20 Nov 2019 05:03:51 -0800 (PST) X-Google-Smtp-Source: APXvYqyLr+dzwD/MY2MrhBEk3ovOf08buKhoHxYzkPgZFUorIcXdqMxbCJlj3/HkqEklB0lMjZw+e9+fs8YIMRIiSgk= X-Received: by 2002:a67:bd05:: with SMTP id y5mr1531608vsq.180.1574255030873; Wed, 20 Nov 2019 05:03:50 -0800 (PST) MIME-Version: 1.0 References: <1573548459-6931-1-git-send-email-matan@mellanox.com> <1574243271-27734-1-git-send-email-matan@mellanox.com> In-Reply-To: <1574243271-27734-1-git-send-email-matan@mellanox.com> From: David Marchand Date: Wed, 20 Nov 2019 14:03:39 +0100 Message-ID: To: Matan Azrad , Thomas Monjalon Cc: dev , mukawa@igel.co.jp, dpdk stable X-MC-Unique: mldr5MPtM7a9Gko_dNQZMw-1 X-Mimecast-Spam-Score: 0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH v2] bus/pci: fix driver detach clear X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Wed, Nov 20, 2019 at 10:48 AM Matan Azrad wrote: > > When a rte_device is unplugged, the driver should be detached from the > device. > > The PCI detach driver operation wrongly didn't clear the driver from the > device structure what remain the device in probe state from the EAL > point of view. > > For example, when a device is removed twice using rte_dev_remove, it > cause a crash in EAL. I can see a crash when using port detach in testpmd with a virtio pci devic= e. testpmd> port attach 0000:07:00.0 Attaching a new port... EAL: PCI device 0000:07:00.0 on NUMA socket -1 EAL: Invalid NUMA socket, default to 0 EAL: probe driver: 1af4:1041 net_virtio Port 1 is attached. Now total ports is 2 Done testpmd> port close 1 Closing ports... EAL: Releasing pci mapped resource for 0000:07:00.0 EAL: Calling pci_unmap_resource for 0000:07:00.0 at 0x2200006000 Done testpmd> port detach 1 Removing a device... Breakpoint 1, local_dev_remove (dev=3D0x1de64b0) at /root/dpdk/lib/librte_eal/common/eal_common_dev.c:315 315 if (dev->bus->unplug =3D=3D NULL) { Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libpcap-1.5.3-11.el7.x86_64 numactl-libs-2.0.12-3.el7.x86_64 (gdb) p *dev $1 =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}, name =3D 0x1cf8078 "0000:07:00.0", driver =3D 0x16c68f0 , bus =3D 0x16b2640 , numa_node =3D 0, devargs =3D 0x1cf8060} (gdb) c Continuing. Device of port 1 is detached Now total ports is 1 Done On the first detach, the pci bus frees the rte_pci_device which embeds the rte_device object. static int pci_unplug(struct rte_device *dev) { struct rte_pci_device *pdev; int ret; pdev =3D RTE_DEV_TO_PCI(dev); ret =3D rte_pci_detach_dev(pdev); if (ret =3D=3D 0) { rte_pci_remove_device(pdev); rte_devargs_remove(dev->devargs); free(pdev); } return ret; } testpmd> port detach 1 Removing a device... Breakpoint 1, local_dev_remove (dev=3D0x1de64b0) at /root/dpdk/lib/librte_eal/common/eal_common_dev.c:315 315 if (dev->bus->unplug =3D=3D NULL) { (gdb) p *dev $2 =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}, name =3D 0xa , driver =3D 0x0, bus =3D 0x4637, numa_node =3D 1, devargs = =3D 0x40000002e040018} (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. 0x00000000007c1ddd in local_dev_remove (dev=3D0x1de64b0) at /root/dpdk/lib/librte_eal/common/eal_common_dev.c:315 315 if (dev->bus->unplug =3D=3D NULL) { On the second detach, testpmd passes the same rte_device pointer it extracts from rte_eth_devices, but the malloc'd location has been reused (with watchpoint on the location, I found somewhere around rte_mp_request_sync/opendir()), and then *crunch* on dev->bus. >From my pov: - testpmd is wrongly reusing a pointer coming from rte_eth_devices[], without caring about the port state (this is what your second patch fixes), - testpmd is directly kicking pointers in rte_eth_devices[] (setting ->device =3D NULL for its own logic), which is bad too, - this patch just hides the reuse of a freed pointer, --=20 David Marchand