From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0385FA04C1 for ; Wed, 20 Nov 2019 14:03:59 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6FA772BAC; Wed, 20 Nov 2019 14:03:58 +0100 (CET) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) by dpdk.org (Postfix) with ESMTP id 28221235 for ; Wed, 20 Nov 2019 14:03:56 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1574255035; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Zx7f+C4/iHFnWEmeVlnnAdzcHJqBjthgXDhL9QgeJTE=; b=fSg6hk5VWNXg5tXal6UHoyFQ5fuj9WEt13M3rHTpjdJmG4naUsTCTsuwD6r95SYgGaFDkO RRC4uvULKARYhcyBRhkx2UU+WY1z4Re9nkVI0zOSDvDRk2ovSNp+uNHstD/bjJXRBdvsnF f/9Vs47n+H/mfAL5J81v3rlDC1sdCck= Received: from mail-ua1-f71.google.com (mail-ua1-f71.google.com [209.85.222.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-336-IczjlyJJP0-4xeT2KU2_5A-1; Wed, 20 Nov 2019 08:03:52 -0500 Received: by mail-ua1-f71.google.com with SMTP id t2so5514043uap.6 for ; Wed, 20 Nov 2019 05:03:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=j6xZ8jaOj9JZnXH9H2LWP8J11l1kRYwK2aycIfuplX4=; b=FV7u4jlCLg2rZJOxCNdOUMzLKGhPbL2mVVxh8rWv/1IwLCy5CwaZu8t7eICSaDvlrc Uv7rrJy2La01Qqf03IVcck1kmIivce6LN5dVa6As8PBOuabvRw0o1LR5w/T31EA63ks/ ZEDSPxvtlnB57cE1Z3KCABeFz5vXY3CAothyBMEWO+1hGfxP1ele0XJUChr8y3643Kc9 HkgqX+Q2wiy3zntvwMk/Y4W13EQ/y1voX9TCWshMm3FMOqto6hitKJRZzbBOJmcbKc4L Q2egjuyuZbHOqyXLiqmjNgVddwkPmFPs3Dyb9CZI0Xn8i952peguMGKeCaL/NfXwY4qU hofw== X-Gm-Message-State: APjAAAV3xlvsnjPQh6jy3BC6J71A1G8wmHVjmbuTdiUWznRHeVhqZ57o PNMU2hmZf5ek/Pe6a5FQ31kEL3LFhvKFafbwVDIlWkrlaJv0Gd26ER9vaB0uC9fQJIaBhGnnr5q hEtJ68umkMBl4CJEzMlP8VyI= X-Received: by 2002:a67:bd05:: with SMTP id y5mr1531641vsq.180.1574255031334; Wed, 20 Nov 2019 05:03:51 -0800 (PST) X-Google-Smtp-Source: APXvYqyLr+dzwD/MY2MrhBEk3ovOf08buKhoHxYzkPgZFUorIcXdqMxbCJlj3/HkqEklB0lMjZw+e9+fs8YIMRIiSgk= X-Received: by 2002:a67:bd05:: with SMTP id y5mr1531608vsq.180.1574255030873; Wed, 20 Nov 2019 05:03:50 -0800 (PST) MIME-Version: 1.0 References: <1573548459-6931-1-git-send-email-matan@mellanox.com> <1574243271-27734-1-git-send-email-matan@mellanox.com> In-Reply-To: <1574243271-27734-1-git-send-email-matan@mellanox.com> From: David Marchand Date: Wed, 20 Nov 2019 14:03:39 +0100 Message-ID: To: Matan Azrad , Thomas Monjalon Cc: dev , mukawa@igel.co.jp, dpdk stable X-MC-Unique: IczjlyJJP0-4xeT2KU2_5A-1 X-Mimecast-Spam-Score: 0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-stable] [PATCH v2] bus/pci: fix driver detach clear X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Sender: "stable" On Wed, Nov 20, 2019 at 10:48 AM Matan Azrad wrote: > > When a rte_device is unplugged, the driver should be detached from the > device. > > The PCI detach driver operation wrongly didn't clear the driver from the > device structure what remain the device in probe state from the EAL > point of view. > > For example, when a device is removed twice using rte_dev_remove, it > cause a crash in EAL. I can see a crash when using port detach in testpmd with a virtio pci devic= e. testpmd> port attach 0000:07:00.0 Attaching a new port... EAL: PCI device 0000:07:00.0 on NUMA socket -1 EAL: Invalid NUMA socket, default to 0 EAL: probe driver: 1af4:1041 net_virtio Port 1 is attached. Now total ports is 2 Done testpmd> port close 1 Closing ports... EAL: Releasing pci mapped resource for 0000:07:00.0 EAL: Calling pci_unmap_resource for 0000:07:00.0 at 0x2200006000 Done testpmd> port detach 1 Removing a device... Breakpoint 1, local_dev_remove (dev=3D0x1de64b0) at /root/dpdk/lib/librte_eal/common/eal_common_dev.c:315 315 if (dev->bus->unplug =3D=3D NULL) { Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libpcap-1.5.3-11.el7.x86_64 numactl-libs-2.0.12-3.el7.x86_64 (gdb) p *dev $1 =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}, name =3D 0x1cf8078 "0000:07:00.0", driver =3D 0x16c68f0 , bus =3D 0x16b2640 , numa_node =3D 0, devargs =3D 0x1cf8060} (gdb) c Continuing. Device of port 1 is detached Now total ports is 1 Done On the first detach, the pci bus frees the rte_pci_device which embeds the rte_device object. static int pci_unplug(struct rte_device *dev) { struct rte_pci_device *pdev; int ret; pdev =3D RTE_DEV_TO_PCI(dev); ret =3D rte_pci_detach_dev(pdev); if (ret =3D=3D 0) { rte_pci_remove_device(pdev); rte_devargs_remove(dev->devargs); free(pdev); } return ret; } testpmd> port detach 1 Removing a device... Breakpoint 1, local_dev_remove (dev=3D0x1de64b0) at /root/dpdk/lib/librte_eal/common/eal_common_dev.c:315 315 if (dev->bus->unplug =3D=3D NULL) { (gdb) p *dev $2 =3D {next =3D {tqe_next =3D 0x0, tqe_prev =3D 0x0}, name =3D 0xa , driver =3D 0x0, bus =3D 0x4637, numa_node =3D 1, devargs = =3D 0x40000002e040018} (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. 0x00000000007c1ddd in local_dev_remove (dev=3D0x1de64b0) at /root/dpdk/lib/librte_eal/common/eal_common_dev.c:315 315 if (dev->bus->unplug =3D=3D NULL) { On the second detach, testpmd passes the same rte_device pointer it extracts from rte_eth_devices, but the malloc'd location has been reused (with watchpoint on the location, I found somewhere around rte_mp_request_sync/opendir()), and then *crunch* on dev->bus. >From my pov: - testpmd is wrongly reusing a pointer coming from rte_eth_devices[], without caring about the port state (this is what your second patch fixes), - testpmd is directly kicking pointers in rte_eth_devices[] (setting ->device =3D NULL for its own logic), which is bad too, - this patch just hides the reuse of a freed pointer, --=20 David Marchand