From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-f65.google.com (mail-wr1-f65.google.com [209.85.221.65]) by dpdk.org (Postfix) with ESMTP id F1B541B117 for ; Wed, 24 Oct 2018 00:39:51 +0200 (CEST) Received: by mail-wr1-f65.google.com with SMTP id u1-v6so3420567wrn.0 for ; Tue, 23 Oct 2018 15:39:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=N557qOKS9L9RMYi4dpp9lyTZxr6RkfVsc1vIT6YC/40=; b=Kmh7d38SZLRbBOaWZExu+JmB0F8OD2hWGjCfSYubCXoIxqiTJVZ2k339+iWKU1TAU2 eyb5cvrRTr+JvOxsbCTFRYsif4Q7oRq2PXsNv/FulFnOTsCWWmG/GhtvoIPyhJVZmlM2 6g6tDGCTMRIwmd+yy5LYL6hE7sBlzb6thF3hi2t2R6kEVxBWVEHfL3pn5M++41WKQpWr 2GCuGBFS+JAfpArBH1KhJelg3wm7TKNIPthp7OSWM1CJDNIAF37YFIzd6xB1/PG+mYQd 0/diFMyA4eZQkKcBrskIQeLE7kBt29FlYHFSSyKlz/zJD1aC3FnVjbHKc0Tg1Rbm8ucd Be/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=N557qOKS9L9RMYi4dpp9lyTZxr6RkfVsc1vIT6YC/40=; b=HdOpNZCXNSfkb+J5dqNZDII6Zpmv5l/YZ+6k93t/YD2UjvFQBvvBWn98ja10WzUM1d AqGeE8miimsLXecpevmzt9k7YjMJ+VBwnSOlmSanPn2LVfLp7MjmHLgoXcpfunXF4XhN hzW+1Wu9pc6Zakir1oaqUOvxsr+6FgL76YI/lYt58tVhTNu6Pu9NgBSw3FPruKADg7rE mqztf+Vb1fCkze+V7luYsuR8azRaMTNXycEzxhJxzDdNj88/6WpD/VCCBKemuWAa/S8b Y8IIoahpID5zpH4dc2e/gcmBpz8L1Gv2mwr3f3vGXT/iNeXuzLeUypjR8IktqsXSTBGf KZpQ== X-Gm-Message-State: AGRZ1gKIsboB+1hUdcOIPa3h9stSOSe8NJbFtr0gD2b4MeTbg7/GudFW uXIkdnbowT1j6hB90BEcVxeXFA== X-Google-Smtp-Source: AJdET5ey2wPhRAsNGgAzI2k0T1qJzelXdg9kXYfvcGV7LDi9I9zR8udN5H0sMNsgy0hdKq9jjBIOOw== X-Received: by 2002:adf:8444:: with SMTP id 62-v6mr127915wrf.251.1540334391080; Tue, 23 Oct 2018 15:39:51 -0700 (PDT) Received: from bidouze.vm.6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id z4-v6sm3758316wmz.14.2018.10.23.15.39.49 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 23 Oct 2018 15:39:49 -0700 (PDT) Date: Wed, 24 Oct 2018 00:39:31 +0200 From: =?iso-8859-1?Q?Ga=EBtan?= Rivet To: Thomas Monjalon Cc: Qi Zhang , dev@dpdk.org Message-ID: <20181023223931.kmro2zfyp4c4wbqm@bidouze.vm.6wind.com> References: <20181022054932.39052-1-qi.z.zhang@intel.com> <1576298.HKmtsfqzoT@xps> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1576298.HKmtsfqzoT@xps> User-Agent: NeoMutt/20170113 (1.7.2) Subject: Re: [dpdk-dev] [PATCH] eal: fix floating device argument pointer X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2018 22:39:52 -0000 Hi, On Mon, Oct 22, 2018 at 09:25:22AM +0200, Thomas Monjalon wrote: > 22/10/2018 07:49, Qi Zhang: > > After we insert a devargs into devargs_list, following bus->scan may > > destroy it due to another rte_devargs_insert. Its better not to use > > a devargs pointer after it has been inserted. > A bus scan calls rte_devargs_insert? Mapping devargs to device is the responsibility of the bus scan, if it calls potentially destructive functions, it must rebuild the map. > I think the problem is in: > > rte_devargs_insert(struct rte_devargs *da) > { > int ret; > > ret = rte_devargs_remove(da); > if (ret < 0) > return ret; > TAILQ_INSERT_TAIL(&devargs_list, da, next); > return 0; > } > > We insert a structure which is freed! Not usually, I hope! > > See http://git.dpdk.org/dpdk/commit/?id=55744d83d525 > > Gaetan, what can be the fix? 1. rte_devargs_insert is misdefined. It is designed as a function that can never fail. The function should return void instead. 2. rte_devargs_remove(da), will not remove da itself. It will remove whichever rte_devargs matches da within the internal list. If da does not match any in the list, it does nothing. As da is a newly-callocated structure, it is actually safe to continue using it after having called rte_devargs_remove(), because it cannot possibly have been inserted in the meantime (so would not have been freed, even if another devargs matched it). The actual issue is that the matching rte_devargs within the list would be referenced by a device after a successful scan, meaning that this reference is not safe if someone attemps to insert the same device after the bus->scan(). If my understanding is correct, the above fix is not necessary, but probing should be guarded against re-entrancy. 3. To fix this bug, one should check that the device one attempts to hotplug does not already exists as a probed rte_device. An existing rte_devargs is not sufficient, because a blacklisted device would have an rte_devargs without having a probed rte_device, and the current behavior is to supersede the current blacklist and forcibly insert the new device, as if it was newly whitelisted. This check can only happen at rte_dev level. 4. Your confusion about rte_devargs_remove is understandable, the API is muddy. The reason for these quirks is because I wanted a user to be able to remove any devargs, even without having a direct reference to it: you only had to define the bus and the device id (name), and it would find it and remove it. It might be preferrable to force the user to find the rte_device, and from it, use the actual rte_devargs reference to remove it, but then, it would be impossible to remove devargs for non-existing devices (spoiler: that's the blacklisted ones). 5. It bears repeating: blacklisted mode is horrible and should be removed. It is all-around abominable, forces unsightly designs to exist and be used, makes managers ask questions about "why do you add this quirky `-w 00:00.0` parameter to your command line and what is your timeline for not needing it?", makes at least one team integrating OVS ask themselves "why not --no-pci? but then why can't I hotplug PCI ports?", and I would not be surprised if it killed puppies as a hobby. So far, I was able to collect "but it simplifies testing bot configuration" as a plus, which I do not agree with. And anyone trying to package DPDK on their platform, expecting users not to know or care about it, would be better off developping a proper autoconf tool, instead of baking it in the entrails of the EAL, which are ugly enough as it is. /rant Regards, -- Gaëtan Rivet 6WIND