From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail04.ics.ntt-tx.co.jp (mail05.ics.ntt-tx.co.jp [210.232.35.69]) by dpdk.org (Postfix) with ESMTP id 4583C1B47C for ; Mon, 17 Dec 2018 12:04:42 +0100 (CET) Received: from gwchk03.silk.ntt-tx.co.jp (gwchk03.silk.ntt-tx.co.jp [10.107.0.111]) by mail04.ics.ntt-tx.co.jp (unknown) with ESMTP id wBHB4fO9007246; Mon, 17 Dec 2018 20:04:41 +0900 Received: (from root@localhost) by gwchk03.silk.ntt-tx.co.jp (unknown) id wBHB4f4o025362; Mon, 17 Dec 2018 20:04:41 +0900 Received: from gwchk.silk.ntt-tx.co.jp [10.107.0.110] by gwchk03.silk.ntt-tx.co.jp with ESMTP id WAA25358; Mon, 17 Dec 2018 20:04:41 +0900 Received: from imss06.silk.ntt-tx.co.jp (localhost [127.0.0.1]) by ccmail04.silk.ntt-tx.co.jp (unknown) with ESMTP id wBHB4f0Z008388; Mon, 17 Dec 2018 20:04:41 +0900 Received: from imss06.silk.ntt-tx.co.jp (localhost [127.0.0.1]) by imss06.silk.ntt-tx.co.jp (unknown) with ESMTP id wBHB4fms005949; Mon, 17 Dec 2018 20:04:41 +0900 Received: from ccmail04 (smtp03.silk.ntt-tx.co.jp [10.107.0.135]) by imss06.silk.ntt-tx.co.jp (unknown) with SMTP id wBHB4fHO005945; Mon, 17 Dec 2018 20:04:41 +0900 Date: Mon, 17 Dec 2018 20:03:11 +0900 From: Hideyuki Yamashita In-Reply-To: References: <0eb59f12-0192-a234-a773-7f3061abea3a@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="ISO-2022-JP" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.74 [ja] X-CCMail7: CC-Mail-V7.0.2-Client-Relayed Message-Id: <201812171104.wBHB4QrE008289@ccmail04.silk.ntt-tx.co.jp> X-TM-AS-MML: No X-CC-Mail-RelayStamp: CC-Mail-V5.14-Server To: "Burakov, Anatoly" Cc: dev@dpdk.org, Yasufumi Ogawa , Nakamura Hioryuki Subject: Re: [dpdk-dev] rte_eal_hotplug_remove() generates error message X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Dec 2018 11:04:43 -0000 > On 17-Dec-18 10:23 AM, Burakov, Anatoly wrote: > > On 17-Dec-18 10:02 AM, Hideyuki Yamashita wrote: > >> Dear Thomas and all, > >> > >> I took a look on dpdk code. > >> I firstly write qustions and my analisys > >> on the current dpdk code follows after that. > >> > >> [1.Questions] > >> I have several questions to ask again. > >> Is my understanding correct about followings > >> > >> Q1: "EAL:ERROR, Invalid memory" is ignorable > >> > >> Q2: there is no big difference between calling > >> rte_eal_hotplug_remove(bus->name, dev->name) > >> and > >> rte_dev_remove(dev) because anyway it > >> reaches to rte_pmd_vhost_remove and encounter > >> the same error. > >> > >> [2.snip from my code] > >> ..... > >> ???????? rte_eth_dev_close(port_id); > >> ???????? ret = rte_dev_remove(dev); > >> ???????? if (ret < 0) > >> ???????????????? return ret; > >> ???????? rte_eth_dev_release_port(&rte_eth_devices[port_id]); > >> > >> [3. My analysis on dpdk code] > >> static int > >> ?? rte_pmd_vhost_remove(struct rte_vdev_device *dev) > >> ?? { > >> ??? ........... > >> ????????  eth_dev_close(eth_dev); > >> > >> ?????????? rte_free(vring_states[eth_dev->data->port_id]); > >> ?????????? vring_states[eth_dev->data->port_id] = NULL; > >> > >> ?????????? rte_eth_dev_release_port(eth_dev); > >> > >> As you can see in rte_eth_vhost.c > >> It calls both eth_dev_close and rte_eth_dev_release_port. > >> And inside both functions, it tries to free mac_addrs. > >> ???????? rte_free(dev->data->mac_addrs);?????? //in rth_dev_close > >> ???????? rte_free(eth_dev->data->mac_addrs);? //in > >> rte_eth_dev_release_port > >> > >> I understand that is the reason why > >> /* Free the memory space back to heap */ > >> void rte_free(void *addr) > >> { > >> ???????? if (addr == NULL) return; > >> ???????? if (malloc_heap_free(malloc_elem_from_data(addr)) < 0) > >> ???????????????? RTE_LOG(ERR, EAL, "Error: Invalid memory\n"); > >> } > >> encounter the error. > >> As an experiment, I commented out one of them, "ERR, Invalid memory" > >> disappered. > >> > >> Thanks and BR, > >> Hideyuki Yamashita > >> NTT TechnoCross > >> > >>> Adding my colleague Yasufumi and Hiroyuki as CC. > >>> > >>> We are waiting valuable advice from you. > >>> > >>> Thanks in advance, > >>> Hideyuki Yamashita > >>> NTT TechnoCross > >>> > >>>> > >>>> Dear Thomas and all, > >>>> > >>>> I hope you all get safely back home after DPDK summit. > >>>> (When I get back Japan, it is chilling. (start of winter)) > >>>> > >>>> On DPDK 18.11.0, we tried to remove vhost device by using > >>>> rte_eal_hotplug_remove(). > >>>> However, following syslog message is printed. > >>>> “EAL: Error: Invalid memory” > >>>> > >>>> At DPDK summit San Jose, we had chance to ask Thomas how to handle > >>>> the error message, and he gave us following advice: > >>>> Replace “rte_eal_hotplug_add()” to “rte_dev_probe(devargs)” and > >>>> “rte_eal_hotplug_remove()” to “rte_eth_dev_close() and > >>>> rte_dev_remove(rte_dev)” > >>>> > >>>> We tested above changes, but the result is the same (i.e., same > >>>> error message is printed). > >>>> The debug log message says: > >>>> --- > >>>> [primary] > >>>> VHOST_CONFIG: vhost-user server: socket created, fd: 17 > >>>> VHOST_CONFIG: bind to /tmp/sock0 > >>>> EAL: Error: Invalid memory > >>>> VHOST_CONFIG: vhost-user server: socket created, fd: 17 > >>>> VHOST_CONFIG: bind to /tmp/sock0 > >>>> > >>>> [secondary] > >>>> APP: devargs=eth_vhost0,iface=/tmp/sock0,queues=1 > >>>> EAL: request: eal_dev_mp_request > >>>> EAL: msg: eal_dev_mp_request > >>>> EAL: request: bus_vdev_mp > >>>> EAL: msg: bus_vdev_mp > >>>> EAL: msg: bus_vdev_mp > >>>> EAL: reply: eal_dev_mp_request > >>>> EAL: msg: eal_dev_mp_request > >>>> rte_eth_promiscuous_disable: Function not supported > >>>> rte_eth_allmulticast_disable: Function not supported > >>>> APP: To Server: add > >>>> EAL: request: eal_dev_mp_request > >>>> EAL: msg: eal_dev_mp_request > >>>> EAL: reply: eal_dev_mp_request > >>>> EAL: msg: eal_dev_mp_request > >>>> APP: To Server: del > >>>> APP: devargs=eth_vhost0,iface=/tmp/sock0,queues=1 > >>>> EAL: request: eal_dev_mp_request > >>>> EAL: msg: eal_dev_mp_request > >>>> EAL: request: bus_vdev_mp > >>>> EAL: msg: bus_vdev_mp > >>>> EAL: msg: bus_vdev_mp > >>>> EAL: reply: eal_dev_mp_request > >>>> EAL: msg: eal_dev_mp_request > >>>> rte_eth_promiscuous_disable: Function not supported > >>>> rte_eth_allmulticast_disable: Function not supported > >>>> APP: To Server: add > >>>> --- > >>>> > >>>> We would like to ask: > >>>> 1)??? Is the message “EAL: Error: Invalid memory” ignorable or not? > >>>> There is no obstacle except this message to re-add the vhost device. > >>>> 2)??? Which is the better(best?) way to add/del vhost device > >>>> “rte_eal_hotplug_add/remove()” or the way Thomas suggested? > >>>> > >>>> Thanks in advance and have a nice day. > >>>> > >>>> BR, > >>>> Hideyuki Yamashita > >>>> NTT TechnoCross > >>> > >> > >> > >> > > > > Hi Hideyuki, > > > > The error you're referring to (about invalid memory) means that you're > > trying to free a pointer that points to invalid memory. Meaning, either > > the pointer itself is not pointing to an allocated area, or it points to > > memory that has already been freed. > > > > If dev->data->mac_addrs and eth_dev->data->mac_addrs point to the same > > area, this is a bug, because this would lead to double free, and > > rte_malloc will rightly complain about invalid memory. Now, malloc won't > > try to do anything with the invalid memory, so the error itself is > > harmless *as far as malloc is concerned*, but these attempts to free the > > memory twice should be fixed whereever they happen. > > > > I'm not well-versed in dev infrastructure, so i wouldn't be able to say > > which one of the rte_free calls is an extra, unneeded one. This is > > something e.g. Thomas could help with, or the driver maintainer. > > > ...with that said, to me, this looks suspicious: > > > > rte_eth_dev_close(port_id); > > ret = rte_dev_remove(dev); > > if (ret < 0) > > return ret; > > rte_eth_dev_release_port(&rte_eth_devices[port_id]); > > So, you're closing the port, then you remove the device, then you release port. From your own analysis, what dev_remove does is the following: > > > rte_pmd_vhost_remove(struct rte_vdev_device *dev) > > { > > ........... > >  eth_dev_close(eth_dev); > > > > rte_free(vring_states[eth_dev->data->port_id]); > > vring_states[eth_dev->data->port_id] = NULL; > > > > rte_eth_dev_release_port(eth_dev); > > So, you have duplicate call to eth_dev_close() (which is likely not a problem), and a duplicate call to rte_eth_dev_release_port(). So, if rte_dev_remove() already calls rte_eth_dev_release_port(), why are you calling it in your code? > > Now, as i said above, i'm not well-versed enough in the dev infrastructure to know which of the calls should go where, but i'm pretty sure you shouldn't call rte_eth_dev_release_port twice - this is likely what's causing your double-free. > > -- Thanks, > Anatoly Hello Anatoly, You are right, I should not call rte_eth_dev_release_port. As an experiment, I commented out rte_eth_dev_release_port. But the same error remains. (I compiled my app(SPP) after the modification) VHOST_CONFIG: vhost-user server: socket created, fd: 15 VHOST_CONFIG: bind to /tmp/sock0 EAL: Error: Invalid memory So I think something is still there in DPDK. BR, Hideyuki Yamashita NTT TechnoCross