From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id C48381B893 for ; Mon, 17 Dec 2018 12:21:04 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Dec 2018 03:21:03 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,365,1539673200"; d="scan'208";a="130591081" Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.251.86.46]) ([10.251.86.46]) by fmsmga001.fm.intel.com with ESMTP; 17 Dec 2018 03:21:01 -0800 To: Hideyuki Yamashita Cc: dev@dpdk.org, Yasufumi Ogawa , Nakamura Hioryuki References: <201812171004.wBHA432F013027@ccmail04.silk.ntt-tx.co.jp> <0eb59f12-0192-a234-a773-7f3061abea3a@intel.com> <201812171046.wBHAkViW032759@ccmail04.silk.ntt-tx.co.jp> From: "Burakov, Anatoly" Message-ID: <272166ee-d05f-528a-06c9-962e3237674e@intel.com> Date: Mon, 17 Dec 2018 11:21:01 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.3.3 MIME-Version: 1.0 In-Reply-To: <201812171046.wBHAkViW032759@ccmail04.silk.ntt-tx.co.jp> Content-Type: text/plain; charset=iso-2022-jp; format=flowed; delsp=yes Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] rte_eal_hotplug_remove() generates error message X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Dec 2018 11:21:05 -0000 On 17-Dec-18 10:45 AM, Hideyuki Yamashita wrote: >> On 17-Dec-18 10:02 AM, Hideyuki Yamashita wrote: >>> Dear Thomas and all, >>> >>> I took a look on dpdk code. >>> I firstly write qustions and my analisys >>> on the current dpdk code follows after that. >>> >>> [1.Questions] >>> I have several questions to ask again. >>> Is my understanding correct about followings >>> >>> Q1: "EAL:ERROR, Invalid memory" is ignorable >>> >>> Q2: there is no big difference between calling >>> rte_eal_hotplug_remove(bus->name, dev->name) >>> and >>> rte_dev_remove(dev) because anyway it >>> reaches to rte_pmd_vhost_remove and encounter >>> the same error. >>> >>> [2.snip from my code] >>> ..... >>> rte_eth_dev_close(port_id); >>> ret = rte_dev_remove(dev); >>> if (ret < 0) >>> return ret; >>> rte_eth_dev_release_port(&rte_eth_devices[port_id]); >>> >>> [3. My analysis on dpdk code] >>> static int >>> rte_pmd_vhost_remove(struct rte_vdev_device *dev) >>> { >>> ........... >>>  eth_dev_close(eth_dev); >>> >>> rte_free(vring_states[eth_dev->data->port_id]); >>> vring_states[eth_dev->data->port_id] = NULL; >>> >>> rte_eth_dev_release_port(eth_dev); >>> >>> As you can see in rte_eth_vhost.c >>> It calls both eth_dev_close and rte_eth_dev_release_port. >>> And inside both functions, it tries to free mac_addrs. >>> rte_free(dev->data->mac_addrs); //in rth_dev_close >>> rte_free(eth_dev->data->mac_addrs); //in rte_eth_dev_release_port >>> >>> I understand that is the reason why >>> /* Free the memory space back to heap */ >>> void rte_free(void *addr) >>> { >>> if (addr == NULL) return; >>> if (malloc_heap_free(malloc_elem_from_data(addr)) < 0) >>> RTE_LOG(ERR, EAL, "Error: Invalid memory\n"); >>> } >>> encounter the error. >>> As an experiment, I commented out one of them, "ERR, Invalid memory" >>> disappered. >>> >>> Thanks and BR, >>> Hideyuki Yamashita >>> NTT TechnoCross >>> >>>> Adding my colleague Yasufumi and Hiroyuki as CC. >>>> >>>> We are waiting valuable advice from you. >>>> >>>> Thanks in advance, >>>> Hideyuki Yamashita >>>> NTT TechnoCross >>>> >>>>> >>>>> Dear Thomas and all, >>>>> >>>>> I hope you all get safely back home after DPDK summit. >>>>> (When I get back Japan, it is chilling. (start of winter)) >>>>> >>>>> On DPDK 18.11.0, we tried to remove vhost device by using rte_eal_hotplug_remove(). >>>>> However, following syslog message is printed. >>>>> “EAL: Error: Invalid memory” >>>>> >>>>> At DPDK summit San Jose, we had chance to ask Thomas how to handle the error message, and he gave us following advice: >>>>> Replace “rte_eal_hotplug_add()” to “rte_dev_probe(devargs)” and >>>>> “rte_eal_hotplug_remove()” to “rte_eth_dev_close() and rte_dev_remove(rte_dev)” >>>>> >>>>> We tested above changes, but the result is the same (i.e., same error message is printed). >>>>> The debug log message says: >>>>> --- >>>>> [primary] >>>>> VHOST_CONFIG: vhost-user server: socket created, fd: 17 >>>>> VHOST_CONFIG: bind to /tmp/sock0 >>>>> EAL: Error: Invalid memory >>>>> VHOST_CONFIG: vhost-user server: socket created, fd: 17 >>>>> VHOST_CONFIG: bind to /tmp/sock0 >>>>> >>>>> [secondary] >>>>> APP: devargs=eth_vhost0,iface=/tmp/sock0,queues=1 >>>>> EAL: request: eal_dev_mp_request >>>>> EAL: msg: eal_dev_mp_request >>>>> EAL: request: bus_vdev_mp >>>>> EAL: msg: bus_vdev_mp >>>>> EAL: msg: bus_vdev_mp >>>>> EAL: reply: eal_dev_mp_request >>>>> EAL: msg: eal_dev_mp_request >>>>> rte_eth_promiscuous_disable: Function not supported >>>>> rte_eth_allmulticast_disable: Function not supported >>>>> APP: To Server: add >>>>> EAL: request: eal_dev_mp_request >>>>> EAL: msg: eal_dev_mp_request >>>>> EAL: reply: eal_dev_mp_request >>>>> EAL: msg: eal_dev_mp_request >>>>> APP: To Server: del >>>>> APP: devargs=eth_vhost0,iface=/tmp/sock0,queues=1 >>>>> EAL: request: eal_dev_mp_request >>>>> EAL: msg: eal_dev_mp_request >>>>> EAL: request: bus_vdev_mp >>>>> EAL: msg: bus_vdev_mp >>>>> EAL: msg: bus_vdev_mp >>>>> EAL: reply: eal_dev_mp_request >>>>> EAL: msg: eal_dev_mp_request >>>>> rte_eth_promiscuous_disable: Function not supported >>>>> rte_eth_allmulticast_disable: Function not supported >>>>> APP: To Server: add >>>>> --- >>>>> >>>>> We would like to ask: >>>>> 1) Is the message “EAL: Error: Invalid memory” ignorable or not? There is no obstacle except this message to re-add the vhost device. >>>>> 2) Which is the better(best?) way to add/del vhost device “rte_eal_hotplug_add/remove()” or the way Thomas suggested? >>>>> >>>>> Thanks in advance and have a nice day. >>>>> >>>>> BR, >>>>> Hideyuki Yamashita >>>>> NTT TechnoCross >>>> >>> >>> >>> >> Hi Hideyuki, >> >> The error you're referring to (about invalid memory) means that you're trying to free a pointer that points to invalid memory. Meaning, either the pointer itself is not pointing to an allocated area, or it points to memory that has already been freed. >> >> If dev->data->mac_addrs and eth_dev->data->mac_addrs point to the same area, this is a bug, because this would lead to double free, and rte_malloc will rightly complain about invalid memory. Now, malloc won't try to do anything with the invalid memory, so the error itself is harmless *as far as malloc is concerned*, but these attempts to free the memory twice should be fixed whereever they happen. >> >> I'm not well-versed in dev infrastructure, so i wouldn't be able to say which one of the rte_free calls is an extra, unneeded one. This is something e.g. Thomas could help with, or the driver maintainer. >> >> --Thanks, >> Anatoly > Hello Anatoly, > > Thanks for your reply for my newbie question. > Now I understand that this error is harmless from DPDK application(SPP) > point of view in practice. Thanks. > But anyway if there is a double free logic, it is a bug and should be > fixed. > The remaining issues are > 1. If it is really a bug (or my mis-understanding) > 2. If is is a bug which function should remove rte_free(mac_addrs) From description, it looks like a bug. Correct usage of API (rte_dev_close() followed by rte_dev_remove()) should not trigger any errors. You might want to create a BugZilla entry describing the issue. > > Thanks, > Hideyuki Yamashita > NTT TechnoCross > > -- Thanks, Anatoly