From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail04.ics.ntt-tx.co.jp (mail05.ics.ntt-tx.co.jp [210.232.35.69]) by dpdk.org (Postfix) with ESMTP id C9057262E for ; Thu, 27 Dec 2018 06:54:25 +0100 (CET) Received: from gwchk03.silk.ntt-tx.co.jp (gwchk03.silk.ntt-tx.co.jp [10.107.0.111]) by mail04.ics.ntt-tx.co.jp (unknown) with ESMTP id wBR5sNRr002876; Thu, 27 Dec 2018 14:54:23 +0900 Received: (from root@localhost) by gwchk03.silk.ntt-tx.co.jp (unknown) id wBR5sNEK005949; Thu, 27 Dec 2018 14:54:23 +0900 Received: from gwchk.silk.ntt-tx.co.jp [10.107.0.110] by gwchk03.silk.ntt-tx.co.jp with ESMTP id QAA05948; Thu, 27 Dec 2018 14:54:23 +0900 Received: from imss06.silk.ntt-tx.co.jp (localhost [127.0.0.1]) by ccmail04.silk.ntt-tx.co.jp (unknown) with ESMTP id wBR5sNvL005269; Thu, 27 Dec 2018 14:54:23 +0900 Received: from imss06.silk.ntt-tx.co.jp (localhost [127.0.0.1]) by imss06.silk.ntt-tx.co.jp (unknown) with ESMTP id wBR5sNlX002937; Thu, 27 Dec 2018 14:54:23 +0900 Received: from ccmail04 (smtp03.silk.ntt-tx.co.jp [10.107.0.135]) by imss06.silk.ntt-tx.co.jp (unknown) with SMTP id wBR5sNGJ002934; Thu, 27 Dec 2018 14:54:23 +0900 Date: Thu, 27 Dec 2018 14:52:17 +0900 From: Hideyuki Yamashita In-Reply-To: <20181218130901.GB2464@dpdk-tbie.sh.intel.com> References: <201812181131.wBIBVX8p018158@ccmail04.silk.ntt-tx.co.jp> <20181218130901.GB2464@dpdk-tbie.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="ISO-2022-JP" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.74 [ja] X-CCMail7: CC-Mail-V7.0.2-Client-Relayed Message-Id: <201812270554.wBR5rnsV005042@ccmail04.silk.ntt-tx.co.jp> X-TM-AS-MML: No X-CC-Mail-RelayStamp: CC-Mail-V5.14-Server To: Tiwei Bie Cc: "Burakov, Anatoly" , dev@dpdk.org, Yasufumi Ogawa , Nakamura Hioryuki , maxime.coquelin@redhat.com, zhihong.wang@intel.com Subject: Re: [dpdk-dev] rte_eal_hotplug_remove() generates error message X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Dec 2018 05:54:26 -0000 > On Tue, Dec 18, 2018 at 08:30:16PM +0900, Hideyuki Yamashita wrote: > > > On Tue, Dec 18, 2018 at 02:52:16PM +0900, Hideyuki Yamashita wrote: > > > > > On Tue, Dec 18, 2018 at 12:11:33PM +0900, Hideyuki Yamashita wrote: > > > > > > > > > > > > On Mon, Dec 17, 2018 at 11:21:01AM +0000, Burakov, Anatoly wrote: > > > > > > > > On 17-Dec-18 10:45 AM, Hideyuki Yamashita wrote: > > > > > > > > > > On 17-Dec-18 10:02 AM, Hideyuki Yamashita wrote: > > > > > > > > > > > Dear Thomas and all, > > > > > > > > > > > > > > > > > > > > > > I took a look on dpdk code. > > > > > > > > > > > I firstly write qustions and my analisys > > > > > > > > > > > on the current dpdk code follows after that. > > > > > > > > > > > > > > > > > > > > > > [1.Questions] > > > > > > > > > > > I have several questions to ask again. > > > > > > > > > > > Is my understanding correct about followings > > > > > > > > > > > > > > > > > > > > > > Q1: "EAL:ERROR, Invalid memory" is ignorable > > > > > > > > > > > > > > > > > > > > > > Q2: there is no big difference between calling > > > > > > > > > > > rte_eal_hotplug_remove(bus->name, dev->name) > > > > > > > > > > > and > > > > > > > > > > > rte_dev_remove(dev) because anyway it > > > > > > > > > > > reaches to rte_pmd_vhost_remove and encounter > > > > > > > > > > > the same error. > > > > > > > > > > > > > > > > > > > > > > [2.snip from my code] > > > > > > > > > > > ..... > > > > > > > > > > > rte_eth_dev_close(port_id); > > > > > > > > > > > ret = rte_dev_remove(dev); > > > > > > > > > > > if (ret < 0) > > > > > > > > > > > return ret; > > > > > > > > > > > rte_eth_dev_release_port(&rte_eth_devices[port_id]); > > > > > > > > > > > > > > > > > > > > > > [3. My analysis on dpdk code] > > > > > > > > > > > static int > > > > > > > > > > > rte_pmd_vhost_remove(struct rte_vdev_device *dev) > > > > > > > > > > > { > > > > > > > > > > > ........... > > > > > > > > > > >  eth_dev_close(eth_dev); > > > > > > > > > > > > > > > > > > > > > > rte_free(vring_states[eth_dev->data->port_id]); > > > > > > > > > > > vring_states[eth_dev->data->port_id] = NULL; > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_release_port(eth_dev); > > > > > > > > > > > > > > > > > > > > > > As you can see in rte_eth_vhost.c > > > > > > > > > > > It calls both eth_dev_close and rte_eth_dev_release_port. > > > > > > > > > > > And inside both functions, it tries to free mac_addrs. > > > > > > > > > > > rte_free(dev->data->mac_addrs); //in rth_dev_close > > > > > > > > > > > rte_free(eth_dev->data->mac_addrs); //in rte_eth_dev_release_port > > > > > > > > > > > > > > > > > > > > > > I understand that is the reason why > > > > > > > > > > > /* Free the memory space back to heap */ > > > > > > > > > > > void rte_free(void *addr) > > > > > > > > > > > { > > > > > > > > > > > if (addr == NULL) return; > > > > > > > > > > > if (malloc_heap_free(malloc_elem_from_data(addr)) < 0) > > > > > > > > > > > RTE_LOG(ERR, EAL, "Error: Invalid memory\n"); > > > > > > > > > > > } > > > > > > > > > > > encounter the error. > > > > > > > > > > > As an experiment, I commented out one of them, "ERR, Invalid memory" > > > > > > > > > > > disappered. > > > > > > > > > > > > > > > > > > > > > > Thanks and BR, > > > > > > > > > > > Hideyuki Yamashita > > > > > > > > > > > NTT TechnoCross > > > > > > > > > > > > > > > > > > > > > > > Adding my colleague Yasufumi and Hiroyuki as CC. > > > > > > > > > > > > > > > > > > > > > > > > We are waiting valuable advice from you. > > > > > > > > > > > > > > > > > > > > > > > > Thanks in advance, > > > > > > > > > > > > Hideyuki Yamashita > > > > > > > > > > > > NTT TechnoCross > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Dear Thomas and all, > > > > > > > > > > > > > > > > > > > > > > > > > > I hope you all get safely back home after DPDK summit. > > > > > > > > > > > > > (When I get back Japan, it is chilling. (start of winter)) > > > > > > > > > > > > > > > > > > > > > > > > > > On DPDK 18.11.0, we tried to remove vhost device by using rte_eal_hotplug_remove(). > > > > > > > > > > > > > However, following syslog message is printed. > > > > > > > > > > > > > “EAL: Error: Invalid memory” > > > > > > > > > > > > > > > > > > > > > > > > > > At DPDK summit San Jose, we had chance to ask Thomas how to handle the error message, and he gave us following advice: > > > > > > > > > > > > > Replace “rte_eal_hotplug_add()” to “rte_dev_probe(devargs)” and > > > > > > > > > > > > > “rte_eal_hotplug_remove()” to “rte_eth_dev_close() and rte_dev_remove(rte_dev)” > > > > > > > > > > > > > > > > > > > > > > > > > > We tested above changes, but the result is the same (i.e., same error message is printed). > > > > > > > > > > > > > The debug log message says: > > > > > > > > > > > > > --- > > > > > > > > > > > > > [primary] > > > > > > > > > > > > > VHOST_CONFIG: vhost-user server: socket created, fd: 17 > > > > > > > > > > > > > VHOST_CONFIG: bind to /tmp/sock0 > > > > > > > > > > > > > EAL: Error: Invalid memory > > > > > > > > > > > > > VHOST_CONFIG: vhost-user server: socket created, fd: 17 > > > > > > > > > > > > > VHOST_CONFIG: bind to /tmp/sock0 > > > > > > > > > > > > > > > > > > > > > > > > > > [secondary] > > > > > > > > > > > > > APP: devargs=eth_vhost0,iface=/tmp/sock0,queues=1 > > > > > > > > > > > > > EAL: request: eal_dev_mp_request > > > > > > > > > > > > > EAL: msg: eal_dev_mp_request > > > > > > > > > > > > > EAL: request: bus_vdev_mp > > > > > > > > > > > > > EAL: msg: bus_vdev_mp > > > > > > > > > > > > > EAL: msg: bus_vdev_mp > > > > > > > > > > > > > EAL: reply: eal_dev_mp_request > > > > > > > > > > > > > EAL: msg: eal_dev_mp_request > > > > > > > > > > > > > rte_eth_promiscuous_disable: Function not supported > > > > > > > > > > > > > rte_eth_allmulticast_disable: Function not supported > > > > > > > > > > > > > APP: To Server: add > > > > > > > > > > > > > EAL: request: eal_dev_mp_request > > > > > > > > > > > > > EAL: msg: eal_dev_mp_request > > > > > > > > > > > > > EAL: reply: eal_dev_mp_request > > > > > > > > > > > > > EAL: msg: eal_dev_mp_request > > > > > > > > > > > > > APP: To Server: del > > > > > > > > > > > > > APP: devargs=eth_vhost0,iface=/tmp/sock0,queues=1 > > > > > > > > > > > > > EAL: request: eal_dev_mp_request > > > > > > > > > > > > > EAL: msg: eal_dev_mp_request > > > > > > > > > > > > > EAL: request: bus_vdev_mp > > > > > > > > > > > > > EAL: msg: bus_vdev_mp > > > > > > > > > > > > > EAL: msg: bus_vdev_mp > > > > > > > > > > > > > EAL: reply: eal_dev_mp_request > > > > > > > > > > > > > EAL: msg: eal_dev_mp_request > > > > > > > > > > > > > rte_eth_promiscuous_disable: Function not supported > > > > > > > > > > > > > rte_eth_allmulticast_disable: Function not supported > > > > > > > > > > > > > APP: To Server: add > > > > > > > > > > > > > --- > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to ask: > > > > > > > > > > > > > 1) Is the message “EAL: Error: Invalid memory” ignorable or not? There is no obstacle except this message to re-add the vhost device. > > > > > > > > > > > > > 2) Which is the better(best?) way to add/del vhost device “rte_eal_hotplug_add/remove()” or the way Thomas suggested? > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks in advance and have a nice day. > > > > > > > > > > > > > > > > > > > > > > > > > > BR, > > > > > > > > > > > > > Hideyuki Yamashita > > > > > > > > > > > > > NTT TechnoCross > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Hideyuki, > > > > > > > > > > > > > > > > > > > > The error you're referring to (about invalid memory) means that you're trying to free a pointer that points to invalid memory. Meaning, either the pointer itself is not pointing to an allocated area, or it points to memory that has already been freed. > > > > > > > > > > > > > > > > > > > > If dev->data->mac_addrs and eth_dev->data->mac_addrs point to the same area, this is a bug, because this would lead to double free, and rte_malloc will rightly complain about invalid memory. Now, malloc won't try to do anything with the invalid memory, so the error itself is harmless *as far as malloc is concerned*, but these attempts to free the memory twice should be fixed whereever they happen. > > > > > > > > > > > > > > > > > > > > I'm not well-versed in dev infrastructure, so i wouldn't be able to say which one of the rte_free calls is an extra, unneeded one. This is something e.g. Thomas could help with, or the driver maintainer. > > > > > > > > > > > > > > > > > > > > --Thanks, > > > > > > > > > > Anatoly > > > > > > > > > Hello Anatoly, > > > > > > > > > > > > > > > > > > Thanks for your reply for my newbie question. > > > > > > > > > Now I understand that this error is harmless from DPDK application(SPP) > > > > > > > > > point of view in practice. Thanks. > > > > > > > > > But anyway if there is a double free logic, it is a bug and should be > > > > > > > > > fixed. > > > > > > > > > The remaining issues are > > > > > > > > > 1. If it is really a bug (or my mis-understanding) > > > > > > > > > 2. If is is a bug which function should remove rte_free(mac_addrs) > > > > > > > > > > > > > > > > From description, it looks like a bug. Correct usage of API (rte_dev_close() > > > > > > > > followed by rte_dev_remove()) should not trigger any errors. You might want > > > > > > > > to create a BugZilla entry describing the issue. > > > > > > > > > > > > > > I also think it's a bug. The rte_free(dev->data->mac_addrs) in > > > > > > > vhost PMD's eth_dev_close() should be removed, as the common data > > > > > > > freeing has been moved to rte_eth_dev_release_port(). > > > > > > > > > > > > > > @Hideyuki, thanks for digging into this! > > > > > > > > > > > > Hello Twei, > > > > > > > > > > > > I share the same view with you. > > > > > > rte_eth_dev_release_port() is used in commonly in many pmds. > > > > > > rte_dev_close() is vhost pmd local. > > > > > > > > > > > > BTW, what is the difference between using > > > > > > rte_eal_hotplug_remove and rte_eth_dev_close()+rte_dev_remove? > > > > > > (I see no big difference right now from application developer point of > > > > > > view) > > > > > > > > > > IMO, the difference is that the former one doesn't require > > > > > operating on rte_device directly. > > > > > > > > > > > > > > > > > Next question is what should I do next (you know I am very newbie). > > > > > > File this bug into BUgzzila of DPDK? > > > > > > > > > > The issue is quite clear, I can fix it directly. > > > > > > > > > > And as the fix is also quite clear, if you want to contribute > > > > > patch, you can also try to send a fix patch [1]. > > > > > > > > > > Either way is fine for me. Just let me know. :-) > > > > > > > > > > [1] https://core.dpdk.org/contribute/#send > > > > Hello, > > > > > > > > Thanks for your guidance. > > > > > > > > Well, I think it is a good chance to be "a contributer of DPDK". > > > > (even for small one. I am getting excited already) > > > > So let me give a try. > > > > > > Great! Looking forward to your patch. :-) > > Hello, > > > > I posted my very first patch to DPDK. > > After that very basic questions come up > > to my mind. > > When will those will be released(if that is merged safely)? > > For the release time, you can check the schedule of the > next release from below link: > > https://core.dpdk.org/roadmap/#dates > > > That is my natural question as application developer. > > (Next release 19.02 of course?) > > Yes. Dear Tiwei and all, I have question. Q1. I CCed my patch to stable@dpdk.org Is that mean it will be reflected into 18.11 stable release? The reason why I ask this is I think it is better that bug should be reflected into LTS because I and SPP users mainly use DPDK LTS(18.11). Q2. If yes, when will those stable release will be released? (e.g. once a month ) Q3. I understand my patch is acked and merged into dpdk-next-virtio. What is dpdk-next-virtio? It is the branch which will be reflected into next release? Thanks in advance! BR, Hideyuki Yamashita NTT TechnoCross > > > > I need to craeate patch also to SPP(my app) > > and I have to write down something about > > fix release timing of this bug in my commit message > > to notify the SPP users. > > (e.g. Until the next release of DPDK, you may encounter > > "EAL:Error Invalid memory" when deleteing vhost pmd.) > > > > Thanks in advance for your kind advice to newbie. > > > > BR, > > Hideyuki Yamashita > > NTT TechnoCross > > > > > > > > > > Thanks for your kindness. > > > > > > > > BR, > > > > Hideyuki Yamashita > > > > NTT TechnoCross > > > > > > > > > > > > > > > > > > > > > Thanks for your confirming. > > > > > > > > > > > > BR, > > > > > > Hideyuki Yamashita > > > > > > NTT TechnoCross > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > Hideyuki Yamashita > > > > > > > > > NTT TechnoCross > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Thanks, > > > > > > > > Anatoly > > > > > > > > > > > > > > > > > > > > > >