From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id E68652BA1 for ; Fri, 17 Mar 2017 06:42:23 +0100 (CET) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga104.jf.intel.com with ESMTP; 16 Mar 2017 22:42:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.36,175,1486454400"; d="scan'208";a="1143617975" Received: from yliu-dev.sh.intel.com (HELO yliu-dev) ([10.239.67.162]) by fmsmga002.fm.intel.com with ESMTP; 16 Mar 2017 22:42:21 -0700 Date: Fri, 17 Mar 2017 13:40:36 +0800 From: Yuanhan Liu To: Gopakumar Choorakkot Edakkunni Cc: dev@dpdk.org Message-ID: <20170317054036.GB18844@yliu-dev.sh.intel.com> References: <20170317020611.GV18844@yliu-dev.sh.intel.com> <20170317043526.GW18844@yliu-dev.sh.intel.com> <20170317051343.GY18844@yliu-dev.sh.intel.com> <20170317052433.GZ18844@yliu-dev.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] virtio "how to restart applications" - //dpdk.org/doc/virtio-net-pmd X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Mar 2017 05:42:24 -0000 On Thu, Mar 16, 2017 at 10:30:09PM -0700, Gopakumar Choorakkot Edakkunni wrote: > Thanks for the confirmation, glad I reached the person who knows the nuts and > bolts of virtio :-). So if the host is not in our control (ie if I am just > running as a VM on host provided by thirdparty vendor), is there any workaround > I can do from the guest side to prevent problems from happening on a guest > restart ? Not too much. You might want to hack the guest DPDK EAL memory initiation part though, to not reset the hugepage memory on start. But that's too hacky that I will not recommend you to do so! > And if theres no workarounds at all and the host has to change, instead of > asking the third party vendor to do a wholesale upgrade to 16.04, is there one/ > few commits that can be added to the host ovs-dpdk to take care of this guest > restart virtio-reset-before opening case ? Yes, backporting the commits I have mentioned should be able to fix it. But please note that I did some code refactorings before those fixes: it won't apply cleanly to DPDK v2.2. And if you want to upgrade, I'd suggest to upgrade to v16.11, which is LTS release. --yliu > > Rgds, > Gopa. > > On Thu, Mar 16, 2017 at 10:24 PM, Yuanhan Liu > wrote: > > On Thu, Mar 16, 2017 at 10:20:30PM -0700, Gopakumar Choorakkot Edakkunni > wrote: > > >> When I was saying dpdk version, I meant the DPDK version with OVS. > > > > Oh I see! My apologies for the misuderstanding. The dpdk version used by > host > > ovs should be dpdk2.2, the guest process uses dpdk16.07. The OVS process > is not > > getting restarted, what is getting restarted is the guest process using > > dpdk16.07 - so the above clarifications you had about virtio being > > reset-before-opened on guest restart - does that still hold good or does > that > > need the HOST side dpdk to be 16.04 or above ? > > Yes, the HOST dpdk should be >= v16.04. > >         --yliu > > > > >> And yes, the fixes are not included in the DPDK required for OVS 2.4. > > > > Thanks for the info. > > > > Rgds, > > Gopa. > > > > On Thu, Mar 16, 2017 at 10:13 PM, Yuanhan Liu < > yuanhan.liu@linux.intel.com> > > wrote: > > > >     On Thu, Mar 16, 2017 at 09:56:01PM -0700, Gopakumar Choorakkot > Edakkunni > >     wrote: > >     > Hi Yuanhan, > >     > > >     > Thanks for the confirmation about not having to do anything special > to > >     close > >     > the ports on dpdk going down or coming up. > >     > > >     > As for the question about if I met any issue of ovs getting stuck - > yes, > >     my > >     > guest process runs dpdk 16.07 as I mentioned earlier - and if I > kill my > >     guest > >     > process, then the host OVS-dpdk on the host reports stall ! The > OVS-dpdk > >     and > >     > emu versions I use are as below. But maybe that is because of the > ovs > >     missing > >     > the fixes you mentioned ? > > > >     When I was saying dpdk version, I meant the DPDK version with OVS. > > > >     > ~# ovs-vswitchd --version > >     > ovs-vswitchd (Open vSwitch) 2.4.1 > > > >     And yes, the fixes are not included in the DPDK required for OVS 2.4. > > > >             --yliu > > > >     > Compiled Nov 14 2016 06:53:31 > >     > # kvm --version > >     > QEMU emulator version 2.2.0, Copyright (c) 2003-2008 Fabrice > Bellard > >     > ~# > >     > > >     > > >     > Rgds, > >     > Gopa. > >     > > >     > On Thu, Mar 16, 2017 at 9:35 PM, Yuanhan Liu < > yuanhan.liu@linux.intel.com > >     > > >     > wrote: > >     > > >     >     On Thu, Mar 16, 2017 at 07:48:28PM -0700, Gopakumar Choorakkot > >     Edakkunni > >     >     wrote: > >     >     > Thanks a lot for the response Yuanhan. I am using dpdk > v16.07. So > >     what > >     >     you are > >     >     > saying is that in 16.07, we dont really need to call > >     rte_eth_dev_close() > >     >     on > >     >     > exit, > >     > > >     >     It's not about "don't really need", it's more like "it's hard > to". > >     Just > >     >     think that it may crash at any time. > >     > > >     >     > because dpdk will ensure that it will do virtio reset before > init > >     when it > >     >     > comes up right ? > >     > > >     >     No, It just handles the abnormal case well when guest APP > restarts. > >     > > >     >     > Regarding the vhost commits you mentioned - do we still need > those > >     fixes > >     >     if we > >     >     > have the "virtio reset before init" mechanism ? > >     > > >     >     Yes, we still need them: just think some malicious guest may > also > >     forge > >     >     data like that. > >     > > >     >     I'm a bit confused then. Have you actually met any issue (like > got > >     stucked) > >     >     with DPDK v16.07? > >     > > >     >             --yliu > >     > > >     >     > Or that is a seperate problem > >     >     > altogether (and hence we would need those fixes) ? > >     >     > > >     >     > Rgds, > >     >     > Gopa. > >     >     > > >     >     > On Thu, Mar 16, 2017 at 7:06 PM, Yuanhan Liu < > >     yuanhan.liu@linux.intel.com > >     >     > > >     >     > wrote: > >     >     > > >     >     >     On Thu, Mar 16, 2017 at 12:39:16PM -0700, Gopakumar > Choorakkot > >     >     Edakkunni > >     >     >     wrote: > >     >     >     > So the doc says we should call rte_eth_dev_close() > *before* > >     going > >     >     down. > >     >     >     And I > >     >     >     > know that especially in dpdk-virtionet  in the guest + > >     ovs-dpdk in > >     >     the > >     >     >     host, > >     >     >     > the ovs ends up getting stalled/stuck (!!) if I dont > close > >     the port > >     >     >     before > >     >     >     > starting() it when the guest dpdk process comes back > up. > >     >     > > >     >     >     I'm assuming you were using an old version, something > like dpdk > >     v2.2? > >     >     >     IIRC, DPDK v16.04 should have fixed your issue. > >     >     > > >     >     >     > Considering that this not done properly can screw up > the HOST > >     ovs, > >     >     and I > >     >     >     want > >     >     >     > to do everything possible to avoid that, I want to be > 200% > >     sure > >     >     that I > >     >     >     call > >     >     >     > close even if my process gets a kill -9 .. So obviously > the > >     only > >     >     way of > >     >     >     doing > >     >     >     > that is to close the port when the dpdk process comes > back up > >     and > >     >     >     *before* we > >     >     >     > init the port. rte_eth_dev_close() is not capable of > doing > >     that as > >     >     it > >     >     >     expects > >     >     >     > the port parameters to be initialized etc.. before it > can be > >     >     called. > >     >     > > >     >     >     We do virtio reset before init, which is basically what > >     >     rte_eth_dev_close() > >     >     >     mainly does. So I see no big issue here. > >     >     > > >     >     >     The stuck issue is due to hugepage reset by the guest > DPDK > >     >     application, > >     >     >     leading all virtio vring elements being mem zeroed. The > old > >     vhost > >     >     doesn't > >     >     >     handle it well, as a result, it got stuck. And here are > some > >     relevant > >     >     >     commits: > >     >     > > >     >     >         a436f53 vhost: avoid dead loop chain > >     >     >         c687b0b vhost: check for ring descriptors overflow > >     >     >         623bc47 vhost: do sanity check for ring descriptor > length > >     >     > > >     >     >             --yliu > >     >     > > >     >     >     > Any other > >     >     >     > suggestions on what can be done to close on restart > rather > >     than > >     >     close on > >     >     >     going > >     >     >     > down ? Thought of bouncing this by the alias before I > add a > >     version > >     >     of > >     >     >     close > >     >     >     > myself that can do this close-on-restart > >     >     > > >     >     > > >     > > >     > > > > > > >