From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <yuanhan.liu@linux.intel.com>
Received: from mga03.intel.com (mga03.intel.com [134.134.136.65])
 by dpdk.org (Postfix) with ESMTP id 2007020F
 for <dev@dpdk.org>; Mon, 11 Jul 2016 13:02:39 +0200 (CEST)
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
 by orsmga103.jf.intel.com with ESMTP; 11 Jul 2016 04:02:38 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.28,346,1464678000"; d="scan'208";a="1019378978"
Received: from yliu-dev.sh.intel.com (HELO yliu-dev) ([10.239.67.162])
 by fmsmga002.fm.intel.com with ESMTP; 11 Jul 2016 04:02:37 -0700
Date: Mon, 11 Jul 2016 19:05:03 +0800
From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To: Ilya Maximets <i.maximets@samsung.com>
Cc: dev@dpdk.org, Huawei Xie <huawei.xie@intel.com>,
 Dyasly Sergey <s.dyasly@samsung.com>,
 Heetae Ahn <heetae82.ahn@samsung.com>,
 Jianfeng Tan <jianfeng.tan@intel.com>,
 Stephen Hemminger <stephen@networkplumber.org>
Message-ID: <20160711110503.GZ26521@yliu-dev.sh.intel.com>
References: <1463748604-27251-1-git-send-email-i.maximets@samsung.com>
 <20160701073506.GQ2831@yliu-dev.sh.intel.com>
 <577CE930.2070007@samsung.com>
 <20160706122446.GO26521@yliu-dev.sh.intel.com>
 <577F9328.1030901@samsung.com>
 <20160710131731.GS26521@yliu-dev.sh.intel.com>
 <20160711083825.GY26521@yliu-dev.sh.intel.com>
 <57836BE0.2070401@samsung.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <57836BE0.2070401@samsung.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
Subject: Re: [dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor
	address.
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Mon, 11 Jul 2016 11:02:40 -0000

On Mon, Jul 11, 2016 at 12:50:24PM +0300, Ilya Maximets wrote:
> On 11.07.2016 11:38, Yuanhan Liu wrote:
> > On Sun, Jul 10, 2016 at 09:17:31PM +0800, Yuanhan Liu wrote:
> >> On Fri, Jul 08, 2016 at 02:48:56PM +0300, Ilya Maximets wrote:
> >>>
> >>> Another point is that crash constantly happens on queue_id=3 (second RX queue) in
> >>> my scenario. It is newly allocated virtqueue while reconfiguration from rxq=1 to
> >>> rxq=2.
> >>
> >> That's a valuable message: what's your DPDK HEAD commit while triggering
> >> this issue?
> 
> fbfd99551ca3 ("mbuf: add raw allocation function")
> 
> > 
> > I guess I have understood what goes wrong in you case.
> > 
> > I would guess that your vhost has 2 queues (here I mean queue-pairs,
> > including one Tx and Rx queue; below usage is the same) configured,
> > so does to your QEMU. However, you just enabled 1 queue while starting
> > testpmd inside the guest, and you want to enable 2 queues by running
> > following testpmd commands:
> > 
> >     stop
> >     port stop all
> >     port config all rxq 2
> >     port config all txq 2
> >     port start all
> > 
> > Badly, that won't work for current virtio PMD implementation, and what's
> > worse, it triggers a vhost crash, the one you saw.
> > 
> > Here is how it comes. Since you just enabled 1 queue while starting
> > testpmd, it will setup 1 queue only, meaning only one queue's **valid**
> > information will be sent to vhost. You might see SET_VRING_ADDR
> > (and related vhost messages) for the other queue as well, but they
> > are just the dummy messages: they don't include any valid/real
> > information about the 2nd queue: the driver don't setup it after all.
> > 
> > So far, so good. It became broken when you run above commands. Those
> > commands do setup for the 2nd queue, however, they failed to trigger
> > the QEMU virtio device to start the vhost-user negotiation, meaning
> > no SET_VRING_ADDR will be sent for the 2nd queue, leaving vhost
> > untold and not updated.
> > 
> > What's worse, above commands trigger the QEMU to send SET_VRING_ENABLE
> > messages, to enable all the vrings. And since the vrings for the 2nd
> > queue are not properly configured, the crash happens.
> 
> Hmm, why 2nd queue works properly with my fix to vhost in this case?

Hmm, really? You are sure that data flows in your 2nd queue after those
commands? From what I know is that your patch just avoid a crash, but
does not fix it.

> > So maybe we should do virtio reset on port start?
> 
> I guess it was removed by this patch:
> a85786dc816f ("virtio: fix states handling during initialization").

Seems yes.

	--yliu