From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 1CA11133F for ; Sat, 5 Nov 2016 07:14:45 +0100 (CET) Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP; 04 Nov 2016 23:14:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,595,1473145200"; d="scan'208";a="27821641" Received: from yliu-dev.sh.intel.com (HELO yliu-dev) ([10.239.67.162]) by orsmga005.jf.intel.com with ESMTP; 04 Nov 2016 23:14:43 -0700 Date: Sat, 5 Nov 2016 14:15:33 +0800 From: Yuanhan Liu To: Kevin Traynor Cc: dev@dpdk.org, Thomas Monjalon , Tan Jianfeng , Ilya Maximets , Kyle Larose Message-ID: <20161105061533.GC12283@yliu-dev.sh.intel.com> References: <1478189400-14606-1-git-send-email-yuanhan.liu@linux.intel.com> <1478189400-14606-5-git-send-email-yuanhan.liu@linux.intel.com> <4e317dd8-19d5-0914-0572-fe5d0651400a@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4e317dd8-19d5-0914-0572-fe5d0651400a@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] [PATCH 4/8] net/virtio: allocate queue at init stage X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Nov 2016 06:14:46 -0000 On Fri, Nov 04, 2016 at 08:30:00PM +0000, Kevin Traynor wrote: > On 11/04/2016 03:21 PM, Kevin Traynor wrote: > > On 11/03/2016 04:09 PM, Yuanhan Liu wrote: > >> Queue allocation should be done once, since the queue related info (such > >> as vring addreess) will only be informed to the vhost-user backend once > >> without virtio device reset. > >> > >> That means, if you allocate queues again after the vhost-user negotiation, > >> the vhost-user backend will not be informed any more. Leading to a state > >> that the vring info mismatches between virtio PMD driver and vhost-backend: > >> the driver switches to the new address has just been allocated, while the > >> vhost-backend still sticks to the old address has been assigned in the init > >> stage. > >> > >> Unfortunately, that is exactly how the virtio driver is coded so far: queue > >> allocation is done at queue_setup stage (when rte_eth_tx/rx_queue_setup is > >> invoked). This is wrong, because queue_setup can be invoked several times. > >> For example, > >> > >> $ start_testpmd.sh ... --txq=1 --rxq=1 ... > >> > port stop 0 > >> > port config all txq 1 # just trigger the queue_setup callback again > >> > port config all rxq 1 > >> > port start 0 > >> > >> The right way to do is allocate the queues in the init stage, so that the > >> vring info could be persistent with the vhost-user backend. > >> > >> Besides that, we should allocate max_queue pairs the device supports, but > >> not nr queue pairs firstly configured, to make following case work. > >> > >> $ start_testpmd.sh ... --txq=1 --rxq=1 ... > >> > port stop 0 > >> > port config all txq 2 > >> > port config all rxq 2 > >> > port start 0 > > > > hi Yuanhan, firstly - thanks for this patchset. It is certainly needed > > to fix the silent failure after increase num q's. > > > > I tried a few tests and I'm seeing an issue. I can stop the port, > > increase the number of queues and traffic is ok, but if I try to > > decrease the number of queues it hangs on port start. I'm running head > > of the master with your patches in the guest and 16.07 in the host. > > > > $ testpmd -c 0x5f -n 4 --socket-mem 1024 -- --burst=64 -i > > --disable-hw-vlan --rxq=2 --txq=2 --rxd=256 --txd=256 --forward-mode=io > >> port stop all > >> port config all rxq 1 > >> port config all txq 1 > >> port start all > > Configuring Port 0 (socket 0) > > (hang here) > > > > I've tested a few different scenarios and anytime the queues are > > decreased from the previous number the hang occurs. > > > > I can debug further but wanted to report early as maybe issue is an > > obvious one? Kevin, thanks for testing! Hmm, it's a case I missed: I was thinking/testing more about increasing (but not shrinking) the queue size :( > virtio_dev_start() is getting stuck as soon as it needs to send a That's because the connection is closed (for a bad reason, see detailes below). You could figure it out quickly from the vhost log: testpmd> VHOST_CONFIG: read message VHOST_USER_GET_VRING_BASE PMD: Connection closed Those messages showed immediately after you execute "port start all". It's actually triggered by the "rx_queue_release", which in turn invokes the "del_queue" virtio-pci method. QEMU then resets the device once the message is got. Thus, you saw above log. Since we now allocate queue once, it doesn't make sense to free those queues and invoke the "del_queue" method at rx/tx_queue_release callback then. The queue_release callback will be invoked when we shrink the queue size. And then I saw this case works like a charm. I'm about to catch a train soon, but I will try to re-post v2 today, with another minor fix I noticed while checking this issue: we should also send the VIRTIO_NET_CTRL_MQ message while the queues shrinks from 2 to 1. --yliu