From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id D0F401B6B9 for ; Fri, 3 Nov 2017 16:54:22 +0100 (CET) Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0FEE5D7E96; Fri, 3 Nov 2017 15:54:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 0FEE5D7E96 Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=maxime.coquelin@redhat.com Received: from [10.36.112.52] (ovpn-112-52.ams2.redhat.com [10.36.112.52]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 04782609A5; Fri, 3 Nov 2017 15:54:07 +0000 (UTC) To: "Michael S. Tsirkin" , "Yao, Lei A" Cc: "dev@dpdk.org" , "yliu@fridaylinux.org" , "Horton, Remy" , "Bie, Tiwei" , "jfreiman@redhat.com" , "vkaplans@redhat.com" , "jasowang@redhat.com" References: <20171005083627.27828-1-maxime.coquelin@redhat.com> <20171005083627.27828-18-maxime.coquelin@redhat.com> <2DBBFF226F7CF64BAFCA79B681719D953A2CD10A@shsmsx102.ccr.corp.intel.com> <8d27d7d9-9567-1a57-5a5f-760f2f117b73@redhat.com> <4054d863-8909-23a7-aba2-b8675cdbb4ba@redhat.com> <20171103171230-mutt-send-email-mst@kernel.org> From: Maxime Coquelin Message-ID: <041c6163-f4fc-a6e7-7142-8ea6594800b5@redhat.com> Date: Fri, 3 Nov 2017 16:54:05 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <20171103171230-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 03 Nov 2017 15:54:22 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH v3 17/19] vhost-user: iommu: postpone device creation until ring are mapped X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Nov 2017 15:54:23 -0000 On 11/03/2017 04:15 PM, Michael S. Tsirkin wrote: > On Fri, Nov 03, 2017 at 09:25:58AM +0100, Maxime Coquelin wrote: >> >> >> On 11/02/2017 05:02 PM, Maxime Coquelin wrote: >>> >>> >>> On 11/02/2017 09:21 AM, Maxime Coquelin wrote: >>>> Hi Lei, >>>> >>>> On 11/02/2017 08:21 AM, Yao, Lei A wrote: >>>>> >>>> ... >>>>> Hi, Maxime > I met one issue with your patch set during the v17.11 test. >>>> >>>> Is it with v17.11-rc2 or -rc1? >>>> >>>>> The test scenario is following, >>>>> 1.    Bind one NIC, use test-pmd set vhost-user with 2 queue >>>>> usertools/dpdk-devbind.py --bind=igb_uio 0000:05:00.0 >>>>> ./x86_64-native-linuxapp-gcc/app/testpmd -c 0xe -n 4 >>>>> --socket-mem 1024,1024 \ >>>>> --vdev 'net_vhost0,iface=vhost-net,queues=2' - -i --rxq=2 >>>>> --txq=2 --nb-cores=2 --rss-ip >>>>> 2.    Launch qemu with  virtio device which has 2 queue >>>>> 3.    In VM, launch testpmd with virtio-pmd using only 1 queue. >>>>> x86_64-native-linuxapp-gcc/app/testpmd -c 0x07 -n 3 - -i >>>>> --txqflags=0xf01 \ >>>>> --rxq=1 --txq=1 --rss-ip --nb-cores=1 >>>>> >>>>> First, >>>>> commit 09927b5249694bad1c094d3068124673722e6b8f >>>>> vhost: translate ring addresses when IOMMU enabled >>>>> The patch causes no traffic in PVP test. but link status is >>>>> still up in vhost-user. >>>>> >>>>> Second, >>>>> eefac9536a901a1f0bb52aa3b6fec8f375f09190 >>>>> vhost: postpone device creation until rings are mapped >>>>> The patch causes link status "down" in vhost-user. >>> >>> I reproduced this one, and understand why link status remains down. >>> My series did fixed a potential issue Michael raised, that the vring >>> addresses should only interpreted once the ring is enabled. >>> When VHOST_USER_F_PROTOCOL_FEATURES is negotiated, the rings addrs are >>> translated when ring is enabled via VHOST_USER_SET_VRING_ENABLE. >>> When not negotiated, the ring is considered started enabled, so >>> translation is done at VHOST_USER_SET_VRING_KICK time. >>> >>> In your case, protocol features are negotiated, so the ring addresses >>> are translated at enable time. The problem is that the code considers >>> the device is ready once addresses for all the rings are translated. >>> But since only the first pair of rings is used, it never happens, and >>> the link remains down. >>> >>> One of the reason this check is done is to avoid starting the PMD >>> threads before the addresses are translated in case of NUMA >>> reallocation, as virtqueues and virtio-net device structs can be >>> reallocated on a different node. >>> >>> I think the right fix would be to only perform NUMA reallocation for >>> vring 0, as today we would end-up reallocating virtio-net struct >>> mulitple time if VQs are on different NUMA nodes. >>> >>> Doing that, we could then consider the device is ready if vring 0 is >>> enabled and its ring addresses are translated, and if other vrings have >>> been kicked. >>> >>> I'll post a patch shortly implementing this idea. >> >> The proposed solution doesn't work, because disabled queues get accessed at >> device start time: >> >> int >> rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable) >> { >> .. >> dev->virtqueue[queue_id]->used->flags = VRING_USED_F_NO_NOTIFY; >> return 0; >> } >> >> The above function being called in Vhost PMD for every queues, enabled >> or not. While we could fix the PMD, it could break other applications >> using the Vhost lib API directly, so we cannot translate at enable >> time reliably. >> >> I think we may be a bit less conservative, and postpone addresses >> translation at kick time, whatever VHOST_USER_F_PROTOCOL_FEATURES is >> negotiated or not. >> >> Regards, >> Maxime >> >>> Thanks, >>> Maxime > > I agree, enabling has nothing to do with it. > > The spec is quite explicit: > > Client must only process each ring when it is started. > > and > > Client must start ring upon receiving a kick (that is, detecting that file > descriptor is readable) on the descriptor specified by > VHOST_USER_SET_VRING_KICK, and stop ring upon receiving > VHOST_USER_GET_VRING_BASE. > Thanks for the confirmation Michael, fix posted. Maxime