From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id DD9C42946 for ; Thu, 29 Sep 2016 17:30:56 +0200 (CEST) Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4877D68803; Thu, 29 Sep 2016 15:30:56 +0000 (UTC) Received: from [10.36.5.134] (vpn1-5-134.ams2.redhat.com [10.36.5.134]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u8TFUrWR026822 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 29 Sep 2016 11:30:54 -0400 To: Yuanhan Liu , "Michael S. Tsirkin" References: <1474872056-24665-1-git-send-email-yuanhan.liu@linux.intel.com> <1474872056-24665-2-git-send-email-yuanhan.liu@linux.intel.com> <20160926221112-mutt-send-email-mst@kernel.org> <20160927031158.GA25823@yliu-dev.sh.intel.com> <20160927224935-mutt-send-email-mst@kernel.org> <20160928022848.GE1597@yliu-dev.sh.intel.com> Cc: Stephen Hemminger , dev@dpdk.org, qemu-devel@nongnu.org From: Maxime Coquelin Message-ID: Date: Thu, 29 Sep 2016 17:30:53 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <20160928022848.GE1597@yliu-dev.sh.intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Thu, 29 Sep 2016 15:30:56 +0000 (UTC) Subject: Re: [dpdk-dev] [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2016 15:30:57 -0000 On 09/28/2016 04:28 AM, Yuanhan Liu wrote: > On Tue, Sep 27, 2016 at 10:56:40PM +0300, Michael S. Tsirkin wrote: >> On Tue, Sep 27, 2016 at 11:11:58AM +0800, Yuanhan Liu wrote: >>> On Mon, Sep 26, 2016 at 10:24:55PM +0300, Michael S. Tsirkin wrote: >>>> On Mon, Sep 26, 2016 at 11:01:58AM -0700, Stephen Hemminger wrote: >>>>> I assume that if using Version 1 that the bit will be ignored >>> >>> Yes, but I will just quote what you just said: what if the guest >>> virtio device is a legacy device? I also gave my reasons in another >>> email why I consistently set this flag: >>> >>> - we have to return all features we support to the guest. >>> >>> We don't know the guest is a modern or legacy device. That means >>> we should claim we support both: VERSION_1 and ANY_LAYOUT. >>> >>> Assume guest is a legacy device and we just set VERSION_1 (the current >>> case), ANY_LAYOUT will never be negotiated. >>> >>> - I'm following the way Linux kernel takes: it also set both features. >>> >>> Maybe, we could unset ANY_LAYOUT when VERSION_1 is _negotiated_? >>> >>> The unset after negotiation I proposed turned out it won't work: the >>> feature is already negotiated; unsetting it only in vhost side doesn't >>> change anything. Besides, it may break the migration as Michael stated >>> below. >> >> I think the reverse. Teach vhost user that for future machine types >> only VERSION_1 implies ANY_LAYOUT. >> >> >>>> Therein lies a problem. If dpdk tweaks flags, updating it >>>> will break guest migration. >>>> >>>> One way is to require that users specify all flags fully when >>>> creating the virtio net device. >>> >>> Like how? By a new command line option? And user has to type >>> all those features? >> >> Make libvirt do this. users use management normally. those that don't >> likely don't migrate VMs. > > Fair enough. > >> >>>> QEMU could verify that all required >>>> flags are set, and fail init if not. >>>> >>>> This has other advantages, e.g. it adds ability to >>>> init device without waiting for dpdk to connect. > > Will the feature negotiation between DPDK and QEMU still exist > in your proposal? > >>>> >>>> However, enabling each new feature would now require >>>> management work. How about dpdk ships the list >>>> of supported features instead? >>>> Management tools could read them on source and destination >>>> and select features supported on both sides. >>> >>> That means the management tool would somehow has a dependency on >>> DPDK project, which I have no objection at all. But, is that >>> a good idea? >> >> It already starts the bridge somehow, does it not? > > Indeed. I was firstly thinking about reading the dpdk source file > to determine the DPDK supported feature list, with which the bind > is too tight. I later realized you may ask DPDK to provide a binary > to dump the list, or something like that. > >> >>> BTW, I'm not quite sure I followed your idea. I mean, how it supposed >>> to fix the ANY_LAYOUT issue here? How this flag will be set for >>> legacy device? >>> >>> --yliu >> >> For ANY_LAYOUT, I think we should just set in in qemu, >> but only for new machine types. > > What do you mean by "new machine types"? Virtio device with newer > virtio-spec version? > >> This addresses migration >> concerns. > > To make sure I followed you, do you mean the migration issue from > an older "dpdk + qemu" combo to a newer "dpdk + qemu" combo (that > more new features might be shipped)? > > Besides that, your proposal looks like a big work to accomplish. > Are you okay to make it simple first: set it consistently like > what Linux kernel does? This would at least make the ANY_LAYOUT > actually be enabled for legacy device (which is also the default > one that's widely used so far). Before enabling anything by default, we should first optimize the 1 slot case. Indeed, micro-benchmark using testpmd in txonly[0] shows ~17% perf regression for 64 bytes case: - 2 descs per packet: 11.6Mpps - 1 desc per packet: 9.6Mpps This is due to the virtio header clearing in virtqueue_enqueue_xmit(). Removing it, we get better results than with 2 descs (1.20Mpps). Since the Virtio PMD doesn't support offloads, I wonder whether we can just drop the memset? -- Maxime [0]: For testing, you'll need these patches, else only first packets will use a single slot: - http://dpdk.org/dev/patchwork/patch/16222/ - http://dpdk.org/dev/patchwork/patch/16223/