From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id F087D5680 for ; Thu, 29 Sep 2016 19:57:49 +0200 (CEST) Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 43AFEA0B5B; Thu, 29 Sep 2016 17:57:49 +0000 (UTC) Received: from redhat.com (vpn-57-22.rdu2.redhat.com [10.10.57.22]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with SMTP id u8THvlZH032229; Thu, 29 Sep 2016 13:57:48 -0400 Date: Thu, 29 Sep 2016 20:57:47 +0300 From: "Michael S. Tsirkin" To: Maxime Coquelin Cc: Yuanhan Liu , Stephen Hemminger , dev@dpdk.org, qemu-devel@nongnu.org Message-ID: <20160929205047-mutt-send-email-mst@kernel.org> References: <1474872056-24665-1-git-send-email-yuanhan.liu@linux.intel.com> <1474872056-24665-2-git-send-email-yuanhan.liu@linux.intel.com> <20160926221112-mutt-send-email-mst@kernel.org> <20160927031158.GA25823@yliu-dev.sh.intel.com> <20160927224935-mutt-send-email-mst@kernel.org> <20160928022848.GE1597@yliu-dev.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Thu, 29 Sep 2016 17:57:49 +0000 (UTC) Subject: Re: [dpdk-dev] [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2016 17:57:50 -0000 On Thu, Sep 29, 2016 at 05:30:53PM +0200, Maxime Coquelin wrote: > > > On 09/28/2016 04:28 AM, Yuanhan Liu wrote: > > On Tue, Sep 27, 2016 at 10:56:40PM +0300, Michael S. Tsirkin wrote: > > > On Tue, Sep 27, 2016 at 11:11:58AM +0800, Yuanhan Liu wrote: > > > > On Mon, Sep 26, 2016 at 10:24:55PM +0300, Michael S. Tsirkin wrote: > > > > > On Mon, Sep 26, 2016 at 11:01:58AM -0700, Stephen Hemminger wrote: > > > > > > I assume that if using Version 1 that the bit will be ignored > > > > > > > > Yes, but I will just quote what you just said: what if the guest > > > > virtio device is a legacy device? I also gave my reasons in another > > > > email why I consistently set this flag: > > > > > > > > - we have to return all features we support to the guest. > > > > > > > > We don't know the guest is a modern or legacy device. That means > > > > we should claim we support both: VERSION_1 and ANY_LAYOUT. > > > > > > > > Assume guest is a legacy device and we just set VERSION_1 (the current > > > > case), ANY_LAYOUT will never be negotiated. > > > > > > > > - I'm following the way Linux kernel takes: it also set both features. > > > > > > > > Maybe, we could unset ANY_LAYOUT when VERSION_1 is _negotiated_? > > > > > > > > The unset after negotiation I proposed turned out it won't work: the > > > > feature is already negotiated; unsetting it only in vhost side doesn't > > > > change anything. Besides, it may break the migration as Michael stated > > > > below. > > > > > > I think the reverse. Teach vhost user that for future machine types > > > only VERSION_1 implies ANY_LAYOUT. > > > > > > > > > > > Therein lies a problem. If dpdk tweaks flags, updating it > > > > > will break guest migration. > > > > > > > > > > One way is to require that users specify all flags fully when > > > > > creating the virtio net device. > > > > > > > > Like how? By a new command line option? And user has to type > > > > all those features? > > > > > > Make libvirt do this. users use management normally. those that don't > > > likely don't migrate VMs. > > > > Fair enough. > > > > > > > > > > QEMU could verify that all required > > > > > flags are set, and fail init if not. > > > > > > > > > > This has other advantages, e.g. it adds ability to > > > > > init device without waiting for dpdk to connect. > > > > Will the feature negotiation between DPDK and QEMU still exist > > in your proposal? > > > > > > > > > > > > However, enabling each new feature would now require > > > > > management work. How about dpdk ships the list > > > > > of supported features instead? > > > > > Management tools could read them on source and destination > > > > > and select features supported on both sides. > > > > > > > > That means the management tool would somehow has a dependency on > > > > DPDK project, which I have no objection at all. But, is that > > > > a good idea? > > > > > > It already starts the bridge somehow, does it not? > > > > Indeed. I was firstly thinking about reading the dpdk source file > > to determine the DPDK supported feature list, with which the bind > > is too tight. I later realized you may ask DPDK to provide a binary > > to dump the list, or something like that. > > > > > > > > > BTW, I'm not quite sure I followed your idea. I mean, how it supposed > > > > to fix the ANY_LAYOUT issue here? How this flag will be set for > > > > legacy device? > > > > > > > > --yliu > > > > > > For ANY_LAYOUT, I think we should just set in in qemu, > > > but only for new machine types. > > > > What do you mean by "new machine types"? Virtio device with newer > > virtio-spec version? > > > > > This addresses migration > > > concerns. > > > > To make sure I followed you, do you mean the migration issue from > > an older "dpdk + qemu" combo to a newer "dpdk + qemu" combo (that > > more new features might be shipped)? > > > > Besides that, your proposal looks like a big work to accomplish. > > Are you okay to make it simple first: set it consistently like > > what Linux kernel does? This would at least make the ANY_LAYOUT > > actually be enabled for legacy device (which is also the default > > one that's widely used so far). > > Before enabling anything by default, we should first optimize the 1 slot > case. Indeed, micro-benchmark using testpmd in txonly[0] shows ~17% > perf regression for 64 bytes case: > - 2 descs per packet: 11.6Mpps > - 1 desc per packet: 9.6Mpps > > This is due to the virtio header clearing in virtqueue_enqueue_xmit(). > Removing it, we get better results than with 2 descs (1.20Mpps). > Since the Virtio PMD doesn't support offloads, I wonder whether we can > just drop the memset? What will happen? Will the header be uninitialized? The spec says: The driver can send a completely checksummed packet. In this case, flags will be zero, and gso_type will be VIRTIO_NET_HDR_GSO_NONE. and The driver MUST set num_buffers to zero. If VIRTIO_NET_F_CSUM is not negotiated, the driver MUST set flags to zero and SHOULD supply a fully checksummed packet to the device. and If none of the VIRTIO_NET_F_HOST_TSO4, TSO6 or UFO options have been negotiated, the driver MUST set gso_type to VIRTIO_NET_HDR_GSO_NONE. so doing this unconditionally would be a spec violation, but if you see value in this, we can add a feature bit. > -- Maxime > [0]: For testing, you'll need these patches, else only first packets > will use a single slot: > - http://dpdk.org/dev/patchwork/patch/16222/ > - http://dpdk.org/dev/patchwork/patch/16223/