From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 9A3B42B9E for ; Mon, 26 Sep 2016 15:03:46 +0200 (CEST) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga105.jf.intel.com with ESMTP; 26 Sep 2016 06:03:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.30,399,1470726000"; d="scan'208";a="13782078" Received: from yliu-dev.sh.intel.com (HELO yliu-dev) ([10.239.67.162]) by orsmga004.jf.intel.com with ESMTP; 26 Sep 2016 06:03:44 -0700 Date: Mon, 26 Sep 2016 21:04:15 +0800 From: Yuanhan Liu To: "Michael S. Tsirkin" Cc: Stephen Hemminger , Maxime Coquelin , huawei.xie@intel.com, dev@dpdk.org, vkaplans@redhat.com Message-ID: <20160926130415.GF20278@yliu-dev.sh.intel.com> References: <1474619303-16709-1-git-send-email-maxime.coquelin@redhat.com> <20160923184310-mutt-send-email-mst@kernel.org> <20160923210259-mutt-send-email-mst@kernel.org> <425573ad-216f-54e7-f4ee-998a4f87e189@redhat.com> <20160923212116-mutt-send-email-mst@kernel.org> <20160923132414.3fb52474@xeon-e3> <20160926030354.GF23158@yliu-dev.sh.intel.com> <20160926152442-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160926152442-mutt-send-email-mst@kernel.org> User-Agent: Mutt/1.5.23 (2014-03-12) Subject: Re: [dpdk-dev] [PATCH v3] vhost: Add indirect descriptors support to the TX path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Sep 2016 13:03:47 -0000 On Mon, Sep 26, 2016 at 03:25:35PM +0300, Michael S. Tsirkin wrote: > On Mon, Sep 26, 2016 at 11:03:54AM +0800, Yuanhan Liu wrote: > > On Fri, Sep 23, 2016 at 01:24:14PM -0700, Stephen Hemminger wrote: > > > On Fri, 23 Sep 2016 21:22:23 +0300 > > > "Michael S. Tsirkin" wrote: > > > > > > > On Fri, Sep 23, 2016 at 08:16:09PM +0200, Maxime Coquelin wrote: > > > > > > > > > > > > > > > On 09/23/2016 08:06 PM, Michael S. Tsirkin wrote: > > > > > > On Fri, Sep 23, 2016 at 08:02:27PM +0200, Maxime Coquelin wrote: > > > > > > > > > > > > > > > > > > > > > On 09/23/2016 05:49 PM, Michael S. Tsirkin wrote: > > > > > > > > On Fri, Sep 23, 2016 at 10:28:23AM +0200, Maxime Coquelin wrote: > > > > > > > > > Indirect descriptors are usually supported by virtio-net devices, > > > > > > > > > allowing to dispatch a larger number of requests. > > > > > > > > > > > > > > > > > > When the virtio device sends a packet using indirect descriptors, > > > > > > > > > only one slot is used in the ring, even for large packets. > > > > > > > > > > > > > > > > > > The main effect is to improve the 0% packet loss benchmark. > > > > > > > > > A PVP benchmark using Moongen (64 bytes) on the TE, and testpmd > > > > > > > > > (fwd io for host, macswap for VM) on DUT shows a +50% gain for > > > > > > > > > zero loss. > > > > > > > > > > > > > > > > > > On the downside, micro-benchmark using testpmd txonly in VM and > > > > > > > > > rxonly on host shows a loss between 1 and 4%.i But depending on > > > > > > > > > the needs, feature can be disabled at VM boot time by passing > > > > > > > > > indirect_desc=off argument to vhost-user device in Qemu. > > > > > > > > > > > > > > > > Even better, change guest pmd to only use indirect > > > > > > > > descriptors when this makes sense (e.g. sufficiently > > > > > > > > large packets). > > > > > > > With the micro-benchmark, the degradation is quite constant whatever > > > > > > > the packet size. > > > > > > > > > > > > > > For PVP, I could not test with larger packets than 64 bytes, as I don't > > > > > > > have a 40G interface, > > > > > > > > > > > > Don't 64 byte packets fit in a single slot anyway? > > > > > No, indirect is used. I didn't checked in details, but I think this is > > > > > because there is no headroom reserved in the mbuf. > > > > > > > > > > This is the condition to meet to fit in a single slot: > > > > > /* optimize ring usage */ > > > > > if (vtpci_with_feature(hw, VIRTIO_F_ANY_LAYOUT) && > > > > > rte_mbuf_refcnt_read(txm) == 1 && > > > > > RTE_MBUF_DIRECT(txm) && > > > > > txm->nb_segs == 1 && > > > > > rte_pktmbuf_headroom(txm) >= hdr_size && > > > > > rte_is_aligned(rte_pktmbuf_mtod(txm, char *), > > > > > __alignof__(struct virtio_net_hdr_mrg_rxbuf))) > > > > > can_push = 1; > > > > > else if (vtpci_with_feature(hw, VIRTIO_RING_F_INDIRECT_DESC) && > > > > > txm->nb_segs < VIRTIO_MAX_TX_INDIRECT) > > > > > use_indirect = 1; > > > > > > > > > > I will check more in details next week. > > > > > > > > Two thoughts then > > > > 1. so can some headroom be reserved? > > > > 2. how about using indirect with 3 s/g entries, > > > > but direct with 2 and down? > > > > > > The default mbuf allocator does keep headroom available. Sounds like a > > > test bug. > > > > That's because we don't have VIRTIO_F_ANY_LAYOUT set, as Stephen claimed > > in v2's comment. > > > > Since DPDK vhost actually supports VIRTIO_F_ANY_LAYOUT for a while, I'd > > like to add it in the features list (VHOST_SUPPORTED_FEATURES). > > > > Will drop a patch shortly. > > > > --yliu > > If VERSION_1 is set then this implies ANY_LAYOUT without it being set. Yes, I saw this note from you in another email. I kept it as it is, for two reasons (maybe I should have claimed it earlier): - we have to return all features we support to the guest. We don't know the guest is a modern or legacy device. That means we should claim we support both: VERSION_1 and ANY_LAYOUT. Assume guest is a legacy device and we just set VERSION_1 (the current case), ANY_LAYOUT will never be negotiated. - I'm following the way Linux kernel takes: it also set both features. Maybe, we could unset ANY_LAYOUT when VERSION_1 is _negotiated_? --yliu