From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id C94E9568A for ; Fri, 30 Sep 2016 14:05:14 +0200 (CEST) Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E01294ACB7; Fri, 30 Sep 2016 12:05:13 +0000 (UTC) Received: from [10.36.5.135] (vpn1-5-135.ams2.redhat.com [10.36.5.135]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u8UC5A7j003327 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 30 Sep 2016 08:05:12 -0400 To: "Michael S. Tsirkin" References: <1474872056-24665-1-git-send-email-yuanhan.liu@linux.intel.com> <1474872056-24665-2-git-send-email-yuanhan.liu@linux.intel.com> <20160926221112-mutt-send-email-mst@kernel.org> <20160927031158.GA25823@yliu-dev.sh.intel.com> <20160927224935-mutt-send-email-mst@kernel.org> <20160928022848.GE1597@yliu-dev.sh.intel.com> <20160929205047-mutt-send-email-mst@kernel.org> <2889e609-f750-a4e1-66f8-768bb07a2339@redhat.com> <20160929231252-mutt-send-email-mst@kernel.org> <05d62750-303c-4b9b-a5cb-9db8552f0ab2@redhat.com> Cc: Yuanhan Liu , Stephen Hemminger , dev@dpdk.org, qemu-devel@nongnu.org From: Maxime Coquelin Message-ID: <2b458818-01ef-0533-4366-1c35a8452e4a@redhat.com> Date: Fri, 30 Sep 2016 14:05:10 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <05d62750-303c-4b9b-a5cb-9db8552f0ab2@redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 30 Sep 2016 12:05:14 +0000 (UTC) Subject: Re: [dpdk-dev] [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Sep 2016 12:05:15 -0000 On 09/29/2016 11:23 PM, Maxime Coquelin wrote: > > > On 09/29/2016 10:21 PM, Michael S. Tsirkin wrote: >> On Thu, Sep 29, 2016 at 10:05:22PM +0200, Maxime Coquelin wrote: >>> >>> >>> On 09/29/2016 07:57 PM, Michael S. Tsirkin wrote: >>>> On Thu, Sep 29, 2016 at 05:30:53PM +0200, Maxime Coquelin wrote: >>> ... >>>>> >>>>> Before enabling anything by default, we should first optimize the 1 >>>>> slot >>>>> case. Indeed, micro-benchmark using testpmd in txonly[0] shows ~17% >>>>> perf regression for 64 bytes case: >>>>> - 2 descs per packet: 11.6Mpps >>>>> - 1 desc per packet: 9.6Mpps >>>>> >>>>> This is due to the virtio header clearing in virtqueue_enqueue_xmit(). >>>>> Removing it, we get better results than with 2 descs (1.20Mpps). >>>>> Since the Virtio PMD doesn't support offloads, I wonder whether we can >>>>> just drop the memset? >>>> >>>> What will happen? Will the header be uninitialized? >>> Yes.. >>> I didn't look closely at the spec, but just looked at DPDK's and Linux >>> vhost implementations. IIUC, the header is just skipped in the two >>> implementations. >> >> In linux guest skbs are initialized AFAIK. See virtio_net_hdr_from_skb >> first thing it does is >> memset(hdr, 0, sizeof(*hdr)); > > I meant in vhost-net linux implementation, the header is just skipped: > > static void handle_tx(struct vhost_net *net) > { > ... > /* Skip header. TODO: support TSO. */ > len = iov_length(vq->iov, out); > iov_iter_init(&msg.msg_iter, WRITE, vq->iov, out, len); > iov_iter_advance(&msg.msg_iter, hdr_size); > > And the same is done is done in DPDK: > > static inline int __attribute__((always_inline)) > copy_desc_to_mbuf(struct virtio_net *dev, struct vring_desc *descs, > uint16_t max_desc, struct rte_mbuf *m, uint16_t desc_idx, > struct rte_mempool *mbuf_pool) > { > ... > /* > * A virtio driver normally uses at least 2 desc buffers > * for Tx: the first for storing the header, and others > * for storing the data. > */ > if (likely((desc->len == dev->vhost_hlen) && > (desc->flags & VRING_DESC_F_NEXT) != 0)) { > desc = &descs[desc->next]; > if (unlikely(desc->flags & VRING_DESC_F_INDIRECT)) > return -1; > > desc_addr = gpa_to_vva(dev, desc->addr); > if (unlikely(!desc_addr)) > return -1; > > rte_prefetch0((void *)(uintptr_t)desc_addr); > > desc_offset = 0; > desc_avail = desc->len; > nr_desc += 1; > > PRINT_PACKET(dev, (uintptr_t)desc_addr, desc->len, 0); > } else { > desc_avail = desc->len - dev->vhost_hlen; > desc_offset = dev->vhost_hlen; > } Actually, the header is parsed in DPDK vhost implementation. But as Virtio PMD provides a zero'ed header, we could just parse the header only if VIRTIO_NET_F_NO_TX_HEADER is not negotiated. >> >> >> >>>> >>>> The spec says: >>>> The driver can send a completely checksummed packet. In this >>>> case, flags >>>> will be zero, and gso_type >>>> will be VIRTIO_NET_HDR_GSO_NONE. >>>> >>>> and >>>> The driver MUST set num_buffers to zero. >>>> If VIRTIO_NET_F_CSUM is not negotiated, the driver MUST set >>>> flags to >>>> zero and SHOULD supply a fully >>>> checksummed packet to the device. >>>> >>>> and >>>> If none of the VIRTIO_NET_F_HOST_TSO4, TSO6 or UFO options have >>>> been >>>> negotiated, the driver MUST >>>> set gso_type to VIRTIO_NET_HDR_GSO_NONE. >>>> >>>> so doing this unconditionally would be a spec violation, but if you see >>>> value in this, we can add a feature bit. >>> Right it would be a spec violation, so it should be done conditionally. >>> If a feature bit is to be added, what about VIRTIO_NET_F_NO_TX_HEADER? >>> It would imply VIRTIO_NET_F_CSUM not set, and no GSO features set. >>> If negotiated, we wouldn't need to prepend a header. >> >> Yes but two points. >> >> 1. why is this memset expensive? Is the test completely skipping looking >> at the packet otherwise? > Yes. >> >> 2. As long as we are doing this, see >> Alignment vs. Networking >> ======================== >> in Documentation/unaligned-memory-access.txt > Thanks, I'll have a look tomorrow. I did a rough prototype which removes Tx headers unconditionally, to see what gain we could expect. I expect the results to be a little lower with no headers in full implementation, as some more checks will have to be done. For PVP use-case with 0.05% acceptable packets loss: - Original (with headers): 9.43Mpps - Indirect descs: 9.36 Mpps - Prototype (no headers): 10.65Mpps For PVP use-case with 0% acceptable packets loss: - Original (with headers): 5.23Mpps - Indirect descs: 7.13 Mpps - Prototype (no headers): 7.92Mpps Maxime