From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 820FD54AF for ; Tue, 22 Nov 2016 14:17:28 +0100 (CET) Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id F3C463710D5; Tue, 22 Nov 2016 13:17:27 +0000 (UTC) Received: from [10.36.7.163] (vpn1-7-163.ams2.redhat.com [10.36.7.163]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uAMDHODF009353 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 22 Nov 2016 08:17:26 -0500 To: "Pierre Pfister (ppfister)" References: <82834CFE-767C-41B0-9327-E64B8210E076@cisco.com> <46762894-5016-f634-c355-77af881685e6@redhat.com> <1FAE01C3-B03F-443A-A77C-579254A07E0E@cisco.com> <5c99bc35-839b-84b8-1df9-db80f288d7b7@redhat.com> Cc: Yuanhan Liu , "dev@dpdk.org" From: Maxime Coquelin Message-ID: <30923c66-8eaa-f7d4-2e6a-d65d8b254e8e@redhat.com> Date: Tue, 22 Nov 2016 14:17:23 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Tue, 22 Nov 2016 13:17:28 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH] virtio: tx with can_push when VERSION_1 is set X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Nov 2016 13:17:28 -0000 Hi Pierre, On 11/22/2016 10:54 AM, Pierre Pfister (ppfister) wrote: > Hello Maxime, > >> Le 9 nov. 2016 à 15:51, Maxime Coquelin a écrit : >> >> Hi Pierre, >> >> On 11/09/2016 01:42 PM, Pierre Pfister (ppfister) wrote: >>> Hello Maxime, >>> >>> Sorry for the late reply. >>> >>> >>>> Le 8 nov. 2016 à 10:44, Maxime Coquelin a écrit : >>>> >>>> Hi Pierre, >>>> >>>> On 11/08/2016 10:31 AM, Pierre Pfister (ppfister) wrote: >>>>> Current virtio driver advertises VERSION_1 support, >>>>> but does not handle device's VERSION_1 support when >>>>> sending packets (it looks for ANY_LAYOUT feature, >>>>> which is absent). >>>>> >>>>> This patch enables 'can_push' in tx path when VERSION_1 >>>>> is advertised by the device. >>>>> >>>>> This significantly improves small packets forwarding rate >>>>> towards devices advertising VERSION_1 feature. >>>> I think it depends whether offloading is enabled or not. >>>> If no offloading enabled, I measured significant drop. >>>> Indeed, when no offloading is enabled, the Tx path in Virtio >>>> does not access the virtio header before your patch, as the header is memset to zero at device init time. >>>> With your patch, it gets memset to zero at every transmit in the hot >>>> path. >>> >>> Right. On the virtio side that is true, but on the device side, we have to access the header anyway. >> No more now, if no offload features have been negotiated. >> I have done a patch that landed in v16.11 to skip header parsing in >> this case. >> That said, we still have to access its descriptor. >> >>> And accessing two descriptors (with the address resolution and memory fetch which comes with it) >>> is a costy operation compared to a single one. >>> In the case indirect descriptors are used, this is 1 desc access instead or 3. >> I agree this is far from being optimal. >> >>> And in the case chained descriptors are used, this doubles the number of packets that you can put in your queue. >>> >>> Those are the results in my PHY -> VM (testpmd) -> PHY setup >>> Traffic is flowing bidirectionally. Numbers are for lossless-rates. >>> >>> When chained buffers are used for dpdk's TX: 2x2.13Mpps >>> When indirect descriptors are used for dpdk's TX: 2x2.38Mpps >>> When shallow buffers are used for dpdk's TX (with this patch): 2x2.42Mpps >> When I tried it, I also did PVP 0% benchmark, and I got opposite results. Chained and indirect cases were significantly better. >> >> My PVP setup was using a single NIC and single Virtio PMD, and NIC2VM >> forwarding was IO mode done with testpmd on host, and Rx->Tx forwarding >> was macswap mode on guest side. >> >> I also saw some perf regression when running simple tespmd test on both >> ends. >> >> Yuanhan, did you run some benchmark with your series enabling >> ANY_LAYOUT? > > It was enabled. But the specs specify that VERSION_1 includes ANY_LAYOUT. > Therefor, Qemu removes ANY_LAYOUT when VERSION_1 is set. > > We can keep arguing about which is fastest. I guess we have different setups and different results, so we probably are deadlocked here. > But in any case, the current code is inconsistent, as it uses single descriptor when ANY_LAYOUT is set, but not when VERSION_1 is set. > > I believe it makes sense to use single-descriptor any time it is possible, but you are free to think otherwise. > Please make a call and make the code consistent (removes single-descriptors all together, or use them when VERSION_1 is set too). Otherwise it just creates yet-another testing headache. I also think it makes sense to have a single descriptor, but I had to highlight that I noticed a significant performance degradation when using single descriptor on my setup. I'm fine we take your patch in virtio-next, so that more testing is conducted. Thanks, Maxime > > Thanks, > > - Pierre > >> >>> >>> I must also note that qemu 2.5 does not seem to deal with VERSION_1 and ANY_LAYOUT correctly. >>> The patch I am proposing here works for qemu 2.7, but with qemu 2.5, testpmd still behaves as if ANY_LAYOUT (or VERSION_1) was not available. This is not catastrophic. But just note that you will not see performance in some cases with qemu 2.5. >> >> Thanks for the info. >> >> Regards, >> Maxime >