From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf0-f53.google.com (mail-lf0-f53.google.com [209.85.215.53]) by dpdk.org (Postfix) with ESMTP id BA2B42E41 for ; Wed, 12 Oct 2016 17:31:15 +0200 (CEST) Received: by mail-lf0-f53.google.com with SMTP id b75so80797105lfg.3 for ; Wed, 12 Oct 2016 08:31:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:user-agent:in-reply-to :references:mime-version:content-transfer-encoding; bh=YluDEmqzZGFBMwk02igIeNS0IZwGc2rO1dpCQCztpfQ=; b=1/7ToqIcGTUMpYpWs62FR2m4C+NzDG+Mtj2r1FRtJfZbb61ecQmj8lYX97wLSCYAUN +gpS+7uznUQpUj9LWHuOYWRjDsLHfui38O3nBefQujJGAnJXQbhJ6NecjQcPg/CcGm4g Ow/YZydnO5T9U4FYHYbGfXRdGhlFQ2Wh63iTYjLqGDvf0uVh7+DYUojdt48IaWBBoiO1 E3Po8En9HsZvEEMXkF8kDPkLpl3ed/5eTE2VK9SswBKXxohLOPRW9hM5ONOc7anoNn2W Z+zlT8Yrm+btFpO+86oBNeIpktTbF2VYnwGcnD0MfCyJXDawt4TZ4J+f3w4UHiP3xW1b hD1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:user-agent :in-reply-to:references:mime-version:content-transfer-encoding; bh=YluDEmqzZGFBMwk02igIeNS0IZwGc2rO1dpCQCztpfQ=; b=SsL/nlOIzhCZUokDT4qtBEhPiBkUDpP/F0+6LUXIpQ4HQx+u8fMk4GTGA3I29OwsXN tzuL4Wq40VTy60Hs7DryqCKrDtrneR7mBhXsnYcqIzhT8oVytFygSKiGPlDpzA4OeAUV HHIM16jsRQMMqCEXtqjK6vJX/wMZf3UW7VT14CnwPQk++Al6N8JIfA8vqm79H0lar3fP cO4pHa+LvuKMKuwva7n7UBOBbE7mhFkm0/LIkEqVAjglo3OuUVd/1JpMnk8loEigoZFo g7DyW58kOfqRG5RjvBScMuGatYvQqrljg/XCaA2oafRsI3FRca64V8KOqJ5B0zQX/UEV aQAQ== X-Gm-Message-State: AA6/9RmKQjfufKvtx7lkwNr0o3Wn99Dfn0X2Cirk22L2F4FIG7jctgKxHg8VLgZz4Y1n+5Ai X-Received: by 10.25.19.37 with SMTP id j37mr2101347lfi.124.1476286274831; Wed, 12 Oct 2016 08:31:14 -0700 (PDT) Received: from xps13.localnet (184.203.134.77.rev.sfr.net. [77.134.203.184]) by smtp.gmail.com with ESMTPSA id g84sm2395333lji.11.2016.10.12.08.31.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Oct 2016 08:31:13 -0700 (PDT) From: Thomas Monjalon To: "Wang, Zhihong" Cc: Yuanhan Liu , Jianbo Liu , Maxime Coquelin , dev@dpdk.org Date: Wed, 12 Oct 2016 17:31:12 +0200 Message-ID: <1707408.heiIBWjog8@xps13> User-Agent: KMail/4.14.10 (Linux/4.5.4-1-ARCH; KDE/4.14.11; x86_64; ; ) In-Reply-To: <8F6C2BD409508844A0EFC19955BE09414E7CD78E@SHSMSX103.ccr.corp.intel.com> References: <1471319402-112998-1-git-send-email-zhihong.wang@intel.com> <20161012025307.GG16751@yliu-dev.sh.intel.com> <8F6C2BD409508844A0EFC19955BE09414E7CD78E@SHSMSX103.ccr.corp.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Subject: Re: [dpdk-dev] [PATCH v3 0/5] vhost: optimize enqueue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Oct 2016 15:31:16 -0000 Sorry guys, you lost me in the discussion. Is there some regression only on ARM? Does it need some work specifically on memcpy for ARM, or vhost for ARM? Who can work on ARM optimization? More comments below. 2016-10-12 12:22, Wang, Zhihong: > From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com] > > > It's an ARM server. > > > > > > >> > 3. How many percentage of drop are you seeing? > > > The testing result: > > > size (bytes) improvement (%) > > > 64 3.92 > > > 128 11.51 > > > 256 24.16 > > > 512 -13.79 > > > 1024 -22.51 > > > 1500 -12.22 > > > A correction is that performance is dropping if byte size is larger than 512. > > > > I have thought of this twice. Unfortunately, I think I may need NACK this > > series. > > > > Merging two code path into one is really good: as you stated, it improves > > the maintainability. But only if we see no performance regression on both > > path after the refactor. Unfortunately, that's not the case here: it hurts > > the performance for one code path (non-mrg Rx). +1 > > That makes me think we may should not do the code path merge at all. I think > > that also aligns with what you have said before (internally): we could do the > > merge if it gives comparable performance before and after that. > > > > Besides that, I don't quite like the way you did in patch 2 (rewrite enqueue): > > you made a lot of changes in one patch. That means if something wrong > > happened, > > it is hard to narrow down which change introduces that regression. Badly, > > that's exactly what we met here. Weeks have been passed, I see no progress. +1, it is important to have simple patches making changes step by step. > > That's the reason we like the idea of "one patch only does one thing, an > > atomic thing". > > > Yuanhan, folks, > > Thanks for the analysis. I disagree here though. > > I analyze, develop, benchmark on x86 platforms, where this patch > works great. > > I've been trying to analyze on ARM too but it takes time and I've > had a schedule. Also since the ARM perf issue comes when it's > v6 already, I might not be able to make it in time. However > that's what I have to do for this patch to be merged in this > or the next release. > > In the meantime, may I suggest we consider the possibility to > have dedicated codes for **perf critical paths** for different > kinds of architecture? Yes that's what we do in several parts of DPDK. > It can be hard for a person to have both the knowledge and the > development environment for multiple archs at the same time. Yes we do not expect you work on ARM. So if nobody work on the ARM issue, you could make 2 code paths in order to allow your optimization for x86 only. But that's not the preferred way. And you must split your rework to better identify which part is a regression on ARM. > Moreover, different optimization techniques might be required for > different archs, so it's hard and unnecessary to make a function > works for all archs, sometimes it's just not the right thing to do. Yes sometimes. Please help us to be convinced for this case. > > So I will apply the first patch (it's a bug fixing patch) and ask you to > > refactor the rest, without the code path merge. > > > > I think we could still have a good maintainability code base if we introduce > > more common helper functions that can be used on both Rx path, or even on > > Tx path (such as update_used_ring, or shadow_used_ring). Yes it is a good step. And the code path merge could be reconsidered later. > > It's a bit late for too many changes for v16.11. I think you could just > > grab patch 6 (vhost: optimize cache access) to the old mrg-Rx code path, > > if that also helps the performance? Let us handle the left in next release, > > such as shadow used ring. Thank you