From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from stargate3.asicdesigners.com (unknown [67.207.115.98]) by dpdk.org (Postfix) with ESMTP id 516BE593A for ; Wed, 7 Oct 2015 17:27:22 +0200 (CEST) Received: from localhost (scalar.blr.asicdesigners.com [10.193.185.94]) by stargate3.asicdesigners.com (8.13.8/8.13.8) with ESMTP id t97FRITC010848; Wed, 7 Oct 2015 08:27:19 -0700 Date: Wed, 7 Oct 2015 20:57:22 +0530 From: Rahul Lakkireddy To: "Ananyev, Konstantin" Message-ID: <20151007152721.GA2689@scalar.blr.asicdesigners.com> References: <318fc8559675b1157e7f049a6a955a6a2059bac7.1443704150.git.rahul.lakkireddy@chelsio.com> <20151005100620.GA2487@scalar.blr.asicdesigners.com> <2601191342CEEE43887BDE71AB97725836AA36CF@irsmsx105.ger.corp.intel.com> <20151005124205.GA24533@scalar.blr.asicdesigners.com> <2601191342CEEE43887BDE71AB97725836AA37E2@irsmsx105.ger.corp.intel.com> <20151005150729.GA8809@scalar.blr.asicdesigners.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151005150729.GA8809@scalar.blr.asicdesigners.com> User-Agent: Mutt/1.5.24 (2015-08-30) Cc: "dev@dpdk.org" , Felix Marti , Nirranjan Kirubaharan , Kumar A S Subject: Re: [dpdk-dev] [PATCH 1/6] cxgbe: Optimize forwarding performance for 40G X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Oct 2015 15:27:22 -0000 On Monday, October 10/05/15, 2015 at 20:37:31 +0530, Rahul Lakkireddy wrote: > On Monday, October 10/05/15, 2015 at 07:09:27 -0700, Ananyev, Konstantin wrote: > > Hi Rahul, > > [...] > > > > > > This additional check seems redundant for single segment > > > > > packets since rte_pktmbuf_free_seg also performs rte_mbuf_sanity_check. > > > > > > > > > > Several PMDs already prefer to use rte_pktmbuf_free_seg directly over > > > > > rte_pktmbuf_free as it is faster. > > > > > > > > Other PMDs use rte_pktmbuf_free_seg() as each TD has an associated > > > > with it segment. So as HW is done with the TD, SW frees associated segment. > > > > In your case I don't see any point in re-implementing rte_pktmbuf_free() manually, > > > > and I don't think it would be any faster. > > > > > > > > Konstantin > > > > > > As I mentioned below, I am clearly seeing a difference of 1 Mpps. And 1 > > > Mpps is not a small difference IMHO. > > > > Agree with you here - it is a significant difference. > > > > > > > > When running l3fwd with 8 queues, I also collected a perf report. > > > When using rte_pktmbuf_free, I see that it eats up around 6% cpu as > > > below in perf top report:- > > > -------------------- > > > 32.00% l3fwd [.] cxgbe_poll > > > 22.25% l3fwd [.] t4_eth_xmit > > > 20.30% l3fwd [.] main_loop > > > 6.77% l3fwd [.] rte_pktmbuf_free > > > 4.86% l3fwd [.] refill_fl_usembufs > > > 2.00% l3fwd [.] write_sgl > > > ..... > > > -------------------- > > > > > > While, when using rte_pktmbuf_free_seg directly, I don't see above > > > problem. perf top report now comes as:- > > > ------------------- > > > 33.36% l3fwd [.] cxgbe_poll > > > 32.69% l3fwd [.] t4_eth_xmit > > > 19.05% l3fwd [.] main_loop > > > 5.21% l3fwd [.] refill_fl_usembufs > > > 2.40% l3fwd [.] write_sgl > > > .... > > > ------------------- > > > > I don't think these 6% disappeared anywhere. > > As I can see, now t4_eth_xmit() increased by roughly same amount > > (you still have same job to do). > > Right. > > > To me it looks like in that case compiler didn't really inline rte_pktmbuf_free(). > > Wonder can you add 'always_inline' attribute to the rte_pktmbuf_free(), > > and see would it make any difference? > > > > Konstantin > > I will try out above and update further. > Tried always_inline and didn't see any difference in performance in RHEL 6.4 with gcc 4.4.7, but was seeing 1 MPPS improvement with the above block. I've moved to latest RHEL 7.1 with gcc 4.8.3 and tried both always_inline and the above block and I'm not seeing any difference for both. Will drop this block and submit a v2. Thanks for the review Aaron and Konstantin. Thanks, Rahul