From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dispatch1-us1.ppe-hosted.com (dispatch1-us1.ppe-hosted.com [148.163.129.52]) by dpdk.org (Postfix) with ESMTP id EE99A29CA for ; Tue, 12 Sep 2017 08:43:27 +0200 (CEST) Received: from pure.maildistiller.com (dispatch1.mdlocal [10.7.20.164]) by dispatch1-us1.ppe-hosted.com (Proofpoint Essentials ESMTP Server) with ESMTP id 4E5ED6004F; Tue, 12 Sep 2017 06:43:27 +0000 (UTC) X-Virus-Scanned: Proofpoint Essentials engine Received: from mx2-us3.ppe-hosted.com (filterqueue.mdlocal [10.7.20.246]) by pure.maildistiller.com (Proofpoint Essentials ESMTP Server) with ESMTPS id 948EF260049; Tue, 12 Sep 2017 06:43:26 +0000 (UTC) Received: from webmail.solarflare.com (webmail.solarflare.com [12.187.104.26]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2-us3.ppe-hosted.com (Proofpoint Essentials ESMTP Server) with ESMTPS id 4F71880065; Tue, 12 Sep 2017 06:43:26 +0000 (UTC) Received: from [192.168.38.17] (84.52.114.114) by ocex03.SolarFlarecom.com (10.20.40.36) with Microsoft SMTP Server (TLS) id 15.0.1044.25; Mon, 11 Sep 2017 23:43:22 -0700 To: Jerin Jacob , Shahaf Shuler CC: "Ananyev, Konstantin" , Stephen Hemminger , Thomas Monjalon , "dev@dpdk.org" , "Zhang, Helin" , "Wu, Jingjing" References: <20170910104827.11da9230@xeon-e3> <20170911062119.GA9342@jerin> <20170911080621.GA15867@jerin> <20170911090543.GA9965@jerin> <2601191342CEEE43887BDE71AB9772584F2497DC@irsmsx105.ger.corp.intel.com> <20170912040107.GA8081@jerin> <20170912055137.GA24921@jerin> From: Andrew Rybchenko Message-ID: Date: Tue, 12 Sep 2017 09:43:20 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <20170912055137.GA24921@jerin> Content-Language: en-GB X-Originating-IP: [84.52.114.114] X-ClientProxiedBy: ocex03.SolarFlarecom.com (10.20.40.36) To ocex03.SolarFlarecom.com (10.20.40.36) X-MDID: 1505198607-7JuYfPA3VXxm Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [PATCH v2 2/2] ethdev: introduce Tx queue offloads API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Sep 2017 06:43:28 -0000 On 09/12/2017 08:51 AM, Jerin Jacob wrote: >> Tuesday, September 12, 2017 7:01 AM, Jerin Jacob: >>> Yes, only when ETH_TXQ_FLAGS_NOMULTMEMP and >>> ETH_TXQ_FLAGS_NOREFCOUNT selected at tx queue configuration. >>> >>>> So literally, yes it is not a TX HW offload, though I understand your >>>> intention to have such possibility - it might help to save some cycles. >>> It not a few cycles. We could see ~24% drop on per core(with 64B) with >>> testpmd and l3fwd on some SoCs. It is not very specific to nicvf HW, The >>> problem is with limited cache hierarchy in very low end arm64 machines. >>> For TX buffer recycling case, it need to touch the mbuf again to find out the >>> associated mempool to free. It is fine if application demands it but not all the >>> application demands it. >>> >>> We have two category of arm64 machines, The high end machine where >>> cache hierarchy similar x86 server machine. The low end ones with very >>> limited cache resources. Unfortunately, we need to have the same binary on >>> both machines. >>> >>> >>>> Wonder would some new driver specific function would help in that case? >>>> nicvf_txq_pool_setup(portid, queueid, struct rte_mempool *txpool, >>>> uint32_t flags); or so? >>> It is possible, but how do we make such change in testpmd, l3fwd or ipsec- >>> gw in tree application which does need only NOMULTIMEMP & >>> NOREFCOUNT. >>> >>> If there is concern about making it Tx queue level it is fine. We can move >>> from queue level to port level or global level. >>> IMO, Application should express in some form that it wants only >>> NOMULTIMEMP & NOREFCOUNT and thats is the case for l3fwd and ipsec- >>> gw >>> >> I understand the use case, and the fact those flags improve the performance on low-end ARM CPUs. >> IMO those flags cannot be on queue/port level. They must be global. > Where should we have it as global(in terms of API)? > And why it can not be at port level? I think port level is the right place for these flags. These flags define which transmit and transmit cleanup callbacks could be used. These functions are specified on port level now. However, I see no good reasons to change it. It will complicate the possibility to make transmit and transmit cleanup callback per queue (not per port as now). All three (no-multi-seg, no-multi-mempool, no-reference-counter) are from one group and should go together. >> Even though the use-case is generic the nicvf PMD is the only one which do such optimization. >> So am suggesting again - why not expose it as a PMD specific parameter? > Why to make it as PMD specific? if application can express it though > normative DPDK APIs. > >> - The application can express it wants such optimization. >> - It is global >> >> Currently it does not seems there is high demand for such flags from other PMDs. If such demand will raise, we can discuss again on how to expose it properly. > It is not PMD specific. It is all about where it runs? it will > applicable for any PMD that runs low end hardwares where it need SW > based Tx buffer recycling(The NPU is different story as it has HW > assisted mempool manager). > What we are loosing by running DPDK effectively on low end hardware > with such "on demand" runtime configuration though DPDK normative API. +1 and it improves performance on amd64 as well, definitely less than 24%, but noticeable. If application architecture meets these conditions, why don't allow it use the advantage and run faster.