From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f53.google.com (mail-wm0-f53.google.com [74.125.82.53]) by dpdk.org (Postfix) with ESMTP id A185B5598 for ; Thu, 1 Dec 2016 08:15:27 +0100 (CET) Received: by mail-wm0-f53.google.com with SMTP id a197so289473707wmd.0 for ; Wed, 30 Nov 2016 23:15:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=et2yVG91S7geCwRAOMuAJHh9Vom32iOUGyT5zmlTqU4=; b=NGTiRBDoVEy5xYD5J+zxvgRAXziLxsA05VfPQBTObvF5CvcYWxQpzbYI9fUuqDrgB4 6frJZEdfbIMNLjBRy9qO7BfN4D3ZefAKYmDjakUlH7q3hU4UkJ2Sadom2YqUlZOqI+vp zuRbo2/rk9YQCqbOgjlabcA3Xh5sTtCmqPVHt8XXnHQxsIFmr3o5HCms6zkkDGXFJHmp Dkv8qZifN3bUmY0jbAGfp8DrjEZ5IcP66eiV/4CPS0EvSH610UupAJZDQzgMacSsndTg /9PQqnV0iwaqaOQQK9dYBwObCAaDWcMo0JIgIKjsN2s6hoXl26TS1lMo5iMu3W7WeTVV xsiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=et2yVG91S7geCwRAOMuAJHh9Vom32iOUGyT5zmlTqU4=; b=Ek2WR77CuctOeRcriagy8qLLiyD4RG90i5YeaKnadBUXapjrLytxhHW9gPswUAfJW4 yz9+6P/bSaqLu7mxHJfLrHnY7HZXj5j4jTUveAjoOLBG7Df1eAWVqv07k8VHQrMtIAuH tKon/5e5VkS4RTYnKvVEkKBKNUiKz4M0qGwqtabvtH9LxjTfW6H4PEJb9I26d3BCwMD9 Unk+03kyNMrhI+rbBywCTnObGngoiE3tIyEW15w5dFwtUWKHeD4/cNYAYVuTxGjaLNE7 0aajGkCslG7Xb58J0vI1+0auhEGn5CiLoqTyAKFbzDOaiMzsToASCKAC2dINpLIJHH2s o45g== X-Gm-Message-State: AKaTC01VRJ3+SkxiCqiWHg0sbrbrKgvYr80eRSw45qTvtI92YFulHdp5puUZny0SseHuGwfF X-Received: by 10.28.138.135 with SMTP id m129mr30163097wmd.36.1480576527378; Wed, 30 Nov 2016 23:15:27 -0800 (PST) Received: from 6wind.com (guy78-3-82-239-227-177.fbx.proxad.net. [82.239.227.177]) by smtp.gmail.com with ESMTPSA id l67sm11561113wmf.20.2016.11.30.23.15.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 Nov 2016 23:15:26 -0800 (PST) Date: Thu, 1 Dec 2016 08:15:18 +0100 From: Adrien Mazarguil To: "Ananyev, Konstantin" Cc: Thomas Monjalon , "dev@dpdk.org" , Rahul Lakkireddy , Stephen Hurd , Jan Medala , Jakub Palider , John Daley , Alejandro Lucero , Harish Patil , Rasesh Mody , Jerin Jacob , Yuanhan Liu , Yong Wang , "Kulasek, TomaszX" , "olivier.matz@6wind.com" Message-ID: <20161201071518.GG10340@6wind.com> References: <1477486575-25148-1-git-send-email-tomaszx.kulasek@intel.com> <1479922585-8640-1-git-send-email-tomaszx.kulasek@intel.com> <8317180.L80Qf11uiu@xps13> <20161130074003.GD10340@6wind.com> <2601191342CEEE43887BDE71AB9772583F0E20C8@irsmsx105.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2601191342CEEE43887BDE71AB9772583F0E20C8@irsmsx105.ger.corp.intel.com> Subject: Re: [dpdk-dev] [PATCH v12 0/6] add Tx preparation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Dec 2016 07:15:27 -0000 Hi Konstantin, On Wed, Nov 30, 2016 at 10:54:50AM +0000, Ananyev, Konstantin wrote: [...] > > Something is definitely needed here, and only PMDs can provide it. I think > > applications should not have to clear checksum fields or initialize them to > > some magic value, same goes for any other offload or hardware limitation > > that needs to be worked around. > > > > tx_prep() is one possible answer to this issue, however as mentioned in the > > original patch it can be very expensive if exposed by the PMD. > > > > Another issue I'm more concerned about is the way limitations are managed > > (struct rte_eth_desc_lim). While not officially tied to tx_prep(), this > > structure contains new fields that are only relevant to a few devices, and I > > fear it will keep growing with each new hardware quirk to manage, breaking > > ABIs in the process. > > Well, if some new HW capability/limitation would arise and we'd like to support > it in DPDK, then yes we probably would need to think how to incorporate it here. > Do you have anything particular in mind here? Nothing in particular, so for the sake of the argument, let's suppose that I would like to add a field to expose some limitation that only applies to my PMD during TX but looks generic enough to make sense, e.g. maximum packet size when VLAN tagging is requested. PMDs are free to set that field to some special value (say, 0) if they do not care. Since that field exists however, conscious applications should check its value for each packet that needs to be transmitted. This extra code causes a slowdown just by sitting in the data path. Since it is not the only field in that structure, the performance impact can be significant. Even though this code is inside applications, it remains unfair to PMDs for which these tests are irrelevant. This problem is identified and addressed by tx_prepare(). Thanks to tx_prepare(), these checks are moved back into PMDs where they belong. PMDs that do not need them do not have to provide support for tx_prepare() and do not suffer any performance impact as result; applications only have to make sure tx_prepare() is always called at some point before tx_burst(). Once you reach this stage, you've effectively made tx_prepare() mandatory before tx_burst(). If some bug occurs, then perhaps you forgot to call tx_prepare(), you just need to add it. The total cost for doing TX is therefore tx_prepare() + tx_burst(). I'm perhaps a bit pessimistic mind you, but I do not think tx_prepare() will remain optional for long. Sure, PMDs that do not implement it do not care, I'm focusing on applications, for which the performance impact of calling tx_prepare() followed by tx_burst() is higher than a single tx_burst() performing all the necessary preparation at once. [...] > > Following the same logic, why can't such a thing be made part of the TX > > burst function as well (through a direct call to rte_phdr_cksum_fix() > > whenever necessary). From an application standpoint, what are the advantages > > of having to: > > > > if (tx_prep()) // iterate and update mbufs as needed > > tx_burst(); // iterate and send > > > > Compared to: > > > > tx_burst(); // iterate, update as needed and send > > I think that was discussed extensively quite a lot previously here: > As Thomas already replied - main motivation is to allow user > to execute them on different stages of packet TX pipeline, > and probably on different cores. > I think that provides better flexibility to the user to when/where > do these preparations and hopefully would lead to better performance. And I agree, I think this use case is valid but does not warrant such a high penalty when your application does not need that much flexibility. Simple (yet conscious) applications need the highest performance. Complex ones as you described already suffer quite a bit from IPCs and won't mind a couple of extra CPU cycles right? Yes they will, therefore we need a method that satisfies both cases. As a possible solution, a special mbuf flag could be added to each mbuf having gone through tx_prepare(). That way, tx_burst() could skip some checks and things it would otherwise have done. Another possibility, telling the PMD first that you always intend to use tx_prepare() and getting a simpler/faster tx_burst() callback as a result. > Though, if you or any other PMD developer/maintainer would prefer > for particular PMD to combine both functionalities into tx_burst() and > keep tx_prep() as NOP - this is still possible too. Whether they implement it or not, this issue does not impact PMDs anyway, we should probably ask DPDK application developers instead. [...] > > For both mlx4 and mlx5 then, > > "it is OK, we do not need any checksum preparation for TSO". > > > > Actually I do not think we'll ever need tx_prep() unless we add our own > > quirks to struct rte_eth_desc_lim (and friends) which are currently quietly > > handled by TX burst functions. > > Ok, so MLX PMD is not affected by these changes and tx_prep for MLX can be safely > set to NULL, correct? Correct, actually the rest of this message should be in a separate thread. From the MLX side, there is no issue with tx_prepare(). -- Adrien Mazarguil 6WIND