From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f50.google.com (mail-wm0-f50.google.com [74.125.82.50]) by dpdk.org (Postfix) with ESMTP id 604862BB9 for ; Thu, 21 Jul 2016 14:47:39 +0200 (CEST) Received: by mail-wm0-f50.google.com with SMTP id q128so20734755wma.1 for ; Thu, 21 Jul 2016 05:47:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=oLK2Bd+zCgRp2+cpIykeZQRalR8T7jjZ+ozW5NrsqFM=; b=gT8WWsqDhDoWeBM/gvJBvAl/Pv0UBGvdY4VJDICdRTeg7sOh8/pvHQwThASDTLPPku HyK2vbdonQQedgVy5hoTTxHHKiaE6S2aVPbgUkXXfY7y3PQi/1B9tlvIk2e0HHW6AHMY NzFANBrJsT1ODbT13Ic3/QY/jvDg3KhsVVEX7MT2igC2YVK/yTGgVaILPflJiMzJqdBm fgUC5ptKPxowkb071q3W2gT6cV5tA8jqKkALuqBFedly62zWHLPX6gQBQN5wPEyE6x0k iVYQUCljeIF2jZhQluFjfGG47K4024s3NfWGY/v6F5nr/TBbVvK0OHBU8iRbCOdRVuVh 67Dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=oLK2Bd+zCgRp2+cpIykeZQRalR8T7jjZ+ozW5NrsqFM=; b=gUy2z4N7I/6DQbK/KdwKRreJBMDRPSIzkGx9Ypb1QIrYj+iUtT9EBenqC8Frtcla89 QnVk/DxTV5HscCby1R/h3T98zGxFnNetzAx9z77aZSjYllgWsMQAl+rKkY4afCmv1JS2 4GATWmJt7m6OJVpZgWtxM1tp9OqInx8R7mFZS90LGIRaBmeCT4Qn0kv21fRE621LAgLh CLB7IQmzJOE9/tYeygG9PSkrXm8HqgFIJlPl7EaOGeRskTtpcMmvzKLiSNjCihBJoxsy KROvBW6DrIRQ4noSCZUgN1E5VuVyX8pzreXYVnHzcdIdYqPXHtQLkhWHpSZfb4Gc5iP8 WvQw== X-Gm-Message-State: ALyK8tIkKg4DfWiwHduT/MiGkWuRzp4LrGGUh/LCm7SgchBBH59XnjypsaU4Cwo4peCVvYOX X-Received: by 10.28.152.5 with SMTP id a5mr18047465wme.76.1469105259039; Thu, 21 Jul 2016 05:47:39 -0700 (PDT) Received: from 6wind.com (guy78-3-82-239-227-177.fbx.proxad.net. [82.239.227.177]) by smtp.gmail.com with ESMTPSA id d80sm3817081wmd.14.2016.07.21.05.47.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 21 Jul 2016 05:47:38 -0700 (PDT) Date: Thu, 21 Jul 2016 14:47:34 +0200 From: Adrien Mazarguil To: "Lu, Wenzhuo" Cc: "dev@dpdk.org" , Thomas Monjalon , "Zhang, Helin" , "Wu, Jingjing" , Rasesh Mody , Ajit Khaparde , Rahul Lakkireddy , Jan Medala , John Daley , "Chen, Jing D" , "Ananyev, Konstantin" , Matej Vido , Alejandro Lucero , Sony Chacko , Jerin Jacob , "De Lara Guarch, Pablo" , Olga Shern Message-ID: <20160721124734.GR7621@6wind.com> Mail-Followup-To: "Lu, Wenzhuo" , "dev@dpdk.org" , Thomas Monjalon , "Zhang, Helin" , "Wu, Jingjing" , Rasesh Mody , Ajit Khaparde , Rahul Lakkireddy , Jan Medala , John Daley , "Chen, Jing D" , "Ananyev, Konstantin" , Matej Vido , Alejandro Lucero , Sony Chacko , Jerin Jacob , "De Lara Guarch, Pablo" , Olga Shern References: <20160705181646.GO7621@6wind.com> <6A0DE07E22DDAD4C9103DF62FEBC09090348E1A7@shsmsx102.ccr.corp.intel.com> <20160707102650.GU7621@6wind.com> <6A0DE07E22DDAD4C9103DF62FEBC090903492563@shsmsx102.ccr.corp.intel.com> <20160719131219.GK7621@6wind.com> <6A0DE07E22DDAD4C9103DF62FEBC090903492A93@shsmsx102.ccr.corp.intel.com> <20160720104115.GN7621@6wind.com> <6A0DE07E22DDAD4C9103DF62FEBC090903492F92@shsmsx102.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <6A0DE07E22DDAD4C9103DF62FEBC090903492F92@shsmsx102.ccr.corp.intel.com> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2016 12:47:39 -0000 Hi Wenzhuo, It seems that we agree on about everything now, just a few more comments below after snipping the now irrelevant parts. On Thu, Jul 21, 2016 at 03:18:11AM +0000, Lu, Wenzhuo wrote: [...] > > > > > Does it mean PMD should store and maintain all the rules? Why not > > > > > let rte do > > > > that? I think if PMD maintain all the rules, it means every kind of > > > > NIC should have a copy of code for the rules. But if rte do that, > > > > only one copy of code need to be maintained, right? > > > > > > > > I've considered having rules stored in a common format understood at > > > > the RTE level and not specific to each PMD and decided that the > > > > opaque rte_flow pointer was a better choice for the following reasons: > > > > > > > > - Even though flow rules management is done in the control path, processing > > > > must be as fast as possible. Letting PMDs store flow rules using their own > > > > internal representation gives them the chance to achieve better > > > > performance. > > > Not quite understand. I think we're talking about maintain the rules by SW. I > > don’t think there's something need to be optimized according to specific NICs. If > > we need to optimize the code, I think we need to consider the CPU, OS ... and > > some common means. I'm wrong? > > > > Perhaps we were talking about different things, here I was only explaining why > > rte_flow (the result of creating a flow rule) should be opaque and fully managed > > by the PMD. More on the SW side of things below. > > > > > > - An opaque context managed by PMDs would probably have to be stored > > > > somewhere as well anyway. > > > > > > > > - PMDs may not need to allocate/store anything at all if they exclusively > > > > rely on HW state for everything. In my opinion, the generic API has enough > > > > constraints for this to work and maintain consistency between flow > > > > rules. Note this is currently how most PMDs implement FDIR and other > > > > filter types. > > > Yes, the rules are stored by HW. But considering stop/start the device, the > > rules in HW will lose. we have to store the rules by SW and re-program them > > when restarting the device. > > > > Assume a HW capable of keeping flow rules programmed even during a > > stop/start cycle (e.g. mlx4/mlx5 may be able to do it from DPDK point of view), > > don't you think it is more efficient to standardize on this behavior and let PMDs > > restore flow rules for HW that do not support it regardless of whether it would > > be done by RTE or the application (SW)? > Didn’t know that. As some NICs have already had the ability to keep the rules during a stop/start cycle, maybe it could be a trend :) Well yeah, if you are wondering about that, these PMDs do not have the same definition for port stop/start as lower level PMDs like ixgbe and i40e. In the mlx4/mlx5 cases, most control path operations (queue creation, destruction and general management) end up performed by kernel drivers. Stopping a port does not really shut it down as the kernel still manages its own netdevice independently. [...] > > > > - The flow rules format described in this specification (pattern / actions) > > > > will be used by applications directly, and will be free to arrange them in > > > > lists, trees or in any other way if they need to keep flow specifications > > > > around for further processing. > > > Who will create the lists, trees or something else? According to previous > > discussion, I think the APP will program the rules one by one. So if APP organize > > the rules to lists, trees..., PMD doesn’t know that. > > > And you said " Given that the opaque rte_flow pointer associated with a flow > > rule is to be stored by the application ". I'm lost here. > > > > I guess that's because we're discussing two different things, flow rule > > specifications and flow rule objects. Let me sum it up: > > > > - Flow rule specifications are the patterns/actions combinations provided by > > applications to rte_flow_create(). Applications can store those as needed > > and organize them as they wish (hash, tree, list). Neither PMDs nor RTE > > will do it for them. > > > > - Flow rule objects (struct rte_flow *) are generated when a flow rule is > > created. Applications must keep these around if they want to manipulate > > them later (i.e. destroy or query existing rules). > Thanks for this clarification. So the specifications can be different with objects, right? The specifications are what the APP wants, the objects are what the APP really gets. As rte_flow_create can fail. Right? Yes, precisely. Apps are also free to keep specifications around even in the event of a flow creation failure. I think a generic software fallback will be provided at some point. [...] > > What we seem to not agree about is that you think RTE should be responsible > > for restoring flow rules of devices that lose them when stopped. I think doing so > > is unfair to devices for which it is not the case and not really nice to applications, > > so my opinion is that the PMD is responsible for restoring flow rules however it > > wants. It is free to use RTE helpers to keep their track, as long as it's all managed > > internally. > What I think is RTE can store the flow rules and recreate them after restarting, in the function like rte_dev_start, so APP knows nothing about it. But according to the discussing above, I think the design doesn't support it, right? Yes. Right now the design explictly states that PMDs are on their own regarding this (4.3 Behavior). While it could be modified, I really think it would be less efficient for the reasons stated above. > RTE doesn't store the flow rules objects and event it stores them, there's no way designed to re-program the objects. And also considering some HW doesn't need to be re-programed. I think it's OK to let PMD maintain the rules as the re-programing is a NIC specific requirement. Great to finally agree on this point. > > > > Thus from an application point of view, whatever happens when > > > > stopping and restarting a port should not matter. If a flow rule was > > > > present before, it must still be present afterwards. If the PMD had > > > > to destroy flow rules and re-create them, it does not actually matter if they > > differ slightly at the HW level, as long as: > > > > > > > > - Existing opaque flow rule pointers (rte_flow) are still valid to the PMD > > > > and refer to the same rules. > > > > > > > > - The overall behavior of all rules is the same. > > > > > > > > The list of rules you think of (patterns / actions) is maintained by > > > > applications (not RTE), and only if they need them. RTE would needlessly > > duplicate this. > > > As said before, need more details to understand this. Maybe an example > > > is better :) > > > > The generic format both RTE and applications might understand is the one > > described in this API (struct rte_flow_pattern and struct rte_flow_actions). > > > > If we wanted RTE to maintain some sort of per-port state for flow rule > > specifications, it would have to be a copy of these structures arranged somehow > > (list or something else). > > > > If we consider that PMDs need to keep a context object associated to a flow > > rule (the opaque struct rte_flow *), then RTE would most likely have to store it > > along with the flow specification. > > > > Such a list may not be useful to applications (list lookups take time), so they > > would implement their own redundant method. They might also require extra > > room to attach some application context to flow rules. A generic list cannot plan > > for it. > > > > Applications know what they want to do with flow rules and are responsible for > > managing them efficiently with RTE out of the way. > > > > I'm not sure if this answered your question, if not, please describe a scenario > > where a RTE-managed list of flow rules would be mandatory. > Got your point and agree :) Thanks. -- Adrien Mazarguil 6WIND