From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f50.google.com (mail-wm0-f50.google.com [74.125.82.50]) by dpdk.org (Postfix) with ESMTP id AA9342C29 for ; Wed, 3 Aug 2016 18:44:17 +0200 (CEST) Received: by mail-wm0-f50.google.com with SMTP id f65so455232996wmi.0 for ; Wed, 03 Aug 2016 09:44:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to; bh=B4AwfHxFy4dB+Re38RBZUF9xYNeoMScu6uKqn221Upg=; b=b/yvPxCPg1JCcJw3Dn4EMP49STHWLSj3i8ONdIcPgElKcvuIw5/Dyzkx+Zk6VV77Nv JZKpluh+GAaDUkh6K84QontOQml/WD/+HBuAJkWiKBQBSTm7bPcpZDQmkx+6OzabF1NX 416shFkZWJMwzVT6eJ9gH2cAOY6iFcyRWFjQssu2fOpnF+mG9WYps6QS0nmxLD22P7Tc 3UOcN5jkx80c4ZU2/Lxf7Lxgj5c23dGS9UcX1Oi2x1bBtseMzWFENQJAF5J6l2Dv3Sga ALY1SO51SOhda6AZitKeTFAeRuqwPZ/INrA4nklfqsM4AWC3RpL2FNXMzmANx/yAyfLO xzlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :in-reply-to; bh=B4AwfHxFy4dB+Re38RBZUF9xYNeoMScu6uKqn221Upg=; b=UEWF+b1riVQCTivnTUVA5cDVENYVPZ0+d8FV1wNmJ8K6nsfw1BYbWFCxWCBSFvqu0C dt8ArMtqym8TZwyRDzxT00qXezcmVmz0DbBQkjV1ScdDuw0ffd08kV/riOxD2heObT2F 0FrQj6ISSMmglJCxjleUX+xHdOdyprXhWvZHZxejEZegERdaKlvxuWha/Bo+TFGINom7 RnR+JJD1NW982YjS6l14tifs6GTw47tAqOHSgR904XtdHDEJjW4E5a8FY+HkV7Ou/7b0 mGXZei8nP7V0/WPbKKf7Fl2ogMWQX5k+adzQsZZYwWbNP6HiblXgjheQkyTMnc/N7K4K KBXw== X-Gm-Message-State: AEkoousq5AbA4un7kujGXq4XFrC5X/aBOAMgHprQiCfpur1q2fsJ8Kejfs5xa32OEnoPe92B X-Received: by 10.28.185.202 with SMTP id j193mr26124376wmf.78.1470242657238; Wed, 03 Aug 2016 09:44:17 -0700 (PDT) Received: from 6wind.com (guy78-3-82-239-227-177.fbx.proxad.net. [82.239.227.177]) by smtp.gmail.com with ESMTPSA id o142sm28025311wme.20.2016.08.03.09.44.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Aug 2016 09:44:16 -0700 (PDT) Date: Wed, 3 Aug 2016 18:44:11 +0200 From: Adrien Mazarguil To: Rahul Lakkireddy Cc: John Fastabend , dev@dpdk.org, Thomas Monjalon , Helin Zhang , Jingjing Wu , Rasesh Mody , Ajit Khaparde , Wenzhuo Lu , Jan Medala , John Daley , Jing Chen , Konstantin Ananyev , Matej Vido , Alejandro Lucero , Sony Chacko , Jerin Jacob , Pablo de Lara , Olga Shern , Kumar A S , Nirranjan Kirubaharan , Indranil Choudhury Message-ID: <20160803164410.GH3336@6wind.com> Mail-Followup-To: Rahul Lakkireddy , John Fastabend , dev@dpdk.org, Thomas Monjalon , Helin Zhang , Jingjing Wu , Rasesh Mody , Ajit Khaparde , Wenzhuo Lu , Jan Medala , John Daley , Jing Chen , Konstantin Ananyev , Matej Vido , Alejandro Lucero , Sony Chacko , Jerin Jacob , Pablo de Lara , Olga Shern , Kumar A S , Nirranjan Kirubaharan , Indranil Choudhury References: <20160705181646.GO7621@6wind.com> <20160721081335.GA15856@chelsio.com> <20160721170738.GT7621@6wind.com> <20160725113229.GA24036@chelsio.com> <579640E2.50702@gmail.com> <20160726100731.GA2542@chelsio.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160726100731.GA2542@chelsio.com> Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Aug 2016 16:44:18 -0000 Replying to everything at once, please see below. On Tue, Jul 26, 2016 at 03:37:35PM +0530, Rahul Lakkireddy wrote: > On Monday, July 07/25/16, 2016 at 09:40:02 -0700, John Fastabend wrote: > > On 16-07-25 04:32 AM, Rahul Lakkireddy wrote: > > > Hi Adrien, > > > > > > On Thursday, July 07/21/16, 2016 at 19:07:38 +0200, Adrien Mazarguil wrote: > > >> Hi Rahul, > > >> > > >> Please see below. > > >> > > >> On Thu, Jul 21, 2016 at 01:43:37PM +0530, Rahul Lakkireddy wrote: > > >>> Hi Adrien, > > >>> > > >>> The proposal looks very good. It satisfies most of the features > > >>> supported by Chelsio NICs. We are looking for suggestions on exposing > > >>> more additional features supported by Chelsio NICs via this API. > > >>> > > >>> Chelsio NICs have two regions in which filters can be placed - > > >>> Maskfull and Maskless regions. As their names imply, maskfull region > > >>> can accept masks to match a range of values; whereas, maskless region > > >>> don't accept any masks and hence perform a more strict exact-matches. > > >>> Filters without masks can also be placed in maskfull region. By > > >>> default, maskless region have higher priority over the maskfull region. > > >>> However, the priority between the two regions is configurable. > > >> > > >> I understand this configuration affects the entire device. Just to be clear, > > >> assuming some filters are already configured, are they affected by a change > > >> of region priority later? > > >> > > > > > > Both the regions exist at the same time in the device. Each filter can > > > either belong to maskfull or the maskless region. > > > > > > The priority is configured at time of filter creation for every > > > individual filter and cannot be changed while the filter is still > > > active. If priority needs to be changed for a particular filter then, > > > it needs to be deleted first and re-created. > > > > Could you model this as two tables and add a table_id to the API? This > > way user space could populate the table it chooses. We would have to add > > some capabilities attributes to "learn" if tables support masks or not > > though. > > > > This approach sounds interesting. Now I understand the idea behind these tables, however from an application point of view I still think it's better if the PMD could take care of flow rules optimizations automatically. Think about it, PMDs have exactly a single kind of device they know perfectly well to manage, while applications want the best possible performance out of any device in the most generic fashion. > > I don't see how the PMD can sort this out in any meaningful way and it > > has to be exposed to the application that has the intelligence to 'know' > > priorities between masks and non-masks filters. I'm sure you could come > > up with something but it would be less than ideal in many cases I would > > guess and we can't have the driver getting priorities wrong or we may > > not get the correct behavior. It may be solved by having the PMD maintain a SW state to quickly know which rules are currently created and in what state the device is so basically the application doesn't have to perform this work. This API allows applications to express basic needs such as "redirect packets matching this pattern to that queue". It must not deal with HW details and limitations in my opinion. If a request cannot be satisfied, then the rule cannot be created. No help from the application must be expected by PMDs, otherwise it opens the door to the same issues as the legacy filtering APIs. [...] > > > Unfortunately, our maskfull region is extremely small too compared to > > > maskless region. > > > > > > > To me this means a userspace application would want to pack it > > carefully to get the full benefit. So you need some mechanism to specify > > the "region" hence the above table proposal. > > > > Right. Makes sense. I do not agree, applications should not be aware of it. Note this case can be handled differently, so that rules do not have to be moved back and forth between both tables. If the first created rule requires a maskfull entry, then all subsequent rules will be entered into that table. Otherwise no maskfull entry can be created as long as there is one maskless entry. When either table is full, no more rules may be added. Would that work for you? > [...] > > >> Now about this "promisc" match criteria, it can be added as a new meta > > >> pattern item (4.1.3 Meta item types). Do you want it to be defined from the > > >> start or add it later with the related code in your PMD? > > >> > > > > > > It could be added as a meta item. If there are other interested > > > parties, it can be added now. Otherwise, we'll add it with our filtering > > > related code. > > > > > > > hmm I guess by "promisc" here you mean match packets received from the > > wire before they have been switched by the silicon? > > > > Match packets received from wire before they have been switched by > silicon, and which also includes packets not destined for DUT and were > still received due to interface being in promisc mode. I think it's fine, but we'll have to precisely define what happens when a packet matched with such pattern is part of a terminating rule. For instance if it is duplicated by HW, then the rule cannot be terminating. [...] > > > This raises another interesting question. What should the PMD do > > > if it has support to only a subset of fields in the particular item? > > > > > > For example, if a rule has been sent to match IP fragmentation along > > > with several other IPv4 fields, and if the underlying hardware doesn't > > > support matching based on IP fragmentation, does the PMD reject the > > > complete rule although it could have done the matching for rest of the > > > IPv4 fields? > > > > I think it has to fail the command other wise user space will not have > > any way to understand that the full match criteria can not be met and > > we will get different behavior for the same applications on different > > nics depending on hardware feature set. This will most likely break > > applications so we need the error IMO. > > > > Ok. Makes sense. Yes, I fully agree with this. > > >>> - Match range of physical ports on the NIC in a single rule via masks. > > >>> For ex: match all UDP packets coming on ports 3 and 4 out of 4 > > >>> ports available on the NIC. > > >> > > >> Applications create flow rules per port, I'd thus suggest that the PMD > > >> should detect identical rules created on different ports and aggregate them > > >> as a single HW rule automatically. > > >> > > >> If you think this approach is not right, the alternative is a meta pattern > > >> item that provides a list of ports. I'm not sure this is the right approach > > >> considering it would most likely not be supported by most NICs. Applications > > >> may not request it explicitly. > > >> > > > > > > Aggregating via PMD will be expensive operation since it would involve: > > > - Search of existing filters. > > > - Deleting those filters. > > > - Creating a single combined filter. > > > > > > And all of above 3 operations would need to be atomic so as not to > > > affect existing traffic which is hitting above filters. Atomicity may not be a problem if the PMD makes sure the new combined rule is inserted before the others, so they do not need to be removed either. > > > Adding a > > > meta item would be a simpler solution here. Yes, clearly. > > For this adding a meta-data item seems simplest to me. And if you want > > to make the default to be only a single port that would maybe make it > > easier for existing apps to port from flow director. Then if an > > application cares it can create a list of ports if needed. > > > > Agreed. However although I'm not opposed to adding dedicated meta items, remember applications will not automatically benefit from the increased performance if a single PMD implements this feature, their maintainers will probably not bother with it. > > >>> - Match range of Physical Functions (PFs) on the NIC in a single rule > > >>> via masks. For ex: match all traffic coming on several PFs. > > >> > > >> The PF and VF pattern items assume there is a single PF associated with a > > >> DPDK port. VFs are identified with an ID. I basically took the same > > >> definitions as the existing filter types, perhaps this is not enough for > > >> Chelsio adapters. > > >> > > >> Do you expose more than one PF for a DPDK port? > > >> > > >> Anyway, I'd suggest the same approach as above, automatic aggregation of > > >> rules for performance reasons, otherwise new or updated PF/VF pattern items, > > >> in which case it would be great if you could provide ideal structure > > >> definitions for this use case. > > >> > > > > > > In Chelsio hardware, all the ports of a device are exposed via single > > > PF4. There could be many VFs attached to a PF. Physical NIC functions > > > are operational on PF4, while VFs can be attached to PFs 0-3. > > > So, Chelsio hardware doesn't remain tied on a PF-to-Port, one-to-one > > > mapping assumption. > > > > > > There already seems to be a PF meta-item, but it doesn't seem to accept > > > any "spec" and "mask" field. Similarly, the VF meta-item doesn't > > > seem to accept a "mask" field. We could probably enable these fields > > > in the PF and VF meta-items to allow configuration. > > > > Maybe a range field would help here as well? So you could specify a VF > > range. It might be one of the things to consider adding later though if > > there is no clear use for it now. > > > > VF-value and VF-mask would help to achieve the desired filter. > VF-mask would also enable to specify a range of VF values. Like John, I think a range or even a list instead of a mask would be better, the PMD can easily create a mask from that if necessary. Reason is that we've always had bad experiences with bit-fields, they're always too short at some point and we would like to avoid having to break the ABI to update existing pattern items later. Also while I don't think this is the case yet, perhaps it will be a good idea for PFs/VFs to have global unique IDs, just like DPDK ports. Thanks. -- Adrien Mazarguil 6WIND