From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f67.google.com (mail-pa0-f67.google.com [209.85.220.67]) by dpdk.org (Postfix) with ESMTP id 62B225597 for ; Mon, 25 Jul 2016 18:40:34 +0200 (CEST) Received: by mail-pa0-f67.google.com with SMTP id cf3so11696923pad.2 for ; Mon, 25 Jul 2016 09:40:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=dzzm16LYp34qG7eEBusEMh8F3uveQ8f9FIXh9TTO+3g=; b=hpXp2UkczlFe251tzrjgCeMqsfNKG75NEie5gwiQhFHYOqIiBod0tvmb0rphoqtm7H gec0Cb8veYCAaFqf2JRQqSsdbdGkWKxU/bmTdNqh6oD5Ra156trMxbyH8ylUEuXvwvC5 saOhL78z2r3m9doSaECwUeWHLJquWs+Z118HLWTXiXBAtgr+C7mqaPJ11ShlzGZCX2l0 ac/rOxopUhTxXGB9E6U1571lX3i9NLbR8RBgzg8wkKvOrcidv+v7pQf4skP/+MKU6RP9 rTi4oHLz7JlYGlbTMaW1K/2isAZloUxOxUgU3bjscUuT1cNgrdr6A/vldMiCFpGOo+a8 H19g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=dzzm16LYp34qG7eEBusEMh8F3uveQ8f9FIXh9TTO+3g=; b=fDqw9OPMtMVCmUj55iZ22EcMSVnytb5iWsKh575yXCeM/4TwKL9jBKAWrQYi6EfMUn ifGOE6GCS/19TSQ1SDq6znFn2GPiirdeFO5/Uyaa4rsKDNW4DtvIIlCl1sH/hxkmyKPr STYGP0jOpGjuDkC/dCMt1KrKbqMfP4R021fNhmPgUo+pmZJ3omZCHOPahGbgnTSGGd6u JjvwOf1seSFzbvcfkdXKKasXUi2FNqKOvkEBC9iHNJD8gjnB5bnXejE1KTY4OdhIZxrY BQg3aHyFuKKCe6s32BVkC5eS0d6W9vItnKlcAxPDvnWnX4sCQgNqWHwS+P+tkofMcswu 5JMA== X-Gm-Message-State: AEkoous/qvanAiBhVFOT3EkYzX1h4GdyLA7kJU7huuwRBE0b3c0s+8PF6qF2WwZBTpE7qg== X-Received: by 10.66.27.243 with SMTP id w19mr30687498pag.112.1469464832729; Mon, 25 Jul 2016 09:40:32 -0700 (PDT) Received: from [192.168.1.6] ([72.168.145.98]) by smtp.googlemail.com with ESMTPSA id ph12sm41534693pab.21.2016.07.25.09.40.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 25 Jul 2016 09:40:32 -0700 (PDT) To: Rahul Lakkireddy , Adrien Mazarguil References: <20160705181646.GO7621@6wind.com> <20160721081335.GA15856@chelsio.com> <20160721170738.GT7621@6wind.com> <20160725113229.GA24036@chelsio.com> Cc: dev@dpdk.org, Thomas Monjalon , Helin Zhang , Jingjing Wu , Rasesh Mody , Ajit Khaparde , Wenzhuo Lu , Jan Medala , John Daley , Jing Chen , Konstantin Ananyev , Matej Vido , Alejandro Lucero , Sony Chacko , Jerin Jacob , Pablo de Lara , Olga Shern , Kumar Sanghvi , Nirranjan Kirubaharan , Indranil Choudhury From: John Fastabend Message-ID: <579640E2.50702@gmail.com> Date: Mon, 25 Jul 2016 09:40:02 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <20160725113229.GA24036@chelsio.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Jul 2016 16:40:34 -0000 On 16-07-25 04:32 AM, Rahul Lakkireddy wrote: > Hi Adrien, > > On Thursday, July 07/21/16, 2016 at 19:07:38 +0200, Adrien Mazarguil wrote: >> Hi Rahul, >> >> Please see below. >> >> On Thu, Jul 21, 2016 at 01:43:37PM +0530, Rahul Lakkireddy wrote: >>> Hi Adrien, >>> >>> The proposal looks very good. It satisfies most of the features >>> supported by Chelsio NICs. We are looking for suggestions on exposing >>> more additional features supported by Chelsio NICs via this API. >>> >>> Chelsio NICs have two regions in which filters can be placed - >>> Maskfull and Maskless regions. As their names imply, maskfull region >>> can accept masks to match a range of values; whereas, maskless region >>> don't accept any masks and hence perform a more strict exact-matches. >>> Filters without masks can also be placed in maskfull region. By >>> default, maskless region have higher priority over the maskfull region. >>> However, the priority between the two regions is configurable. >> >> I understand this configuration affects the entire device. Just to be clear, >> assuming some filters are already configured, are they affected by a change >> of region priority later? >> > > Both the regions exist at the same time in the device. Each filter can > either belong to maskfull or the maskless region. > > The priority is configured at time of filter creation for every > individual filter and cannot be changed while the filter is still > active. If priority needs to be changed for a particular filter then, > it needs to be deleted first and re-created. Could you model this as two tables and add a table_id to the API? This way user space could populate the table it chooses. We would have to add some capabilities attributes to "learn" if tables support masks or not though. I don't see how the PMD can sort this out in any meaningful way and it has to be exposed to the application that has the intelligence to 'know' priorities between masks and non-masks filters. I'm sure you could come up with something but it would be less than ideal in many cases I would guess and we can't have the driver getting priorities wrong or we may not get the correct behavior. > >>> Please suggest on how we can let the apps configure in which region >>> filters must be placed and set the corresponding priority accordingly >>> via this API. >> >> Okay. Applications, like customers, are always right. >> >> With this in mind, PMDs are not allowed to break existing flow rules, and >> face two options when applications provide a flow specification that would >> break an existing rule: >> >> - Refuse to create it (easiest). >> >> - Find a workaround to apply it anyway (possibly quite complicated). >> >> The reason you have these two regions is performance right? Otherwise I'd >> just say put everything in the maskfull region. >> > > Unfortunately, our maskfull region is extremely small too compared to > maskless region. > To me this means a userspace application would want to pack it carefully to get the full benefit. So you need some mechanism to specify the "region" hence the above table proposal. >> PMDs are allowed to rearrange existing rules or change device parameters as >> long as existing constraints are satisfied. In my opinion it does not matter >> which region has the highest default priority. Applications always want the >> best performance so the first created rule should be in the most appropriate >> region. >> >> If a subsequent rule requires it to be in the other region but the >> application specified the wrong priority for this to work, then the PMD can >> either choose to swap region priorities on the device (assuming it does not >> affect other rules), or destroy and recreate the original rule in a way that >> satisfies all constraints (i.e. moving conflicting rules from the maskless >> region to the maskfull one). >> >> Going further, when subsequent rules get destroyed the PMD should ideally >> move back maskfull rules back into the maskless region for better >> performance. >> >> This is only a suggestion. PMDs have the right to say "no" at some point. >> > > Filter search and deletion are expensive operations and they need to be > atomic in order to not affect existing traffic already hitting the > filters. Also, Chelsio hardware can support upto ~500,000 maskless > region entries. So, the cost of search becomes too high for the PMD > when there are large number of filter entries present. > > In my opinion, rather than PMD deciding on priority by itself, it would > be more simpler if the apps have the flexibility to configure the > priority between the regions by themselves? > Agreed I think this will be a common problem especially as the hardware tables get bigger and support masks where priority becomes important to distinguish between multiple matching rules. +1 for having a priority field. >> More important in my opinion is to make sure applications can create a given >> set of flow rules in any order. If rules a/b/c can be created, then it won't >> make sense from an application point of view if c/a/b for some reason cannot >> and the PMD maintainers will rightfully get a bug report. [...] > >> Now about this "promisc" match criteria, it can be added as a new meta >> pattern item (4.1.3 Meta item types). Do you want it to be defined from the >> start or add it later with the related code in your PMD? >> > > It could be added as a meta item. If there are other interested > parties, it can be added now. Otherwise, we'll add it with our filtering > related code. > hmm I guess by "promisc" here you mean match packets received from the wire before they have been switched by the silicon? >>> - Match FCoE packets. >> >> The "oE" part is covered by the ETH item (using the proper Ethertype), but a >> new pattern item should probably be added for the FC protocol itself. As >> above, I suggest adding it later. >> > > Same here. > >>> - Match IP Fragmented packets. >> >> It seems that I missed quite a few fields in the IPv4 item definition >> (struct rte_flow_item_ipv4). It should be in there. >> > > This raises another interesting question. What should the PMD do > if it has support to only a subset of fields in the particular item? > > For example, if a rule has been sent to match IP fragmentation along > with several other IPv4 fields, and if the underlying hardware doesn't > support matching based on IP fragmentation, does the PMD reject the > complete rule although it could have done the matching for rest of the > IPv4 fields? I think it has to fail the command other wise user space will not have any way to understand that the full match criteria can not be met and we will get different behavior for the same applications on different nics depending on hardware feature set. This will most likely break applications so we need the error IMO. > >>> - Match range of physical ports on the NIC in a single rule via masks. >>> For ex: match all UDP packets coming on ports 3 and 4 out of 4 >>> ports available on the NIC. >> >> Applications create flow rules per port, I'd thus suggest that the PMD >> should detect identical rules created on different ports and aggregate them >> as a single HW rule automatically. >> >> If you think this approach is not right, the alternative is a meta pattern >> item that provides a list of ports. I'm not sure this is the right approach >> considering it would most likely not be supported by most NICs. Applications >> may not request it explicitly. >> > > Aggregating via PMD will be expensive operation since it would involve: > - Search of existing filters. > - Deleting those filters. > - Creating a single combined filter. > > And all of above 3 operations would need to be atomic so as not to > affect existing traffic which is hitting above filters. Adding a > meta item would be a simpler solution here. > For this adding a meta-data item seems simplest to me. And if you want to make the default to be only a single port that would maybe make it easier for existing apps to port from flow director. Then if an application cares it can create a list of ports if needed. >>> - Match range of Physical Functions (PFs) on the NIC in a single rule >>> via masks. For ex: match all traffic coming on several PFs. >> >> The PF and VF pattern items assume there is a single PF associated with a >> DPDK port. VFs are identified with an ID. I basically took the same >> definitions as the existing filter types, perhaps this is not enough for >> Chelsio adapters. >> >> Do you expose more than one PF for a DPDK port? >> >> Anyway, I'd suggest the same approach as above, automatic aggregation of >> rules for performance reasons, otherwise new or updated PF/VF pattern items, >> in which case it would be great if you could provide ideal structure >> definitions for this use case. >> > > In Chelsio hardware, all the ports of a device are exposed via single > PF4. There could be many VFs attached to a PF. Physical NIC functions > are operational on PF4, while VFs can be attached to PFs 0-3. > So, Chelsio hardware doesn't remain tied on a PF-to-Port, one-to-one > mapping assumption. > > There already seems to be a PF meta-item, but it doesn't seem to accept > any "spec" and "mask" field. Similarly, the VF meta-item doesn't > seem to accept a "mask" field. We could probably enable these fields > in the PF and VF meta-items to allow configuration. Maybe a range field would help here as well? So you could specify a VF range. It might be one of the things to consider adding later though if there is no clear use for it now. > >>> Please suggest on how we can expose the above features to DPDK apps via >>> this API. >>> [...] Thanks, John