From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <john.fastabend@gmail.com>
Received: from mail-pa0-f43.google.com (mail-pa0-f43.google.com
 [209.85.220.43]) by dpdk.org (Postfix) with ESMTP id 8506E5913
 for <dev@dpdk.org>; Wed, 10 Aug 2016 18:36:01 +0200 (CEST)
Received: by mail-pa0-f43.google.com with SMTP id ti13so17334865pac.0
 for <dev@dpdk.org>; Wed, 10 Aug 2016 09:36:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=subject:to:references:from:message-id:date:user-agent:mime-version
 :in-reply-to:content-transfer-encoding;
 bh=kGFYnkZ+B4doXp+4kiQRmp/rEmVD4xE7v02Gs9qF0fE=;
 b=RfEYff1RuaNLzKg0pawWuYL3LY5+JH3/yKqs1Oe/NrLA3XH1ZYEyZV7wryf4PO+jtV
 497Th71I8FWiMqHEOi7SG+kFNwnVp4bFTm24rRNbdEcSVN/AlUyBf4KLnXgnCiC71y0L
 LXUIlVv8oJmh91i7wDNgpVVz0ucCi4RccnAGlezlazQYgjwtIsGJ5Qw+YkOKzbWvkS+e
 2ZM6CGaMKE3EZJXU8WqSqEzAV+2FjmdsFf69pJayVXdtHdkjD7s3RbTBA084UC/W/jm5
 19VFQbaurjpMh6jS0DKjRb/L+IfME5UQFlNI+FGDBP7pptz1k5xecd6M0zuuI8kQ+ORP
 vQKg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:subject:to:references:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=kGFYnkZ+B4doXp+4kiQRmp/rEmVD4xE7v02Gs9qF0fE=;
 b=V15COTIWIBB2yiAzl5/hd0B+aan0eMIa4MFW+5uR2M3y0gxu0C1H0GpZK3JsnGRlRt
 vr07aKiVHFKraY0vS/ITYNBfmvZZA0RX3qZpWCr32kvYgwiQ+g9QCdpJgSEhfEX6bUP3
 XYM+DZGraWkgPm8+BebSU8Rq8V/IHdZJEtRoeLNnieeeRklSO87TPLmaHR7N+GJuPOmR
 qAS/Hr5m4G9s8S8WuWDVKsIR/Pn5ufy1VeaIXWaUx0IiTbmz8QjgRnFU+k4diSkYCSVn
 WCAkPtAqLy5js/jGtgG+l6Zd4Y7NAcTTfiIHYCi/q40cJD0S2gQJYOlbi/er1Mq7FvNF
 DMTA==
X-Gm-Message-State: AEkoouvmGhXnQmp1u1WscNxYPyJSOB3pF4JQwQrfiPnsuFaHIPPGxzrPlZDoULCFr8DkTw==
X-Received: by 10.66.82.42 with SMTP id f10mr8749983pay.17.1470846960591;
 Wed, 10 Aug 2016 09:36:00 -0700 (PDT)
Received: from [192.168.1.6] ([72.168.145.191])
 by smtp.googlemail.com with ESMTPSA id
 x126sm65226113pfx.61.2016.08.10.09.35.41
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Wed, 10 Aug 2016 09:35:59 -0700 (PDT)
To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, dev@dpdk.org,
 Thomas Monjalon <thomas.monjalon@6wind.com>,
 Helin Zhang <helin.zhang@intel.com>, Jingjing Wu <jingjing.wu@intel.com>,
 Rasesh Mody <rasesh.mody@qlogic.com>,
 Ajit Khaparde <ajit.khaparde@broadcom.com>,
 Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>,
 Wenzhuo Lu <wenzhuo.lu@intel.com>, Jan Medala <jan@semihalf.com>,
 John Daley <johndale@cisco.com>, Jing Chen <jing.d.chen@intel.com>,
 Konstantin Ananyev <konstantin.ananyev@intel.com>,
 Matej Vido <matejvido@gmail.com>,
 Alejandro Lucero <alejandro.lucero@netronome.com>,
 Sony Chacko <sony.chacko@qlogic.com>,
 Pablo de Lara <pablo.de.lara.guarch@intel.com>,
 Olga Shern <olgas@mellanox.com>
References: <20160705181646.GO7621@6wind.com>
 <20160711104141.GA10172@localhost.localdomain>
 <20160721192023.GU7621@6wind.com> <5793DD3E.3080605@gmail.com>
 <57A0E423.2030804@gmail.com> <20160803143049.GF3336@6wind.com>
 <57A233A9.3000006@gmail.com> <20160804130528.GM3336@6wind.com>
 <57AA4A0A.8060809@gmail.com> <20160810110236.GA3336@6wind.com>
From: John Fastabend <john.fastabend@gmail.com>
Message-ID: <57AB57D8.7070507@gmail.com>
Date: Wed, 10 Aug 2016 09:35:36 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.6.0
MIME-Version: 1.0
In-Reply-To: <20160810110236.GA3336@6wind.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Subject: Re: [dpdk-dev] [RFC] Generic flow director/filtering/classification
 API
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Aug 2016 16:36:01 -0000

On 16-08-10 04:02 AM, Adrien Mazarguil wrote:
> On Tue, Aug 09, 2016 at 02:24:26PM -0700, John Fastabend wrote:
> [...]
>>> Just an idea, could some kind of meta pattern items specifying time
>>> constraints for a rule address this issue? Say, how long (cycles/ms) the PMD
>>> may take to query/apply/delete the rule. If it cannot be guaranteed, the
>>> rule cannot be created. Applications could mantain statistic counters about
>>> failed rules to determine if performance issues are caused by the inability
>>> to create them.
>>
>> It seems a bit heavy to me to have each PMD driver implementing
>> something like this. But it would be interesting to explore probably
>> after the basic support is implemented though.
> 
> OK, let's keep this for later.
> 
> [...]
>>> Such a pattern could be generated from a separate function before feeding it
>>> to rte_flow_create(), or translated by the PMD afterwards assuming a
>>> separate meta item such as RAW_END exists to signal the end of a RAW layer.
>>> Of course processing this would be more expensive.
>>>
>>
>> Or the supported parse graph could be fetched from the hardware with the
>> values for each protocol so that the programming interface is the same.
>> The well known protocols could keep the 'enum values' in the header
>> rte_flow_item_type enum so that users would not be required to do
>> the parse graph but for new or experimental protocols we could query
>> the parse graph and get the programming pattern matching id for them.
>>
>> The normal flow would be unchanged but we don't get stuck upgrading
>> everything to add our own protocol. So the flow would be,
>>
>>  rte_get_parse_graph(graph);
>>  flow_item_proto = is_my_proto_supported(graph);
>>
>>  pattern = build_flow_match(flow_item_proto, value, mask);
>>  action = build_action();
>>  rte_flow_create(my_port, pattern, action);
>>
>> The only change to the API proposed to support this would be to allow
>> unsupported RTE_FLOW_ values to be pushed to the hardware and define
>> a range of values that are reserved for use by the parse graph discover.
>>
>> This would not have to be any more expensive.
> 
> Makes sense. Unless made entirely out of RAW items however the ensuing
> pattern would not be portable across DPDK ports, instances and versions if
> dumped in binary form for later use.
> 

Right.

> Since those would have be recognized by PMDs and applications regardless of
> the API version, I suggest making generated item types negative (enums are
> signed, let's use that).

That works then the normal positive enums maintain the list of
known/accepted protocols.

> 
> DPDK would have to maintain a list of expended values to avoid collisions
> between PMDs. A method should be provided to release them.

The 'middle layer' could have a non-public API for PMDs to get new
values call it get_flow_type_item_id() or something.

> 
> [...]
>> hmm for performance reasons building an entire graph up using RAW items
>> seems to be a bit heavy. Another alternative to the above parse graph
>> notion would be to allow users to add RAW node definitions at init time
>> and have the PMD give a ID back for those. Then the new node could be
>> used just like any other RTE_FLOW_ITEM_TYPE in a pattern.
>>
>> Something like,
>>
>> 	ret_flow_item_type_foo = rte_create_raw_node(foo_raw_pattern)
>> 	ret_flow_item_type_bar = rte_create_raw_node(bar_raw_pattern)
>>
>> then allow ret_flow_item_type_{foo|bar} to be used in subsequent
>> pattern matching items. And if the hardware can not support this return
>> an error from the initial rte_create_raw_node() API call.
>>
>> Do any either of those proposals sound like reasonable extensions?
> 
> Both seem acceptable in my opinion as they fit in the described API. However
> I think it would be better for this function to spit out a pattern made of
> any number of items instead of a single new item type. That way, existing
> fixed items can be reused as well, the entire pattern may even become
> portable as a result, it could be considered as a method to optimize a
> RAW pattern.
> 
> The RAW approach has the advantage of not requiring much additional code in
> the public API besides a couple of function declarations. A proper full
> blown graph would require a lot more as described in your original
> reply. Not sure which is better.
> 
> Either way they won't be part of the initial specification but it looks like
> they can be added later without affecting the basics.
> 

Right its not needed in initial spec as long as we have a path to get
there and it looks like we have two usable possibilities so that works
for me.


>>> [...]
>>>> The two open items from me are do we need to support adding new variable
>>>> length headers? And how do we handle multiple tables I'll take that up
>>>> in the other thread.
>>>
>>> I think variable length headers may be eventually supported through pattern
>>> tricks or eventually a separate conversion layer.
>>>
>>
>> A parse graph notion would support this naturally though without pattern
>> tricks hence my above suggestions.
> 
> All right, I agree a method to let applications precisely define what they
> want to match can be useful now I understand what you mean by
> "dynamically".
> 
>> Also in the current scheme how would I match an ipv6 option or specific
>> nsh option or mpls tag?
> 
> Ideally through specific pattern items defined for this purpose, which is
> how I thought the API would evolve. Of course it wouldn't be fully dynamic
> and you'd have to wait for a DPDK release that implements them.
> 

The only trouble is if you don't know exactly where the option is in the
list of options (which you wont in general) its a bit hard to get right
with the existing spec as best I can tell. Because RAW patterns
would require you to know where the option is in the list and ANY
pattern wouldn't guarantee a match is in the correct header with stacked
headers. At least if I'm reading the spec correctly it seems to be
an issue.

.John