From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by dpdk.org (Postfix) with ESMTP id 4F1DE2C6D for ; Thu, 18 May 2017 14:12:40 +0200 (CEST) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 May 2017 05:12:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.38,359,1491289200"; d="scan'208";a="89007348" Received: from fyigit-mobl1.ger.corp.intel.com (HELO [10.237.220.81]) ([10.237.220.81]) by orsmga004.jf.intel.com with ESMTP; 18 May 2017 05:12:37 -0700 To: "Ananyev, Konstantin" , "dev@dpdk.org" Cc: "Mcnamara, John" , "Tahhan, Maryam" References: <20170420185448.19162-1-ferruh.yigit@intel.com> <20170420185448.19162-2-ferruh.yigit@intel.com> <2601191342CEEE43887BDE71AB9772583FAF774A@IRSMSX109.ger.corp.intel.com> <23ee1b56-ded8-024d-9668-9a4b0b830130@intel.com> <2601191342CEEE43887BDE71AB9772583FAF77CA@IRSMSX109.ger.corp.intel.com> From: Ferruh Yigit Message-ID: Date: Thu, 18 May 2017 13:12:36 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <2601191342CEEE43887BDE71AB9772583FAF77CA@IRSMSX109.ger.corp.intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [RFC 17.08] flow_classify: add librte_flow_classify library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 May 2017 12:12:43 -0000 On 5/17/2017 5:10 PM, Ananyev, Konstantin wrote: >>> Hi Ferruh, >>> Please see my comments/questions below. >>> Thanks >>> Konstantin >>> >>>> + >>>> +/** >>>> + * @file >>>> + * >>>> + * RTE Flow Classify Library >>>> + * >>>> + * This library provides flow record information with some measured properties. >>>> + * >>>> + * Application can select variety of flow types based on various flow keys. >>>> + * >>>> + * Library only maintains flow records between rte_flow_classify_stats_get() >>>> + * calls and with a maximum limit. >>>> + * >>>> + * Provided flow record will be linked list rte_flow_classify_stat_xxx >>>> + * structure. >>>> + * >>>> + * Library is responsible from allocating and freeing memory for flow record >>>> + * table. Previous table freed with next rte_flow_classify_stats_get() call and >>>> + * all tables are freed with rte_flow_classify_type_reset() or >>>> + * rte_flow_classify_type_set(x, 0). Memory for table allocated on the fly while >>>> + * creating records. >>>> + * >>>> + * A rte_flow_classify_type_set() with a valid type will register Rx/Tx >>>> + * callbacks and start filling flow record table. >>>> + * With rte_flow_classify_stats_get(), pointer sent to caller and meanwhile >>>> + * library continues collecting records. >>>> + * >>>> + * Usage: >>>> + * - application calls rte_flow_classify_type_set() for a device >>>> + * - library creates Rx/Tx callbacks for packets and start filling flow table >>> >>> Does it necessary to use an RX callback here? >>> Can library provide an API like collect(port_id, input_mbuf[], pkt_num) instead? >>> So the user would have a choice either setup a callback or call collect() directly. >> >> This was also comment from Morten, I will update RFC to use direct API call. >> >>> >>>> + * for that type of flow (currently only one flow type supported) >>>> + * - application calls rte_flow_classify_stats_get() to get pointer to linked >>>> + * listed flow table. Library assigns this pointer to another value and keeps >>>> + * collecting flow data. In next rte_flow_classify_stats_get(), library first >>>> + * free the previous table, and pass current table to the application, keep >>>> + * collecting data. >>> >>> Ok, but that means that you can't use stats_get() for the same type >>> from 2 different threads without explicit synchronization? >> >> Correct. >> And multiple threads shouldn't be calling this API. It doesn't store >> previous flow data, so multiple threads calling this only can have piece >> of information. Do you see any use case that multiple threads can call >> this API? > > One example would be when you have multiple queues per port, > managed/monitored by different cores. > BTW, how are you going to collect the stats in that way? > >> >>> >>>> + * - application calls rte_flow_classify_type_reset(), library unregisters the >>>> + * callbacks and free all flow table data. >>>> + * >>>> + */ >>>> + >>>> +enum rte_flow_classify_type { >>>> + RTE_FLOW_CLASSIFY_TYPE_GENERIC = (1 << 0), >>>> + RTE_FLOW_CLASSIFY_TYPE_MAX, >>>> +}; >>>> + >>>> +#define RTE_FLOW_CLASSIFY_TYPE_MASK = (((RTE_FLOW_CLASSIFY_TYPE_MAX - 1) << 1) - 1) >>>> + >>>> +/** >>>> + * Global configuration struct >>>> + */ >>>> +struct rte_flow_classify_config { >>>> + uint32_t type; /* bitwise enum rte_flow_classify_type values */ >>>> + void *flow_table_prev; >>>> + uint32_t flow_table_prev_item_count; >>>> + void *flow_table_current; >>>> + uint32_t flow_table_current_item_count; >>>> +} rte_flow_classify_config[RTE_MAX_ETHPORTS]; >>>> + >>>> +#define RTE_FLOW_CLASSIFY_STAT_MAX UINT16_MAX >>>> + >>>> +/** >>>> + * Classification stats data struct >>>> + */ >>>> +struct rte_flow_classify_stat_generic { >>>> + struct rte_flow_classify_stat_generic *next; >>>> + uint32_t id; >>>> + uint64_t timestamp; >>>> + >>>> + struct ether_addr src_mac; >>>> + struct ether_addr dst_mac; >>>> + uint32_t src_ipv4; >>>> + uint32_t dst_ipv4; >>>> + uint8_t l3_protocol_id; >>>> + uint16_t src_port; >>>> + uint16_t dst_port; >>>> + >>>> + uint64_t packet_count; >>>> + uint64_t packet_size; /* bytes */ >>>> +}; >>> >>> Ok, so if I understood things right, for generic type it will always classify all incoming packets by: >>> >>> all by absolute values, and represent results as a linked list. >>> Is that correct, or I misunderstood your intentions here? >> >> Correct. >> >>> If so, then I see several disadvantages here: >>> 1) It is really hard to predict what kind of stats is required for that particular cases. >>> Let say some people would like to collect stat by , >>> another by , third ones by and so on. >>> Having just one hardcoded filter doesn't seem very felxable/usable. >>> I think you need to find a way to allow user to define what type of filter they want to apply. >> >> The flow type should be provided by applications, according their needs, >> and needs to be implemented in this library. The generic one will be the >> only one implemented in first version: >> enum rte_flow_classify_type { >> RTE_FLOW_CLASSIFY_TYPE_GENERIC = (1 << 0), >> RTE_FLOW_CLASSIFY_TYPE_MAX, >> }; >> >> >> App should set the type first via the API: >> rte_flow_classify_type_set(uint8_t port_id, uint32_t type); >> >> >> And the stats for this type will be returned, because returned type can >> be different type of struct, returned as void: >> rte_flow_classify_stats_get(uint8_t port_id, void *stats); > > I understand that, but it means that for every different filter user wants to use, > someone has to update the library: define a new type and write a new piece of code to handle it. > That seems not flexible and totally un-extendable from user perspective. > Even HW allows some flexibility with RX filters. > Why not allow user to specify a classification filter he/she wants for that particular case? > In a way both rte_flow and rte_acl work? > >> >>> I think it was discussed already, but I still wonder why rte_flow_item can't be used for that approach? >> >> >>> 2) Even one 10G port can produce you ~14M rte_flow_classify_stat_generic entries in one second >>> (all packets have different ipv4/ports or so). >>> Accessing/retrieving items over linked list with 14M entries - doesn't sound like a good idea. >>> I'd say we need some better way to retrieve/present collected data. >> >> This is to keep flows, so I expect the numbers will be less comparing to >> the packet numbers. > > That was an extreme example to show how bad the selected approach should behave. > What I am trying to say: we need a way to collect and retrieve stats in a quick and easy way. > Let say right now user invoked stats_get(port=0, type=generic). > Now, he is interested to get stats for particular dst_ip only. > The only way to get it: walk over whole list stats_get() returned and examine each entry one by one. > > I think would be much better to have something like: > > struct rte_flow_stats {timestamp; packet_count; packet_bytes; ..}; > > > > filter_id = rte_flow_stats_register(.., &rte_flow_item); > .... > struct rte_flow_stats stats; > rte_flow_stats_get(..., filter_id, &stats); > > That allows user to define flows to collect stats for. > Again in that case you don't need to worry about when/where to destroy the previous > version of your stats. Except from using rte_flow, above suggest instead of: - set key/filter - poll collect() - when ever app wants call stats_get() using: - poll stats_get(key/filter); specially after switched from callbacks to polling, this makes sense because application already will have to do to continuous calls to this library. Merging set filter/collect/stats_get into same function saves library from storing/deleting stats until app asks for them, as you mentioned above. So, I will update RFC according. > Of course the open question is how to treat packets that would match more than one flow > (priority/insertion order/something else?), but I suppose we'll need to deal with that question anyway. > > Konstantin > >> It is possible to use fixed size arrays for this. But I think it is easy >> to make this switch later, I would like to see the performance effect >> before doing this switch. Do you think is it OK to start like this and >> give that decision during implementation?