From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0CD5CA0C47; Thu, 7 Oct 2021 13:08:28 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 9944D411AD; Thu, 7 Oct 2021 13:08:27 +0200 (CEST) Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2041.outbound.protection.outlook.com [40.107.236.41]) by mails.dpdk.org (Postfix) with ESMTP id 9B31F41137 for ; Thu, 7 Oct 2021 13:08:25 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RUfdvG26RpIC5VvljWvlksiAKsKnae2tVSG0xg8b3IZ5aGAyP/qvldO2PRO/r7zc6t9YrTnkIOlgvJoDBA9X0/k7rHlN3jkej2On9HlNVjJivjMpnfnI6ueAbXbId7luJKgSzQBYKtRrgDu3jH/YiSDN8XQmLEV1Owuqr6MZCDFFh+nChSRHKyjR1yPpC/dNBPumvCQRa9GP/JIr1jftXAo/CsEYVufFp7ySRO9CnULpy/jMzS4yafDTQBlNlwwmfN1Gtdi6Jeawn3GIFNLMfB+5kBjisHJiBEYxvOrh6DFr/7M7apmTEBfPhRPOVslKOEPlibD49sw/290mMMjK7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=RoEOvQ9bMxL+r+zuMeOQFCLYAll3ewIvya8aXyapPUY=; b=d6VM+L8Mfht1mtkniMi+vsinxld6dmTAyrz9wQ+xj+O9yOSWXBYfYFKZxMVRJeYNNu4OhM49IxZxlyqYLTTi2cv8VAgJpt6+vNcUR7+pZDW3bf52jHu3TqFB3r4D83dCcgH83xWkxoGUcLtXdwwq5V8jyqMvgWE75dRkLEj0s6istR6uxKg+/Vdwfg1/GY+58+/jtLhifIXfBabt+cQNbAs8gT6N8BoOoFnH258so5jNmRXQqPSH0YLIzJJYpmCbMi0ZbX2N9oJpE5A73n7AqhL7k9RIAUxmNNo2ZVLknqI3ukftIC4kCXyRDWt/gNkLugy0HA3dfrmF+IEdloYkHA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=RoEOvQ9bMxL+r+zuMeOQFCLYAll3ewIvya8aXyapPUY=; b=tJDA7URgalVu8TQdsPqB1gmvmtN+kLoVByoz7mg5hxjCzW5P5x/9a+bttApMfXrpK5Z69K69Y6/SUkamqs4Pv0fiyuHJSaf3OYiO35fpK8j9SORnRuYqt+998vre4AKx5vy3iJMnMgG0+Kp0S9oinpiNGUomtuWuCN44rIN38mqMp2d0AK54MUomg4kXSuaDXVpcCsFSnXA9wg80sGnWfQMjlc9wRSQOllkwFzTVzvUv8zK9GHLWgR7Edz2YwC+zNSvL9hCZHacuCumBp0AQK7ElWwxBy1bEhJ6aHDook0+B/ZzWL1EVOJm5VI6tsQWmyjbpqzuGyWY8A8liMgS/6A== Received: from DM8PR12MB5400.namprd12.prod.outlook.com (2603:10b6:8:3b::12) by DM4PR12MB5391.namprd12.prod.outlook.com (2603:10b6:5:39a::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4587.20; Thu, 7 Oct 2021 11:08:24 +0000 Received: from DM8PR12MB5400.namprd12.prod.outlook.com ([fe80::d03d:1f75:ca20:6a32]) by DM8PR12MB5400.namprd12.prod.outlook.com ([fe80::d03d:1f75:ca20:6a32%6]) with mapi id 15.20.4587.020; Thu, 7 Oct 2021 11:08:24 +0000 From: Ori Kam To: Slava Ovsiienko , "dev@dpdk.org" CC: Raslan Darawsheh , Matan Azrad , Shahaf Shuler , Gregory Etelson , NBU-Contact-Thomas Monjalon Thread-Topic: [PATCH v2 01/14] ethdev: introduce configurable flexible item Thread-Index: AQHXtvtrgGNY5+Ub3EKNTkjZpP290qvFlm3w Date: Thu, 7 Oct 2021 11:08:23 +0000 Message-ID: References: <20210922180418.20663-1-viacheslavo@nvidia.com> <20211001193415.23288-1-viacheslavo@nvidia.com> <20211001193415.23288-2-viacheslavo@nvidia.com> In-Reply-To: <20211001193415.23288-2-viacheslavo@nvidia.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: nvidia.com; dkim=none (message not signed) header.d=none;nvidia.com; dmarc=none action=none header.from=nvidia.com; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 2ea3906c-265f-469c-a57d-08d98982c7c9 x-ms-traffictypediagnostic: DM4PR12MB5391: x-ld-processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 4+GxPV7WibcDvMxsi8mEYRoDoOjL5vqqknSGCLsdIC6LhQquxcJYY7BtUhlc13cBU7Vp0kb51zQ3ZbbBg1zvqvkPTfp/1IC+J/dMm2zm+xzNB90Pz/kcnJ4t+PxNPWuIJ2gvJKtb8wG63EGzHJpnHbnCzOytlnyEQemc7mAIJtB/BSRXojhcxtgy0w4FOUWywJNLFhPcsTprwLGnKORiBmh36i2Xo6riweLjJ0hwydc8cWQx9g6d+Gae/7h/Z7G29NCXp8sDv/Qqls8wiaIySI4/bqh/zDK+Y7py5Kr21D4YeVVPq58ooZmifB2gPc8BPAqrYvCjZrlVc2iaDVU9J/OmxS9VQBeM1pdiIX059Zg1BB2WQef2G4Nj+P9zc6im6GaHxZBlTJSv4/vKgA68vaqG5v7/6uZirXAGmamj27n6QFzTbZpyfWEh8mdBdJWkmNsz40vxdMf179qGhpNCqSVC9aYEwu3DjcrybI7llidkIypOfiGlzrD5ypC5/NvD63O1glE11mfeVZl8vjOtByST6ZTXJW7ytS4rQnnEsNl1NbGfsDukjiA7SIe1QvSZKF5Mal8c1/wbDxaM1JJ3WQqj3BzGW7hM5cP7VK+a3A5b02zEF2mTManB1IxpPWxOxZSa3jzdetsKjHTYm2riJzktGBiTytzQHkFV/f+UJ+NrOO3RMVGQ3z8rHM6bBBEJmj1RsVqyLxWYUr7TkWTMrQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DM8PR12MB5400.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(9686003)(508600001)(5660300002)(4326008)(38100700002)(66446008)(52536014)(76116006)(64756008)(66946007)(66556008)(66476007)(30864003)(55016002)(54906003)(71200400001)(316002)(110136005)(6506007)(7696005)(86362001)(53546011)(33656002)(186003)(2906002)(122000001)(8676002)(38070700005)(83380400001)(8936002)(579004)(559001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?AkeAetNA4/6cs4JKE3aD/EfPfm0A7tMnqE1lMVr77DA/0PeNUGUBAQKc1oSM?= =?us-ascii?Q?oF6AbkbM+Dirsx7NxQkRqGp68I+v6kRtQK2ZqZNsC3LMAixK/FUtKt3/cQSw?= =?us-ascii?Q?pTwI4b9b4jWc59kDCOVswNp0ZbVQd29HEMYZgqxcnDB14WqzAjF+QsCBag1i?= =?us-ascii?Q?V7pi+EQb6TrYgbgg0mSSZrpCdtQer6rTjAtCIhl+3vT9F7TKU5MzCSJpMALn?= =?us-ascii?Q?yixX1rEui+jQ6yawqTIUg2KHXTrJpxK4l6Kc1bN8t4jaulmAE4hSKQ9wHi5I?= =?us-ascii?Q?f9CaeI24wqY8OjJ+9I45qlPjL7lC9pp68SBHOVkzzJpHirQCHPs0fnuRD6dY?= =?us-ascii?Q?0f2KQWFURhlBLQD0KnDkGsV5E9YmNs4g2eRcs+jeCPO3We7YS15JxdwOp6WU?= =?us-ascii?Q?tsmYLtGtdX+ug9cDn/vItNaqUJnCNwtAeDB4eM4qdj1y53fTeQdaEdPfS4P7?= =?us-ascii?Q?OVw+wN9vM0A/RNSTh87BzMq8nRLOLAZMt/nO5sh64FIwkqLBNnFCi7chZ8ms?= =?us-ascii?Q?Gcw+0jJljoweQJf/Vy+K5P1p7YZOq146bWhfK2Gfj03nL1O0fuytcH7bTmGN?= =?us-ascii?Q?7SO+AEy4I3g0XR/oTSWLZQufsjuPt4vbA5KMkmwTq2PwGj1hCvQ9Pz7h63VI?= =?us-ascii?Q?WKS/d8w7m5waTq4nMq6coDL2Z6jeuBSXjyNBn4rqt5NmVaNT34yHL1eWTCDm?= =?us-ascii?Q?W/We4BFCldIGhpIqu+qw+HEEXiqZHyWv4O5Cz1WVqvExmvt7RNU5JV+IszKq?= =?us-ascii?Q?XEpwoHZo340TjOPfab3DuL9VD62uU4CXRF7J4bSPz9fBdFl5XVV9Yp8RzjiW?= =?us-ascii?Q?jQdiD41hcMDpa7o3DAWSrj6fhc97HhQPcuitdG6FVmWHWFajT73NdEc3pXqv?= =?us-ascii?Q?2yAQWdMdNf+D/wkIB0xUWMgIV6UNrdJ/oFv1IjP0JKEmEd7AT9Qxdeeic9bd?= =?us-ascii?Q?+pJaaim0abxHkaB4fDcnKbY0HD7zrjq198zhq2kTJx2D2rIc/4avLbpl3GVC?= =?us-ascii?Q?0MZh+KmN17432/opK58XWoeGkRRkoPIOMv03E4EhMrOXqiWwqL3gEKutVnfN?= =?us-ascii?Q?2/xs+JqFkxh6zRLq7Q1hrWRWvOx/VQUorfPq3RUhQVK3cvDGLhBQUxV5xGW4?= =?us-ascii?Q?6cwcbSxWrQqwM7zkuR6kd2jItj01Lls2zR55xXEcxOtIOpmaTCzjjfUEwpk+?= =?us-ascii?Q?5BCn3hPBaEx4+FquC3zhs9A2QnqdsStE59fI+PSv7khPee1PuAZKAZqxJi4r?= =?us-ascii?Q?XiaX9ZFoK5ebtwNFi6Hc6VZxoz3hDUI0222u2aiLOYf49XjUlht7feZ7UGUY?= =?us-ascii?Q?7JpQh9oMW8Vm/A/0ijhoIvEG0qqQhlU+Czd4g1BqNAnSuT+fbXEpAplXYAa4?= =?us-ascii?Q?wx79SPmnCa42XyVmGUQ1bD8hA50c?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM8PR12MB5400.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2ea3906c-265f-469c-a57d-08d98982c7c9 X-MS-Exchange-CrossTenant-originalarrivaltime: 07 Oct 2021 11:08:23.8458 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: vmkdgpOa5CcA8BfvM5a0X798EztZL5j6+ByA+ubVIEJTrDxd5X+wLiH+H8VGZATuNISOGVuOpysIGJh0a/HPpQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB5391 Subject: Re: [dpdk-dev] [PATCH v2 01/14] ethdev: introduce configurable flexible item X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Slava, > -----Original Message----- > From: Slava Ovsiienko > Sent: Friday, October 1, 2021 10:34 PM > Subject: [PATCH v2 01/14] ethdev: introduce configurable flexible item >=20 > 1. Introduction and Retrospective >=20 > Nowadays the networks are evolving fast and wide, the network structures = are > getting more and more complicated, the new application areas are emerging= . > To address these challenges the new network protocols are continuously be= ing > developed, considered by technical communities, adopted by industry and, > eventually implemented in hardware and software. The DPDK framework > follows the common trends and if we bother to glance at the RTE Flow API > header we see the multiple new items were introduced during the last year= s > since the initial release. >=20 > The new protocol adoption and implementation process is not straightforwa= rd > and takes time, the new protocol passes development, consideration, > adoption, and implementation phases. The industry tries to mitigate and > address the forthcoming network protocols, for example, many hardware > vendors are implementing flexible and configurable network protocol parse= rs. > As DPDK developers, could we anticipate the near future in the same fashi= on > and introduce the similar flexibility in RTE Flow API? >=20 > Let's check what we already have merged in our project, and we see the ni= ce > raw item (rte_flow_item_raw). At the first glance, it looks superior and = we can > try to implement a flow matching on the header of some relatively new tun= nel > protocol, say on the GENEVE header with variable length options. And, und= er > further consideration, we run into the raw item > limitations: >=20 > - only fixed size network header can be represented > - the entire network header pattern of fixed format > (header field offsets are fixed) must be provided > - the search for patterns is not robust (the wrong matches > might be triggered), and actually is not supported > by existing PMDs > - no explicitly specified relations with preceding > and following items > - no tunnel hint support >=20 > As the result, implementing the support for tunnel protocols like > aforementioned GENEVE with variable extra protocol option with flow raw > item becomes very complicated and would require multiple flows and > multiple raw items chained in the same flow (by the way, there is no supp= ort > found for chained raw items in implemented drivers). >=20 > This RFC introduces the dedicated flex item (rte_flow_item_flex) to handl= e > matches with existing and new network protocol headers in a unified fashi= on. >=20 > 2. Flex Item Life Cycle >=20 > Let's assume there are the requirements to support the new network protoc= ol > with RTE Flows. What is given within protocol > specification: >=20 > - header format > - header length, (can be variable, depending on options) > - potential presence of extra options following or included > in the header the header > - the relations with preceding protocols. For example, > the GENEVE follows UDP, eCPRI can follow either UDP > or L2 header > - the relations with following protocols. For example, > the next layer after tunnel header can be L2 or L3 > - whether the new protocol is a tunnel and the header > is a splitting point between outer and inner layers >=20 > The supposed way to operate with flex item: >=20 > - application defines the header structures according to > protocol specification >=20 > - application calls rte_flow_flex_item_create() with desired > configuration according to the protocol specification, it > creates the flex item object over specified ethernet device > and prepares PMD and underlying hardware to handle flex > item. On item creation call PMD backing the specified > ethernet device returns the opaque handle identifying > the object have been created >=20 > - application uses the rte_flow_item_flex with obtained handle > in the flows, the values/masks to match with fields in the > header are specified in the flex item per flow as for regular > items (except that pattern buffer combines all fields) >=20 > - flows with flex items match with packets in a regular fashion, > the values and masks for the new protocol header match are > taken from the flex items in the flows >=20 > - application destroys flows with flex items >=20 > - application calls rte_flow_flex_item_release() as part of > ethernet device API and destroys the flex item object in > PMD and releases the engaged hardware resources >=20 > 3. Flex Item Structure >=20 > The flex item structure is intended to be used as part of the flow patter= n like > regular RTE flow items and provides the mask and value to match with fiel= ds of > the protocol item was configured for. >=20 > struct rte_flow_item_flex { > void *handle; > uint32_t length; > const uint8_t* pattern; > }; >=20 > The handle is some opaque object maintained on per device basis by > underlying driver. >=20 > The protocol header fields are considered as bit fields, all offsets and = widths > are expressed in bits. The pattern is the buffer containing the bit > concatenation of all the fields presented at item configuration time, in = the > same order and same amount. If byte boundary alignment is needed an > application can use a dummy type field, this is just some kind of gap fil= ler. >=20 > The length field specifies the pattern buffer length in bytes and is need= ed to > allow rte_flow_copy() operations. The approach of multiple pattern pointe= rs > and lengths (per field) was considered and found clumsy - it seems to be = much > suitable for the application to maintain the single structure within the = single > pattern buffer. >=20 I think that the main thing that is unclear to me and I think I understand = it from reading the code is that the pattern is the entire flex header structure. maybe a better word will be header? In the beginning I thought that you should only give the matchable fields. also you say everything is in bits and suddenly you are talking in bytes. > 4. Flex Item Configuration >=20 > The flex item configuration consists of the following parts: >=20 > - header field descriptors: > - next header > - next protocol > - sample to match > - input link descriptors > - output link descriptors >=20 > The field descriptors tell driver and hardware what data should be extrac= ted > from the packet and then presented to match in the flows. Each field is a= bit > pattern. It has width, offset from the header beginning, mode of offset > calculation, and offset related parameters. >=20 I'm not sure your indentation is correct for the next header, next protocol= , sample to match. Since reading the first line means that all fields are going to be matched while in following sections only the sample to match are matchable.=20 > The next header field is special, no data are actually taken from the pac= ket, > but its offset is used as pointer to the next header in the packet, in ot= her word > the next header offset specifies the size of the header being parsed by f= lex > item. >=20 So the name of the next header should be len? > There is one more special field - next protocol, it specifies where the n= ext > protocol identifier is contained and packet data sampled from this field = will be > used to determine the next protocol header type to continue packet parsin= g. > The next protocol field is like eth_type field in MAC2, or proto field in= IPv4/v6 > headers. >=20 > The sample fields are used to represent the data be sampled from the pack= et > and then matched with established flows. Should this be samples? >=20 > There are several methods supposed to calculate field offset in runtime > depending on configuration and packet content: >=20 > - FIELD_MODE_FIXED - fixed offset. The bit offset from > header beginning is permanent and defined by field_base > configuration parameter. >=20 > - FIELD_MODE_OFFSET - the field bit offset is extracted > from other header field (indirect offset field). The > resulting field offset to match is calculated from as: >=20 > field_base + (*field_offset & offset_mask) << field_shift >=20 Not all of those fields names are defined later in this patch, and I'm not sure about what they mean. Does * means take the value this is in field_offset? How do we know the width of the field (by the value of the mask)? > This mode is useful to sample some extra options following > the main header with field containing main header length. > Also, this mode can be used to calculate offset to the > next protocol header, for example - IPv4 header contains > the 4-bit field with IPv4 header length expressed in dwords. > One more example - this mode would allow us to skip GENEVE > header variable length options. >=20 > - FIELD_MODE_BITMASK - the field bit offset is extracted > from other header field (indirect offset field), the latter > is considered as bitmask containing some number of one bits, > the resulting field offset to match is calculated as: >=20 > field_base + bitcount(*field_offset & offset_mask) << field_shift Same comment as above you are using name that are not defined later. >=20 > This mode would be useful to skip the GTP header and its > extra options with specified flags. >=20 > - FIELD_MODE_DUMMY - dummy field, optionally used for byte > boundary alignment in pattern. Pattern mask and data are > ignored in the match. All configuration parameters besides > field size and offset are ignored. >=20 > The offset mode list can be extended by vendors according to hardware > supported options. >=20 > The input link configuration section tells the driver after what protocol= s and at > what conditions the flex item can follow. > Input link specified the preceding header pattern, for example for GENEVE= it > can be UDP item specifying match on destination port with value 6081. The > flex item can follow multiple header types and multiple input links shoul= d be > specified. At flow creation type the item with one of input link types sh= ould > precede the flex item and driver will select the correct flex item settin= gs, > depending on actual flow pattern. >=20 > The output link configuration section tells the driver how to continue pa= cket > parsing after the flex item protocol. > If multiple protocols can follow the flex item header the flex item shoul= d > contain the field with next protocol identifier, and the parsing will be > continued depending on the data contained in this field in the actual pac= ket. >=20 > The flex item fields can participate in RSS hash calculation, the dedicat= ed flag > is present in field description to specify what fields should be provided= for > hashing. >=20 > 5. Flex Item Chaining >=20 > If there are multiple protocols supposed to be supported with flex items = in > chained fashion - two or more flex items within the same flow and these o= nes > might be neighbors in pattern - it means the flex items are mutual refere= ncing. > In this case, the item that occurred first should be created with empty o= utput > link list or with the list including existing items, and then the second = flex item > should be created referencing the first flex item as input arc. >=20 And then I assume we should update the output list. > Also, the hardware resources used by flex items to handle the packet can = be > limited. If there are multiple flex items that are supposed to be used wi= thin the > same flow it would be nice to provide some hint for the driver that these= two > or more flex items are intended for simultaneous usage. > The fields of items should be assigned with hint indices and these indice= s from > two or more flex items should not overlap (be unique per field). For this= case, > the driver will try to engage not overlapping hardware resources and prov= ide > independent handling of the fields with unique indices. If the hint index= is zero > the driver assigns resources on its own. >=20 > 6. Example of New Protocol Handling >=20 > Let's suppose we have the requirements to handle the new tunnel protocol > that follows UDP header with destination port 0xFADE and is followed by M= AC > header. Let the new protocol header format be like this: >=20 > struct new_protocol_header { > rte_be32 header_length; /* length in dwords, including options */ > rte_be32 specific0; /* some protocol data, no intention */ > rte_be32 specific1; /* to match in flows on these fields */ > rte_be32 crucial; /* data of interest, match is needed */ > rte_be32 options[0]; /* optional protocol data, variable length */ > }; >=20 > The supposed flex item configuration: >=20 > struct rte_flow_item_flex_field field0 =3D { > .field_mode =3D FIELD_MODE_DUMMY, /* Affects match pattern only */ > .field_size =3D 96, /* three dwords from the beginning= */ > }; > struct rte_flow_item_flex_field field1 =3D { > .field_mode =3D FIELD_MODE_FIXED, > .field_size =3D 32, /* Field size is one dword */ > .field_base =3D 96, /* Skip three dwords from the beginning */ > }; > struct rte_flow_item_udp spec0 =3D { > .hdr =3D { > .dst_port =3D RTE_BE16(0xFADE), > } > }; > struct rte_flow_item_udp mask0 =3D { > .hdr =3D { > .dst_port =3D RTE_BE16(0xFFFF), > } > }; > struct rte_flow_item_flex_link link0 =3D { > .item =3D { > .type =3D RTE_FLOW_ITEM_TYPE_UDP, > .spec =3D &spec0, > .mask =3D &mask0, > }; >=20 > struct rte_flow_item_flex_conf conf =3D { > .next_header =3D { > .field_mode =3D FIELD_MODE_OFFSET, > .field_base =3D 0, > .offset_base =3D 0, > .offset_mask =3D 0xFFFFFFFF, > .offset_shift =3D 2 /* Expressed in dwords, shift left by 2 */ > }, > .sample =3D { > &field0, > &field1, > }, Why in sample you give both fields? by your decision we just want to match on field1. > .sample_num =3D 2, > .input_link[0] =3D &link0, > .input_num =3D 1 > }; >=20 > Let's suppose we have created the flex item successfully, and PMD returne= d > the handle 0x123456789A. We can use the following item pattern to match t= he > crucial field in the packet with value 0x00112233: >=20 > struct new_protocol_header spec_pattern =3D > { > .crucial =3D RTE_BE32(0x00112233), > }; > struct new_protocol_header mask_pattern =3D > { > .crucial =3D RTE_BE32(0xFFFFFFFF), > }; > struct rte_flow_item_flex spec_flex =3D { > .handle =3D 0x123456789A > .length =3D sizeiof(struct new_protocol_header), > .pattern =3D &spec_pattern, > }; > struct rte_flow_item_flex mask_flex =3D { > .length =3D sizeof(struct new_protocol_header), > .pattern =3D &mask_pattern, > }; > struct rte_flow_item item_to_match =3D { > .type =3D RTE_FLOW_ITEM_TYPE_FLEX, > .spec =3D &spec_flex, > .mask =3D &mask_flex, > }; >=20 > Signed-off-by: Viacheslav Ovsiienko > --- > doc/guides/prog_guide/rte_flow.rst | 24 +++ > doc/guides/rel_notes/release_21_11.rst | 7 + > lib/ethdev/rte_ethdev.h | 1 + > lib/ethdev/rte_flow.h | 228 +++++++++++++++++++++++++ > 4 files changed, 260 insertions(+) >=20 > diff --git a/doc/guides/prog_guide/rte_flow.rst > b/doc/guides/prog_guide/rte_flow.rst > index 2b42d5ec8c..628f30cea7 100644 > --- a/doc/guides/prog_guide/rte_flow.rst > +++ b/doc/guides/prog_guide/rte_flow.rst > @@ -1425,6 +1425,30 @@ Matches a conntrack state after conntrack action. > - ``flags``: conntrack packet state flags. > - Default ``mask`` matches all state bits. >=20 > +Item: ``FLEX`` > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > + > +Matches with the network protocol header of preliminary configured forma= t. > +The application describes the desired header structure, defines the > +header fields attributes and header relations with preceding and > +following protocols and configures the ethernet devices accordingly via > +rte_flow_flex_item_create() routine. How about: matches a custom header that was created using rte_flow_flex_item_create > + > +- ``handle``: the flex item handle returned by the PMD on successful > + rte_flow_flex_item_create() call. The item handle is unique within > + the device port, mask for this field is ignored. I think you can remove that it is unique handle. > +- ``length``: match pattern length in bytes. If the length does not > +cover > + all fields defined in item configuration, the pattern spec and mask > +are > + supposed to be appended with zeroes till the full configured item leng= th. It looks bugy saying that you can give any length but expect the applicatio= n to supply the full length. =20 > +- ``pattern``: pattern to match. The protocol header fields are > +considered > + as bit fields, all offsets and widths are expressed in bits. The > +pattern > + is the buffer containing the bit concatenation of all the fields > +presented > + at item configuration time, in the same order and same amount. The > +most > + regular way is to define all the header fields in the flex item > +configuration > + and directly use the header structure as pattern template, i.e. > +application > + just can fill the header structures with desired match values and > +masks and > + specify these structures as flex item pattern directly. > + It hard to understand this comment and what the application should set. I suggest to take the basic approach and just explain it. ( I think those a= re the last few lines) > Actions > ~~~~~~~ >=20 > diff --git a/doc/guides/rel_notes/release_21_11.rst > b/doc/guides/rel_notes/release_21_11.rst > index 73e377a007..170797f9e9 100644 > --- a/doc/guides/rel_notes/release_21_11.rst > +++ b/doc/guides/rel_notes/release_21_11.rst > @@ -55,6 +55,13 @@ New Features > Also, make sure to start the actual text at the margin. > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D >=20 > +* **Introduced RTE Flow Flex Item.** > + > + * The configurable RTE Flow Flex Item provides the capability to intro= dude > + the arbitrary user specified network protocol header, configure the = device > + hardware accordingly, and perform match on this header with desired > patterns > + and masks. > + > * **Enabled new devargs parser.** >=20 > * Enabled devargs syntax > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index > afdc53b674..e9ad7673e9 100644 > --- a/lib/ethdev/rte_ethdev.h > +++ b/lib/ethdev/rte_ethdev.h > @@ -558,6 +558,7 @@ struct rte_eth_rss_conf { > * it takes the reserved value 0 as input for the hash function. > */ > #define ETH_RSS_L4_CHKSUM (1ULL << 35) > +#define ETH_RSS_FLEX (1ULL << 36) Is the indentation right? How do you support FLEX RSS if more then on FLEX item is configured? >=20 > /* > * We use the following macros to combine with above ETH_RSS_* for diff = --git > a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index > 7b1ed7f110..eccb1e1791 100644 > --- a/lib/ethdev/rte_flow.h > +++ b/lib/ethdev/rte_flow.h > @@ -574,6 +574,15 @@ enum rte_flow_item_type { > * @see struct rte_flow_item_conntrack. > */ > RTE_FLOW_ITEM_TYPE_CONNTRACK, > + > + /** > + * Matches a configured set of fields at runtime calculated offsets > + * over the generic network header with variable length and > + * flexible pattern > + * I think it should say matches on application configured header. > + * @see struct rte_flow_item_flex. > + */ > + RTE_FLOW_ITEM_TYPE_FLEX, > }; >=20 > /** > @@ -1839,6 +1848,160 @@ struct rte_flow_item { > const void *mask; /**< Bit-mask applied to spec and last. */ }; >=20 > +/** > + * @warning > + * @b EXPERIMENTAL: this structure may change without prior notice > + * > + * RTE_FLOW_ITEM_TYPE_FLEX > + * > + * Matches a specified set of fields within the network protocol > + * header. Each field is presented as set of bits with specified width, > +and > + * bit offset (this is dynamic one - can be calulated by several > +methods > + * in runtime) from the header beginning. > + * > + * The pattern is concatenation of all bit fields configured at item > +creation > + * by rte_flow_flex_item_create() exactly in the same order and amount, > +no > + * fields can be omitted or swapped. The dummy mode field can be used > +for > + * pattern byte boundary alignment, least significant bit in byte goes f= irst. > + * Only the fields specified in sample_data configuration parameter > +participate > + * in pattern construction. > + * > + * If pattern length is smaller than configured fields overall length > +it is > + * extended with trailing zeroes, both for value and mask. > + * > + * This type does not support ranges (struct rte_flow_item.last). > + */ I think it is to complex to understand see my comment above. > +struct rte_flow_item_flex { > + struct rte_flow_item_flex_handle *handle; /**< Opaque item handle. > */ > + uint32_t length; /**< Pattern length in bytes. */ > + const uint8_t *pattern; /**< Combined bitfields pattern to match. */ > +}; > +/** > + * Field bit offset calculation mode. > + */ > +enum rte_flow_item_flex_field_mode { > + /** > + * Dummy field, used for byte boundary alignment in pattern. > + * Pattern mask and data are ignored in the match. All configuration > + * parameters besides field size are ignored. Since in the item we just set value and mask what will happen if we set mask to be different then 0 in an offset that we have such a field? > + */ > + FIELD_MODE_DUMMY =3D 0, > + /** > + * Fixed offset field. The bit offset from header beginning is > + * is permanent and defined by field_base parameter. > + */ > + FIELD_MODE_FIXED, > + /** > + * The field bit offset is extracted from other header field (indirect > + * offset field). The resulting field offset to match is calculated as: > + * > + * field_base + (*field_offset & offset_mask) << field_shift I can't find those name in the patch and I'm not clear on what they mean. > + */ > + FIELD_MODE_OFFSET, > + /** > + * The field bit offset is extracted from other header field (indirect > + * offset field), the latter is considered as bitmask containing some > + * number of one bits, the resulting field offset to match is > + * calculated as: Just like above.=20 > + * > + * field_base + bitcount(*field_offset & offset_mask) << field_shift > + */ > + FIELD_MODE_BITMASK, > +}; > + > +/** > + * Flex item field tunnel mode > + */ > +enum rte_flow_item_flex_tunnel_mode { > + FLEX_TUNNEL_MODE_FIRST =3D 0, /**< First item occurrence. */ > + FLEX_TUNNEL_MODE_OUTER =3D 1, /**< Outer item. */ > + FLEX_TUNNEL_MODE_INNER =3D 2 /**< Inner item. */ }; > + The '}' should be at a new line. If the item can be inner and outer do we need to define two flex objects? Also why enum and not defines? >From API point of view I think it should hav the following options: Mode_outer , mode_inner, mode_global and mode_tunnel, Why is per field and not per object.=20 > +/** > + * @warning > + * @b EXPERIMENTAL: this structure may change without prior notice */ > +__extension__ struct rte_flow_item_flex_field { > + /** Defines how match field offset is calculated over the packet. */ > + enum rte_flow_item_flex_field_mode field_mode; > + uint32_t field_size; /**< Match field size in bits. */ I think it will be better to remove the word Match. > + int32_t field_base; /**< Match field offset in bits. */ I think it will be better to remove the word Match. > + uint32_t offset_base; /**< Indirect offset field offset in bits. */ I think a better name will be offset_field /* the offset of the field that = holds the offset that should be used from the field_base */ what do you think? Maybe just change from offset_base to offset? > + uint32_t offset_mask; /**< Indirect offset field bit mask. */ Maybe better wording? The mask to apply to the value that is set in the offset_field. > + int32_t offset_shift; /**< Indirect offset multiply factor. */ > + uint16_t tunnel_count:2; /**< 0-first occurrence, 1-outer, 2-inner.*/ I think this may result in some warning since you try to cast enum to 2 bit= s. Also the same question from above to support inner and outer do we need two objects? > + uint16_t rss_hash:1; /**< Field participates in RSS hash calculation. *= / Please see my comment on the RSS, it is not clear how more then one flex it= em=20 can be created and the rss will work. > + uint16_t field_id; /**< device hint, for flows with multiple items. */ How should this be used?=20 Should be capital D in device. > +}; > + > +/** > + * @warning > + * @b EXPERIMENTAL: this structure may change without prior notice */ > +struct rte_flow_item_flex_link { > + /** > + * Preceding/following header. The item type must be always > provided. > + * For preceding one item must specify the header value/mask to > match > + * for the link be taken and start the flex item header parsing. > + */ > + struct rte_flow_item item; > + /** > + * Next field value to match to continue with one of the configured > + * next protocols. > + */ > + uint32_t next; Is this offset of the field or the value? > + /** > + * Specifies whether flex item represents tunnel protocol > + */ > + bool tunnel; > +}; > + > +/** > + * @warning > + * @b EXPERIMENTAL: this structure may change without prior notice */ > +struct rte_flow_item_flex_conf { > + /** > + * The next header offset, it presents the network header size covered > + * by the flex item and can be obtained with all supported offset > + * calculating methods (fixed, dedicated field, bitmask, etc). > + */ > + struct rte_flow_item_flex_field next_header; I think a better name will be size/len > + /** > + * Specifies the next protocol field to match with link next protocol > + * values and continue packet parsing with matching link. > + */ > + struct rte_flow_item_flex_field next_protocol; > + /** > + * The fields will be sampled and presented for explicit match > + * with pattern in the rte_flow_flex_item. There can be multiple > + * fields descriptors, the number should be specified by sample_num. > + */ > + struct rte_flow_item_flex_field *sample_data; > + /** Number of field descriptors in the sample_data array. */ > + uint32_t sample_num; nb_samples? > + /** > + * Input link defines the flex item relation with preceding > + * header. It specified the preceding item type and provides pattern > + * to match. The flex item will continue parsing and will provide the > + * data to flow match in case if there is the match with one of input > + * links. > + */ > + struct rte_flow_item_flex_link *input_link; > + /** Number of link descriptors in the input link array. */ > + uint32_t input_num; Nb_inputs > + /** > + * Output link defines the next protocol field value to match and > + * the following protocol header to continue packet parsing. Also > + * defines the tunnel-related behaviour. > + */ > + struct rte_flow_item_flex_link *output_link; > + /** Number of link descriptors in the output link array. */ > + uint32_t output_num; > +}; > + > /** > * Action types. > * > @@ -4288,6 +4451,71 @@ rte_flow_tunnel_item_release(uint16_t port_id, > struct rte_flow_item *items, > uint32_t num_of_items, > struct rte_flow_error *error); > + > +/** > + * Create the flex item with specified configuration over > + * the Ethernet device. > + * > + * @param port_id > + * Port identifier of Ethernet device. > + * @param[in] conf > + * Item configuration. > + * @param[out] error > + * Perform verbose error reporting if not NULL. PMDs initialize this > + * structure in case of error only. > + * > + * @return > + * Non-NULL opaque pointer on success, NULL otherwise and rte_errno is > set. > + */ > +__rte_experimental > +struct rte_flow_item_flex_handle * > +rte_flow_flex_item_create(uint16_t port_id, > + const struct rte_flow_item_flex_conf *conf, > + struct rte_flow_error *error); > + > +/** > + * Release the flex item on the specified Ethernet device. > + * > + * @param port_id > + * Port identifier of Ethernet device. > + * @param[in] handle > + * Handle of the item existing on the specified device. > + * @param[out] error > + * Perform verbose error reporting if not NULL. PMDs initialize this > + * structure in case of error only. > + * > + * @return > + * 0 on success, a negative errno value otherwise and rte_errno is set= . > + */ > +__rte_experimental > +int > +rte_flow_flex_item_release(uint16_t port_id, > + const struct rte_flow_item_flex_handle *handle, > + struct rte_flow_error *error); > + > +/** > + * Modify the flex item on the specified Ethernet device. > + * > + * @param port_id > + * Port identifier of Ethernet device. > + * @param[in] handle > + * Handle of the item existing on the specified device. > + * @param[in] conf > + * Item new configuration. Do you to supply full configuration for each update? Maybe add a mask? > + * @param[out] error > + * Perform verbose error reporting if not NULL. PMDs initialize this > + * structure in case of error only. > + * > + * @return > + * 0 on success, a negative errno value otherwise and rte_errno is set= . > + */ > +__rte_experimental > +int > +rte_flow_flex_item_update(uint16_t port_id, > + const struct rte_flow_item_flex_handle *handle, > + const struct rte_flow_item_flex_conf *conf, > + struct rte_flow_error *error); > + > #ifdef __cplusplus > } > #endif > -- > 2.18.1 Best, Ori