From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id DE281A0540; Mon, 13 Jul 2020 06:52:49 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 58D9E1D14D; Mon, 13 Jul 2020 06:52:48 +0200 (CEST) Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-eopbgr30079.outbound.protection.outlook.com [40.107.3.79]) by dpdk.org (Postfix) with ESMTP id 4E71B1C23E for ; Mon, 13 Jul 2020 06:52:46 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VdWoNWpVIhg/fumN9Sm8C8cuHDfNzBOvmVbEasK0b+K2uWtmvcn52fouqn2g7DU8EqR18DEHz5waJJ/yyRPm5sx948MznMEHw11UqDzAaWODnkfyD/Xq/jWCemiGtF9MLg8COFDik5KLj4evZ4jjsz1ij9agoogNJWQwWpYVVm+OkLNcHioGIs4D9UHPCqg9vy65khLgslQp+hnMrqR1Q7VUvX07hh7U6ecxiWtOdAnQFQgnJvNWgX5ONO71IyBgAq9AYPkJVZh3Jp+lmBsp3AaLsnp+nbeC6AJb1MHrHvflsumE20tad4nPJhRW1CJzYfZ4S6am7cOr1RMIErsiSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=I8amX/ual6y24Gz38Qp6PZZRMFZoNMSBpUYS66ffiuk=; b=G/8yFso19O5Rfs6WP5UpPtJ+Xde99AADMcX0OCh9dLopWHy4w1mPoevK+EazTsYgKcGqZ9k+z33dLnz5v3fVF9A2URnNQEwGB/SkY0mLDn68rnagDKVRn7aG5NzMELtSMPIA0xmDIOvWi/q/PZTi3ryXCGhiv2KEPzDc4j0hYSFFHOGbRaZzX4ggkY/P84ybjzgp/X/eos/MdJUB5ciDG7eeFW5VHh89rQnFF8GJQIGl6vL21bMrceuwmD8Z/m5djc9JZP92H/qbFYOvFns51m7t180KUjMRt5BMeD6gQjVjvF4XEibkF7OAZF47TQ3q8shi0Czni+gSevK9HSci8A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mellanox.com; dmarc=pass action=none header.from=mellanox.com; dkim=pass header.d=mellanox.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=I8amX/ual6y24Gz38Qp6PZZRMFZoNMSBpUYS66ffiuk=; b=q7UTPJIbz0qareSQjiAJgn6OiwOBDGvS9vXr/Jes2tMunOMKTztDW7GBxow68qQi057YO7OdDLMC8SkW/QY+PrQiQjIDq4ZVm7wBqrFDtNt0LLY4B384g2k8wfH0t5WVIghHNfFVXKOtqk9k9LxfSMQiC7/Jl+nLtoSXOndNufc= Authentication-Results: broadcom.com; dkim=none (message not signed) header.d=none;broadcom.com; dmarc=none action=none header.from=mellanox.com; Received: from AM0PR0502MB3924.eurprd05.prod.outlook.com (2603:10a6:208:20::30) by AM4PR0501MB2657.eurprd05.prod.outlook.com (2603:10a6:200:65::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3174.21; Mon, 13 Jul 2020 04:52:44 +0000 Received: from AM0PR0502MB3924.eurprd05.prod.outlook.com ([fe80::cd67:f25f:c3aa:f459]) by AM0PR0502MB3924.eurprd05.prod.outlook.com ([fe80::cd67:f25f:c3aa:f459%2]) with mapi id 15.20.3174.025; Mon, 13 Jul 2020 04:52:44 +0000 To: William Tu Cc: dev@dpdk.org, Thomas Monjalon , Ori Kam , Eli Britstein , Sriharsha Basavapatna , Hemal Shah References: <5862610e-76cc-7783-7d66-2b2173eeb974@mellanox.com> From: Oz Shlomo Message-ID: <395d4c2d-198f-67e3-a6b2-f40773a0e196@mellanox.com> Date: Mon, 13 Jul 2020 07:52:41 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-ClientProxiedBy: AM0PR08CA0029.eurprd08.prod.outlook.com (2603:10a6:208:d2::42) To AM0PR0502MB3924.eurprd05.prod.outlook.com (2603:10a6:208:20::30) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from [192.168.14.169] (79.181.205.58) by AM0PR08CA0029.eurprd08.prod.outlook.com (2603:10a6:208:d2::42) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3174.21 via Frontend Transport; Mon, 13 Jul 2020 04:52:43 +0000 X-Originating-IP: [79.181.205.58] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 8f5f3a4b-ffbe-4887-c988-08d826e894e1 X-MS-TrafficTypeDiagnostic: AM4PR0501MB2657: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: AHEo0mlN2XNYVc1eGN0RdTUBUeGgccxY7jXJw1Lp0RHsOAJtoVpOTm0ZWbn8wTHoQTbW5vP2nnbmf/IP1uFoJdmQiwwPJTCi/t1qRJOPtqXabkCfGGpVy309y1/HzJSKYNEzhyG9wJUqkdIZ2qPvEgwnUqieALWZ6pPugjhPxLwjbz4/NZVgL/BE6qR6+zLeAALGkcJwDCE1ATeZpESxuu3FSeyQVCc/77zRE2XMMzwF8XxhkT5VeIuQLBjEEzkZz5bfcr3Kc6ArmesqTpY5eEk7I/tuf4yUDVJhnaB1paG9Ho/8rdUyBMfGQZiH1LQDZwj+WfXkgM7mX0uyEiLaM0Ni9hNmWDsXCD15qBlqU/V0K0spqufHH/I55rzCpSKShmAnJrOuL7XO9EIoDGG1T/SZ9UZpV4THAlbjMHp4Iko+neSvut4denno/jaSSCQMOvc7juW8umzWJ4OECa7wVA== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM0PR0502MB3924.eurprd05.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(4636009)(376002)(366004)(136003)(396003)(39860400002)(346002)(6916009)(31696002)(8936002)(4326008)(316002)(966005)(83380400001)(54906003)(66476007)(478600001)(16576012)(52116002)(5660300002)(45080400002)(66556008)(66946007)(31686004)(36756003)(53546011)(86362001)(16526019)(186003)(8676002)(26005)(2906002)(956004)(2616005)(6486002)(43740500002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: a9OsNK38ZNSz40a+nAWRHR9CJY65F0suciyJkoC/HwIMqT9zZMfeH+eW9Edcr+5LtI9nPpCWLLjZtiAkkduzzm5i2k6XNNdD/8C2VIBjH6qHIHNu21g1rWBOC7IYPxtwlDw14hnRRVmFhX6kRqdpduScuuobctJpbifopRS+gx1wRexpASlIja2cDT3qwxBbj2nKzCsAeraLsxaJTvY+u1/JbRK5Lx5rDP/PibfnN1X30NDoBSd3OMUaiRTaT0vH9AKX6XIk+asqflF8Sv4dsjDw29/VsCx8lYY3lpwJfNwOoZTaSnbzkKV1JCMtPM7+6qQDCjw8XloUQIF9rJsZdBd9XnorHLkfbPzFIiJo6HBAU4P+slLgIHtjEXk7d3HUkr3O+4QD7aDUyejRa23+QW0qQ/uyeqoZO+w2E7bs6xXcl2G/tChlF/gFMFg454jXLcrsE3hMOA7WYInAss9OrAh1PA8GF8TuF4y4BOxw0Xw= X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8f5f3a4b-ffbe-4887-c988-08d826e894e1 X-MS-Exchange-CrossTenant-AuthSource: AM0PR0502MB3924.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Jul 2020 04:52:44.7027 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 0qIixu0baceM9xTA4JD3dbOEUf+/h+boF3yMXxCKXNsUq/aXsURjbxqNJ9RPYkECTshMdOwOm9T24f8z7AmpZA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR0501MB2657 Subject: Re: [dpdk-dev] [RFC] - Offloading tunnel ports X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi William, On 7/12/2020 7:34 PM, William Tu wrote: > Hi Oz, > > I started to learn about this and have a couple of questions below. > Thank you in advance. > > On Tue, Jun 9, 2020 at 8:07 AM Oz Shlomo wrote: >> >> Rte_flow API provides the building blocks for vendor agnostic flow >> classification offloads. The rte_flow match and action primitives are fine >> grained, thus enabling DPDK applications the flexibility to offload network >> stacks and complex pipelines. >> >> Applications wishing to offload complex data structures (e.g. tunnel virtual >> ports) are required to use the rte_flow primitives, such as group, meta, mark, >> tag and others to model their high level objects. >> >> The hardware model design for high level software objects is not trivial. >> Furthermore, an optimal design is often vendor specific. >> >> The goal of this RFC is to provide applications with the hardware offload >> model for common high level software objects which is optimal in regards >> to the underlying hardware. >> >> Tunnel ports are the first of such objects. >> >> Tunnel ports >> ------------ >> Ingress processing of tunneled traffic requires the classification >> of the tunnel type followed by a decap action. >> >> In software, once a packet is decapsulated the in_port field is changed >> to a virtual port representing the tunnel type. The outer header fields >> are stored as packet metadata members and may be matched by proceeding >> flows. >> >> Openvswitch, for example, uses two flows: >> 1. classification flow - setting the virtual port representing the tunnel type >> For example: match on udp port 4789 actions=tnl_pop(vxlan_vport) >> 2. steering flow according to outer and inner header matches >> match on in_port=vxlan_vport and outer/inner header matches actions=forward to port X >> The benefits of multi-flow tables are described in [1]. >> >> Offloading tunnel ports >> ----------------------- >> Tunnel ports introduce a new stateless field that can be matched on. >> Currently the rte_flow library provides an API to encap, decap and match >> on tunnel headers. However, there is no rte_flow primitive to set and >> match tunnel virtual ports. >> >> There are several possible hardware models for offloading virtual tunnel port >> flows including, but not limited to, the following: >> 1. Setting the virtual port on a hw register using the rte_flow_action_mark/ >> rte_flow_action_tag/rte_flow_set_meta objects. >> 2. Mapping a virtual port to an rte_flow group >> 3. Avoiding the need to match on transient objects by merging multi-table >> flows to a single rte_flow rule. >> >> Every approach has its pros and cons. >> The preferred approach should take into account the entire system architecture >> and is very often vendor specific. > > Are these three solutions mutually exclusive? > And IIUC, based on the description below, you're proposing solution 1, right? > and the patch on OVS is using solution 2? > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fcover%2F20200120150830.16262-1-elibr%40mellanox.com%2F&data=02%7C01%7Cozsh%40mellanox.com%7C4ece31d745d246e30f9308d8268185cb%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637301685025981024&sdata=mPCFG468xYkHRX3HJRkrrDix4hfDLstAZtlILQfGxr8%3D&reserved=0 > From the OVS patchset we learned that it might be better to provide each vendor with the flexibility to implement its optimal hardware model. We propose this design as an alternative to the submitted OVS patchset. This patch is designed to provide an abstract API. As such, any of the solutions listed above, or others, are possible. The Mellanox PMD is planned to implemented solution 2. >> >> The proposed rte_flow_tunnel_port_set helper function (drafted below) is designed >> to provide a common, vendor agnostic, API for setting the virtual port value. >> The helper API enables PMD implementations to return vendor specific combination of >> rte_flow actions realizing the vendor's hardware model for setting a tunnel port. >> Applications may append the list of actions returned from the helper function when >> creating an rte_flow rule in hardware. >> >> Similarly, the rte_flow_tunnel_port_match helper (drafted below) allows for >> multiple hardware implementations to return a list of fte_flow items. >> > And if we're using solution 1 "Setting the virtual port on a hw > register using the rte_flow_action_mark/ > rte_flow_action_tag/rte_flow_set_meta objects." > For the classification flow, does that mean HW no longer needs to > translate tnl_pop to mark + jump, > but the HW can directly execute the tnl_pop(vxlan_vport) action > because the outer header is > saved using rte_flow_set_meta? > In this case we would need to map the outer header fields to a unique id. This can be done either from the datapath (for capable hardware) or from the flows. The latter option, requires the flow to match on the outer header fields that should be stored. OVS matches on the outer header fields only after it classifies the tunnel port (i.e. after the tnl_pop action). >> Miss handling >> ------------- >> Packets going through multiple rte_flow groups are exposed to hw misses due to >> partial packet processing. In such cases, the software should continue the >> packet's processing from the point where the hardware missed. >> >> We propose a generic rte_flow_restore structure providing the state that was >> stored in hardware when the packet missed. >> >> Currently, the structure will provide the tunnel state of the packet that >> missed, namely: >> 1. The group id that missed >> 2. The tunnel port that missed >> 3. Tunnel information that was stored in memory (due to decap action). >> In the future, we may add additional fields as more state may be stored in >> the device memory (e.g. ct_state). >> >> Applications may query the state via a new rte_flow_get_restore_info(mbuf) API, >> thus allowing a vendor specific implementation. >> > > Thanks > William >