From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3C58AA0032; Wed, 16 Mar 2022 15:01:48 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C4A7F410EC; Wed, 16 Mar 2022 15:01:47 +0100 (CET) Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2069.outbound.protection.outlook.com [40.107.244.69]) by mails.dpdk.org (Postfix) with ESMTP id 3BE1D40395 for ; Wed, 16 Mar 2022 15:01:45 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=nxst8tH7a6Cfj7/eaVofNR2j2UEkD+QXeQPWV7aoYPph6DnAAmzPngyrUCne1ZXSqD7gLuvXiIXFpn1zeW8ATsPrqP8NPd/+4xsEM2c0wWMmUhLsWewr4gy7UDsQ/ki+6+3IKAWc3N9t8MkpZlpBDi+6Cu1Yc0i8OztDfA++qs4LB5hNr34uxvqslxU1JO1x4YCFJukn68yVLA6a6Io1DPqPM4jt3VQT1Da5vTFwqgRlrWuZzSRmuIr2SjmJ2C93Vbxo4QWt2SYAnh3iKD4dibGzGtN4SkxUsElA0wsIdiyGamda/XTx9FyEtzDmQTrQzZ9VbDN24QJq6lBWdhGV2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4QCP0/+NMv5MuOwJZPq6QtvmGins5G1QGuWqS/taQLM=; b=S3ptM0LKmuTvdZXQVPDy0w69BUKu6tWwjeXvs4JGGUprJmajCbuzXesrZejDREF2kVErq5RVICgMxoh6hSykROLSVyTHIYm7hWs0KlP7mEhtyiYDnXEtFUjkA+RVvw2u1xiwrDfhe412UnbwEVUe/ryHUjkmhv8Lo09VSw+v6UsF4hYtG8LfFspxAQORuhFCcZpW4mRmn5VnE0UTgUX0KmD8BQ3cszu1eaXzeqqV1Co7eOp8Cl10eIg77vXaPLs/VDfpYYMbPu0z4lzUiFB94k1LSKD1xL91cxbOEtU3n7ubzxfOYKHzhuxMOz4PbOnGVOCA4ea2c4RXgpAH+AFAUg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4QCP0/+NMv5MuOwJZPq6QtvmGins5G1QGuWqS/taQLM=; b=YvD4EaoqW9aUp4juEXQ5wfjPtoRnby1hOJaj7+zEfJ3C8lrYPUzxql9XMQ7Hn/Lj+Z+2+4r3TZdmb1J9/qmr9GqqMH5eET4vZzWir9NCJP1GUHZqjKykrSY7Ti2PFvKZ+WrT3lNaahe3jODQxJhU04DkZyLDdXZgUzFpuiXtCwzUmJ/fpCLuzehfOvLZPBy1PQIm1EL7+4a1FE5k6/pMP8lzwCuW0xjDfByM8JSuRxc/co7+csPZHDVgVvnwAckqc1ASnoh7xQH0WMMa+Ly+rwL/OG/qjpxgMiPrIwfA4v3MV5AgoFuhh8IKIDpnjLJpOHxpWko9kkzAnBudZLjoHA== Received: from MW2PR12MB4666.namprd12.prod.outlook.com (2603:10b6:302:13::22) by BN6PR12MB1794.namprd12.prod.outlook.com (2603:10b6:404:100::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5061.26; Wed, 16 Mar 2022 14:01:40 +0000 Received: from MW2PR12MB4666.namprd12.prod.outlook.com ([fe80::a19c:db22:1184:7bb]) by MW2PR12MB4666.namprd12.prod.outlook.com ([fe80::a19c:db22:1184:7bb%5]) with mapi id 15.20.5081.015; Wed, 16 Mar 2022 14:01:40 +0000 From: Ori Kam To: "NBU-Contact-Thomas Monjalon (EXTERNAL)" , Ilya Maximets CC: dev , Sriharsha Basavapatna , Gaetan Rivet , Eli Britstein , Ivan Malov , Andrew Rybchenko , Ian Stokes , Slava Ovsiienko , Matan Azrad Subject: RE: rte_flow API change request: tunnel info restoration is too slow Thread-Topic: rte_flow API change request: tunnel info restoration is too slow Thread-Index: AQHYOLm5r97/sjUAO0WK4HJHs9sJFKzBwocAgAAt/QCAABaRAIAAAKkA Date: Wed, 16 Mar 2022 14:01:40 +0000 Message-ID: References: <5248c2ca-f2a6-3fb0-38b8-7f659bfa40de@ovn.org> <6043769.DvuYhMxLoT@thomas> <7c70ff8a-296b-6f4e-53c7-ad0f825b838c@ovn.org> <4878373.LvFx2qVVIh@thomas> In-Reply-To: <4878373.LvFx2qVVIh@thomas> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 52f61fb1-2c73-438e-2c31-08da07557ed6 x-ms-traffictypediagnostic: BN6PR12MB1794:EE_ x-ld-processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr x-microsoft-antispam-prvs: x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: qcf7LajIaXGxW2GqKB3c9QXtESt0Jpp5Sa5FBFMP9EzI5R4d8C0eu3mweOQHgFVomj/JVi4rvianPqAJlW0aRPA77kwS1Qm1DWOkyXtRmAK3Gd+f0MnMcoNpVvPbi3ur70ZS7g60BjNxEZM1GVFG+xPchZfqoNFsqiwR5TvpPVoBjMMK/x7aWmSDfRz/xHiMrifmmxsP4riaZMMvCsrBBJ3r+x9WL6Ik9utlLnxTolQ8/eRvjm2B9X2WEKMkfyy/Fn3Dg+RoWk8j/TewOrnacBMtvtyTUVRxtY/DevFow3u3cwm2vwoinXGa5CsLQylumm3RdQWnGei4kUENo85O7I6ENWiq0ut3fekDGug1hCOjc1Mi2xJFIBcRZycZIESe35uyQn0L5eUfarlu63htQV2jAESW90WAYfAagh/TWFj6+pdegUQsVZVmd3pHXepePftB0mjqsYrd5pRz5K1Xa5M0DvaAstAOPizXgH0dPCPKOaG3XRygzKRBIGwtuGRhX7gxZR8dxodkq70GqPTlQdbP5VxYHCT/YPdiPnFd8+nuJyUkB2DvKgF2DP8joFoWvEuP/wEssRopnNg6iGyT5sL4emQMyjwuF7NplNVS9aOcWMLFlVOQz4VWJjqh/4vDpLtsYRwn54m7tFME3SYnaouhJhaf80JG+gwuHI9LTb5tVe0yUBDv773PeJsrmDF1t8QcQJVXgiA+1WNOL0Y9n7oxlmSw5kulGa5wYPK6V1zUQ5sg6Mr8ZwLzoyFhBs1T10mvSyIinAjOe5qU5x7t3lQbDKz7kLwmfTIkvSENv1I= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MW2PR12MB4666.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230001)(4636009)(366004)(26005)(71200400001)(186003)(6506007)(9686003)(7696005)(2906002)(53546011)(107886003)(45080400002)(83380400001)(5660300002)(52536014)(8936002)(55016003)(54906003)(110136005)(508600001)(966005)(76116006)(66946007)(66556008)(66476007)(66446008)(4326008)(64756008)(8676002)(316002)(122000001)(38100700002)(86362001)(33656002)(38070700005); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?4T2yNqwwkhn0tbDOwTMCQidRFI3d1pG8guxI8M4yj//lBvVwFV5SbuvozLK2?= =?us-ascii?Q?9cpm30DGV/PaJ0jxiTd96nC3sZr4PjRzcfMNIQQszYFHg+1akeNz4kKsKAFp?= =?us-ascii?Q?xe4t5MxudN5gTnBHS5NB/o6bnJmxDCVQxwmHMOvMD26DN3fd/v6BpswiqO3g?= =?us-ascii?Q?Z6DgSS2EqWhvnYF9IFeJ+f4T7CRQpUTFCNfXduPcu18f5FkWJ6WF6DeYetRF?= =?us-ascii?Q?Dpim+mUf0nmIb+HRRcFSdte4WY6pp8ergUl2tly502nItPJYmx2woFGnGvOj?= =?us-ascii?Q?QM1jULGKMDD7zQZTFc99uBbceR5H9rYB1+WT0MlPwep1VzAcvIqEGmBPOhNt?= =?us-ascii?Q?qbrXw60Ue9G1PNwOrgLKpOc3DI3WHxUYoz6wuUyvhQtiC3dQfxzjI8xq0pOO?= =?us-ascii?Q?LMEYSUycbKMSVotHnwQoQwTHL3RshQ/TiWZNqQeHpbsRDD+XKJl0kFvrrjMH?= =?us-ascii?Q?Fo9+gNu9tJJlaniHI/+huiw1cTIbAYAlJrIGUJ7x4LpLaMpFqA5tjBhAKudd?= =?us-ascii?Q?MNdLX19UZ//gMVTg+YJ/dz3pbJQZr3CSKRfP6dWmm/Cc0T6bi+FCKggbZfC4?= =?us-ascii?Q?QT8lNvoxD/kmgIPN3MfdazZ0fXNAyfcPnj5q259HC45vlIfZVC/aAtSvCu7B?= =?us-ascii?Q?TBy62+QBfgAAODuvLSec0rGQ7yI/EUVN+Z0UjZPuXqjsJGj/0AA2SBm67vBV?= =?us-ascii?Q?G062J2WgQRi3/NEOSPrVilJ1jsMMbtaftsZYwKm3L+B5KLJ5RoWbLk+CF3Th?= =?us-ascii?Q?nEVbBWS4wj8qdzyWz+H3q8FwGQ3GaeEHTgvUiYHLgK31MdWrDzsRCwqd3+bh?= =?us-ascii?Q?hRb0WBNMtyLoa2uByO1dXk9hsvwEXSlLb3elExc88uCLZrCNr9/hy5/mt4Lf?= =?us-ascii?Q?mbF7bG7AH3X1eq9Pg/2wbQ0JchNvkRuCgYN30P1Npfh+wjOcjAH3nyhb3FdA?= =?us-ascii?Q?nOZrDirt4JtIWhnDCtqmEcbIR6fQkHW0uU1sp7fIzjQboBD9HcGtOVfmfqcG?= =?us-ascii?Q?F2d/o14tu72a5PT1cQEdAZ7vxEIxNC6Qz3PLd3PLvN15X5xOJewbWrw29eKb?= =?us-ascii?Q?J5PQuHYvEFKnqZ80lvA3x19Yw+ct3dROUSew5ke2gR+SP/arKUAMkwGVtwin?= =?us-ascii?Q?Q8XLRgx1AFqdymd61/rTrSkSi42ogv68SmMjFl7Ayibgi1lXASTjBzNwwNUa?= =?us-ascii?Q?7E6WMmqKgMqKnPBdmBfpqow8H+vAvFFZ4EbTX83F5fqyTKKmgVCBEz1Kf4Ap?= =?us-ascii?Q?1w/YvHppwhVvqZQfEF7IssXnR0BWTraIJN4Ju3nzA9VImiLs2NhPsuzKzfWF?= =?us-ascii?Q?dp5Y4zJfskQHYzDvxtB/ifrK6bjCG84HeaW7xGBc+PPGRe2D2ZC9DcskMop+?= =?us-ascii?Q?npl3dNgxqCnG8O1zBo2C0+loJGTj+njnB2c1J/W0YB52o+D02dRQly62xyoq?= =?us-ascii?Q?Sp4bFadeGSE=3D?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: MW2PR12MB4666.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 52f61fb1-2c73-438e-2c31-08da07557ed6 X-MS-Exchange-CrossTenant-originalarrivaltime: 16 Mar 2022 14:01:40.7145 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: BxqmMs8Ka6HejHieDEvnPBHI5WrOmJ/+kCAgznLUEm/lkq/+Z5dOIjsOKF5gGn/qgCM6w/rny6rsTADG6PcsNA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR12MB1794 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi > -----Original Message----- > From: Thomas Monjalon > Sent: Wednesday, March 16, 2022 3:47 PM > Subject: Re: rte_flow API change request: tunnel info restoration is too = slow >=20 > 16/03/2022 13:25, Ilya Maximets: > > On 3/16/22 10:41, Thomas Monjalon wrote: > > > 15/03/2022 23:12, Ilya Maximets: > > >> Hi, everyone. > > >> > > >> After implementing support for tunnel offloading in OVS we faced a > > >> significant performance issue caused by the requirement to call > > >> rte_flow_get_restore_info() function on a per-packet basis. > > >> > > >> The main problem is that once the tunnel offloading is configured, > > >> there is no way for the application to tell if a particular packet > > >> was partially processed in the hardware and the tunnel info has to > > >> be restored. What we have to do right now is to call the > > >> rte_flow_get_restore_info() unconditionally for every packet. The > > >> result of that call says if we have the tunnel info or not. > > >> > > >> rte_flow_get_restore_info() call itself is very heavy. It is at > > >> least a couple of indirect function calls and the device lock > > >> on the application side (not-really-thread-safety of the rte_flow > > >> API is a separate topic). Internal info lookup inside the driver > > >> also costs a lot (depends on a driver). > > >> > > >> It has been measured that having this call on a per-packet basis can > > >> reduce application performance (OVS) by a factor of 3 in some cases: > > >> > https://nam11.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fmail.= openvswitch.org%2Fpip > ermail%2Fovs-dev%2F2021- > November%2F389265.html&data=3D04%7C01%7Corika%40nvidia.com%7C16130914= f2354dc02cbe08 > da075369f1%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C63783035211114947= 2%7CUnknown > %7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVC= I6Mn0%3D%7C > 3000&sdata=3DZciqL%2FK8xhJLhVFJjn%2B6euRk7nt9HVA3Ych4Kqv0%2BY4%3D&= ;reserved=3D0 > > >> > https://nam11.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fgithu= b.com%2Fopenvswitch > %2Fovs%2Fcommit%2F6e50c1651869de0335eb4b7fd0960059c5505f5c&data=3D04%= 7C01%7Corika% > 40nvidia.com%7C16130914f2354dc02cbe08da075369f1%7C43083d15727340c1b7db39e= fd9ccc17a%7C0% > 7C0%7C637830352111149472%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQ= IjoiV2lu > MzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=3D5cb6n8PqNIgE49013%2= B83%2Buy8shi8x > YHPoX%2BrHoyt8ig%3D&reserved=3D0 > > >> (Above patch avoid the problem in a hacky way for devices that doesn= 't > > >> support tunnel offloading, but it's not applicable to situation > > >> where device actually supports it, since the API has to be called.) > > >> > > >> Another tricky part is that we have to call rte_flow_get_restore_inf= o() > > >> before checking other parts of the mbuf, because mlx5 driver, for > > >> example, re-uses the mbuf.hash.fdir value for both tunnel info > > >> restoration and classification offloading, so the application has > > >> no way to tell which one is used right now and has to call the > > >> restoration API first in order to find out. > > >> > > >> > > >> What we need: > > >> > > >> A generic and fast (couple of CPU cycles) API that will clearly say > > >> if the heavy rte_flow_get_restore_info() has to be called for a > > >> particular packet or not. Ideally, that should be a static mbuf > > >> flag that can be easily checked by the application. > > > > > > A dynamic mbuf flag, defined in the API, looks to be a good idea. > > > > Makes sense. OTOH, I'm not sure what is the profit of having it > > dynamically allocated if it will need to be always defined. >=20 > True. > We need to discuss whether we can have situations where it is not registe= red > at all. We recently introduced a function for initial config of rte_flow, > it could be the trigger to register such flag or field. >=20 Why would it always be defined? As I can see it this flag is only used if application is planning to use th= e tunnel. We should also introduce some way to let application know if PMD is taking = over the fdir or metadata. > > But, well, it doesn't really matter, I guess. >=20 > That's a detail but it should be discussed. >=20 > > >> Calls inside the device driver are way too slow for that purpose, > > >> especially if they are not fully thread-safe, or require complex > > >> lookups or calculations. > > >> > > >> I'm assuming here that packets that really need the tunnel info > > >> restoration should be fairly rare. > > >> > > >> > > >> Current state: > > >> > > >> Currently, the get_restore_info() API is implemented only for mlx5 > > >> and sfc drivers, AFAICT. > > >> SFC driver is already using mbuf flag, but > > >> it's dynamic and not exposed to the application. > > > > > > What is this flag? > > > > SFC driver defines a dynamic field 'rte_net_sfc_dynfield_ft_id' > > and the corresponding flag 'rte_net_sfc_dynflag_ft_id_valid' to > > check if the field contains a valid data. >=20 > OK, so we could deprecate this flag. >=20 > > >> MLX5 driver re-uses mbuf.hash.fdir value > > >> and performs a heavy lookup inside the driver. > > > > > > We should avoid re-using a field. > > > > +1 from me, but I'm not familiar with the mlx5 driver enough > > to tell how to change it. > > Normally I would agree, but at least in the MLX5 case due to HW limitations= , there is a use of the same field so in any case application can't use the f= dir, and adding new field will mean penalty in performance. > > > > > >> For now OVS doesn't support tunnel offload with DPDK formally, the > > >> code in OVS is under the experimental API ifdef and not compiled-in > > >> by default. > > >> > > >> //Let me know if there is more formal way to submit such requests. > > > > > > That's a well written request, thanks. > > > If you are looking for something more formal, > > > it could be a patch in the API. > > > > I'm not looking for now. :) > > I think we need an agreement from driver maintainers first. And > > I don't think we can introduce such API change without changing > > drivers first, because otherwise we'll end up with the 'has_vlan' > > situation and the broken offloading support. >=20 > OK >=20 I think you are missing some driver maintainers. Best, Ori