From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 02519A0351; Mon, 10 Jan 2022 10:54:38 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C1D7D426D9; Mon, 10 Jan 2022 10:54:38 +0100 (CET) Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2046.outbound.protection.outlook.com [40.107.244.46]) by mails.dpdk.org (Postfix) with ESMTP id EBCDC4013F for ; Mon, 10 Jan 2022 10:54:37 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ko0gWFsSGpr7gnBbgpxB/Udl2aMVKX2o9XY8GYcW1LP7u9T4eegtynOMN0fcY1rqFQgBBPv8ymdfPlXjZ0oXCEu1EBGC0Ogl2kKBa/LRkG13fKsPFMostYjiUVYqUKZgVAekNUs6DiAauc9wYJRvH1cQFDIapJTjTWUKQ5RHuMTjilQdpH3LXqbfXZ0pFpVDtKgrEgzFeLFTOXi3wyS1RNGW8L2pP7G59CuRQwV+Baf789WSQGT3y3wUHUl9wdpZLCCn/op3E/joT3WE9Ne4LPcv4TSENeCis+9rHyCKBGpC/dNnFJInj1C0mxpZULYzR+DMisENKZQhTRXtVfsXng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=F5BxETHvrIoVINy8lD7N2RAVEuhy6VwMROnQQuC9jG8=; b=Rc0950a03rhl5OxWkQ2DgbYU3cha4GnMA0Uu1g1TlIRai4MQlN9cWDz6/X8H4Paw4m8UELVTrcN8C0OY+2oL11vs3iMKZFpYKIQk/NaYvI09JLCTGjrRPyQbPIfvtO7MOszeBhpEuxBpS5nVVdWAqg8dYvNF6rbbE/MkSTlw6faM0AejVMO3EUxjTva6tjxFBpV9x7D7D+05h6aUk+g4KeB/6SY2MvshTGvTd+WV4AmJLS5rRQp16l/XCsCgzyijaHfYQMOMsWZVvDYX0PPewfmsPQIzQgSucOMhtCJHoH2RD06Qj1vwoFYr8MtHoHO/avlJhtpsTb8tGPwukea9tg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=F5BxETHvrIoVINy8lD7N2RAVEuhy6VwMROnQQuC9jG8=; b=NaKBBG+loI5J7Lf5X2rm5oMVrIVb5qqKbLYEpWM4R1jGqMKns7SD9PxQGXvXBR9uOA3rWOXurkakZqOGPCr/92r4Tn7U2IVkuMri66blUxYrr0zd6b/7HaDLLq/w09FAD8X7mu/ihkhBtuOksFa6jO8r+eL79itN12g2afa667fEuKFDgXLVA33g+IqlbL5D4cp+TCfnm4dh8mBtzMj9OD2+p+DBlHClXcVWd87okuTMMb8yP8x5ltORmxMAA14yKOww5Obvfzz1pwuZp5mcoz1pFSnoJ3BBYbMXDrtDW24fBpIK0OfTdqzWLBQB2g1f7m84lgLrg0jZSPVmIVlL7Q== Received: from MW2PR12MB4666.namprd12.prod.outlook.com (2603:10b6:302:13::22) by MWHPR12MB1919.namprd12.prod.outlook.com (2603:10b6:300:108::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.11; Mon, 10 Jan 2022 09:54:36 +0000 Received: from MW2PR12MB4666.namprd12.prod.outlook.com ([fe80::78:438a:c6b7:1cc1]) by MW2PR12MB4666.namprd12.prod.outlook.com ([fe80::78:438a:c6b7:1cc1%3]) with mapi id 15.20.4867.011; Mon, 10 Jan 2022 09:54:35 +0000 From: Ori Kam To: Ivan Malov CC: Stephen Hemminger , "NBU-Contact-Thomas Monjalon (EXTERNAL)" , "NBU-Contact-Adrien Mazarguil (EXTERNAL)" , "dev@dpdk.org" , Andrew Rybchenko Subject: RE: Understanding Flow API action RSS Thread-Topic: Understanding Flow API action RSS Thread-Index: AQHX/MEw4cm0JPZdFkG5T3Lwla88o6xS12aAgABGoQCAABppAIAAOdQAgAc34BCAAA7fAIAAD+vg Date: Mon, 10 Jan 2022 09:54:35 +0000 Message-ID: References: <76f98055-c517-5185-b79-d16ec5ef5ff@oktetlabs.ru> <4677833.GXAFRqVoOG@thomas> <20220104085442.4e406f2a@hermes.local> <13f1886-d714-7e8-e176-4872a1c8e85@oktetlabs.ru> <20220104135612.4e5c8143@hermes.local> <1fa28b5-22f4-36f0-a4fe-2ceedad4434@oktetlabs.ru> In-Reply-To: <1fa28b5-22f4-36f0-a4fe-2ceedad4434@oktetlabs.ru> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: b987e7cd-2c4c-4d67-13cc-08d9d41f35ae x-ms-traffictypediagnostic: MWHPR12MB1919:EE_ x-ld-processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: VfK3N/w6OANYIvE6jQn4h+PmPlzRtibBvFW/oxNsj8l9BWiq08mznkF+aJ2BPAWAgh+ywY53KJDTmFlB5MiMaZ4Q6SJidFsZjdC8MhCS7P5Cmvs7uhwKU51BsMAHlgFly5KanZr1wE9E7An/jShP6gqUsYfKhhPVVKhNYca9eSi3kYUosuAEwWkUoLF+clyJFC9jZ3hvumybv+4yGZ3Bj01ZrRQvkDY/4EHfC1bBiQndSjDoTr1lzdt+7jB99lcVgpOBehSK3uDnFIpUf8RJTzk7JDyJ31BHXluqsUn/SQbbt/GMzrpzzs1BZY8uPwuJo27gFcS4+R3SAr45zZFYrmpfiJ/i+HIJfMPlKkue5ZMl1gdHaOgpa3P2tgvwoiNKgqT+qeASGO5omQQdHUov+/k2yI2su7cQTbTeuxD5bGFGH5KElSG03Dt1EW3UKojXzIhqgvcoCd1faxPPNpLn8JB5OBtTUy8JcNmml3peahzp1fLPnIJcdgNiuoGLmLcUEVYBHka84SjRdNXRqzoSRYDExLACiXg2DW0Da3WL7pfIqXT14qnkCTGj4v7dSm4FTcndC7GwK8uHuG2Cp1cwoVrQIV9srdPatAN58VZSN+sZVknxjcDomVCiF/zn1D1Hq71nmd/kO0xGPy28gCQQ0WKVacsN3M7YxdScWzjKVf6NzAihsJ1VRE6IF+utCFltcC0i71PalVLFZsiXKI69Hg== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MW2PR12MB4666.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(7696005)(83380400001)(38100700002)(186003)(6916009)(6506007)(76116006)(66556008)(52536014)(54906003)(8676002)(71200400001)(66476007)(4326008)(122000001)(316002)(5660300002)(8936002)(64756008)(53546011)(45080400002)(26005)(508600001)(38070700005)(66946007)(33656002)(2906002)(55016003)(86362001)(66446008)(9686003); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?T9YbaaWO8F6B33gL+YxWvblI/xpvVRq/6EvHVare2NJ57KLRwfRv75poGRLN?= =?us-ascii?Q?f0fzUx6e3ZQZod1tv0rMqcBsJvAfM64Ke7xrTlYeyIkomCY5dORa+Bm2aE2x?= =?us-ascii?Q?UwvnYwD/EMwzDn2YF1c0SCqZ6njjX+H7V4i67C8VMM/uR7KgLDDfp3SP4mqB?= =?us-ascii?Q?0YHq7WdRuA4d5VvbnGS8pmZNFWy2f5aEtu+WHLUYTxGyCZ9xL1gEYkJkSRhj?= =?us-ascii?Q?WgcVfzPzLjbZqhpCc2JuK3Dbe7qcgJJf/J9pwycMq/MTs461GgP4FaERi83b?= =?us-ascii?Q?FXu1jCXOlULy74bHZF+P0mICRdMbgkZYIw6JQykpy0VE22/dOncBY78n43Bl?= =?us-ascii?Q?cL3VYD49RktXOtRm5MoFWvItjQJ0c0qiNPf5yUT0VbpP7KGla95vpzF0E2k4?= =?us-ascii?Q?ud4sHKxG1C3ktQyGYcUOV/dMSS2d9LTzvHSXyLS8N+5jXU2nsRQm3JWecnsT?= =?us-ascii?Q?YTdVoY9gJ70r+9nSRRMcFAQC/llZK4G5eM0O/DJ/IHk7QCvDSwtfEs47jgyr?= =?us-ascii?Q?C7dLlg6ZSaV1um56ZubRRfNuvMr2cEf6pyxKqa/EfrF65haPdcNuQSs04azm?= =?us-ascii?Q?7VgebJktMV3RR7i9+k69SO4ubITMQEKkvdkkMDkcz18qD3pesU/qfh2KSwEv?= =?us-ascii?Q?yoP/I/dnlol4/rljm/MGcf+v8VEZCFevG/IJBwMsDdwUxw8csjQyq0ba+XF/?= =?us-ascii?Q?74GAyg35/EvELy5kY1l00c8JWRDfDgKgsni+soLJTdQZnxGOdXZz0DjyWx24?= =?us-ascii?Q?v32eEqkd1pxiapHwJ/8Nk2Dec/jN03e3uF3fooaQef149tRBZd/ABBUaKrJw?= =?us-ascii?Q?12FzHimpi30BxtMBtDZvruL6AEsG/tA4H23q9i0ar7vjd+Vjy4RPc7v6HbAc?= =?us-ascii?Q?TkEXng03o7Y8pT6YX2iay3CoNp/Ana07jQu3YIUj/sgcmnnXacjwXDTc3Ood?= =?us-ascii?Q?QBTZHvk7LOl0FInq0kKvD73f9j2rh+dH3unfVUmVyazLFUZyNp+OWUXz39y2?= =?us-ascii?Q?oz14Z01A0aDQntAVGmbzHbAMJK3TXioVqZgbu54+zA/fT6JzWowh4tfJHR0t?= =?us-ascii?Q?krsampxaUbAqMLsiVDdf3cO+kFhfjW2Yi79qmIZFYgVO42PmegifbRimAnY8?= =?us-ascii?Q?PCSZvCBvr2Wn45nolwpsq1QhAZzHMaKYty0tFme77NEQ9Winte7byAmuZe9k?= =?us-ascii?Q?itJIiWehYQVxoep6KIWlG+bEoiG2nJR45sDHP371D45sYoCeGoSPx2dzN494?= =?us-ascii?Q?UWoUuBnlZIGSTdMNcKuREbzKeHF4oCVpXBpITGVIH9nTOvSIVBbCDVSFeaOV?= =?us-ascii?Q?UjcoJwO2P5Chhmnrr5xlj0yRSTuwbjazHOf+o4fwONHGCrHcxJDTFTJbi6PU?= =?us-ascii?Q?ZxnbjOqhaGx/KIhP03ADn29o3uovkrVEf2xZp1QJVhbid+5wvZ99cL8mrIPc?= =?us-ascii?Q?rzMVcLyxSfxDjGqL4xmJnknKNlGBBy6C8YJrusk4yOzFadG3c/BY7rgbsrzY?= =?us-ascii?Q?pt+59XDa2ZxBIWqY7MdM9jhAhnv2phttQO4yHQw8g+8BfdPXHwkASFQ2KdKT?= =?us-ascii?Q?KzfXZK+L3AkWQ6D2N21pbtvRFhPOiyEqkYMT506UJjqNha5M8DF4HZtJyF71?= =?us-ascii?Q?pPz7VlEz93n2SWIicI4lexk=3D?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: MW2PR12MB4666.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: b987e7cd-2c4c-4d67-13cc-08d9d41f35ae X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Jan 2022 09:54:35.8061 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: rG86fWrqhpUthjPshdAoyrwwCbApVZ4z9E6QaXz5N9gj1waa3tlhrjDOekvdpjWiy7BmgAcz75DpryIVuTm3vQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR12MB1919 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Ivan, > -----Original Message----- > From: Ivan Malov > Sent: Sunday, January 9, 2022 3:03 PM > Subject: RE: Understanding Flow API action RSS >=20 > Hi Ori, >=20 > On Sun, 9 Jan 2022, Ori Kam wrote: >=20 > > Hi Stephen and Ivan > > > >> -----Original Message----- > >> From: Stephen Hemminger > >> Sent: Tuesday, January 4, 2022 11:56 PM > >> Subject: Re: Understanding Flow API action RSS > >> > >> On Tue, 4 Jan 2022 21:29:14 +0300 (MSK) > >> Ivan Malov wrote: > >> > >>> Hi Stephen, > >>> > >>> On Tue, 4 Jan 2022, Stephen Hemminger wrote: > >>> > >>>> On Tue, 04 Jan 2022 13:41:55 +0100 > >>>> Thomas Monjalon wrote: > >>>> > >>>>> +Cc Ori Kam, rte_flow maintainer > >>>>> > >>>>> 29/12/2021 15:34, Ivan Malov: > >>>>>> Hi all, > >>>>>> > >>>>>> In 'rte_flow.h', there is 'struct rte_flow_action_rss'. In it, 'qu= eue' is > >>>>>> to provide "Queue indices to use". But it is unclear whether the o= rder of > >>>>>> elements is meaningful or not. Does that matter? Can queue indices= repeat? > >>>> > >>>> The order probably doesn't matter, it is like the RSS indirection ta= ble. > >>> > >>> Sorry, but RSS indirection table (RETA) assumes some structure. In it= , > >>> queue indices can repeat, and the order is meaningful. In DPDK, RETA > >>> may comprise multiple "groups", each one comprising 64 entries. > >>> > >>> This 'queue' array in flow action RSS does not stick with the same > >>> terminology, it does not reuse the definition of RETA "group", etc. > >>> Just "queue indices to use". No definition of order, no structure. > >>> > >>> The API contract is not clear. Neither to users, nor to PMDs. > >>> > >> From API in RSS the queues are simply the queue ID, order doesn't matt= er, > > Duplicating the queue may affect the the spread based on the HW/PMD. > > In common case each queue should appear only once and the PMD may dupli= cate > > entries to get the best performance. >=20 > Look. In a DPDK PMD, one has "global" RSS table. Consider the following > example: 0, 0, 1, 1, 2, 2, 3, 3 ... and so on. As you may see, queue > indices may repeat. They may have different order: 1, 1, 0, 0, ... . > The order is of great importance. If you send a packet to a > DPDK-powered server, you can know in advance its hash value. > Hence, you may strictly predict which RSS table entry this > hash will point at. That predicts the target Rx queue. >=20 > So the questions which one should attempt to clarify, are as follows: > 1) Is the 'queue' array ordered? (Does the order of elements matter?) > 2) Can its elements repeat? (*allowed* or *not allowed*?) >=20 >From API point of view the array is ordered, and may have repeating element= s. > > > >>>> > >>>> rx queue =3D RSS_indirection_table[ RSS_hash_value % RSS_indirect= ion_table_size ] > >>>> > >>>> So you could play with multiple queues matching same hash value, but= that > >>>> would be uncommon. > >>>> > >>>>>> An ethdev may have "global" RSS setting with an indirection table = of some > >>>>>> fixed size (say, 512). In what comes to flow rules, does that size= matter? > >>>> > >>>> Global RSS is only used if the incoming packet does not match any rt= e_flow > >>>> action. If there is a a RTE_FLOW_ACTION_TYPE_QUEUE or RTE_FLOW_ACTIO= N_TYPE_RSS > >>>> these take precedence. > >>> > >>> Yes, I know all of that. The question is how does the PMD select RETA= size > >>> for this action? Can it select an arbitrary value? Or should it stick= with > >>> the "global" one (eg. 512)? How does the user know the table size? > >>> > >>> If the user simply wants to spread traffic across the given queues, > >>> the effective table size is a don't care to them, and the existing > >>> API contract is fine. But if the user expects that certain packets > >>> hit some precise queues, they need to know the table size for that. > >>> > > Just like you said RSS simply spread the traffic to the given queues. >=20 > Yes, to the given queues. The question is whether the 'queue' array > has RETA properties (order matters; elements can repeat) or not. >=20 Yes order matters and elements can repeat. > > If application wants to send traffic to some queue it should use the qu= eue action. >=20 > Yes, but that's not what I mean. Consider the following example. The user > generates packets with random IP addresses at machine A. These packets > hit DPDK at machine B. For a given *packet*, the sender (A) can > compute its RSS hash in software. This will point out the RETA > entry index. But, in order to predict the exact *queue* index, > the sender has to know the table (its contents, its size). >=20 Why do application need this info? > For a "global" DPDK RSS setting, the table can be easily obtained with > an ethdev callback / API. Very simple. Fixed-size table, and it can > be queried. But how does one obtain similar knowledge for RSS action? >=20 The RSS action was designed to allow balanced traffic spread. The size of the reta is PMD dependent, in some PMD the size will be the number of queues in others it will be the number of queues but in power of 2, so if the app requested 8 queues the reta will also be 8. In any case PMD should use the given order, if the PMD needs to expend it should cycle on the application requested queues in the order they were = given. > > > >>> So, the question is whether the users should or should not build > >>> any expectations of the effective table size and, if they should, > >>> are they supposed to use the "global" table size for that? > >> > >> You are right this area is completely undocumented. Personally would r= eally like > >> it if rte_flow had a reference software implementation and all the HW = vendors > >> had to make sure their HW matched the SW reference version. But this a= case > >> where the funding is all on the HW side, and no one has time or resour= ces > >> to do a complete SW version.. > >> > >> A sane implementation would configure RSS indirection as across all > >> rx queues that were available when the device was started; ie all queu= es > >> that did not have deferred start set. Then the application would start= /stop > >> queues and use rte_flow to reach them. > >> > >> But it doesn't appear the HW follows that model. > >> > >> > >>>>>> When the user selects 'RTE_ETH_HASH_FUNCTION_DEFAULT' in action RS= S, does > >>>>>> that allow the PMD to configure an arbitrary, non-Toeplitz hash al= gorithm? > >>>> > >>>> No the default is always Toeplitz. This goes back to the original d= efinition > >>>> of RSS which is in Microsoft NDIS and uses Toeplitz. > >>> > >>> Then why have a dedicated enum named TOEPLITZ? Also, once again, the > >>> documentation should be more specific to say which algorithm exactly > >>> this DEFAULT choice provides. Otherwise, it is very vague. > >>> > >>>> > >>>> DPDK should have more examples of using rte_flow, I have some sample= s > >>>> but they aren't that useful. > >>>> > >>> > >>> I could not agree more. > > > > Feel free to add/suggest what example are missing. > > > >>> > >>> Thanks, > >>> Ivan M. > > > > Best, > > Ori > > Best, Ori