From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id 02525A00E6 for ; Tue, 11 Jun 2019 02:06:50 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id AAC851C1C8; Tue, 11 Jun 2019 02:06:49 +0200 (CEST) Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-eopbgr130055.outbound.protection.outlook.com [40.107.13.55]) by dpdk.org (Postfix) with ESMTP id 8E3E51C1AF for ; Tue, 11 Jun 2019 02:06:47 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=CX9OtlEMdZ4rcIG2oOlk6sLaQuMPM2uzdZ4ECiQXkUE=; b=kYxk5j4dIi4ASe+Vb0BbVevnKPPnK4Dert6m7vfy84/PE4SVM+XJ8W6BMb5t9PFDfD2dZb63FLiTwGugUZRQPA2/jmwkPv+L0e4Qj2O1/7wlq8Vvw59gvxP5jDK+xy8ku0XI6k19hNbz8vgdUR3WXmBQIjLHbC0ILRlP62bdbAE= Received: from DB3PR0502MB3980.eurprd05.prod.outlook.com (52.134.72.27) by DB3PR0502MB4058.eurprd05.prod.outlook.com (52.134.68.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1965.14; Tue, 11 Jun 2019 00:06:43 +0000 Received: from DB3PR0502MB3980.eurprd05.prod.outlook.com ([fe80::24ee:49a5:d686:cbc4]) by DB3PR0502MB3980.eurprd05.prod.outlook.com ([fe80::24ee:49a5:d686:cbc4%3]) with mapi id 15.20.1987.010; Tue, 11 Jun 2019 00:06:43 +0000 From: Yongseok Koh To: Andrew Rybchenko CC: "Wang, Haiyue" , Shahaf Shuler , Thomas Monjalon , "Yigit, Ferruh" , Adrien Mazarguil , "olivier.matz@6wind.com" , "dev@dpdk.org" , "Ananyev, Konstantin" Thread-Topic: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata Thread-Index: AQHVHs7/yMaFtEc2t0m5Tj2xdzavgKaUOSSAgABDMwCAARkbgA== Date: Tue, 11 Jun 2019 00:06:43 +0000 Message-ID: <20190611000505.GA25815@mtidpdk.mti.labs.mlnx> References: <20190603213231.27020-1-yskoh@mellanox.com> <7047a597-ea0d-f159-e95d-0fd8bca5b78d@solarflare.com> <82445af1-9e66-9de9-f3d2-176de09d904b@solarflare.com> In-Reply-To: <82445af1-9e66-9de9-f3d2-176de09d904b@solarflare.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-clientproxiedby: DM5PR1401CA0018.namprd14.prod.outlook.com (2603:10b6:4:4a::28) To DB3PR0502MB3980.eurprd05.prod.outlook.com (2603:10a6:8:10::27) authentication-results: spf=none (sender IP is ) smtp.mailfrom=yskoh@mellanox.com; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [209.116.155.178] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: b05e825f-9cac-41b3-8652-08d6ee00af48 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(4618075)(2017052603328)(7193020); SRVR:DB3PR0502MB4058; x-ms-traffictypediagnostic: DB3PR0502MB4058: x-ms-exchange-purlcount: 4 x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 006546F32A x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(346002)(136003)(396003)(366004)(39860400002)(376002)(189003)(199004)(13464003)(6306002)(6512007)(9686003)(316002)(6436002)(6486002)(6916009)(1076003)(102836004)(8676002)(486006)(66066001)(33656002)(81156014)(229853002)(446003)(11346002)(476003)(305945005)(7736002)(386003)(52116002)(76176011)(6506007)(99286004)(53546011)(54906003)(64756008)(81166006)(8936002)(45080400002)(66446008)(14444005)(14454004)(186003)(66476007)(66946007)(73956011)(6116002)(966005)(2906002)(4326008)(478600001)(25786009)(71200400001)(86362001)(6246003)(5660300002)(68736007)(256004)(26005)(53936002)(66556008)(71190400001)(3846002); DIR:OUT; SFP:1101; SCL:1; SRVR:DB3PR0502MB4058; H:DB3PR0502MB3980.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: x/iK9Rx7NDkcpx5i0+nFeyTMVGHfyB9AZgZBbaj10uLsHCQjqv3wzQPNvP6DqkAizhEh3OXvx38J6Pu37P018aJH6HmdTlvFDrVBFRARoCZA8ZhZ8ivfJpl4zRetRfRRhKDBl7MATeWVA/vv0d+AGwLbRnOyxHOutiRpYmKcsg1i+6zeq+iIR48esHaXpLSHdPHcMVNi4VW0yXEBwtRvgd/QHNe7AFxBdeeImAExbgWW3XhwlOQOsFHefimlgojglNgf+CJYtnrBhGxxda2qW9DLtac0CP47uQe9XaoUr88Axyye4vyYIMG7Fk4xP3zIcvrhSJc9rhkHQDWWI83vmYAinDOZo4U8iVW+iO4QJ5qyG3Rb3qO0Tp6tCh6W83dPoY9ncN20t6Eaa2gbSZIb65TEMXG/rRkobuYAjukSLoU= Content-Type: text/plain; charset="us-ascii" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: b05e825f-9cac-41b3-8652-08d6ee00af48 X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Jun 2019 00:06:43.1538 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: yskoh@mellanox.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR0502MB4058 Subject: Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Mon, Jun 10, 2019 at 10:20:28AM +0300, Andrew Rybchenko wrote: > On 6/10/19 6:19 AM, Wang, Haiyue wrote: > > > -----Original Message----- > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Andrew Rybchenko > > > Sent: Sunday, June 9, 2019 22:24 > > > To: Yongseok Koh ; shahafs@mellanox.com; thomas@m= onjalon.net; Yigit, Ferruh > > > ; adrien.mazarguil@6wind.com; olivier.matz@6w= ind.com > > > Cc: dev@dpdk.org > > > Subject: Re: [dpdk-dev] [RFC 1/3] ethdev: extend flow metadata > > >=20 > > > On 6/4/19 12:32 AM, Yongseok Koh wrote: > > > > Currently, metadata can be set on egress path via mbuf tx_meatadata= field > > > > with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_RX_META matches me= tadata. > > > >=20 > > > > This patch extends the usability. > > > >=20 > > > > 1) RTE_FLOW_ACTION_TYPE_SET_META > > > >=20 > > > > When supporting multiple tables, Tx metadata can also be set by a r= ule and > > > > matched by another rule. This new action allows metadata to be set = as a > > > > result of flow match. > > > >=20 > > > > 2) Metadata on ingress > > > >=20 > > > > There's also need to support metadata on packet Rx. Metadata can be= set by > > > > SET_META action and matched by META item like Tx. The final value s= et by > > > > the action will be delivered to application via mbuf metadata field= with > > > > PKT_RX_METADATA ol_flag. > > > >=20 > > > > For this purpose, mbuf->tx_metadata is moved as a separate new fiel= d and > > > > renamed to 'metadata' to support both Rx and Tx metadata. > > > >=20 > > > > For loopback/hairpin packet, metadata set on Rx/Tx may or may not b= e > > > > propagated to the other path depending on HW capability. > > > >=20 > > > > Signed-off-by: Yongseok Koh > > > There is a mark on Rx which is delivered to application in hash.fdir.= hi. > > > Why do we need one more 32-bit value set by NIC and delivered to > > > application? > > > What is the difference between MARK and META on Rx? > > > When application should use MARK and when META? > > > Is there cases when both could be necessary? > > >=20 > > In my understanding, MARK is FDIR related thing, META seems to be NIC > > specific. And we also need this kind of specific data field to export > > NIC's data to application. >=20 > I think it is better to avoid NIC vendor-specifics in motivation. I > understand > that it exists for you, but I think it is better to look at it from RTE f= low > API > definition point of view: both are 32-bit (except endianess and I'm not s= ure > that I understand why meta is defined as big-endian since it is not a val= ue > coming from or going to network in a packet, I'm sorry that I've missed i= t > on review that time), both may be set using action on Rx, both may be > matched using pattern item. Yes, MARK and META has the same characteristic on Rx path. Let me clarify w= hy I picked this way. What if device has more bits to deliver to host? Currently, only 32-bit dat= a can be delivered to user via MARK ID. Now we have more requests from users (OVS connection tracking) that want to see more information generated during flo= w match from the device. Let's say it is 64 bits and it may contain intermedi= ate match results to keep track of multi-table match, to keep address of callba= ck function to call, or so. I thought about extending the current MARK to 64-b= it but I knew that we couldn't make more room in the first cacheline of mbuf w= here every vendor has their critical interest. And the FDIR has been there for a= long time and has lots of use-cases in DPDK (not easy to break). This is why I'm suggesting to obtain another 32 bits in the second cacheline of the structu= re. Also, I thought about other scenario as well. Even though we have MARK item introduced lately, it isn't used by any PMD at all for now, meaning it migh= t not be match-able on a certain device. What if there are two types registers on= Rx and one is match-able and the other isn't? PMD can use META for match-able register while MARK is used for non-match-able register without supporting item match. If MARK simply becomes 64-bit just because it has the same characteristic in terms of rte_flow, only one of such registers can be used= as we can't say only part of bits are match-able on the item. Instead of exten= ding the MARK to 64 bits, I thought it would be better to give more flexibility = by bundling it with Tx metadata, which can set by mbuf. The actual issue we have may be how we can make it scalable? What if there'= s more need to carry more data from device? Well, IIRC, Olivier once suggeste= d to put a pointer (like mbuf->userdata) to extend mbuf struct beyond two cachel= ines. But we still have some space left at the end. > > > Moreover, the third patch adds 32-bit tags which are not delivered to > > > application. May be META/MARK should be simply a kind of TAG (e.g. wi= th > > > index 0 or marked using additional attribute) which is delivered to > > > application? Yes, TAG is a kind of transient device-internal data which isn't delivered = to host. It would be a design choice. I could define all these kinds as an arr= ay of MARK IDs having different attributes - some are exportable/match-able and o= thers are not, which sounds quite complex. As rte_flow doesn't have a direct way = to check device capability (user has to call a series of validate functions instead), I thought defining TAG would be better. > > > (It is either API breakage (if tx_metadata is removed) or ABI breakag= e > > > if metadata and tx_metadata will share new location after shinfo). Fortunately, mlx5 is the only entity which uses tx_metadata so far. > > Make use of udata64 to export NIC metadata to application ? > > RTE_STD_C11 > > union { > > void *userdata; /**< Can be used for external metadata */ > > uint64_t udata64; /**< Allow 8-byte userdata on 32-bit */ > > uint64_t rx_metadata; > > }; >=20 > As I understand it does not work for Tx and I'm not sure that it is > a good idea to have different locations for Tx and Rx. >=20 > RFC adds it at the end of mbuf, but it was rejected before since > it eats space in mbuf structure (CC Konstantin). Yep, I was in the discussion. IIRC, the reason wasn't because it ate space = but because it could recycle unused space on Tx path. We still have 16B after s= hinfo and I'm not sure how many bytes we should reserve. I think reserving space = for one pointer would be fine. Thanks, Yongseok > There is a long discussion on the topic before [1], [2], [3] and [4]. >=20 > Andrew. >=20 > [1] https://eur03.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2Fma= ils.dpdk.org%2Farchives%2Fdev%2F2018-August%2F109660.html&data=3D02%7C0= 1%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e4= d9ba6a4d149256f461b%7C0%7C0%7C636957480475389496&sdata=3DEFHyECwg0NBRvy= rouZqWD6x0WD4xAsqsfYQGrEvS%2BEg%3D&reserved=3D0 > [2] https://eur03.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2Fma= ils.dpdk.org%2Farchives%2Fdev%2F2018-September%2F111771.html&data=3D02%= 7C01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d= 2e4d9ba6a4d149256f461b%7C0%7C0%7C636957480475389496&sdata=3DM8cQSmQhWKl= UVKvFgux0T0TWAnJhPxdO4Dn3fkReTyg%3D&reserved=3D0 > [3] https://eur03.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2Fma= ils.dpdk.org%2Farchives%2Fdev%2F2018-October%2F114559.html&data=3D02%7C= 01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e= 4d9ba6a4d149256f461b%7C0%7C0%7C636957480475394493&sdata=3DZVm5god7n1i07= OCc5Z7B%2BBUpnjXCraJXU0FeF5KkCRc%3D&reserved=3D0 > [4] https://eur03.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2Fma= ils.dpdk.org%2Farchives%2Fdev%2F2018-October%2F115469.html&data=3D02%7C= 01%7Cyskoh%40mellanox.com%7C6c81080cb68340d2128c08d6ed742746%7Ca652971c7d2e= 4d9ba6a4d149256f461b%7C0%7C0%7C636957480475394493&sdata=3DXgKV%2B331Vqs= q9Ns40giI1nAwscVxBxqb78vB1BY8z%2Bc%3D&reserved=3D0