From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 85A77A0032; Fri, 1 Oct 2021 10:11:37 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 08B9E4067A; Fri, 1 Oct 2021 10:11:37 +0200 (CEST) Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) by mails.dpdk.org (Postfix) with ESMTP id 1EA0340040 for ; Fri, 1 Oct 2021 10:11:35 +0200 (CEST) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 973D45C00EE; Fri, 1 Oct 2021 04:11:34 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Fri, 01 Oct 2021 04:11:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=fm2; bh= uAt3vKYfoKB/IYdnGzzSH/Dv236eRPiBrUnsVL/cbgg=; b=PRVKVLF474oy25gD bY6IBehnx+O1je3wyRpek8a0x/oe19INkCWVvE+4Yft6rCkwfBjhdoeDS+E4Sv+g d+OOTJvIqQLadQ6IheSJPlcP2a0qrLGFOmZ+9mGVt0kxvL8LePSPdO0JEbpnr7Ro O/0TLYalE3TpKTJJ+Lhu+RnCzu7ajh7vxkOiqTsTNvPaMzi+K8nUUeJvmO6rKC8m bAeuu+AZXlpz+cLGofAwchPkKqR5s5ZF8vx8o1ujS/jkPgoQm/yDyg8NGwl8C/HJ v2XetKHl31aAVbHGBJ/hmVUofxoFQYOtsh4sP4T6aiftQ5JoSLJKwi7NQhyQTGuE aG7o2Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; bh=uAt3vKYfoKB/IYdnGzzSH/Dv236eRPiBrUnsVL/cb gg=; b=QLg/3W2nz5FlZjW+dIC9L6L2Ug7gawZ/whLFUWoOR7FBiy7btnoY9B6bV yJVr5ikMzkmA6fh3IPN5aUAkJZKsG6sl2sY/UwU9EhBV1zl3eB7z57DvxSqsDWUq LSbXVxsIGh7a1n+YoxvyJU2jJ/RXlWfF3VszuJR7dR2vt01ypMRyfJFKcDcZJg99 zNckJU+aBnBOqD3/sFeaoKVSyg+kN51SqgK9zFw+4X7evWI5Ni6+cDN4tWE+cKDy b2aVxUA5IXoog5KOBOcmuE8K8ed4+X9b3TgdWvfKEr/Lkdhp4FAktrfamzkWtROt RznBtM0bGF/iWCxK0hEWv7QCyfErA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddrudekhedguddvhecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefhvffufffkjghfggfgtgesthfuredttddtvdenucfhrhhomhepvfhhohhm rghsucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenuc ggtffrrghtthgvrhhnpedugefgvdefudfftdefgeelgffhueekgfffhfeujedtteeutdej ueeiiedvffegheenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfh hrohhmpehthhhomhgrshesmhhonhhjrghlohhnrdhnvght X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 1 Oct 2021 04:11:33 -0400 (EDT) From: Thomas Monjalon To: Ivan Malov , Andrew Rybchenko Cc: dev@dpdk.org, Andy Moreton , orika@nvidia.com, ferruh.yigit@intel.com, olivier.matz@6wind.com Date: Fri, 01 Oct 2021 10:11:31 +0200 Message-ID: <5427719.I9DohtKF8S@thomas> In-Reply-To: <9f44035b-9569-746a-d2cd-73a793348f31@oktetlabs.ru> References: <20210902142359.28138-1-ivan.malov@oktetlabs.ru> <8e727e12-6655-43b9-9af3-bcc5b882508d@oktetlabs.ru> <9f44035b-9569-746a-d2cd-73a793348f31@oktetlabs.ru> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Subject: Re: [dpdk-dev] [PATCH v3 0/5] A means to negotiate delivery of Rx meta data X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" 01/10/2021 08:47, Andrew Rybchenko: > On 9/30/21 10:30 PM, Ivan Malov wrote: > > Hi Thomas, > > > > On 30/09/2021 19:18, Thomas Monjalon wrote: > >> 23/09/2021 13:20, Ivan Malov: > >>> In 2019, commit [1] announced changes in DEV_RX_OFFLOAD namespace > >>> intending to add new flags, RSS_HASH and FLOW_MARK. Since then, > >>> only the former has been added. The problem hasn't been solved. > >>> Applications still assume that no efforts are needed to enable > >>> flow mark and similar meta data delivery. > >>> > >>> The team behind net/sfc driver has to take over the efforts since > >>> the problem has started impacting us. Riverhead, a cutting edge > >>> Xilinx smart NIC family, has two Rx prefix types. Rx meta data > >>> is available only from long Rx prefix. Switching between the > >>> prefix formats can't happen in started state. Hence, we run > >>> into the same problem which [1] was aiming to solve. > >> > >> Sorry I don't understand what is Rx prefix? > > > > A small chunk of per-packet metadata in Rx packet buffer preceding the > > actual packet data. In terms of mbuf, this could be something lying > > before m->data_off. I've never seen the word "Rx prefix". In general we talk about mbuf headroom and mbuf metadata, the rest being the mbuf payload and mbuf tailroom. I guess you mean mbuf metadata in the space of the struct rte_mbuf? > >>> Rx meta data (mark, flag, tunnel ID) delivery is not an offload > >>> on its own since the corresponding flows must be active to set > >>> the data in the first place. Hence, adding offload flags > >>> similar to RSS_HASH is not a good idea. > >> > >> What means "active" here? > > > > Active = inserted and functional. What this paragraph is trying to say > > is that when you enable, say, RSS_HASH, that implies both computation of > > the hash and the driver's ability to extract in from packets > > ("delivery"). But when it comes to MARK, it's just "delivery". No > > "offload" here: the NIC won't set any mark in packets unless you create > > a flow rule to make it do so. That's the gist of it. OK Yes I agree RTE_FLOW_ACTION_TYPE_MARK doesn't need any offload flag. Same for RTE_FLOW_ACTION_TYPE_SET_META. > >>> Patch [1/5] of this series adds a generic API to let applications > >>> negotiate delivery of Rx meta data during initialisation period. What is a metadata? Do you mean RTE_FLOW_ITEM_TYPE_META and RTE_FLOW_ITEM_TYPE_MARK? Metadata word could cover any field in the mbuf struct so it is vague. > >>> This way, an application knows right from the start which parts > >>> of Rx meta data won't be delivered. Hence, no necessity to try > >>> inserting flows requesting such data and handle the failures. > >> > >> Sorry I don't understand the problem you want to solve. > >> And sorry for not noticing earlier. > > > > No worries. *Some* PMDs do not enable delivery of, say, Rx mark with the > > packets by default (for performance reasons). If the application tries > > to insert a flow with action MARK, the PMD may not be able to enable > > delivery of Rx mark without the need to re-start Rx sub-system. And > > that's fraught with traffic disruption and similar bad consequences. In > > order to address it, we need to let the application express its interest > > in receiving mark with packets as early as possible. This way, the PMD > > can enable Rx mark delivery in advance. And, as an additional benefit, > > the application can learn *from the very beginning* whether it will be > > possible to use the feature or not. If this API tells the application > > that no mark delivery will be enabled, then the application can just > > skip many unnecessary attempts to insert wittingly unsupported flows > > during runtime. I'm puzzled, because we could have the same reasoning for any offload. I don't understand why we are focusing on mark only. I would prefer we find a generic solution using the rte_flow API. Can we make rte_flow_validate() working before port start? If validating a fake rule doesn't make sense, why not having a new function accepting a single action as parameter? > Thomas, if I'm not mistaken, net/mlx5 dv_xmeta_en driver option > is vendor-specific way to address the same problem. Not exactly, it is configuring the capabilities: +------+-----------+-----------+-------------+-------------+ | Mode | ``MARK`` | ``META`` | ``META`` Tx | FDB/Through | +======+===========+===========+=============+=============+ | 0 | 24 bits | 32 bits | 32 bits | no | +------+-----------+-----------+-------------+-------------+ | 1 | 24 bits | vary 0-32 | 32 bits | yes | +------+-----------+-----------+-------------+-------------+ | 2 | vary 0-24 | 32 bits | 32 bits | yes | +------+-----------+-----------+-------------+-------------+