From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-eopbgr80041.outbound.protection.outlook.com [40.107.8.41]) by dpdk.org (Postfix) with ESMTP id 2ADB31B574 for ; Thu, 12 Jul 2018 02:59:39 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=o/01zHwW6ZPafje1VLBlceryPIxelNcQCN7urwrc+5c=; b=iTVXzPAheRbLYV7bxlrMoLP/2PJraONCpjtqAj8VybWxTZxw6PR+H1Bar5Kpu1JXM1pOItT3HnIjQExE3QjbqT7csHURqRszN1tku5nBdgPGn0o26F3kuKLlsSG4vAPKN9D6Wag5weplkmWOTlxysJ1iGiY2QcAFrMQuGp6YC00= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=yskoh@mellanox.com; Received: from yongseok-MBP.local (209.116.155.178) by VI1PR0501MB2045.eurprd05.prod.outlook.com (2603:10a6:800:36::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.952.18; Thu, 12 Jul 2018 00:59:33 +0000 Date: Wed, 11 Jul 2018 17:59:18 -0700 From: Yongseok Koh To: Adrien Mazarguil Cc: Shahaf Shuler , Nelio Laranjeiro , dev@dpdk.org Message-ID: <20180712005917.GD69686@yongseok-MBP.local> References: <20180627173355.4718-1-adrien.mazarguil@6wind.com> <20180627173355.4718-3-adrien.mazarguil@6wind.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180627173355.4718-3-adrien.mazarguil@6wind.com> User-Agent: Mutt/1.9.3 (2018-01-21) X-Originating-IP: [209.116.155.178] X-ClientProxiedBy: MWHPR2201CA0095.namprd22.prod.outlook.com (2603:10b6:301:5e::48) To VI1PR0501MB2045.eurprd05.prod.outlook.com (2603:10a6:800:36::19) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 9df2d116-094b-421e-98e0-08d5e792bc01 X-MS-Office365-Filtering-HT: Tenant X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989117)(4534165)(4627221)(201703031133081)(201702281549075)(8990107)(5600053)(711020)(48565401081)(2017052603328)(7153060)(7193020); SRVR:VI1PR0501MB2045; X-Microsoft-Exchange-Diagnostics: 1; VI1PR0501MB2045; 3:xmQfUeLJz5T+5f9Fvi88E08TlDemix+2vvCuM2TT0oUPdkIeMey8SpNegJfsFkLwPT/ZNWW3NkHFMpnxAybt16rh73w7FynZ/4BJljdhJKd39HHO0YywXl8gYnWAuIOyCH0WOfYHYPpP3XS9rY9UevZ0M+jnamY5+228bPzP5UzUvqAH5Vz+K1g2ZYdKXLcbrAmVzrKvNfsSh8y4ceB87heTzDZPS45nZJXDhU41c0HHcIf+Y9DnQUfDpK0oz/jY; 25:GF2FtHfRipgpk2iSTod7WhKWgPdWjNV2ivdoTyzHj/sfY4p9y1qLjHLa4q3VhXCuqWQnxXx9bm1lTs9C/N0vt1fGjeemIuPehTdJlzY4oRONuSbSjRDWoRP5s9S1Y3GGSnoQSqWyAh2zNaKlo+aShJZBs5gBzQrTu1RDCl87LERhXne9pW9TqEXqBsg0yNG8cm4Rhi9AZ8NNcSiZSB7Ut1DLGtd8zAw309JTKIrJuU5BhU+joUgDyPrYqQB6uG1A3vPqeQOS4qiFatv0tDTI+aczcSyVyOre2mRf7OwAcU770VBiYZtdIDgWcrLFKKWmPF8KKA43HuNO9nkYAcr3rw==; 31:JPNrKqLstpqpxfp2U4hhPYejo76JU29DOE4UwO280TpTq3cwaW/QsLjJhS9Znr5rwW0icpZI2EJbd3Yslp+8ms6fwqg/2u3PZ38QP//GLF+ASASgzh0DSgcXKRmQdgUap33tJadXD/tmVTsKiKN1D8dt9A7SU9FvihCMigf1OXXeh8Fwz6Dcr0bbPI4l7tZm0dMi0vAvJ6Q+BB2Q/HjClvJ7KyVZCDUnB9SfZa2zRKQ= X-MS-TrafficTypeDiagnostic: VI1PR0501MB2045: X-LD-Processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr X-Microsoft-Exchange-Diagnostics: 1; VI1PR0501MB2045; 20:X4pMiNuJLMRvWvfAiOmVorONIGcZMFc8yCvieb/zitLpTEWxxLhIEXHzul6AY7ql2e4TOeBtQSIONHK7ZEVMtZdLAUmZqRgYAl2WtR5cUkT1T2RbzkyRAguXtCOKrejqW+cs0FYIMe1od1QNLSnasY6zDeU09phvAjk8llQKC5IsZBRDR46NKzcUWAxurgvi2nVvqVaGF3FdII9e/2hXG78QPPD6bE13I4P29v7jUPhgbgLbF2j6GDhoQ/t+Rq5Ma7TZIk3Y8KovfqwpsuaPxuNid4p0VWXQGoSSh9BemAVMypqZPtLWqyrNDGj/tj8g1yYtBc9EHbQcOXYKDJX0RtnvEzMrOmT+ZRWfv+ewdaAug6nzt+1ok6RVRqVj/IR8OscS0TtHkB3Zwc1nrtRzLLAw1ynto3s4lXfpWcmE9cUqWHPDYfpMDaf9esyq+v/RMugJfxoTaW6VQdE2eEJ962/txpgfGHL4Hqvy700F2xa1FjxuaKePP3mOhdQU4+gv; 4:xm11im0LG1yBv2ktOpyzzc6Z8RnodrLtFYggys3bMg3vm4PtJoGUT0ePYz2GwtZ3SGxn0btkm9Czta8fsiAbVmSpjrysFWVesTmyDRQiu5/1+QkmeoHRXOUQj0cRMKzrg828NIQEr1ueOV2leoSRcH1RqsPGOittDmSfyC+WNVeQfgk6WY3Xw/nj/Z3KekkSEqh+CFBexrThT2RlsbExMRRHj34yvVcLyOXfsp1x72v1ZRVcs8lx9+exPDRiaTysL/Dq5yj/gFhSnt7wWO9rv6YwhhjT7OwME4mDcQcNdiL4LFfUV1wwcVKZgGiUm9l7 X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(211171220733660); X-MS-Exchange-SenderADCheck: 1 X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(93006095)(93001095)(10201501046)(3002001)(3231311)(944501410)(52105095)(6055026)(149027)(150027)(6041310)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123558120)(20161123562045)(20161123564045)(6072148)(201708071742011)(7699016); SRVR:VI1PR0501MB2045; BCL:0; PCL:0; RULEID:; SRVR:VI1PR0501MB2045; X-Forefront-PRVS: 0731AA2DE6 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(396003)(136003)(346002)(376002)(366004)(39860400002)(199004)(189003)(956004)(81166006)(7696005)(316002)(53936002)(478600001)(58126008)(476003)(81156014)(16586007)(55016002)(446003)(33656002)(5660300001)(6246003)(52116002)(11346002)(66066001)(76176011)(86362001)(575784001)(186003)(105586002)(106356001)(6916009)(2906002)(6666003)(26005)(50466002)(8676002)(54906003)(16526019)(6506007)(9686003)(98436002)(33896004)(6116002)(3846002)(23726003)(8936002)(229853002)(1076002)(305945005)(14444005)(386003)(7736002)(97736004)(68736007)(47776003)(25786009)(4326008)(486006)(18370500001); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR0501MB2045; H:yongseok-MBP.local; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; Received-SPF: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; VI1PR0501MB2045; 23:OOFtUMB1IYSYJRnRkML+8VhBka4MdLgc93cOstR?= =?us-ascii?Q?aJH7eaOpvbhdSa6RHLSiehFoNG47vU5vxFGI+TIJsnRXXX5YN9nkaP2ZFRqA?= =?us-ascii?Q?YsSYuotjKCz53r9BghKOXfaRMZ84lEAp+4QeaagJ+MZtO1hhvUqEqoDECRjQ?= =?us-ascii?Q?al+BnZBtqoI86NEpV3kRkXWqE2aMYizBQcoAi4c3nU0wfI60Gt81o4TinlVo?= =?us-ascii?Q?jJAPzoCEssL2UG4LGvydCCi5RHvd/YgOrBLFnWLheX2BbfWm7YwTSzFGi9yj?= =?us-ascii?Q?N3Arut+H2vz/6HIc7ZVZVFH1ulMHU97InpTefbisEtFtrNxXUT6zLiwZtnKK?= =?us-ascii?Q?7PO+ZvFG4NN3CzOeNvZ5pWryCgJcE1TV7A75uZBmGLv0YiKELc04uE5mxb4c?= =?us-ascii?Q?fvq4O7lNhBvLRAuXJlAQr4uytnYI5WrsIBiTPrUqK4Z2nuYPrv131fuyxRDq?= =?us-ascii?Q?I8GiFYxRSpRtBrK/O9CXp7Z5Edx75huICmgPyQ2yPlnGGL//u0DUdgLweS/j?= =?us-ascii?Q?h8Dj8OI9vY+7MlSn4xvtCqiPxrTkkngfU8FhRB+/7ksvWN07kSxNAINlsM7V?= =?us-ascii?Q?RIuLIJS43ZkroXRXgskGOeuL1+hWV+VL9wvAWaS6sH5yGgR/mHqG32fnXw9N?= =?us-ascii?Q?RfVESkcVUEIkOQWroxwHAhpyongXQvys+RbP955r3aCwOdK7yA1YNdFnEA94?= =?us-ascii?Q?9+wAxSZku++m6fOmkQBrZDaEH2z3y74QDS1NqEk5UVSpeUBQN1z2csFqGT8F?= =?us-ascii?Q?7N7JeQsg9k+djASXoCXXY2DGtPt2FHvQU77mMx/6tl8w1C4N0yfZu4Tfwf07?= =?us-ascii?Q?9z3QIVkKmtQqQgcGvwTTaZsfKXFa4EYGH7Hkk28wr8UeZG/OERyNMnheCb9x?= =?us-ascii?Q?JxcSBM4tLx/z7kP8h2JaSYBGL8I5HYN+d/xUc2GzfG8rcNyBs1inVyiYGs5w?= =?us-ascii?Q?ysNvozhwXksTcJnJByuutZLsFbdNRiose300+jsof+GdWjSHbrf7VXOTAIOJ?= =?us-ascii?Q?roQRE3YdEMnBsE+AifOXPq3CsFGFc+zNKLtjBq+7roIQ0KhpNARLsqEDfIO3?= =?us-ascii?Q?X2GOik5wQMIQe8iAt8yEpef/5TpcxbqgbXPPbnmPvbdwNfHg4BGHDmHsZKVw?= =?us-ascii?Q?tH+H+qn71dcB70xYUCU1qI/KGXQX7ce/Pl3Tfzd50jsNRWW6bYM5m+YmNYGp?= =?us-ascii?Q?vrjiDtw2QJKqYyBOl39JiaJoiVQ20sIbHVPBFbEwhg1fHAMhbHVRIHeHar9k?= =?us-ascii?Q?fSqwYuTMu9o5A+TpN3JqKCJSOJHxPNGU3RFIvzKqdLlXlqS3rS5oihrAqPci?= =?us-ascii?Q?SuhioDIe6Wg8rBKsTO5T51UgiQRRptLgg/45gT3Op5ZyY?= X-Microsoft-Antispam-Message-Info: /1RKYXR2BmHdBz2R09j0bkkWWcB6F4+aSHyqvv6ZzjtE+QS5DSQBn/9GcJaViu95+VwCmUcL1WmTv7DaMHahubtGKxZSMWUnNDKvQVRDnNJlw12Pty5614mnyP9ukE9hXe4hkCUr9SHTh7C62TtcHhBWNqZGIuMm1sortUMXFk4QPImZOVUIhUTt2WoHOOYf5MdNP5/1o227hIz2GqUDpVXUXOhwEvRguh2K+6kBnTJG3uAIr31/1G7ASRn0ZaGtoiXi/p+hxewxksXDWN1vhtHS+cipzuRiAYtUc5+lVu6qWwAB5+TEPN43fnnlzQH3sYPqDkIG9X72JT336YguLpwwdDGYz40mwe+fhZxwIFs= X-Microsoft-Exchange-Diagnostics: 1; VI1PR0501MB2045; 6:4mF1sFfLj1XvYOMecPrL4ifdtC3AO/GIXk07p3yDvrBAVA9CsgUNQ4dXA7weimSlYCzedyICanZAM/ZXDOtKWQt0WTgFmel8kfEI0QrO+Ks4xKdn3KsTLPM1GkC1574EKTLYeg/s7SBzuqllQpau2ALwIaR/3Iztwaa/2mH1minC0JTnXNz5dxE+fYW1m411/cxsT+Rzy2+ntwXF3RYbp+TSJHdXG0yOyUJJfYP8ceO7vzKfmdrM6IL3BBzBs2o9m0ugiovyT/wWrbvQvG5enAyQDQrU0NDECwC+Pyg7aGyr17ou0DByYNwK4llDPERsS0mlUiXxNTxlOgaGnTUnEDaMpK9l3OgMb1BR+GA7dhc/h1gkezzr87bxKUmbdt3Cefbj5yD2y7jCAEWuv6kPrIHQUQ1EPe8xfsl/XXXXXHsESYbNtcf6lmoDIfFffwYp5TZb0D0bgRInAd7PxFaVIw==; 5:Re8+e9C4P5U8znTsbiioV9aij4zrB77CU76XvPwRpW+MHgfjDYUwoLRZT+T0fgW/Le1kHCIFSpc3OMJHd120Vl7c+486TWrNOLgSOT/a8VgwtQqpnktzeXmE6ymJIGpaxjxy9b/Z3pAkm1no14S4a9LL8B8ZSE72DVTuRpQ8PN0=; 24:A8KbRAti0uRgu++FXmxuhQZymv5cGKO2qU2axbXz8ye+M6HLNt3PSbWOQNGlSQJLaHky6bi++xjUrBE0bO3jecpr9Vi4C72d+UZ6rMtlfVM= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; VI1PR0501MB2045; 7:hdKc/x7UQ+xFnmo/GaGf+3KLLMP8u2JuFzD0ccFsCjyn+G+0rYL3XthIpaUvkIxmi4mQGc5ym/iOflafa7joJ8rhAguzmNwdj5f07yc+dk40/njACsZHVcaEBfNtnkVquoARuq1CyU7x7Ol/SCdChrHkzdVaIT3PxAxrCUpHkZxMdK8ELh/HNT2LNYZLCWlu1haaLwC/gVbD0jbCu6ZcBE5FFErkkE1mRVRoxfvlpfIWPqW3YeLI0du9IiZCxgb/ X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jul 2018 00:59:33.7257 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 9df2d116-094b-421e-98e0-08d5e792bc01 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0501MB2045 Subject: Re: [dpdk-dev] [PATCH 2/6] net/mlx5: add framework for switch flow rules X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jul 2018 00:59:44 -0000 On Wed, Jun 27, 2018 at 08:08:12PM +0200, Adrien Mazarguil wrote: > Because mlx5 switch flow rules are configured through Netlink (TC > interface) and have little in common with Verbs, this patch adds a separate > parser function to handle them. > > - mlx5_nl_flow_transpose() converts a rte_flow rule to its TC equivalent > and stores the result in a buffer. > > - mlx5_nl_flow_brand() gives a unique handle to a flow rule buffer. > > - mlx5_nl_flow_create() instantiates a flow rule on the device based on > such a buffer. > > - mlx5_nl_flow_destroy() performs the reverse operation. > > These functions are called by the existing implementation when encountering > flow rules which must be offloaded to the switch (currently relying on the > transfer attribute). > > Signed-off-by: Adrien Mazarguil > Signed-off-by: Nelio Laranjeiro > --- > drivers/net/mlx5/mlx5.h | 18 +++ > drivers/net/mlx5/mlx5_flow.c | 113 ++++++++++++++ > drivers/net/mlx5/mlx5_nl_flow.c | 295 +++++++++++++++++++++++++++++++++++ > 3 files changed, 426 insertions(+) > > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h > index 390249adb..aa16057d6 100644 > --- a/drivers/net/mlx5/mlx5.h > +++ b/drivers/net/mlx5/mlx5.h > @@ -148,6 +148,12 @@ struct mlx5_drop { > struct mlx5_rxq_ibv *rxq; /* Verbs Rx queue. */ > }; > > +/** DPDK port to network interface index (ifindex) conversion. */ > +struct mlx5_nl_flow_ptoi { > + uint16_t port_id; /**< DPDK port ID. */ > + unsigned int ifindex; /**< Network interface index. */ > +}; > + > struct mnl_socket; > > struct priv { > @@ -374,6 +380,18 @@ int mlx5_nl_allmulti(struct rte_eth_dev *dev, int enable); > > /* mlx5_nl_flow.c */ > > +int mlx5_nl_flow_transpose(void *buf, > + size_t size, > + const struct mlx5_nl_flow_ptoi *ptoi, > + const struct rte_flow_attr *attr, > + const struct rte_flow_item *pattern, > + const struct rte_flow_action *actions, > + struct rte_flow_error *error); > +void mlx5_nl_flow_brand(void *buf, uint32_t handle); > +int mlx5_nl_flow_create(struct mnl_socket *nl, void *buf, > + struct rte_flow_error *error); > +int mlx5_nl_flow_destroy(struct mnl_socket *nl, void *buf, > + struct rte_flow_error *error); > int mlx5_nl_flow_init(struct mnl_socket *nl, unsigned int ifindex, > struct rte_flow_error *error); > struct mnl_socket *mlx5_nl_flow_socket_create(void); > diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c > index 9241855be..93b245991 100644 > --- a/drivers/net/mlx5/mlx5_flow.c > +++ b/drivers/net/mlx5/mlx5_flow.c > @@ -4,6 +4,7 @@ > */ > > #include > +#include > #include > #include > > @@ -271,6 +272,7 @@ struct rte_flow { > /**< Store tunnel packet type data to store in Rx queue. */ > uint8_t key[40]; /**< RSS hash key. */ > uint16_t (*queue)[]; /**< Destination queues to redirect traffic to. */ > + void *nl_flow; /**< Netlink flow buffer if relevant. */ > }; > > static const struct rte_flow_ops mlx5_flow_ops = { > @@ -2403,6 +2405,106 @@ mlx5_flow_actions(struct rte_eth_dev *dev, > } > > /** > + * Validate flow rule and fill flow structure accordingly. > + * > + * @param dev > + * Pointer to Ethernet device. > + * @param[out] flow > + * Pointer to flow structure. > + * @param flow_size > + * Size of allocated space for @p flow. > + * @param[in] attr > + * Flow rule attributes. > + * @param[in] pattern > + * Pattern specification (list terminated by the END pattern item). > + * @param[in] actions > + * Associated actions (list terminated by the END action). > + * @param[out] error > + * Perform verbose error reporting if not NULL. > + * > + * @return > + * A positive value representing the size of the flow object in bytes > + * regardless of @p flow_size on success, a negative errno value otherwise > + * and rte_errno is set. > + */ > +static int > +mlx5_flow_merge_switch(struct rte_eth_dev *dev, > + struct rte_flow *flow, > + size_t flow_size, > + const struct rte_flow_attr *attr, > + const struct rte_flow_item pattern[], > + const struct rte_flow_action actions[], > + struct rte_flow_error *error) > +{ > + struct priv *priv = dev->data->dev_private; > + unsigned int n = mlx5_domain_to_port_id(priv->domain_id, NULL, 0); > + uint16_t port_list[!n + n]; > + struct mlx5_nl_flow_ptoi ptoi[!n + n + 1]; > + size_t off = RTE_ALIGN_CEIL(sizeof(*flow), alignof(max_align_t)); > + unsigned int i; > + unsigned int own = 0; > + int ret; > + > + /* At least one port is needed when no switch domain is present. */ > + if (!n) { > + n = 1; > + port_list[0] = dev->data->port_id; > + } else { > + n = mlx5_domain_to_port_id(priv->domain_id, port_list, n); > + if (n > RTE_DIM(port_list)) > + n = RTE_DIM(port_list); > + } > + for (i = 0; i != n; ++i) { > + struct rte_eth_dev_info dev_info; > + > + rte_eth_dev_info_get(port_list[i], &dev_info); > + if (port_list[i] == dev->data->port_id) > + own = i; > + ptoi[i].port_id = port_list[i]; > + ptoi[i].ifindex = dev_info.if_index; > + } > + /* Ensure first entry of ptoi[] is the current device. */ > + if (own) { > + ptoi[n] = ptoi[0]; > + ptoi[0] = ptoi[own]; > + ptoi[own] = ptoi[n]; > + } > + /* An entry with zero ifindex terminates ptoi[]. */ > + ptoi[n].port_id = 0; > + ptoi[n].ifindex = 0; > + if (flow_size < off) > + flow_size = 0; > + ret = mlx5_nl_flow_transpose((uint8_t *)flow + off, > + flow_size ? flow_size - off : 0, > + ptoi, attr, pattern, actions, error); > + if (ret < 0) > + return ret; So, there's an assumption that the buffer allocated outside of this API is enough to include all the messages in mlx5_nl_flow_transpose(), right? If flow_size isn't enough, buf_tmp will be used and _transpose() doesn't return error but required size. Sounds confusing, may need to make a change or to have clearer documentation. > + if (flow_size) { > + *flow = (struct rte_flow){ > + .attributes = *attr, > + .nl_flow = (uint8_t *)flow + off, > + }; > + /* > + * Generate a reasonably unique handle based on the address > + * of the target buffer. > + * > + * This is straightforward on 32-bit systems where the flow > + * pointer can be used directly. Otherwise, its least > + * significant part is taken after shifting it by the > + * previous power of two of the pointed buffer size. > + */ > + if (sizeof(flow) <= 4) > + mlx5_nl_flow_brand(flow->nl_flow, (uintptr_t)flow); > + else > + mlx5_nl_flow_brand > + (flow->nl_flow, > + (uintptr_t)flow >> > + rte_log2_u32(rte_align32prevpow2(flow_size))); > + } > + return off + ret; > +} > + > +/** > * Validate the rule and return a flow structure filled accordingly. > * > * @param dev > @@ -2439,6 +2541,9 @@ mlx5_flow_merge(struct rte_eth_dev *dev, struct rte_flow *flow, > int ret; > uint32_t i; > > + if (attr->transfer) > + return mlx5_flow_merge_switch(dev, flow, flow_size, > + attr, items, actions, error); > if (!remain) > flow = &local_flow; > ret = mlx5_flow_attributes(dev, attr, flow, error); > @@ -2554,8 +2659,11 @@ mlx5_flow_validate(struct rte_eth_dev *dev, > static void > mlx5_flow_fate_remove(struct rte_eth_dev *dev, struct rte_flow *flow) > { > + struct priv *priv = dev->data->dev_private; > struct mlx5_flow_verbs *verbs; > > + if (flow->nl_flow && priv->mnl_socket) > + mlx5_nl_flow_destroy(priv->mnl_socket, flow->nl_flow, NULL); > LIST_FOREACH(verbs, &flow->verbs, next) { > if (verbs->flow) { > claim_zero(mlx5_glue->destroy_flow(verbs->flow)); > @@ -2592,6 +2700,7 @@ static int > mlx5_flow_fate_apply(struct rte_eth_dev *dev, struct rte_flow *flow, > struct rte_flow_error *error) > { > + struct priv *priv = dev->data->dev_private; > struct mlx5_flow_verbs *verbs; > int err; > > @@ -2640,6 +2749,10 @@ mlx5_flow_fate_apply(struct rte_eth_dev *dev, struct rte_flow *flow, > goto error; > } > } > + if (flow->nl_flow && > + priv->mnl_socket && > + mlx5_nl_flow_create(priv->mnl_socket, flow->nl_flow, error)) > + goto error; > return 0; > error: > err = rte_errno; /* Save rte_errno before cleanup. */ > diff --git a/drivers/net/mlx5/mlx5_nl_flow.c b/drivers/net/mlx5/mlx5_nl_flow.c > index 7a8683b03..1fc62fb0a 100644 > --- a/drivers/net/mlx5/mlx5_nl_flow.c > +++ b/drivers/net/mlx5/mlx5_nl_flow.c > @@ -5,7 +5,9 @@ > > #include > #include > +#include > #include > +#include > #include > #include > #include > @@ -14,11 +16,248 @@ > #include > #include > > +#include > #include > #include > > #include "mlx5.h" > > +/** Parser state definitions for mlx5_nl_flow_trans[]. */ > +enum mlx5_nl_flow_trans { > + INVALID, > + BACK, > + ATTR, > + PATTERN, > + ITEM_VOID, > + ACTIONS, > + ACTION_VOID, > + END, > +}; > + > +#define TRANS(...) (const enum mlx5_nl_flow_trans []){ __VA_ARGS__, INVALID, } > + > +#define PATTERN_COMMON \ > + ITEM_VOID, ACTIONS > +#define ACTIONS_COMMON \ > + ACTION_VOID, END > + > +/** Parser state transitions used by mlx5_nl_flow_transpose(). */ > +static const enum mlx5_nl_flow_trans *const mlx5_nl_flow_trans[] = { > + [INVALID] = NULL, > + [BACK] = NULL, > + [ATTR] = TRANS(PATTERN), > + [PATTERN] = TRANS(PATTERN_COMMON), > + [ITEM_VOID] = TRANS(BACK), > + [ACTIONS] = TRANS(ACTIONS_COMMON), > + [ACTION_VOID] = TRANS(BACK), > + [END] = NULL, > +}; > + > +/** > + * Transpose flow rule description to rtnetlink message. > + * > + * This function transposes a flow rule description to a traffic control > + * (TC) filter creation message ready to be sent over Netlink. > + * > + * Target interface is specified as the first entry of the @p ptoi table. > + * Subsequent entries enable this function to resolve other DPDK port IDs > + * found in the flow rule. > + * > + * @param[out] buf > + * Output message buffer. May be NULL when @p size is 0. > + * @param size > + * Size of @p buf. Message may be truncated if not large enough. > + * @param[in] ptoi > + * DPDK port ID to network interface index translation table. This table > + * is terminated by an entry with a zero ifindex value. > + * @param[in] attr > + * Flow rule attributes. > + * @param[in] pattern > + * Pattern specification. > + * @param[in] actions > + * Associated actions. > + * @param[out] error > + * Perform verbose error reporting if not NULL. > + * > + * @return > + * A positive value representing the exact size of the message in bytes > + * regardless of the @p size parameter on success, a negative errno value > + * otherwise and rte_errno is set. > + */ > +int > +mlx5_nl_flow_transpose(void *buf, > + size_t size, > + const struct mlx5_nl_flow_ptoi *ptoi, > + const struct rte_flow_attr *attr, > + const struct rte_flow_item *pattern, > + const struct rte_flow_action *actions, > + struct rte_flow_error *error) > +{ > + alignas(struct nlmsghdr) > + uint8_t buf_tmp[MNL_SOCKET_BUFFER_SIZE]; > + const struct rte_flow_item *item; > + const struct rte_flow_action *action; > + unsigned int n; > + struct nlattr *na_flower; > + struct nlattr *na_flower_act; > + const enum mlx5_nl_flow_trans *trans; > + const enum mlx5_nl_flow_trans *back; > + > + if (!size) > + goto error_nobufs; > +init: > + item = pattern; > + action = actions; > + n = 0; > + na_flower = NULL; > + na_flower_act = NULL; > + trans = TRANS(ATTR); > + back = trans; > +trans: > + switch (trans[n++]) { > + struct nlmsghdr *nlh; > + struct tcmsg *tcm; > + > + case INVALID: > + if (item->type) > + return rte_flow_error_set > + (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM, > + item, "unsupported pattern item combination"); > + else if (action->type) > + return rte_flow_error_set > + (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ACTION, > + action, "unsupported action combination"); > + return rte_flow_error_set > + (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, > + "flow rule lacks some kind of fate action"); > + case BACK: > + trans = back; > + n = 0; > + goto trans; > + case ATTR: > + /* > + * Supported attributes: no groups, some priorities and > + * ingress only. Don't care about transfer as it is the > + * caller's problem. > + */ > + if (attr->group) > + return rte_flow_error_set > + (error, ENOTSUP, > + RTE_FLOW_ERROR_TYPE_ATTR_GROUP, > + attr, "groups are not supported"); > + if (attr->priority > 0xfffe) > + return rte_flow_error_set > + (error, ENOTSUP, > + RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, > + attr, "lowest priority level is 0xfffe"); > + if (!attr->ingress) > + return rte_flow_error_set > + (error, ENOTSUP, > + RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, > + attr, "only ingress is supported"); > + if (attr->egress) > + return rte_flow_error_set > + (error, ENOTSUP, > + RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, > + attr, "egress is not supported"); > + if (size < mnl_nlmsg_size(sizeof(*tcm))) > + goto error_nobufs; > + nlh = mnl_nlmsg_put_header(buf); > + nlh->nlmsg_type = 0; > + nlh->nlmsg_flags = 0; > + nlh->nlmsg_seq = 0; > + tcm = mnl_nlmsg_put_extra_header(nlh, sizeof(*tcm)); > + tcm->tcm_family = AF_UNSPEC; > + tcm->tcm_ifindex = ptoi[0].ifindex; > + /* > + * Let kernel pick a handle by default. A predictable handle > + * can be set by the caller on the resulting buffer through > + * mlx5_nl_flow_brand(). > + */ > + tcm->tcm_handle = 0; > + tcm->tcm_parent = TC_H_MAKE(TC_H_INGRESS, TC_H_MIN_INGRESS); > + /* > + * Priority cannot be zero to prevent the kernel from > + * picking one automatically. > + */ > + tcm->tcm_info = TC_H_MAKE((attr->priority + 1) << 16, > + RTE_BE16(ETH_P_ALL)); > + break; > + case PATTERN: > + if (!mnl_attr_put_strz_check(buf, size, TCA_KIND, "flower")) > + goto error_nobufs; > + na_flower = mnl_attr_nest_start_check(buf, size, TCA_OPTIONS); > + if (!na_flower) > + goto error_nobufs; > + if (!mnl_attr_put_u32_check(buf, size, TCA_FLOWER_FLAGS, > + TCA_CLS_FLAGS_SKIP_SW)) > + goto error_nobufs; > + break; > + case ITEM_VOID: > + if (item->type != RTE_FLOW_ITEM_TYPE_VOID) > + goto trans; > + ++item; > + break; > + case ACTIONS: > + if (item->type != RTE_FLOW_ITEM_TYPE_END) > + goto trans; > + assert(na_flower); > + assert(!na_flower_act); > + na_flower_act = > + mnl_attr_nest_start_check(buf, size, TCA_FLOWER_ACT); > + if (!na_flower_act) > + goto error_nobufs; > + break; > + case ACTION_VOID: > + if (action->type != RTE_FLOW_ACTION_TYPE_VOID) > + goto trans; > + ++action; > + break; > + case END: > + if (item->type != RTE_FLOW_ITEM_TYPE_END || > + action->type != RTE_FLOW_ACTION_TYPE_END) > + goto trans; > + if (na_flower_act) > + mnl_attr_nest_end(buf, na_flower_act); > + if (na_flower) > + mnl_attr_nest_end(buf, na_flower); > + nlh = buf; > + return nlh->nlmsg_len; > + } > + back = trans; > + trans = mlx5_nl_flow_trans[trans[n - 1]]; > + n = 0; > + goto trans; > +error_nobufs: > + if (buf != buf_tmp) { > + buf = buf_tmp; > + size = sizeof(buf_tmp); > + goto init; > + } Continuing my comment above. This part is unclear. It looks to me that this func does: 1) if size is zero, consider it as a testing call to know the amount of memory required. 2) if size isn't zero but not enough, it stops writing to buf and start over to return the amount of memory required instead of returning error. 3) if size isn't zero and enough, it fills in buf. Do I correctly understand? Thanks, Yongseok > + return rte_flow_error_set > + (error, ENOBUFS, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, > + "generated TC message is too large"); > +} > + > +/** > + * Brand rtnetlink buffer with unique handle. > + * > + * This handle should be unique for a given network interface to avoid > + * collisions. > + * > + * @param buf > + * Flow rule buffer previously initialized by mlx5_nl_flow_transpose(). > + * @param handle > + * Unique 32-bit handle to use. > + */ > +void > +mlx5_nl_flow_brand(void *buf, uint32_t handle) > +{ > + struct tcmsg *tcm = mnl_nlmsg_get_payload(buf); > + > + tcm->tcm_handle = handle; > +} > + > /** > * Send Netlink message with acknowledgment. > * > @@ -54,6 +293,62 @@ mlx5_nl_flow_nl_ack(struct mnl_socket *nl, struct nlmsghdr *nlh) > } > > /** > + * Create a Netlink flow rule. > + * > + * @param nl > + * Libmnl socket to use. > + * @param buf > + * Flow rule buffer previously initialized by mlx5_nl_flow_transpose(). > + * @param[out] error > + * Perform verbose error reporting if not NULL. > + * > + * @return > + * 0 on success, a negative errno value otherwise and rte_errno is set. > + */ > +int > +mlx5_nl_flow_create(struct mnl_socket *nl, void *buf, > + struct rte_flow_error *error) > +{ > + struct nlmsghdr *nlh = buf; > + > + nlh->nlmsg_type = RTM_NEWTFILTER; > + nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL; > + if (!mlx5_nl_flow_nl_ack(nl, nlh)) > + return 0; > + return rte_flow_error_set > + (error, rte_errno, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, > + "netlink: failed to create TC flow rule"); > +} > + > +/** > + * Destroy a Netlink flow rule. > + * > + * @param nl > + * Libmnl socket to use. > + * @param buf > + * Flow rule buffer previously initialized by mlx5_nl_flow_transpose(). > + * @param[out] error > + * Perform verbose error reporting if not NULL. > + * > + * @return > + * 0 on success, a negative errno value otherwise and rte_errno is set. > + */ > +int > +mlx5_nl_flow_destroy(struct mnl_socket *nl, void *buf, > + struct rte_flow_error *error) > +{ > + struct nlmsghdr *nlh = buf; > + > + nlh->nlmsg_type = RTM_DELTFILTER; > + nlh->nlmsg_flags = NLM_F_REQUEST; > + if (!mlx5_nl_flow_nl_ack(nl, nlh)) > + return 0; > + return rte_flow_error_set > + (error, errno, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL, > + "netlink: failed to destroy TC flow rule"); > +} > + > +/** > * Initialize ingress qdisc of a given network interface. > * > * @param nl > -- > 2.11.0