From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <yskoh@mellanox.com>
Received: from EUR01-DB5-obe.outbound.protection.outlook.com
 (mail-db5eur01on0055.outbound.protection.outlook.com [104.47.2.55])
 by dpdk.org (Postfix) with ESMTP id 4ED201BEB2
 for <dev@dpdk.org>; Fri,  6 Jul 2018 04:16:52 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com;
 s=selector1;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=eaBBBSnAyvYXqYlx0m2QCG77wopMcr4mRaCMPVClXqk=;
 b=vPQucjGRIiAGqGFS+CFBHxOlDkUFa8+Vn5WeUEG7czYDeUwfnEgfvy3avx19qYRnv4SeHr6jPCFwvYOEWVuXkrxp2I0HFXxo7QVeKNYeoN3M2SGYonGGWeVmqcBOqNi/RJ6n7eYixKAzFpqorbKuBeI7cQN2I6XuP+TYvvU5mxI=
Authentication-Results: spf=none (sender IP is )
 smtp.mailfrom=yskoh@mellanox.com; 
Received: from yongseok-MBP.local (209.116.155.178) by
 VI1PR0501MB2045.eurprd05.prod.outlook.com (2603:10a6:800:36::19) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.930.19; Fri, 6 Jul
 2018 02:16:49 +0000
Date: Thu, 5 Jul 2018 19:16:35 -0700
From: Yongseok Koh <yskoh@mellanox.com>
To: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Cc: dev@dpdk.org, Adrien Mazarguil <adrien.mazarguil@6wind.com>
Message-ID: <20180706021630.GB47821@yongseok-MBP.local>
References: <cover.1527506071.git.nelio.laranjeiro@6wind.com>
 <cover.1530111623.git.nelio.laranjeiro@6wind.com>
 <ae5d5fc2b1a1501ca622e31c9d1cc6a348b2bd15.1530111623.git.nelio.laranjeiro@6wind.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <ae5d5fc2b1a1501ca622e31c9d1cc6a348b2bd15.1530111623.git.nelio.laranjeiro@6wind.com>
User-Agent: Mutt/1.9.3 (2018-01-21)
X-Originating-IP: [209.116.155.178]
X-ClientProxiedBy: BN6PR1001CA0031.namprd10.prod.outlook.com
 (2603:10b6:405:28::44) To VI1PR0501MB2045.eurprd05.prod.outlook.com
 (2603:10a6:800:36::19)
X-MS-PublicTrafficType: Email
X-MS-Office365-Filtering-HT: Tenant
X-MS-Office365-Filtering-Correlation-Id: adf5d240-f3d7-49f2-8d7d-08d5e2e68854
X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0;
 RULEID:(7020095)(4652040)(8989117)(48565401081)(5600053)(711020)(4534165)(4627221)(201703031133081)(201702281549075)(8990107)(2017052603328)(7153060)(7193020);
 SRVR:VI1PR0501MB2045; 
X-Microsoft-Exchange-Diagnostics: 1; VI1PR0501MB2045;
 3:qzMK+xfXUm6pdTigfyaovuqOGefLHcV5Ws3MWU1jBFB07yuLq3BsMM6vipU9k7kIBXHgWKQgocle3A7UYuQ5y3DMDsbRidDQM06HmBwvAyWP8fq06guot/j0AaJia446jTTeZjGOSSfKVxE3EmR8+Khbnb5PMMAVOQhu96CGTCm+qRG7ZKNDeOuGwkUVDSInKJXudxc3tn2WQ6eM7MVmm3Yl4/3KG1Z+zb3RwO9nZil4gE6WgiK2R1sFVRuQSpAx;
 25:rUEAAql8VoJR0N11vYYiXCgxY6LiE39uiPj5KGwP/RTroZqiK3pHk0NE9WseX1H7Wx4E+sYtOIYF5Q3eYGkDSckWQI4XXZWCdy04Td5dTCrL81sFKE35Y6qL69rAChWqKRj/LK1UV3ZnxVNOM8QTzhLtaZCqSVH7A1J2fH5p1o41B/2DeTikC2JMMDZKQxHQIQOT63lLlWzctwg5c1ZArPe208kMdFpUp9XNrJmdgZbOm/zV7zP6pgMGLcos6Sybp1t90VroL2xkutGckfp9j/UxAIx3fEl54gDC4/6A5wGFo5Pldydk/Rm0ths9JVciZXhpmbFDbmPicCPa0B67tw==;
 31:GDopUmtOtUccIVczvecDKBNzO/L3pXdvoPsDTV2KVzU56sBrFJIvq20422qgq0KJVaPh1KffRbzkA8aHNs09aHxbS5W7Wi9db3hlYhQQiuBQfr8e+uFpTBMPN3DiJIDxlJ0YaPGY5rEDk4iFhND180Nu6aOe7Yrw1wjKgXBlvHnCIq+wjAQJIh2iUzJ+oOQP72DyKFwKS6dOD7EP3A9pSrbVQezK1Vt9yD3ubvVOmXQ=
X-MS-TrafficTypeDiagnostic: VI1PR0501MB2045:
X-LD-Processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr
X-Microsoft-Exchange-Diagnostics: 1; VI1PR0501MB2045;
 20:4t26wrfP50F+GaGoS3re7WCI/tvmYqFiA1xmDobadYMNBGwtaefErj2RE3n7pRpekrumS7dxxrumGgvcnAXVw6+TDAZvk+U8Qg1pRowq2faGFk2mNkhicaANa/qtraCCF7ruywoQ/bcc2SzccFLDbUia9n7+QVzNPKyQSesvx4N7nFTXB/zQPLP3lC9qSzsfgjp1NyZlVEc8izAhGG8UdyUuW1JRObf1Ezdo7hepXvFiRZl/JyM8no32YyQ8Y3ZjQ3B4AEBhDDbY5e5FOTWdCG/Gfs6UcUd9thyKw8hEmtKnKTTjuXZCVBAD+sktMWe8n9kuutV4h240+k7gmxwXROREL+caKKVIqA+AtQMZb+6Tasex+QGf2LyzKQJGxEcFQ4Fea7iKRZw/25BvXp2PAVvUzrnMXiDKkUjN/wITUrs9/YH9B/+7HNtRgRQk2RWHLFJUyrRT3enGPyoYYlMk4JfL3bWb4HcOVqmZGjhcoJpnGv+irLWUxGXYCDFYfJeq;
 4:up8Ah1qa7Tj/t1/8VE46F1xSRAYAtts90dtdfAyocjYZ22g+rKdscm3PbkPELA8+i4AABrCSdk0/X6hYnmUARA0p+eEiYEzNRA+wgQJ6h052/3ecQq8Kf4jts9fsgl//xZgoPlAWV177TtRIIXhBq6wEDUKpui8sdhUjsHumBZiIFE80z/ltt7JCN+MBfZcWwDsxrOB7YdWDq2ObH/hjMWBO7H3zY8JCtcy9VU6hJX/Ns2CFNtaESGVQjz6gtUcq2rpjMDsm/46vV/SjKMRF1WXIsxTB55Pir+AMmajAYktjIbZsbbMnFSkTJp1nawpF
X-Microsoft-Antispam-PRVS: <VI1PR0501MB2045EB1E01903C2C8BA96341C3470@VI1PR0501MB2045.eurprd05.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:(788757137089);
X-MS-Exchange-SenderADCheck: 1
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0;
 RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(10201501046)(3002001)(93006095)(93001095)(3231254)(944501410)(52105095)(6055026)(149027)(150027)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123562045)(20161123558120)(20161123564045)(6072148)(201708071742011)(7699016);
 SRVR:VI1PR0501MB2045; BCL:0; PCL:0; RULEID:; SRVR:VI1PR0501MB2045; 
X-Forefront-PRVS: 0725D9E8D0
X-Forefront-Antispam-Report: SFV:NSPM;
 SFS:(10009020)(39860400002)(396003)(366004)(136003)(346002)(376002)(51234002)(199004)(189003)(47776003)(52116002)(26005)(33896004)(229853002)(1076002)(76176011)(11346002)(23726003)(6506007)(386003)(55016002)(9686003)(2906002)(476003)(956004)(16586007)(446003)(16526019)(58126008)(316002)(7696005)(6116002)(14444005)(3846002)(486006)(186003)(7736002)(106356001)(68736007)(6916009)(97736004)(81166006)(8936002)(81156014)(575784001)(5660300001)(105586002)(305945005)(8676002)(98436002)(86362001)(478600001)(50466002)(33656002)(66066001)(6246003)(25786009)(4326008)(53946003)(53936002)(6666003)(18370500001)(21314002)(579004)(559001)(309714004);
 DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR0501MB2045; H:yongseok-MBP.local; FPR:;
 SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; 
Received-SPF: None (protection.outlook.com: mellanox.com does not designate
 permitted sender hosts)
X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; VI1PR0501MB2045;
 23:VtAnJQ9b+q766Xr1sqb36gsA66JQVDvnmS6lxaw?=
 =?us-ascii?Q?Ar/CmFUyUWXBGyOqBDYFDv7+W4qUfxTH7RpXPLF2TaNflFVMyN2chGvveoQG?=
 =?us-ascii?Q?ezeupUMrGvxTlCK/HIY+6jfm92AY1nKvrCmxorTdmq4DxWo0Q13FxsFRMqyO?=
 =?us-ascii?Q?uId3q9QN0lKf8Dquab9FUNIsK23We2x6HOhJkgOZqvQRr9ip/vZP3gVAf7O9?=
 =?us-ascii?Q?3ooqvBauQqwWgQimfJnaTjB99cX0lTRgwyBDRm/nDt/MvLIWqrbgV9inkTqx?=
 =?us-ascii?Q?EfkGZ9y0xRrdj9NRk5Of5msIyIOaj2wW+FdbKgWQKVFuRacEJEf101Z990SL?=
 =?us-ascii?Q?wOm4yfpbPtR9MuvnG1YFD35tGkjjvB3Ac5z9YJEbSbXw/c60X9l1OV3n6kn1?=
 =?us-ascii?Q?9m023bIMz4kAwxmtrpw0kkH9ZOegRK5MC/2wCX3rLrMMhA0nQcS1N3zdonDm?=
 =?us-ascii?Q?kRhmlsS6xFRWPh7QVQoybZZq+PFb94Yl0bGSfLOhiwsakt6OCvKZZEwrpqyZ?=
 =?us-ascii?Q?ZGCJMtPIVZMqMlqxzq2lC5SO3if6ZmW1pITXQxMEneqYJcfdX0HnxK38H1M0?=
 =?us-ascii?Q?oViaQUUjI1NcuQlro05PzGQvZ6Tmfk1lj0Pj4w6R8j1FPKclvzRAMG2+zhNi?=
 =?us-ascii?Q?28MrlM+KyMajR2NyO3reOPGpXrePifYuL/fpk/xrz0aEfbwe87eITwoWZtI9?=
 =?us-ascii?Q?DZHhAuckimX23qGp04gvpU2O5yyzIkeLGbl0b6TDDWzlbyJ9RiG/7xEyqN3I?=
 =?us-ascii?Q?VRHBOVouXJjTw5JUsiN2seFCbasaCxmr+4co3J/oDnk75hw4czq2vSYwPCax?=
 =?us-ascii?Q?5na1H0L9vKzHmzbal+HtG4RIWtWBuCVbCh+tv7vTi/d0Q6CFJL74JeDfmt9Y?=
 =?us-ascii?Q?sEArKJMFNN5XgcrWp0S8jxB6hCLFcB9poNqkWXlDXTQmrf7olB+h5y4BpdvR?=
 =?us-ascii?Q?YXcrKmpY1MCPzyGrAn0MZZ6ESqUXfy0p8ZF7sCWdIdwOP6fcpmNlbfaLv8eA?=
 =?us-ascii?Q?ZLr7ozAoRZt0ql61rgCVg7S6gpsaQ5zQV3cn090EuoMRhTQ+EK5x2kwAefkg?=
 =?us-ascii?Q?LLp/As7nJn2Kp7HOYqmj+yLFdt8eP0CgZ0mluHDAFO6kir0NKhwXquJY+pIR?=
 =?us-ascii?Q?FkHogyS7mu9NhktUEyLe/f9upoSAKyrWd8jlXE8m5L6OYbYzjO5oahjYIJ7Y?=
 =?us-ascii?Q?9g1qgku/qQxQXTsm6+HTNszE3r10aks+fbQ8oBuDu0crdCLU4zfMTxBzfXRl?=
 =?us-ascii?Q?iBpkLBkgL8kjg2OWU85WGWEIYD2JMVYTFD2V1mP2Ipc4CdLjnOO9LFAqKPZG?=
 =?us-ascii?Q?Cv01DRufEx8yiv9Uax4GDBQgubuKJrMoCGuEK5plLoGZq/JAgK8dVXjbfmE8?=
 =?us-ascii?Q?kaKiSeccqAYAZgGzxckfuo8EfCKyuuOkz5gt4Pj7+2Xuo5czFdA3zapBt9+Y?=
 =?us-ascii?Q?EA3yNdLt3XVQ8w7KQ1FhePu0FJUtD+uQ=3D?=
X-Microsoft-Antispam-Message-Info: H2CCU0nenmZpin5Mb/NUlKFRLvvYtUpcXtJlh+ZePypsxBVrca9ufMiDvCLtGgw6JURJgb7wNXP834+OrGJflx8fISX9ioSKTxFfn19v3RnOzD+Vqs+TngMtAKvdQ7MI4dEv5JtYo2yg8+raPNxucRKzLiZDP1uleDMc9lCADweKwB7VCwGhCvWpPdrL3bWQ57es1xRX1t9X0PooENxX3fLh4+bHMWUM6HXOHqZKEyTfQZKmhZD3Pq1HEO6mbnNdSg9GhIuKir86tSRbKla97qJdy+LJteGLMAUNmhjCWVlhEGRRgW72/do8k9eS1h5THbFCmrI2+0sor7BG2+fynoMn49oQJwkcaTNs/y/0FNI=
X-Microsoft-Exchange-Diagnostics: 1; VI1PR0501MB2045;
 6:eJuNvobT1I5NOynRCesMA03fLtvLcrJOnNCofERHOmtiPSzkStPlS5unpmijsaI0kjI2t0Y4V0sD55qXap5TBvDtCZMhNnMzklg4wLkWJXlvRc+6NVrFl/5/WQfqSYK9gtPqA10QzmpnOYV3VmvcNQKs9xmLBpHkhh76c93kGtYV74lpQGoiM59kU9Dg3mhW7BRGgtDmihv7XcvGGh2+bmbdeigvjSMt2l38zBG6eYEgbwSEQHoQmwJ7FSH4czl0vql6+SEYBcLfKPEwCYNQYoVKBlURz7RQspziEbVJmWbeNoHdTqD/3XM7JJyhZ9FEz53TobnpHpbvNodfOe0bREky1kqJTwfIWxboNm3yWYjUOZ2e00i+80BaKvj3AacQBzRKWa0WK/Fw6Dw2oyCLwIuDMd7vawhqfbI1db9oA3XG1TMqscUBUigdbwVp2b9by/lHEHzeTC4waL8EikpNnw==;
 5:AA0gflkCHmox2gyRars21Icc3wmpcIz9KtWJkZuzZQbtNckwYn1A/PuOthUMjpdSOaoDWowEn3PGX0eAIxqykULaBFG+ZAVI4KTKgOC0+CUsaBQf41wQZDBd3OstMlNngt3Eyu4tB8Vln8Fw3d5v9+9k/wGOeFXhyrto1uoxdsA=;
 24:JWL8R51CM7NZwtAJin0wiK28y8+tkrLWf+imjcYiLZLp63JFtiP9XuwpcjXf9DeVs4L2ye6GfmKOuZJnNGhM9zRBmoIFACagF623w+hZTEI=
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-Microsoft-Exchange-Diagnostics: 1; VI1PR0501MB2045;
 7:hZMq+V6/mhdzgElOYw36VpsFA7hxRNrtLJYw8TRSCJJgIXqvzd+fpSW88cIWA9tP+FpdIP/1N2SiylgYkoMFXd2cG6SyfLdM4nmaE/k3L5EAwV//qX4uXxhA2RGs/uvrx9YLPOvUpwKVa89FXpI88kMY0kDNtAU1ULZy3L9DpSoJfj6lRRQt4quEZHqdSTuuvFBQ+mqooBH9otRT9jwMk6Pro33q/3/1eVJCdCgO0bvzGDZXcikC/z6wU6gnq0vt
X-OriginatorOrg: Mellanox.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jul 2018 02:16:49.1874 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: adf5d240-f3d7-49f2-8d7d-08d5e2e68854
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b
X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0501MB2045
Subject: Re: [dpdk-dev] [PATCH v2 13/20] net/mlx5: add RSS flow action
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Jul 2018 02:16:52 -0000

On Wed, Jun 27, 2018 at 05:07:45PM +0200, Nelio Laranjeiro wrote:
> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c | 1211 +++++++++++++++++++++++++---------
>  1 file changed, 899 insertions(+), 312 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index a39157533..08e0a6556 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -51,13 +51,148 @@ extern const struct eth_dev_ops mlx5_dev_ops_isolate;
>  /* Action fate on the packet. */
>  #define MLX5_FLOW_FATE_DROP (1u << 0)
>  #define MLX5_FLOW_FATE_QUEUE (1u << 1)
> +#define MLX5_FLOW_FATE_RSS (1u << 2)
>  
>  /* Modify a packet. */
>  #define MLX5_FLOW_MOD_FLAG (1u << 0)
>  #define MLX5_FLOW_MOD_MARK (1u << 1)
>  
> +/* Priority reserved for default flows. */
> +#define MLX5_FLOW_PRIO_RSVD ((uint32_t)-1)
> +
> +enum mlx5_expansion {
> +	MLX5_EXPANSION_ROOT,
> +	MLX5_EXPANSION_ROOT2,

How about MLX5_EXPANSION_OUTER_ROOT?

> +	MLX5_EXPANSION_OUTER_ETH,
> +	MLX5_EXPANSION_OUTER_IPV4,
> +	MLX5_EXPANSION_OUTER_IPV4_UDP,
> +	MLX5_EXPANSION_OUTER_IPV4_TCP,
> +	MLX5_EXPANSION_OUTER_IPV6,
> +	MLX5_EXPANSION_OUTER_IPV6_UDP,
> +	MLX5_EXPANSION_OUTER_IPV6_TCP,
> +	MLX5_EXPANSION_VXLAN,
> +	MLX5_EXPANSION_VXLAN_GPE,
> +	MLX5_EXPANSION_GRE,
> +	MLX5_EXPANSION_MPLS,
> +	MLX5_EXPANSION_ETH,
> +	MLX5_EXPANSION_IPV4,
> +	MLX5_EXPANSION_IPV4_UDP,
> +	MLX5_EXPANSION_IPV4_TCP,
> +	MLX5_EXPANSION_IPV6,
> +	MLX5_EXPANSION_IPV6_UDP,
> +	MLX5_EXPANSION_IPV6_TCP,
> +};
> +
> +/** Supported expansion of items. */
> +static const struct rte_flow_expand_node mlx5_support_expansion[] = {
> +	[MLX5_EXPANSION_ROOT] = {
> +		.next = RTE_FLOW_EXPAND_ITEMS(MLX5_EXPANSION_ETH,
> +					      MLX5_EXPANSION_IPV4,
> +					      MLX5_EXPANSION_IPV6),
> +		.type = RTE_FLOW_ITEM_TYPE_END,
> +	},
> +	[MLX5_EXPANSION_ROOT2] = {
> +		.next = RTE_FLOW_EXPAND_ITEMS(MLX5_EXPANSION_OUTER_ETH,
> +					      MLX5_EXPANSION_OUTER_IPV4,
> +					      MLX5_EXPANSION_OUTER_IPV6),
> +		.type = RTE_FLOW_ITEM_TYPE_END,
> +	},
> +	[MLX5_EXPANSION_OUTER_ETH] = {
> +		.next = RTE_FLOW_EXPAND_ITEMS(MLX5_EXPANSION_OUTER_IPV4,
> +					      MLX5_EXPANSION_OUTER_IPV6),
> +		.type = RTE_FLOW_ITEM_TYPE_ETH,
> +		.rss_types = 0,
> +	},
> +	[MLX5_EXPANSION_OUTER_IPV4] = {
> +		.next = RTE_FLOW_EXPAND_ITEMS(MLX5_EXPANSION_OUTER_IPV4_UDP,
> +					      MLX5_EXPANSION_OUTER_IPV4_TCP),
> +		.type = RTE_FLOW_ITEM_TYPE_IPV4,
> +		.rss_types = ETH_RSS_IPV4 | ETH_RSS_FRAG_IPV4 |
> +			ETH_RSS_NONFRAG_IPV4_OTHER,
> +	},
> +	[MLX5_EXPANSION_OUTER_IPV4_UDP] = {
> +		.next = RTE_FLOW_EXPAND_ITEMS(MLX5_EXPANSION_VXLAN),
> +		.type = RTE_FLOW_ITEM_TYPE_UDP,
> +		.rss_types = ETH_RSS_NONFRAG_IPV4_UDP,
> +	},
> +	[MLX5_EXPANSION_OUTER_IPV4_TCP] = {
> +		.type = RTE_FLOW_ITEM_TYPE_TCP,
> +		.rss_types = ETH_RSS_NONFRAG_IPV4_TCP,
> +	},
> +	[MLX5_EXPANSION_OUTER_IPV6] = {
> +		.next = RTE_FLOW_EXPAND_ITEMS(MLX5_EXPANSION_OUTER_IPV6_UDP,
> +					      MLX5_EXPANSION_OUTER_IPV6_TCP),
> +		.type = RTE_FLOW_ITEM_TYPE_IPV6,
> +		.rss_types = ETH_RSS_IPV6 | ETH_RSS_FRAG_IPV6 |
> +			ETH_RSS_NONFRAG_IPV6_OTHER,
> +	},
> +	[MLX5_EXPANSION_OUTER_IPV6_UDP] = {
> +		.next = RTE_FLOW_EXPAND_ITEMS(MLX5_EXPANSION_VXLAN),
> +		.type = RTE_FLOW_ITEM_TYPE_UDP,
> +		.rss_types = ETH_RSS_NONFRAG_IPV6_UDP,
> +	},
> +	[MLX5_EXPANSION_OUTER_IPV6_TCP] = {
> +		.type = RTE_FLOW_ITEM_TYPE_TCP,
> +		.rss_types = ETH_RSS_NONFRAG_IPV6_TCP,
> +	},
> +	[MLX5_EXPANSION_VXLAN] = {
> +		.next = RTE_FLOW_EXPAND_ITEMS(MLX5_EXPANSION_ETH),
> +		.type = RTE_FLOW_ITEM_TYPE_VXLAN,
> +	},
> +	[MLX5_EXPANSION_VXLAN_GPE] = {
> +		.next = RTE_FLOW_EXPAND_ITEMS(MLX5_EXPANSION_ETH,
> +					      MLX5_EXPANSION_IPV4,
> +					      MLX5_EXPANSION_IPV6),
> +		.type = RTE_FLOW_ITEM_TYPE_VXLAN_GPE,
> +	},
> +	[MLX5_EXPANSION_GRE] = {
> +		.next = RTE_FLOW_EXPAND_ITEMS(MLX5_EXPANSION_IPV4),
> +		.type = RTE_FLOW_ITEM_TYPE_GRE,
> +	},
> +	[MLX5_EXPANSION_ETH] = {
> +		.next = RTE_FLOW_EXPAND_ITEMS(MLX5_EXPANSION_IPV4,
> +					      MLX5_EXPANSION_IPV6),
> +		.type = RTE_FLOW_ITEM_TYPE_ETH,
> +	},
> +	[MLX5_EXPANSION_IPV4] = {
> +		.next = RTE_FLOW_EXPAND_ITEMS(MLX5_EXPANSION_IPV4_UDP,
> +					      MLX5_EXPANSION_IPV4_TCP),
> +		.type = RTE_FLOW_ITEM_TYPE_IPV4,
> +		.rss_types = ETH_RSS_IPV4 | ETH_RSS_FRAG_IPV4 |
> +			ETH_RSS_NONFRAG_IPV4_OTHER,
> +	},
> +	[MLX5_EXPANSION_IPV4_UDP] = {
> +		.type = RTE_FLOW_ITEM_TYPE_UDP,
> +		.rss_types = ETH_RSS_NONFRAG_IPV4_UDP,
> +	},
> +	[MLX5_EXPANSION_IPV4_TCP] = {
> +		.type = RTE_FLOW_ITEM_TYPE_TCP,
> +		.rss_types = ETH_RSS_NONFRAG_IPV4_TCP,
> +	},
> +	[MLX5_EXPANSION_IPV6] = {
> +		.next = RTE_FLOW_EXPAND_ITEMS(MLX5_EXPANSION_IPV6_UDP,
> +					      MLX5_EXPANSION_IPV6_TCP),
> +		.type = RTE_FLOW_ITEM_TYPE_IPV6,
> +		.rss_types = ETH_RSS_IPV6 | ETH_RSS_FRAG_IPV6 |
> +			ETH_RSS_NONFRAG_IPV6_OTHER,
> +	},
> +	[MLX5_EXPANSION_IPV6_UDP] = {
> +		.type = RTE_FLOW_ITEM_TYPE_UDP,
> +		.rss_types = ETH_RSS_NONFRAG_IPV6_UDP,
> +	},
> +	[MLX5_EXPANSION_IPV6_TCP] = {
> +		.type = RTE_FLOW_ITEM_TYPE_TCP,
> +		.rss_types = ETH_RSS_NONFRAG_IPV6_TCP,
> +	},
> +};
> +
>  /** Handles information leading to a drop fate. */
>  struct mlx5_flow_verbs {
> +	LIST_ENTRY(mlx5_flow_verbs) next;
> +	uint32_t layers;
> +	/**< Bit-fields of expanded layers see MLX5_FLOW_ITEMS_*. */
> +	uint32_t modifier;
> +	/**< Bit-fields of expanded modifier see MLX5_FLOW_MOD_*. */
>  	unsigned int size; /**< Size of the attribute. */
>  	struct {
>  		struct ibv_flow_attr *attr;
> @@ -66,20 +201,26 @@ struct mlx5_flow_verbs {
>  	};
>  	struct ibv_flow *flow; /**< Verbs flow pointer. */
>  	struct mlx5_hrxq *hrxq; /**< Hash Rx queue object. */
> +	uint64_t hash_fields; /**< Verbs hash Rx queue hash fields. */
>  };
>  
>  /* Flow structure. */
>  struct rte_flow {
>  	TAILQ_ENTRY(rte_flow) next; /**< Pointer to the next flow structure. */
>  	struct rte_flow_attr attributes; /**< User flow attribute. */
> +	uint32_t expand:1; /**< Flow is expanded due to RSS configuration. */

Suggest 'expanded'.

>  	uint32_t layers;
>  	/**< Bit-fields of present layers see MLX5_FLOW_ITEMS_*. */
>  	uint32_t modifier;
>  	/**< Bit-fields of present modifier see MLX5_FLOW_MOD_*. */
>  	uint32_t fate;
>  	/**< Bit-fields of present fate see MLX5_FLOW_FATE_*. */
> -	struct mlx5_flow_verbs verbs; /* Verbs flow. */
> -	uint16_t queue; /**< Destination queue to redirect traffic to. */
> +	LIST_HEAD(verbs, mlx5_flow_verbs) verbs; /**< Verbs flows list. */
> +	struct mlx5_flow_verbs *cur_verbs;
> +	/**< Current Verbs flow structure being filled. */
> +	struct rte_flow_action_rss rss;/**< RSS context. */
> +	uint8_t key[40]; /**< RSS hash key. */

Let's define a macro for '40'.

> +	uint16_t (*queue)[]; /**< Destination queues to redirect traffic to. */
>  };
>  
>  static const struct rte_flow_ops mlx5_flow_ops = {
> @@ -122,16 +263,27 @@ struct ibv_spec_header {
>  	uint16_t size;
>  };
>  
> - /**
> -  * Get the maximum number of priority available.
> -  *
> -  * @param dev
> -  *   Pointer to Ethernet device.
> -  *
> -  * @return
> -  *   number of supported flow priority on success, a negative errno value
> -  *   otherwise and rte_errno is set.
> -  */
> +/* Map of Verbs to Flow priority with 8 Verbs priorities. */
> +static const uint32_t priority_map_3[][3] = {
> +	{ 0, 1, 2 }, { 2, 3, 4 }, { 5, 6, 7 },
> +};
> +
> +/* Map of Verbs to Flow priority with 16 Verbs priorities. */
> +static const uint32_t priority_map_5[][3] = {
> +	{ 0, 1, 2 }, { 3, 4, 5 }, { 6, 7, 8 },
> +	{ 9, 10, 11 }, { 12, 13, 14 },
> +};

How about 

enum mlx5_sub_priority {
	MLX5_SUB_PRIORITY_0 = 0,
	MLX5_SUB_PRIORITY_1,
	MLX5_SUB_PRIORITY_2,
	MLX5_SUB_PRIORITY_MAX,
};

static const uint32_t priority_map_3[][MLX5_SUB_PRIORITY_MAX] = {

> +
> +/**
> + * Get the maximum number of priority available.
> + *
> + * @param dev
> + *   Pointer to Ethernet device.
> + *
> + * @return
> + *   number of supported flow priority on success, a negative errno
> + *   value otherwise and rte_errno is set.
> + */
>  int
>  mlx5_flow_priorities(struct rte_eth_dev *dev)

mlx5_flow_priorities() vs mlx5_flow_priority(), similar name but different
functionality. Better to rename it, e.g. mlx5_flow_get_max_priority() and
mlx5_flow_adjust_priority()

>  {
> @@ -156,6 +308,7 @@ mlx5_flow_priorities(struct rte_eth_dev *dev)
>  	struct mlx5_hrxq *drop = mlx5_hrxq_drop_new(dev);
>  	uint16_t vprio[] = { 8, 16 };
>  	int i;
> +	int priority = 0;
>  
>  	if (!drop) {
>  		rte_errno = ENOTSUP;
> @@ -167,11 +320,54 @@ mlx5_flow_priorities(struct rte_eth_dev *dev)
>  		if (!flow)
>  			break;
>  		claim_zero(mlx5_glue->destroy_flow(flow));
> +		priority = vprio[i];
> +	}
> +	switch (priority) {
> +	case 8:
> +		priority = 3;

How about,
	priority = RTE_DIM(priority_map_3);

> +		break;
> +	case 16:
> +		priority = 5;

	priority = RTE_DIM(priority_map_5);

> +		break;
> +	default:
> +		rte_errno = ENOTSUP;
> +		DRV_LOG(ERR,
> +			"port %u verbs maximum priority: %d expected 8/16",
> +			dev->data->port_id, vprio[i]);
> +		return -rte_errno;
>  	}
>  	mlx5_hrxq_drop_release(dev, drop);
>  	DRV_LOG(INFO, "port %u flow maximum priority: %d",
> -		dev->data->port_id, vprio[i]);
> -	return vprio[i];
> +		dev->data->port_id, priority);
> +	return priority;
> +}
> +
> +/** > + * Adjust flow priority.
> + *
> + * @param dev
> + *   Pointer to Ethernet device.
> + * @param flow
> + *   Pointer to an rte flow.
> + *
> + * @return
> + *   The priority adjusted.
> + */
> +static int
> +mlx5_flow_priority(struct rte_eth_dev *dev, uint32_t priority,
> +		   uint32_t subpriority)
> +{
> +	struct priv *priv = dev->data->dev_private;
> +
> +	switch (priv->config.flow_prio) {
> +	case 3:

	case RTE_DIM(priority_map_3):

> +		priority = priority_map_3[priority][subpriority];
> +		break;
> +	case 5:

	case RTE_DIM(priority_map_5):

> +		priority = priority_map_5[priority][subpriority];
> +		break;
> +	}
> +	return priority;
>  }
>  
>  /**
> @@ -185,6 +381,8 @@ void
>  mlx5_flow_print(struct rte_flow *flow __rte_unused)
>  {
>  #ifndef NDEBUG
> +	struct mlx5_flow_verbs *verbs = LIST_FIRST(&flow->verbs);
> +
>  	fprintf(stdout, "---------8<------------\n");
>  	fprintf(stdout, "%s: flow information\n", MLX5_DRIVER_NAME);
>  	fprintf(stdout, " attributes: group %u priority %u ingress %d egress %d"
> @@ -193,26 +391,36 @@ mlx5_flow_print(struct rte_flow *flow __rte_unused)
>  		flow->attributes.ingress,
>  		flow->attributes.egress,
>  		flow->attributes.transfer);
> -	fprintf(stdout, " layers: %s/%s/%s\n",
> -		flow->layers & MLX5_FLOW_LAYER_OUTER_L2 ? "l2" : "-",
> -		flow->layers & MLX5_FLOW_LAYER_OUTER_L3 ? "l3" : "-",
> -		flow->layers & MLX5_FLOW_LAYER_OUTER_L4 ? "l4" : "-");
> -	if (flow->fate & MLX5_FLOW_FATE_DROP)
> +	if (flow->fate & MLX5_FLOW_FATE_DROP) {
>  		fprintf(stdout, " fate: drop queue\n");
> -	else if (flow->fate & MLX5_FLOW_FATE_QUEUE)
> -		fprintf(stdout, " fate: target queue %u\n", flow->queue);
> -	if (flow->verbs.attr) {
> -		struct ibv_spec_header *hdr =
> -			(struct ibv_spec_header *)flow->verbs.specs;
> -		const int n = flow->verbs.attr->num_of_specs;
> -		int i;
> -
> -		fprintf(stdout, " Verbs attributes: specs_n %u\n",
> -			flow->verbs.attr->num_of_specs);
> -		for (i = 0; i != n; ++i) {
> -			rte_hexdump(stdout, " ", hdr, hdr->size);
> -			hdr = (struct ibv_spec_header *)
> -				((uint8_t *)hdr + hdr->size);
> +	} else {
> +		uint16_t i;
> +
> +		fprintf(stdout, " fate: target queues");
> +		for (i = 0; i != flow->rss.queue_num; ++i)
> +			fprintf(stdout, " %u", (*flow->queue)[i]);
> +		fprintf(stdout, "\n");
> +	}
> +	LIST_FOREACH(verbs, &flow->verbs, next) {
> +		uint32_t layers = flow->layers | verbs->layers;
> +
> +		fprintf(stdout, " layers: %s/%s/%s\n",
> +			layers & MLX5_FLOW_LAYER_OUTER_L2 ? "l2" : "-",
> +			layers & MLX5_FLOW_LAYER_OUTER_L3 ? "l3" : "-",
> +			layers & MLX5_FLOW_LAYER_OUTER_L4 ? "l4" : "-");
> +		if (verbs->attr) {
> +			struct ibv_spec_header *hdr =
> +				(struct ibv_spec_header *)verbs->specs;
> +			const int n = verbs->attr->num_of_specs;
> +			int i;
> +
> +			fprintf(stdout, " Verbs attributes: specs_n %u\n",
> +				verbs->attr->num_of_specs);
> +			for (i = 0; i != n; ++i) {
> +				rte_hexdump(stdout, " ", hdr, hdr->size);
> +				hdr = (struct ibv_spec_header *)
> +					((uint8_t *)hdr + hdr->size);
> +			}
>  		}
>  	}
>  	fprintf(stdout, "--------->8------------\n");
> @@ -239,18 +447,20 @@ mlx5_flow_attributes(struct rte_eth_dev *dev, const struct rte_flow_attr *attr,
>  		     struct rte_flow *flow, struct rte_flow_error *error)
>  {
>  	uint32_t priority_max =
> -		((struct priv *)dev->data->dev_private)->config.flow_prio;
> +		((struct priv *)dev->data->dev_private)->config.flow_prio - 1;
>  
>  	if (attr->group)
>  		return rte_flow_error_set(error, ENOTSUP,
>  					  RTE_FLOW_ERROR_TYPE_ATTR_GROUP,
>  					  NULL,
>  					  "groups are not supported");
> -	if (attr->priority >= priority_max)
> +	if (attr->priority != MLX5_FLOW_PRIO_RSVD &&
> +	    attr->priority >= priority_max)
>  		return rte_flow_error_set(error, ENOTSUP,
>  					  RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY,
>  					  NULL,
> -					  "priority value is not supported");
> +					  "requested priority value is not"
> +					  " supported");
>  	if (attr->egress)
>  		return rte_flow_error_set(error, ENOTSUP,
>  					  RTE_FLOW_ERROR_TYPE_ATTR_EGRESS,
> @@ -267,6 +477,8 @@ mlx5_flow_attributes(struct rte_eth_dev *dev, const struct rte_flow_attr *attr,
>  					  NULL,
>  					  "only ingress is supported");
>  	flow->attributes = *attr;
> +	if (attr->priority == MLX5_FLOW_PRIO_RSVD)
> +		flow->attributes.priority = priority_max;
>  	return 0;
>  }
>  
> @@ -346,14 +558,51 @@ mlx5_flow_item_validate(const struct rte_flow_item *item,
>  static void
>  mlx5_flow_spec_verbs_add(struct rte_flow *flow, void *src, unsigned int size)
>  {
> -	if (flow->verbs.specs) {
> +	struct mlx5_flow_verbs *verbs = flow->cur_verbs;
> +
> +	if (verbs->specs) {
>  		void *dst;
>  
> -		dst = (void *)(flow->verbs.specs + flow->verbs.size);
> +		dst = (void *)(verbs->specs + verbs->size);
>  		memcpy(dst, src, size);
> -		++flow->verbs.attr->num_of_specs;
> +		++verbs->attr->num_of_specs;
>  	}
> -	flow->verbs.size += size;
> +	verbs->size += size;
> +}
> +
> +/**
> + * Update layer bit-field.
> + *
> + * @param flow[in, out]
> + *   Pointer to flow structure.
> + * @param layers
> + *   Bit-fields of layers to add see MLX5_FLOW_ITEMS_*.

Where is MLX5_FLOW_ITEMS_*? Isn't it MLX5_FLOW_LAYER_*?
There are several occurrences.

> + */
> +static void
> +mlx5_flow_layers_update(struct rte_flow *flow, uint32_t layers)
> +{
> +	if (flow->expand) {
> +		if (flow->cur_verbs)
> +			flow->cur_verbs->layers |= layers;

If flow->cur_verbs is null, does that mean it is a testing call? Then, is it
unnecessary to update layers for the testing call? Confusing..

> +	} else {
> +		flow->layers |= layers;
> +	}
> +}
> +
> +/**
> + * Get layers bit-field.
> + *
> + * @param flow[in, out]
> + *   Pointer to flow structure.
> + */
> +static uint32_t
> +mlx5_flow_layers(struct rte_flow *flow)
> +{
> +	uint32_t layers = flow->layers;
> +
> +	if (flow->expand && flow->cur_verbs)

If flow is expanded and it is a testing call, then flow->layers is used?

> +		layers |= flow->cur_verbs->layers;
> +	return layers;

This part is so unclear to me, hard to understand. There are two 'layers'
fields, one in rte_flow and the other in mlx5_flow_verbs. It seems
rte_flow->layers is used only when the flow isn't expanded. If a flow is
expanded, flow->expand is set after processing the first entry in the expanded
list. In mlx5_flow_merge(),

	for (i = 0; i != buf->entries; ++i) {

		...

		flow->expand = !!(buf->entries > 1);
	}

Why is flow->expand set at the end of the loop? Is this in order to avoid
validation for the expanded flows? mlx5_flow_item_xxx() executes validation only
if flow->expand is zero, why?

And why does mlx5_flow_layers() have to return (flow->layers |
flow->cur_verbs->layers) if expanded?

If there are 3 entries in the rte_flow_expand_rss,
	eth
	eth / ipv4 / udp
	eth / ipv6 / udp

Then, the 2nd and 3rd don't have MLX5_FLOW_LAYER_OUTER_L2 in layers field?
Please explain in details and add comments appropriately.

>  }
>  
>  /**
> @@ -388,22 +637,26 @@ mlx5_flow_item_eth(const struct rte_flow_item *item, struct rte_flow *flow,
>  		.type = IBV_FLOW_SPEC_ETH,
>  		.size = size,
>  	};
> +	const uint32_t layers = mlx5_flow_layers(flow);
>  	int ret;
>  
> -	if (flow->layers & MLX5_FLOW_LAYER_OUTER_L2)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ITEM,
> -					  item,
> -					  "L2 layers already configured");
> -	if (!mask)
> -		mask = &rte_flow_item_eth_mask;
> -	ret = mlx5_flow_item_validate(item, (const uint8_t *)mask,
> -				      (const uint8_t *)&nic_mask,
> -				      sizeof(struct rte_flow_item_eth),
> -				      error);
> -	if (ret)
> -		return ret;
> -	flow->layers |= MLX5_FLOW_LAYER_OUTER_L2;
> +	if (!flow->expand) {
> +		if (layers & MLX5_FLOW_LAYER_OUTER_L2)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ITEM,
> +						  item,
> +						  "L2 layers already"
> +						  " configured");
> +		if (!mask)
> +			mask = &rte_flow_item_eth_mask;
> +		ret = mlx5_flow_item_validate(item, (const uint8_t *)mask,
> +					      (const uint8_t *)&nic_mask,
> +					      sizeof(struct rte_flow_item_eth),
> +					      error);
> +		if (ret)
> +			return ret;
> +	}
> +	mlx5_flow_layers_update(flow, MLX5_FLOW_LAYER_OUTER_L2);
>  	if (size > flow_size)
>  		return size;
>  	if (spec) {
> @@ -482,6 +735,7 @@ mlx5_flow_item_vlan(const struct rte_flow_item *item, struct rte_flow *flow,
>  		.tci = RTE_BE16(0x0fff),
>  	};
>  	unsigned int size = sizeof(struct ibv_flow_spec_eth);
> +	struct mlx5_flow_verbs *verbs = flow->cur_verbs;
>  	struct ibv_flow_spec_eth eth = {
>  		.type = IBV_FLOW_SPEC_ETH,
>  		.size = size,
> @@ -491,24 +745,30 @@ mlx5_flow_item_vlan(const struct rte_flow_item *item, struct rte_flow *flow,
>  			MLX5_FLOW_LAYER_OUTER_L4;
>  	const uint32_t vlanm = MLX5_FLOW_LAYER_OUTER_VLAN;
>  	const uint32_t l2m = MLX5_FLOW_LAYER_OUTER_L2;
> +	const uint32_t layers = mlx5_flow_layers(flow);
>  
> -	if (flow->layers & vlanm)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ITEM,
> -					  item,
> -					  "L2 layers already configured");
> -	else if ((flow->layers & lm) != 0)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ITEM,
> -					  item,
> -					  "L2 layer cannot follow L3/L4 layer");
> -	if (!mask)
> -		mask = &rte_flow_item_vlan_mask;
> -	ret = mlx5_flow_item_validate(item, (const uint8_t *)mask,
> -				      (const uint8_t *)&nic_mask,
> -				      sizeof(struct rte_flow_item_vlan), error);
> -	if (ret)
> -		return ret;
> +	if (!flow->expand) {
> +		if (layers & vlanm)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ITEM,
> +						  item,
> +						  "L2 layers already"
> +						  " configured");
> +		else if ((layers & lm) != 0)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ITEM,
> +						  item,
> +						  "L2 layer cannot follow"
> +						  " L3/L4 layer");
> +		if (!mask)
> +			mask = &rte_flow_item_vlan_mask;
> +		ret = mlx5_flow_item_validate(item, (const uint8_t *)mask,
> +					      (const uint8_t *)&nic_mask,
> +					      sizeof(struct rte_flow_item_vlan),
> +					      error);
> +		if (ret)
> +			return ret;
> +	}
>  	if (spec) {
>  		eth.val.vlan_tag = spec->tci;
>  		eth.mask.vlan_tag = mask->tci;
> @@ -517,32 +777,34 @@ mlx5_flow_item_vlan(const struct rte_flow_item *item, struct rte_flow *flow,
>  		eth.mask.ether_type = mask->inner_type;
>  		eth.val.ether_type &= eth.mask.ether_type;
>  	}
> -	/*
> -	 * From verbs perspective an empty VLAN is equivalent
> -	 * to a packet without VLAN layer.
> -	 */
> -	if (!eth.mask.vlan_tag)
> -		return rte_flow_error_set(error, EINVAL,
> -					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> -					  item->spec,
> -					  "VLAN cannot be empty");
> -	/* Outer TPID cannot be matched. */
> -	if (eth.mask.ether_type)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> -					  item->spec,
> -					  "VLAN TPID matching is not"
> -					  " supported");
> -	if (!(flow->layers & l2m)) {
> +	if (!flow->expand) {
> +		/*
> +		 * From verbs perspective an empty VLAN is equivalent
> +		 * to a packet without VLAN layer.
> +		 */
> +		if (!eth.mask.vlan_tag)
> +			return rte_flow_error_set(error, EINVAL,
> +						  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> +						  item->spec,
> +						  "VLAN cannot be empty");
> +		/* Outer TPID cannot be matched. */
> +		if (eth.mask.ether_type)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> +						  item->spec,
> +						  "VLAN TPID matching is not"
> +						  " supported");
> +	}
> +	if (!(layers & l2m)) {
>  		if (size <= flow_size)
>  			mlx5_flow_spec_verbs_add(flow, &eth, size);
>  	} else {
> -		if (flow->verbs.attr)
> -			mlx5_flow_item_vlan_update(flow->verbs.attr, &eth);
> +		if (verbs->attr)
> +			mlx5_flow_item_vlan_update(verbs->attr, &eth);
>  		size = 0; /**< Only an update is done in eth specification. */
>  	}
> -	flow->layers |= MLX5_FLOW_LAYER_OUTER_L2 |
> -		MLX5_FLOW_LAYER_OUTER_VLAN;
> +	mlx5_flow_layers_update(flow, MLX5_FLOW_LAYER_OUTER_L2 |
> +				MLX5_FLOW_LAYER_OUTER_VLAN);
>  	return size;
>  }
>  
> @@ -582,25 +844,31 @@ mlx5_flow_item_ipv4(const struct rte_flow_item *item, struct rte_flow *flow,
>  		.size = size,
>  	};
>  	int ret;
> +	const uint32_t layers = mlx5_flow_layers(flow);
>  
> -	if (flow->layers & MLX5_FLOW_LAYER_OUTER_L3)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ITEM,
> -					  item,
> -					  "multiple L3 layers not supported");
> -	else if (flow->layers & MLX5_FLOW_LAYER_OUTER_L4)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ITEM,
> -					  item,
> -					  "L3 cannot follow an L4 layer.");
> -	if (!mask)
> -		mask = &rte_flow_item_ipv4_mask;
> -	ret = mlx5_flow_item_validate(item, (const uint8_t *)mask,
> -				      (const uint8_t *)&nic_mask,
> -				      sizeof(struct rte_flow_item_ipv4), error);
> -	if (ret < 0)
> -		return ret;
> -	flow->layers |= MLX5_FLOW_LAYER_OUTER_L3_IPV4;
> +	if (!flow->expand) {
> +		if (layers & MLX5_FLOW_LAYER_OUTER_L3)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ITEM,
> +						  item,
> +						  "multiple L3 layers not"
> +						  " supported");
> +		else if (layers & MLX5_FLOW_LAYER_OUTER_L4)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ITEM,
> +						  item,
> +						  "L3 cannot follow an L4"
> +						  " layer");
> +		if (!mask)
> +			mask = &rte_flow_item_ipv4_mask;
> +		ret = mlx5_flow_item_validate(item, (const uint8_t *)mask,
> +					      (const uint8_t *)&nic_mask,
> +					      sizeof(struct rte_flow_item_ipv4),
> +					      error);
> +		if (ret < 0)
> +			return ret;
> +	}
> +	mlx5_flow_layers_update(flow, MLX5_FLOW_LAYER_OUTER_L3_IPV4);
>  	if (size > flow_size)
>  		return size;
>  	if (spec) {
> @@ -667,25 +935,31 @@ mlx5_flow_item_ipv6(const struct rte_flow_item *item, struct rte_flow *flow,
>  		.size = size,
>  	};
>  	int ret;
> +	const uint32_t layers = mlx5_flow_layers(flow);
>  
> -	if (flow->layers & MLX5_FLOW_LAYER_OUTER_L3)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ITEM,
> -					  item,
> -					  "multiple L3 layers not supported");
> -	else if (flow->layers & MLX5_FLOW_LAYER_OUTER_L4)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ITEM,
> -					  item,
> -					  "L3 cannot follow an L4 layer.");
> -	if (!mask)
> -		mask = &rte_flow_item_ipv6_mask;
> -	ret = mlx5_flow_item_validate(item, (const uint8_t *)mask,
> -				      (const uint8_t *)&nic_mask,
> -				      sizeof(struct rte_flow_item_ipv6), error);
> -	if (ret < 0)
> -		return ret;
> -	flow->layers |= MLX5_FLOW_LAYER_OUTER_L3_IPV6;
> +	if (!flow->expand) {
> +		if (layers & MLX5_FLOW_LAYER_OUTER_L3)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ITEM,
> +						  item,
> +						  "multiple L3 layers not"
> +						  " supported");
> +		else if (layers & MLX5_FLOW_LAYER_OUTER_L4)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ITEM,
> +						  item,
> +						  "L3 cannot follow an L4"
> +						  " layer");
> +		if (!mask)
> +			mask = &rte_flow_item_ipv6_mask;
> +		ret = mlx5_flow_item_validate(item, (const uint8_t *)mask,
> +					      (const uint8_t *)&nic_mask,
> +					      sizeof(struct rte_flow_item_ipv6),
> +					      error);
> +		if (ret < 0)
> +			return ret;
> +	}
> +	mlx5_flow_layers_update(flow, MLX5_FLOW_LAYER_OUTER_L3_IPV6);
>  	if (size > flow_size)
>  		return size;
>  	if (spec) {
> @@ -759,25 +1033,31 @@ mlx5_flow_item_udp(const struct rte_flow_item *item, struct rte_flow *flow,
>  		.size = size,
>  	};
>  	int ret;
> +	const uint32_t layers = mlx5_flow_layers(flow);
>  
> -	if (!(flow->layers & MLX5_FLOW_LAYER_OUTER_L3))
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ITEM,
> -					  item,
> -					  "L3 is mandatory to filter on L4");
> -	if (flow->layers & MLX5_FLOW_LAYER_OUTER_L4)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ITEM,
> -					  item,
> -					  "L4 layer is already present");
> -	if (!mask)
> -		mask = &rte_flow_item_udp_mask;
> -	ret = mlx5_flow_item_validate(item, (const uint8_t *)mask,
> -				      (const uint8_t *)&rte_flow_item_udp_mask,
> -				      sizeof(struct rte_flow_item_udp), error);
> -	if (ret < 0)
> -		return ret;
> -	flow->layers |= MLX5_FLOW_LAYER_OUTER_L4_UDP;
> +	if (!flow->expand) {
> +		if (!(layers & MLX5_FLOW_LAYER_OUTER_L3))
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ITEM,
> +						  item,
> +						  "L3 is mandatory to filter"
> +						  " on L4");
> +		if (layers & MLX5_FLOW_LAYER_OUTER_L4)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ITEM,
> +						  item,
> +						  "L4 layer is already"
> +						  " present");
> +		if (!mask)
> +			mask = &rte_flow_item_udp_mask;
> +		ret = mlx5_flow_item_validate
> +			(item, (const uint8_t *)mask,
> +			 (const uint8_t *)&rte_flow_item_udp_mask,
> +			 sizeof(struct rte_flow_item_udp), error);
> +		if (ret < 0)
> +			return ret;
> +	}
> +	mlx5_flow_layers_update(flow, MLX5_FLOW_LAYER_OUTER_L4_UDP);
>  	if (size > flow_size)
>  		return size;
>  	if (spec) {
> @@ -821,25 +1101,31 @@ mlx5_flow_item_tcp(const struct rte_flow_item *item, struct rte_flow *flow,
>  		.size = size,
>  	};
>  	int ret;
> +	const uint32_t layers = mlx5_flow_layers(flow);
>  
> -	if (!(flow->layers & MLX5_FLOW_LAYER_OUTER_L3))
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ITEM,
> -					  item,
> -					  "L3 is mandatory to filter on L4");
> -	if (flow->layers & MLX5_FLOW_LAYER_OUTER_L4)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ITEM,
> -					  item,
> -					  "L4 layer is already present");
> -	if (!mask)
> -		mask = &rte_flow_item_tcp_mask;
> -	ret = mlx5_flow_item_validate(item, (const uint8_t *)mask,
> -				      (const uint8_t *)&rte_flow_item_tcp_mask,
> -				      sizeof(struct rte_flow_item_tcp), error);
> -	if (ret < 0)
> -		return ret;
> -	flow->layers |= MLX5_FLOW_LAYER_OUTER_L4_TCP;
> +	if (!flow->expand) {
> +		if (!(layers & MLX5_FLOW_LAYER_OUTER_L3))
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ITEM,
> +						  item,
> +						  "L3 is mandatory to filter"
> +						  " on L4");
> +		if (layers & MLX5_FLOW_LAYER_OUTER_L4)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ITEM,
> +						  item,
> +						  "L4 layer is already"
> +						  " present");
> +		if (!mask)
> +			mask = &rte_flow_item_tcp_mask;
> +		ret = mlx5_flow_item_validate
> +			(item, (const uint8_t *)mask,
> +			 (const uint8_t *)&rte_flow_item_tcp_mask,
> +			 sizeof(struct rte_flow_item_tcp), error);
> +		if (ret < 0)
> +			return ret;
> +	}
> +	mlx5_flow_layers_update(flow, MLX5_FLOW_LAYER_OUTER_L4_TCP);
>  	if (size > flow_size)
>  		return size;
>  	if (spec) {
> @@ -954,18 +1240,20 @@ mlx5_flow_action_drop(const struct rte_flow_action *actions,
>  			.size = size,
>  	};
>  
> -	if (flow->fate)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ACTION,
> -					  actions,
> -					  "multiple fate actions are not"
> -					  " supported");
> -	if (flow->modifier & (MLX5_FLOW_MOD_FLAG | MLX5_FLOW_MOD_MARK))
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ACTION,
> -					  actions,
> -					  "drop is not compatible with"
> -					  " flag/mark action");
> +	if (!flow->expand) {
> +		if (flow->fate)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ACTION,
> +						  actions,
> +						  "multiple fate actions are"
> +						  " not supported");
> +		if (flow->modifier & (MLX5_FLOW_MOD_FLAG | MLX5_FLOW_MOD_MARK))
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ACTION,
> +						  actions,
> +						  "drop is not compatible with"
> +						  " flag/mark action");
> +	}
>  	if (size < flow_size)
>  		mlx5_flow_spec_verbs_add(flow, &drop, size);
>  	flow->fate |= MLX5_FLOW_FATE_DROP;
> @@ -998,6 +1286,8 @@ mlx5_flow_action_queue(struct rte_eth_dev *dev,
>  	struct priv *priv = dev->data->dev_private;
>  	const struct rte_flow_action_queue *queue = actions->conf;
>  
> +	if (flow->expand)
> +		return 0;
>  	if (flow->fate)
>  		return rte_flow_error_set(error, ENOTSUP,
>  					  RTE_FLOW_ERROR_TYPE_ACTION,
> @@ -1014,11 +1304,162 @@ mlx5_flow_action_queue(struct rte_eth_dev *dev,
>  					  RTE_FLOW_ERROR_TYPE_ACTION_CONF,
>  					  &queue->index,
>  					  "queue is not configured");
> -	flow->queue = queue->index;
> +	if (flow->queue)
> +		(*flow->queue)[0] = queue->index;
> +	flow->rss.queue_num = 1;
>  	flow->fate |= MLX5_FLOW_FATE_QUEUE;
>  	return 0;
>  }
>  
> +/**
> + * Store the Verbs hash fields and priority according to the layer and types.
> + *
> + * @param dev
> + *   Pointer to Ethernet device.
> + * @param flow
> + *   Pointer to flow structure.
> + * @param types
> + *   RSS types for this flow (see ETH_RSS_*).
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> + */
> +static int
> +mlx5_flow_action_rss_verbs_attr(struct rte_eth_dev *dev, struct rte_flow *flow,
> +				uint32_t types)
> +{
> +	const uint32_t layers = mlx5_flow_layers(flow);
> +	uint64_t hash_fields;
> +	uint32_t priority;
> +
> +	if ((types & ETH_RSS_NONFRAG_IPV4_TCP) &&
> +	    (layers & MLX5_FLOW_LAYER_OUTER_L4_TCP)) {
> +		hash_fields = IBV_RX_HASH_SRC_IPV4 |
> +			IBV_RX_HASH_DST_IPV4 |
> +			IBV_RX_HASH_SRC_PORT_TCP |
> +			IBV_RX_HASH_DST_PORT_TCP;
> +		priority = 0;
> +	} else if ((types & ETH_RSS_NONFRAG_IPV4_UDP) &&
> +		 (layers & MLX5_FLOW_LAYER_OUTER_L4_UDP)) {
> +		hash_fields = IBV_RX_HASH_SRC_IPV4 |
> +			IBV_RX_HASH_DST_IPV4 |
> +			IBV_RX_HASH_SRC_PORT_UDP |
> +			IBV_RX_HASH_DST_PORT_UDP;
> +		priority = 0;
> +	} else if ((types & (ETH_RSS_IPV4 | ETH_RSS_FRAG_IPV4)) &&
> +		 (layers & MLX5_FLOW_LAYER_OUTER_L3_IPV4)) {
> +		hash_fields = IBV_RX_HASH_SRC_IPV4 |
> +			IBV_RX_HASH_DST_IPV4;
> +		priority = 1;
> +	} else if ((types & ETH_RSS_NONFRAG_IPV6_TCP) &&
> +		 (layers & MLX5_FLOW_LAYER_OUTER_L4_TCP)) {
> +		hash_fields = IBV_RX_HASH_SRC_IPV6 |
> +			IBV_RX_HASH_DST_IPV6 |
> +			IBV_RX_HASH_SRC_PORT_TCP |
> +			IBV_RX_HASH_DST_PORT_TCP;
> +		priority = 0;
> +	} else if ((types & ETH_RSS_NONFRAG_IPV6_UDP) &&
> +		 (layers & MLX5_FLOW_LAYER_OUTER_L3_IPV6)) {
> +		hash_fields = IBV_RX_HASH_SRC_IPV6 |
> +			IBV_RX_HASH_DST_IPV6 |
> +			IBV_RX_HASH_SRC_PORT_UDP |
> +			IBV_RX_HASH_DST_PORT_UDP;
> +		priority = 0;
> +	} else if ((types & (ETH_RSS_IPV6 | ETH_RSS_FRAG_IPV6)) &&
> +		 (layers & MLX5_FLOW_LAYER_OUTER_L3_IPV6)) {
> +		hash_fields = IBV_RX_HASH_SRC_IPV6 |
> +			IBV_RX_HASH_DST_IPV6;
> +		priority = 1;
> +	} else {
> +		hash_fields = 0;
> +		priority = 2;

How about 
		delta = MLX5_SUB_PRIORITY_2;

> +	}
> +	flow->cur_verbs->hash_fields = hash_fields;
> +	flow->cur_verbs->attr->priority =
> +		mlx5_flow_priority(dev, flow->attributes.priority, priority);
> +	return 0;
> +}
> +
> +/**
> + * Validate action queue provided by the user.
> + *
> + * @param dev
> + *   Pointer to Ethernet device structure.
> + * @param actions
> + *   Pointer to flow actions array.
> + * @param flow
> + *   Pointer to the rte_flow structure.
> + * @param error
> + *   Pointer to error structure.

Missing return value.

> + */
> +static int
> +mlx5_flow_action_rss(struct rte_eth_dev *dev,
> +		     const struct rte_flow_action *actions,
> +		     struct rte_flow *flow,
> +		     struct rte_flow_error *error)
> +{
> +	struct priv *priv = dev->data->dev_private;
> +	const struct rte_flow_action_rss *rss = actions->conf;
> +	unsigned int i;
> +
> +	if (flow->expand)
> +		return 0;
> +	if (flow->fate)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_ACTION,
> +					  actions,
> +					  "multiple fate actions are not"
> +					  " supported");
> +	if (rss->func != RTE_ETH_HASH_FUNCTION_DEFAULT &&
> +	    rss->func != RTE_ETH_HASH_FUNCTION_TOEPLITZ)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_ACTION_CONF,
> +					  &rss->func,
> +					  "RSS hash function not supported");
> +	if (rss->level > 1)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_ACTION_CONF,
> +					  &rss->level,
> +					  "tunnel RSS is not supported");
> +	if (rss->key_len < rss_hash_default_key_len)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_ACTION_CONF,
> +					  &rss->key_len,
> +					  "RSS hash key too small");
> +	if (rss->key_len > rss_hash_default_key_len)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_ACTION_CONF,
> +					  &rss->key_len,
> +					  "RSS hash key too large");
> +	if (rss->queue_num > priv->config.ind_table_max_size)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_ACTION_CONF,
> +					  &rss->queue_num,
> +					  "number of queues too large");
> +	if (rss->types & MLX5_RSS_HF_MASK)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_ACTION_CONF,
> +					  &rss->types,
> +					  "some RSS protocols are not"
> +					  " supported");
> +	for (i = 0; i != rss->queue_num; ++i) {
> +		if (!(*priv->rxqs)[rss->queue[i]])
> +			return rte_flow_error_set
> +				(error, EINVAL,
> +				 RTE_FLOW_ERROR_TYPE_ACTION_CONF,
> +				 &rss->queue[i],
> +				 "queue is not configured");
> +	}
> +	if (flow->queue)
> +		memcpy((*flow->queue), rss->queue,
> +		       rss->queue_num * sizeof(uint16_t));
> +	flow->rss.queue_num = rss->queue_num;
> +	memcpy(flow->key, rss->key, rss_hash_default_key_len);
> +	flow->rss.types = rss->types;
> +	flow->fate |= MLX5_FLOW_FATE_RSS;
> +	return 0;
> +}
> +
>  /**
>   * Validate action flag provided by the user.
>   *
> @@ -1046,43 +1487,59 @@ mlx5_flow_action_flag(const struct rte_flow_action *actions,
>  		.size = size,
>  		.tag_id = mlx5_flow_mark_set(MLX5_FLOW_MARK_DEFAULT),
>  	};
> +	struct mlx5_flow_verbs *verbs = flow->cur_verbs;
>  
> -	if (flow->modifier & MLX5_FLOW_MOD_FLAG)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ACTION,
> -					  actions,
> -					  "flag action already present");
> -	if (flow->fate & MLX5_FLOW_FATE_DROP)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ACTION,
> -					  actions,
> -					  "flag is not compatible with drop"
> -					  " action");
> -	if (flow->modifier & MLX5_FLOW_MOD_MARK)
> -		return 0;
> +	if (!flow->expand) {
> +		if (flow->modifier & MLX5_FLOW_MOD_FLAG)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ACTION,
> +						  actions,
> +						  "flag action already present");
> +		if (flow->fate & MLX5_FLOW_FATE_DROP)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ACTION,
> +						  actions,
> +						  "flag is not compatible with"
> +						  " drop action");
> +	}
> +	/*
> +	 * The two only possible cases, a mark has already been added in the
> +	 * specification, in such case, the flag is already present in
> +	 * addition of the mark.
> +	 * Second case, has it is not possible to have two flags, it just
> +	 * needs to add it.
> +	 */

Can you rephrase the 'second case'? Maybe 'has' -> 'as'?

> +	if (verbs) {
> +		verbs->modifier |= MLX5_FLOW_MOD_FLAG;
> +		if (verbs->modifier & MLX5_FLOW_MOD_MARK)
> +			size = 0;
> +		else if (size <= flow_size)
> +			mlx5_flow_spec_verbs_add(flow, &tag, size);
> +	} else {
> +		if (flow->modifier & MLX5_FLOW_MOD_MARK)
> +			size = 0;
> +	}
>  	flow->modifier |= MLX5_FLOW_MOD_FLAG;
> -	if (size <= flow_size)
> -		mlx5_flow_spec_verbs_add(flow, &tag, size);
>  	return size;
>  }
>  
>  /**
>   * Update verbs specification to modify the flag to mark.
>   *
> - * @param flow
> - *   Pointer to the rte_flow structure.
> + * @param verbs
> + *   Pointer to the mlx5_flow_verbs structure.
>   * @param mark_id
>   *   Mark identifier to replace the flag.
>   */
>  static void
> -mlx5_flow_verbs_mark_update(struct rte_flow *flow, uint32_t mark_id)
> +mlx5_flow_verbs_mark_update(struct mlx5_flow_verbs *verbs, uint32_t mark_id)
>  {
>  	struct ibv_spec_header *hdr;
>  	int i;
>  
>  	/* Update Verbs specification. */
> -	hdr = (struct ibv_spec_header *)flow->verbs.specs;
> -	for (i = 0; i != flow->verbs.attr->num_of_specs; ++i) {
> +	hdr = (struct ibv_spec_header *)verbs->specs;
> +	for (i = 0; i != verbs->attr->num_of_specs; ++i) {
>  		if (hdr->type == IBV_FLOW_SPEC_ACTION_TAG) {
>  			struct ibv_flow_spec_action_tag *t =
>  				(struct ibv_flow_spec_action_tag *)hdr;
> @@ -1120,38 +1577,52 @@ mlx5_flow_action_mark(const struct rte_flow_action *actions,
>  		.type = IBV_FLOW_SPEC_ACTION_TAG,
>  		.size = size,
>  	};
> +	struct mlx5_flow_verbs *verbs = flow->cur_verbs;
>  
> -	if (!mark)
> -		return rte_flow_error_set(error, EINVAL,
> -					  RTE_FLOW_ERROR_TYPE_ACTION,
> -					  actions,
> -					  "configuration cannot be null");
> -	if (mark->id >= MLX5_FLOW_MARK_MAX)
> -		return rte_flow_error_set(error, EINVAL,
> -					  RTE_FLOW_ERROR_TYPE_ACTION_CONF,
> -					  &mark->id,
> -					  "mark must be between 0 and"
> -					  " 16777199");
> -	if (flow->modifier & MLX5_FLOW_MOD_MARK)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ACTION,
> -					  actions,
> -					  "mark action already present");
> -	if (flow->fate & MLX5_FLOW_FATE_DROP)
> -		return rte_flow_error_set(error, ENOTSUP,
> -					  RTE_FLOW_ERROR_TYPE_ACTION,
> -					  actions,
> -					  "mark is not compatible with drop"
> -					  " action");
> -	if (flow->modifier & MLX5_FLOW_MOD_FLAG) {
> -		mlx5_flow_verbs_mark_update(flow, mark->id);
> -		size = 0; /**< Only an update is done in the specification. */
> -	} else {
> -		tag.tag_id = mlx5_flow_mark_set(mark->id);
> -		if (size <= flow_size) {
> +	if (!flow->expand) {
> +		if (!mark)
> +			return rte_flow_error_set(error, EINVAL,
> +						  RTE_FLOW_ERROR_TYPE_ACTION,
> +						  actions,
> +						  "configuration cannot be"
> +						  " null");
> +		if (mark->id >= MLX5_FLOW_MARK_MAX)
> +			return rte_flow_error_set
> +				(error, EINVAL,
> +				 RTE_FLOW_ERROR_TYPE_ACTION_CONF,
> +				 &mark->id,
> +				 "mark must be between 0 and 16777199");
> +		if (flow->modifier & MLX5_FLOW_MOD_MARK)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ACTION,
> +						  actions,
> +						  "mark action already"
> +						  " present");
> +		if (flow->fate & MLX5_FLOW_FATE_DROP)
> +			return rte_flow_error_set(error, ENOTSUP,
> +						  RTE_FLOW_ERROR_TYPE_ACTION,
> +						  actions,
> +						  "mark is not compatible with"
> +						  " drop action");
> +	}
> +	/*
> +	 * The two only possible cases, a flag has already been added in the
> +	 * specification, in such case, it needs to be update to add the id.
> +	 * Second case, has it is not possible to have two mark, it just
> +	 * needs to add it.
> +	 */

Can you rephrase the 'second case'? Maybe 'has' -> 'as'?

> +	if (verbs) {
> +		verbs->modifier |= MLX5_FLOW_MOD_MARK;
> +		if (verbs->modifier & MLX5_FLOW_MOD_FLAG) {
> +			mlx5_flow_verbs_mark_update(verbs, mark->id);
> +			size = 0;
> +		} else if (size <= flow_size) {

If verbs isn't null (not testing call), isn't it guaranteed there's enough
space? Is it still needed to check the size?

>  			tag.tag_id = mlx5_flow_mark_set(mark->id);
>  			mlx5_flow_spec_verbs_add(flow, &tag, size);
>  		}
> +	} else {
> +		if (flow->modifier & MLX5_FLOW_MOD_FLAG)
> +			size = 0;
>  	}
>  	flow->modifier |= MLX5_FLOW_MOD_MARK;
>  	return size;
> @@ -1185,6 +1656,15 @@ mlx5_flow_actions(struct rte_eth_dev *dev,
>  	int remain = flow_size;
>  	int ret = 0;
>  
> +	/*
> +	 * FLAG/MARK are the only actions having a specification in Verbs and
> +	 * not making part of the packet fate.  Due to this specificity and to
> +	 * avoid extra variable, their bit in the flow->modifier bit-field are
> +	 * disabled here to compute the exact necessary memory those action
> +	 * needs.
> +	 */
> +	flow->modifier &= ~(MLX5_FLOW_MOD_FLAG | MLX5_FLOW_MOD_MARK);

Can't understand this well. Is this for the case where the flow is expanded? If
so, why don't you reset flow->modifier in the for loop of mlx5_flow_merge()?

> +	/* Process the actions. */
>  	for (; actions->type != RTE_FLOW_ACTION_TYPE_END; actions++) {
>  		switch (actions->type) {
>  		case RTE_FLOW_ACTION_TYPE_VOID:
> @@ -1204,6 +1684,9 @@ mlx5_flow_actions(struct rte_eth_dev *dev,
>  		case RTE_FLOW_ACTION_TYPE_QUEUE:
>  			ret = mlx5_flow_action_queue(dev, actions, flow, error);
>  			break;
> +		case RTE_FLOW_ACTION_TYPE_RSS:
> +			ret = mlx5_flow_action_rss(dev, actions, flow, error);
> +			break;
>  		default:
>  			return rte_flow_error_set(error, ENOTSUP,
>  						  RTE_FLOW_ERROR_TYPE_ACTION,
> @@ -1257,27 +1740,92 @@ mlx5_flow_merge(struct rte_eth_dev *dev, struct rte_flow *flow,
>  		struct rte_flow_error *error)
>  {
>  	struct rte_flow local_flow = { .layers = 0, };
> -	size_t size = sizeof(*flow) + sizeof(struct ibv_flow_attr);
> +	size_t size = sizeof(*flow);
>  	int remain = (flow_size > size) ? flow_size - size : 0;
> +	struct rte_flow_expand_rss *buf;
>  	int ret;
> +	uint32_t i;
>  
>  	if (!remain)
>  		flow = &local_flow;
>  	ret = mlx5_flow_attributes(dev, attr, flow, error);
>  	if (ret < 0)
>  		return ret;
> -	ret = mlx5_flow_items(items, flow, remain, error);
> -	if (ret < 0)
> -		return ret;
> -	size += ret;
> -	remain = (flow_size > size) ? flow_size - size : 0;
> -	ret = mlx5_flow_actions(dev, actions, flow, remain, error);
> +	ret = mlx5_flow_actions(dev, actions, &local_flow, 0, error);
>  	if (ret < 0)
>  		return ret;
> -	size += ret;
> +	ret = rte_flow_expand_rss(NULL, 0, items, local_flow.rss.types,
> +				  mlx5_support_expansion,
> +				  local_flow.rss.level < 2 ?
> +				  MLX5_EXPANSION_ROOT : MLX5_EXPANSION_ROOT2);
> +	assert(ret > 0);
> +	buf = rte_calloc(__func__, 1, ret, 0);
> +	if (!buf) {
> +		rte_flow_error_set(error, ENOMEM,
> +				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +				   NULL,
> +				   "not enough memory to expand the RSS flow");
> +		goto error;
> +	}

I'm pretty sure you've already fixed this bug. Validation can't return ENOMEM.

> +	ret = rte_flow_expand_rss(buf, ret, items, local_flow.rss.types,
> +				  mlx5_support_expansion,
> +				  local_flow.rss.level < 2 ?
> +				  MLX5_EXPANSION_ROOT : MLX5_EXPANSION_ROOT2);
> +	assert(ret > 0);
> +	size += RTE_ALIGN_CEIL(local_flow.rss.queue_num * sizeof(uint16_t),
> +			       sizeof(void *));
>  	if (size <= flow_size)
> -		flow->verbs.attr->priority = flow->attributes.priority;
> +		flow->queue = (void *)(flow + 1);
> +	LIST_INIT(&flow->verbs);
> +	flow->layers = 0;
> +	flow->modifier = 0;
> +	flow->fate = 0;
> +	for (i = 0; i != buf->entries; ++i) {
> +		size_t off = size;
> +
> +		size += sizeof(struct ibv_flow_attr) +
> +			sizeof(struct mlx5_flow_verbs);
> +		remain = (flow_size > size) ? flow_size - size : 0;
> +		if (remain) {
> +			flow->cur_verbs = (void *)((uintptr_t)flow + off);
> +			flow->cur_verbs->attr = (void *)(flow->cur_verbs + 1);
> +			flow->cur_verbs->specs =
> +				(void *)(flow->cur_verbs->attr + 1);
> +		}
> +		ret = mlx5_flow_items
> +			((const struct rte_flow_item *)buf->patterns[i],
> +			 flow, remain, error);
> +		if (ret < 0)
> +			goto error;
> +		size += ret;
> +		if (remain > ret)
> +			remain -= ret;
> +		else
> +			remain = 0;
> +		ret = mlx5_flow_actions(dev, actions, flow, remain, error);
> +		if (ret < 0)
> +			goto error;
> +		size += ret;
> +		if (remain > ret)
> +			remain -= ret;
> +		else
> +			remain = 0;
> +		if (size <= flow_size) {
> +			flow->cur_verbs->attr->priority =
> +				flow->attributes.priority;
> +			ret = mlx5_flow_action_rss_verbs_attr(dev, flow,
> +							      flow->rss.types);
> +			if (ret < 0)
> +				goto error;
> +			LIST_INSERT_HEAD(&flow->verbs, flow->cur_verbs, next);
> +		}
> +		flow->expand = !!(buf->entries > 1);
> +	}
> +	rte_free(buf);
>  	return size;
> +error:
> +	rte_free(buf);
> +	return ret;
>  }
>  
>  /**
> @@ -1292,9 +1840,13 @@ static void
>  mlx5_flow_rxq_mark(struct rte_eth_dev *dev, struct rte_flow *flow)
>  {
>  	struct priv *priv = dev->data->dev_private;
> +	const uint32_t mask = MLX5_FLOW_MOD_FLAG | MLX5_FLOW_MOD_MARK;
> +	uint32_t i;
>  
> -	(*priv->rxqs)[flow->queue]->mark |=
> -		flow->modifier & (MLX5_FLOW_MOD_FLAG | MLX5_FLOW_MOD_MARK);
> +	if (!(flow->modifier & mask))
> +		return;
> +	for (i = 0; i != flow->rss.queue_num; ++i)
> +		(*priv->rxqs)[(*flow->queue)[i]]->mark = 1;
>  }
>  
>  /**
> @@ -1328,18 +1880,20 @@ mlx5_flow_validate(struct rte_eth_dev *dev,
>  static void
>  mlx5_flow_fate_remove(struct rte_eth_dev *dev, struct rte_flow *flow)
>  {
> -	if (flow->fate & MLX5_FLOW_FATE_DROP) {
> -		if (flow->verbs.flow) {
> -			claim_zero(mlx5_glue->destroy_flow(flow->verbs.flow));
> -			flow->verbs.flow = NULL;
> +	struct mlx5_flow_verbs *verbs;
> +
> +	LIST_FOREACH(verbs, &flow->verbs, next) {
> +		if (verbs->flow) {
> +			claim_zero(mlx5_glue->destroy_flow(verbs->flow));
> +			verbs->flow = NULL;
> +		}
> +		if (verbs->hrxq) {
> +			if (flow->fate & MLX5_FLOW_FATE_DROP)
> +				mlx5_hrxq_drop_release(dev, verbs->hrxq);
> +			else
> +				mlx5_hrxq_release(dev, verbs->hrxq);
> +			verbs->hrxq = NULL;
>  		}
> -	}
> -	if (flow->verbs.hrxq) {
> -		if (flow->fate & MLX5_FLOW_FATE_DROP)
> -			mlx5_hrxq_drop_release(dev, flow->verbs.hrxq);
> -		else if (flow->fate & MLX5_FLOW_FATE_QUEUE)
> -			mlx5_hrxq_release(dev, flow->verbs.hrxq);
> -		flow->verbs.hrxq = NULL;
>  	}
>  }
>  
> @@ -1360,46 +1914,68 @@ static int
>  mlx5_flow_fate_apply(struct rte_eth_dev *dev, struct rte_flow *flow,
>  		     struct rte_flow_error *error)
>  {
> -	if (flow->fate & MLX5_FLOW_FATE_DROP) {
> -		flow->verbs.hrxq = mlx5_hrxq_drop_new(dev);
> -		if (!flow->verbs.hrxq)
> -			return rte_flow_error_set
> -				(error, errno,
> -				 RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> -				 NULL,
> -				 "cannot allocate Drop queue");
> -	} else if (flow->fate & MLX5_FLOW_FATE_QUEUE) {
> -		struct mlx5_hrxq *hrxq;
> -
> -		hrxq = mlx5_hrxq_get(dev, rss_hash_default_key,
> -				     rss_hash_default_key_len, 0,
> -				     &flow->queue, 1, 0, 0);
> -		if (!hrxq)
> -			hrxq = mlx5_hrxq_new(dev, rss_hash_default_key,
> -					     rss_hash_default_key_len, 0,
> -					     &flow->queue, 1, 0, 0);
> -		if (!hrxq)
> -			return rte_flow_error_set(error, rte_errno,
> -					RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> -					NULL,
> -					"cannot create flow");
> -		flow->verbs.hrxq = hrxq;
> -	}
> -	flow->verbs.flow =
> -		mlx5_glue->create_flow(flow->verbs.hrxq->qp, flow->verbs.attr);
> -	if (!flow->verbs.flow) {
> -		if (flow->fate & MLX5_FLOW_FATE_DROP)
> -			mlx5_hrxq_drop_release(dev, flow->verbs.hrxq);
> -		else
> -			mlx5_hrxq_release(dev, flow->verbs.hrxq);
> -		flow->verbs.hrxq = NULL;
> -		return rte_flow_error_set(error, errno,
> -					  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> -					  NULL,
> -					  "kernel module refuses to create"
> -					  " flow");
> +	struct mlx5_flow_verbs *verbs;
> +	int err;
> +
> +	LIST_FOREACH(verbs, &flow->verbs, next) {
> +		if (flow->fate & MLX5_FLOW_FATE_DROP) {
> +			verbs->hrxq = mlx5_hrxq_drop_new(dev);
> +			if (!verbs->hrxq) {
> +				rte_flow_error_set
> +					(error, errno,
> +					 RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +					 NULL,
> +					 "cannot get drop hash queue");
> +				goto error;
> +			}
> +		} else {
> +			struct mlx5_hrxq *hrxq;
> +
> +			hrxq = mlx5_hrxq_get(dev, flow->key,
> +					     rss_hash_default_key_len,
> +					     verbs->hash_fields,
> +					     (*flow->queue),
> +					     flow->rss.queue_num, 0, 0);
> +			if (!hrxq)
> +				hrxq = mlx5_hrxq_new(dev, flow->key,
> +						     rss_hash_default_key_len,
> +						     verbs->hash_fields,
> +						     (*flow->queue),
> +						     flow->rss.queue_num, 0, 0);
> +			if (!hrxq) {
> +				rte_flow_error_set
> +					(error, rte_errno,
> +					 RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +					 NULL,
> +					 "cannot get hash queue");
> +				goto error;
> +			}
> +			verbs->hrxq = hrxq;
> +		}
> +		verbs->flow =
> +			mlx5_glue->create_flow(verbs->hrxq->qp, verbs->attr);
> +		if (!verbs->flow) {
> +			rte_flow_error_set(error, errno,
> +					   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> +					   NULL,
> +					   "hardware refuses to create flow");
> +			goto error;
> +		}
>  	}
>  	return 0;
> +error:
> +	err = rte_errno; /* Save rte_errno before cleanup. */
> +	LIST_FOREACH(verbs, &flow->verbs, next) {
> +		if (verbs->hrxq) {
> +			if (flow->fate & MLX5_FLOW_FATE_DROP)
> +				mlx5_hrxq_drop_release(dev, verbs->hrxq);
> +			else
> +				mlx5_hrxq_release(dev, verbs->hrxq);
> +			verbs->hrxq = NULL;
> +		}
> +	}
> +	rte_errno = err; /* Restore rte_errno. */
> +	return -rte_errno;
>  }
>  
>  /**
> @@ -1429,42 +2005,43 @@ mlx5_flow_list_create(struct rte_eth_dev *dev,
>  		      const struct rte_flow_action actions[],
>  		      struct rte_flow_error *error)
>  {
> -	struct rte_flow *flow;
> -	size_t size;
> +	struct rte_flow *flow = NULL;
> +	size_t size = 0;
>  	int ret;
>  
> -	ret = mlx5_flow_merge(dev, NULL, 0, attr, items, actions, error);
> +	ret = mlx5_flow_merge(dev, flow, size, attr, items, actions, error);
>  	if (ret < 0)
>  		return NULL;
>  	size = ret;
> -	flow = rte_zmalloc(__func__, size, 0);
> +	flow = rte_calloc(__func__, 1, size, 0);
>  	if (!flow) {
>  		rte_flow_error_set(error, ENOMEM,
>  				   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
>  				   NULL,
> -				   "cannot allocate memory");
> +				   "not enough memory to create flow");
>  		return NULL;
>  	}
> -	flow->verbs.attr = (struct ibv_flow_attr *)(flow + 1);
> -	flow->verbs.specs = (uint8_t *)(flow->verbs.attr + 1);
>  	ret = mlx5_flow_merge(dev, flow, size, attr, items, actions, error);
> -	if (ret < 0)
> -		goto error;
> +	if (ret < 0) {
> +		rte_free(flow);
> +		return NULL;
> +	}
>  	assert((size_t)ret == size);
>  	if (dev->data->dev_started) {
>  		ret = mlx5_flow_fate_apply(dev, flow, error);
> -		if (ret < 0)
> -			goto error;
> +		if (ret < 0) {
> +			ret = rte_errno; /* Save rte_errno before cleanup. */
> +			if (flow) {
> +				mlx5_flow_fate_remove(dev, flow);
> +				rte_free(flow);
> +			}
> +			rte_errno = ret; /* Restore rte_errno. */
> +			return NULL;
> +		}
>  	}
>  	mlx5_flow_rxq_mark(dev, flow);
>  	TAILQ_INSERT_TAIL(list, flow, next);
>  	return flow;
> -error:
> -	ret = rte_errno; /* Save rte_errno before cleanup. */
> -	mlx5_flow_fate_remove(dev, flow);
> -	rte_free(flow);
> -	rte_errno = ret; /* Restore rte_errno. */
> -	return NULL;
>  }
>  
>  /**
> @@ -1502,7 +2079,7 @@ mlx5_flow_list_destroy(struct rte_eth_dev *dev, struct mlx5_flows *list,
>  	struct priv *priv = dev->data->dev_private;
>  	struct rte_flow *rflow;
>  	const uint32_t mask = MLX5_FLOW_MOD_FLAG & MLX5_FLOW_MOD_MARK;
> -	int mark = 0;
> +	unsigned int i;
>  
>  	mlx5_flow_fate_remove(dev, flow);
>  	TAILQ_REMOVE(list, flow, next);
> @@ -1512,18 +2089,28 @@ mlx5_flow_list_destroy(struct rte_eth_dev *dev, struct mlx5_flows *list,
>  	}
>  	/*
>  	 * When a flow is removed and this flow has a flag/mark modifier, all
> -	 * flows needs to be parse to verify if the Rx queue use by the flow
> +	 * flows needs to be parse to verify if the Rx queues use by the flow
>  	 * still need to track the flag/mark request.
>  	 */
> -	TAILQ_FOREACH(rflow, &priv->flows, next) {
> -		if (!(rflow->modifier & mask))
> -			continue;
> -		if (flow->queue == rflow->queue) {
> -			mark = 1;
> -			break;
> +	for (i = 0; i != flow->rss.queue_num; ++i) {
> +		int mark = 0;
> +
> +		TAILQ_FOREACH(rflow, &priv->flows, next) {
> +			unsigned int j;
> +
> +			if (!(rflow->modifier & mask))
> +				continue;
> +			for (j = 0; j != rflow->rss.queue_num; ++j) {
> +				if ((*flow->queue)[i] == (*rflow->queue)[j]) {
> +					mark = 1;
> +					break;
> +				}
> +			}
> +			if (mark)
> +				break;
>  		}
> +		(*priv->rxqs)[i]->mark = !!mark;
>  	}
> -	(*priv->rxqs)[flow->queue]->mark = !!mark;
>  	rte_free(flow);
>  }
>  
> @@ -1654,7 +2241,7 @@ mlx5_ctrl_flow_vlan(struct rte_eth_dev *dev,
>  	struct priv *priv = dev->data->dev_private;
>  	const struct rte_flow_attr attr = {
>  		.ingress = 1,
> -		.priority = priv->config.flow_prio - 1,
> +		.priority = MLX5_FLOW_PRIO_RSVD,
>  	};
>  	struct rte_flow_item items[] = {
>  		{
> -- 
> 2.18.0
>