From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <yskoh@mellanox.com>
Received: from EUR01-HE1-obe.outbound.protection.outlook.com
 (mail-he1eur01on0058.outbound.protection.outlook.com [104.47.0.58])
 by dpdk.org (Postfix) with ESMTP id 122E25F25
 for <dev@dpdk.org>; Thu, 18 Oct 2018 10:00:56 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com;
 s=selector1;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=Sy4g+gTxmuHpJeSsG+P2LxZ6eCgIDV+++NggMkGAIx0=;
 b=B2kCYfOkEb2WlgiKxnFAoZTN+fxy+DDfwnVeX8F+4cpkK4+46f1ZkhYh69sy4QyfLwctrPeit08BfJkPu+9R5dhKgOQ9eli/9AE0mLjUyfnro1R0+kDq12BR+cXIfn1j4U2I0D61XbeskjDaZoY9RAhwAXjwJM54ecS/8Ld43GY=
Received: from DB3PR0502MB3980.eurprd05.prod.outlook.com (52.134.72.27) by
 DB3SPR01MB016.eurprd05.prod.outlook.com (52.134.70.155) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.20.1228.31; Thu, 18 Oct 2018 08:00:54 +0000
Received: from DB3PR0502MB3980.eurprd05.prod.outlook.com
 ([fe80::f8a1:fcab:94f0:97cc]) by DB3PR0502MB3980.eurprd05.prod.outlook.com
 ([fe80::f8a1:fcab:94f0:97cc%3]) with mapi id 15.20.1228.033; Thu, 18 Oct 2018
 08:00:54 +0000
From: Yongseok Koh <yskoh@mellanox.com>
To: Dekel Peled <dekelp@mellanox.com>
CC: Shahaf Shuler <shahafs@mellanox.com>, "dev@dpdk.org" <dev@dpdk.org>, Ori
 Kam <orika@mellanox.com>
Thread-Topic: [PATCH v4] net/mlx5: support metadata as flow rule criteria
Thread-Index: AQHUZhA4AaTi6ALDNUW2mycYWkLgUKUkpTiA
Date: Thu, 18 Oct 2018 08:00:53 +0000
Message-ID: <20181018080039.GA29378@mtidpdk.mti.labs.mlnx>
References: <1539256741-9407-1-git-send-email-dekelp@mellanox.com>
 <1539777217-64116-1-git-send-email-dekelp@mellanox.com>
In-Reply-To: <1539777217-64116-1-git-send-email-dekelp@mellanox.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-clientproxiedby: BYAPR02CA0069.namprd02.prod.outlook.com
 (2603:10b6:a03:54::46) To DB3PR0502MB3980.eurprd05.prod.outlook.com
 (2603:10a6:8:10::27)
authentication-results: spf=none (sender IP is )
 smtp.mailfrom=yskoh@mellanox.com; 
x-ms-exchange-messagesentrepresentingtype: 1
x-originating-ip: [209.116.155.178]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; DB3SPR01MB016;
 6:GDyJyjJu91MCbBSDZGtNYtZkULMte9ytYZjZHE4k96rnS3VcYMGcPVNt8b00GkbQCEBPaNabuN57ygl6kDxNN7XCgb8soT+SpJAc5YlLUbL8rSjLiOocC/ZCXQ/Zr9flTnoUE9oZO2XM9BbsM/VO+0AzEjMZUhXJ5qBxU0ZB1H4i+2CECMKrKQ7wKi7XJbnFM7CHcCy/t552tXxLg8h6Y2WFEt+3fLZYRk2HZQRXp8OHaEdF5Qm5+ydfa/NQLSQloDuRSKJo0p/Typ4MrtE7tN6REdyawXDOvoqz5RSyXg3kB/m2+V+JNhbqzdKE7hyXg5kudB+/UU9u5eQ7x9qPXtKlSyYDvd8iyPlqD1uoEQPb4njMZ2k2Ocr+WkosxXEraq7kRnm2pNvC0rFAXDQ5z+Xwa79hgrwW1ZAbQbm/FrAmdBSZ/75LWY/e0TFGxbnNBIexNqrc87nALb/thvyj+w==;
 5:urapo6AM3JgscbP0lkQp+222T2t/quq4tIWgeXbXju/uGD9xYDRxdDknQLBSH7Ttm2P5TXp0EsPrtIqoI/5fNqPSDmcOcTB+DT89ENEztomJQ7QyNPWF8LzE88okUgPtZE1cH3pxHwxWUwIn9zz9JtGiInZXZcNG7LNFDvCrk4E=;
 7:TzOGLEDQ9BnomZOR9qQg5IOexmSDuBkzNSNWj74wV5dJimJFzCcbOrfIM1aA7oH9VoM/FWx/nuay7AgkpH4zvyLX13kxvXnrcCn4bCy9Bvw203eZRLK+SWYqW+/asRmRcz8PCPXAnV9RbZylLbUj+giAYUPsrpHXDfxpMHR5LSHk/CiumNFp4q+brpU4AMWTX2N+yMMHjbgiXnsb6Exi4UZQgldZ3TxrjwhhQHcHUigKe0uZE0/DqCLslW30mGk3
x-ms-office365-filtering-correlation-id: f2c9a311-8e50-4d98-cdc3-08d634cfd379
x-ms-office365-filtering-ht: Tenant
x-microsoft-antispam: BCL:0; PCL:0;
 RULEID:(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600074)(711020)(4618075)(2017052603328)(7153060)(7193020);
 SRVR:DB3SPR01MB016; 
x-ms-traffictypediagnostic: DB3SPR01MB016:
x-microsoft-antispam-prvs: <DB3SPR01MB01677BB69C8347615FF37A6C3F80@DB3SPR01MB016.eurprd05.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(788757137089);
x-ms-exchange-senderadcheck: 1
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0;
 RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(93006095)(93001095)(10201501046)(3231355)(944501410)(4982022)(52105095)(3002001)(6055026)(149066)(150057)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123564045)(20161123562045)(20161123558120)(20161123560045)(201708071742011)(7699051)(76991095);
 SRVR:DB3SPR01MB016; BCL:0; PCL:0; RULEID:; SRVR:DB3SPR01MB016; 
x-forefront-prvs: 08296C9B35
x-forefront-antispam-report: SFV:NSPM;
 SFS:(10009020)(346002)(376002)(396003)(366004)(136003)(39860400002)(199004)(189003)(256004)(81156014)(7736002)(6116002)(8676002)(4744004)(305945005)(81166006)(52116002)(229853002)(1076002)(966005)(99286004)(3846002)(53946003)(9686003)(6512007)(6436002)(6486002)(14454004)(8936002)(86362001)(66066001)(2900100001)(107886003)(6246003)(14444005)(186003)(33656002)(11346002)(53936002)(4326008)(446003)(486006)(2906002)(6636002)(25786009)(68736007)(54906003)(476003)(6862004)(106356001)(71190400001)(6506007)(102836004)(71200400001)(316002)(26005)(5250100002)(33896004)(478600001)(5660300001)(6306002)(105586002)(97736004)(76176011)(386003)(579004);
 DIR:OUT; SFP:1101; SCL:1; SRVR:DB3SPR01MB016;
 H:DB3PR0502MB3980.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en;
 PTR:InfoNoRecords; A:1; MX:1; 
received-spf: None (protection.outlook.com: mellanox.com does not designate
 permitted sender hosts)
x-microsoft-antispam-message-info: 4bvXZetatCQK2bFu7FGMhkkyZcEGRb4rYnxg22fb0PCL3UxOKV6JFVhaQh03lJ58/51WnYbaXxQZDhg6ExK/htMWhSXB/ev7sCMuCVWCSKEcajp++xyOM/ddTjmhobO5Ztg8lMLCp1+3wdXvU+SQ1Pcey0lxY/G8YWIyjwwDfWGs5r1yGBOWB9lLdNTIRF0J1mOIoqil2wJzDvx/+AU59vsUSLbF4FVtcNJtNN8W8Y5KcO7rIpO4Wv4F4/fu0QZR6GyJn16uI4HN/Obxh08RibMD/cQulYo51v+7UDDf8wwgV7uVPeuTkGbd9jk7GX02VbFTSEVW2FbG1rNGp9u4hAyVjFS+S1bf0RVfEAKvKeg=
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="us-ascii"
Content-ID: <BD769C6B86D1424AAAC419A1F1AF24F4@eurprd05.prod.outlook.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: Mellanox.com
X-MS-Exchange-CrossTenant-Network-Message-Id: f2c9a311-8e50-4d98-cdc3-08d634cfd379
X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Oct 2018 08:00:53.9416 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3SPR01MB016
Subject: Re: [dpdk-dev] [PATCH v4] net/mlx5: support metadata as flow rule
	criteria
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2018 08:00:56 -0000

On Wed, Oct 17, 2018 at 02:53:37PM +0300, Dekel Peled wrote:
> As described in series starting at [1], it adds option to set
> metadata value as match pattern when creating a new flow rule.
>=20
> This patch adds metadata support in mlx5 driver, in two parts:
> - Add the validation and setting of metadata value in matcher,
>   when creating a new flow rule.
> - Add the passing of metadata value from mbuf to wqe when
>   indicated by ol_flag, in different burst functions.
>=20
> [1] "ethdev: support metadata as flow rule criteria"
>     http://mails.dpdk.org/archives/dev/2018-October/115469.html
>=20
> ---
> v4:
> - Rebase.
> - Apply code review comments.
> v3:
> - Update meta item validation.
> v2:
> - Split the support of egress rules to a different patch.
> ---
> =09
> Signed-off-by: Dekel Peled <dekelp@mellanox.com>
> ---
>  drivers/net/mlx5/mlx5_flow.c          |   2 +-
>  drivers/net/mlx5/mlx5_flow.h          |   8 +++
>  drivers/net/mlx5/mlx5_flow_dv.c       | 109 ++++++++++++++++++++++++++++=
++++++
>  drivers/net/mlx5/mlx5_prm.h           |   2 +-
>  drivers/net/mlx5/mlx5_rxtx.c          |  33 ++++++++--
>  drivers/net/mlx5/mlx5_rxtx_vec.c      |  38 +++++++++---
>  drivers/net/mlx5/mlx5_rxtx_vec.h      |   1 +
>  drivers/net/mlx5/mlx5_rxtx_vec_neon.h |   9 ++-
>  drivers/net/mlx5/mlx5_rxtx_vec_sse.h  |  10 ++--
>  drivers/net/mlx5/mlx5_txq.c           |   6 ++
>  10 files changed, 192 insertions(+), 26 deletions(-)
>=20
> diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
> index bd70fce..15262f6 100644
> --- a/drivers/net/mlx5/mlx5_flow.c
> +++ b/drivers/net/mlx5/mlx5_flow.c
> @@ -418,7 +418,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev=
 *dev, int32_t priority,
>   * @return
>   *   0 on success, a negative errno value otherwise and rte_errno is set=
.
>   */
> -static int
> +int
>  mlx5_flow_item_acceptable(const struct rte_flow_item *item,
>  			  const uint8_t *mask,
>  			  const uint8_t *nic_mask,
> diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
> index 094f666..834a6ed 100644
> --- a/drivers/net/mlx5/mlx5_flow.h
> +++ b/drivers/net/mlx5/mlx5_flow.h
> @@ -43,6 +43,9 @@
>  #define MLX5_FLOW_LAYER_GRE (1u << 14)
>  #define MLX5_FLOW_LAYER_MPLS (1u << 15)
> =20
> +/* General pattern items bits. */
> +#define MLX5_FLOW_ITEM_METADATA (1u << 16)
> +
>  /* Outer Masks. */
>  #define MLX5_FLOW_LAYER_OUTER_L3 \
>  	(MLX5_FLOW_LAYER_OUTER_L3_IPV4 | MLX5_FLOW_LAYER_OUTER_L3_IPV6)
> @@ -307,6 +310,11 @@ int mlx5_flow_validate_action_rss(const struct rte_f=
low_action *action,
>  int mlx5_flow_validate_attributes(struct rte_eth_dev *dev,
>  				  const struct rte_flow_attr *attributes,
>  				  struct rte_flow_error *error);
> +int mlx5_flow_item_acceptable(const struct rte_flow_item *item,
> +			      const uint8_t *mask,
> +			      const uint8_t *nic_mask,
> +			      unsigned int size,
> +			      struct rte_flow_error *error);
>  int mlx5_flow_validate_item_eth(const struct rte_flow_item *item,
>  				uint64_t item_flags,
>  				struct rte_flow_error *error);
> diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow=
_dv.c
> index a013201..bfddfab 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -36,6 +36,69 @@
>  #ifdef HAVE_IBV_FLOW_DV_SUPPORT
> =20
>  /**
> + * Validate META item.
> + *
> + * @param[in] dev
> + *   Pointer to the rte_eth_dev structure.
> + * @param[in] item
> + *   Item specification.
> + * @param[in] attr
> + *   Attributes of flow that includes this item.
> + * @param[out] error
> + *   Pointer to error structure.
> + *
> + * @return
> + *   0 on success, a negative errno value otherwise and rte_errno is set=
.
> + */
> +static int
> +flow_dv_validate_item_meta(struct rte_eth_dev *dev,
> +			   const struct rte_flow_item *item,
> +			   const struct rte_flow_attr *attr,
> +			   struct rte_flow_error *error)
> +{
> +	const struct rte_flow_item_meta *spec =3D item->spec;
> +	const struct rte_flow_item_meta *mask =3D item->mask;
> +

No blank line.

> +	const struct rte_flow_item_meta nic_mask =3D {
> +		.data =3D RTE_BE32(UINT32_MAX)
> +	};
> +

Ditto.

> +	int ret;
> +	uint64_t offloads =3D dev->data->dev_conf.txmode.offloads;
> +
> +	if (!(offloads & DEV_TX_OFFLOAD_MATCH_METADATA))
> +		return rte_flow_error_set(error, EPERM,
> +					  RTE_FLOW_ERROR_TYPE_ITEM,
> +					  NULL,
> +					  "match on metadata offload "
> +					  "configuration is off for this port");
> +	if (!spec)
> +		return rte_flow_error_set(error, EINVAL,
> +					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> +					  item->spec,
> +					  "data cannot be empty");
> +	if (!spec->data)
> +		return rte_flow_error_set(error, EINVAL,
> +					  RTE_FLOW_ERROR_TYPE_ITEM_SPEC,
> +					  NULL,
> +					  "data cannot be zero");
> +	if (!mask)
> +		mask =3D &rte_flow_item_meta_mask;
> +	ret =3D mlx5_flow_item_acceptable(item, (const uint8_t *)mask,
> +					(const uint8_t *)&nic_mask,
> +					sizeof(struct rte_flow_item_meta),
> +					error);
> +	if (ret < 0)
> +		return ret;
> +	if (attr->ingress)
> +		return rte_flow_error_set(error, ENOTSUP,
> +					  RTE_FLOW_ERROR_TYPE_ATTR_INGRESS,
> +					  NULL,
> +					  "pattern not supported for ingress");
> +	return 0;
> +}
> +
> +/**
>   * Verify the @p attributes will be correctly understood by the NIC and =
store
>   * them in the @p flow if everything is correct.
>   *
> @@ -214,6 +277,13 @@
>  				return ret;
>  			item_flags |=3D MLX5_FLOW_LAYER_MPLS;
>  			break;
> +		case RTE_FLOW_ITEM_TYPE_META:
> +			ret =3D flow_dv_validate_item_meta(dev, items, attr,
> +							 error);
> +			if (ret < 0)
> +				return ret;
> +			item_flags |=3D MLX5_FLOW_ITEM_METADATA;
> +			break;
>  		default:
>  			return rte_flow_error_set(error, ENOTSUP,
>  						  RTE_FLOW_ERROR_TYPE_ITEM,
> @@ -855,6 +925,42 @@
>  }
> =20
>  /**
> + * Add META item to matcher
> + *
> + * @param[in, out] matcher
> + *   Flow matcher.
> + * @param[in, out] key
> + *   Flow matcher value.
> + * @param[in] item
> + *   Flow pattern to translate.
> + * @param[in] inner
> + *   Item is inner pattern.
> + */
> +static void
> +flow_dv_translate_item_meta(void *matcher, void *key,
> +				const struct rte_flow_item *item)
> +{
> +	const struct rte_flow_item_meta *meta_m;
> +	const struct rte_flow_item_meta *meta_v;
> +
> +	void *misc2_m =3D
> +		MLX5_ADDR_OF(fte_match_param, matcher, misc_parameters_2);
> +	void *misc2_v =3D
> +		MLX5_ADDR_OF(fte_match_param, key, misc_parameters_2);
> +
> +	meta_m =3D (const void *)item->mask;
> +	if (!meta_m)
> +		meta_m =3D &rte_flow_item_meta_mask;
> +	meta_v =3D (const void *)item->spec;
> +	if (meta_v) {
> +		MLX5_SET(fte_match_set_misc2, misc2_m, metadata_reg_a,
> +			RTE_BE32(meta_m->data));

Nope. RTE_BE32() is for builtin constant, not for a variable.
You should use rte_cpu_to_be_32() instead.

> +		MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_a,
> +			RTE_BE32(meta_v->data));

Same here.

> +	}
> +}
> +
> +/**
>   * Update the matcher and the value based the selected item.
>   *
>   * @param[in, out] matcher
> @@ -940,6 +1046,9 @@
>  		flow_dv_translate_item_vxlan(tmatcher->mask.buf, key, item,
>  					     inner);
>  		break;
> +	case RTE_FLOW_ITEM_TYPE_META:
> +		flow_dv_translate_item_meta(tmatcher->mask.buf, key, item);
> +		break;
>  	default:
>  		break;
>  	}
> diff --git a/drivers/net/mlx5/mlx5_prm.h b/drivers/net/mlx5/mlx5_prm.h
> index 69296a0..29742b1 100644
> --- a/drivers/net/mlx5/mlx5_prm.h
> +++ b/drivers/net/mlx5/mlx5_prm.h
> @@ -159,7 +159,7 @@ struct mlx5_wqe_eth_seg_small {
>  	uint8_t	cs_flags;
>  	uint8_t	rsvd1;
>  	uint16_t mss;
> -	uint32_t rsvd2;
> +	uint32_t flow_table_metadata;
>  	uint16_t inline_hdr_sz;
>  	uint8_t inline_hdr[2];
>  } __rte_aligned(MLX5_WQE_DWORD_SIZE);
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index 558e6b6..5b4d2fd 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -523,6 +523,7 @@
>  		uint8_t tso =3D txq->tso_en && (buf->ol_flags & PKT_TX_TCP_SEG);
>  		uint32_t swp_offsets =3D 0;
>  		uint8_t swp_types =3D 0;
> +		uint32_t metadata;
>  		uint16_t tso_segsz =3D 0;
>  #ifdef MLX5_PMD_SOFT_COUNTERS
>  		uint32_t total_length =3D 0;
> @@ -566,6 +567,10 @@
>  		cs_flags =3D txq_ol_cksum_to_cs(buf);
>  		txq_mbuf_to_swp(txq, buf, (uint8_t *)&swp_offsets, &swp_types);
>  		raw =3D ((uint8_t *)(uintptr_t)wqe) + 2 * MLX5_WQE_DWORD_SIZE;
> +		/* Copy metadata from mbuf if valid */
> +		metadata =3D buf->ol_flags & PKT_TX_METADATA ?
> +						buf->tx_metadata : 0;

Indentation.

> +

No blank line.

>  		/* Replace the Ethernet type by the VLAN if necessary. */
>  		if (buf->ol_flags & PKT_TX_VLAN_PKT) {
>  			uint32_t vlan =3D rte_cpu_to_be_32(0x81000000 |
> @@ -781,7 +786,7 @@
>  				swp_offsets,
>  				cs_flags | (swp_types << 8) |
>  				(rte_cpu_to_be_16(tso_segsz) << 16),
> -				0,
> +				rte_cpu_to_be_32(metadata),
>  				(ehdr << 16) | rte_cpu_to_be_16(tso_header_sz),
>  			};
>  		} else {
> @@ -795,7 +800,7 @@
>  			wqe->eseg =3D (rte_v128u32_t){
>  				swp_offsets,
>  				cs_flags | (swp_types << 8),
> -				0,
> +				rte_cpu_to_be_32(metadata),
>  				(ehdr << 16) | rte_cpu_to_be_16(pkt_inline_sz),
>  			};
>  		}
> @@ -861,7 +866,7 @@
>  	mpw->wqe->eseg.inline_hdr_sz =3D 0;
>  	mpw->wqe->eseg.rsvd0 =3D 0;
>  	mpw->wqe->eseg.rsvd1 =3D 0;
> -	mpw->wqe->eseg.rsvd2 =3D 0;
> +	mpw->wqe->eseg.flow_table_metadata =3D 0;
>  	mpw->wqe->ctrl[0] =3D rte_cpu_to_be_32((MLX5_OPC_MOD_MPW << 24) |
>  					     (txq->wqe_ci << 8) |
>  					     MLX5_OPCODE_TSO);
> @@ -948,6 +953,7 @@
>  		uint32_t length;
>  		unsigned int segs_n =3D buf->nb_segs;
>  		uint32_t cs_flags;
> +		uint32_t metadata;
> =20
>  		/*
>  		 * Make sure there is enough room to store this packet and
> @@ -964,6 +970,9 @@
>  		max_elts -=3D segs_n;
>  		--pkts_n;
>  		cs_flags =3D txq_ol_cksum_to_cs(buf);
> +		/* Copy metadata from mbuf if valid */
> +		metadata =3D buf->ol_flags & PKT_TX_METADATA ?
> +						buf->tx_metadata : 0;

Indentation.
And no need to change to big-endian? I think it needs.

>  		/* Retrieve packet information. */
>  		length =3D PKT_LEN(buf);
>  		assert(length);
> @@ -971,6 +980,7 @@
>  		if ((mpw.state =3D=3D MLX5_MPW_STATE_OPENED) &&
>  		    ((mpw.len !=3D length) ||
>  		     (segs_n !=3D 1) ||
> +		     (mpw.wqe->eseg.flow_table_metadata !=3D metadata) ||
>  		     (mpw.wqe->eseg.cs_flags !=3D cs_flags)))
>  			mlx5_mpw_close(txq, &mpw);
>  		if (mpw.state =3D=3D MLX5_MPW_STATE_CLOSED) {
> @@ -984,6 +994,7 @@
>  			max_wqe -=3D 2;
>  			mlx5_mpw_new(txq, &mpw, length);
>  			mpw.wqe->eseg.cs_flags =3D cs_flags;
> +			mpw.wqe->eseg.flow_table_metadata =3D metadata;
>  		}
>  		/* Multi-segment packets must be alone in their MPW. */
>  		assert((segs_n =3D=3D 1) || (mpw.pkts_n =3D=3D 0));
> @@ -1082,7 +1093,7 @@
>  	mpw->wqe->eseg.cs_flags =3D 0;
>  	mpw->wqe->eseg.rsvd0 =3D 0;
>  	mpw->wqe->eseg.rsvd1 =3D 0;
> -	mpw->wqe->eseg.rsvd2 =3D 0;
> +	mpw->wqe->eseg.flow_table_metadata =3D 0;
>  	inl =3D (struct mlx5_wqe_inl_small *)
>  		(((uintptr_t)mpw->wqe) + 2 * MLX5_WQE_DWORD_SIZE);
>  	mpw->data.raw =3D (uint8_t *)&inl->raw;
> @@ -1172,6 +1183,7 @@
>  		uint32_t length;
>  		unsigned int segs_n =3D buf->nb_segs;
>  		uint8_t cs_flags;
> +		uint32_t metadata;
> =20
>  		/*
>  		 * Make sure there is enough room to store this packet and
> @@ -1193,18 +1205,23 @@
>  		 */
>  		max_wqe =3D (1u << txq->wqe_n) - (txq->wqe_ci - txq->wqe_pi);
>  		cs_flags =3D txq_ol_cksum_to_cs(buf);
> +		/* Copy metadata from mbuf if valid */
> +		metadata =3D buf->ol_flags & PKT_TX_METADATA ?
> +						buf->tx_metadata : 0;

Indentation.
And no need to change to big-endian?

>  		/* Retrieve packet information. */
>  		length =3D PKT_LEN(buf);
>  		/* Start new session if packet differs. */
>  		if (mpw.state =3D=3D MLX5_MPW_STATE_OPENED) {
>  			if ((mpw.len !=3D length) ||
>  			    (segs_n !=3D 1) ||
> +			    (mpw.wqe->eseg.flow_table_metadata !=3D metadata) ||
>  			    (mpw.wqe->eseg.cs_flags !=3D cs_flags))
>  				mlx5_mpw_close(txq, &mpw);
>  		} else if (mpw.state =3D=3D MLX5_MPW_INL_STATE_OPENED) {
>  			if ((mpw.len !=3D length) ||
>  			    (segs_n !=3D 1) ||
>  			    (length > inline_room) ||
> +			    (mpw.wqe->eseg.flow_table_metadata !=3D metadata) ||
>  			    (mpw.wqe->eseg.cs_flags !=3D cs_flags)) {
>  				mlx5_mpw_inline_close(txq, &mpw);
>  				inline_room =3D
> @@ -1224,12 +1241,14 @@
>  				max_wqe -=3D 2;
>  				mlx5_mpw_new(txq, &mpw, length);
>  				mpw.wqe->eseg.cs_flags =3D cs_flags;
> +				mpw.wqe->eseg.flow_table_metadata =3D metadata;
>  			} else {
>  				if (unlikely(max_wqe < wqe_inl_n))
>  					break;
>  				max_wqe -=3D wqe_inl_n;
>  				mlx5_mpw_inline_new(txq, &mpw, length);
>  				mpw.wqe->eseg.cs_flags =3D cs_flags;
> +				mpw.wqe->eseg.flow_table_metadata =3D metadata;
>  			}
>  		}
>  		/* Multi-segment packets must be alone in their MPW. */
> @@ -1461,6 +1480,7 @@
>  		unsigned int do_inline =3D 0; /* Whether inline is possible. */
>  		uint32_t length;
>  		uint8_t cs_flags;
> +		uint32_t metadata;
> =20
>  		/* Multi-segmented packet is handled in slow-path outside. */
>  		assert(NB_SEGS(buf) =3D=3D 1);
> @@ -1468,6 +1488,9 @@
>  		if (max_elts - j =3D=3D 0)
>  			break;
>  		cs_flags =3D txq_ol_cksum_to_cs(buf);
> +		/* Copy metadata from mbuf if valid */
> +		metadata =3D buf->ol_flags & PKT_TX_METADATA ?
> +						buf->tx_metadata : 0;

Indentation.
And no need to change to big-endian?

>  		/* Retrieve packet information. */
>  		length =3D PKT_LEN(buf);
>  		/* Start new session if:
> @@ -1482,6 +1505,7 @@
>  			    (length <=3D txq->inline_max_packet_sz &&
>  			     inl_pad + sizeof(inl_hdr) + length >
>  			     mpw_room) ||
> +			     (mpw.wqe->eseg.flow_table_metadata !=3D metadata) ||
>  			    (mpw.wqe->eseg.cs_flags !=3D cs_flags))
>  				max_wqe -=3D mlx5_empw_close(txq, &mpw);
>  		}
> @@ -1505,6 +1529,7 @@
>  				    sizeof(inl_hdr) + length <=3D mpw_room &&
>  				    !txq->mpw_hdr_dseg;
>  			mpw.wqe->eseg.cs_flags =3D cs_flags;
> +			mpw.wqe->eseg.flow_table_metadata =3D metadata;
>  		} else {
>  			/* Evaluate whether the next packet can be inlined.
>  			 * Inlininig is possible when:
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.c b/drivers/net/mlx5/mlx5_rxt=
x_vec.c
> index 0a4aed8..16a8608 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec.c
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.c
> @@ -41,6 +41,8 @@
> =20
>  /**
>   * Count the number of packets having same ol_flags and calculate cs_fla=
gs.
> + * If PKT_TX_METADATA is set in ol_flags, packets must have same metadat=
a
> + * as well.

Packets can have different metadata but we just want to count the number of
packets having same data. Please correct the comment.

>   *
>   * @param pkts
>   *   Pointer to array of packets.
> @@ -48,26 +50,41 @@
>   *   Number of packets.
>   * @param cs_flags
>   *   Pointer of flags to be returned.
> + * @param metadata
> + *   Pointer of metadata to be returned.
> + * @param txq_offloads
> + *   Offloads enabled on Tx queue
>   *
>   * @return
> - *   Number of packets having same ol_flags.
> + *   Number of packets having same ol_flags and metadata, if relevant.
>   */
>  static inline unsigned int
> -txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_fl=
ags)
> +txq_calc_offload(struct rte_mbuf **pkts, uint16_t pkts_n, uint8_t *cs_fl=
ags,
> +		 uint32_t *metadata, const uint64_t txq_offloads)
>  {
>  	unsigned int pos;
>  	const uint64_t ol_mask =3D
>  		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
>  		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
> -		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
> +		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM | PKT_TX_METADATA;

Shouldn't add PKT_TX_METADATA. As it is for cksum, you might rather want to
change the name, e.g., cksum_ol_mask.

> =20
>  	if (!pkts_n)
>  		return 0;
>  	/* Count the number of packets having same ol_flags. */

This comment has to be corrected and moved.

> -	for (pos =3D 1; pos < pkts_n; ++pos)
> -		if ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask)
> +	for (pos =3D 1; pos < pkts_n; ++pos) {
> +		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP) &&
> +			((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & ol_mask))

Indentation.

>  			break;
> +		/* If the metadata ol_flag is set,
> +		 *  metadata must be same in all packets.
> +		 */

Correct comment. First line should be empty for multi-line comment.
And it can't be 'must'. We are not forcing it but just counting the number =
of
packets having same metadata like I mentioned above.

> +		if ((txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) &&
> +			(pkts[pos]->ol_flags & PKT_TX_METADATA) &&
> +			pkts[0]->tx_metadata !=3D pkts[pos]->tx_metadata)

Disagree. What if pkts[0] doesn't have PKT_TXT_METADATA while pkt[1] has it=
?
And, indentation.

> +			break;
> +	}
>  	*cs_flags =3D txq_ol_cksum_to_cs(pkts[0]);
> +	*metadata =3D rte_cpu_to_be_32(pkts[0]->tx_metadata);

Same here. You should check if pkts[0] has metadata first.

>  	return pos;

Here's my suggestion for the whole func.

	unsigned int pos;
	const uint64_t cksum_ol_mask =3D
		PKT_TX_IP_CKSUM | PKT_TX_TCP_CKSUM |
		PKT_TX_UDP_CKSUM | PKT_TX_TUNNEL_GRE |
		PKT_TX_TUNNEL_VXLAN | PKT_TX_OUTER_IP_CKSUM;
	const uint32_t p0_metadata;

	if (!pkts_n)
		return 0;
	p0_metadata =3D pkts[0]->ol_flags & PKT_TX_METADATA ?
		      pkts[0]->tx_metadata : 0;
	/* Count the number of packets having same offload parameters. */
	for (pos =3D 1; pos < pkts_n; ++pos) {
		/* Check if packet can have same checksum flags. */
		if ((txq_offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP) &&
		    ((pkts[pos]->ol_flags ^ pkts[0]->ol_flags) & cksum_ol_mask))
			break;
		/* Check if packet has same metadata. */
		if (txq_offloads & DEV_TX_OFFLOAD_MATCH_METADATA) {
			const uint32_t p1_metadata =3D
				pkts[pos]->ol_flags & PKT_TX_METADATA ?
				pkts[pos]->tx_metadata : 0;

			if (p1_metadata !=3D p0_metadata)
				break;
		}
	}
	*cs_flags =3D txq_ol_cksum_to_cs(pkts[0]);
	*metadata =3D rte_cpu_to_be_32(p0_metadata);
	return pos;
>  }
> =20
> @@ -96,7 +113,7 @@
>  		uint16_t ret;
> =20
>  		n =3D RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
> -		ret =3D txq_burst_v(txq, &pkts[nb_tx], n, 0);
> +		ret =3D txq_burst_v(txq, &pkts[nb_tx], n, 0, 0);
>  		nb_tx +=3D ret;
>  		if (!ret)
>  			break;
> @@ -127,6 +144,7 @@
>  		uint8_t cs_flags =3D 0;
>  		uint16_t n;
>  		uint16_t ret;
> +		uint32_t metadata =3D 0;

Let's use rte_be32_t instead.

> =20
>  		/* Transmit multi-seg packets in the head of pkts list. */
>  		if ((txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS) &&
> @@ -137,9 +155,11 @@
>  		n =3D RTE_MIN((uint16_t)(pkts_n - nb_tx), MLX5_VPMD_TX_MAX_BURST);
>  		if (txq->offloads & DEV_TX_OFFLOAD_MULTI_SEGS)
>  			n =3D txq_count_contig_single_seg(&pkts[nb_tx], n);
> -		if (txq->offloads & MLX5_VEC_TX_CKSUM_OFFLOAD_CAP)
> -			n =3D txq_calc_offload(&pkts[nb_tx], n, &cs_flags);
> -		ret =3D txq_burst_v(txq, &pkts[nb_tx], n, cs_flags);
> +		if (txq->offloads & (MLX5_VEC_TX_CKSUM_OFFLOAD_CAP |
> +				DEV_TX_OFFLOAD_MATCH_METADATA))

Indentation.

> +			n =3D txq_calc_offload(&pkts[nb_tx], n,
> +					&cs_flags, &metadata, txq->offloads);

Indentation.

> +		ret =3D txq_burst_v(txq, &pkts[nb_tx], n, cs_flags, metadata);
>  		nb_tx +=3D ret;
>  		if (!ret)
>  			break;
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec.h b/drivers/net/mlx5/mlx5_rxt=
x_vec.h
> index fb884f9..fda7004 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec.h
> @@ -22,6 +22,7 @@
>  /* HW offload capabilities of vectorized Tx. */
>  #define MLX5_VEC_TX_OFFLOAD_CAP \
>  	(MLX5_VEC_TX_CKSUM_OFFLOAD_CAP | \
> +	 DEV_TX_OFFLOAD_MATCH_METADATA | \
>  	 DEV_TX_OFFLOAD_MULTI_SEGS)
> =20
>  /*
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h b/drivers/net/mlx5/mlx=
5_rxtx_vec_neon.h
> index b37b738..a8a4d7b 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
> @@ -201,13 +201,15 @@
>   *   Number of packets to be sent (<=3D MLX5_VPMD_TX_MAX_BURST).
>   * @param cs_flags
>   *   Checksum offload flags to be written in the descriptor.
> + * @param metadata
> + *   Metadata value to be written in the descriptor.
>   *
>   * @return
>   *   Number of packets successfully transmitted (<=3D pkts_n).
>   */
>  static inline uint16_t
>  txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t =
pkts_n,
> -	    uint8_t cs_flags)
> +	    uint8_t cs_flags, uint32_t metadata)

Let's use rte_be32_t instead.

>  {
>  	struct rte_mbuf **elts;
>  	uint16_t elts_head =3D txq->elts_head;
> @@ -294,10 +296,7 @@
>  	vst1q_u8((void *)t_wqe, ctrl);
>  	/* Fill ESEG in the header. */
>  	vst1q_u8((void *)(t_wqe + 1),
> -		 ((uint8x16_t) { 0, 0, 0, 0,
> -				 cs_flags, 0, 0, 0,
> -				 0, 0, 0, 0,
> -				 0, 0, 0, 0 }));
> +		 ((uint32x4_t) { 0, cs_flags, metadata, 0 }));
>  #ifdef MLX5_PMD_SOFT_COUNTERS
>  	txq->stats.opackets +=3D pkts_n;
>  #endif
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h b/drivers/net/mlx5/mlx5=
_rxtx_vec_sse.h
> index 54b3783..31aae4a 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> @@ -202,13 +202,15 @@
>   *   Number of packets to be sent (<=3D MLX5_VPMD_TX_MAX_BURST).
>   * @param cs_flags
>   *   Checksum offload flags to be written in the descriptor.
> + * @param metadata
> + *   Metadata value to be written in the descriptor.
>   *
>   * @return
>   *   Number of packets successfully transmitted (<=3D pkts_n).
>   */
>  static inline uint16_t
>  txq_burst_v(struct mlx5_txq_data *txq, struct rte_mbuf **pkts, uint16_t =
pkts_n,
> -	    uint8_t cs_flags)
> +	    uint8_t cs_flags, uint32_t metadata)

Let's use rte_be32_t instead.

>  {
>  	struct rte_mbuf **elts;
>  	uint16_t elts_head =3D txq->elts_head;
> @@ -292,11 +294,7 @@
>  	ctrl =3D _mm_shuffle_epi8(ctrl, shuf_mask_ctrl);
>  	_mm_store_si128(t_wqe, ctrl);
>  	/* Fill ESEG in the header. */
> -	_mm_store_si128(t_wqe + 1,
> -			_mm_set_epi8(0, 0, 0, 0,
> -				     0, 0, 0, 0,
> -				     0, 0, 0, cs_flags,
> -				     0, 0, 0, 0));
> +	_mm_store_si128(t_wqe + 1, _mm_set_epi32(0, metadata, cs_flags, 0));
>  #ifdef MLX5_PMD_SOFT_COUNTERS
>  	txq->stats.opackets +=3D pkts_n;
>  #endif
> diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
> index f9bc473..7263fb1 100644
> --- a/drivers/net/mlx5/mlx5_txq.c
> +++ b/drivers/net/mlx5/mlx5_txq.c
> @@ -128,6 +128,12 @@
>  			offloads |=3D (DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
>  				     DEV_TX_OFFLOAD_GRE_TNL_TSO);
>  	}
> +

Please no blank line.

> +#ifdef HAVE_IBV_FLOW_DV_SUPPORT
> +	if (config->dv_flow_en)
> +		offloads |=3D DEV_TX_OFFLOAD_MATCH_METADATA;
> +#endif
> +

Same here.

>  	return offloads;
>  }
> =20
> --=20
> 1.8.3.1
>=20