DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/4] Add delay drop support for Rx queue
@ 2021-11-04 11:26 Bing Zhao
  2021-11-04 11:26 ` [dpdk-dev] [PATCH 1/4] common/mlx5: support delay drop capabilities query Bing Zhao
                   ` (9 more replies)
  0 siblings, 10 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 11:26 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

This patch set introduce a new devarg to support delay drop when
creating the Rx queues.

Bing Zhao (4):
  common/mlx5: support delay drop capabilities query
  net/mlx5: add support for Rx queue delay drop
  net/mlx5: support querying delay drop status via ethtool
  doc: update the description for Rx delay drop

 doc/guides/nics/mlx5.rst                |  26 ++++++
 doc/guides/rel_notes/release_21_11.rst  |   1 +
 drivers/common/mlx5/mlx5_devx_cmds.c    |   1 +
 drivers/common/mlx5/mlx5_devx_cmds.h    |   1 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c | 113 ++++++++++++++++++++++++
 drivers/net/mlx5/linux/mlx5_os.c        |  11 +++
 drivers/net/mlx5/mlx5.c                 |   7 ++
 drivers/net/mlx5/mlx5.h                 |  10 +++
 drivers/net/mlx5/mlx5_devx.c            |   5 ++
 drivers/net/mlx5/mlx5_rx.h              |   1 +
 drivers/net/mlx5/mlx5_trigger.c         |  10 +++
 11 files changed, 186 insertions(+)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH 1/4] common/mlx5: support delay drop capabilities query
  2021-11-04 11:26 [dpdk-dev] [PATCH 0/4] Add delay drop support for Rx queue Bing Zhao
@ 2021-11-04 11:26 ` Bing Zhao
  2021-11-04 11:26 ` [dpdk-dev] [PATCH 2/4] net/mlx5: add support for Rx queue delay drop Bing Zhao
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 11:26 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

The "rq_delay_drop" capability in the HCA_CAP is checked and saved
in the output data structure for the future usage.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c | 1 +
 drivers/common/mlx5/mlx5_devx_cmds.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index 12c114a91b..eaf1dd5046 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -962,6 +962,7 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 	attr->ct_offload = !!(MLX5_GET64(cmd_hca_cap, hcattr,
 					 general_obj_types) &
 			      MLX5_GENERAL_OBJ_TYPES_CAP_CONN_TRACK_OFFLOAD);
+	attr->rq_delay_drop = MLX5_GET(cmd_hca_cap, hcattr, rq_delay_drop);
 	if (attr->qos.sup) {
 		hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
 				MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP |
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index 2326f1e968..25e2814ac0 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -176,6 +176,7 @@ struct mlx5_hca_attr {
 	uint32_t swp_csum:1;
 	uint32_t swp_lso:1;
 	uint32_t lro_max_msg_sz_mode:2;
+	uint32_t rq_delay_drop:1;
 	uint32_t lro_timer_supported_periods[MLX5_LRO_NUM_SUPP_PERIODS];
 	uint16_t lro_min_mss_size;
 	uint32_t flex_parser_protocols;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH 2/4] net/mlx5: add support for Rx queue delay drop
  2021-11-04 11:26 [dpdk-dev] [PATCH 0/4] Add delay drop support for Rx queue Bing Zhao
  2021-11-04 11:26 ` [dpdk-dev] [PATCH 1/4] common/mlx5: support delay drop capabilities query Bing Zhao
@ 2021-11-04 11:26 ` Bing Zhao
  2021-11-04 14:01   ` David Marchand
  2021-11-04 11:26 ` [dpdk-dev] [PATCH 3/4] net/mlx5: support querying delay drop status via ethtool Bing Zhao
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 11:26 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

For an Ethernet RQ, packets received when receive WQEs are exhausted
are dropped. This behavior prevents slow or malicious software
entities at the host from affecting the network. While for hairpin
cases, even if there is no software involved during the packet
forwarding from Rx to Tx side, some hiccup in the hardware or back
pressure from Tx side may still cause the WQEs to be exhausted. In
certain scenarios it may be preferred to configure the device to
avoid such packet drops, assuming the posting of WQEs will resume
shortly.

To support this, a new devarg "delay_drop_en" is introduced, by
default, the delay drop is enabled for hairpin Rx queues and
disabled for standard Rx queues. This value is used as a bit mask:
  - bit 0: enablement of standard Rx queue
  - bit 1: enablement of hairpin Rx queue
And this attribute will be applied to all Rx queues of a device.

If the hardware capabilities do not support this delay drop, all the
Rx queues will still be created without this attribute, and the
devarg setting will be ignored.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_os.c | 11 +++++++++++
 drivers/net/mlx5/mlx5.c          |  7 +++++++
 drivers/net/mlx5/mlx5.h          |  9 +++++++++
 drivers/net/mlx5/mlx5_devx.c     |  5 +++++
 drivers/net/mlx5/mlx5_rx.h       |  1 +
 5 files changed, 33 insertions(+)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index f51da8c3a3..def2cca3cd 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1506,6 +1506,15 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		goto error;
 #endif
 	}
+	if (config->std_delay_drop || config->hp_delay_drop) {
+		if (!config->hca_attr.rq_delay_drop) {
+			config->std_delay_drop = 0;
+			config->hp_delay_drop = 0;
+			DRV_LOG(WARNING,
+				"dev_port-%u: Rxq delay drop is not supported",
+				priv->dev_port);
+		}
+	}
 	if (sh->devx) {
 		uint32_t reg[MLX5_ST_SZ_DW(register_mtutc)];
 
@@ -2075,6 +2084,8 @@ mlx5_os_config_default(struct mlx5_dev_config *config)
 	config->decap_en = 1;
 	config->log_hp_size = MLX5_ARG_UNSET;
 	config->allow_duplicate_pattern = 1;
+	config->std_delay_drop = 0;
+	config->hp_delay_drop = 1;
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index dc15688f21..80a6692b94 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -183,6 +183,9 @@
 /* Device parameter to configure implicit registration of mempool memory. */
 #define MLX5_MR_MEMPOOL_REG_EN "mr_mempool_reg_en"
 
+/* Device parameter to configure the delay drop when creating Rxqs. */
+#define MLX5_DELAY_DROP_EN "delay_drop_en"
+
 /* Shared memory between primary and secondary processes. */
 struct mlx5_shared_data *mlx5_shared_data;
 
@@ -2095,6 +2098,9 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
 		config->decap_en = !!tmp;
 	} else if (strcmp(MLX5_ALLOW_DUPLICATE_PATTERN, key) == 0) {
 		config->allow_duplicate_pattern = !!tmp;
+	} else if (strcmp(MLX5_DELAY_DROP_EN, key) == 0) {
+		config->std_delay_drop = tmp & MLX5_DELAY_DROP_STANDARD;
+		config->hp_delay_drop = tmp & MLX5_DELAY_DROP_HAIRPIN;
 	} else {
 		DRV_LOG(WARNING, "%s: unknown parameter", key);
 		rte_errno = EINVAL;
@@ -2157,6 +2163,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 		MLX5_DECAP_EN,
 		MLX5_ALLOW_DUPLICATE_PATTERN,
 		MLX5_MR_MEMPOOL_REG_EN,
+		MLX5_DELAY_DROP_EN,
 		NULL,
 	};
 	struct rte_kvargs *kvlist;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 74af88ec19..8d32d55c9a 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -99,6 +99,13 @@ enum mlx5_flow_type {
 	MLX5_FLOW_TYPE_MAXI,
 };
 
+/* The mode of delay drop for Rx queues. */
+enum mlx5_delay_drop_mode {
+	MLX5_DELAY_DROP_NONE = 0, /* All disabled. */
+	MLX5_DELAY_DROP_STANDARD = RTE_BIT32(0), /* Standard queues enable. */
+	MLX5_DELAY_DROP_HAIRPIN = RTE_BIT32(1), /* Hairpin queues enable. */
+};
+
 /* Hlist and list callback context. */
 struct mlx5_flow_cb_ctx {
 	struct rte_eth_dev *dev;
@@ -264,6 +271,8 @@ struct mlx5_dev_config {
 	unsigned int dv_miss_info:1; /* restore packet after partial hw miss */
 	unsigned int allow_duplicate_pattern:1;
 	/* Allow/Prevent the duplicate rules pattern. */
+	unsigned int std_delay_drop:1; /* Enable standard Rxq delay drop. */
+	unsigned int hp_delay_drop:1; /* Enable hairpin Rxq delay drop. */
 	struct {
 		unsigned int enabled:1; /* Whether MPRQ is enabled. */
 		unsigned int stride_num_n; /* Number of strides. */
diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index 424f77be79..2e1d849eab 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -280,6 +280,7 @@ mlx5_rxq_create_devx_rq_resources(struct rte_eth_dev *dev,
 						MLX5_WQ_END_PAD_MODE_NONE;
 	rq_attr.wq_attr.pd = cdev->pdn;
 	rq_attr.counter_set_id = priv->counter_set_id;
+	rq_attr.delay_drop_en = rxq_data->delay_drop;
 	/* Create RQ using DevX API. */
 	return mlx5_devx_rq_create(cdev->ctx, &rxq_ctrl->obj->rq_obj, wqe_size,
 				   log_desc_n, &rq_attr, rxq_ctrl->socket);
@@ -443,6 +444,8 @@ mlx5_rxq_obj_hairpin_new(struct rte_eth_dev *dev, uint16_t idx)
 			attr.wq_attr.log_hairpin_data_sz -
 			MLX5_HAIRPIN_QUEUE_STRIDE;
 	attr.counter_set_id = priv->counter_set_id;
+	rxq_data->delay_drop = priv->config.hp_delay_drop;
+	attr.delay_drop_en = priv->config.hp_delay_drop;
 	tmpl->rq = mlx5_devx_cmd_create_rq(priv->sh->cdev->ctx, &attr,
 					   rxq_ctrl->socket);
 	if (!tmpl->rq) {
@@ -503,6 +506,7 @@ mlx5_rxq_devx_obj_new(struct rte_eth_dev *dev, uint16_t idx)
 		DRV_LOG(ERR, "Failed to create CQ.");
 		goto error;
 	}
+	rxq_data->delay_drop = priv->config.std_delay_drop;
 	/* Create RQ using DevX API. */
 	ret = mlx5_rxq_create_devx_rq_resources(dev, rxq_data);
 	if (ret) {
@@ -921,6 +925,7 @@ mlx5_rxq_devx_obj_drop_create(struct rte_eth_dev *dev)
 	rxq_ctrl->priv = priv;
 	rxq_ctrl->obj = rxq;
 	rxq_data = &rxq_ctrl->rxq;
+	rxq_data->delay_drop = 0;
 	/* Create CQ using DevX API. */
 	ret = mlx5_rxq_create_devx_cq_resources(dev, rxq_data);
 	if (ret != 0) {
diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index 69b1263339..05807764b8 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -92,6 +92,7 @@ struct mlx5_rxq_data {
 	unsigned int lro:1; /* Enable LRO. */
 	unsigned int dynf_meta:1; /* Dynamic metadata is configured. */
 	unsigned int mcqe_format:3; /* CQE compression format. */
+	unsigned int delay_drop:1; /* Enable delay drop. */
 	volatile uint32_t *rq_db;
 	volatile uint32_t *cq_db;
 	uint16_t port_id;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH 3/4] net/mlx5: support querying delay drop status via ethtool
  2021-11-04 11:26 [dpdk-dev] [PATCH 0/4] Add delay drop support for Rx queue Bing Zhao
  2021-11-04 11:26 ` [dpdk-dev] [PATCH 1/4] common/mlx5: support delay drop capabilities query Bing Zhao
  2021-11-04 11:26 ` [dpdk-dev] [PATCH 2/4] net/mlx5: add support for Rx queue delay drop Bing Zhao
@ 2021-11-04 11:26 ` Bing Zhao
  2021-11-04 11:26 ` [dpdk-dev] [PATCH 4/4] doc: update the description for Rx delay drop Bing Zhao
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 11:26 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

The delay drop is global per PF device and the kernel driver is
taking care of the initialization and rearming. By default, the
timeout value is set to activate the delay drop when the driver
is loaded.

A private flag "dropless_rq" is used to control the rearming. Only
when it is on, the rearming will be handled once received a timeout
event. Or else, the delay drop will be deactived after the first
timeout and all the Rx queues won't have this feature.

The PMD is trying to query this flag and warn the application when
some queues are created with delay drop but the flag is off.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_ethdev_os.c | 113 ++++++++++++++++++++++++
 drivers/net/mlx5/mlx5.h                 |   1 +
 drivers/net/mlx5/mlx5_trigger.c         |  10 +++
 3 files changed, 124 insertions(+)

diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index 9d0e491d0c..9255877dab 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -1630,3 +1630,116 @@ mlx5_get_mac(struct rte_eth_dev *dev, uint8_t (*mac)[RTE_ETHER_ADDR_LEN])
 	memcpy(mac, request.ifr_hwaddr.sa_data, RTE_ETHER_ADDR_LEN);
 	return 0;
 }
+
+/*
+ * Query dropless_rq private flag provided by ETHTOOL.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev)
+{
+	struct {
+		struct ethtool_sset_info hdr;
+		uint32_t buf[1];
+	} sset_info;
+	struct ethtool_drvinfo drvinfo;
+	struct ifreq ifr;
+	struct ethtool_gstrings *strings = NULL;
+	struct ethtool_value flags;
+	int32_t str_sz;
+	int32_t len;
+	int32_t i;
+	int ret;
+
+	sset_info.hdr.cmd = ETHTOOL_GSSET_INFO;
+	sset_info.hdr.reserved = 0;
+	sset_info.hdr.sset_mask = 1ULL << ETH_SS_PRIV_FLAGS;
+	ifr.ifr_data = (caddr_t)&sset_info;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (!ret) {
+		const uint32_t *sset_lengths = sset_info.hdr.data;
+
+		len = sset_info.hdr.sset_mask ? sset_lengths[0] : 0;
+	} else if (ret == -EOPNOTSUPP) {
+		drvinfo.cmd = ETHTOOL_GDRVINFO;
+		ifr.ifr_data = (caddr_t)&drvinfo;
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+		if (ret) {
+			DRV_LOG(WARNING, "port %u cannot get the driver info",
+				dev->data->port_id);
+			goto exit;
+		}
+		len = *(uint32_t *)((char *)&drvinfo +
+			offsetof(struct ethtool_drvinfo, n_priv_flags));
+	} else {
+		DRV_LOG(WARNING, "port %u cannot get the sset info",
+			dev->data->port_id);
+		goto exit;
+	}
+	if (!len) {
+		DRV_LOG(WARNING, "port %u does not have private flag",
+			dev->data->port_id);
+		rte_errno = EOPNOTSUPP;
+		ret = -rte_errno;
+		goto exit;
+	} else if (len > 32) {
+		DRV_LOG(WARNING, "port %u maximal private flags number is 32",
+			dev->data->port_id);
+		len = 32;
+	}
+	str_sz = ETH_GSTRING_LEN * len;
+	strings = (struct ethtool_gstrings *)
+		  mlx5_malloc(0, str_sz + sizeof(struct ethtool_gstrings), 0,
+			      SOCKET_ID_ANY);
+	if (!strings) {
+		DRV_LOG(WARNING, "port %u unable to allocate memory for"
+			" private flags", dev->data->port_id);
+		rte_errno = ENOMEM;
+		ret = -rte_errno;
+		goto exit;
+	}
+	strings->cmd = ETHTOOL_GSTRINGS;
+	strings->string_set = ETH_SS_PRIV_FLAGS;
+	strings->len = len;
+	ifr.ifr_data = (caddr_t)strings;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (ret) {
+		DRV_LOG(WARNING, "port %u unable to get private flags strings",
+			dev->data->port_id);
+		goto exit;
+	}
+	for (i = 0; i < len; i++) {
+		strings->data[(i + 1) * ETH_GSTRING_LEN - 1] = 0;
+		if (!strcmp((const char *)strings->data + i * ETH_GSTRING_LEN,
+			     "dropless_rq"))
+			break;
+	}
+	if (i == len) {
+		DRV_LOG(WARNING, "port %u does not support dropless_rq",
+			dev->data->port_id);
+		rte_errno = EOPNOTSUPP;
+		ret = -rte_errno;
+		goto exit;
+	}
+	flags.cmd = ETHTOOL_GPFLAGS;
+	ifr.ifr_data = (caddr_t)&flags;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (ret) {
+		DRV_LOG(WARNING, "port %u unable to get private flags status",
+			dev->data->port_id);
+		goto exit;
+	}
+	if (!(flags.data & (1U << i)))
+		DRV_LOG(WARNING, "port %u dropless_rq flag is off, no rearming",
+			dev->data->port_id);
+	else
+		DRV_LOG(DEBUG, "port %u support dropless_rq with rearming",
+			dev->data->port_id);
+exit:
+	mlx5_free(strings);
+	return ret;
+}
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 8d32d55c9a..e0f40ce31a 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1599,6 +1599,7 @@ int mlx5_os_read_dev_stat(struct mlx5_priv *priv,
 int mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats);
 int mlx5_os_get_stats_n(struct rte_eth_dev *dev);
 void mlx5_os_stats_init(struct rte_eth_dev *dev);
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index ebeeae279e..2644855483 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -1126,6 +1126,16 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 			dev->data->port_id, strerror(rte_errno));
 		goto error;
 	}
+	if (priv->config.std_delay_drop || priv->config.hp_delay_drop) {
+		if (!priv->config.vf && !priv->config.sf &&
+		    !priv->representor) {
+			(void)mlx5_get_flag_dropless_rq(dev);
+		} else {
+			DRV_LOG(INFO,
+				"port %u doesn't support dropless_rq flag",
+				dev->data->port_id);
+		}
+	}
 	ret = mlx5_rxq_start(dev);
 	if (ret) {
 		DRV_LOG(ERR, "port %u Rx queue allocation failed: %s",
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH 4/4] doc: update the description for Rx delay drop
  2021-11-04 11:26 [dpdk-dev] [PATCH 0/4] Add delay drop support for Rx queue Bing Zhao
                   ` (2 preceding siblings ...)
  2021-11-04 11:26 ` [dpdk-dev] [PATCH 3/4] net/mlx5: support querying delay drop status via ethtool Bing Zhao
@ 2021-11-04 11:26 ` Bing Zhao
  2021-11-04 14:01 ` [dpdk-dev] [PATCH v2 0/2] Add delay drop support for Rx queue Bing Zhao
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 11:26 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

The release note and new device parameter "delay_drop_en" for the
delay drop feature in mlx5.rst are updated.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 doc/guides/nics/mlx5.rst               | 26 ++++++++++++++++++++++++++
 doc/guides/rel_notes/release_21_11.rst |  1 +
 2 files changed, 27 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index bb92520dff..2874a34cb6 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -1123,6 +1123,27 @@ Driver options
 
   By default, the PMD will set this value to 1.
 
+- ``delay_drop_en`` parameter [int]
+
+  Bitmask value for the Rx queue delay drop attribute. Bit 0 is used for standard
+  Rx queue and bit 1 is used for hairpin Rx queue.
+  By default, the delay drop will be enabled for all hairpin Rx queues (if any)
+  and disabled for all standard Rx queues. It will be ignored if the NIC does
+  not support the attribute.
+  A timeout value is set in the driver to control the waiting time before dropping
+  a packet when there is no WQE available on a delay drop Rx queue. Once the timer
+  is expired, the delay drop will be deactivated for all queues. To re-activeate it,
+  a rearming is needed and now it is part of the kernel driver.
+
+  To enable / disable the delay drop rearming, the private flag ``dropless_rq`` can
+  be set and queried via ethtool:
+
+  - ethtool --set-priv-flags <netdev> dropless_rq on (/ off)
+  - ethtool --show-priv-flags <netdev>
+
+  The configuration flag is global per PF and can only be set on the PF, once it is on,
+  all the VFs', SFs' and representors' Rx queues will share the timer and rearming.
+
 .. _mlx5_firmware_config:
 
 Firmware configuration
@@ -1797,6 +1818,11 @@ Supported hardware offloads
    |                       | |               | | rdma-core 35  |
    |                       | |               | | ConnectX-6 Dx |
    +-----------------------+-----------------+-----------------+
+   | Rxq Delay drop        | | DPDK 21.11    | | DPDK 21.11    |
+   |                       | | OFED 5.5      | | OFED 5.5      |
+   |                       | | N/A           | | N/A           |
+   |                       | | ConnectX-5    | | ConnectX-5    |
+   +-----------------------+-----------------+-----------------+
 
 .. table:: Minimal SW/HW versions for shared action offload
    :name: sact
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 13d8330873..76d18aeb6b 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -191,6 +191,7 @@ New Features
   * Added implicit mempool registration to avoid data path hiccups (opt-out).
   * Added NIC offloads for the PMD on Windows (TSO, VLAN strip, CRC keep).
   * Added socket direct mode bonding support.
+  * Added delay drop support for Rx queue.
 
 * **Updated Solarflare network PMD.**
 
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [dpdk-dev] [PATCH 2/4] net/mlx5: add support for Rx queue delay drop
  2021-11-04 11:26 ` [dpdk-dev] [PATCH 2/4] net/mlx5: add support for Rx queue delay drop Bing Zhao
@ 2021-11-04 14:01   ` David Marchand
  2021-11-04 14:34     ` Bing Zhao
  0 siblings, 1 reply; 29+ messages in thread
From: David Marchand @ 2021-11-04 14:01 UTC (permalink / raw)
  To: Bing Zhao
  Cc: Slava Ovsiienko, Matan Azrad, dev, Raslan Darawsheh,
	Thomas Monjalon, Ori Kam

On Thu, Nov 4, 2021 at 12:27 PM Bing Zhao <bingz@nvidia.com> wrote:
>
> For an Ethernet RQ, packets received when receive WQEs are exhausted
> are dropped. This behavior prevents slow or malicious software
> entities at the host from affecting the network. While for hairpin
> cases, even if there is no software involved during the packet
> forwarding from Rx to Tx side, some hiccup in the hardware or back
> pressure from Tx side may still cause the WQEs to be exhausted. In
> certain scenarios it may be preferred to configure the device to
> avoid such packet drops, assuming the posting of WQEs will resume
> shortly.
>
> To support this, a new devarg "delay_drop_en" is introduced, by
> default, the delay drop is enabled for hairpin Rx queues and
> disabled for standard Rx queues. This value is used as a bit mask:
>   - bit 0: enablement of standard Rx queue
>   - bit 1: enablement of hairpin Rx queue
> And this attribute will be applied to all Rx queues of a device.

Rather than a devargs, why can't the driver use this option in the
identified usecases where it makes sense?
Here, hairpin.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v2 0/2] Add delay drop support for Rx queue
  2021-11-04 11:26 [dpdk-dev] [PATCH 0/4] Add delay drop support for Rx queue Bing Zhao
                   ` (3 preceding siblings ...)
  2021-11-04 11:26 ` [dpdk-dev] [PATCH 4/4] doc: update the description for Rx delay drop Bing Zhao
@ 2021-11-04 14:01 ` Bing Zhao
  2021-11-04 14:01   ` [dpdk-dev] [PATCH v2 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
  2021-11-04 14:01   ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
  2021-11-04 16:55 ` [dpdk-dev] [PATCH v3 0/2] Add delay drop support for Rx queue Bing Zhao
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 14:01 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

This patch set introduce a new devarg to support delay drop when
creating the Rx queues. The default attribute will be disabled and
the behavior will remain the same as before.

---
v2:
  - change hairpin queue delay drop to disable by default
  - combine the commits
  - fix Windows building
  - change the log print
---

Bing Zhao (2):
  net/mlx5: add support for Rx queue delay drop
  net/mlx5: check delay drop settings in kernel driver

 doc/guides/nics/mlx5.rst                  |  26 +++++
 doc/guides/rel_notes/release_21_11.rst    |   1 +
 drivers/common/mlx5/mlx5_devx_cmds.c      |   1 +
 drivers/common/mlx5/mlx5_devx_cmds.h      |   1 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c   | 114 ++++++++++++++++++++++
 drivers/net/mlx5/linux/mlx5_os.c          |  11 +++
 drivers/net/mlx5/mlx5.c                   |   7 ++
 drivers/net/mlx5/mlx5.h                   |  10 ++
 drivers/net/mlx5/mlx5_devx.c              |   5 +
 drivers/net/mlx5/mlx5_rx.h                |   1 +
 drivers/net/mlx5/mlx5_trigger.c           |  18 ++++
 drivers/net/mlx5/windows/mlx5_ethdev_os.c |  17 ++++
 12 files changed, 212 insertions(+)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v2 1/2] net/mlx5: add support for Rx queue delay drop
  2021-11-04 14:01 ` [dpdk-dev] [PATCH v2 0/2] Add delay drop support for Rx queue Bing Zhao
@ 2021-11-04 14:01   ` Bing Zhao
  2021-11-04 14:01   ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
  1 sibling, 0 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 14:01 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

For an Ethernet RQ, packets received when receive WQEs are exhausted
are dropped. This behavior prevents slow or malicious software
entities at the host from affecting the network. While for hairpin
cases, even if there is no software involved during the packet
forwarding from Rx to Tx side, some hiccup in the hardware or back
pressure from Tx side may still cause the WQEs to be exhausted. In
certain scenarios it may be preferred to configure the device to
avoid such packet drops, assuming the posting of WQEs will resume
shortly.

To support this, a new devarg "delay_drop_en" is introduced, by
default, the delay drop is enabled for hairpin Rx queues and
disabled for standard Rx queues. This value is used as a bit mask:
  - bit 0: enablement of standard Rx queue
  - bit 1: enablement of hairpin Rx queue
And this attribute will be applied to all Rx queues of a device.

The "rq_delay_drop" capability in the HCA_CAP is checked before
creating any queue. If the hardware capabilities do not support
this delay drop, all the Rx queues will still be created without
this attribute, and the devarg setting will be ignored even if it
is specified explicitly.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c |  1 +
 drivers/common/mlx5/mlx5_devx_cmds.h |  1 +
 drivers/net/mlx5/linux/mlx5_os.c     | 11 +++++++++++
 drivers/net/mlx5/mlx5.c              |  7 +++++++
 drivers/net/mlx5/mlx5.h              |  9 +++++++++
 drivers/net/mlx5/mlx5_devx.c         |  5 +++++
 drivers/net/mlx5/mlx5_rx.h           |  1 +
 7 files changed, 35 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index 12c114a91b..eaf1dd5046 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -962,6 +962,7 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 	attr->ct_offload = !!(MLX5_GET64(cmd_hca_cap, hcattr,
 					 general_obj_types) &
 			      MLX5_GENERAL_OBJ_TYPES_CAP_CONN_TRACK_OFFLOAD);
+	attr->rq_delay_drop = MLX5_GET(cmd_hca_cap, hcattr, rq_delay_drop);
 	if (attr->qos.sup) {
 		hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
 				MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP |
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index 2326f1e968..25e2814ac0 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -176,6 +176,7 @@ struct mlx5_hca_attr {
 	uint32_t swp_csum:1;
 	uint32_t swp_lso:1;
 	uint32_t lro_max_msg_sz_mode:2;
+	uint32_t rq_delay_drop:1;
 	uint32_t lro_timer_supported_periods[MLX5_LRO_NUM_SUPP_PERIODS];
 	uint16_t lro_min_mss_size;
 	uint32_t flex_parser_protocols;
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index f51da8c3a3..e8894239ed 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1506,6 +1506,15 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		goto error;
 #endif
 	}
+	if (config->std_delay_drop || config->hp_delay_drop) {
+		if (!config->hca_attr.rq_delay_drop) {
+			config->std_delay_drop = 0;
+			config->hp_delay_drop = 0;
+			DRV_LOG(WARNING,
+				"dev_port-%u: Rxq delay drop is not supported",
+				priv->dev_port);
+		}
+	}
 	if (sh->devx) {
 		uint32_t reg[MLX5_ST_SZ_DW(register_mtutc)];
 
@@ -2075,6 +2084,8 @@ mlx5_os_config_default(struct mlx5_dev_config *config)
 	config->decap_en = 1;
 	config->log_hp_size = MLX5_ARG_UNSET;
 	config->allow_duplicate_pattern = 1;
+	config->std_delay_drop = 0;
+	config->hp_delay_drop = 0;
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index dc15688f21..80a6692b94 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -183,6 +183,9 @@
 /* Device parameter to configure implicit registration of mempool memory. */
 #define MLX5_MR_MEMPOOL_REG_EN "mr_mempool_reg_en"
 
+/* Device parameter to configure the delay drop when creating Rxqs. */
+#define MLX5_DELAY_DROP_EN "delay_drop_en"
+
 /* Shared memory between primary and secondary processes. */
 struct mlx5_shared_data *mlx5_shared_data;
 
@@ -2095,6 +2098,9 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
 		config->decap_en = !!tmp;
 	} else if (strcmp(MLX5_ALLOW_DUPLICATE_PATTERN, key) == 0) {
 		config->allow_duplicate_pattern = !!tmp;
+	} else if (strcmp(MLX5_DELAY_DROP_EN, key) == 0) {
+		config->std_delay_drop = tmp & MLX5_DELAY_DROP_STANDARD;
+		config->hp_delay_drop = tmp & MLX5_DELAY_DROP_HAIRPIN;
 	} else {
 		DRV_LOG(WARNING, "%s: unknown parameter", key);
 		rte_errno = EINVAL;
@@ -2157,6 +2163,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 		MLX5_DECAP_EN,
 		MLX5_ALLOW_DUPLICATE_PATTERN,
 		MLX5_MR_MEMPOOL_REG_EN,
+		MLX5_DELAY_DROP_EN,
 		NULL,
 	};
 	struct rte_kvargs *kvlist;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 74af88ec19..8d32d55c9a 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -99,6 +99,13 @@ enum mlx5_flow_type {
 	MLX5_FLOW_TYPE_MAXI,
 };
 
+/* The mode of delay drop for Rx queues. */
+enum mlx5_delay_drop_mode {
+	MLX5_DELAY_DROP_NONE = 0, /* All disabled. */
+	MLX5_DELAY_DROP_STANDARD = RTE_BIT32(0), /* Standard queues enable. */
+	MLX5_DELAY_DROP_HAIRPIN = RTE_BIT32(1), /* Hairpin queues enable. */
+};
+
 /* Hlist and list callback context. */
 struct mlx5_flow_cb_ctx {
 	struct rte_eth_dev *dev;
@@ -264,6 +271,8 @@ struct mlx5_dev_config {
 	unsigned int dv_miss_info:1; /* restore packet after partial hw miss */
 	unsigned int allow_duplicate_pattern:1;
 	/* Allow/Prevent the duplicate rules pattern. */
+	unsigned int std_delay_drop:1; /* Enable standard Rxq delay drop. */
+	unsigned int hp_delay_drop:1; /* Enable hairpin Rxq delay drop. */
 	struct {
 		unsigned int enabled:1; /* Whether MPRQ is enabled. */
 		unsigned int stride_num_n; /* Number of strides. */
diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index 424f77be79..2e1d849eab 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -280,6 +280,7 @@ mlx5_rxq_create_devx_rq_resources(struct rte_eth_dev *dev,
 						MLX5_WQ_END_PAD_MODE_NONE;
 	rq_attr.wq_attr.pd = cdev->pdn;
 	rq_attr.counter_set_id = priv->counter_set_id;
+	rq_attr.delay_drop_en = rxq_data->delay_drop;
 	/* Create RQ using DevX API. */
 	return mlx5_devx_rq_create(cdev->ctx, &rxq_ctrl->obj->rq_obj, wqe_size,
 				   log_desc_n, &rq_attr, rxq_ctrl->socket);
@@ -443,6 +444,8 @@ mlx5_rxq_obj_hairpin_new(struct rte_eth_dev *dev, uint16_t idx)
 			attr.wq_attr.log_hairpin_data_sz -
 			MLX5_HAIRPIN_QUEUE_STRIDE;
 	attr.counter_set_id = priv->counter_set_id;
+	rxq_data->delay_drop = priv->config.hp_delay_drop;
+	attr.delay_drop_en = priv->config.hp_delay_drop;
 	tmpl->rq = mlx5_devx_cmd_create_rq(priv->sh->cdev->ctx, &attr,
 					   rxq_ctrl->socket);
 	if (!tmpl->rq) {
@@ -503,6 +506,7 @@ mlx5_rxq_devx_obj_new(struct rte_eth_dev *dev, uint16_t idx)
 		DRV_LOG(ERR, "Failed to create CQ.");
 		goto error;
 	}
+	rxq_data->delay_drop = priv->config.std_delay_drop;
 	/* Create RQ using DevX API. */
 	ret = mlx5_rxq_create_devx_rq_resources(dev, rxq_data);
 	if (ret) {
@@ -921,6 +925,7 @@ mlx5_rxq_devx_obj_drop_create(struct rte_eth_dev *dev)
 	rxq_ctrl->priv = priv;
 	rxq_ctrl->obj = rxq;
 	rxq_data = &rxq_ctrl->rxq;
+	rxq_data->delay_drop = 0;
 	/* Create CQ using DevX API. */
 	ret = mlx5_rxq_create_devx_cq_resources(dev, rxq_data);
 	if (ret != 0) {
diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index 69b1263339..05807764b8 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -92,6 +92,7 @@ struct mlx5_rxq_data {
 	unsigned int lro:1; /* Enable LRO. */
 	unsigned int dynf_meta:1; /* Dynamic metadata is configured. */
 	unsigned int mcqe_format:3; /* CQE compression format. */
+	unsigned int delay_drop:1; /* Enable delay drop. */
 	volatile uint32_t *rq_db;
 	volatile uint32_t *cq_db;
 	uint16_t port_id;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v2 2/2] net/mlx5: check delay drop settings in kernel driver
  2021-11-04 14:01 ` [dpdk-dev] [PATCH v2 0/2] Add delay drop support for Rx queue Bing Zhao
  2021-11-04 14:01   ` [dpdk-dev] [PATCH v2 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
@ 2021-11-04 14:01   ` Bing Zhao
  1 sibling, 0 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 14:01 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

The delay drop is the common feature managed on per device basis
and the kernel driver is responsible one for the initialization and
rearming.

By default, the timeout value is set to activate the delay drop when
the driver is loaded.

A private flag "dropless_rq" is used to control the rearming. Only
when it is on, the rearming will be handled once received a timeout
event. Or else, the delay drop will be deactivated after the first
timeout occurs and all the Rx queues won't have this feature.

The PMD is trying to query this flag and warn the application when
some queues are created with delay drop but the flag is off.

The documents are also updated in this commit.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 doc/guides/nics/mlx5.rst                  |  26 +++++
 doc/guides/rel_notes/release_21_11.rst    |   1 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c   | 114 ++++++++++++++++++++++
 drivers/net/mlx5/mlx5.h                   |   1 +
 drivers/net/mlx5/mlx5_trigger.c           |  18 ++++
 drivers/net/mlx5/windows/mlx5_ethdev_os.c |  17 ++++
 6 files changed, 177 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index bb92520dff..2874a34cb6 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -1123,6 +1123,27 @@ Driver options
 
   By default, the PMD will set this value to 1.
 
+- ``delay_drop_en`` parameter [int]
+
+  Bitmask value for the Rx queue delay drop attribute. Bit 0 is used for standard
+  Rx queue and bit 1 is used for hairpin Rx queue.
+  By default, the delay drop will be enabled for all hairpin Rx queues (if any)
+  and disabled for all standard Rx queues. It will be ignored if the NIC does
+  not support the attribute.
+  A timeout value is set in the driver to control the waiting time before dropping
+  a packet when there is no WQE available on a delay drop Rx queue. Once the timer
+  is expired, the delay drop will be deactivated for all queues. To re-activeate it,
+  a rearming is needed and now it is part of the kernel driver.
+
+  To enable / disable the delay drop rearming, the private flag ``dropless_rq`` can
+  be set and queried via ethtool:
+
+  - ethtool --set-priv-flags <netdev> dropless_rq on (/ off)
+  - ethtool --show-priv-flags <netdev>
+
+  The configuration flag is global per PF and can only be set on the PF, once it is on,
+  all the VFs', SFs' and representors' Rx queues will share the timer and rearming.
+
 .. _mlx5_firmware_config:
 
 Firmware configuration
@@ -1797,6 +1818,11 @@ Supported hardware offloads
    |                       | |               | | rdma-core 35  |
    |                       | |               | | ConnectX-6 Dx |
    +-----------------------+-----------------+-----------------+
+   | Rxq Delay drop        | | DPDK 21.11    | | DPDK 21.11    |
+   |                       | | OFED 5.5      | | OFED 5.5      |
+   |                       | | N/A           | | N/A           |
+   |                       | | ConnectX-5    | | ConnectX-5    |
+   +-----------------------+-----------------+-----------------+
 
 .. table:: Minimal SW/HW versions for shared action offload
    :name: sact
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 13d8330873..76d18aeb6b 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -191,6 +191,7 @@ New Features
   * Added implicit mempool registration to avoid data path hiccups (opt-out).
   * Added NIC offloads for the PMD on Windows (TSO, VLAN strip, CRC keep).
   * Added socket direct mode bonding support.
+  * Added delay drop support for Rx queue.
 
 * **Updated Solarflare network PMD.**
 
diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index 9d0e491d0c..1dd2d74c77 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -1630,3 +1630,117 @@ mlx5_get_mac(struct rte_eth_dev *dev, uint8_t (*mac)[RTE_ETHER_ADDR_LEN])
 	memcpy(mac, request.ifr_hwaddr.sa_data, RTE_ETHER_ADDR_LEN);
 	return 0;
 }
+
+/*
+ * Query dropless_rq private flag value provided by ETHTOOL.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   - 0 on success, flag is not set.
+ *   - 1 on success, flag is set.
+ *   - negative errno value otherwise and rte_errno is set.
+ */
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev)
+{
+	struct {
+		struct ethtool_sset_info hdr;
+		uint32_t buf[1];
+	} sset_info;
+	struct ethtool_drvinfo drvinfo;
+	struct ifreq ifr;
+	struct ethtool_gstrings *strings = NULL;
+	struct ethtool_value flags;
+	const int32_t flag_len = sizeof(flags.data) * CHAR_BIT;
+	int32_t str_sz;
+	int32_t len;
+	int32_t i;
+	int ret;
+
+	sset_info.hdr.cmd = ETHTOOL_GSSET_INFO;
+	sset_info.hdr.reserved = 0;
+	sset_info.hdr.sset_mask = 1ULL << ETH_SS_PRIV_FLAGS;
+	ifr.ifr_data = (caddr_t)&sset_info;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (!ret) {
+		const uint32_t *sset_lengths = sset_info.hdr.data;
+
+		len = sset_info.hdr.sset_mask ? sset_lengths[0] : 0;
+	} else if (ret == -EOPNOTSUPP) {
+		drvinfo.cmd = ETHTOOL_GDRVINFO;
+		ifr.ifr_data = (caddr_t)&drvinfo;
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+		if (ret) {
+			DRV_LOG(WARNING, "port %u cannot get the driver info",
+				dev->data->port_id);
+			goto exit;
+		}
+		len = *(uint32_t *)((char *)&drvinfo +
+			offsetof(struct ethtool_drvinfo, n_priv_flags));
+	} else {
+		DRV_LOG(WARNING, "port %u cannot get the sset info",
+			dev->data->port_id);
+		goto exit;
+	}
+	if (!len) {
+		DRV_LOG(WARNING, "port %u does not have private flag",
+			dev->data->port_id);
+		rte_errno = EOPNOTSUPP;
+		ret = -rte_errno;
+		goto exit;
+	} else if (len > flag_len) {
+		DRV_LOG(WARNING, "port %u maximal private flags number is %d",
+			dev->data->port_id, flag_len);
+		len = flag_len;
+	}
+	str_sz = ETH_GSTRING_LEN * len;
+	strings = (struct ethtool_gstrings *)
+		  mlx5_malloc(0, str_sz + sizeof(struct ethtool_gstrings), 0,
+			      SOCKET_ID_ANY);
+	if (!strings) {
+		DRV_LOG(WARNING, "port %u unable to allocate memory for"
+			" private flags", dev->data->port_id);
+		rte_errno = ENOMEM;
+		ret = -rte_errno;
+		goto exit;
+	}
+	strings->cmd = ETHTOOL_GSTRINGS;
+	strings->string_set = ETH_SS_PRIV_FLAGS;
+	strings->len = len;
+	ifr.ifr_data = (caddr_t)strings;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (ret) {
+		DRV_LOG(WARNING, "port %u unable to get private flags strings",
+			dev->data->port_id);
+		goto exit;
+	}
+	for (i = 0; i < len; i++) {
+		strings->data[(i + 1) * ETH_GSTRING_LEN - 1] = 0;
+		if (!strcmp((const char *)strings->data + i * ETH_GSTRING_LEN,
+			     "dropless_rq"))
+			break;
+	}
+	if (i == len) {
+		DRV_LOG(WARNING, "port %u does not support dropless_rq",
+			dev->data->port_id);
+		rte_errno = EOPNOTSUPP;
+		ret = -rte_errno;
+		goto exit;
+	}
+	flags.cmd = ETHTOOL_GPFLAGS;
+	ifr.ifr_data = (caddr_t)&flags;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (ret) {
+		DRV_LOG(WARNING, "port %u unable to get private flags status",
+			dev->data->port_id);
+		goto exit;
+	}
+	if (!(flags.data & (1U << i)))
+		ret = 0;
+	else
+		ret = 1;
+exit:
+	mlx5_free(strings);
+	return ret;
+}
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 8d32d55c9a..e0f40ce31a 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1599,6 +1599,7 @@ int mlx5_os_read_dev_stat(struct mlx5_priv *priv,
 int mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats);
 int mlx5_os_get_stats_n(struct rte_eth_dev *dev);
 void mlx5_os_stats_init(struct rte_eth_dev *dev);
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index ebeeae279e..34fcd2b441 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -1126,6 +1126,24 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 			dev->data->port_id, strerror(rte_errno));
 		goto error;
 	}
+	if (priv->config.std_delay_drop || priv->config.hp_delay_drop) {
+		if (!priv->config.vf && !priv->config.sf &&
+		    !priv->representor) {
+			ret = mlx5_get_flag_dropless_rq(dev);
+			if (ret < 0)
+				DRV_LOG(WARNING,
+					"port %u cannot query dropless flag",
+					dev->data->port_id);
+			else if (!ret)
+				DRV_LOG(WARNING,
+					"port %u dropless_rq OFF, no rearming",
+					dev->data->port_id);
+		} else {
+			DRV_LOG(DEBUG,
+				"port %u doesn't support dropless_rq flag",
+				dev->data->port_id);
+		}
+	}
 	ret = mlx5_rxq_start(dev);
 	if (ret) {
 		DRV_LOG(ERR, "port %u Rx queue allocation failed: %s",
diff --git a/drivers/net/mlx5/windows/mlx5_ethdev_os.c b/drivers/net/mlx5/windows/mlx5_ethdev_os.c
index fddc7a6b12..359f73df7c 100644
--- a/drivers/net/mlx5/windows/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/windows/mlx5_ethdev_os.c
@@ -389,3 +389,20 @@ mlx5_is_removed(struct rte_eth_dev *dev)
 		return 1;
 	return 0;
 }
+
+/*
+ * Query dropless_rq private flag value provided by ETHTOOL.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   - 0 on success, flag is not set.
+ *   - 1 on success, flag is set.
+ *   - negative errno value otherwise and rte_errno is set.
+ */
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev)
+{
+	RTE_SET_USED(dev);
+	return -ENOTSUP;
+}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [dpdk-dev] [PATCH 2/4] net/mlx5: add support for Rx queue delay drop
  2021-11-04 14:01   ` David Marchand
@ 2021-11-04 14:34     ` Bing Zhao
  0 siblings, 0 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 14:34 UTC (permalink / raw)
  To: David Marchand
  Cc: Slava Ovsiienko, Matan Azrad, dev, Raslan Darawsheh,
	NBU-Contact-Thomas Monjalon, Ori Kam

Hi David,

Many thanks for this comments. My answers are inline.

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Thursday, November 4, 2021 10:01 PM
> To: Bing Zhao <bingz@nvidia.com>
> Cc: Slava Ovsiienko <viacheslavo@nvidia.com>; Matan Azrad
> <matan@nvidia.com>; dev <dev@dpdk.org>; Raslan Darawsheh
> <rasland@nvidia.com>; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Ori Kam <orika@nvidia.com>
> Subject: Re: [dpdk-dev] [PATCH 2/4] net/mlx5: add support for Rx
> queue delay drop
> 
> External email: Use caution opening links or attachments
> 
> 
> On Thu, Nov 4, 2021 at 12:27 PM Bing Zhao <bingz@nvidia.com> wrote:
> >
> > For an Ethernet RQ, packets received when receive WQEs are
> exhausted
> > are dropped. This behavior prevents slow or malicious software
> > entities at the host from affecting the network. While for hairpin
> > cases, even if there is no software involved during the packet
> > forwarding from Rx to Tx side, some hiccup in the hardware or back
> > pressure from Tx side may still cause the WQEs to be exhausted. In
> > certain scenarios it may be preferred to configure the device to
> avoid
> > such packet drops, assuming the posting of WQEs will resume
> shortly.
> >
> > To support this, a new devarg "delay_drop_en" is introduced, by
> > default, the delay drop is enabled for hairpin Rx queues and
> disabled
> > for standard Rx queues. This value is used as a bit mask:
> >   - bit 0: enablement of standard Rx queue
> >   - bit 1: enablement of hairpin Rx queue And this attribute will
> be
> > applied to all Rx queues of a device.
> 
> Rather than a devargs, why can't the driver use this option in the
> identified usecases where it makes sense?
> Here, hairpin.

In the patch set v2, the attribute for hairpin is also disabled, then the default behavior will remain the same as today. This is only some minor change but it may have some impact on the HW processing.
With this attribute ON for a specific queue, it will have the such impact:

Pros: If there is some hiccup in the SW / HW, or there is a burst and the SW is not fast enough to handle. Once the WQEs get exhausted in the queue, the packets will not be dropped immediately and held in the NIC. It gives more tolerance and make the queue work as a dropless queue.

Cons: While some packets are waiting for the available WQEs, the new packets maybe dropped if there is not enough space. Or some new packets may have a bigger latency since the previous ones are waiting. If the traffic exceeds the line rate or the SW is too slow to handle the incoming traffic, the packets will be dropped eventually. Some contexts are global and the waiting on one queue may have an impact on other queues.

So right now this devarg is to give the flexibility / ability for the application to verify and decide if this is needed in the real-life. Theoretically, this would help for most of the cases.

> 
> 
> --
> David Marchand

BR. Bing

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v3 0/2] Add delay drop support for Rx queue
  2021-11-04 11:26 [dpdk-dev] [PATCH 0/4] Add delay drop support for Rx queue Bing Zhao
                   ` (4 preceding siblings ...)
  2021-11-04 14:01 ` [dpdk-dev] [PATCH v2 0/2] Add delay drop support for Rx queue Bing Zhao
@ 2021-11-04 16:55 ` Bing Zhao
  2021-11-04 16:55   ` [dpdk-dev] [PATCH v3 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
  2021-11-04 16:55   ` [dpdk-dev] [PATCH v3 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
  2021-11-04 17:59 ` [dpdk-dev] [PATCH v4 0/2] Add delay drop support for Rx queue Bing Zhao
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 16:55 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

This patch set introduce a new devarg to support delay drop when
creating the Rx queues. The default attribute will be disabled and
the behavior will remain the same as before.

---
v2:
  - change hairpin queue delay drop to disable by default
  - combine the commits
  - fix Windows building
  - change the log print
v3: fix conflict and building
---

Bing Zhao (2):
  net/mlx5: add support for Rx queue delay drop
  net/mlx5: check delay drop settings in kernel driver

 doc/guides/nics/mlx5.rst                  |  26 +++++
 doc/guides/rel_notes/release_21_11.rst    |   1 +
 drivers/common/mlx5/mlx5_devx_cmds.c      |   1 +
 drivers/common/mlx5/mlx5_devx_cmds.h      |   1 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c   | 114 ++++++++++++++++++++++
 drivers/net/mlx5/linux/mlx5_os.c          |  11 +++
 drivers/net/mlx5/mlx5.c                   |   7 ++
 drivers/net/mlx5/mlx5.h                   |  10 ++
 drivers/net/mlx5/mlx5_devx.c              |   5 +
 drivers/net/mlx5/mlx5_rx.h                |   1 +
 drivers/net/mlx5/mlx5_trigger.c           |  18 ++++
 drivers/net/mlx5/windows/mlx5_ethdev_os.c |  17 ++++
 12 files changed, 212 insertions(+)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v3 1/2] net/mlx5: add support for Rx queue delay drop
  2021-11-04 16:55 ` [dpdk-dev] [PATCH v3 0/2] Add delay drop support for Rx queue Bing Zhao
@ 2021-11-04 16:55   ` Bing Zhao
  2021-11-04 16:55   ` [dpdk-dev] [PATCH v3 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
  1 sibling, 0 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 16:55 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

For an Ethernet RQ, packets received when receive WQEs are exhausted
are dropped. This behavior prevents slow or malicious software
entities at the host from affecting the network. While for hairpin
cases, even if there is no software involved during the packet
forwarding from Rx to Tx side, some hiccup in the hardware or back
pressure from Tx side may still cause the WQEs to be exhausted. In
certain scenarios it may be preferred to configure the device to
avoid such packet drops, assuming the posting of WQEs will resume
shortly.

To support this, a new devarg "delay_drop_en" is introduced, by
default, the delay drop is enabled for hairpin Rx queues and
disabled for standard Rx queues. This value is used as a bit mask:
  - bit 0: enablement of standard Rx queue
  - bit 1: enablement of hairpin Rx queue
And this attribute will be applied to all Rx queues of a device.

The "rq_delay_drop" capability in the HCA_CAP is checked before
creating any queue. If the hardware capabilities do not support
this delay drop, all the Rx queues will still be created without
this attribute, and the devarg setting will be ignored even if it
is specified explicitly.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c |  1 +
 drivers/common/mlx5/mlx5_devx_cmds.h |  1 +
 drivers/net/mlx5/linux/mlx5_os.c     | 11 +++++++++++
 drivers/net/mlx5/mlx5.c              |  7 +++++++
 drivers/net/mlx5/mlx5.h              |  9 +++++++++
 drivers/net/mlx5/mlx5_devx.c         |  5 +++++
 drivers/net/mlx5/mlx5_rx.h           |  1 +
 7 files changed, 35 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index 4ab3070da0..3748e54b22 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -964,6 +964,7 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 	attr->ct_offload = !!(MLX5_GET64(cmd_hca_cap, hcattr,
 					 general_obj_types) &
 			      MLX5_GENERAL_OBJ_TYPES_CAP_CONN_TRACK_OFFLOAD);
+	attr->rq_delay_drop = MLX5_GET(cmd_hca_cap, hcattr, rq_delay_drop);
 	if (attr->qos.sup) {
 		hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
 				MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP |
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index 86ee4f7b78..50d3264b46 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -178,6 +178,7 @@ struct mlx5_hca_attr {
 	uint32_t swp_csum:1;
 	uint32_t swp_lso:1;
 	uint32_t lro_max_msg_sz_mode:2;
+	uint32_t rq_delay_drop:1;
 	uint32_t lro_timer_supported_periods[MLX5_LRO_NUM_SUPP_PERIODS];
 	uint16_t lro_min_mss_size;
 	uint32_t flex_parser_protocols;
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index e0304b685e..de880ee4c9 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1508,6 +1508,15 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		goto error;
 #endif
 	}
+	if (config->std_delay_drop || config->hp_delay_drop) {
+		if (!config->hca_attr.rq_delay_drop) {
+			config->std_delay_drop = 0;
+			config->hp_delay_drop = 0;
+			DRV_LOG(WARNING,
+				"dev_port-%u: Rxq delay drop is not supported",
+				priv->dev_port);
+		}
+	}
 	if (sh->devx) {
 		uint32_t reg[MLX5_ST_SZ_DW(register_mtutc)];
 
@@ -2077,6 +2086,8 @@ mlx5_os_config_default(struct mlx5_dev_config *config)
 	config->decap_en = 1;
 	config->log_hp_size = MLX5_ARG_UNSET;
 	config->allow_duplicate_pattern = 1;
+	config->std_delay_drop = 0;
+	config->hp_delay_drop = 0;
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 8614b8ffdd..a961cce430 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -183,6 +183,9 @@
 /* Device parameter to configure implicit registration of mempool memory. */
 #define MLX5_MR_MEMPOOL_REG_EN "mr_mempool_reg_en"
 
+/* Device parameter to configure the delay drop when creating Rxqs. */
+#define MLX5_DELAY_DROP_EN "delay_drop_en"
+
 /* Shared memory between primary and secondary processes. */
 struct mlx5_shared_data *mlx5_shared_data;
 
@@ -2091,6 +2094,9 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
 		config->decap_en = !!tmp;
 	} else if (strcmp(MLX5_ALLOW_DUPLICATE_PATTERN, key) == 0) {
 		config->allow_duplicate_pattern = !!tmp;
+	} else if (strcmp(MLX5_DELAY_DROP_EN, key) == 0) {
+		config->std_delay_drop = tmp & MLX5_DELAY_DROP_STANDARD;
+		config->hp_delay_drop = tmp & MLX5_DELAY_DROP_HAIRPIN;
 	} else {
 		DRV_LOG(WARNING, "%s: unknown parameter", key);
 		rte_errno = EINVAL;
@@ -2153,6 +2159,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 		MLX5_DECAP_EN,
 		MLX5_ALLOW_DUPLICATE_PATTERN,
 		MLX5_MR_MEMPOOL_REG_EN,
+		MLX5_DELAY_DROP_EN,
 		NULL,
 	};
 	struct rte_kvargs *kvlist;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 51f4578838..b2022f3300 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -99,6 +99,13 @@ enum mlx5_flow_type {
 	MLX5_FLOW_TYPE_MAXI,
 };
 
+/* The mode of delay drop for Rx queues. */
+enum mlx5_delay_drop_mode {
+	MLX5_DELAY_DROP_NONE = 0, /* All disabled. */
+	MLX5_DELAY_DROP_STANDARD = RTE_BIT32(0), /* Standard queues enable. */
+	MLX5_DELAY_DROP_HAIRPIN = RTE_BIT32(1), /* Hairpin queues enable. */
+};
+
 /* Hlist and list callback context. */
 struct mlx5_flow_cb_ctx {
 	struct rte_eth_dev *dev;
@@ -264,6 +271,8 @@ struct mlx5_dev_config {
 	unsigned int dv_miss_info:1; /* restore packet after partial hw miss */
 	unsigned int allow_duplicate_pattern:1;
 	/* Allow/Prevent the duplicate rules pattern. */
+	unsigned int std_delay_drop:1; /* Enable standard Rxq delay drop. */
+	unsigned int hp_delay_drop:1; /* Enable hairpin Rxq delay drop. */
 	struct {
 		unsigned int enabled:1; /* Whether MPRQ is enabled. */
 		unsigned int stride_num_n; /* Number of strides. */
diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index a9f9f4af70..e46f79124d 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -277,6 +277,7 @@ mlx5_rxq_create_devx_rq_resources(struct mlx5_rxq_priv *rxq)
 						MLX5_WQ_END_PAD_MODE_NONE;
 	rq_attr.wq_attr.pd = cdev->pdn;
 	rq_attr.counter_set_id = priv->counter_set_id;
+	rq_attr.delay_drop_en = rxq_data->delay_drop;
 	rq_attr.user_index = rte_cpu_to_be_16(priv->dev_data->port_id);
 	if (rxq_data->shared) /* Create RMP based RQ. */
 		rxq->devx_rq.rmp = &rxq_ctrl->obj->devx_rmp;
@@ -439,6 +440,8 @@ mlx5_rxq_obj_hairpin_new(struct mlx5_rxq_priv *rxq)
 			attr.wq_attr.log_hairpin_data_sz -
 			MLX5_HAIRPIN_QUEUE_STRIDE;
 	attr.counter_set_id = priv->counter_set_id;
+	rxq_ctrl->rxq.delay_drop = priv->config.hp_delay_drop;
+	attr.delay_drop_en = priv->config.hp_delay_drop;
 	tmpl->rq = mlx5_devx_cmd_create_rq(priv->sh->cdev->ctx, &attr,
 					   rxq_ctrl->socket);
 	if (!tmpl->rq) {
@@ -496,6 +499,7 @@ mlx5_rxq_devx_obj_new(struct mlx5_rxq_priv *rxq)
 		DRV_LOG(ERR, "Failed to create CQ.");
 		goto error;
 	}
+	rxq_data->delay_drop = priv->config.std_delay_drop;
 	/* Create RQ using DevX API. */
 	ret = mlx5_rxq_create_devx_rq_resources(rxq);
 	if (ret) {
@@ -941,6 +945,7 @@ mlx5_rxq_devx_obj_drop_create(struct rte_eth_dev *dev)
 			dev->data->port_id);
 		goto error;
 	}
+	rxq_ctrl->rxq.delay_drop = 0;
 	/* Create RQ using DevX API. */
 	ret = mlx5_rxq_create_devx_rq_resources(rxq);
 	if (ret != 0) {
diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index eda6eca8de..3b797e577a 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -97,6 +97,7 @@ struct mlx5_rxq_data {
 	unsigned int dynf_meta:1; /* Dynamic metadata is configured. */
 	unsigned int mcqe_format:3; /* CQE compression format. */
 	unsigned int shared:1; /* Shared RXQ. */
+	unsigned int delay_drop:1; /* Enable delay drop. */
 	volatile uint32_t *rq_db;
 	volatile uint32_t *cq_db;
 	uint16_t port_id;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v3 2/2] net/mlx5: check delay drop settings in kernel driver
  2021-11-04 16:55 ` [dpdk-dev] [PATCH v3 0/2] Add delay drop support for Rx queue Bing Zhao
  2021-11-04 16:55   ` [dpdk-dev] [PATCH v3 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
@ 2021-11-04 16:55   ` Bing Zhao
  1 sibling, 0 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 16:55 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

The delay drop is the common feature managed on per device basis
and the kernel driver is responsible one for the initialization and
rearming.

By default, the timeout value is set to activate the delay drop when
the driver is loaded.

A private flag "dropless_rq" is used to control the rearming. Only
when it is on, the rearming will be handled once received a timeout
event. Or else, the delay drop will be deactivated after the first
timeout occurs and all the Rx queues won't have this feature.

The PMD is trying to query this flag and warn the application when
some queues are created with delay drop but the flag is off.

The documents are also updated in this commit.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 doc/guides/nics/mlx5.rst                  |  26 +++++
 doc/guides/rel_notes/release_21_11.rst    |   1 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c   | 114 ++++++++++++++++++++++
 drivers/net/mlx5/mlx5.h                   |   1 +
 drivers/net/mlx5/mlx5_trigger.c           |  18 ++++
 drivers/net/mlx5/windows/mlx5_ethdev_os.c |  17 ++++
 6 files changed, 177 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 824971d89a..006896375f 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -1129,6 +1129,27 @@ Driver options
 
   By default, the PMD will set this value to 1.
 
+- ``delay_drop_en`` parameter [int]
+
+  Bitmask value for the Rx queue delay drop attribute. Bit 0 is used for standard
+  Rx queue and bit 1 is used for hairpin Rx queue.
+  By default, the delay drop will be enabled for all hairpin Rx queues (if any)
+  and disabled for all standard Rx queues. It will be ignored if the NIC does
+  not support the attribute.
+  A timeout value is set in the driver to control the waiting time before dropping
+  a packet when there is no WQE available on a delay drop Rx queue. Once the timer
+  is expired, the delay drop will be deactivated for all queues. To re-activeate it,
+  a rearming is needed and now it is part of the kernel driver.
+
+  To enable / disable the delay drop rearming, the private flag ``dropless_rq`` can
+  be set and queried via ethtool:
+
+  - ethtool --set-priv-flags <netdev> dropless_rq on (/ off)
+  - ethtool --show-priv-flags <netdev>
+
+  The configuration flag is global per PF and can only be set on the PF, once it is on,
+  all the VFs', SFs' and representors' Rx queues will share the timer and rearming.
+
 .. _mlx5_firmware_config:
 
 Firmware configuration
@@ -1803,6 +1824,11 @@ Supported hardware offloads
    |                       | |               | | rdma-core 35  |
    |                       | |               | | ConnectX-6 Dx |
    +-----------------------+-----------------+-----------------+
+   | Rxq Delay drop        | | DPDK 21.11    | | DPDK 21.11    |
+   |                       | | OFED 5.5      | | OFED 5.5      |
+   |                       | | N/A           | | N/A           |
+   |                       | | ConnectX-5    | | ConnectX-5    |
+   +-----------------------+-----------------+-----------------+
 
 .. table:: Minimal SW/HW versions for shared action offload
    :name: sact
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 13d8330873..76d18aeb6b 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -191,6 +191,7 @@ New Features
   * Added implicit mempool registration to avoid data path hiccups (opt-out).
   * Added NIC offloads for the PMD on Windows (TSO, VLAN strip, CRC keep).
   * Added socket direct mode bonding support.
+  * Added delay drop support for Rx queue.
 
 * **Updated Solarflare network PMD.**
 
diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index 9d0e491d0c..1dd2d74c77 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -1630,3 +1630,117 @@ mlx5_get_mac(struct rte_eth_dev *dev, uint8_t (*mac)[RTE_ETHER_ADDR_LEN])
 	memcpy(mac, request.ifr_hwaddr.sa_data, RTE_ETHER_ADDR_LEN);
 	return 0;
 }
+
+/*
+ * Query dropless_rq private flag value provided by ETHTOOL.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   - 0 on success, flag is not set.
+ *   - 1 on success, flag is set.
+ *   - negative errno value otherwise and rte_errno is set.
+ */
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev)
+{
+	struct {
+		struct ethtool_sset_info hdr;
+		uint32_t buf[1];
+	} sset_info;
+	struct ethtool_drvinfo drvinfo;
+	struct ifreq ifr;
+	struct ethtool_gstrings *strings = NULL;
+	struct ethtool_value flags;
+	const int32_t flag_len = sizeof(flags.data) * CHAR_BIT;
+	int32_t str_sz;
+	int32_t len;
+	int32_t i;
+	int ret;
+
+	sset_info.hdr.cmd = ETHTOOL_GSSET_INFO;
+	sset_info.hdr.reserved = 0;
+	sset_info.hdr.sset_mask = 1ULL << ETH_SS_PRIV_FLAGS;
+	ifr.ifr_data = (caddr_t)&sset_info;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (!ret) {
+		const uint32_t *sset_lengths = sset_info.hdr.data;
+
+		len = sset_info.hdr.sset_mask ? sset_lengths[0] : 0;
+	} else if (ret == -EOPNOTSUPP) {
+		drvinfo.cmd = ETHTOOL_GDRVINFO;
+		ifr.ifr_data = (caddr_t)&drvinfo;
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+		if (ret) {
+			DRV_LOG(WARNING, "port %u cannot get the driver info",
+				dev->data->port_id);
+			goto exit;
+		}
+		len = *(uint32_t *)((char *)&drvinfo +
+			offsetof(struct ethtool_drvinfo, n_priv_flags));
+	} else {
+		DRV_LOG(WARNING, "port %u cannot get the sset info",
+			dev->data->port_id);
+		goto exit;
+	}
+	if (!len) {
+		DRV_LOG(WARNING, "port %u does not have private flag",
+			dev->data->port_id);
+		rte_errno = EOPNOTSUPP;
+		ret = -rte_errno;
+		goto exit;
+	} else if (len > flag_len) {
+		DRV_LOG(WARNING, "port %u maximal private flags number is %d",
+			dev->data->port_id, flag_len);
+		len = flag_len;
+	}
+	str_sz = ETH_GSTRING_LEN * len;
+	strings = (struct ethtool_gstrings *)
+		  mlx5_malloc(0, str_sz + sizeof(struct ethtool_gstrings), 0,
+			      SOCKET_ID_ANY);
+	if (!strings) {
+		DRV_LOG(WARNING, "port %u unable to allocate memory for"
+			" private flags", dev->data->port_id);
+		rte_errno = ENOMEM;
+		ret = -rte_errno;
+		goto exit;
+	}
+	strings->cmd = ETHTOOL_GSTRINGS;
+	strings->string_set = ETH_SS_PRIV_FLAGS;
+	strings->len = len;
+	ifr.ifr_data = (caddr_t)strings;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (ret) {
+		DRV_LOG(WARNING, "port %u unable to get private flags strings",
+			dev->data->port_id);
+		goto exit;
+	}
+	for (i = 0; i < len; i++) {
+		strings->data[(i + 1) * ETH_GSTRING_LEN - 1] = 0;
+		if (!strcmp((const char *)strings->data + i * ETH_GSTRING_LEN,
+			     "dropless_rq"))
+			break;
+	}
+	if (i == len) {
+		DRV_LOG(WARNING, "port %u does not support dropless_rq",
+			dev->data->port_id);
+		rte_errno = EOPNOTSUPP;
+		ret = -rte_errno;
+		goto exit;
+	}
+	flags.cmd = ETHTOOL_GPFLAGS;
+	ifr.ifr_data = (caddr_t)&flags;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (ret) {
+		DRV_LOG(WARNING, "port %u unable to get private flags status",
+			dev->data->port_id);
+		goto exit;
+	}
+	if (!(flags.data & (1U << i)))
+		ret = 0;
+	else
+		ret = 1;
+exit:
+	mlx5_free(strings);
+	return ret;
+}
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index b2022f3300..9307a4f95b 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1602,6 +1602,7 @@ int mlx5_os_read_dev_stat(struct mlx5_priv *priv,
 int mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats);
 int mlx5_os_get_stats_n(struct rte_eth_dev *dev);
 void mlx5_os_stats_init(struct rte_eth_dev *dev);
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index a3e62e9533..0ecc530043 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -1129,6 +1129,24 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 			dev->data->port_id, strerror(rte_errno));
 		goto error;
 	}
+	if (priv->config.std_delay_drop || priv->config.hp_delay_drop) {
+		if (!priv->config.vf && !priv->config.sf &&
+		    !priv->representor) {
+			ret = mlx5_get_flag_dropless_rq(dev);
+			if (ret < 0)
+				DRV_LOG(WARNING,
+					"port %u cannot query dropless flag",
+					dev->data->port_id);
+			else if (!ret)
+				DRV_LOG(WARNING,
+					"port %u dropless_rq OFF, no rearming",
+					dev->data->port_id);
+		} else {
+			DRV_LOG(DEBUG,
+				"port %u doesn't support dropless_rq flag",
+				dev->data->port_id);
+		}
+	}
 	ret = mlx5_rxq_start(dev);
 	if (ret) {
 		DRV_LOG(ERR, "port %u Rx queue allocation failed: %s",
diff --git a/drivers/net/mlx5/windows/mlx5_ethdev_os.c b/drivers/net/mlx5/windows/mlx5_ethdev_os.c
index fddc7a6b12..359f73df7c 100644
--- a/drivers/net/mlx5/windows/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/windows/mlx5_ethdev_os.c
@@ -389,3 +389,20 @@ mlx5_is_removed(struct rte_eth_dev *dev)
 		return 1;
 	return 0;
 }
+
+/*
+ * Query dropless_rq private flag value provided by ETHTOOL.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   - 0 on success, flag is not set.
+ *   - 1 on success, flag is set.
+ *   - negative errno value otherwise and rte_errno is set.
+ */
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev)
+{
+	RTE_SET_USED(dev);
+	return -ENOTSUP;
+}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v4 0/2] Add delay drop support for Rx queue
  2021-11-04 11:26 [dpdk-dev] [PATCH 0/4] Add delay drop support for Rx queue Bing Zhao
                   ` (5 preceding siblings ...)
  2021-11-04 16:55 ` [dpdk-dev] [PATCH v3 0/2] Add delay drop support for Rx queue Bing Zhao
@ 2021-11-04 17:59 ` Bing Zhao
  2021-11-04 17:59   ` [dpdk-dev] [PATCH v4 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
                     ` (2 more replies)
  2021-11-05 13:36 ` [dpdk-dev] [PATCH v5 " Bing Zhao
                   ` (2 subsequent siblings)
  9 siblings, 3 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 17:59 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

This patch set introduce a new devarg to support delay drop when
creating the Rx queues. The default attribute will be disabled and
the behavior will remain the same as before.

---
v2:
  - change hairpin queue delay drop to disable by default
  - combine the commits
  - fix Windows building
  - change the log print
v3: fix conflict and building
v4: code style update and commit log polishing
---

Bing Zhao (2):
  net/mlx5: add support for Rx queue delay drop
  net/mlx5: check delay drop settings in kernel driver

 doc/guides/nics/mlx5.rst                  |  26 +++++
 doc/guides/rel_notes/release_21_11.rst    |   1 +
 drivers/common/mlx5/mlx5_devx_cmds.c      |   1 +
 drivers/common/mlx5/mlx5_devx_cmds.h      |   1 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c   | 111 ++++++++++++++++++++++
 drivers/net/mlx5/linux/mlx5_os.c          |  11 +++
 drivers/net/mlx5/mlx5.c                   |   7 ++
 drivers/net/mlx5/mlx5.h                   |  10 ++
 drivers/net/mlx5/mlx5_devx.c              |   5 +
 drivers/net/mlx5/mlx5_rx.h                |   1 +
 drivers/net/mlx5/mlx5_trigger.c           |  18 ++++
 drivers/net/mlx5/windows/mlx5_ethdev_os.c |  17 ++++
 12 files changed, 209 insertions(+)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v4 1/2] net/mlx5: add support for Rx queue delay drop
  2021-11-04 17:59 ` [dpdk-dev] [PATCH v4 0/2] Add delay drop support for Rx queue Bing Zhao
@ 2021-11-04 17:59   ` Bing Zhao
  2021-11-04 18:22     ` Slava Ovsiienko
  2021-11-04 17:59   ` [dpdk-dev] [PATCH v4 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
  2021-11-04 21:46   ` [dpdk-dev] [PATCH v4 0/2] Add delay drop support for Rx queue Raslan Darawsheh
  2 siblings, 1 reply; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 17:59 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

For the Ethernet RQs, if there all receiving descriptors are
exhausted, the packets being received will be dropped. This behavior
prevents slow or malicious software entities at the host from
affecting the network. While for hairpin cases, even if there is no
software involved during the packet forwarding from Rx to Tx side,
some hiccup in the hardware or back pressure from Tx side may still
cause the descriptors to be exhausted. In certain scenarios it may be
preferred to configure the device to avoid such packet drops,
assuming the posting of descriptors will resume shortly.

To support this, a new devarg "delay_drop_en" is introduced, by
default, the delay drop is enabled for hairpin Rx queues and
disabled for standard Rx queues. This value is used as a bit mask:
  - bit 0: enablement of standard Rx queue
  - bit 1: enablement of hairpin Rx queue
And this attribute will be applied to all Rx queues of a device.

The "rq_delay_drop" capability in the HCA_CAP is checked before
creating any queue. If the hardware capabilities do not support
this delay drop, all the Rx queues will still be created without
this attribute, and the devarg setting will be ignored even if it
is specified explicitly.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 drivers/common/mlx5/mlx5_devx_cmds.c |  1 +
 drivers/common/mlx5/mlx5_devx_cmds.h |  1 +
 drivers/net/mlx5/linux/mlx5_os.c     | 11 +++++++++++
 drivers/net/mlx5/mlx5.c              |  7 +++++++
 drivers/net/mlx5/mlx5.h              |  9 +++++++++
 drivers/net/mlx5/mlx5_devx.c         |  5 +++++
 drivers/net/mlx5/mlx5_rx.h           |  1 +
 7 files changed, 35 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index 4ab3070da0..3748e54b22 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -964,6 +964,7 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 	attr->ct_offload = !!(MLX5_GET64(cmd_hca_cap, hcattr,
 					 general_obj_types) &
 			      MLX5_GENERAL_OBJ_TYPES_CAP_CONN_TRACK_OFFLOAD);
+	attr->rq_delay_drop = MLX5_GET(cmd_hca_cap, hcattr, rq_delay_drop);
 	if (attr->qos.sup) {
 		hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
 				MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP |
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index 86ee4f7b78..50d3264b46 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -178,6 +178,7 @@ struct mlx5_hca_attr {
 	uint32_t swp_csum:1;
 	uint32_t swp_lso:1;
 	uint32_t lro_max_msg_sz_mode:2;
+	uint32_t rq_delay_drop:1;
 	uint32_t lro_timer_supported_periods[MLX5_LRO_NUM_SUPP_PERIODS];
 	uint16_t lro_min_mss_size;
 	uint32_t flex_parser_protocols;
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index e0304b685e..de880ee4c9 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1508,6 +1508,15 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		goto error;
 #endif
 	}
+	if (config->std_delay_drop || config->hp_delay_drop) {
+		if (!config->hca_attr.rq_delay_drop) {
+			config->std_delay_drop = 0;
+			config->hp_delay_drop = 0;
+			DRV_LOG(WARNING,
+				"dev_port-%u: Rxq delay drop is not supported",
+				priv->dev_port);
+		}
+	}
 	if (sh->devx) {
 		uint32_t reg[MLX5_ST_SZ_DW(register_mtutc)];
 
@@ -2077,6 +2086,8 @@ mlx5_os_config_default(struct mlx5_dev_config *config)
 	config->decap_en = 1;
 	config->log_hp_size = MLX5_ARG_UNSET;
 	config->allow_duplicate_pattern = 1;
+	config->std_delay_drop = 0;
+	config->hp_delay_drop = 0;
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 8614b8ffdd..a961cce430 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -183,6 +183,9 @@
 /* Device parameter to configure implicit registration of mempool memory. */
 #define MLX5_MR_MEMPOOL_REG_EN "mr_mempool_reg_en"
 
+/* Device parameter to configure the delay drop when creating Rxqs. */
+#define MLX5_DELAY_DROP_EN "delay_drop_en"
+
 /* Shared memory between primary and secondary processes. */
 struct mlx5_shared_data *mlx5_shared_data;
 
@@ -2091,6 +2094,9 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
 		config->decap_en = !!tmp;
 	} else if (strcmp(MLX5_ALLOW_DUPLICATE_PATTERN, key) == 0) {
 		config->allow_duplicate_pattern = !!tmp;
+	} else if (strcmp(MLX5_DELAY_DROP_EN, key) == 0) {
+		config->std_delay_drop = tmp & MLX5_DELAY_DROP_STANDARD;
+		config->hp_delay_drop = tmp & MLX5_DELAY_DROP_HAIRPIN;
 	} else {
 		DRV_LOG(WARNING, "%s: unknown parameter", key);
 		rte_errno = EINVAL;
@@ -2153,6 +2159,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 		MLX5_DECAP_EN,
 		MLX5_ALLOW_DUPLICATE_PATTERN,
 		MLX5_MR_MEMPOOL_REG_EN,
+		MLX5_DELAY_DROP_EN,
 		NULL,
 	};
 	struct rte_kvargs *kvlist;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 51f4578838..b2022f3300 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -99,6 +99,13 @@ enum mlx5_flow_type {
 	MLX5_FLOW_TYPE_MAXI,
 };
 
+/* The mode of delay drop for Rx queues. */
+enum mlx5_delay_drop_mode {
+	MLX5_DELAY_DROP_NONE = 0, /* All disabled. */
+	MLX5_DELAY_DROP_STANDARD = RTE_BIT32(0), /* Standard queues enable. */
+	MLX5_DELAY_DROP_HAIRPIN = RTE_BIT32(1), /* Hairpin queues enable. */
+};
+
 /* Hlist and list callback context. */
 struct mlx5_flow_cb_ctx {
 	struct rte_eth_dev *dev;
@@ -264,6 +271,8 @@ struct mlx5_dev_config {
 	unsigned int dv_miss_info:1; /* restore packet after partial hw miss */
 	unsigned int allow_duplicate_pattern:1;
 	/* Allow/Prevent the duplicate rules pattern. */
+	unsigned int std_delay_drop:1; /* Enable standard Rxq delay drop. */
+	unsigned int hp_delay_drop:1; /* Enable hairpin Rxq delay drop. */
 	struct {
 		unsigned int enabled:1; /* Whether MPRQ is enabled. */
 		unsigned int stride_num_n; /* Number of strides. */
diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index a9f9f4af70..e46f79124d 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -277,6 +277,7 @@ mlx5_rxq_create_devx_rq_resources(struct mlx5_rxq_priv *rxq)
 						MLX5_WQ_END_PAD_MODE_NONE;
 	rq_attr.wq_attr.pd = cdev->pdn;
 	rq_attr.counter_set_id = priv->counter_set_id;
+	rq_attr.delay_drop_en = rxq_data->delay_drop;
 	rq_attr.user_index = rte_cpu_to_be_16(priv->dev_data->port_id);
 	if (rxq_data->shared) /* Create RMP based RQ. */
 		rxq->devx_rq.rmp = &rxq_ctrl->obj->devx_rmp;
@@ -439,6 +440,8 @@ mlx5_rxq_obj_hairpin_new(struct mlx5_rxq_priv *rxq)
 			attr.wq_attr.log_hairpin_data_sz -
 			MLX5_HAIRPIN_QUEUE_STRIDE;
 	attr.counter_set_id = priv->counter_set_id;
+	rxq_ctrl->rxq.delay_drop = priv->config.hp_delay_drop;
+	attr.delay_drop_en = priv->config.hp_delay_drop;
 	tmpl->rq = mlx5_devx_cmd_create_rq(priv->sh->cdev->ctx, &attr,
 					   rxq_ctrl->socket);
 	if (!tmpl->rq) {
@@ -496,6 +499,7 @@ mlx5_rxq_devx_obj_new(struct mlx5_rxq_priv *rxq)
 		DRV_LOG(ERR, "Failed to create CQ.");
 		goto error;
 	}
+	rxq_data->delay_drop = priv->config.std_delay_drop;
 	/* Create RQ using DevX API. */
 	ret = mlx5_rxq_create_devx_rq_resources(rxq);
 	if (ret) {
@@ -941,6 +945,7 @@ mlx5_rxq_devx_obj_drop_create(struct rte_eth_dev *dev)
 			dev->data->port_id);
 		goto error;
 	}
+	rxq_ctrl->rxq.delay_drop = 0;
 	/* Create RQ using DevX API. */
 	ret = mlx5_rxq_create_devx_rq_resources(rxq);
 	if (ret != 0) {
diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index eda6eca8de..3b797e577a 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -97,6 +97,7 @@ struct mlx5_rxq_data {
 	unsigned int dynf_meta:1; /* Dynamic metadata is configured. */
 	unsigned int mcqe_format:3; /* CQE compression format. */
 	unsigned int shared:1; /* Shared RXQ. */
+	unsigned int delay_drop:1; /* Enable delay drop. */
 	volatile uint32_t *rq_db;
 	volatile uint32_t *cq_db;
 	uint16_t port_id;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v4 2/2] net/mlx5: check delay drop settings in kernel driver
  2021-11-04 17:59 ` [dpdk-dev] [PATCH v4 0/2] Add delay drop support for Rx queue Bing Zhao
  2021-11-04 17:59   ` [dpdk-dev] [PATCH v4 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
@ 2021-11-04 17:59   ` Bing Zhao
  2021-11-04 18:22     ` Slava Ovsiienko
  2021-11-04 21:46   ` [dpdk-dev] [PATCH v4 0/2] Add delay drop support for Rx queue Raslan Darawsheh
  2 siblings, 1 reply; 29+ messages in thread
From: Bing Zhao @ 2021-11-04 17:59 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

The delay drop is the common feature managed on per device basis
and the kernel driver is responsible one for the initialization and
rearming.

By default, the timeout value is set to activate the delay drop when
the driver is loaded.

A private flag "dropless_rq" is used to control the rearming. Only
when it is on, the rearming will be handled once received a timeout
event. Or else, the delay drop will be deactivated after the first
timeout occurs and all the Rx queues won't have this feature.

The PMD is trying to query this flag and warn the application when
some queues are created with delay drop but the flag is off.

The documents are also updated in this commit.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
---
 doc/guides/nics/mlx5.rst                  |  26 +++++
 doc/guides/rel_notes/release_21_11.rst    |   1 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c   | 111 ++++++++++++++++++++++
 drivers/net/mlx5/mlx5.h                   |   1 +
 drivers/net/mlx5/mlx5_trigger.c           |  18 ++++
 drivers/net/mlx5/windows/mlx5_ethdev_os.c |  17 ++++
 6 files changed, 174 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 824971d89a..006896375f 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -1129,6 +1129,27 @@ Driver options
 
   By default, the PMD will set this value to 1.
 
+- ``delay_drop_en`` parameter [int]
+
+  Bitmask value for the Rx queue delay drop attribute. Bit 0 is used for standard
+  Rx queue and bit 1 is used for hairpin Rx queue.
+  By default, the delay drop will be enabled for all hairpin Rx queues (if any)
+  and disabled for all standard Rx queues. It will be ignored if the NIC does
+  not support the attribute.
+  A timeout value is set in the driver to control the waiting time before dropping
+  a packet when there is no WQE available on a delay drop Rx queue. Once the timer
+  is expired, the delay drop will be deactivated for all queues. To re-activeate it,
+  a rearming is needed and now it is part of the kernel driver.
+
+  To enable / disable the delay drop rearming, the private flag ``dropless_rq`` can
+  be set and queried via ethtool:
+
+  - ethtool --set-priv-flags <netdev> dropless_rq on (/ off)
+  - ethtool --show-priv-flags <netdev>
+
+  The configuration flag is global per PF and can only be set on the PF, once it is on,
+  all the VFs', SFs' and representors' Rx queues will share the timer and rearming.
+
 .. _mlx5_firmware_config:
 
 Firmware configuration
@@ -1803,6 +1824,11 @@ Supported hardware offloads
    |                       | |               | | rdma-core 35  |
    |                       | |               | | ConnectX-6 Dx |
    +-----------------------+-----------------+-----------------+
+   | Rxq Delay drop        | | DPDK 21.11    | | DPDK 21.11    |
+   |                       | | OFED 5.5      | | OFED 5.5      |
+   |                       | | N/A           | | N/A           |
+   |                       | | ConnectX-5    | | ConnectX-5    |
+   +-----------------------+-----------------+-----------------+
 
 .. table:: Minimal SW/HW versions for shared action offload
    :name: sact
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 13d8330873..76d18aeb6b 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -191,6 +191,7 @@ New Features
   * Added implicit mempool registration to avoid data path hiccups (opt-out).
   * Added NIC offloads for the PMD on Windows (TSO, VLAN strip, CRC keep).
   * Added socket direct mode bonding support.
+  * Added delay drop support for Rx queue.
 
 * **Updated Solarflare network PMD.**
 
diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index 9d0e491d0c..c19825ee52 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -1630,3 +1630,114 @@ mlx5_get_mac(struct rte_eth_dev *dev, uint8_t (*mac)[RTE_ETHER_ADDR_LEN])
 	memcpy(mac, request.ifr_hwaddr.sa_data, RTE_ETHER_ADDR_LEN);
 	return 0;
 }
+
+/*
+ * Query dropless_rq private flag value provided by ETHTOOL.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   - 0 on success, flag is not set.
+ *   - 1 on success, flag is set.
+ *   - negative errno value otherwise and rte_errno is set.
+ */
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev)
+{
+	struct {
+		struct ethtool_sset_info hdr;
+		uint32_t buf[1];
+	} sset_info;
+	struct ethtool_drvinfo drvinfo;
+	struct ifreq ifr;
+	struct ethtool_gstrings *strings = NULL;
+	struct ethtool_value flags;
+	const int32_t flag_len = sizeof(flags.data) * CHAR_BIT;
+	int32_t str_sz;
+	int32_t len;
+	int32_t i;
+	int ret;
+
+	sset_info.hdr.cmd = ETHTOOL_GSSET_INFO;
+	sset_info.hdr.reserved = 0;
+	sset_info.hdr.sset_mask = 1ULL << ETH_SS_PRIV_FLAGS;
+	ifr.ifr_data = (caddr_t)&sset_info;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (!ret) {
+		const uint32_t *sset_lengths = sset_info.hdr.data;
+
+		len = sset_info.hdr.sset_mask ? sset_lengths[0] : 0;
+	} else if (ret == -EOPNOTSUPP) {
+		drvinfo.cmd = ETHTOOL_GDRVINFO;
+		ifr.ifr_data = (caddr_t)&drvinfo;
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+		if (ret) {
+			DRV_LOG(WARNING, "port %u cannot get the driver info",
+				dev->data->port_id);
+			goto exit;
+		}
+		len = *(uint32_t *)((char *)&drvinfo +
+			offsetof(struct ethtool_drvinfo, n_priv_flags));
+	} else {
+		DRV_LOG(WARNING, "port %u cannot get the sset info",
+			dev->data->port_id);
+		goto exit;
+	}
+	if (!len) {
+		DRV_LOG(WARNING, "port %u does not have private flag",
+			dev->data->port_id);
+		rte_errno = EOPNOTSUPP;
+		ret = -rte_errno;
+		goto exit;
+	} else if (len > flag_len) {
+		DRV_LOG(WARNING, "port %u maximal private flags number is %d",
+			dev->data->port_id, flag_len);
+		len = flag_len;
+	}
+	str_sz = ETH_GSTRING_LEN * len;
+	strings = (struct ethtool_gstrings *)
+		  mlx5_malloc(0, str_sz + sizeof(struct ethtool_gstrings), 0,
+			      SOCKET_ID_ANY);
+	if (!strings) {
+		DRV_LOG(WARNING, "port %u unable to allocate memory for"
+			" private flags", dev->data->port_id);
+		rte_errno = ENOMEM;
+		ret = -rte_errno;
+		goto exit;
+	}
+	strings->cmd = ETHTOOL_GSTRINGS;
+	strings->string_set = ETH_SS_PRIV_FLAGS;
+	strings->len = len;
+	ifr.ifr_data = (caddr_t)strings;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (ret) {
+		DRV_LOG(WARNING, "port %u unable to get private flags strings",
+			dev->data->port_id);
+		goto exit;
+	}
+	for (i = 0; i < len; i++) {
+		strings->data[(i + 1) * ETH_GSTRING_LEN - 1] = 0;
+		if (!strcmp((const char *)strings->data + i * ETH_GSTRING_LEN,
+			     "dropless_rq"))
+			break;
+	}
+	if (i == len) {
+		DRV_LOG(WARNING, "port %u does not support dropless_rq",
+			dev->data->port_id);
+		rte_errno = EOPNOTSUPP;
+		ret = -rte_errno;
+		goto exit;
+	}
+	flags.cmd = ETHTOOL_GPFLAGS;
+	ifr.ifr_data = (caddr_t)&flags;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (ret) {
+		DRV_LOG(WARNING, "port %u unable to get private flags status",
+			dev->data->port_id);
+		goto exit;
+	}
+	ret = !!(flags.data & (1U << i));
+exit:
+	mlx5_free(strings);
+	return ret;
+}
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index b2022f3300..9307a4f95b 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1602,6 +1602,7 @@ int mlx5_os_read_dev_stat(struct mlx5_priv *priv,
 int mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats);
 int mlx5_os_get_stats_n(struct rte_eth_dev *dev);
 void mlx5_os_stats_init(struct rte_eth_dev *dev);
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index a3e62e9533..0ecc530043 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -1129,6 +1129,24 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 			dev->data->port_id, strerror(rte_errno));
 		goto error;
 	}
+	if (priv->config.std_delay_drop || priv->config.hp_delay_drop) {
+		if (!priv->config.vf && !priv->config.sf &&
+		    !priv->representor) {
+			ret = mlx5_get_flag_dropless_rq(dev);
+			if (ret < 0)
+				DRV_LOG(WARNING,
+					"port %u cannot query dropless flag",
+					dev->data->port_id);
+			else if (!ret)
+				DRV_LOG(WARNING,
+					"port %u dropless_rq OFF, no rearming",
+					dev->data->port_id);
+		} else {
+			DRV_LOG(DEBUG,
+				"port %u doesn't support dropless_rq flag",
+				dev->data->port_id);
+		}
+	}
 	ret = mlx5_rxq_start(dev);
 	if (ret) {
 		DRV_LOG(ERR, "port %u Rx queue allocation failed: %s",
diff --git a/drivers/net/mlx5/windows/mlx5_ethdev_os.c b/drivers/net/mlx5/windows/mlx5_ethdev_os.c
index fddc7a6b12..359f73df7c 100644
--- a/drivers/net/mlx5/windows/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/windows/mlx5_ethdev_os.c
@@ -389,3 +389,20 @@ mlx5_is_removed(struct rte_eth_dev *dev)
 		return 1;
 	return 0;
 }
+
+/*
+ * Query dropless_rq private flag value provided by ETHTOOL.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   - 0 on success, flag is not set.
+ *   - 1 on success, flag is set.
+ *   - negative errno value otherwise and rte_errno is set.
+ */
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev)
+{
+	RTE_SET_USED(dev);
+	return -ENOTSUP;
+}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [dpdk-dev] [PATCH v4 1/2] net/mlx5: add support for Rx queue delay drop
  2021-11-04 17:59   ` [dpdk-dev] [PATCH v4 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
@ 2021-11-04 18:22     ` Slava Ovsiienko
  0 siblings, 0 replies; 29+ messages in thread
From: Slava Ovsiienko @ 2021-11-04 18:22 UTC (permalink / raw)
  To: Bing Zhao, Matan Azrad
  Cc: dev, Raslan Darawsheh, NBU-Contact-Thomas Monjalon, Ori Kam

> -----Original Message-----
> From: Bing Zhao <bingz@nvidia.com>
> Sent: Thursday, November 4, 2021 19:59
> To: Slava Ovsiienko <viacheslavo@nvidia.com>; Matan Azrad
> <matan@nvidia.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>; NBU-Contact-
> Thomas Monjalon <thomas@monjalon.net>; Ori Kam <orika@nvidia.com>
> Subject: [PATCH v4 1/2] net/mlx5: add support for Rx queue delay drop
> 
> For the Ethernet RQs, if there all receiving descriptors are exhausted, the
> packets being received will be dropped. This behavior prevents slow or
> malicious software entities at the host from affecting the network. While for
> hairpin cases, even if there is no software involved during the packet
> forwarding from Rx to Tx side, some hiccup in the hardware or back pressure
> from Tx side may still cause the descriptors to be exhausted. In certain
> scenarios it may be preferred to configure the device to avoid such packet
> drops, assuming the posting of descriptors will resume shortly.
> 
> To support this, a new devarg "delay_drop_en" is introduced, by default, the
> delay drop is enabled for hairpin Rx queues and disabled for standard Rx
> queues. This value is used as a bit mask:
>   - bit 0: enablement of standard Rx queue
>   - bit 1: enablement of hairpin Rx queue And this attribute will be applied to
> all Rx queues of a device.
> 
> The "rq_delay_drop" capability in the HCA_CAP is checked before creating
> any queue. If the hardware capabilities do not support this delay drop, all the
> Rx queues will still be created without this attribute, and the devarg setting
> will be ignored even if it is specified explicitly.
> 
> Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [dpdk-dev] [PATCH v4 2/2] net/mlx5: check delay drop settings in kernel driver
  2021-11-04 17:59   ` [dpdk-dev] [PATCH v4 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
@ 2021-11-04 18:22     ` Slava Ovsiienko
  0 siblings, 0 replies; 29+ messages in thread
From: Slava Ovsiienko @ 2021-11-04 18:22 UTC (permalink / raw)
  To: Bing Zhao, Matan Azrad
  Cc: dev, Raslan Darawsheh, NBU-Contact-Thomas Monjalon, Ori Kam

> -----Original Message-----
> From: Bing Zhao <bingz@nvidia.com>
> Sent: Thursday, November 4, 2021 19:59
> To: Slava Ovsiienko <viacheslavo@nvidia.com>; Matan Azrad
> <matan@nvidia.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>; NBU-Contact-
> Thomas Monjalon <thomas@monjalon.net>; Ori Kam <orika@nvidia.com>
> Subject: [PATCH v4 2/2] net/mlx5: check delay drop settings in kernel driver
> 
> The delay drop is the common feature managed on per device basis and the
> kernel driver is responsible one for the initialization and rearming.
> 
> By default, the timeout value is set to activate the delay drop when the
> driver is loaded.
> 
> A private flag "dropless_rq" is used to control the rearming. Only when it is
> on, the rearming will be handled once received a timeout event. Or else, the
> delay drop will be deactivated after the first timeout occurs and all the Rx
> queues won't have this feature.
> 
> The PMD is trying to query this flag and warn the application when some
> queues are created with delay drop but the flag is off.
> 
> The documents are also updated in this commit.
> 
> Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [dpdk-dev] [PATCH v4 0/2] Add delay drop support for Rx queue
  2021-11-04 17:59 ` [dpdk-dev] [PATCH v4 0/2] Add delay drop support for Rx queue Bing Zhao
  2021-11-04 17:59   ` [dpdk-dev] [PATCH v4 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
  2021-11-04 17:59   ` [dpdk-dev] [PATCH v4 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
@ 2021-11-04 21:46   ` Raslan Darawsheh
  2 siblings, 0 replies; 29+ messages in thread
From: Raslan Darawsheh @ 2021-11-04 21:46 UTC (permalink / raw)
  To: Bing Zhao, Slava Ovsiienko, Matan Azrad
  Cc: dev, NBU-Contact-Thomas Monjalon, Ori Kam

Hi,

> -----Original Message-----
> From: Bing Zhao <bingz@nvidia.com>
> Sent: Thursday, November 4, 2021 7:59 PM
> To: Slava Ovsiienko <viacheslavo@nvidia.com>; Matan Azrad
> <matan@nvidia.com>
> Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>; NBU-Contact-
> Thomas Monjalon <thomas@monjalon.net>; Ori Kam <orika@nvidia.com>
> Subject: [PATCH v4 0/2] Add delay drop support for Rx queue
> 
> This patch set introduce a new devarg to support delay drop when creating the
> Rx queues. The default attribute will be disabled and the behavior will remain the
> same as before.
> 
> ---
> v2:
>   - change hairpin queue delay drop to disable by default
>   - combine the commits
>   - fix Windows building
>   - change the log print
> v3: fix conflict and building
> v4: code style update and commit log polishing
> ---
> 
> Bing Zhao (2):
>   net/mlx5: add support for Rx queue delay drop
>   net/mlx5: check delay drop settings in kernel driver
> 
>  doc/guides/nics/mlx5.rst                  |  26 +++++
>  doc/guides/rel_notes/release_21_11.rst    |   1 +
>  drivers/common/mlx5/mlx5_devx_cmds.c      |   1 +
>  drivers/common/mlx5/mlx5_devx_cmds.h      |   1 +
>  drivers/net/mlx5/linux/mlx5_ethdev_os.c   | 111 ++++++++++++++++++++++
>  drivers/net/mlx5/linux/mlx5_os.c          |  11 +++
>  drivers/net/mlx5/mlx5.c                   |   7 ++
>  drivers/net/mlx5/mlx5.h                   |  10 ++
>  drivers/net/mlx5/mlx5_devx.c              |   5 +
>  drivers/net/mlx5/mlx5_rx.h                |   1 +
>  drivers/net/mlx5/mlx5_trigger.c           |  18 ++++
>  drivers/net/mlx5/windows/mlx5_ethdev_os.c |  17 ++++
>  12 files changed, 209 insertions(+)
> 
> --
> 2.27.0

Series applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v5 0/2] Add delay drop support for Rx queue
  2021-11-04 11:26 [dpdk-dev] [PATCH 0/4] Add delay drop support for Rx queue Bing Zhao
                   ` (6 preceding siblings ...)
  2021-11-04 17:59 ` [dpdk-dev] [PATCH v4 0/2] Add delay drop support for Rx queue Bing Zhao
@ 2021-11-05 13:36 ` Bing Zhao
  2021-11-05 13:36   ` [dpdk-dev] [PATCH v5 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
  2021-11-05 13:36   ` [dpdk-dev] [PATCH v5 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
  2021-11-05 14:28 ` [dpdk-dev] [PATCH v6 0/2] Add delay drop support for Rx queue Bing Zhao
  2021-11-05 15:30 ` [dpdk-dev] [PATCH v7 0/2] Add delay drop support for Rx queue Bing Zhao
  9 siblings, 2 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-05 13:36 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

This patch set introduce a new devarg to support delay drop when
creating the Rx queues. The default attribute will be disabled and
the behavior will remain the same as before.

---
v2:
  - change hairpin queue delay drop to disable by default
  - combine the commits
  - fix Windows building
  - change the log print
v3: fix conflict and building
v4: code style update and commit log polishing
v5:
  - split and fix the document description
  - devarg name update
---

Bing Zhao (2):
  net/mlx5: add support for Rx queue delay drop
  net/mlx5: check delay drop settings in kernel driver

 doc/guides/nics/mlx5.rst                  |  27 ++++++
 doc/guides/rel_notes/release_21_11.rst    |   1 +
 drivers/common/mlx5/mlx5_devx_cmds.c      |   1 +
 drivers/common/mlx5/mlx5_devx_cmds.h      |   1 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c   | 111 ++++++++++++++++++++++
 drivers/net/mlx5/linux/mlx5_os.c          |  11 +++
 drivers/net/mlx5/mlx5.c                   |   7 ++
 drivers/net/mlx5/mlx5.h                   |  10 ++
 drivers/net/mlx5/mlx5_devx.c              |   5 +
 drivers/net/mlx5/mlx5_rx.h                |   1 +
 drivers/net/mlx5/mlx5_trigger.c           |  18 ++++
 drivers/net/mlx5/windows/mlx5_ethdev_os.c |  17 ++++
 12 files changed, 210 insertions(+)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v5 1/2] net/mlx5: add support for Rx queue delay drop
  2021-11-05 13:36 ` [dpdk-dev] [PATCH v5 " Bing Zhao
@ 2021-11-05 13:36   ` Bing Zhao
  2021-11-05 13:36   ` [dpdk-dev] [PATCH v5 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
  1 sibling, 0 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-05 13:36 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

For the Ethernet RQs, if there all receiving descriptors are
exhausted, the packets being received will be dropped. This behavior
prevents slow or malicious software entities at the host from
affecting the network. While for hairpin cases, even if there is no
software involved during the packet forwarding from Rx to Tx side,
some hiccup in the hardware or back pressure from Tx side may still
cause the descriptors to be exhausted. In certain scenarios it may be
preferred to configure the device to avoid such packet drops,
assuming the posting of descriptors will resume shortly.

To support this, a new devarg "delay_drop" is introduced. By default,
the delay drop is enabled for hairpin Rx queues and disabled for
standard Rx queues. This value is used as a bit mask:
  - bit 0: enablement of standard Rx queue
  - bit 1: enablement of hairpin Rx queue
And this attribute will be applied to all Rx queues of a device.

The "rq_delay_drop" capability in the HCA_CAP is checked before
creating any queue. If the hardware capabilities do not support
this delay drop, all the Rx queues will still be created without
this attribute, and the devarg setting will be ignored even if it
is specified explicitly. A warning log is used to notify the
application when this occurs.

The document of "mlx5.rst" is updated.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst             | 11 +++++++++++
 drivers/common/mlx5/mlx5_devx_cmds.c |  1 +
 drivers/common/mlx5/mlx5_devx_cmds.h |  1 +
 drivers/net/mlx5/linux/mlx5_os.c     | 11 +++++++++++
 drivers/net/mlx5/mlx5.c              |  7 +++++++
 drivers/net/mlx5/mlx5.h              |  9 +++++++++
 drivers/net/mlx5/mlx5_devx.c         |  5 +++++
 drivers/net/mlx5/mlx5_rx.h           |  1 +
 8 files changed, 46 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 824971d89a..061a44c723 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -114,6 +114,7 @@ Features
 - Sub-Function representors.
 - Sub-Function.
 - Shared Rx queue.
+- Rx queue delay drop.
 
 
 Limitations
@@ -608,6 +609,16 @@ Driver options
   - POWER8 and ARMv8 with ConnectX-4 Lx, ConnectX-5, ConnectX-6, ConnectX-6 Dx,
     ConnectX-6 Lx, BlueField and BlueField-2.
 
+- ``delay_drop_en`` parameter [int]
+
+  Bitmask value for the Rx queue delay drop attribute. Bit 0 is used for the
+  standard Rx queue and bit 1 is used for the hairpin Rx queue. By default, the
+  delay drop is disabled for all Rx queues. It will be ignored if the port does
+  not support the attribute even if it is enabled explicitly.
+
+  The packets being received will not be dropped immediately when the WQEs are
+  exhausted in a Rx queue with delay drop enabled.
+
 - ``mprq_en`` parameter [int]
 
   A nonzero value enables configuring Multi-Packet Rx queues. Rx queue is
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index fca1470be7..49db07facc 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -965,6 +965,7 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 	attr->ct_offload = !!(MLX5_GET64(cmd_hca_cap, hcattr,
 					 general_obj_types) &
 			      MLX5_GENERAL_OBJ_TYPES_CAP_CONN_TRACK_OFFLOAD);
+	attr->rq_delay_drop = MLX5_GET(cmd_hca_cap, hcattr, rq_delay_drop);
 	if (attr->qos.sup) {
 		hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
 				MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP |
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index 344cd7bbf3..447f76f1f9 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -178,6 +178,7 @@ struct mlx5_hca_attr {
 	uint32_t swp_csum:1;
 	uint32_t swp_lso:1;
 	uint32_t lro_max_msg_sz_mode:2;
+	uint32_t rq_delay_drop:1;
 	uint32_t lro_timer_supported_periods[MLX5_LRO_NUM_SUPP_PERIODS];
 	uint16_t lro_min_mss_size;
 	uint32_t flex_parser_protocols;
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index e0304b685e..de880ee4c9 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1508,6 +1508,15 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		goto error;
 #endif
 	}
+	if (config->std_delay_drop || config->hp_delay_drop) {
+		if (!config->hca_attr.rq_delay_drop) {
+			config->std_delay_drop = 0;
+			config->hp_delay_drop = 0;
+			DRV_LOG(WARNING,
+				"dev_port-%u: Rxq delay drop is not supported",
+				priv->dev_port);
+		}
+	}
 	if (sh->devx) {
 		uint32_t reg[MLX5_ST_SZ_DW(register_mtutc)];
 
@@ -2077,6 +2086,8 @@ mlx5_os_config_default(struct mlx5_dev_config *config)
 	config->decap_en = 1;
 	config->log_hp_size = MLX5_ARG_UNSET;
 	config->allow_duplicate_pattern = 1;
+	config->std_delay_drop = 0;
+	config->hp_delay_drop = 0;
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 8614b8ffdd..4e289402a8 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -183,6 +183,9 @@
 /* Device parameter to configure implicit registration of mempool memory. */
 #define MLX5_MR_MEMPOOL_REG_EN "mr_mempool_reg_en"
 
+/* Device parameter to configure the delay drop when creating Rxqs. */
+#define MLX5_DELAY_DROP_EN "delay_drop"
+
 /* Shared memory between primary and secondary processes. */
 struct mlx5_shared_data *mlx5_shared_data;
 
@@ -2091,6 +2094,9 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
 		config->decap_en = !!tmp;
 	} else if (strcmp(MLX5_ALLOW_DUPLICATE_PATTERN, key) == 0) {
 		config->allow_duplicate_pattern = !!tmp;
+	} else if (strcmp(MLX5_DELAY_DROP_EN, key) == 0) {
+		config->std_delay_drop = tmp & MLX5_DELAY_DROP_STANDARD;
+		config->hp_delay_drop = tmp & MLX5_DELAY_DROP_HAIRPIN;
 	} else {
 		DRV_LOG(WARNING, "%s: unknown parameter", key);
 		rte_errno = EINVAL;
@@ -2153,6 +2159,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 		MLX5_DECAP_EN,
 		MLX5_ALLOW_DUPLICATE_PATTERN,
 		MLX5_MR_MEMPOOL_REG_EN,
+		MLX5_DELAY_DROP_EN,
 		NULL,
 	};
 	struct rte_kvargs *kvlist;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 51f4578838..b2022f3300 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -99,6 +99,13 @@ enum mlx5_flow_type {
 	MLX5_FLOW_TYPE_MAXI,
 };
 
+/* The mode of delay drop for Rx queues. */
+enum mlx5_delay_drop_mode {
+	MLX5_DELAY_DROP_NONE = 0, /* All disabled. */
+	MLX5_DELAY_DROP_STANDARD = RTE_BIT32(0), /* Standard queues enable. */
+	MLX5_DELAY_DROP_HAIRPIN = RTE_BIT32(1), /* Hairpin queues enable. */
+};
+
 /* Hlist and list callback context. */
 struct mlx5_flow_cb_ctx {
 	struct rte_eth_dev *dev;
@@ -264,6 +271,8 @@ struct mlx5_dev_config {
 	unsigned int dv_miss_info:1; /* restore packet after partial hw miss */
 	unsigned int allow_duplicate_pattern:1;
 	/* Allow/Prevent the duplicate rules pattern. */
+	unsigned int std_delay_drop:1; /* Enable standard Rxq delay drop. */
+	unsigned int hp_delay_drop:1; /* Enable hairpin Rxq delay drop. */
 	struct {
 		unsigned int enabled:1; /* Whether MPRQ is enabled. */
 		unsigned int stride_num_n; /* Number of strides. */
diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index a9f9f4af70..e46f79124d 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -277,6 +277,7 @@ mlx5_rxq_create_devx_rq_resources(struct mlx5_rxq_priv *rxq)
 						MLX5_WQ_END_PAD_MODE_NONE;
 	rq_attr.wq_attr.pd = cdev->pdn;
 	rq_attr.counter_set_id = priv->counter_set_id;
+	rq_attr.delay_drop_en = rxq_data->delay_drop;
 	rq_attr.user_index = rte_cpu_to_be_16(priv->dev_data->port_id);
 	if (rxq_data->shared) /* Create RMP based RQ. */
 		rxq->devx_rq.rmp = &rxq_ctrl->obj->devx_rmp;
@@ -439,6 +440,8 @@ mlx5_rxq_obj_hairpin_new(struct mlx5_rxq_priv *rxq)
 			attr.wq_attr.log_hairpin_data_sz -
 			MLX5_HAIRPIN_QUEUE_STRIDE;
 	attr.counter_set_id = priv->counter_set_id;
+	rxq_ctrl->rxq.delay_drop = priv->config.hp_delay_drop;
+	attr.delay_drop_en = priv->config.hp_delay_drop;
 	tmpl->rq = mlx5_devx_cmd_create_rq(priv->sh->cdev->ctx, &attr,
 					   rxq_ctrl->socket);
 	if (!tmpl->rq) {
@@ -496,6 +499,7 @@ mlx5_rxq_devx_obj_new(struct mlx5_rxq_priv *rxq)
 		DRV_LOG(ERR, "Failed to create CQ.");
 		goto error;
 	}
+	rxq_data->delay_drop = priv->config.std_delay_drop;
 	/* Create RQ using DevX API. */
 	ret = mlx5_rxq_create_devx_rq_resources(rxq);
 	if (ret) {
@@ -941,6 +945,7 @@ mlx5_rxq_devx_obj_drop_create(struct rte_eth_dev *dev)
 			dev->data->port_id);
 		goto error;
 	}
+	rxq_ctrl->rxq.delay_drop = 0;
 	/* Create RQ using DevX API. */
 	ret = mlx5_rxq_create_devx_rq_resources(rxq);
 	if (ret != 0) {
diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index eda6eca8de..3b797e577a 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -97,6 +97,7 @@ struct mlx5_rxq_data {
 	unsigned int dynf_meta:1; /* Dynamic metadata is configured. */
 	unsigned int mcqe_format:3; /* CQE compression format. */
 	unsigned int shared:1; /* Shared RXQ. */
+	unsigned int delay_drop:1; /* Enable delay drop. */
 	volatile uint32_t *rq_db;
 	volatile uint32_t *cq_db;
 	uint16_t port_id;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v5 2/2] net/mlx5: check delay drop settings in kernel driver
  2021-11-05 13:36 ` [dpdk-dev] [PATCH v5 " Bing Zhao
  2021-11-05 13:36   ` [dpdk-dev] [PATCH v5 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
@ 2021-11-05 13:36   ` Bing Zhao
  1 sibling, 0 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-05 13:36 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

The delay drop is the common feature managed on per device basis
and the kernel driver is responsible one for the initialization and
rearming.

By default, the timeout value is set to activate the delay drop when
the driver is loaded.

A private flag "dropless_rq" is used to control the rearming. Only
when it is on, the rearming will be handled once received a timeout
event. Or else, the delay drop will be deactivated after the first
timeout occurs and all the Rx queues won't have this feature.

The PMD is trying to query this flag and warn the application when
some queues are created with delay drop but the flag is off.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst                  |  16 ++++
 doc/guides/rel_notes/release_21_11.rst    |   1 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c   | 111 ++++++++++++++++++++++
 drivers/net/mlx5/mlx5.h                   |   1 +
 drivers/net/mlx5/mlx5_trigger.c           |  18 ++++
 drivers/net/mlx5/windows/mlx5_ethdev_os.c |  17 ++++
 6 files changed, 164 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 061a44c723..97d6c1227c 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -619,6 +619,22 @@ Driver options
   The packets being received will not be dropped immediately when the WQEs are
   exhausted in a Rx queue with delay drop enabled.
 
+  A timeout value is set in the driver to control the waiting time before
+  dropping a packet. Once the timer is expired, the delay drop will be
+  deactivated for all the Rx queues with this feature enable. To re-activeate
+  it, a rearming is needed and it is part of the kernel driver starting from
+  OFED 5.5.
+
+  To enable / disable the delay drop rearming, the private flag ``dropless_rq``
+  can be set and queried via ethtool:
+
+  - ethtool --set-priv-flags <netdev> dropless_rq on (/ off)
+  - ethtool --show-priv-flags <netdev>
+
+  The configuration flag is global per PF and can only be set on the PF, once
+  it is on, all the VFs', SFs' and representors' Rx queues will share the timer
+  and rearming.
+
 - ``mprq_en`` parameter [int]
 
   A nonzero value enables configuring Multi-Packet Rx queues. Rx queue is
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 92180bb4bd..9556aa8bd9 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -192,6 +192,7 @@ New Features
   * Added implicit mempool registration to avoid data path hiccups (opt-out).
   * Added NIC offloads for the PMD on Windows (TSO, VLAN strip, CRC keep).
   * Added socket direct mode bonding support.
+  * Added delay drop support for Rx queue.
 
 * **Updated Solarflare network PMD.**
 
diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index 9d0e491d0c..c19825ee52 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -1630,3 +1630,114 @@ mlx5_get_mac(struct rte_eth_dev *dev, uint8_t (*mac)[RTE_ETHER_ADDR_LEN])
 	memcpy(mac, request.ifr_hwaddr.sa_data, RTE_ETHER_ADDR_LEN);
 	return 0;
 }
+
+/*
+ * Query dropless_rq private flag value provided by ETHTOOL.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   - 0 on success, flag is not set.
+ *   - 1 on success, flag is set.
+ *   - negative errno value otherwise and rte_errno is set.
+ */
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev)
+{
+	struct {
+		struct ethtool_sset_info hdr;
+		uint32_t buf[1];
+	} sset_info;
+	struct ethtool_drvinfo drvinfo;
+	struct ifreq ifr;
+	struct ethtool_gstrings *strings = NULL;
+	struct ethtool_value flags;
+	const int32_t flag_len = sizeof(flags.data) * CHAR_BIT;
+	int32_t str_sz;
+	int32_t len;
+	int32_t i;
+	int ret;
+
+	sset_info.hdr.cmd = ETHTOOL_GSSET_INFO;
+	sset_info.hdr.reserved = 0;
+	sset_info.hdr.sset_mask = 1ULL << ETH_SS_PRIV_FLAGS;
+	ifr.ifr_data = (caddr_t)&sset_info;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (!ret) {
+		const uint32_t *sset_lengths = sset_info.hdr.data;
+
+		len = sset_info.hdr.sset_mask ? sset_lengths[0] : 0;
+	} else if (ret == -EOPNOTSUPP) {
+		drvinfo.cmd = ETHTOOL_GDRVINFO;
+		ifr.ifr_data = (caddr_t)&drvinfo;
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+		if (ret) {
+			DRV_LOG(WARNING, "port %u cannot get the driver info",
+				dev->data->port_id);
+			goto exit;
+		}
+		len = *(uint32_t *)((char *)&drvinfo +
+			offsetof(struct ethtool_drvinfo, n_priv_flags));
+	} else {
+		DRV_LOG(WARNING, "port %u cannot get the sset info",
+			dev->data->port_id);
+		goto exit;
+	}
+	if (!len) {
+		DRV_LOG(WARNING, "port %u does not have private flag",
+			dev->data->port_id);
+		rte_errno = EOPNOTSUPP;
+		ret = -rte_errno;
+		goto exit;
+	} else if (len > flag_len) {
+		DRV_LOG(WARNING, "port %u maximal private flags number is %d",
+			dev->data->port_id, flag_len);
+		len = flag_len;
+	}
+	str_sz = ETH_GSTRING_LEN * len;
+	strings = (struct ethtool_gstrings *)
+		  mlx5_malloc(0, str_sz + sizeof(struct ethtool_gstrings), 0,
+			      SOCKET_ID_ANY);
+	if (!strings) {
+		DRV_LOG(WARNING, "port %u unable to allocate memory for"
+			" private flags", dev->data->port_id);
+		rte_errno = ENOMEM;
+		ret = -rte_errno;
+		goto exit;
+	}
+	strings->cmd = ETHTOOL_GSTRINGS;
+	strings->string_set = ETH_SS_PRIV_FLAGS;
+	strings->len = len;
+	ifr.ifr_data = (caddr_t)strings;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (ret) {
+		DRV_LOG(WARNING, "port %u unable to get private flags strings",
+			dev->data->port_id);
+		goto exit;
+	}
+	for (i = 0; i < len; i++) {
+		strings->data[(i + 1) * ETH_GSTRING_LEN - 1] = 0;
+		if (!strcmp((const char *)strings->data + i * ETH_GSTRING_LEN,
+			     "dropless_rq"))
+			break;
+	}
+	if (i == len) {
+		DRV_LOG(WARNING, "port %u does not support dropless_rq",
+			dev->data->port_id);
+		rte_errno = EOPNOTSUPP;
+		ret = -rte_errno;
+		goto exit;
+	}
+	flags.cmd = ETHTOOL_GPFLAGS;
+	ifr.ifr_data = (caddr_t)&flags;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (ret) {
+		DRV_LOG(WARNING, "port %u unable to get private flags status",
+			dev->data->port_id);
+		goto exit;
+	}
+	ret = !!(flags.data & (1U << i));
+exit:
+	mlx5_free(strings);
+	return ret;
+}
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index b2022f3300..9307a4f95b 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1602,6 +1602,7 @@ int mlx5_os_read_dev_stat(struct mlx5_priv *priv,
 int mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats);
 int mlx5_os_get_stats_n(struct rte_eth_dev *dev);
 void mlx5_os_stats_init(struct rte_eth_dev *dev);
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index a3e62e9533..0ecc530043 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -1129,6 +1129,24 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 			dev->data->port_id, strerror(rte_errno));
 		goto error;
 	}
+	if (priv->config.std_delay_drop || priv->config.hp_delay_drop) {
+		if (!priv->config.vf && !priv->config.sf &&
+		    !priv->representor) {
+			ret = mlx5_get_flag_dropless_rq(dev);
+			if (ret < 0)
+				DRV_LOG(WARNING,
+					"port %u cannot query dropless flag",
+					dev->data->port_id);
+			else if (!ret)
+				DRV_LOG(WARNING,
+					"port %u dropless_rq OFF, no rearming",
+					dev->data->port_id);
+		} else {
+			DRV_LOG(DEBUG,
+				"port %u doesn't support dropless_rq flag",
+				dev->data->port_id);
+		}
+	}
 	ret = mlx5_rxq_start(dev);
 	if (ret) {
 		DRV_LOG(ERR, "port %u Rx queue allocation failed: %s",
diff --git a/drivers/net/mlx5/windows/mlx5_ethdev_os.c b/drivers/net/mlx5/windows/mlx5_ethdev_os.c
index fddc7a6b12..359f73df7c 100644
--- a/drivers/net/mlx5/windows/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/windows/mlx5_ethdev_os.c
@@ -389,3 +389,20 @@ mlx5_is_removed(struct rte_eth_dev *dev)
 		return 1;
 	return 0;
 }
+
+/*
+ * Query dropless_rq private flag value provided by ETHTOOL.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   - 0 on success, flag is not set.
+ *   - 1 on success, flag is set.
+ *   - negative errno value otherwise and rte_errno is set.
+ */
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev)
+{
+	RTE_SET_USED(dev);
+	return -ENOTSUP;
+}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v6 0/2] Add delay drop support for Rx queue
  2021-11-04 11:26 [dpdk-dev] [PATCH 0/4] Add delay drop support for Rx queue Bing Zhao
                   ` (7 preceding siblings ...)
  2021-11-05 13:36 ` [dpdk-dev] [PATCH v5 " Bing Zhao
@ 2021-11-05 14:28 ` Bing Zhao
  2021-11-05 14:28   ` [dpdk-dev] [PATCH v6 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
  2021-11-05 14:28   ` [dpdk-dev] [PATCH v6 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
  2021-11-05 15:30 ` [dpdk-dev] [PATCH v7 0/2] Add delay drop support for Rx queue Bing Zhao
  9 siblings, 2 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-05 14:28 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

---
v2:
  - change hairpin queue delay drop to disable by default
  - combine the commits
  - fix Windows building
  - change the log print
v3: fix conflict and building
v4: code style update and commit log polishing
v5:
  - split and fix the document description
  - devarg name update
v6: document and macro name update
---

Bing Zhao (2):
  net/mlx5: add support for Rx queue delay drop
  net/mlx5: check delay drop settings in kernel driver

 doc/guides/nics/mlx5.rst                  |  27 ++++++
 doc/guides/rel_notes/release_21_11.rst    |   1 +
 drivers/common/mlx5/mlx5_devx_cmds.c      |   1 +
 drivers/common/mlx5/mlx5_devx_cmds.h      |   1 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c   | 111 ++++++++++++++++++++++
 drivers/net/mlx5/linux/mlx5_os.c          |  11 +++
 drivers/net/mlx5/mlx5.c                   |   7 ++
 drivers/net/mlx5/mlx5.h                   |  10 ++
 drivers/net/mlx5/mlx5_devx.c              |   5 +
 drivers/net/mlx5/mlx5_rx.h                |   1 +
 drivers/net/mlx5/mlx5_trigger.c           |  18 ++++
 drivers/net/mlx5/windows/mlx5_ethdev_os.c |  17 ++++
 12 files changed, 210 insertions(+)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v6 1/2] net/mlx5: add support for Rx queue delay drop
  2021-11-05 14:28 ` [dpdk-dev] [PATCH v6 0/2] Add delay drop support for Rx queue Bing Zhao
@ 2021-11-05 14:28   ` Bing Zhao
  2021-11-05 14:28   ` [dpdk-dev] [PATCH v6 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
  1 sibling, 0 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-05 14:28 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

For the Ethernet RQs, if there all receiving descriptors are
exhausted, the packets being received will be dropped. This behavior
prevents slow or malicious software entities at the host from
affecting the network. While for hairpin cases, even if there is no
software involved during the packet forwarding from Rx to Tx side,
some hiccup in the hardware or back pressure from Tx side may still
cause the descriptors to be exhausted. In certain scenarios it may be
preferred to configure the device to avoid such packet drops,
assuming the posting of descriptors will resume shortly.

To support this, a new devarg "delay_drop" is introduced. By default,
the delay drop is enabled for hairpin Rx queues and disabled for
standard Rx queues. This value is used as a bit mask:
  - bit 0: enablement of standard Rx queue
  - bit 1: enablement of hairpin Rx queue
And this attribute will be applied to all Rx queues of a device.

The "rq_delay_drop" capability in the HCA_CAP is checked before
creating any queue. If the hardware capabilities do not support
this delay drop, all the Rx queues will still be created without
this attribute, and the devarg setting will be ignored even if it
is specified explicitly. A warning log is used to notify the
application when this occurs.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst               | 27 ++++++++++++++++++++++++++
 doc/guides/rel_notes/release_21_11.rst |  1 +
 drivers/common/mlx5/mlx5_devx_cmds.c   |  1 +
 drivers/common/mlx5/mlx5_devx_cmds.h   |  1 +
 drivers/net/mlx5/linux/mlx5_os.c       | 11 +++++++++++
 drivers/net/mlx5/mlx5.c                |  7 +++++++
 drivers/net/mlx5/mlx5.h                |  9 +++++++++
 drivers/net/mlx5/mlx5_devx.c           |  5 +++++
 drivers/net/mlx5/mlx5_rx.h             |  1 +
 9 files changed, 63 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 824971d89a..82dda457c0 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -114,6 +114,7 @@ Features
 - Sub-Function representors.
 - Sub-Function.
 - Shared Rx queue.
+- Rx queue delay drop.
 
 
 Limitations
@@ -608,6 +609,32 @@ Driver options
   - POWER8 and ARMv8 with ConnectX-4 Lx, ConnectX-5, ConnectX-6, ConnectX-6 Dx,
     ConnectX-6 Lx, BlueField and BlueField-2.
 
+- ``delay_drop`` parameter [int]
+
+  Bitmask value for the Rx queue delay drop attribute. Bit 0 is used for the
+  standard Rx queue and bit 1 is used for the hairpin Rx queue. By default, the
+  delay drop is disabled for all Rx queues. It will be ignored if the port does
+  not support the attribute even if it is enabled explicitly.
+
+  The packets being received will not be dropped immediately when the WQEs are
+  exhausted in a Rx queue with delay drop enabled.
+
+  A timeout value is set in the driver to control the waiting time before
+  dropping a packet. Once the timer is expired, the delay drop will be
+  deactivated for all the Rx queues with this feature enable. To re-activeate
+  it, a rearming is needed and it is part of the kernel driver starting from
+  OFED 5.5.
+
+  To enable / disable the delay drop rearming, the private flag ``dropless_rq``
+  can be set and queried via ethtool:
+
+  - ethtool --set-priv-flags <netdev> dropless_rq on (/ off)
+  - ethtool --show-priv-flags <netdev>
+
+  The configuration flag is global per PF and can only be set on the PF, once
+  it is on, all the VFs', SFs' and representors' Rx queues will share the timer
+  and rearming.
+
 - ``mprq_en`` parameter [int]
 
   A nonzero value enables configuring Multi-Packet Rx queues. Rx queue is
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 92180bb4bd..9556aa8bd9 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -192,6 +192,7 @@ New Features
   * Added implicit mempool registration to avoid data path hiccups (opt-out).
   * Added NIC offloads for the PMD on Windows (TSO, VLAN strip, CRC keep).
   * Added socket direct mode bonding support.
+  * Added delay drop support for Rx queue.
 
 * **Updated Solarflare network PMD.**
 
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index fca1470be7..49db07facc 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -965,6 +965,7 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 	attr->ct_offload = !!(MLX5_GET64(cmd_hca_cap, hcattr,
 					 general_obj_types) &
 			      MLX5_GENERAL_OBJ_TYPES_CAP_CONN_TRACK_OFFLOAD);
+	attr->rq_delay_drop = MLX5_GET(cmd_hca_cap, hcattr, rq_delay_drop);
 	if (attr->qos.sup) {
 		hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
 				MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP |
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index 344cd7bbf3..447f76f1f9 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -178,6 +178,7 @@ struct mlx5_hca_attr {
 	uint32_t swp_csum:1;
 	uint32_t swp_lso:1;
 	uint32_t lro_max_msg_sz_mode:2;
+	uint32_t rq_delay_drop:1;
 	uint32_t lro_timer_supported_periods[MLX5_LRO_NUM_SUPP_PERIODS];
 	uint16_t lro_min_mss_size;
 	uint32_t flex_parser_protocols;
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index e0304b685e..de880ee4c9 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1508,6 +1508,15 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		goto error;
 #endif
 	}
+	if (config->std_delay_drop || config->hp_delay_drop) {
+		if (!config->hca_attr.rq_delay_drop) {
+			config->std_delay_drop = 0;
+			config->hp_delay_drop = 0;
+			DRV_LOG(WARNING,
+				"dev_port-%u: Rxq delay drop is not supported",
+				priv->dev_port);
+		}
+	}
 	if (sh->devx) {
 		uint32_t reg[MLX5_ST_SZ_DW(register_mtutc)];
 
@@ -2077,6 +2086,8 @@ mlx5_os_config_default(struct mlx5_dev_config *config)
 	config->decap_en = 1;
 	config->log_hp_size = MLX5_ARG_UNSET;
 	config->allow_duplicate_pattern = 1;
+	config->std_delay_drop = 0;
+	config->hp_delay_drop = 0;
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 8614b8ffdd..9c8d1cc76f 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -183,6 +183,9 @@
 /* Device parameter to configure implicit registration of mempool memory. */
 #define MLX5_MR_MEMPOOL_REG_EN "mr_mempool_reg_en"
 
+/* Device parameter to configure the delay drop when creating Rxqs. */
+#define MLX5_DELAY_DROP "delay_drop"
+
 /* Shared memory between primary and secondary processes. */
 struct mlx5_shared_data *mlx5_shared_data;
 
@@ -2091,6 +2094,9 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
 		config->decap_en = !!tmp;
 	} else if (strcmp(MLX5_ALLOW_DUPLICATE_PATTERN, key) == 0) {
 		config->allow_duplicate_pattern = !!tmp;
+	} else if (strcmp(MLX5_DELAY_DROP, key) == 0) {
+		config->std_delay_drop = tmp & MLX5_DELAY_DROP_STANDARD;
+		config->hp_delay_drop = tmp & MLX5_DELAY_DROP_HAIRPIN;
 	} else {
 		DRV_LOG(WARNING, "%s: unknown parameter", key);
 		rte_errno = EINVAL;
@@ -2153,6 +2159,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 		MLX5_DECAP_EN,
 		MLX5_ALLOW_DUPLICATE_PATTERN,
 		MLX5_MR_MEMPOOL_REG_EN,
+		MLX5_DELAY_DROP,
 		NULL,
 	};
 	struct rte_kvargs *kvlist;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 51f4578838..b2022f3300 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -99,6 +99,13 @@ enum mlx5_flow_type {
 	MLX5_FLOW_TYPE_MAXI,
 };
 
+/* The mode of delay drop for Rx queues. */
+enum mlx5_delay_drop_mode {
+	MLX5_DELAY_DROP_NONE = 0, /* All disabled. */
+	MLX5_DELAY_DROP_STANDARD = RTE_BIT32(0), /* Standard queues enable. */
+	MLX5_DELAY_DROP_HAIRPIN = RTE_BIT32(1), /* Hairpin queues enable. */
+};
+
 /* Hlist and list callback context. */
 struct mlx5_flow_cb_ctx {
 	struct rte_eth_dev *dev;
@@ -264,6 +271,8 @@ struct mlx5_dev_config {
 	unsigned int dv_miss_info:1; /* restore packet after partial hw miss */
 	unsigned int allow_duplicate_pattern:1;
 	/* Allow/Prevent the duplicate rules pattern. */
+	unsigned int std_delay_drop:1; /* Enable standard Rxq delay drop. */
+	unsigned int hp_delay_drop:1; /* Enable hairpin Rxq delay drop. */
 	struct {
 		unsigned int enabled:1; /* Whether MPRQ is enabled. */
 		unsigned int stride_num_n; /* Number of strides. */
diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index a9f9f4af70..e46f79124d 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -277,6 +277,7 @@ mlx5_rxq_create_devx_rq_resources(struct mlx5_rxq_priv *rxq)
 						MLX5_WQ_END_PAD_MODE_NONE;
 	rq_attr.wq_attr.pd = cdev->pdn;
 	rq_attr.counter_set_id = priv->counter_set_id;
+	rq_attr.delay_drop_en = rxq_data->delay_drop;
 	rq_attr.user_index = rte_cpu_to_be_16(priv->dev_data->port_id);
 	if (rxq_data->shared) /* Create RMP based RQ. */
 		rxq->devx_rq.rmp = &rxq_ctrl->obj->devx_rmp;
@@ -439,6 +440,8 @@ mlx5_rxq_obj_hairpin_new(struct mlx5_rxq_priv *rxq)
 			attr.wq_attr.log_hairpin_data_sz -
 			MLX5_HAIRPIN_QUEUE_STRIDE;
 	attr.counter_set_id = priv->counter_set_id;
+	rxq_ctrl->rxq.delay_drop = priv->config.hp_delay_drop;
+	attr.delay_drop_en = priv->config.hp_delay_drop;
 	tmpl->rq = mlx5_devx_cmd_create_rq(priv->sh->cdev->ctx, &attr,
 					   rxq_ctrl->socket);
 	if (!tmpl->rq) {
@@ -496,6 +499,7 @@ mlx5_rxq_devx_obj_new(struct mlx5_rxq_priv *rxq)
 		DRV_LOG(ERR, "Failed to create CQ.");
 		goto error;
 	}
+	rxq_data->delay_drop = priv->config.std_delay_drop;
 	/* Create RQ using DevX API. */
 	ret = mlx5_rxq_create_devx_rq_resources(rxq);
 	if (ret) {
@@ -941,6 +945,7 @@ mlx5_rxq_devx_obj_drop_create(struct rte_eth_dev *dev)
 			dev->data->port_id);
 		goto error;
 	}
+	rxq_ctrl->rxq.delay_drop = 0;
 	/* Create RQ using DevX API. */
 	ret = mlx5_rxq_create_devx_rq_resources(rxq);
 	if (ret != 0) {
diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index eda6eca8de..3b797e577a 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -97,6 +97,7 @@ struct mlx5_rxq_data {
 	unsigned int dynf_meta:1; /* Dynamic metadata is configured. */
 	unsigned int mcqe_format:3; /* CQE compression format. */
 	unsigned int shared:1; /* Shared RXQ. */
+	unsigned int delay_drop:1; /* Enable delay drop. */
 	volatile uint32_t *rq_db;
 	volatile uint32_t *cq_db;
 	uint16_t port_id;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v6 2/2] net/mlx5: check delay drop settings in kernel driver
  2021-11-05 14:28 ` [dpdk-dev] [PATCH v6 0/2] Add delay drop support for Rx queue Bing Zhao
  2021-11-05 14:28   ` [dpdk-dev] [PATCH v6 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
@ 2021-11-05 14:28   ` Bing Zhao
  1 sibling, 0 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-05 14:28 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

The delay drop is the common feature managed on per device basis
and the kernel driver is responsible one for the initialization and
rearming.

By default, the timeout value is set to activate the delay drop when
the driver is loaded.

A private flag "dropless_rq" is used to control the rearming. Only
when it is on, the rearming will be handled once received a timeout
event. Or else, the delay drop will be deactivated after the first
timeout occurs and all the Rx queues won't have this feature.

The PMD is trying to query this flag and warn the application when
some queues are created with delay drop but the flag is off.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 drivers/net/mlx5/linux/mlx5_ethdev_os.c   | 111 ++++++++++++++++++++++
 drivers/net/mlx5/mlx5.h                   |   1 +
 drivers/net/mlx5/mlx5_trigger.c           |  18 ++++
 drivers/net/mlx5/windows/mlx5_ethdev_os.c |  17 ++++
 4 files changed, 147 insertions(+)

diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index 9d0e491d0c..c19825ee52 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -1630,3 +1630,114 @@ mlx5_get_mac(struct rte_eth_dev *dev, uint8_t (*mac)[RTE_ETHER_ADDR_LEN])
 	memcpy(mac, request.ifr_hwaddr.sa_data, RTE_ETHER_ADDR_LEN);
 	return 0;
 }
+
+/*
+ * Query dropless_rq private flag value provided by ETHTOOL.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   - 0 on success, flag is not set.
+ *   - 1 on success, flag is set.
+ *   - negative errno value otherwise and rte_errno is set.
+ */
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev)
+{
+	struct {
+		struct ethtool_sset_info hdr;
+		uint32_t buf[1];
+	} sset_info;
+	struct ethtool_drvinfo drvinfo;
+	struct ifreq ifr;
+	struct ethtool_gstrings *strings = NULL;
+	struct ethtool_value flags;
+	const int32_t flag_len = sizeof(flags.data) * CHAR_BIT;
+	int32_t str_sz;
+	int32_t len;
+	int32_t i;
+	int ret;
+
+	sset_info.hdr.cmd = ETHTOOL_GSSET_INFO;
+	sset_info.hdr.reserved = 0;
+	sset_info.hdr.sset_mask = 1ULL << ETH_SS_PRIV_FLAGS;
+	ifr.ifr_data = (caddr_t)&sset_info;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (!ret) {
+		const uint32_t *sset_lengths = sset_info.hdr.data;
+
+		len = sset_info.hdr.sset_mask ? sset_lengths[0] : 0;
+	} else if (ret == -EOPNOTSUPP) {
+		drvinfo.cmd = ETHTOOL_GDRVINFO;
+		ifr.ifr_data = (caddr_t)&drvinfo;
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+		if (ret) {
+			DRV_LOG(WARNING, "port %u cannot get the driver info",
+				dev->data->port_id);
+			goto exit;
+		}
+		len = *(uint32_t *)((char *)&drvinfo +
+			offsetof(struct ethtool_drvinfo, n_priv_flags));
+	} else {
+		DRV_LOG(WARNING, "port %u cannot get the sset info",
+			dev->data->port_id);
+		goto exit;
+	}
+	if (!len) {
+		DRV_LOG(WARNING, "port %u does not have private flag",
+			dev->data->port_id);
+		rte_errno = EOPNOTSUPP;
+		ret = -rte_errno;
+		goto exit;
+	} else if (len > flag_len) {
+		DRV_LOG(WARNING, "port %u maximal private flags number is %d",
+			dev->data->port_id, flag_len);
+		len = flag_len;
+	}
+	str_sz = ETH_GSTRING_LEN * len;
+	strings = (struct ethtool_gstrings *)
+		  mlx5_malloc(0, str_sz + sizeof(struct ethtool_gstrings), 0,
+			      SOCKET_ID_ANY);
+	if (!strings) {
+		DRV_LOG(WARNING, "port %u unable to allocate memory for"
+			" private flags", dev->data->port_id);
+		rte_errno = ENOMEM;
+		ret = -rte_errno;
+		goto exit;
+	}
+	strings->cmd = ETHTOOL_GSTRINGS;
+	strings->string_set = ETH_SS_PRIV_FLAGS;
+	strings->len = len;
+	ifr.ifr_data = (caddr_t)strings;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (ret) {
+		DRV_LOG(WARNING, "port %u unable to get private flags strings",
+			dev->data->port_id);
+		goto exit;
+	}
+	for (i = 0; i < len; i++) {
+		strings->data[(i + 1) * ETH_GSTRING_LEN - 1] = 0;
+		if (!strcmp((const char *)strings->data + i * ETH_GSTRING_LEN,
+			     "dropless_rq"))
+			break;
+	}
+	if (i == len) {
+		DRV_LOG(WARNING, "port %u does not support dropless_rq",
+			dev->data->port_id);
+		rte_errno = EOPNOTSUPP;
+		ret = -rte_errno;
+		goto exit;
+	}
+	flags.cmd = ETHTOOL_GPFLAGS;
+	ifr.ifr_data = (caddr_t)&flags;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (ret) {
+		DRV_LOG(WARNING, "port %u unable to get private flags status",
+			dev->data->port_id);
+		goto exit;
+	}
+	ret = !!(flags.data & (1U << i));
+exit:
+	mlx5_free(strings);
+	return ret;
+}
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index b2022f3300..9307a4f95b 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1602,6 +1602,7 @@ int mlx5_os_read_dev_stat(struct mlx5_priv *priv,
 int mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats);
 int mlx5_os_get_stats_n(struct rte_eth_dev *dev);
 void mlx5_os_stats_init(struct rte_eth_dev *dev);
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index a3e62e9533..0ecc530043 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -1129,6 +1129,24 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 			dev->data->port_id, strerror(rte_errno));
 		goto error;
 	}
+	if (priv->config.std_delay_drop || priv->config.hp_delay_drop) {
+		if (!priv->config.vf && !priv->config.sf &&
+		    !priv->representor) {
+			ret = mlx5_get_flag_dropless_rq(dev);
+			if (ret < 0)
+				DRV_LOG(WARNING,
+					"port %u cannot query dropless flag",
+					dev->data->port_id);
+			else if (!ret)
+				DRV_LOG(WARNING,
+					"port %u dropless_rq OFF, no rearming",
+					dev->data->port_id);
+		} else {
+			DRV_LOG(DEBUG,
+				"port %u doesn't support dropless_rq flag",
+				dev->data->port_id);
+		}
+	}
 	ret = mlx5_rxq_start(dev);
 	if (ret) {
 		DRV_LOG(ERR, "port %u Rx queue allocation failed: %s",
diff --git a/drivers/net/mlx5/windows/mlx5_ethdev_os.c b/drivers/net/mlx5/windows/mlx5_ethdev_os.c
index fddc7a6b12..359f73df7c 100644
--- a/drivers/net/mlx5/windows/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/windows/mlx5_ethdev_os.c
@@ -389,3 +389,20 @@ mlx5_is_removed(struct rte_eth_dev *dev)
 		return 1;
 	return 0;
 }
+
+/*
+ * Query dropless_rq private flag value provided by ETHTOOL.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   - 0 on success, flag is not set.
+ *   - 1 on success, flag is set.
+ *   - negative errno value otherwise and rte_errno is set.
+ */
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev)
+{
+	RTE_SET_USED(dev);
+	return -ENOTSUP;
+}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v7 0/2] Add delay drop support for Rx queue
  2021-11-04 11:26 [dpdk-dev] [PATCH 0/4] Add delay drop support for Rx queue Bing Zhao
                   ` (8 preceding siblings ...)
  2021-11-05 14:28 ` [dpdk-dev] [PATCH v6 0/2] Add delay drop support for Rx queue Bing Zhao
@ 2021-11-05 15:30 ` Bing Zhao
  2021-11-05 15:30   ` [dpdk-dev] [PATCH v7 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
                     ` (2 more replies)
  9 siblings, 3 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-05 15:30 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

---
v2:
  - change hairpin queue delay drop to disable by default
  - combine the commits
  - fix Windows building
  - change the log print
v3: fix conflict and building
v4: code style update and commit log polishing
v5:
  - split and fix the document description
  - devarg name update
v6: document and macro name update
v7: revert the improper merge of mlx5.rst update
---

Bing Zhao (2):
  net/mlx5: add support for Rx queue delay drop
  net/mlx5: check delay drop settings in kernel driver

 doc/guides/nics/mlx5.rst                  |  27 ++++++
 doc/guides/rel_notes/release_21_11.rst    |   1 +
 drivers/common/mlx5/mlx5_devx_cmds.c      |   1 +
 drivers/common/mlx5/mlx5_devx_cmds.h      |   1 +
 drivers/net/mlx5/linux/mlx5_ethdev_os.c   | 111 ++++++++++++++++++++++
 drivers/net/mlx5/linux/mlx5_os.c          |  11 +++
 drivers/net/mlx5/mlx5.c                   |   7 ++
 drivers/net/mlx5/mlx5.h                   |  10 ++
 drivers/net/mlx5/mlx5_devx.c              |   5 +
 drivers/net/mlx5/mlx5_rx.h                |   1 +
 drivers/net/mlx5/mlx5_trigger.c           |  18 ++++
 drivers/net/mlx5/windows/mlx5_ethdev_os.c |  17 ++++
 12 files changed, 210 insertions(+)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v7 1/2] net/mlx5: add support for Rx queue delay drop
  2021-11-05 15:30 ` [dpdk-dev] [PATCH v7 0/2] Add delay drop support for Rx queue Bing Zhao
@ 2021-11-05 15:30   ` Bing Zhao
  2021-11-05 15:30   ` [dpdk-dev] [PATCH v7 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
  2021-11-05 16:07   ` [dpdk-dev] [PATCH v7 0/2] Add delay drop support for Rx queue Ferruh Yigit
  2 siblings, 0 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-05 15:30 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

For the Ethernet RQs, if there all receiving descriptors are
exhausted, the packets being received will be dropped. This behavior
prevents slow or malicious software entities at the host from
affecting the network. While for hairpin cases, even if there is no
software involved during the packet forwarding from Rx to Tx side,
some hiccup in the hardware or back pressure from Tx side may still
cause the descriptors to be exhausted. In certain scenarios it may be
preferred to configure the device to avoid such packet drops,
assuming the posting of descriptors will resume shortly.

To support this, a new devarg "delay_drop" is introduced. By default,
the delay drop is enabled for hairpin Rx queues and disabled for
standard Rx queues. This value is used as a bit mask:
  - bit 0: enablement of standard Rx queue
  - bit 1: enablement of hairpin Rx queue
And this attribute will be applied to all Rx queues of a device.

The "rq_delay_drop" capability in the HCA_CAP is checked before
creating any queue. If the hardware capabilities do not support
this delay drop, all the Rx queues will still be created without
this attribute, and the devarg setting will be ignored even if it
is specified explicitly. A warning log is used to notify the
application when this occurs.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst               | 11 +++++++++++
 doc/guides/rel_notes/release_21_11.rst |  1 +
 drivers/common/mlx5/mlx5_devx_cmds.c   |  1 +
 drivers/common/mlx5/mlx5_devx_cmds.h   |  1 +
 drivers/net/mlx5/linux/mlx5_os.c       | 11 +++++++++++
 drivers/net/mlx5/mlx5.c                |  7 +++++++
 drivers/net/mlx5/mlx5.h                |  9 +++++++++
 drivers/net/mlx5/mlx5_devx.c           |  5 +++++
 drivers/net/mlx5/mlx5_rx.h             |  1 +
 9 files changed, 47 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 824971d89a..0ecd4f8738 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -114,6 +114,7 @@ Features
 - Sub-Function representors.
 - Sub-Function.
 - Shared Rx queue.
+- Rx queue delay drop.
 
 
 Limitations
@@ -608,6 +609,16 @@ Driver options
   - POWER8 and ARMv8 with ConnectX-4 Lx, ConnectX-5, ConnectX-6, ConnectX-6 Dx,
     ConnectX-6 Lx, BlueField and BlueField-2.
 
+- ``delay_drop`` parameter [int]
+
+  Bitmask value for the Rx queue delay drop attribute. Bit 0 is used for the
+  standard Rx queue and bit 1 is used for the hairpin Rx queue. By default, the
+  delay drop is disabled for all Rx queues. It will be ignored if the port does
+  not support the attribute even if it is enabled explicitly.
+
+  The packets being received will not be dropped immediately when the WQEs are
+  exhausted in a Rx queue with delay drop enabled.
+
 - ``mprq_en`` parameter [int]
 
   A nonzero value enables configuring Multi-Packet Rx queues. Rx queue is
diff --git a/doc/guides/rel_notes/release_21_11.rst b/doc/guides/rel_notes/release_21_11.rst
index 92180bb4bd..9556aa8bd9 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -192,6 +192,7 @@ New Features
   * Added implicit mempool registration to avoid data path hiccups (opt-out).
   * Added NIC offloads for the PMD on Windows (TSO, VLAN strip, CRC keep).
   * Added socket direct mode bonding support.
+  * Added delay drop support for Rx queue.
 
 * **Updated Solarflare network PMD.**
 
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c
index fca1470be7..49db07facc 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -965,6 +965,7 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 	attr->ct_offload = !!(MLX5_GET64(cmd_hca_cap, hcattr,
 					 general_obj_types) &
 			      MLX5_GENERAL_OBJ_TYPES_CAP_CONN_TRACK_OFFLOAD);
+	attr->rq_delay_drop = MLX5_GET(cmd_hca_cap, hcattr, rq_delay_drop);
 	if (attr->qos.sup) {
 		hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
 				MLX5_GET_HCA_CAP_OP_MOD_QOS_CAP |
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h
index 344cd7bbf3..447f76f1f9 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -178,6 +178,7 @@ struct mlx5_hca_attr {
 	uint32_t swp_csum:1;
 	uint32_t swp_lso:1;
 	uint32_t lro_max_msg_sz_mode:2;
+	uint32_t rq_delay_drop:1;
 	uint32_t lro_timer_supported_periods[MLX5_LRO_NUM_SUPP_PERIODS];
 	uint16_t lro_min_mss_size;
 	uint32_t flex_parser_protocols;
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index e0304b685e..de880ee4c9 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1508,6 +1508,15 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 		goto error;
 #endif
 	}
+	if (config->std_delay_drop || config->hp_delay_drop) {
+		if (!config->hca_attr.rq_delay_drop) {
+			config->std_delay_drop = 0;
+			config->hp_delay_drop = 0;
+			DRV_LOG(WARNING,
+				"dev_port-%u: Rxq delay drop is not supported",
+				priv->dev_port);
+		}
+	}
 	if (sh->devx) {
 		uint32_t reg[MLX5_ST_SZ_DW(register_mtutc)];
 
@@ -2077,6 +2086,8 @@ mlx5_os_config_default(struct mlx5_dev_config *config)
 	config->decap_en = 1;
 	config->log_hp_size = MLX5_ARG_UNSET;
 	config->allow_duplicate_pattern = 1;
+	config->std_delay_drop = 0;
+	config->hp_delay_drop = 0;
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 8614b8ffdd..9c8d1cc76f 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -183,6 +183,9 @@
 /* Device parameter to configure implicit registration of mempool memory. */
 #define MLX5_MR_MEMPOOL_REG_EN "mr_mempool_reg_en"
 
+/* Device parameter to configure the delay drop when creating Rxqs. */
+#define MLX5_DELAY_DROP "delay_drop"
+
 /* Shared memory between primary and secondary processes. */
 struct mlx5_shared_data *mlx5_shared_data;
 
@@ -2091,6 +2094,9 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
 		config->decap_en = !!tmp;
 	} else if (strcmp(MLX5_ALLOW_DUPLICATE_PATTERN, key) == 0) {
 		config->allow_duplicate_pattern = !!tmp;
+	} else if (strcmp(MLX5_DELAY_DROP, key) == 0) {
+		config->std_delay_drop = tmp & MLX5_DELAY_DROP_STANDARD;
+		config->hp_delay_drop = tmp & MLX5_DELAY_DROP_HAIRPIN;
 	} else {
 		DRV_LOG(WARNING, "%s: unknown parameter", key);
 		rte_errno = EINVAL;
@@ -2153,6 +2159,7 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
 		MLX5_DECAP_EN,
 		MLX5_ALLOW_DUPLICATE_PATTERN,
 		MLX5_MR_MEMPOOL_REG_EN,
+		MLX5_DELAY_DROP,
 		NULL,
 	};
 	struct rte_kvargs *kvlist;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 51f4578838..b2022f3300 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -99,6 +99,13 @@ enum mlx5_flow_type {
 	MLX5_FLOW_TYPE_MAXI,
 };
 
+/* The mode of delay drop for Rx queues. */
+enum mlx5_delay_drop_mode {
+	MLX5_DELAY_DROP_NONE = 0, /* All disabled. */
+	MLX5_DELAY_DROP_STANDARD = RTE_BIT32(0), /* Standard queues enable. */
+	MLX5_DELAY_DROP_HAIRPIN = RTE_BIT32(1), /* Hairpin queues enable. */
+};
+
 /* Hlist and list callback context. */
 struct mlx5_flow_cb_ctx {
 	struct rte_eth_dev *dev;
@@ -264,6 +271,8 @@ struct mlx5_dev_config {
 	unsigned int dv_miss_info:1; /* restore packet after partial hw miss */
 	unsigned int allow_duplicate_pattern:1;
 	/* Allow/Prevent the duplicate rules pattern. */
+	unsigned int std_delay_drop:1; /* Enable standard Rxq delay drop. */
+	unsigned int hp_delay_drop:1; /* Enable hairpin Rxq delay drop. */
 	struct {
 		unsigned int enabled:1; /* Whether MPRQ is enabled. */
 		unsigned int stride_num_n; /* Number of strides. */
diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index a9f9f4af70..e46f79124d 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -277,6 +277,7 @@ mlx5_rxq_create_devx_rq_resources(struct mlx5_rxq_priv *rxq)
 						MLX5_WQ_END_PAD_MODE_NONE;
 	rq_attr.wq_attr.pd = cdev->pdn;
 	rq_attr.counter_set_id = priv->counter_set_id;
+	rq_attr.delay_drop_en = rxq_data->delay_drop;
 	rq_attr.user_index = rte_cpu_to_be_16(priv->dev_data->port_id);
 	if (rxq_data->shared) /* Create RMP based RQ. */
 		rxq->devx_rq.rmp = &rxq_ctrl->obj->devx_rmp;
@@ -439,6 +440,8 @@ mlx5_rxq_obj_hairpin_new(struct mlx5_rxq_priv *rxq)
 			attr.wq_attr.log_hairpin_data_sz -
 			MLX5_HAIRPIN_QUEUE_STRIDE;
 	attr.counter_set_id = priv->counter_set_id;
+	rxq_ctrl->rxq.delay_drop = priv->config.hp_delay_drop;
+	attr.delay_drop_en = priv->config.hp_delay_drop;
 	tmpl->rq = mlx5_devx_cmd_create_rq(priv->sh->cdev->ctx, &attr,
 					   rxq_ctrl->socket);
 	if (!tmpl->rq) {
@@ -496,6 +499,7 @@ mlx5_rxq_devx_obj_new(struct mlx5_rxq_priv *rxq)
 		DRV_LOG(ERR, "Failed to create CQ.");
 		goto error;
 	}
+	rxq_data->delay_drop = priv->config.std_delay_drop;
 	/* Create RQ using DevX API. */
 	ret = mlx5_rxq_create_devx_rq_resources(rxq);
 	if (ret) {
@@ -941,6 +945,7 @@ mlx5_rxq_devx_obj_drop_create(struct rte_eth_dev *dev)
 			dev->data->port_id);
 		goto error;
 	}
+	rxq_ctrl->rxq.delay_drop = 0;
 	/* Create RQ using DevX API. */
 	ret = mlx5_rxq_create_devx_rq_resources(rxq);
 	if (ret != 0) {
diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index eda6eca8de..3b797e577a 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -97,6 +97,7 @@ struct mlx5_rxq_data {
 	unsigned int dynf_meta:1; /* Dynamic metadata is configured. */
 	unsigned int mcqe_format:3; /* CQE compression format. */
 	unsigned int shared:1; /* Shared RXQ. */
+	unsigned int delay_drop:1; /* Enable delay drop. */
 	volatile uint32_t *rq_db;
 	volatile uint32_t *cq_db;
 	uint16_t port_id;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [dpdk-dev] [PATCH v7 2/2] net/mlx5: check delay drop settings in kernel driver
  2021-11-05 15:30 ` [dpdk-dev] [PATCH v7 0/2] Add delay drop support for Rx queue Bing Zhao
  2021-11-05 15:30   ` [dpdk-dev] [PATCH v7 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
@ 2021-11-05 15:30   ` Bing Zhao
  2021-11-05 16:07   ` [dpdk-dev] [PATCH v7 0/2] Add delay drop support for Rx queue Ferruh Yigit
  2 siblings, 0 replies; 29+ messages in thread
From: Bing Zhao @ 2021-11-05 15:30 UTC (permalink / raw)
  To: viacheslavo, matan; +Cc: dev, rasland, thomas, orika

The delay drop is the common feature managed on per device basis
and the kernel driver is responsible one for the initialization and
rearming.

By default, the timeout value is set to activate the delay drop when
the driver is loaded.

A private flag "dropless_rq" is used to control the rearming. Only
when it is on, the rearming will be handled once received a timeout
event. Or else, the delay drop will be deactivated after the first
timeout occurs and all the Rx queues won't have this feature.

The PMD is trying to query this flag and warn the application when
some queues are created with delay drop but the flag is off.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
---
 doc/guides/nics/mlx5.rst                  |  16 ++++
 drivers/net/mlx5/linux/mlx5_ethdev_os.c   | 111 ++++++++++++++++++++++
 drivers/net/mlx5/mlx5.h                   |   1 +
 drivers/net/mlx5/mlx5_trigger.c           |  18 ++++
 drivers/net/mlx5/windows/mlx5_ethdev_os.c |  17 ++++
 5 files changed, 163 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 0ecd4f8738..82dda457c0 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -619,6 +619,22 @@ Driver options
   The packets being received will not be dropped immediately when the WQEs are
   exhausted in a Rx queue with delay drop enabled.
 
+  A timeout value is set in the driver to control the waiting time before
+  dropping a packet. Once the timer is expired, the delay drop will be
+  deactivated for all the Rx queues with this feature enable. To re-activeate
+  it, a rearming is needed and it is part of the kernel driver starting from
+  OFED 5.5.
+
+  To enable / disable the delay drop rearming, the private flag ``dropless_rq``
+  can be set and queried via ethtool:
+
+  - ethtool --set-priv-flags <netdev> dropless_rq on (/ off)
+  - ethtool --show-priv-flags <netdev>
+
+  The configuration flag is global per PF and can only be set on the PF, once
+  it is on, all the VFs', SFs' and representors' Rx queues will share the timer
+  and rearming.
+
 - ``mprq_en`` parameter [int]
 
   A nonzero value enables configuring Multi-Packet Rx queues. Rx queue is
diff --git a/drivers/net/mlx5/linux/mlx5_ethdev_os.c b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
index 9d0e491d0c..c19825ee52 100644
--- a/drivers/net/mlx5/linux/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/linux/mlx5_ethdev_os.c
@@ -1630,3 +1630,114 @@ mlx5_get_mac(struct rte_eth_dev *dev, uint8_t (*mac)[RTE_ETHER_ADDR_LEN])
 	memcpy(mac, request.ifr_hwaddr.sa_data, RTE_ETHER_ADDR_LEN);
 	return 0;
 }
+
+/*
+ * Query dropless_rq private flag value provided by ETHTOOL.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   - 0 on success, flag is not set.
+ *   - 1 on success, flag is set.
+ *   - negative errno value otherwise and rte_errno is set.
+ */
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev)
+{
+	struct {
+		struct ethtool_sset_info hdr;
+		uint32_t buf[1];
+	} sset_info;
+	struct ethtool_drvinfo drvinfo;
+	struct ifreq ifr;
+	struct ethtool_gstrings *strings = NULL;
+	struct ethtool_value flags;
+	const int32_t flag_len = sizeof(flags.data) * CHAR_BIT;
+	int32_t str_sz;
+	int32_t len;
+	int32_t i;
+	int ret;
+
+	sset_info.hdr.cmd = ETHTOOL_GSSET_INFO;
+	sset_info.hdr.reserved = 0;
+	sset_info.hdr.sset_mask = 1ULL << ETH_SS_PRIV_FLAGS;
+	ifr.ifr_data = (caddr_t)&sset_info;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (!ret) {
+		const uint32_t *sset_lengths = sset_info.hdr.data;
+
+		len = sset_info.hdr.sset_mask ? sset_lengths[0] : 0;
+	} else if (ret == -EOPNOTSUPP) {
+		drvinfo.cmd = ETHTOOL_GDRVINFO;
+		ifr.ifr_data = (caddr_t)&drvinfo;
+		ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+		if (ret) {
+			DRV_LOG(WARNING, "port %u cannot get the driver info",
+				dev->data->port_id);
+			goto exit;
+		}
+		len = *(uint32_t *)((char *)&drvinfo +
+			offsetof(struct ethtool_drvinfo, n_priv_flags));
+	} else {
+		DRV_LOG(WARNING, "port %u cannot get the sset info",
+			dev->data->port_id);
+		goto exit;
+	}
+	if (!len) {
+		DRV_LOG(WARNING, "port %u does not have private flag",
+			dev->data->port_id);
+		rte_errno = EOPNOTSUPP;
+		ret = -rte_errno;
+		goto exit;
+	} else if (len > flag_len) {
+		DRV_LOG(WARNING, "port %u maximal private flags number is %d",
+			dev->data->port_id, flag_len);
+		len = flag_len;
+	}
+	str_sz = ETH_GSTRING_LEN * len;
+	strings = (struct ethtool_gstrings *)
+		  mlx5_malloc(0, str_sz + sizeof(struct ethtool_gstrings), 0,
+			      SOCKET_ID_ANY);
+	if (!strings) {
+		DRV_LOG(WARNING, "port %u unable to allocate memory for"
+			" private flags", dev->data->port_id);
+		rte_errno = ENOMEM;
+		ret = -rte_errno;
+		goto exit;
+	}
+	strings->cmd = ETHTOOL_GSTRINGS;
+	strings->string_set = ETH_SS_PRIV_FLAGS;
+	strings->len = len;
+	ifr.ifr_data = (caddr_t)strings;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (ret) {
+		DRV_LOG(WARNING, "port %u unable to get private flags strings",
+			dev->data->port_id);
+		goto exit;
+	}
+	for (i = 0; i < len; i++) {
+		strings->data[(i + 1) * ETH_GSTRING_LEN - 1] = 0;
+		if (!strcmp((const char *)strings->data + i * ETH_GSTRING_LEN,
+			     "dropless_rq"))
+			break;
+	}
+	if (i == len) {
+		DRV_LOG(WARNING, "port %u does not support dropless_rq",
+			dev->data->port_id);
+		rte_errno = EOPNOTSUPP;
+		ret = -rte_errno;
+		goto exit;
+	}
+	flags.cmd = ETHTOOL_GPFLAGS;
+	ifr.ifr_data = (caddr_t)&flags;
+	ret = mlx5_ifreq(dev, SIOCETHTOOL, &ifr);
+	if (ret) {
+		DRV_LOG(WARNING, "port %u unable to get private flags status",
+			dev->data->port_id);
+		goto exit;
+	}
+	ret = !!(flags.data & (1U << i));
+exit:
+	mlx5_free(strings);
+	return ret;
+}
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index b2022f3300..9307a4f95b 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1602,6 +1602,7 @@ int mlx5_os_read_dev_stat(struct mlx5_priv *priv,
 int mlx5_os_read_dev_counters(struct rte_eth_dev *dev, uint64_t *stats);
 int mlx5_os_get_stats_n(struct rte_eth_dev *dev);
 void mlx5_os_stats_init(struct rte_eth_dev *dev);
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev);
 
 /* mlx5_mac.c */
 
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index a3e62e9533..0ecc530043 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -1129,6 +1129,24 @@ mlx5_dev_start(struct rte_eth_dev *dev)
 			dev->data->port_id, strerror(rte_errno));
 		goto error;
 	}
+	if (priv->config.std_delay_drop || priv->config.hp_delay_drop) {
+		if (!priv->config.vf && !priv->config.sf &&
+		    !priv->representor) {
+			ret = mlx5_get_flag_dropless_rq(dev);
+			if (ret < 0)
+				DRV_LOG(WARNING,
+					"port %u cannot query dropless flag",
+					dev->data->port_id);
+			else if (!ret)
+				DRV_LOG(WARNING,
+					"port %u dropless_rq OFF, no rearming",
+					dev->data->port_id);
+		} else {
+			DRV_LOG(DEBUG,
+				"port %u doesn't support dropless_rq flag",
+				dev->data->port_id);
+		}
+	}
 	ret = mlx5_rxq_start(dev);
 	if (ret) {
 		DRV_LOG(ERR, "port %u Rx queue allocation failed: %s",
diff --git a/drivers/net/mlx5/windows/mlx5_ethdev_os.c b/drivers/net/mlx5/windows/mlx5_ethdev_os.c
index fddc7a6b12..359f73df7c 100644
--- a/drivers/net/mlx5/windows/mlx5_ethdev_os.c
+++ b/drivers/net/mlx5/windows/mlx5_ethdev_os.c
@@ -389,3 +389,20 @@ mlx5_is_removed(struct rte_eth_dev *dev)
 		return 1;
 	return 0;
 }
+
+/*
+ * Query dropless_rq private flag value provided by ETHTOOL.
+ *
+ * @param dev
+ *   Pointer to Ethernet device.
+ *
+ * @return
+ *   - 0 on success, flag is not set.
+ *   - 1 on success, flag is set.
+ *   - negative errno value otherwise and rte_errno is set.
+ */
+int mlx5_get_flag_dropless_rq(struct rte_eth_dev *dev)
+{
+	RTE_SET_USED(dev);
+	return -ENOTSUP;
+}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [dpdk-dev] [PATCH v7 0/2] Add delay drop support for Rx queue
  2021-11-05 15:30 ` [dpdk-dev] [PATCH v7 0/2] Add delay drop support for Rx queue Bing Zhao
  2021-11-05 15:30   ` [dpdk-dev] [PATCH v7 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
  2021-11-05 15:30   ` [dpdk-dev] [PATCH v7 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
@ 2021-11-05 16:07   ` Ferruh Yigit
  2 siblings, 0 replies; 29+ messages in thread
From: Ferruh Yigit @ 2021-11-05 16:07 UTC (permalink / raw)
  To: Bing Zhao, viacheslavo, matan; +Cc: dev, rasland, thomas, orika

On 11/5/2021 3:30 PM, Bing Zhao wrote:
> ---
> v2:
>    - change hairpin queue delay drop to disable by default
>    - combine the commits
>    - fix Windows building
>    - change the log print
> v3: fix conflict and building
> v4: code style update and commit log polishing
> v5:
>    - split and fix the document description
>    - devarg name update
> v6: document and macro name update
> v7: revert the improper merge of mlx5.rst update
> ---
> 
> Bing Zhao (2):
>    net/mlx5: add support for Rx queue delay drop
>    net/mlx5: check delay drop settings in kernel driver
> 

Series applied to dpdk-next-net/main, thanks.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2021-11-05 16:15 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-04 11:26 [dpdk-dev] [PATCH 0/4] Add delay drop support for Rx queue Bing Zhao
2021-11-04 11:26 ` [dpdk-dev] [PATCH 1/4] common/mlx5: support delay drop capabilities query Bing Zhao
2021-11-04 11:26 ` [dpdk-dev] [PATCH 2/4] net/mlx5: add support for Rx queue delay drop Bing Zhao
2021-11-04 14:01   ` David Marchand
2021-11-04 14:34     ` Bing Zhao
2021-11-04 11:26 ` [dpdk-dev] [PATCH 3/4] net/mlx5: support querying delay drop status via ethtool Bing Zhao
2021-11-04 11:26 ` [dpdk-dev] [PATCH 4/4] doc: update the description for Rx delay drop Bing Zhao
2021-11-04 14:01 ` [dpdk-dev] [PATCH v2 0/2] Add delay drop support for Rx queue Bing Zhao
2021-11-04 14:01   ` [dpdk-dev] [PATCH v2 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
2021-11-04 14:01   ` [dpdk-dev] [PATCH v2 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
2021-11-04 16:55 ` [dpdk-dev] [PATCH v3 0/2] Add delay drop support for Rx queue Bing Zhao
2021-11-04 16:55   ` [dpdk-dev] [PATCH v3 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
2021-11-04 16:55   ` [dpdk-dev] [PATCH v3 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
2021-11-04 17:59 ` [dpdk-dev] [PATCH v4 0/2] Add delay drop support for Rx queue Bing Zhao
2021-11-04 17:59   ` [dpdk-dev] [PATCH v4 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
2021-11-04 18:22     ` Slava Ovsiienko
2021-11-04 17:59   ` [dpdk-dev] [PATCH v4 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
2021-11-04 18:22     ` Slava Ovsiienko
2021-11-04 21:46   ` [dpdk-dev] [PATCH v4 0/2] Add delay drop support for Rx queue Raslan Darawsheh
2021-11-05 13:36 ` [dpdk-dev] [PATCH v5 " Bing Zhao
2021-11-05 13:36   ` [dpdk-dev] [PATCH v5 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
2021-11-05 13:36   ` [dpdk-dev] [PATCH v5 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
2021-11-05 14:28 ` [dpdk-dev] [PATCH v6 0/2] Add delay drop support for Rx queue Bing Zhao
2021-11-05 14:28   ` [dpdk-dev] [PATCH v6 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
2021-11-05 14:28   ` [dpdk-dev] [PATCH v6 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
2021-11-05 15:30 ` [dpdk-dev] [PATCH v7 0/2] Add delay drop support for Rx queue Bing Zhao
2021-11-05 15:30   ` [dpdk-dev] [PATCH v7 1/2] net/mlx5: add support for Rx queue delay drop Bing Zhao
2021-11-05 15:30   ` [dpdk-dev] [PATCH v7 2/2] net/mlx5: check delay drop settings in kernel driver Bing Zhao
2021-11-05 16:07   ` [dpdk-dev] [PATCH v7 0/2] Add delay drop support for Rx queue Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).