DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] net/i40e: add interface to choose latest vector path
@ 2018-08-30  2:16 Xiaoyun Li
  2018-08-31 11:24 ` [dpdk-dev] [PATCH v2] " Xiaoyun Li
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Xiaoyun Li @ 2018-08-30  2:16 UTC (permalink / raw)
  To: beilei.xing, qi.z.zhang; +Cc: dev, zhiyong.yang, Xiaoyun Li

Right now, vector path is limited to only use on later platform due
to the frequency penalty. This patch adds a devarg enable-latest-vec
to allow the users to use the latest vector path that the platform
supported. Namely, using AVX2 vector path on broadwell is possible.

Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
---
 doc/guides/nics/i40e.rst               |  8 ++++++++
 doc/guides/rel_notes/release_18_11.rst |  4 ++++
 drivers/net/i40e/i40e_ethdev.c         | 35 ++++++++++++++++++++++++++++++++++
 drivers/net/i40e/i40e_ethdev.h         |  1 +
 drivers/net/i40e/i40e_rxtx.c           | 27 ++++++++++++++++++++++++++
 5 files changed, 75 insertions(+)

diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
index 65d87f8..75605bd 100644
--- a/doc/guides/nics/i40e.rst
+++ b/doc/guides/nics/i40e.rst
@@ -163,6 +163,14 @@ Runtime Config Options
   Currently hot-plugging of representor ports is not supported so all required
   representors must be specified on the creation of the PF.
 
+- ``Enable latest vector`` (default ``disable``)
+
+  AVX2 vector path cannot be chosen on platforms like HSW/BDW due to throttling
+  penalty. But users may want AVX2 vector path on HSW/BDW because it can get better
+  perf such as in VPP. So ``devargs`` parameter ``enable-latest-vec`` is introduced,
+  for example::
+    -w 84:00.0,enable-latest-vec=1
+
 Driver compilation and testing
 ------------------------------
 
diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst
index 3ae6b3f..f8b0f31 100644
--- a/doc/guides/rel_notes/release_18_11.rst
+++ b/doc/guides/rel_notes/release_18_11.rst
@@ -54,6 +54,10 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added a devarg to eable the latest vector path.**
+  A new devarg ``enable-latest-vec`` was introduced to allow users to choose
+  the latest vector path that the platform supported. For example, VPP users
+  can use AVX2 vector path on BDW/HSW to get better performance.
 
 API Changes
 -----------
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 85a6a86..e910b14 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -12513,6 +12513,41 @@ i40e_config_rss_filter(struct i40e_pf *pf,
 	return 0;
 }
 
+#define ETH_I40E_ENABLE_LATEST_VEC	"enable-latest-vec"
+
+bool
+i40e_parse_latest_vec(struct rte_eth_dev *dev)
+{
+	static const char *const valid_keys[] = {
+		ETH_I40E_ENABLE_LATEST_VEC, NULL};
+	int enable_latest_vec;
+	struct rte_kvargs *kvlist;
+
+	if (!dev->device->devargs)
+		return 0;
+
+	kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys);
+	if (!kvlist)
+		return -EINVAL;
+
+	if (rte_kvargs_count(kvlist, ETH_I40E_ENABLE_LATEST_VEC) > 1)
+		PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and only "
+			    "the first one is used !",
+			    ETH_I40E_ENABLE_LATEST_VEC);
+
+	enable_latest_vec = atoi((&kvlist->pairs[0])->value);
+
+	rte_kvargs_free(kvlist);
+
+	if (enable_latest_vec != 0 && enable_latest_vec != 1)
+		PMD_DRV_LOG(WARNING, "Value should be 0 or 1, set it as 1!");
+
+	if (enable_latest_vec)
+		return true;
+	else
+		return false;
+}
+
 RTE_INIT(i40e_init_log)
 {
 	i40e_logtype_init = rte_log_register("pmd.net.i40e.init");
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
index 3fffe5a..cdf68cd 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -1243,6 +1243,7 @@ int i40e_config_rss_filter(struct i40e_pf *pf,
 		struct i40e_rte_flow_rss_conf *conf, bool add);
 int i40e_vf_representor_init(struct rte_eth_dev *ethdev, void *init_params);
 int i40e_vf_representor_uninit(struct rte_eth_dev *ethdev);
+bool i40e_parse_latest_vec(struct rte_eth_dev *dev);
 
 #define I40E_DEV_TO_PCI(eth_dev) \
 	RTE_DEV_TO_PCI((eth_dev)->device)
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 2a28ee3..b3069ac 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -2960,6 +2960,15 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
 			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
 				dev->rx_pkt_burst =
 					i40e_recv_scattered_pkts_vec_avx2;
+			/*
+			 * Give users chance to use the latest vector path
+			 * that the platform supported.
+			 */
+			if (i40e_parse_latest_vec(dev)) {
+				if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+					dev->rx_pkt_burst =
+					i40e_recv_scattered_pkts_vec_avx2;
+			}
 #endif
 		} else {
 			PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk "
@@ -2989,6 +2998,15 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
 		 */
 		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
 			dev->rx_pkt_burst = i40e_recv_pkts_vec_avx2;
+		/*
+		 * Give users chance to use the latest vector path
+		 * that the platform supported.
+		 */
+		if (i40e_parse_latest_vec(dev)) {
+			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+				dev->rx_pkt_burst =
+					i40e_recv_scattered_pkts_vec_avx2;
+		}
 #endif
 	} else if (ad->rx_bulk_alloc_allowed) {
 		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are "
@@ -3083,6 +3101,15 @@ i40e_set_tx_function(struct rte_eth_dev *dev)
 			 */
 			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
 				dev->tx_pkt_burst = i40e_xmit_pkts_vec_avx2;
+			/*
+			 * Give users chance to use the latest vector path
+			 * that the platform supported.
+			 */
+			if (i40e_parse_latest_vec(dev)) {
+				if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+					dev->tx_pkt_burst =
+					i40e_xmit_pkts_vec_avx2;
+			}
 #endif
 		} else {
 			PMD_INIT_LOG(DEBUG, "Simple tx finally be used.");
-- 
2.7.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH v2] net/i40e: add interface to choose latest vector path
  2018-08-30  2:16 [dpdk-dev] [PATCH] net/i40e: add interface to choose latest vector path Xiaoyun Li
@ 2018-08-31 11:24 ` Xiaoyun Li
  2018-09-04 11:39 ` [dpdk-dev] [PATCH v3] " Xiaoyun Li
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Xiaoyun Li @ 2018-08-31 11:24 UTC (permalink / raw)
  To: beilei.xing, qi.z.zhang; +Cc: dev, zhiyong.yang, Xiaoyun Li

Right now, vector path is limited to only use on later platform due
to the frequency penalty. This patch adds a devarg enable-latest-vec
to allow the users to use the latest vector path that the platform
supported. Namely, using AVX2 vector path on broadwell is possible.

Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
---
v2:
 * Correct the calling of the wrong function last time.
 * Fix seg fault bug.
---
 doc/guides/nics/i40e.rst               |  8 ++++++
 doc/guides/rel_notes/release_18_11.rst |  4 +++
 drivers/net/i40e/i40e_ethdev.c         | 38 ++++++++++++++++++++++++++
 drivers/net/i40e/i40e_ethdev.h         |  1 +
 drivers/net/i40e/i40e_rxtx.c           | 27 ++++++++++++++++++
 5 files changed, 78 insertions(+)

diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
index 65d87f869..75605bdce 100644
--- a/doc/guides/nics/i40e.rst
+++ b/doc/guides/nics/i40e.rst
@@ -163,6 +163,14 @@ Runtime Config Options
   Currently hot-plugging of representor ports is not supported so all required
   representors must be specified on the creation of the PF.
 
+- ``Enable latest vector`` (default ``disable``)
+
+  AVX2 vector path cannot be chosen on platforms like HSW/BDW due to throttling
+  penalty. But users may want AVX2 vector path on HSW/BDW because it can get better
+  perf such as in VPP. So ``devargs`` parameter ``enable-latest-vec`` is introduced,
+  for example::
+    -w 84:00.0,enable-latest-vec=1
+
 Driver compilation and testing
 ------------------------------
 
diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst
index 3ae6b3f58..f8b0f3189 100644
--- a/doc/guides/rel_notes/release_18_11.rst
+++ b/doc/guides/rel_notes/release_18_11.rst
@@ -54,6 +54,10 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added a devarg to eable the latest vector path.**
+  A new devarg ``enable-latest-vec`` was introduced to allow users to choose
+  the latest vector path that the platform supported. For example, VPP users
+  can use AVX2 vector path on BDW/HSW to get better performance.
 
 API Changes
 -----------
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 85a6a867f..16b5345fb 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -12513,6 +12513,44 @@ i40e_config_rss_filter(struct i40e_pf *pf,
 	return 0;
 }
 
+#define ETH_I40E_ENABLE_LATEST_VEC	"enable-latest-vec"
+
+bool
+i40e_parse_latest_vec(struct rte_eth_dev *dev)
+{
+	static const char *const valid_keys[] = {
+		ETH_I40E_ENABLE_LATEST_VEC, NULL};
+	int enable_latest_vec;
+	struct rte_kvargs *kvlist;
+
+	if (!dev->device->devargs)
+		return 0;
+
+	kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys);
+	if (!kvlist)
+		return -EINVAL;
+
+	if (!rte_kvargs_count(kvlist, ETH_I40E_ENABLE_LATEST_VEC))
+		return 0;
+
+	if (rte_kvargs_count(kvlist, ETH_I40E_ENABLE_LATEST_VEC) > 1)
+		PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and only "
+			    "the first one is used !",
+			    ETH_I40E_ENABLE_LATEST_VEC);
+
+	enable_latest_vec = atoi((&kvlist->pairs[0])->value);
+
+	rte_kvargs_free(kvlist);
+
+	if (enable_latest_vec != 0 && enable_latest_vec != 1)
+		PMD_DRV_LOG(WARNING, "Value should be 0 or 1, set it as 1!");
+
+	if (enable_latest_vec)
+		return true;
+	else
+		return false;
+}
+
 RTE_INIT(i40e_init_log)
 {
 	i40e_logtype_init = rte_log_register("pmd.net.i40e.init");
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
index 3fffe5a55..cdf68cd93 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -1243,6 +1243,7 @@ int i40e_config_rss_filter(struct i40e_pf *pf,
 		struct i40e_rte_flow_rss_conf *conf, bool add);
 int i40e_vf_representor_init(struct rte_eth_dev *ethdev, void *init_params);
 int i40e_vf_representor_uninit(struct rte_eth_dev *ethdev);
+bool i40e_parse_latest_vec(struct rte_eth_dev *dev);
 
 #define I40E_DEV_TO_PCI(eth_dev) \
 	RTE_DEV_TO_PCI((eth_dev)->device)
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 2a28ee348..75f8ec284 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -2960,6 +2960,15 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
 			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
 				dev->rx_pkt_burst =
 					i40e_recv_scattered_pkts_vec_avx2;
+			/*
+			 * Give users chance to use the latest vector path
+			 * that the platform supported.
+			 */
+			if (i40e_parse_latest_vec(dev)) {
+				if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+					dev->rx_pkt_burst =
+					i40e_recv_scattered_pkts_vec_avx2;
+			}
 #endif
 		} else {
 			PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk "
@@ -2989,6 +2998,15 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
 		 */
 		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
 			dev->rx_pkt_burst = i40e_recv_pkts_vec_avx2;
+		/*
+		 * Give users chance to use the latest vector path
+		 * that the platform supported.
+		 */
+		if (i40e_parse_latest_vec(dev)) {
+			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+				dev->rx_pkt_burst =
+					i40e_recv_pkts_vec_avx2;
+		}
 #endif
 	} else if (ad->rx_bulk_alloc_allowed) {
 		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are "
@@ -3083,6 +3101,15 @@ i40e_set_tx_function(struct rte_eth_dev *dev)
 			 */
 			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
 				dev->tx_pkt_burst = i40e_xmit_pkts_vec_avx2;
+			/*
+			 * Give users chance to use the latest vector path
+			 * that the platform supported.
+			 */
+			if (i40e_parse_latest_vec(dev)) {
+				if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+					dev->tx_pkt_burst =
+					i40e_xmit_pkts_vec_avx2;
+			}
 #endif
 		} else {
 			PMD_INIT_LOG(DEBUG, "Simple tx finally be used.");
-- 
2.17.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH v3] net/i40e: add interface to choose latest vector path
  2018-08-30  2:16 [dpdk-dev] [PATCH] net/i40e: add interface to choose latest vector path Xiaoyun Li
  2018-08-31 11:24 ` [dpdk-dev] [PATCH v2] " Xiaoyun Li
@ 2018-09-04 11:39 ` Xiaoyun Li
  2018-09-05 12:21   ` Zhang, Qi Z
  2018-09-10 10:17 ` [dpdk-dev] [PATCH v4] " Xiaoyun Li
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Xiaoyun Li @ 2018-09-04 11:39 UTC (permalink / raw)
  To: beilei.xing, qi.z.zhang
  Cc: dev, zhiyong.yang, bruce.richardson, david.hunt, Xiaoyun Li

Right now, vector path is limited to only use on later platform.
This patch adds a devarg enable-latest-vec to allow the users to
use the latest vector path that the platform supported. Namely,
using AVX2 vector path on broadwell is possible.

Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
---
v3:
 * Polish the doc and commit log.
v2:
 * Correct the calling of the wrong function last time.
 * Fix seg fault bug.

 doc/guides/nics/i40e.rst               |  8 ++++++
 doc/guides/rel_notes/release_18_11.rst |  4 +++
 drivers/net/i40e/i40e_ethdev.c         | 38 ++++++++++++++++++++++++++
 drivers/net/i40e/i40e_ethdev.h         |  1 +
 drivers/net/i40e/i40e_rxtx.c           | 27 ++++++++++++++++++
 5 files changed, 78 insertions(+)

diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
index 65d87f869..6158e7c34 100644
--- a/doc/guides/nics/i40e.rst
+++ b/doc/guides/nics/i40e.rst
@@ -163,6 +163,14 @@ Runtime Config Options
   Currently hot-plugging of representor ports is not supported so all required
   representors must be specified on the creation of the PF.
 
+- ``Enable latest vector`` (default ``disable``)
+
+  Vector path was limited to use only on later platform. But users may want the
+  latest vector path. For example, VPP users may want to use AVX2 vector path on HSW/BDW
+  because it can get better perf. So ``devargs`` parameter ``enable-latest-vec``
+  is introduced, for example::
+    -w 84:00.0,enable-latest-vec=1
+
 Driver compilation and testing
 ------------------------------
 
diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst
index 3ae6b3f58..f8b0f3189 100644
--- a/doc/guides/rel_notes/release_18_11.rst
+++ b/doc/guides/rel_notes/release_18_11.rst
@@ -54,6 +54,10 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added a devarg to eable the latest vector path.**
+  A new devarg ``enable-latest-vec`` was introduced to allow users to choose
+  the latest vector path that the platform supported. For example, VPP users
+  can use AVX2 vector path on BDW/HSW to get better performance.
 
 API Changes
 -----------
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 85a6a867f..16b5345fb 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -12513,6 +12513,44 @@ i40e_config_rss_filter(struct i40e_pf *pf,
 	return 0;
 }
 
+#define ETH_I40E_ENABLE_LATEST_VEC	"enable-latest-vec"
+
+bool
+i40e_parse_latest_vec(struct rte_eth_dev *dev)
+{
+	static const char *const valid_keys[] = {
+		ETH_I40E_ENABLE_LATEST_VEC, NULL};
+	int enable_latest_vec;
+	struct rte_kvargs *kvlist;
+
+	if (!dev->device->devargs)
+		return 0;
+
+	kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys);
+	if (!kvlist)
+		return -EINVAL;
+
+	if (!rte_kvargs_count(kvlist, ETH_I40E_ENABLE_LATEST_VEC))
+		return 0;
+
+	if (rte_kvargs_count(kvlist, ETH_I40E_ENABLE_LATEST_VEC) > 1)
+		PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and only "
+			    "the first one is used !",
+			    ETH_I40E_ENABLE_LATEST_VEC);
+
+	enable_latest_vec = atoi((&kvlist->pairs[0])->value);
+
+	rte_kvargs_free(kvlist);
+
+	if (enable_latest_vec != 0 && enable_latest_vec != 1)
+		PMD_DRV_LOG(WARNING, "Value should be 0 or 1, set it as 1!");
+
+	if (enable_latest_vec)
+		return true;
+	else
+		return false;
+}
+
 RTE_INIT(i40e_init_log)
 {
 	i40e_logtype_init = rte_log_register("pmd.net.i40e.init");
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
index 3fffe5a55..cdf68cd93 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -1243,6 +1243,7 @@ int i40e_config_rss_filter(struct i40e_pf *pf,
 		struct i40e_rte_flow_rss_conf *conf, bool add);
 int i40e_vf_representor_init(struct rte_eth_dev *ethdev, void *init_params);
 int i40e_vf_representor_uninit(struct rte_eth_dev *ethdev);
+bool i40e_parse_latest_vec(struct rte_eth_dev *dev);
 
 #define I40E_DEV_TO_PCI(eth_dev) \
 	RTE_DEV_TO_PCI((eth_dev)->device)
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 2a28ee348..75f8ec284 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -2960,6 +2960,15 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
 			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
 				dev->rx_pkt_burst =
 					i40e_recv_scattered_pkts_vec_avx2;
+			/*
+			 * Give users chance to use the latest vector path
+			 * that the platform supported.
+			 */
+			if (i40e_parse_latest_vec(dev)) {
+				if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+					dev->rx_pkt_burst =
+					i40e_recv_scattered_pkts_vec_avx2;
+			}
 #endif
 		} else {
 			PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk "
@@ -2989,6 +2998,15 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
 		 */
 		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
 			dev->rx_pkt_burst = i40e_recv_pkts_vec_avx2;
+		/*
+		 * Give users chance to use the latest vector path
+		 * that the platform supported.
+		 */
+		if (i40e_parse_latest_vec(dev)) {
+			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+				dev->rx_pkt_burst =
+					i40e_recv_pkts_vec_avx2;
+		}
 #endif
 	} else if (ad->rx_bulk_alloc_allowed) {
 		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are "
@@ -3083,6 +3101,15 @@ i40e_set_tx_function(struct rte_eth_dev *dev)
 			 */
 			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
 				dev->tx_pkt_burst = i40e_xmit_pkts_vec_avx2;
+			/*
+			 * Give users chance to use the latest vector path
+			 * that the platform supported.
+			 */
+			if (i40e_parse_latest_vec(dev)) {
+				if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+					dev->tx_pkt_burst =
+					i40e_xmit_pkts_vec_avx2;
+			}
 #endif
 		} else {
 			PMD_INIT_LOG(DEBUG, "Simple tx finally be used.");
-- 
2.17.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v3] net/i40e: add interface to choose latest vector path
  2018-09-04 11:39 ` [dpdk-dev] [PATCH v3] " Xiaoyun Li
@ 2018-09-05 12:21   ` Zhang, Qi Z
  2018-09-07  9:23     ` Li, Xiaoyun
  0 siblings, 1 reply; 19+ messages in thread
From: Zhang, Qi Z @ 2018-09-05 12:21 UTC (permalink / raw)
  To: Li, Xiaoyun, Xing, Beilei
  Cc: dev, Yang, Zhiyong, Richardson, Bruce, Hunt, David

Hi Xiaoyun:

> -----Original Message-----
> From: Li, Xiaoyun
> Sent: Tuesday, September 4, 2018 7:40 PM
> To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>
> Cc: dev@dpdk.org; Yang, Zhiyong <zhiyong.yang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; Hunt, David <david.hunt@intel.com>; Li,
> Xiaoyun <xiaoyun.li@intel.com>
> Subject: [PATCH v3] net/i40e: add interface to choose latest vector path
> 
> Right now, vector path is limited to only use on later platform.
> This patch adds a devarg enable-latest-vec to allow the users to use the latest
> vector path that the platform supported. Namely, using AVX2 vector path on
> broadwell is possible.
> 
> Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
> ---
> v3:
>  * Polish the doc and commit log.
> v2:
>  * Correct the calling of the wrong function last time.
>  * Fix seg fault bug.
> 
>  doc/guides/nics/i40e.rst               |  8 ++++++
>  doc/guides/rel_notes/release_18_11.rst |  4 +++
>  drivers/net/i40e/i40e_ethdev.c         | 38
> ++++++++++++++++++++++++++
>  drivers/net/i40e/i40e_ethdev.h         |  1 +
>  drivers/net/i40e/i40e_rxtx.c           | 27 ++++++++++++++++++
>  5 files changed, 78 insertions(+)
> 
> diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index
> 65d87f869..6158e7c34 100644
> --- a/doc/guides/nics/i40e.rst
> +++ b/doc/guides/nics/i40e.rst
> @@ -163,6 +163,14 @@ Runtime Config Options
>    Currently hot-plugging of representor ports is not supported so all required
>    representors must be specified on the creation of the PF.
> 
> +- ``Enable latest vector`` (default ``disable``)
> +
> +  Vector path was limited to use only on later platform. But users may
> + want the  latest vector path. For example, VPP users may want to use
> + AVX2 vector path on HSW/BDW  because it can get better perf. So
> + ``devargs`` parameter ``enable-latest-vec``  is introduced, for example::
> +    -w 84:00.0,enable-latest-vec=1


How about "use_latest_vec" or "use-lastest-vpmd"?

> +
>  Driver compilation and testing
>  ------------------------------
> 
> diff --git a/doc/guides/rel_notes/release_18_11.rst
> b/doc/guides/rel_notes/release_18_11.rst
> index 3ae6b3f58..f8b0f3189 100644
> --- a/doc/guides/rel_notes/release_18_11.rst
> +++ b/doc/guides/rel_notes/release_18_11.rst
> @@ -54,6 +54,10 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =========================================================
> 
> +* **Added a devarg to eable the latest vector path.**
> +  A new devarg ``enable-latest-vec`` was introduced to allow users to
> +choose
> +  the latest vector path that the platform supported. For example, VPP
> +users
> +  can use AVX2 vector path on BDW/HSW to get better performance.
> 
>  API Changes
>  -----------
> diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
> index 85a6a867f..16b5345fb 100644
> --- a/drivers/net/i40e/i40e_ethdev.c
> +++ b/drivers/net/i40e/i40e_ethdev.c
> @@ -12513,6 +12513,44 @@ i40e_config_rss_filter(struct i40e_pf *pf,
>  	return 0;
>  }
> 
> +#define ETH_I40E_ENABLE_LATEST_VEC	"enable-latest-vec"

This should be defined along with other exist devargs, please check ETH_I40E_SUPPORT_MULTI_DRIVER for reference
Also it should be registered with RTE_PMD_REGISTER_PARAM_STRING.

> +
> +bool
> +i40e_parse_latest_vec(struct rte_eth_dev *dev) {
> +	static const char *const valid_keys[] = {
> +		ETH_I40E_ENABLE_LATEST_VEC, NULL};
> +	int enable_latest_vec;
> +	struct rte_kvargs *kvlist;
> +
> +	if (!dev->device->devargs)
> +		return 0;
> +
> +	kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys);
> +	if (!kvlist)
> +		return -EINVAL;
> +
> +	if (!rte_kvargs_count(kvlist, ETH_I40E_ENABLE_LATEST_VEC))
> +		return 0;
> +
> +	if (rte_kvargs_count(kvlist, ETH_I40E_ENABLE_LATEST_VEC) > 1)
> +		PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and
> only "
> +			    "the first one is used !",
> +			    ETH_I40E_ENABLE_LATEST_VEC);
> +
> +	enable_latest_vec = atoi((&kvlist->pairs[0])->value);
> +
> +	rte_kvargs_free(kvlist);
> +
> +	if (enable_latest_vec != 0 && enable_latest_vec != 1)
> +		PMD_DRV_LOG(WARNING, "Value should be 0 or 1, set it as 1!");
> +
> +	if (enable_latest_vec)
> +		return true;
> +	else
> +		return false;
> +}

We call rte_kvargs_parse in different place for different parameter which is not necessary.
it's better to merge them into one parse_devargs function at dev_init and then all corresponding field of i40e_adapter can be configured at the same place.
Though this is not this patch's scope, but it's better to introduce a field like i40e_adapter->use_latest_vec
and in i40e_parse_latest_vec, it just assign the value which could be used later.
This will make things easy for future code clean and also it is not necessary to call i40e_parse_latest_vec multiple time in set_rx/tx_function.

> +
>  RTE_INIT(i40e_init_log)
>  {
>  	i40e_logtype_init = rte_log_register("pmd.net.i40e.init");
> diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
> index 3fffe5a55..cdf68cd93 100644
> --- a/drivers/net/i40e/i40e_ethdev.h
> +++ b/drivers/net/i40e/i40e_ethdev.h
> @@ -1243,6 +1243,7 @@ int i40e_config_rss_filter(struct i40e_pf *pf,
>  		struct i40e_rte_flow_rss_conf *conf, bool add);  int
> i40e_vf_representor_init(struct rte_eth_dev *ethdev, void *init_params);  int
> i40e_vf_representor_uninit(struct rte_eth_dev *ethdev);
> +bool i40e_parse_latest_vec(struct rte_eth_dev *dev);
> 
>  #define I40E_DEV_TO_PCI(eth_dev) \
>  	RTE_DEV_TO_PCI((eth_dev)->device)
> diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index
> 2a28ee348..75f8ec284 100644
> --- a/drivers/net/i40e/i40e_rxtx.c
> +++ b/drivers/net/i40e/i40e_rxtx.c
> @@ -2960,6 +2960,15 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
>  			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
>  				dev->rx_pkt_burst =
>  					i40e_recv_scattered_pkts_vec_avx2;
> +			/*
> +			 * Give users chance to use the latest vector path
> +			 * that the platform supported.
> +			 */
> +			if (i40e_parse_latest_vec(dev)) {
> +				if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
> +					dev->rx_pkt_burst =
> +					i40e_recv_scattered_pkts_vec_avx2;
> +			}


How about: 
			
			if (adapter->use_latest_vec)
				dev->pkt_burst = get_latest_rx_vec();
			else
				dev->pkt_burst = get_recommended_rx_vec();

this make code easy to understand and in future, if we have avx512 vpmd, it's easy for us to modify the code.


>  #endif
>  		} else {
>  			PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk "
> @@ -2989,6 +2998,15 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
>  		 */
>  		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
>  			dev->rx_pkt_burst = i40e_recv_pkts_vec_avx2;
> +		/*
> +		 * Give users chance to use the latest vector path
> +		 * that the platform supported.
> +		 */
> +		if (i40e_parse_latest_vec(dev)) {
> +			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
> +				dev->rx_pkt_burst =
> +					i40e_recv_pkts_vec_avx2;
> +		}

Same as above
>  #endif
>  	} else if (ad->rx_bulk_alloc_allowed) {
>  		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are "
> @@ -3083,6 +3101,15 @@ i40e_set_tx_function(struct rte_eth_dev *dev)
>  			 */
>  			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
>  				dev->tx_pkt_burst = i40e_xmit_pkts_vec_avx2;
> +			/*
> +			 * Give users chance to use the latest vector path
> +			 * that the platform supported.
> +			 */
> +			if (i40e_parse_latest_vec(dev)) {
> +				if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
> +					dev->tx_pkt_burst =
> +					i40e_xmit_pkts_vec_avx2;
> +			}

Same as above

>  #endif
>  		} else {
>  			PMD_INIT_LOG(DEBUG, "Simple tx finally be used.");
> --
> 2.17.1

Regards
Qi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v3] net/i40e: add interface to choose latest vector path
  2018-09-05 12:21   ` Zhang, Qi Z
@ 2018-09-07  9:23     ` Li, Xiaoyun
  0 siblings, 0 replies; 19+ messages in thread
From: Li, Xiaoyun @ 2018-09-07  9:23 UTC (permalink / raw)
  To: Zhang, Qi Z, Xing, Beilei
  Cc: dev, Yang, Zhiyong, Richardson, Bruce, Hunt, David

OK. Will send v4 later. Thanks.

> -----Original Message-----
> From: Zhang, Qi Z
> Sent: Wednesday, September 5, 2018 20:22
> To: Li, Xiaoyun <xiaoyun.li@intel.com>; Xing, Beilei <beilei.xing@intel.com>
> Cc: dev@dpdk.org; Yang, Zhiyong <zhiyong.yang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; Hunt, David <david.hunt@intel.com>
> Subject: RE: [PATCH v3] net/i40e: add interface to choose latest vector path
> 
> Hi Xiaoyun:
> 
> > -----Original Message-----
> > From: Li, Xiaoyun
> > Sent: Tuesday, September 4, 2018 7:40 PM
> > To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z
> > <qi.z.zhang@intel.com>
> > Cc: dev@dpdk.org; Yang, Zhiyong <zhiyong.yang@intel.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>; Hunt, David
> > <david.hunt@intel.com>; Li, Xiaoyun <xiaoyun.li@intel.com>
> > Subject: [PATCH v3] net/i40e: add interface to choose latest vector
> > path
> >
> > Right now, vector path is limited to only use on later platform.
> > This patch adds a devarg enable-latest-vec to allow the users to use
> > the latest vector path that the platform supported. Namely, using AVX2
> > vector path on broadwell is possible.
> >
> > Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
> > ---
> > v3:
> >  * Polish the doc and commit log.
> > v2:
> >  * Correct the calling of the wrong function last time.
> >  * Fix seg fault bug.
> >
> >  doc/guides/nics/i40e.rst               |  8 ++++++
> >  doc/guides/rel_notes/release_18_11.rst |  4 +++
> >  drivers/net/i40e/i40e_ethdev.c         | 38
> > ++++++++++++++++++++++++++
> >  drivers/net/i40e/i40e_ethdev.h         |  1 +
> >  drivers/net/i40e/i40e_rxtx.c           | 27 ++++++++++++++++++
> >  5 files changed, 78 insertions(+)
> >
> > diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index
> > 65d87f869..6158e7c34 100644
> > --- a/doc/guides/nics/i40e.rst
> > +++ b/doc/guides/nics/i40e.rst
> > @@ -163,6 +163,14 @@ Runtime Config Options
> >    Currently hot-plugging of representor ports is not supported so all
> required
> >    representors must be specified on the creation of the PF.
> >
> > +- ``Enable latest vector`` (default ``disable``)
> > +
> > +  Vector path was limited to use only on later platform. But users
> > + may want the  latest vector path. For example, VPP users may want to
> > + use
> > + AVX2 vector path on HSW/BDW  because it can get better perf. So
> > + ``devargs`` parameter ``enable-latest-vec``  is introduced, for example::
> > +    -w 84:00.0,enable-latest-vec=1
> 
> 
> How about "use_latest_vec" or "use-lastest-vpmd"?
> 
> > +
> >  Driver compilation and testing
> >  ------------------------------
> >
> > diff --git a/doc/guides/rel_notes/release_18_11.rst
> > b/doc/guides/rel_notes/release_18_11.rst
> > index 3ae6b3f58..f8b0f3189 100644
> > --- a/doc/guides/rel_notes/release_18_11.rst
> > +++ b/doc/guides/rel_notes/release_18_11.rst
> > @@ -54,6 +54,10 @@ New Features
> >       Also, make sure to start the actual text at the margin.
> >       =========================================================
> >
> > +* **Added a devarg to eable the latest vector path.**
> > +  A new devarg ``enable-latest-vec`` was introduced to allow users to
> > +choose
> > +  the latest vector path that the platform supported. For example,
> > +VPP users
> > +  can use AVX2 vector path on BDW/HSW to get better performance.
> >
> >  API Changes
> >  -----------
> > diff --git a/drivers/net/i40e/i40e_ethdev.c
> > b/drivers/net/i40e/i40e_ethdev.c index 85a6a867f..16b5345fb 100644
> > --- a/drivers/net/i40e/i40e_ethdev.c
> > +++ b/drivers/net/i40e/i40e_ethdev.c
> > @@ -12513,6 +12513,44 @@ i40e_config_rss_filter(struct i40e_pf *pf,
> >  	return 0;
> >  }
> >
> > +#define ETH_I40E_ENABLE_LATEST_VEC	"enable-latest-vec"
> 
> This should be defined along with other exist devargs, please check
> ETH_I40E_SUPPORT_MULTI_DRIVER for reference Also it should be
> registered with RTE_PMD_REGISTER_PARAM_STRING.
> 
> > +
> > +bool
> > +i40e_parse_latest_vec(struct rte_eth_dev *dev) {
> > +	static const char *const valid_keys[] = {
> > +		ETH_I40E_ENABLE_LATEST_VEC, NULL};
> > +	int enable_latest_vec;
> > +	struct rte_kvargs *kvlist;
> > +
> > +	if (!dev->device->devargs)
> > +		return 0;
> > +
> > +	kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys);
> > +	if (!kvlist)
> > +		return -EINVAL;
> > +
> > +	if (!rte_kvargs_count(kvlist, ETH_I40E_ENABLE_LATEST_VEC))
> > +		return 0;
> > +
> > +	if (rte_kvargs_count(kvlist, ETH_I40E_ENABLE_LATEST_VEC) > 1)
> > +		PMD_DRV_LOG(WARNING, "More than one argument \"%s\"
> and
> > only "
> > +			    "the first one is used !",
> > +			    ETH_I40E_ENABLE_LATEST_VEC);
> > +
> > +	enable_latest_vec = atoi((&kvlist->pairs[0])->value);
> > +
> > +	rte_kvargs_free(kvlist);
> > +
> > +	if (enable_latest_vec != 0 && enable_latest_vec != 1)
> > +		PMD_DRV_LOG(WARNING, "Value should be 0 or 1, set it as
> 1!");
> > +
> > +	if (enable_latest_vec)
> > +		return true;
> > +	else
> > +		return false;
> > +}
> 
> We call rte_kvargs_parse in different place for different parameter which is
> not necessary.
> it's better to merge them into one parse_devargs function at dev_init and
> then all corresponding field of i40e_adapter can be configured at the same
> place.
> Though this is not this patch's scope, but it's better to introduce a field like
> i40e_adapter->use_latest_vec and in i40e_parse_latest_vec, it just assign the
> value which could be used later.
> This will make things easy for future code clean and also it is not necessary to
> call i40e_parse_latest_vec multiple time in set_rx/tx_function.
> 
> > +
> >  RTE_INIT(i40e_init_log)
> >  {
> >  	i40e_logtype_init = rte_log_register("pmd.net.i40e.init");
> > diff --git a/drivers/net/i40e/i40e_ethdev.h
> > b/drivers/net/i40e/i40e_ethdev.h index 3fffe5a55..cdf68cd93 100644
> > --- a/drivers/net/i40e/i40e_ethdev.h
> > +++ b/drivers/net/i40e/i40e_ethdev.h
> > @@ -1243,6 +1243,7 @@ int i40e_config_rss_filter(struct i40e_pf *pf,
> >  		struct i40e_rte_flow_rss_conf *conf, bool add);  int
> > i40e_vf_representor_init(struct rte_eth_dev *ethdev, void
> > *init_params);  int i40e_vf_representor_uninit(struct rte_eth_dev
> > *ethdev);
> > +bool i40e_parse_latest_vec(struct rte_eth_dev *dev);
> >
> >  #define I40E_DEV_TO_PCI(eth_dev) \
> >  	RTE_DEV_TO_PCI((eth_dev)->device)
> > diff --git a/drivers/net/i40e/i40e_rxtx.c
> > b/drivers/net/i40e/i40e_rxtx.c index
> > 2a28ee348..75f8ec284 100644
> > --- a/drivers/net/i40e/i40e_rxtx.c
> > +++ b/drivers/net/i40e/i40e_rxtx.c
> > @@ -2960,6 +2960,15 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
> >  			if
> (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> >  				dev->rx_pkt_burst =
> >  					i40e_recv_scattered_pkts_vec_avx2;
> > +			/*
> > +			 * Give users chance to use the latest vector path
> > +			 * that the platform supported.
> > +			 */
> > +			if (i40e_parse_latest_vec(dev)) {
> > +				if
> (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
> > +					dev->rx_pkt_burst =
> > +					i40e_recv_scattered_pkts_vec_avx2;
> > +			}
> 
> 
> How about:
> 
> 			if (adapter->use_latest_vec)
> 				dev->pkt_burst = get_latest_rx_vec();
> 			else
> 				dev->pkt_burst = get_recommended_rx_vec();
> 
> this make code easy to understand and in future, if we have avx512 vpmd,
> it's easy for us to modify the code.
> 
> 
> >  #endif
> >  		} else {
> >  			PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk
> "
> > @@ -2989,6 +2998,15 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
> >  		 */
> >  		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> >  			dev->rx_pkt_burst = i40e_recv_pkts_vec_avx2;
> > +		/*
> > +		 * Give users chance to use the latest vector path
> > +		 * that the platform supported.
> > +		 */
> > +		if (i40e_parse_latest_vec(dev)) {
> > +			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
> > +				dev->rx_pkt_burst =
> > +					i40e_recv_pkts_vec_avx2;
> > +		}
> 
> Same as above
> >  #endif
> >  	} else if (ad->rx_bulk_alloc_allowed) {
> >  		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions
> are "
> > @@ -3083,6 +3101,15 @@ i40e_set_tx_function(struct rte_eth_dev *dev)
> >  			 */
> >  			if
> (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> >  				dev->tx_pkt_burst =
> i40e_xmit_pkts_vec_avx2;
> > +			/*
> > +			 * Give users chance to use the latest vector path
> > +			 * that the platform supported.
> > +			 */
> > +			if (i40e_parse_latest_vec(dev)) {
> > +				if
> (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
> > +					dev->tx_pkt_burst =
> > +					i40e_xmit_pkts_vec_avx2;
> > +			}
> 
> Same as above
> 
> >  #endif
> >  		} else {
> >  			PMD_INIT_LOG(DEBUG, "Simple tx finally be used.");
> > --
> > 2.17.1
> 
> Regards
> Qi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH v4] net/i40e: add interface to choose latest vector path
  2018-08-30  2:16 [dpdk-dev] [PATCH] net/i40e: add interface to choose latest vector path Xiaoyun Li
  2018-08-31 11:24 ` [dpdk-dev] [PATCH v2] " Xiaoyun Li
  2018-09-04 11:39 ` [dpdk-dev] [PATCH v3] " Xiaoyun Li
@ 2018-09-10 10:17 ` Xiaoyun Li
  2018-09-12  7:45   ` Zhang, Qi Z
  2018-09-12 10:12 ` [dpdk-dev] [PATCH v5] " Xiaoyun Li
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Xiaoyun Li @ 2018-09-10 10:17 UTC (permalink / raw)
  To: beilei.xing, qi.z.zhang
  Cc: dev, zhiyong.yang, bruce.richardson, david.hunt, Xiaoyun Li

Right now, vector path is limited to only use on later platform.
This patch adds a devarg use-latest-vec to allow the users to
use the latest vector path that the platform supported. Namely,
using AVX2 vector path on broadwell is possible.

Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
---
v4:
 * Polish the codes.
v3:
 * Polish the doc and commit log.
v2:
 * Correct the calling of the wrong function last time.
 * Fix seg fault bug.
---
 doc/guides/nics/i40e.rst               |   8 ++
 doc/guides/rel_notes/release_18_11.rst |   4 +
 drivers/net/i40e/i40e_ethdev.c         |  46 ++++++++++-
 drivers/net/i40e/i40e_ethdev.h         |   3 +
 drivers/net/i40e/i40e_rxtx.c           | 103 ++++++++++++++++---------
 5 files changed, 128 insertions(+), 36 deletions(-)

diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
index 65d87f869..643e6a062 100644
--- a/doc/guides/nics/i40e.rst
+++ b/doc/guides/nics/i40e.rst
@@ -163,6 +163,14 @@ Runtime Config Options
   Currently hot-plugging of representor ports is not supported so all required
   representors must be specified on the creation of the PF.
 
+- ``Use latest vector`` (default ``disable``)
+
+  Vector path was limited to use only on later platform. But users may want the
+  latest vector path. For example, VPP users may want to use AVX2 vector path on HSW/BDW
+  because it can get better perf. So ``devargs`` parameter ``use-latest-vec`` is
+  introduced, for example::
+    -w 84:00.0,use-latest-vec=1
+
 Driver compilation and testing
 ------------------------------
 
diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst
index 3ae6b3f58..34af591a2 100644
--- a/doc/guides/rel_notes/release_18_11.rst
+++ b/doc/guides/rel_notes/release_18_11.rst
@@ -54,6 +54,10 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added a devarg to use the latest vector path.**
+  A new devarg ``use-latest-vec`` was introduced to allow users to choose
+  the latest vector path that the platform supported. For example, VPP users
+  can use AVX2 vector path on BDW/HSW to get better performance.
 
 API Changes
 -----------
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 85a6a867f..72377d0b6 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -44,6 +44,7 @@
 #define ETH_I40E_FLOATING_VEB_LIST_ARG	"floating_veb_list"
 #define ETH_I40E_SUPPORT_MULTI_DRIVER	"support-multi-driver"
 #define ETH_I40E_QUEUE_NUM_PER_VF_ARG	"queue-num-per-vf"
+#define ETH_I40E_USE_LATEST_VEC	"use-latest-vec"
 
 #define I40E_CLEAR_PXE_WAIT_MS     200
 
@@ -408,6 +409,7 @@ static const char *const valid_keys[] = {
 	ETH_I40E_FLOATING_VEB_LIST_ARG,
 	ETH_I40E_SUPPORT_MULTI_DRIVER,
 	ETH_I40E_QUEUE_NUM_PER_VF_ARG,
+	ETH_I40E_USE_LATEST_VEC,
 	NULL};
 
 static const struct rte_pci_id pci_id_i40e_map[] = {
@@ -1201,6 +1203,46 @@ i40e_aq_debug_write_global_register(struct i40e_hw *hw,
 	return i40e_aq_debug_write_register(hw, reg_addr, reg_val, cmd_details);
 }
 
+static int
+i40e_parse_latest_vec(struct rte_eth_dev *dev)
+{
+	struct i40e_adapter *ad =
+		I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
+	int kvargs_count, use_latest_vec;
+	struct rte_kvargs *kvlist;
+
+	ad->use_latest_vec = false;
+
+	if (!dev->device->devargs)
+		return 0;
+
+	kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys);
+	if (!kvlist)
+		return -EINVAL;
+
+	kvargs_count = rte_kvargs_count(kvlist, ETH_I40E_USE_LATEST_VEC);
+	if (!kvargs_count) {
+		rte_kvargs_free(kvlist);
+		return 0;
+	}
+
+	if (kvargs_count > 1)
+		PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and only "
+			    "the first one is used !",
+			    ETH_I40E_USE_LATEST_VEC);
+
+	use_latest_vec = atoi((&kvlist->pairs[0])->value);
+
+	rte_kvargs_free(kvlist);
+
+	if (use_latest_vec != 0 && use_latest_vec != 1)
+		PMD_DRV_LOG(WARNING, "Value should be 0 or 1, set it as 1!");
+
+	ad->use_latest_vec = (bool)use_latest_vec;
+
+	return 0;
+}
+
 static int
 eth_i40e_dev_init(struct rte_eth_dev *dev, void *init_params __rte_unused)
 {
@@ -1263,6 +1305,7 @@ eth_i40e_dev_init(struct rte_eth_dev *dev, void *init_params __rte_unused)
 
 	/* Check if need to support multi-driver */
 	i40e_support_multi_driver(dev);
+	i40e_parse_latest_vec(dev);
 
 	/* Make sure all is clean before doing PF reset */
 	i40e_clear_hw(hw);
@@ -12527,4 +12570,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_i40e,
 			      ETH_I40E_FLOATING_VEB_ARG "=1"
 			      ETH_I40E_FLOATING_VEB_LIST_ARG "=<string>"
 			      ETH_I40E_QUEUE_NUM_PER_VF_ARG "=1|2|4|8|16"
-			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1");
+			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1"
+			      ETH_I40E_USE_LATEST_VEC "=1");
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
index 3fffe5a55..140c92b84 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -1078,6 +1078,9 @@ struct i40e_adapter {
 	uint64_t pctypes_tbl[I40E_FLOW_TYPE_MAX] __rte_cache_min_aligned;
 	uint64_t flow_types_mask;
 	uint64_t pctypes_mask;
+
+	/* For devargs */
+	bool use_latest_vec;
 };
 
 /**
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 2a28ee348..e9fa7ed90 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -2909,6 +2909,34 @@ i40e_txq_info_get(struct rte_eth_dev *dev, uint16_t queue_id,
 	qinfo->conf.offloads = txq->offloads;
 }
 
+static eth_rx_burst_t
+i40e_get_latest_rx_vec(bool scatter)
+{
+#ifdef RTE_ARCH_X86
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+		return scatter ? i40e_recv_scattered_pkts_vec_avx2 :
+				 i40e_recv_pkts_vec_avx2;
+#endif
+	return scatter ? i40e_recv_scattered_pkts_vec :
+			 i40e_recv_pkts_vec;
+}
+
+static eth_rx_burst_t
+i40e_get_recommend_rx_vec(bool scatter)
+{
+#ifdef RTE_ARCH_X86
+	/*
+	 * since AVX frequency can be different to base frequency, limit
+	 * use of AVX2 version to later plaforms, not all those that could
+	 * theoretically run it.
+	 */
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
+		return scatter ? i40e_recv_scattered_pkts_vec_avx2 :
+				 i40e_recv_pkts_vec_avx2;
+#endif
+	return scatter ? i40e_recv_scattered_pkts_vec :
+			 i40e_recv_pkts_vec;
+}
 void __attribute__((cold))
 i40e_set_rx_function(struct rte_eth_dev *dev)
 {
@@ -2948,19 +2976,12 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
 			PMD_INIT_LOG(DEBUG, "Using Vector Scattered Rx "
 					    "callback (port=%d).",
 				     dev->data->port_id);
-
-			dev->rx_pkt_burst = i40e_recv_scattered_pkts_vec;
-#ifdef RTE_ARCH_X86
-			/*
-			 * since AVX frequency can be different to base
-			 * frequency, limit use of AVX2 version to later
-			 * plaforms, not all those that could theoretically
-			 * run it.
-			 */
-			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
+			if (ad->use_latest_vec)
 				dev->rx_pkt_burst =
-					i40e_recv_scattered_pkts_vec_avx2;
-#endif
+					i40e_get_latest_rx_vec(true);
+			else
+				dev->rx_pkt_burst =
+					i40e_get_recommend_rx_vec(true);
 		} else {
 			PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk "
 					   "allocation callback (port=%d).",
@@ -2978,18 +2999,10 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
 				    "burst size no less than %d (port=%d).",
 			     RTE_I40E_DESCS_PER_LOOP,
 			     dev->data->port_id);
-
-		dev->rx_pkt_burst = i40e_recv_pkts_vec;
-#ifdef RTE_ARCH_X86
-		/*
-		 * since AVX frequency can be different to base
-		 * frequency, limit use of AVX2 version to later
-		 * plaforms, not all those that could theoretically
-		 * run it.
-		 */
-		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
-			dev->rx_pkt_burst = i40e_recv_pkts_vec_avx2;
-#endif
+		if (ad->use_latest_vec)
+			dev->rx_pkt_burst = i40e_get_latest_rx_vec(false);
+		else
+			dev->rx_pkt_burst = i40e_get_recommend_rx_vec(false);
 	} else if (ad->rx_bulk_alloc_allowed) {
 		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are "
 				    "satisfied. Rx Burst Bulk Alloc function "
@@ -3049,6 +3062,31 @@ i40e_set_tx_function_flag(struct rte_eth_dev *dev, struct i40e_tx_queue *txq)
 				txq->queue_id);
 }
 
+static eth_tx_burst_t
+i40e_get_latest_tx_vec(void)
+{
+#ifdef RTE_ARCH_X86
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+		return i40e_xmit_pkts_vec_avx2;
+#endif
+	return i40e_xmit_pkts_vec;
+}
+
+static eth_tx_burst_t
+i40e_get_recommend_tx_vec(void)
+{
+#ifdef RTE_ARCH_X86
+	/*
+	 * since AVX frequency can be different to base frequency, limit
+	 * use of AVX2 version to later plaforms, not all those that could
+	 * theoretically run it.
+	 */
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
+		return i40e_xmit_pkts_vec_avx2;
+#endif
+	return i40e_xmit_pkts_vec;
+}
+
 void __attribute__((cold))
 i40e_set_tx_function(struct rte_eth_dev *dev)
 {
@@ -3073,17 +3111,12 @@ i40e_set_tx_function(struct rte_eth_dev *dev)
 	if (ad->tx_simple_allowed) {
 		if (ad->tx_vec_allowed) {
 			PMD_INIT_LOG(DEBUG, "Vector tx finally be used.");
-			dev->tx_pkt_burst = i40e_xmit_pkts_vec;
-#ifdef RTE_ARCH_X86
-			/*
-			 * since AVX frequency can be different to base
-			 * frequency, limit use of AVX2 version to later
-			 * plaforms, not all those that could theoretically
-			 * run it.
-			 */
-			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
-				dev->tx_pkt_burst = i40e_xmit_pkts_vec_avx2;
-#endif
+			if (ad->use_latest_vec)
+				dev->tx_pkt_burst =
+					i40e_get_latest_tx_vec();
+			else
+				dev->tx_pkt_burst =
+					i40e_get_recommend_tx_vec();
 		} else {
 			PMD_INIT_LOG(DEBUG, "Simple tx finally be used.");
 			dev->tx_pkt_burst = i40e_xmit_pkts_simple;
-- 
2.17.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v4] net/i40e: add interface to choose latest vector path
  2018-09-10 10:17 ` [dpdk-dev] [PATCH v4] " Xiaoyun Li
@ 2018-09-12  7:45   ` Zhang, Qi Z
  2018-09-12  7:50     ` Zhang, Qi Z
  0 siblings, 1 reply; 19+ messages in thread
From: Zhang, Qi Z @ 2018-09-12  7:45 UTC (permalink / raw)
  To: Li, Xiaoyun, Xing, Beilei
  Cc: dev, Yang, Zhiyong, Richardson, Bruce, Hunt, David



> -----Original Message-----
> From: Li, Xiaoyun
> Sent: Monday, September 10, 2018 6:18 PM
> To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>
> Cc: dev@dpdk.org; Yang, Zhiyong <zhiyong.yang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; Hunt, David <david.hunt@intel.com>; Li,
> Xiaoyun <xiaoyun.li@intel.com>
> Subject: [PATCH v4] net/i40e: add interface to choose latest vector path
> 
> Right now, vector path is limited to only use on later platform.
> This patch adds a devarg use-latest-vec to allow the users to use the latest
> vector path that the platform supported. Namely, using AVX2 vector path on
> broadwell is possible.
> 
> Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
> ---
> v4:
>  * Polish the codes.
> v3:
>  * Polish the doc and commit log.
> v2:
>  * Correct the calling of the wrong function last time.
>  * Fix seg fault bug.
> ---
>  doc/guides/nics/i40e.rst               |   8 ++
>  doc/guides/rel_notes/release_18_11.rst |   4 +
>  drivers/net/i40e/i40e_ethdev.c         |  46 ++++++++++-
>  drivers/net/i40e/i40e_ethdev.h         |   3 +
>  drivers/net/i40e/i40e_rxtx.c           | 103 ++++++++++++++++---------
>  5 files changed, 128 insertions(+), 36 deletions(-)
> 
> diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index
> 65d87f869..643e6a062 100644
> --- a/doc/guides/nics/i40e.rst
> +++ b/doc/guides/nics/i40e.rst
> @@ -163,6 +163,14 @@ Runtime Config Options
>    Currently hot-plugging of representor ports is not supported so all required
>    representors must be specified on the creation of the PF.
> 
> +- ``Use latest vector`` (default ``disable``)
> +
> +  Vector path was limited to use only on later platform. But users may
> + want the  latest vector path. For example, VPP users may want to use
> + AVX2 vector path on HSW/BDW  because it can get better perf. So
> + ``devargs`` parameter ``use-latest-vec`` is  introduced, for example::
> +    -w 84:00.0,use-latest-vec=1
> +
>  Driver compilation and testing
>  ------------------------------
> 
> diff --git a/doc/guides/rel_notes/release_18_11.rst
> b/doc/guides/rel_notes/release_18_11.rst
> index 3ae6b3f58..34af591a2 100644
> --- a/doc/guides/rel_notes/release_18_11.rst
> +++ b/doc/guides/rel_notes/release_18_11.rst
> @@ -54,6 +54,10 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =========================================================
> 
> +* **Added a devarg to use the latest vector path.**
> +  A new devarg ``use-latest-vec`` was introduced to allow users to
> +choose
> +  the latest vector path that the platform supported. For example, VPP
> +users
> +  can use AVX2 vector path on BDW/HSW to get better performance.
> 
>  API Changes
>  -----------
> diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
> index 85a6a867f..72377d0b6 100644
> --- a/drivers/net/i40e/i40e_ethdev.c
> +++ b/drivers/net/i40e/i40e_ethdev.c
> @@ -44,6 +44,7 @@
>  #define ETH_I40E_FLOATING_VEB_LIST_ARG	"floating_veb_list"
>  #define ETH_I40E_SUPPORT_MULTI_DRIVER	"support-multi-driver"
>  #define ETH_I40E_QUEUE_NUM_PER_VF_ARG	"queue-num-per-vf"
> +#define ETH_I40E_USE_LATEST_VEC	"use-latest-vec"
> 
>  #define I40E_CLEAR_PXE_WAIT_MS     200
> 
> @@ -408,6 +409,7 @@ static const char *const valid_keys[] = {
>  	ETH_I40E_FLOATING_VEB_LIST_ARG,
>  	ETH_I40E_SUPPORT_MULTI_DRIVER,
>  	ETH_I40E_QUEUE_NUM_PER_VF_ARG,
> +	ETH_I40E_USE_LATEST_VEC,
>  	NULL};
> 
>  static const struct rte_pci_id pci_id_i40e_map[] = { @@ -1201,6 +1203,46 @@
> i40e_aq_debug_write_global_register(struct i40e_hw *hw,
>  	return i40e_aq_debug_write_register(hw, reg_addr, reg_val,
> cmd_details);  }
> 
> +static int
> +i40e_parse_latest_vec(struct rte_eth_dev *dev) {
> +	struct i40e_adapter *ad =
> +		I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
> +	int kvargs_count, use_latest_vec;
> +	struct rte_kvargs *kvlist;
> +
> +	ad->use_latest_vec = false;
> +
> +	if (!dev->device->devargs)
> +		return 0;
> +
> +	kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys);
> +	if (!kvlist)
> +		return -EINVAL;
> +
> +	kvargs_count = rte_kvargs_count(kvlist, ETH_I40E_USE_LATEST_VEC);
> +	if (!kvargs_count) {
> +		rte_kvargs_free(kvlist);
> +		return 0;
> +	}
> +
> +	if (kvargs_count > 1)
> +		PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and
> only "
> +			    "the first one is used !",
> +			    ETH_I40E_USE_LATEST_VEC);
> +
> +	use_latest_vec = atoi((&kvlist->pairs[0])->value);
> +
> +	rte_kvargs_free(kvlist);
> +
> +	if (use_latest_vec != 0 && use_latest_vec != 1)
> +		PMD_DRV_LOG(WARNING, "Value should be 0 or 1, set it as 1!");
> +
> +	ad->use_latest_vec = (bool)use_latest_vec;
> +
> +	return 0;
> +}
> +
>  static int
>  eth_i40e_dev_init(struct rte_eth_dev *dev, void *init_params __rte_unused)
> { @@ -1263,6 +1305,7 @@ eth_i40e_dev_init(struct rte_eth_dev *dev, void
> *init_params __rte_unused)
> 
>  	/* Check if need to support multi-driver */
>  	i40e_support_multi_driver(dev);
> +	i40e_parse_latest_vec(dev);
> 
>  	/* Make sure all is clean before doing PF reset */
>  	i40e_clear_hw(hw);
> @@ -12527,4 +12570,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_i40e,
>  			      ETH_I40E_FLOATING_VEB_ARG "=1"
>  			      ETH_I40E_FLOATING_VEB_LIST_ARG "=<string>"
>  			      ETH_I40E_QUEUE_NUM_PER_VF_ARG "=1|2|4|8|16"
> -			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1");
> +			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1"
> +			      ETH_I40E_USE_LATEST_VEC "=1");
> diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
> index 3fffe5a55..140c92b84 100644
> --- a/drivers/net/i40e/i40e_ethdev.h
> +++ b/drivers/net/i40e/i40e_ethdev.h
> @@ -1078,6 +1078,9 @@ struct i40e_adapter {
>  	uint64_t pctypes_tbl[I40E_FLOW_TYPE_MAX] __rte_cache_min_aligned;
>  	uint64_t flow_types_mask;
>  	uint64_t pctypes_mask;
> +
> +	/* For devargs */
> +	bool use_latest_vec;
>  };
> 
>  /**
> diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index
> 2a28ee348..e9fa7ed90 100644
> --- a/drivers/net/i40e/i40e_rxtx.c
> +++ b/drivers/net/i40e/i40e_rxtx.c
> @@ -2909,6 +2909,34 @@ i40e_txq_info_get(struct rte_eth_dev *dev,
> uint16_t queue_id,
>  	qinfo->conf.offloads = txq->offloads;
>  }
> 
> +static eth_rx_burst_t
> +i40e_get_latest_rx_vec(bool scatter)
> +{
> +#ifdef RTE_ARCH_X86
> +	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
> +		return scatter ? i40e_recv_scattered_pkts_vec_avx2 :
> +				 i40e_recv_pkts_vec_avx2;
> +#endif
> +	return scatter ? i40e_recv_scattered_pkts_vec :
> +			 i40e_recv_pkts_vec;
> +}
> +
> +static eth_rx_burst_t
> +i40e_get_recommend_rx_vec(bool scatter) { #ifdef RTE_ARCH_X86
> +	/*
> +	 * since AVX frequency can be different to base frequency, limit
> +	 * use of AVX2 version to later plaforms, not all those that could
> +	 * theoretically run it.
> +	 */
> +	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> +		return scatter ? i40e_recv_scattered_pkts_vec_avx2 :
> +				 i40e_recv_pkts_vec_avx2;
> +#endif
> +	return scatter ? i40e_recv_scattered_pkts_vec :
> +			 i40e_recv_pkts_vec;
> +}
>  void __attribute__((cold))
>  i40e_set_rx_function(struct rte_eth_dev *dev)  { @@ -2948,19 +2976,12
> @@ i40e_set_rx_function(struct rte_eth_dev *dev)
>  			PMD_INIT_LOG(DEBUG, "Using Vector Scattered Rx "
>  					    "callback (port=%d).",
>  				     dev->data->port_id);
> -
> -			dev->rx_pkt_burst = i40e_recv_scattered_pkts_vec;
> -#ifdef RTE_ARCH_X86
> -			/*
> -			 * since AVX frequency can be different to base
> -			 * frequency, limit use of AVX2 version to later
> -			 * plaforms, not all those that could theoretically
> -			 * run it.
> -			 */
> -			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> +			if (ad->use_latest_vec)
>  				dev->rx_pkt_burst =
> -					i40e_recv_scattered_pkts_vec_avx2;
> -#endif
> +					i40e_get_latest_rx_vec(true);
> +			else
> +				dev->rx_pkt_burst =
> +					i40e_get_recommend_rx_vec(true);
>  		} else {
>  			PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk "
>  					   "allocation callback (port=%d).", @@ -2978,18
> +2999,10 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
>  				    "burst size no less than %d (port=%d).",
>  			     RTE_I40E_DESCS_PER_LOOP,
>  			     dev->data->port_id);
> -
> -		dev->rx_pkt_burst = i40e_recv_pkts_vec;
> -#ifdef RTE_ARCH_X86
> -		/*
> -		 * since AVX frequency can be different to base
> -		 * frequency, limit use of AVX2 version to later
> -		 * plaforms, not all those that could theoretically
> -		 * run it.
> -		 */
> -		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> -			dev->rx_pkt_burst = i40e_recv_pkts_vec_avx2;
> -#endif
> +		if (ad->use_latest_vec)
> +			dev->rx_pkt_burst = i40e_get_latest_rx_vec(false);
> +		else
> +			dev->rx_pkt_burst = i40e_get_recommend_rx_vec(false);


How about simplify the code as below?

/* default */
dev->rx_pkt_burst = dev->data->scattered_rx ?
		i40e_recv_scattered_pkts : i40e_recv_pkts;

if (ad->rx_vec_allowed) {
	/* overwrite by vec path*/
	if (ad->use_latest_vec)
		dev->rx_pkt_burst = i40e_get_latest_rx_vec(dev->data->scattered_rx);
	else
		dev->rx_pkt_burst = i40e_get_recommend_rx_vec(dev->data->scattered_rx);
} else if (ad->rx_bulk_alloc_allowed) {
	/* or overwrite by bulk alloc */
	dev->rx_pkt_burst = i40e_recv_pkts_bulk_alloc;
}


>  	} else if (ad->rx_bulk_alloc_allowed) {
>  		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are "
>  				    "satisfied. Rx Burst Bulk Alloc function "
> @@ -3049,6 +3062,31 @@ i40e_set_tx_function_flag(struct rte_eth_dev *dev,
> struct i40e_tx_queue *txq)
>  				txq->queue_id);
>  }
> 
> +static eth_tx_burst_t
> +i40e_get_latest_tx_vec(void)
> +{
> +#ifdef RTE_ARCH_X86
> +	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
> +		return i40e_xmit_pkts_vec_avx2;
> +#endif
> +	return i40e_xmit_pkts_vec;
> +}
> +
> +static eth_tx_burst_t
> +i40e_get_recommend_tx_vec(void)
> +{
> +#ifdef RTE_ARCH_X86
> +	/*
> +	 * since AVX frequency can be different to base frequency, limit
> +	 * use of AVX2 version to later plaforms, not all those that could
> +	 * theoretically run it.
> +	 */
> +	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> +		return i40e_xmit_pkts_vec_avx2;
> +#endif
> +	return i40e_xmit_pkts_vec;
> +}
> +
>  void __attribute__((cold))
>  i40e_set_tx_function(struct rte_eth_dev *dev)  { @@ -3073,17 +3111,12
> @@ i40e_set_tx_function(struct rte_eth_dev *dev)
>  	if (ad->tx_simple_allowed) {
>  		if (ad->tx_vec_allowed) {
>  			PMD_INIT_LOG(DEBUG, "Vector tx finally be used.");
> -			dev->tx_pkt_burst = i40e_xmit_pkts_vec;
> -#ifdef RTE_ARCH_X86
> -			/*
> -			 * since AVX frequency can be different to base
> -			 * frequency, limit use of AVX2 version to later
> -			 * plaforms, not all those that could theoretically
> -			 * run it.
> -			 */
> -			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> -				dev->tx_pkt_burst = i40e_xmit_pkts_vec_avx2;
> -#endif
> +			if (ad->use_latest_vec)
> +				dev->tx_pkt_burst =
> +					i40e_get_latest_tx_vec();
> +			else
> +				dev->tx_pkt_burst =
> +					i40e_get_recommend_tx_vec();
>  		} else {
>  			PMD_INIT_LOG(DEBUG, "Simple tx finally be used.");
>  			dev->tx_pkt_burst = i40e_xmit_pkts_simple;
> --
> 2.17.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v4] net/i40e: add interface to choose latest vector path
  2018-09-12  7:45   ` Zhang, Qi Z
@ 2018-09-12  7:50     ` Zhang, Qi Z
  0 siblings, 0 replies; 19+ messages in thread
From: Zhang, Qi Z @ 2018-09-12  7:50 UTC (permalink / raw)
  To: Zhang, Qi Z, Li, Xiaoyun, Xing, Beilei
  Cc: dev, Yang, Zhiyong, Richardson, Bruce, Hunt, David



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Zhang, Qi Z
> Sent: Wednesday, September 12, 2018 3:45 PM
> To: Li, Xiaoyun <xiaoyun.li@intel.com>; Xing, Beilei <beilei.xing@intel.com>
> Cc: dev@dpdk.org; Yang, Zhiyong <zhiyong.yang@intel.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; Hunt, David <david.hunt@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v4] net/i40e: add interface to choose latest
> vector path
> 
> 
> 
> > -----Original Message-----
> > From: Li, Xiaoyun
> > Sent: Monday, September 10, 2018 6:18 PM
> > To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z
> > <qi.z.zhang@intel.com>
> > Cc: dev@dpdk.org; Yang, Zhiyong <zhiyong.yang@intel.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>; Hunt, David
> > <david.hunt@intel.com>; Li, Xiaoyun <xiaoyun.li@intel.com>
> > Subject: [PATCH v4] net/i40e: add interface to choose latest vector
> > path
> >
> > Right now, vector path is limited to only use on later platform.
> > This patch adds a devarg use-latest-vec to allow the users to use the
> > latest vector path that the platform supported. Namely, using AVX2
> > vector path on broadwell is possible.
> >
> > Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
> > ---
> > v4:
> >  * Polish the codes.
> > v3:
> >  * Polish the doc and commit log.
> > v2:
> >  * Correct the calling of the wrong function last time.
> >  * Fix seg fault bug.
> > ---
> >  doc/guides/nics/i40e.rst               |   8 ++
> >  doc/guides/rel_notes/release_18_11.rst |   4 +
> >  drivers/net/i40e/i40e_ethdev.c         |  46 ++++++++++-
> >  drivers/net/i40e/i40e_ethdev.h         |   3 +
> >  drivers/net/i40e/i40e_rxtx.c           | 103 ++++++++++++++++---------
> >  5 files changed, 128 insertions(+), 36 deletions(-)
> >
> > diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index
> > 65d87f869..643e6a062 100644
> > --- a/doc/guides/nics/i40e.rst
> > +++ b/doc/guides/nics/i40e.rst
> > @@ -163,6 +163,14 @@ Runtime Config Options
> >    Currently hot-plugging of representor ports is not supported so all
> required
> >    representors must be specified on the creation of the PF.
> >
> > +- ``Use latest vector`` (default ``disable``)
> > +
> > +  Vector path was limited to use only on later platform. But users
> > + may want the  latest vector path. For example, VPP users may want to
> > + use
> > + AVX2 vector path on HSW/BDW  because it can get better perf. So
> > + ``devargs`` parameter ``use-latest-vec`` is  introduced, for example::
> > +    -w 84:00.0,use-latest-vec=1
> > +
> >  Driver compilation and testing
> >  ------------------------------
> >
> > diff --git a/doc/guides/rel_notes/release_18_11.rst
> > b/doc/guides/rel_notes/release_18_11.rst
> > index 3ae6b3f58..34af591a2 100644
> > --- a/doc/guides/rel_notes/release_18_11.rst
> > +++ b/doc/guides/rel_notes/release_18_11.rst
> > @@ -54,6 +54,10 @@ New Features
> >       Also, make sure to start the actual text at the margin.
> >       =========================================================
> >
> > +* **Added a devarg to use the latest vector path.**
> > +  A new devarg ``use-latest-vec`` was introduced to allow users to
> > +choose
> > +  the latest vector path that the platform supported. For example,
> > +VPP users
> > +  can use AVX2 vector path on BDW/HSW to get better performance.
> >
> >  API Changes
> >  -----------
> > diff --git a/drivers/net/i40e/i40e_ethdev.c
> > b/drivers/net/i40e/i40e_ethdev.c index 85a6a867f..72377d0b6 100644
> > --- a/drivers/net/i40e/i40e_ethdev.c
> > +++ b/drivers/net/i40e/i40e_ethdev.c
> > @@ -44,6 +44,7 @@
> >  #define ETH_I40E_FLOATING_VEB_LIST_ARG	"floating_veb_list"
> >  #define ETH_I40E_SUPPORT_MULTI_DRIVER	"support-multi-driver"
> >  #define ETH_I40E_QUEUE_NUM_PER_VF_ARG	"queue-num-per-vf"
> > +#define ETH_I40E_USE_LATEST_VEC	"use-latest-vec"
> >
> >  #define I40E_CLEAR_PXE_WAIT_MS     200
> >
> > @@ -408,6 +409,7 @@ static const char *const valid_keys[] = {
> >  	ETH_I40E_FLOATING_VEB_LIST_ARG,
> >  	ETH_I40E_SUPPORT_MULTI_DRIVER,
> >  	ETH_I40E_QUEUE_NUM_PER_VF_ARG,
> > +	ETH_I40E_USE_LATEST_VEC,
> >  	NULL};
> >
> >  static const struct rte_pci_id pci_id_i40e_map[] = { @@ -1201,6
> > +1203,46 @@ i40e_aq_debug_write_global_register(struct i40e_hw *hw,
> >  	return i40e_aq_debug_write_register(hw, reg_addr, reg_val,
> > cmd_details);  }
> >
> > +static int
> > +i40e_parse_latest_vec(struct rte_eth_dev *dev) {
> > +	struct i40e_adapter *ad =
> > +		I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
> > +	int kvargs_count, use_latest_vec;
> > +	struct rte_kvargs *kvlist;
> > +
> > +	ad->use_latest_vec = false;
> > +
> > +	if (!dev->device->devargs)
> > +		return 0;
> > +
> > +	kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys);
> > +	if (!kvlist)
> > +		return -EINVAL;
> > +
> > +	kvargs_count = rte_kvargs_count(kvlist, ETH_I40E_USE_LATEST_VEC);
> > +	if (!kvargs_count) {
> > +		rte_kvargs_free(kvlist);
> > +		return 0;
> > +	}
> > +
> > +	if (kvargs_count > 1)
> > +		PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and
> > only "
> > +			    "the first one is used !",
> > +			    ETH_I40E_USE_LATEST_VEC);
> > +
> > +	use_latest_vec = atoi((&kvlist->pairs[0])->value);
> > +
> > +	rte_kvargs_free(kvlist);
> > +
> > +	if (use_latest_vec != 0 && use_latest_vec != 1)
> > +		PMD_DRV_LOG(WARNING, "Value should be 0 or 1, set it as 1!");
> > +
> > +	ad->use_latest_vec = (bool)use_latest_vec;
> > +
> > +	return 0;
> > +}
> > +
> >  static int
> >  eth_i40e_dev_init(struct rte_eth_dev *dev, void *init_params
> > __rte_unused) { @@ -1263,6 +1305,7 @@ eth_i40e_dev_init(struct
> > rte_eth_dev *dev, void *init_params __rte_unused)
> >
> >  	/* Check if need to support multi-driver */
> >  	i40e_support_multi_driver(dev);
> > +	i40e_parse_latest_vec(dev);
> >
> >  	/* Make sure all is clean before doing PF reset */
> >  	i40e_clear_hw(hw);
> > @@ -12527,4 +12570,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_i40e,
> >  			      ETH_I40E_FLOATING_VEB_ARG "=1"
> >  			      ETH_I40E_FLOATING_VEB_LIST_ARG "=<string>"
> >  			      ETH_I40E_QUEUE_NUM_PER_VF_ARG "=1|2|4|8|16"
> > -			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1");
> > +			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1"
> > +			      ETH_I40E_USE_LATEST_VEC "=1");
> > diff --git a/drivers/net/i40e/i40e_ethdev.h
> > b/drivers/net/i40e/i40e_ethdev.h index 3fffe5a55..140c92b84 100644
> > --- a/drivers/net/i40e/i40e_ethdev.h
> > +++ b/drivers/net/i40e/i40e_ethdev.h
> > @@ -1078,6 +1078,9 @@ struct i40e_adapter {
> >  	uint64_t pctypes_tbl[I40E_FLOW_TYPE_MAX] __rte_cache_min_aligned;
> >  	uint64_t flow_types_mask;
> >  	uint64_t pctypes_mask;
> > +
> > +	/* For devargs */
> > +	bool use_latest_vec;
> >  };
> >
> >  /**
> > diff --git a/drivers/net/i40e/i40e_rxtx.c
> > b/drivers/net/i40e/i40e_rxtx.c index
> > 2a28ee348..e9fa7ed90 100644
> > --- a/drivers/net/i40e/i40e_rxtx.c
> > +++ b/drivers/net/i40e/i40e_rxtx.c
> > @@ -2909,6 +2909,34 @@ i40e_txq_info_get(struct rte_eth_dev *dev,
> > uint16_t queue_id,
> >  	qinfo->conf.offloads = txq->offloads;  }
> >
> > +static eth_rx_burst_t
> > +i40e_get_latest_rx_vec(bool scatter)
> > +{
> > +#ifdef RTE_ARCH_X86
> > +	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
> > +		return scatter ? i40e_recv_scattered_pkts_vec_avx2 :
> > +				 i40e_recv_pkts_vec_avx2;
> > +#endif
> > +	return scatter ? i40e_recv_scattered_pkts_vec :
> > +			 i40e_recv_pkts_vec;
> > +}
> > +
> > +static eth_rx_burst_t
> > +i40e_get_recommend_rx_vec(bool scatter) { #ifdef RTE_ARCH_X86
> > +	/*
> > +	 * since AVX frequency can be different to base frequency, limit
> > +	 * use of AVX2 version to later plaforms, not all those that could
> > +	 * theoretically run it.
> > +	 */
> > +	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> > +		return scatter ? i40e_recv_scattered_pkts_vec_avx2 :
> > +				 i40e_recv_pkts_vec_avx2;
> > +#endif
> > +	return scatter ? i40e_recv_scattered_pkts_vec :
> > +			 i40e_recv_pkts_vec;
> > +}
> >  void __attribute__((cold))
> >  i40e_set_rx_function(struct rte_eth_dev *dev)  { @@ -2948,19 +2976,12
> > @@ i40e_set_rx_function(struct rte_eth_dev *dev)
> >  			PMD_INIT_LOG(DEBUG, "Using Vector Scattered Rx "
> >  					    "callback (port=%d).",
> >  				     dev->data->port_id);
> > -
> > -			dev->rx_pkt_burst = i40e_recv_scattered_pkts_vec;
> > -#ifdef RTE_ARCH_X86
> > -			/*
> > -			 * since AVX frequency can be different to base
> > -			 * frequency, limit use of AVX2 version to later
> > -			 * plaforms, not all those that could theoretically
> > -			 * run it.
> > -			 */
> > -			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> > +			if (ad->use_latest_vec)
> >  				dev->rx_pkt_burst =
> > -					i40e_recv_scattered_pkts_vec_avx2;
> > -#endif
> > +					i40e_get_latest_rx_vec(true);
> > +			else
> > +				dev->rx_pkt_burst =
> > +					i40e_get_recommend_rx_vec(true);
> >  		} else {
> >  			PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk "
> >  					   "allocation callback (port=%d).", @@ -2978,18
> > +2999,10 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
> >  				    "burst size no less than %d (port=%d).",
> >  			     RTE_I40E_DESCS_PER_LOOP,
> >  			     dev->data->port_id);
> > -
> > -		dev->rx_pkt_burst = i40e_recv_pkts_vec;
> > -#ifdef RTE_ARCH_X86
> > -		/*
> > -		 * since AVX frequency can be different to base
> > -		 * frequency, limit use of AVX2 version to later
> > -		 * plaforms, not all those that could theoretically
> > -		 * run it.
> > -		 */
> > -		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> > -			dev->rx_pkt_burst = i40e_recv_pkts_vec_avx2;
> > -#endif
> > +		if (ad->use_latest_vec)
> > +			dev->rx_pkt_burst = i40e_get_latest_rx_vec(false);
> > +		else
> > +			dev->rx_pkt_burst = i40e_get_recommend_rx_vec(false);
> 
> 
> How about simplify the code as below?
> 
> /* default */
> dev->rx_pkt_burst = dev->data->scattered_rx ?
> 		i40e_recv_scattered_pkts : i40e_recv_pkts;
> 
> if (ad->rx_vec_allowed) {
> 	/* overwrite by vec path*/
> 	if (ad->use_latest_vec)
> 		dev->rx_pkt_burst =
> i40e_get_latest_rx_vec(dev->data->scattered_rx);
> 	else
> 		dev->rx_pkt_burst =
> i40e_get_recommend_rx_vec(dev->data->scattered_rx);
> } else if (ad->rx_bulk_alloc_allowed) {

Sorry, I think it should be:
	else if (ad->rx_bulk_alloc_allowed && ! dev->data->scattered_rx) {

> 	/* or overwrite by bulk alloc */
> 	dev->rx_pkt_burst = i40e_recv_pkts_bulk_alloc; }
> 
> 
> >  	} else if (ad->rx_bulk_alloc_allowed) {
> >  		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are "
> >  				    "satisfied. Rx Burst Bulk Alloc function "
> > @@ -3049,6 +3062,31 @@ i40e_set_tx_function_flag(struct rte_eth_dev
> > *dev, struct i40e_tx_queue *txq)
> >  				txq->queue_id);
> >  }
> >
> > +static eth_tx_burst_t
> > +i40e_get_latest_tx_vec(void)
> > +{
> > +#ifdef RTE_ARCH_X86
> > +	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
> > +		return i40e_xmit_pkts_vec_avx2;
> > +#endif
> > +	return i40e_xmit_pkts_vec;
> > +}
> > +
> > +static eth_tx_burst_t
> > +i40e_get_recommend_tx_vec(void)
> > +{
> > +#ifdef RTE_ARCH_X86
> > +	/*
> > +	 * since AVX frequency can be different to base frequency, limit
> > +	 * use of AVX2 version to later plaforms, not all those that could
> > +	 * theoretically run it.
> > +	 */
> > +	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> > +		return i40e_xmit_pkts_vec_avx2;
> > +#endif
> > +	return i40e_xmit_pkts_vec;
> > +}
> > +
> >  void __attribute__((cold))
> >  i40e_set_tx_function(struct rte_eth_dev *dev)  { @@ -3073,17 +3111,12
> > @@ i40e_set_tx_function(struct rte_eth_dev *dev)
> >  	if (ad->tx_simple_allowed) {
> >  		if (ad->tx_vec_allowed) {
> >  			PMD_INIT_LOG(DEBUG, "Vector tx finally be used.");
> > -			dev->tx_pkt_burst = i40e_xmit_pkts_vec;
> > -#ifdef RTE_ARCH_X86
> > -			/*
> > -			 * since AVX frequency can be different to base
> > -			 * frequency, limit use of AVX2 version to later
> > -			 * plaforms, not all those that could theoretically
> > -			 * run it.
> > -			 */
> > -			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> > -				dev->tx_pkt_burst = i40e_xmit_pkts_vec_avx2;
> > -#endif
> > +			if (ad->use_latest_vec)
> > +				dev->tx_pkt_burst =
> > +					i40e_get_latest_tx_vec();
> > +			else
> > +				dev->tx_pkt_burst =
> > +					i40e_get_recommend_tx_vec();
> >  		} else {
> >  			PMD_INIT_LOG(DEBUG, "Simple tx finally be used.");
> >  			dev->tx_pkt_burst = i40e_xmit_pkts_simple;
> > --
> > 2.17.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH v5] net/i40e: add interface to choose latest vector path
  2018-08-30  2:16 [dpdk-dev] [PATCH] net/i40e: add interface to choose latest vector path Xiaoyun Li
                   ` (2 preceding siblings ...)
  2018-09-10 10:17 ` [dpdk-dev] [PATCH v4] " Xiaoyun Li
@ 2018-09-12 10:12 ` Xiaoyun Li
  2018-09-13  1:38   ` Zhang, Qi Z
  2018-09-13 13:27   ` Ferruh Yigit
  2018-09-17  9:58 ` [dpdk-dev] [PATCH v6] net/i40e: add interface to use latest vec path Xiaoyun Li
  2018-09-18  2:22 ` [dpdk-dev] [PATCH v7] " Xiaoyun Li
  5 siblings, 2 replies; 19+ messages in thread
From: Xiaoyun Li @ 2018-09-12 10:12 UTC (permalink / raw)
  To: beilei.xing, qi.z.zhang; +Cc: dev, zhiyong.yang, Xiaoyun Li

Right now, vector path is limited to only use on later platform.
This patch adds a devarg use-latest-vec to allow the users to
use the latest vector path that the platform supported. Namely,
using AVX2 vector path on broadwell is possible.

Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
---
v5:
 * Simpify the rx set function.
v4:
 * Polish the codes.
v3:
 * Polish the doc and commit log.
v2:
 * Correct the calling of the wrong function last time.
 * Fix seg fault bug.
---
 doc/guides/nics/i40e.rst               |   8 ++
 doc/guides/rel_notes/release_18_11.rst |   4 +
 drivers/net/i40e/i40e_ethdev.c         |  46 +++++++-
 drivers/net/i40e/i40e_ethdev.h         |   3 +
 drivers/net/i40e/i40e_rxtx.c           | 143 +++++++++++++------------
 5 files changed, 136 insertions(+), 68 deletions(-)

diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
index 65d87f869..643e6a062 100644
--- a/doc/guides/nics/i40e.rst
+++ b/doc/guides/nics/i40e.rst
@@ -163,6 +163,14 @@ Runtime Config Options
   Currently hot-plugging of representor ports is not supported so all required
   representors must be specified on the creation of the PF.
 
+- ``Use latest vector`` (default ``disable``)
+
+  Vector path was limited to use only on later platform. But users may want the
+  latest vector path. For example, VPP users may want to use AVX2 vector path on HSW/BDW
+  because it can get better perf. So ``devargs`` parameter ``use-latest-vec`` is
+  introduced, for example::
+    -w 84:00.0,use-latest-vec=1
+
 Driver compilation and testing
 ------------------------------
 
diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst
index 3ae6b3f58..34af591a2 100644
--- a/doc/guides/rel_notes/release_18_11.rst
+++ b/doc/guides/rel_notes/release_18_11.rst
@@ -54,6 +54,10 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added a devarg to use the latest vector path.**
+  A new devarg ``use-latest-vec`` was introduced to allow users to choose
+  the latest vector path that the platform supported. For example, VPP users
+  can use AVX2 vector path on BDW/HSW to get better performance.
 
 API Changes
 -----------
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 85a6a867f..72377d0b6 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -44,6 +44,7 @@
 #define ETH_I40E_FLOATING_VEB_LIST_ARG	"floating_veb_list"
 #define ETH_I40E_SUPPORT_MULTI_DRIVER	"support-multi-driver"
 #define ETH_I40E_QUEUE_NUM_PER_VF_ARG	"queue-num-per-vf"
+#define ETH_I40E_USE_LATEST_VEC	"use-latest-vec"
 
 #define I40E_CLEAR_PXE_WAIT_MS     200
 
@@ -408,6 +409,7 @@ static const char *const valid_keys[] = {
 	ETH_I40E_FLOATING_VEB_LIST_ARG,
 	ETH_I40E_SUPPORT_MULTI_DRIVER,
 	ETH_I40E_QUEUE_NUM_PER_VF_ARG,
+	ETH_I40E_USE_LATEST_VEC,
 	NULL};
 
 static const struct rte_pci_id pci_id_i40e_map[] = {
@@ -1201,6 +1203,46 @@ i40e_aq_debug_write_global_register(struct i40e_hw *hw,
 	return i40e_aq_debug_write_register(hw, reg_addr, reg_val, cmd_details);
 }
 
+static int
+i40e_parse_latest_vec(struct rte_eth_dev *dev)
+{
+	struct i40e_adapter *ad =
+		I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
+	int kvargs_count, use_latest_vec;
+	struct rte_kvargs *kvlist;
+
+	ad->use_latest_vec = false;
+
+	if (!dev->device->devargs)
+		return 0;
+
+	kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys);
+	if (!kvlist)
+		return -EINVAL;
+
+	kvargs_count = rte_kvargs_count(kvlist, ETH_I40E_USE_LATEST_VEC);
+	if (!kvargs_count) {
+		rte_kvargs_free(kvlist);
+		return 0;
+	}
+
+	if (kvargs_count > 1)
+		PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and only "
+			    "the first one is used !",
+			    ETH_I40E_USE_LATEST_VEC);
+
+	use_latest_vec = atoi((&kvlist->pairs[0])->value);
+
+	rte_kvargs_free(kvlist);
+
+	if (use_latest_vec != 0 && use_latest_vec != 1)
+		PMD_DRV_LOG(WARNING, "Value should be 0 or 1, set it as 1!");
+
+	ad->use_latest_vec = (bool)use_latest_vec;
+
+	return 0;
+}
+
 static int
 eth_i40e_dev_init(struct rte_eth_dev *dev, void *init_params __rte_unused)
 {
@@ -1263,6 +1305,7 @@ eth_i40e_dev_init(struct rte_eth_dev *dev, void *init_params __rte_unused)
 
 	/* Check if need to support multi-driver */
 	i40e_support_multi_driver(dev);
+	i40e_parse_latest_vec(dev);
 
 	/* Make sure all is clean before doing PF reset */
 	i40e_clear_hw(hw);
@@ -12527,4 +12570,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_i40e,
 			      ETH_I40E_FLOATING_VEB_ARG "=1"
 			      ETH_I40E_FLOATING_VEB_LIST_ARG "=<string>"
 			      ETH_I40E_QUEUE_NUM_PER_VF_ARG "=1|2|4|8|16"
-			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1");
+			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1"
+			      ETH_I40E_USE_LATEST_VEC "=1");
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
index 3fffe5a55..140c92b84 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -1078,6 +1078,9 @@ struct i40e_adapter {
 	uint64_t pctypes_tbl[I40E_FLOW_TYPE_MAX] __rte_cache_min_aligned;
 	uint64_t flow_types_mask;
 	uint64_t pctypes_mask;
+
+	/* For devargs */
+	bool use_latest_vec;
 };
 
 /**
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 2a28ee348..937dfdf3e 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -2909,6 +2909,35 @@ i40e_txq_info_get(struct rte_eth_dev *dev, uint16_t queue_id,
 	qinfo->conf.offloads = txq->offloads;
 }
 
+static eth_rx_burst_t
+i40e_get_latest_rx_vec(bool scatter)
+{
+#ifdef RTE_ARCH_X86
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+		return scatter ? i40e_recv_scattered_pkts_vec_avx2 :
+				 i40e_recv_pkts_vec_avx2;
+#endif
+	return scatter ? i40e_recv_scattered_pkts_vec :
+			 i40e_recv_pkts_vec;
+}
+
+static eth_rx_burst_t
+i40e_get_recommend_rx_vec(bool scatter)
+{
+#ifdef RTE_ARCH_X86
+	/*
+	 * since AVX frequency can be different to base frequency, limit
+	 * use of AVX2 version to later plaforms, not all those that could
+	 * theoretically run it.
+	 */
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
+		return scatter ? i40e_recv_scattered_pkts_vec_avx2 :
+				 i40e_recv_pkts_vec_avx2;
+#endif
+	return scatter ? i40e_recv_scattered_pkts_vec :
+			 i40e_recv_pkts_vec;
+}
+
 void __attribute__((cold))
 i40e_set_rx_function(struct rte_eth_dev *dev)
 {
@@ -2940,57 +2969,17 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
 		}
 	}
 
-	if (dev->data->scattered_rx) {
-		/* Set the non-LRO scattered callback: there are Vector and
-		 * single allocation versions.
-		 */
-		if (ad->rx_vec_allowed) {
-			PMD_INIT_LOG(DEBUG, "Using Vector Scattered Rx "
-					    "callback (port=%d).",
-				     dev->data->port_id);
-
-			dev->rx_pkt_burst = i40e_recv_scattered_pkts_vec;
-#ifdef RTE_ARCH_X86
-			/*
-			 * since AVX frequency can be different to base
-			 * frequency, limit use of AVX2 version to later
-			 * plaforms, not all those that could theoretically
-			 * run it.
-			 */
-			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
-				dev->rx_pkt_burst =
-					i40e_recv_scattered_pkts_vec_avx2;
-#endif
-		} else {
-			PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk "
-					   "allocation callback (port=%d).",
-				     dev->data->port_id);
-			dev->rx_pkt_burst = i40e_recv_scattered_pkts;
-		}
-	/* If parameters allow we are going to choose between the following
-	 * callbacks:
-	 *    - Vector
-	 *    - Bulk Allocation
-	 *    - Single buffer allocation (the simplest one)
-	 */
-	} else if (ad->rx_vec_allowed) {
-		PMD_INIT_LOG(DEBUG, "Vector rx enabled, please make sure RX "
-				    "burst size no less than %d (port=%d).",
-			     RTE_I40E_DESCS_PER_LOOP,
-			     dev->data->port_id);
-
-		dev->rx_pkt_burst = i40e_recv_pkts_vec;
-#ifdef RTE_ARCH_X86
-		/*
-		 * since AVX frequency can be different to base
-		 * frequency, limit use of AVX2 version to later
-		 * plaforms, not all those that could theoretically
-		 * run it.
-		 */
-		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
-			dev->rx_pkt_burst = i40e_recv_pkts_vec_avx2;
-#endif
-	} else if (ad->rx_bulk_alloc_allowed) {
+	if (ad->rx_vec_allowed) {
+		/* Vec Rx path */
+		PMD_INIT_LOG(DEBUG, "Vector Rx path will be used on port=%d.",
+				dev->data->port_id);
+		if (ad->use_latest_vec)
+			dev->rx_pkt_burst =
+			i40e_get_latest_rx_vec(dev->data->scattered_rx);
+		else
+			dev->rx_pkt_burst =
+			i40e_get_recommend_rx_vec(dev->data->scattered_rx);
+	} else if (!dev->data->scattered_rx && ad->rx_bulk_alloc_allowed) {
 		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are "
 				    "satisfied. Rx Burst Bulk Alloc function "
 				    "will be used on port=%d.",
@@ -2998,12 +2987,12 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
 
 		dev->rx_pkt_burst = i40e_recv_pkts_bulk_alloc;
 	} else {
-		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are not "
-				    "satisfied, or Scattered Rx is requested "
-				    "(port=%d).",
+		/* Simple Rx Path. */
+		PMD_INIT_LOG(DEBUG, "Simple Rx path will be used on port=%d.",
 			     dev->data->port_id);
-
-		dev->rx_pkt_burst = i40e_recv_pkts;
+		dev->rx_pkt_burst = dev->data->scattered_rx ?
+					i40e_recv_scattered_pkts :
+					i40e_recv_pkts;
 	}
 
 	/* Propagate information about RX function choice through all queues. */
@@ -3049,6 +3038,31 @@ i40e_set_tx_function_flag(struct rte_eth_dev *dev, struct i40e_tx_queue *txq)
 				txq->queue_id);
 }
 
+static eth_tx_burst_t
+i40e_get_latest_tx_vec(void)
+{
+#ifdef RTE_ARCH_X86
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+		return i40e_xmit_pkts_vec_avx2;
+#endif
+	return i40e_xmit_pkts_vec;
+}
+
+static eth_tx_burst_t
+i40e_get_recommend_tx_vec(void)
+{
+#ifdef RTE_ARCH_X86
+	/*
+	 * since AVX frequency can be different to base frequency, limit
+	 * use of AVX2 version to later plaforms, not all those that could
+	 * theoretically run it.
+	 */
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
+		return i40e_xmit_pkts_vec_avx2;
+#endif
+	return i40e_xmit_pkts_vec;
+}
+
 void __attribute__((cold))
 i40e_set_tx_function(struct rte_eth_dev *dev)
 {
@@ -3073,17 +3087,12 @@ i40e_set_tx_function(struct rte_eth_dev *dev)
 	if (ad->tx_simple_allowed) {
 		if (ad->tx_vec_allowed) {
 			PMD_INIT_LOG(DEBUG, "Vector tx finally be used.");
-			dev->tx_pkt_burst = i40e_xmit_pkts_vec;
-#ifdef RTE_ARCH_X86
-			/*
-			 * since AVX frequency can be different to base
-			 * frequency, limit use of AVX2 version to later
-			 * plaforms, not all those that could theoretically
-			 * run it.
-			 */
-			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
-				dev->tx_pkt_burst = i40e_xmit_pkts_vec_avx2;
-#endif
+			if (ad->use_latest_vec)
+				dev->tx_pkt_burst =
+					i40e_get_latest_tx_vec();
+			else
+				dev->tx_pkt_burst =
+					i40e_get_recommend_tx_vec();
 		} else {
 			PMD_INIT_LOG(DEBUG, "Simple tx finally be used.");
 			dev->tx_pkt_burst = i40e_xmit_pkts_simple;
-- 
2.17.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v5] net/i40e: add interface to choose latest vector path
  2018-09-12 10:12 ` [dpdk-dev] [PATCH v5] " Xiaoyun Li
@ 2018-09-13  1:38   ` Zhang, Qi Z
  2018-09-13 13:27   ` Ferruh Yigit
  1 sibling, 0 replies; 19+ messages in thread
From: Zhang, Qi Z @ 2018-09-13  1:38 UTC (permalink / raw)
  To: Li, Xiaoyun, Xing, Beilei; +Cc: dev, Yang, Zhiyong

> -----Original Message-----
> From: Li, Xiaoyun
> Sent: Wednesday, September 12, 2018 6:12 PM
> To: Xing, Beilei <beilei.xing@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>
> Cc: dev@dpdk.org; Yang, Zhiyong <zhiyong.yang@intel.com>; Li, Xiaoyun
> <xiaoyun.li@intel.com>
> Subject: [PATCH v5] net/i40e: add interface to choose latest vector path
> 
> Right now, vector path is limited to only use on later platform.
> This patch adds a devarg use-latest-vec to allow the users to use the latest
> vector path that the platform supported. Namely, using AVX2 vector path on
> broadwell is possible.
> 
> Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>

Acked-by: Qi Zhang <qi.z.zhang@intel.com>

Applied to dpdk-next-net-intel.

Regards
Qi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v5] net/i40e: add interface to choose latest vector path
  2018-09-12 10:12 ` [dpdk-dev] [PATCH v5] " Xiaoyun Li
  2018-09-13  1:38   ` Zhang, Qi Z
@ 2018-09-13 13:27   ` Ferruh Yigit
  2018-09-17  7:12     ` Li, Xiaoyun
  1 sibling, 1 reply; 19+ messages in thread
From: Ferruh Yigit @ 2018-09-13 13:27 UTC (permalink / raw)
  To: Xiaoyun Li, beilei.xing, qi.z.zhang; +Cc: dev, zhiyong.yang

On 9/12/2018 11:12 AM, Xiaoyun Li wrote:
> Right now, vector path is limited to only use on later platform.

i40e supports vector instructions for intel, arm and powerpc,
this behavior is only for Intel vector drivers, can be good to clarify,
also it can better to explain a little more what "limited to only use on later
platform" means.

> This patch adds a devarg use-latest-vec to allow the users to
> use the latest vector path that the platform supported. Namely,
> using AVX2 vector path on broadwell is possible.

Again, this is for intel only, and can you put a matrix to clarify what is
supported:

no devarg:
Machine		PMD
avx512		avx2
avx2		sse4.2
sse4.2		sse4.2
< sse4.2	not supported

with devarg:
Machine		PMD
avx512		avx2
avx2		avx2
sse4.2		sse4.2
< sse4.2	not supported


And I am not sure about name of the devarg "use-latest-vec", I can see it has
been discussed already.
What about "use-latest-supported-vec"? Too verbose?
Do you have any other suggestion?

<...>

> @@ -163,6 +163,14 @@ Runtime Config Options
>    Currently hot-plugging of representor ports is not supported so all required
>    representors must be specified on the creation of the PF.
>  
> +- ``Use latest vector`` (default ``disable``)
> +
> +  Vector path was limited to use only on later platform. But users may want the
> +  latest vector path. For example, VPP users may want to use AVX2 vector path on HSW/BDW
> +  because it can get better perf. So ``devargs`` parameter ``use-latest-vec`` is
> +  introduced, for example::
> +    -w 84:00.0,use-latest-vec=1

Do we need to name a specific consumer of DPDK on i40e document? Why not say any
application?

> +
>  Driver compilation and testing
>  ------------------------------
>  
> diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst
> index 3ae6b3f58..34af591a2 100644
> --- a/doc/guides/rel_notes/release_18_11.rst
> +++ b/doc/guides/rel_notes/release_18_11.rst
> @@ -54,6 +54,10 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =========================================================
>  
> +* **Added a devarg to use the latest vector path.**
> +  A new devarg ``use-latest-vec`` was introduced to allow users to choose
> +  the latest vector path that the platform supported. For example, VPP users
> +  can use AVX2 vector path on BDW/HSW to get better performance.

Same, do we need to name a specific consumer of DPDK on release notes?

<...>

> @@ -1201,6 +1203,46 @@ i40e_aq_debug_write_global_register(struct i40e_hw *hw,
>  	return i40e_aq_debug_write_register(hw, reg_addr, reg_val, cmd_details);
>  }
>  
> +static int
> +i40e_parse_latest_vec(struct rte_eth_dev *dev)
> +{
> +	struct i40e_adapter *ad =
> +		I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
> +	int kvargs_count, use_latest_vec;
> +	struct rte_kvargs *kvlist;
> +
> +	ad->use_latest_vec = false;
> +
> +	if (!dev->device->devargs)
> +		return 0;
> +
> +	kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys);
> +	if (!kvlist)
> +		return -EINVAL;

Agree with Qi to prevent rte_kvargs_parse() call for each devarg, in the future.

> +
> +	kvargs_count = rte_kvargs_count(kvlist, ETH_I40E_USE_LATEST_VEC);
> +	if (!kvargs_count) {
> +		rte_kvargs_free(kvlist);
> +		return 0;
> +	}
> +
> +	if (kvargs_count > 1)
> +		PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and only "
> +			    "the first one is used !",
> +			    ETH_I40E_USE_LATEST_VEC);
> +
> +	use_latest_vec = atoi((&kvlist->pairs[0])->value);

Instead of accessing directly kvlist internals, please use rte_kvargs_process()

<...>

> @@ -12527,4 +12570,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_i40e,
>  			      ETH_I40E_FLOATING_VEB_ARG "=1"
>  			      ETH_I40E_FLOATING_VEB_LIST_ARG "=<string>"
>  			      ETH_I40E_QUEUE_NUM_PER_VF_ARG "=1|2|4|8|16"
> -			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1");
> +			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1"
> +			      ETH_I40E_USE_LATEST_VEC "=1");

= 0|1 ?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v5] net/i40e: add interface to choose latest vector path
  2018-09-13 13:27   ` Ferruh Yigit
@ 2018-09-17  7:12     ` Li, Xiaoyun
  0 siblings, 0 replies; 19+ messages in thread
From: Li, Xiaoyun @ 2018-09-17  7:12 UTC (permalink / raw)
  To: Yigit, Ferruh, Xing, Beilei, Zhang, Qi Z; +Cc: dev, Yang, Zhiyong

Will send new version later. Thanks.

> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Thursday, September 13, 2018 21:27
> To: Li, Xiaoyun <xiaoyun.li@intel.com>; Xing, Beilei <beilei.xing@intel.com>;
> Zhang, Qi Z <qi.z.zhang@intel.com>
> Cc: dev@dpdk.org; Yang, Zhiyong <zhiyong.yang@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v5] net/i40e: add interface to choose latest
> vector path
> 
> On 9/12/2018 11:12 AM, Xiaoyun Li wrote:
> > Right now, vector path is limited to only use on later platform.
> 
> i40e supports vector instructions for intel, arm and powerpc, this behavior is
> only for Intel vector drivers, can be good to clarify, also it can better to
> explain a little more what "limited to only use on later platform" means.
OK. Will update the commit log.

> 
> > This patch adds a devarg use-latest-vec to allow the users to use the
> > latest vector path that the platform supported. Namely, using AVX2
> > vector path on broadwell is possible.
> 
> Again, this is for intel only, and can you put a matrix to clarify what is
> supported:
> 
> no devarg:
> Machine		PMD
> avx512		avx2
> avx2		sse4.2
> sse4.2		sse4.2
> < sse4.2	not supported
> 
> with devarg:
> Machine		PMD
> avx512		avx2
> avx2		avx2
> sse4.2		sse4.2
> < sse4.2	not supported
OK.

> 
> 
> And I am not sure about name of the devarg "use-latest-vec", I can see it has
> been discussed already.
> What about "use-latest-supported-vec"? Too verbose?
> Do you have any other suggestion?
OK. Will take that name. 
> 
> <...>
> 
> > @@ -163,6 +163,14 @@ Runtime Config Options
> >    Currently hot-plugging of representor ports is not supported so all
> required
> >    representors must be specified on the creation of the PF.
> >
> > +- ``Use latest vector`` (default ``disable``)
> > +
> > +  Vector path was limited to use only on later platform. But users
> > + may want the  latest vector path. For example, VPP users may want to
> > + use AVX2 vector path on HSW/BDW  because it can get better perf. So
> > + ``devargs`` parameter ``use-latest-vec`` is  introduced, for example::
> > +    -w 84:00.0,use-latest-vec=1
> 
> Do we need to name a specific consumer of DPDK on i40e document? Why
> not say any application?
OK. Will generalize it to users not VPP users.
> 
> > +
> >  Driver compilation and testing
> >  ------------------------------
> >
> > diff --git a/doc/guides/rel_notes/release_18_11.rst
> > b/doc/guides/rel_notes/release_18_11.rst
> > index 3ae6b3f58..34af591a2 100644
> > --- a/doc/guides/rel_notes/release_18_11.rst
> > +++ b/doc/guides/rel_notes/release_18_11.rst
> > @@ -54,6 +54,10 @@ New Features
> >       Also, make sure to start the actual text at the margin.
> >       =========================================================
> >
> > +* **Added a devarg to use the latest vector path.**
> > +  A new devarg ``use-latest-vec`` was introduced to allow users to
> > +choose
> > +  the latest vector path that the platform supported. For example,
> > +VPP users
> > +  can use AVX2 vector path on BDW/HSW to get better performance.
> 
> Same, do we need to name a specific consumer of DPDK on release notes?
> 
> <...>
> 
> > @@ -1201,6 +1203,46 @@ i40e_aq_debug_write_global_register(struct
> i40e_hw *hw,
> >  	return i40e_aq_debug_write_register(hw, reg_addr, reg_val,
> > cmd_details);  }
> >
> > +static int
> > +i40e_parse_latest_vec(struct rte_eth_dev *dev) {
> > +	struct i40e_adapter *ad =
> > +		I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
> > +	int kvargs_count, use_latest_vec;
> > +	struct rte_kvargs *kvlist;
> > +
> > +	ad->use_latest_vec = false;
> > +
> > +	if (!dev->device->devargs)
> > +		return 0;
> > +
> > +	kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys);
> > +	if (!kvlist)
> > +		return -EINVAL;
> 
> Agree with Qi to prevent rte_kvargs_parse() call for each devarg, in the
> future.
OK. I am preparing patch about that.

> 
> > +
> > +	kvargs_count = rte_kvargs_count(kvlist, ETH_I40E_USE_LATEST_VEC);
> > +	if (!kvargs_count) {
> > +		rte_kvargs_free(kvlist);
> > +		return 0;
> > +	}
> > +
> > +	if (kvargs_count > 1)
> > +		PMD_DRV_LOG(WARNING, "More than one argument \"%s\"
> and only "
> > +			    "the first one is used !",
> > +			    ETH_I40E_USE_LATEST_VEC);
> > +
> > +	use_latest_vec = atoi((&kvlist->pairs[0])->value);
> 
> Instead of accessing directly kvlist internals, please use rte_kvargs_process()
OK.

> 
> <...>
> 
> > @@ -12527,4 +12570,5 @@
> RTE_PMD_REGISTER_PARAM_STRING(net_i40e,
> >  			      ETH_I40E_FLOATING_VEB_ARG "=1"
> >  			      ETH_I40E_FLOATING_VEB_LIST_ARG "=<string>"
> >  			      ETH_I40E_QUEUE_NUM_PER_VF_ARG
> "=1|2|4|8|16"
> > -			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1");
> > +			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1"
> > +			      ETH_I40E_USE_LATEST_VEC "=1");
> 
> = 0|1 ?
Yes. Will correct it.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH v6] net/i40e: add interface to use latest vec path
  2018-08-30  2:16 [dpdk-dev] [PATCH] net/i40e: add interface to choose latest vector path Xiaoyun Li
                   ` (3 preceding siblings ...)
  2018-09-12 10:12 ` [dpdk-dev] [PATCH v5] " Xiaoyun Li
@ 2018-09-17  9:58 ` Xiaoyun Li
  2018-09-17 14:14   ` Ferruh Yigit
  2018-09-18  2:22 ` [dpdk-dev] [PATCH v7] " Xiaoyun Li
  5 siblings, 1 reply; 19+ messages in thread
From: Xiaoyun Li @ 2018-09-17  9:58 UTC (permalink / raw)
  To: beilei.xing, qi.z.zhang, ferruh.yigit; +Cc: dev, zhiyong.yang, Xiaoyun Li

For IA, the AVX2 vector path is only recommended to be used on later
platforms (identified by AVX512 support, like SKL etc.) This is because
performance benchmark shows downgrade when running AVX2 vector path on
early platform (BDW/HSW) in some cases. But we still observe perf gain
with some real work loading.

So this patch introduced the new devarg use-latest-supported-vec to
force the driver always selecting the latest supported vec path. Then
apps are able to take AVX2 path on early platforms. And this logic can
be re-used if we will have AVX512 vec path in future.

This patch only affects IA platforms. The selected vec path would be
like the following:
  Without devarg/devarg = 0:
  Machine	vPMD
  AVX512F	AVX2
  AVX2	SSE4.2
  SSE4.2	SSE4.2
  <SSE4.2	Not Supported

  With devarg = 1
  Machine	vPMD
  AVX512F	AVX2
  AVX2	AVX2
  SSE4.2	SSE4.2
  <SSE4.2	Not Supported

Other platforms can also apply the same logic if necessary in future.

Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
---
v6:
 * Polish the doc and commit log.
 * Use rte_kvargs_process instead of directly kvlist internals.
v5:
 * Simpify the rx set function.
v4:
 * Polish the codes.
v3:
 * Polish the doc and commit log.
v2:
 * Correct the calling of the wrong function last time.
 * Fix seg fault bug.
---
 doc/guides/nics/i40e.rst               |   8 ++
 doc/guides/rel_notes/release_18_11.rst |   4 +
 drivers/net/i40e/i40e_ethdev.c         |  65 ++++++++++-
 drivers/net/i40e/i40e_ethdev.h         |   3 +
 drivers/net/i40e/i40e_rxtx.c           | 143 +++++++++++++------------
 5 files changed, 155 insertions(+), 68 deletions(-)

diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
index 5d8500cef..6393baf0b 100644
--- a/doc/guides/nics/i40e.rst
+++ b/doc/guides/nics/i40e.rst
@@ -163,6 +163,14 @@ Runtime Config Options
   Currently hot-plugging of representor ports is not supported so all required
   representors must be specified on the creation of the PF.
 
+- ``Use latest supported vector`` (default ``disable``)
+
+  Latest supported vector path may not always get the best perf so vector path was
+  recommended to use only on later platform. But users may want the latest vector path
+  since it can get better perf in some real work loading cases. So ``devargs`` param
+  ``use-latest-supported-vec`` is introduced, for example::
+    -w 84:00.0,use-latest-supported-vec=1
+
 Driver compilation and testing
 ------------------------------
 
diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst
index 8c4bb5447..49fdcdfc8 100644
--- a/doc/guides/rel_notes/release_18_11.rst
+++ b/doc/guides/rel_notes/release_18_11.rst
@@ -67,6 +67,10 @@ New Features
   SR-IOV option in Hyper-V and Azure. This is an alternative to the previous
   vdev_netvsc, tap, and failsafe drivers combination.
 
+* **Added a devarg to use the latest supported vector path.**
+  A new devarg ``use-latest-supported-vec`` was introduced to allow users to
+  choose the latest vector path that the platform supported. For example, users
+  can use AVX2 vector path on BDW/HSW to get better performance.
 
 API Changes
 -----------
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 24d73f2ff..e53ed5315 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -44,6 +44,7 @@
 #define ETH_I40E_FLOATING_VEB_LIST_ARG	"floating_veb_list"
 #define ETH_I40E_SUPPORT_MULTI_DRIVER	"support-multi-driver"
 #define ETH_I40E_QUEUE_NUM_PER_VF_ARG	"queue-num-per-vf"
+#define ETH_I40E_USE_LATEST_VEC	"use-latest-supported-vec"
 
 #define I40E_CLEAR_PXE_WAIT_MS     200
 
@@ -409,6 +410,7 @@ static const char *const valid_keys[] = {
 	ETH_I40E_FLOATING_VEB_LIST_ARG,
 	ETH_I40E_SUPPORT_MULTI_DRIVER,
 	ETH_I40E_QUEUE_NUM_PER_VF_ARG,
+	ETH_I40E_USE_LATEST_VEC,
 	NULL};
 
 static const struct rte_pci_id pci_id_i40e_map[] = {
@@ -1202,6 +1204,64 @@ i40e_aq_debug_write_global_register(struct i40e_hw *hw,
 	return i40e_aq_debug_write_register(hw, reg_addr, reg_val, cmd_details);
 }
 
+static int
+i40e_parse_latest_vec_handler(__rte_unused const char *key,
+				const char *value,
+				void *opaque)
+{
+	struct i40e_adapter *ad;
+	int use_latest_vec;
+
+	ad = (struct i40e_adapter *)opaque;
+
+	use_latest_vec = atoi(value);
+
+	if (use_latest_vec != 0 && use_latest_vec != 1)
+		PMD_DRV_LOG(WARNING, "Value should be 0 or 1, set it as 1!");
+
+	ad->use_latest_vec = (bool)use_latest_vec;
+
+	return 0;
+}
+
+static int
+i40e_use_latest_vec(struct rte_eth_dev *dev)
+{
+	struct i40e_adapter *ad =
+		I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
+	struct rte_kvargs *kvlist;
+	int kvargs_count;
+
+	ad->use_latest_vec = false;
+
+	if (!dev->device->devargs)
+		return 0;
+
+	kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys);
+	if (!kvlist)
+		return -EINVAL;
+
+	kvargs_count = rte_kvargs_count(kvlist, ETH_I40E_USE_LATEST_VEC);
+	if (!kvargs_count) {
+		rte_kvargs_free(kvlist);
+		return 0;
+	}
+
+	if (kvargs_count > 1)
+		PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and only "
+			    "the first invalid or last valid one is used !",
+			    ETH_I40E_USE_LATEST_VEC);
+
+	if (rte_kvargs_process(kvlist, ETH_I40E_USE_LATEST_VEC,
+				i40e_parse_latest_vec_handler, ad) < 0) {
+		rte_kvargs_free(kvlist);
+		return -EINVAL;
+	}
+
+	rte_kvargs_free(kvlist);
+	return 0;
+}
+
 #define I40E_ALARM_INTERVAL 50000 /* us */
 
 static int
@@ -1266,6 +1326,8 @@ eth_i40e_dev_init(struct rte_eth_dev *dev, void *init_params __rte_unused)
 
 	/* Check if need to support multi-driver */
 	i40e_support_multi_driver(dev);
+	/* Check if users want the latest supported vec path */
+	i40e_use_latest_vec(dev);
 
 	/* Make sure all is clean before doing PF reset */
 	i40e_clear_hw(hw);
@@ -12599,4 +12661,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_i40e,
 			      ETH_I40E_FLOATING_VEB_ARG "=1"
 			      ETH_I40E_FLOATING_VEB_LIST_ARG "=<string>"
 			      ETH_I40E_QUEUE_NUM_PER_VF_ARG "=1|2|4|8|16"
-			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1");
+			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1"
+			      ETH_I40E_USE_LATEST_VEC "=0|1");
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
index 3fffe5a55..140c92b84 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -1078,6 +1078,9 @@ struct i40e_adapter {
 	uint64_t pctypes_tbl[I40E_FLOW_TYPE_MAX] __rte_cache_min_aligned;
 	uint64_t flow_types_mask;
 	uint64_t pctypes_mask;
+
+	/* For devargs */
+	bool use_latest_vec;
 };
 
 /**
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 5e78d489e..7c986d535 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -2909,6 +2909,35 @@ i40e_txq_info_get(struct rte_eth_dev *dev, uint16_t queue_id,
 	qinfo->conf.offloads = txq->offloads;
 }
 
+static eth_rx_burst_t
+i40e_get_latest_rx_vec(bool scatter)
+{
+#ifdef RTE_ARCH_X86
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+		return scatter ? i40e_recv_scattered_pkts_vec_avx2 :
+				 i40e_recv_pkts_vec_avx2;
+#endif
+	return scatter ? i40e_recv_scattered_pkts_vec :
+			 i40e_recv_pkts_vec;
+}
+
+static eth_rx_burst_t
+i40e_get_recommend_rx_vec(bool scatter)
+{
+#ifdef RTE_ARCH_X86
+	/*
+	 * since AVX frequency can be different to base frequency, limit
+	 * use of AVX2 version to later plaforms, not all those that could
+	 * theoretically run it.
+	 */
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
+		return scatter ? i40e_recv_scattered_pkts_vec_avx2 :
+				 i40e_recv_pkts_vec_avx2;
+#endif
+	return scatter ? i40e_recv_scattered_pkts_vec :
+			 i40e_recv_pkts_vec;
+}
+
 void __attribute__((cold))
 i40e_set_rx_function(struct rte_eth_dev *dev)
 {
@@ -2940,57 +2969,17 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
 		}
 	}
 
-	if (dev->data->scattered_rx) {
-		/* Set the non-LRO scattered callback: there are Vector and
-		 * single allocation versions.
-		 */
-		if (ad->rx_vec_allowed) {
-			PMD_INIT_LOG(DEBUG, "Using Vector Scattered Rx "
-					    "callback (port=%d).",
-				     dev->data->port_id);
-
-			dev->rx_pkt_burst = i40e_recv_scattered_pkts_vec;
-#ifdef RTE_ARCH_X86
-			/*
-			 * since AVX frequency can be different to base
-			 * frequency, limit use of AVX2 version to later
-			 * plaforms, not all those that could theoretically
-			 * run it.
-			 */
-			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
-				dev->rx_pkt_burst =
-					i40e_recv_scattered_pkts_vec_avx2;
-#endif
-		} else {
-			PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk "
-					   "allocation callback (port=%d).",
-				     dev->data->port_id);
-			dev->rx_pkt_burst = i40e_recv_scattered_pkts;
-		}
-	/* If parameters allow we are going to choose between the following
-	 * callbacks:
-	 *    - Vector
-	 *    - Bulk Allocation
-	 *    - Single buffer allocation (the simplest one)
-	 */
-	} else if (ad->rx_vec_allowed) {
-		PMD_INIT_LOG(DEBUG, "Vector rx enabled, please make sure RX "
-				    "burst size no less than %d (port=%d).",
-			     RTE_I40E_DESCS_PER_LOOP,
-			     dev->data->port_id);
-
-		dev->rx_pkt_burst = i40e_recv_pkts_vec;
-#ifdef RTE_ARCH_X86
-		/*
-		 * since AVX frequency can be different to base
-		 * frequency, limit use of AVX2 version to later
-		 * plaforms, not all those that could theoretically
-		 * run it.
-		 */
-		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
-			dev->rx_pkt_burst = i40e_recv_pkts_vec_avx2;
-#endif
-	} else if (ad->rx_bulk_alloc_allowed) {
+	if (ad->rx_vec_allowed) {
+		/* Vec Rx path */
+		PMD_INIT_LOG(DEBUG, "Vector Rx path will be used on port=%d.",
+				dev->data->port_id);
+		if (ad->use_latest_vec)
+			dev->rx_pkt_burst =
+			i40e_get_latest_rx_vec(dev->data->scattered_rx);
+		else
+			dev->rx_pkt_burst =
+			i40e_get_recommend_rx_vec(dev->data->scattered_rx);
+	} else if (!dev->data->scattered_rx && ad->rx_bulk_alloc_allowed) {
 		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are "
 				    "satisfied. Rx Burst Bulk Alloc function "
 				    "will be used on port=%d.",
@@ -2998,12 +2987,12 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
 
 		dev->rx_pkt_burst = i40e_recv_pkts_bulk_alloc;
 	} else {
-		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are not "
-				    "satisfied, or Scattered Rx is requested "
-				    "(port=%d).",
+		/* Simple Rx Path. */
+		PMD_INIT_LOG(DEBUG, "Simple Rx path will be used on port=%d.",
 			     dev->data->port_id);
-
-		dev->rx_pkt_burst = i40e_recv_pkts;
+		dev->rx_pkt_burst = dev->data->scattered_rx ?
+					i40e_recv_scattered_pkts :
+					i40e_recv_pkts;
 	}
 
 	/* Propagate information about RX function choice through all queues. */
@@ -3049,6 +3038,31 @@ i40e_set_tx_function_flag(struct rte_eth_dev *dev, struct i40e_tx_queue *txq)
 				txq->queue_id);
 }
 
+static eth_tx_burst_t
+i40e_get_latest_tx_vec(void)
+{
+#ifdef RTE_ARCH_X86
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+		return i40e_xmit_pkts_vec_avx2;
+#endif
+	return i40e_xmit_pkts_vec;
+}
+
+static eth_tx_burst_t
+i40e_get_recommend_tx_vec(void)
+{
+#ifdef RTE_ARCH_X86
+	/*
+	 * since AVX frequency can be different to base frequency, limit
+	 * use of AVX2 version to later plaforms, not all those that could
+	 * theoretically run it.
+	 */
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
+		return i40e_xmit_pkts_vec_avx2;
+#endif
+	return i40e_xmit_pkts_vec;
+}
+
 void __attribute__((cold))
 i40e_set_tx_function(struct rte_eth_dev *dev)
 {
@@ -3073,17 +3087,12 @@ i40e_set_tx_function(struct rte_eth_dev *dev)
 	if (ad->tx_simple_allowed) {
 		if (ad->tx_vec_allowed) {
 			PMD_INIT_LOG(DEBUG, "Vector tx finally be used.");
-			dev->tx_pkt_burst = i40e_xmit_pkts_vec;
-#ifdef RTE_ARCH_X86
-			/*
-			 * since AVX frequency can be different to base
-			 * frequency, limit use of AVX2 version to later
-			 * plaforms, not all those that could theoretically
-			 * run it.
-			 */
-			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
-				dev->tx_pkt_burst = i40e_xmit_pkts_vec_avx2;
-#endif
+			if (ad->use_latest_vec)
+				dev->tx_pkt_burst =
+					i40e_get_latest_tx_vec();
+			else
+				dev->tx_pkt_burst =
+					i40e_get_recommend_tx_vec();
 		} else {
 			PMD_INIT_LOG(DEBUG, "Simple tx finally be used.");
 			dev->tx_pkt_burst = i40e_xmit_pkts_simple;
-- 
2.17.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v6] net/i40e: add interface to use latest vec path
  2018-09-17  9:58 ` [dpdk-dev] [PATCH v6] net/i40e: add interface to use latest vec path Xiaoyun Li
@ 2018-09-17 14:14   ` Ferruh Yigit
  2018-09-17 14:37     ` Thomas Monjalon
  0 siblings, 1 reply; 19+ messages in thread
From: Ferruh Yigit @ 2018-09-17 14:14 UTC (permalink / raw)
  To: Xiaoyun Li, beilei.xing, qi.z.zhang
  Cc: dev, zhiyong.yang, Thomas Monjalon, Bruce Richardson

On 9/17/2018 10:58 AM, Xiaoyun Li wrote:
> For IA, the AVX2 vector path is only recommended to be used on later
> platforms (identified by AVX512 support, like SKL etc.) This is because
> performance benchmark shows downgrade when running AVX2 vector path on
> early platform (BDW/HSW) in some cases. But we still observe perf gain
> with some real work loading.
> 
> So this patch introduced the new devarg use-latest-supported-vec to
> force the driver always selecting the latest supported vec path. Then
> apps are able to take AVX2 path on early platforms. And this logic can
> be re-used if we will have AVX512 vec path in future.
> 
> This patch only affects IA platforms. The selected vec path would be
> like the following:
>   Without devarg/devarg = 0:
>   Machine	vPMD
>   AVX512F	AVX2
>   AVX2	SSE4.2
>   SSE4.2	SSE4.2
>   <SSE4.2	Not Supported
> 
>   With devarg = 1
>   Machine	vPMD
>   AVX512F	AVX2
>   AVX2	AVX2
>   SSE4.2	SSE4.2
>   <SSE4.2	Not Supported
> 
> Other platforms can also apply the same logic if necessary in future.
> 
> Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
> ---
> v6:
>  * Polish the doc and commit log.
>  * Use rte_kvargs_process instead of directly kvlist internals.
> v5:
>  * Simpify the rx set function.
> v4:
>  * Polish the codes.
> v3:
>  * Polish the doc and commit log.
> v2:
>  * Correct the calling of the wrong function last time.
>  * Fix seg fault bug.

Thanks for the update looks good to me.

<...>

> @@ -1078,6 +1078,9 @@ struct i40e_adapter {
>  	uint64_t pctypes_tbl[I40E_FLOW_TYPE_MAX] __rte_cache_min_aligned;
>  	uint64_t flow_types_mask;
>  	uint64_t pctypes_mask;
> +
> +	/* For devargs */
> +	bool use_latest_vec;

For this one checkpatch is giving following warning:

CHECK:BOOL_MEMBER: Avoid using bool structure members because of possible
alignment issues - see: https://lkml.org/lkml/2017/11/21/384

The comment in the link seems valid. What do you think using a basic storage
type for the variable, like uint8_t?


And overall is there any objection to follow this new convention?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v6] net/i40e: add interface to use latest vec path
  2018-09-17 14:14   ` Ferruh Yigit
@ 2018-09-17 14:37     ` Thomas Monjalon
  2018-09-18  1:28       ` Li, Xiaoyun
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Monjalon @ 2018-09-17 14:37 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: Xiaoyun Li, beilei.xing, qi.z.zhang, dev, zhiyong.yang, Bruce Richardson

17/09/2018 16:14, Ferruh Yigit:
> On 9/17/2018 10:58 AM, Xiaoyun Li wrote:
> > @@ -1078,6 +1078,9 @@ struct i40e_adapter {
> >  	uint64_t pctypes_tbl[I40E_FLOW_TYPE_MAX] __rte_cache_min_aligned;
> >  	uint64_t flow_types_mask;
> >  	uint64_t pctypes_mask;
> > +
> > +	/* For devargs */
> > +	bool use_latest_vec;
> 
> For this one checkpatch is giving following warning:
> 
> CHECK:BOOL_MEMBER: Avoid using bool structure members because of possible
> alignment issues - see: https://lkml.org/lkml/2017/11/21/384
> 
> The comment in the link seems valid. What do you think using a basic storage
> type for the variable, like uint8_t?
> 
> 
> And overall is there any objection to follow this new convention?

I agree with avoiding bool in structs.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v6] net/i40e: add interface to use latest vec path
  2018-09-17 14:37     ` Thomas Monjalon
@ 2018-09-18  1:28       ` Li, Xiaoyun
  0 siblings, 0 replies; 19+ messages in thread
From: Li, Xiaoyun @ 2018-09-18  1:28 UTC (permalink / raw)
  To: Thomas Monjalon, Yigit, Ferruh
  Cc: Xing, Beilei, Zhang, Qi Z, dev, Yang, Zhiyong, Richardson, Bruce


> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Monday, September 17, 2018 22:37
> To: Yigit, Ferruh <ferruh.yigit@intel.com>
> Cc: Li, Xiaoyun <xiaoyun.li@intel.com>; Xing, Beilei <beilei.xing@intel.com>;
> Zhang, Qi Z <qi.z.zhang@intel.com>; dev@dpdk.org; Yang, Zhiyong
> <zhiyong.yang@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [PATCH v6] net/i40e: add interface to use latest vec path
> 
> 17/09/2018 16:14, Ferruh Yigit:
> > On 9/17/2018 10:58 AM, Xiaoyun Li wrote:
> > > @@ -1078,6 +1078,9 @@ struct i40e_adapter {
> > >  	uint64_t pctypes_tbl[I40E_FLOW_TYPE_MAX]
> __rte_cache_min_aligned;
> > >  	uint64_t flow_types_mask;
> > >  	uint64_t pctypes_mask;
> > > +
> > > +	/* For devargs */
> > > +	bool use_latest_vec;
> >
> > For this one checkpatch is giving following warning:
> >
> > CHECK:BOOL_MEMBER: Avoid using bool structure members because of
> > possible alignment issues - see: https://lkml.org/lkml/2017/11/21/384
> >
> > The comment in the link seems valid. What do you think using a basic
> > storage type for the variable, like uint8_t?
> >
> >
> > And overall is there any objection to follow this new convention?
> 
> I agree with avoiding bool in structs.
> 
OK. Thanks!

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [dpdk-dev] [PATCH v7] net/i40e: add interface to use latest vec path
  2018-08-30  2:16 [dpdk-dev] [PATCH] net/i40e: add interface to choose latest vector path Xiaoyun Li
                   ` (4 preceding siblings ...)
  2018-09-17  9:58 ` [dpdk-dev] [PATCH v6] net/i40e: add interface to use latest vec path Xiaoyun Li
@ 2018-09-18  2:22 ` Xiaoyun Li
  2018-09-18 13:01   ` Ferruh Yigit
  5 siblings, 1 reply; 19+ messages in thread
From: Xiaoyun Li @ 2018-09-18  2:22 UTC (permalink / raw)
  To: beilei.xing, qi.z.zhang, ferruh.yigit
  Cc: dev, zhiyong.yang, thomas, bruce.richardson, Xiaoyun Li

For IA, the AVX2 vector path is only recommended to be used on later
platforms (identified by AVX512 support, like SKL etc.) This is because
performance benchmark shows downgrade when running AVX2 vector path on
early platform (BDW/HSW) in some cases. But we still observe perf gain
with some real work loading.

So this patch introduced the new devarg use-latest-supported-vec to
force the driver always selecting the latest supported vec path. Then
apps are able to take AVX2 path on early platforms. And this logic can
be re-used if we will have AVX512 vec path in future.

This patch only affects IA platforms. The selected vec path would be
like the following:
  Without devarg/devarg = 0:
  Machine	vPMD
  AVX512F	AVX2
  AVX2	SSE4.2
  SSE4.2	SSE4.2
  <SSE4.2	Not Supported

  With devarg = 1
  Machine	vPMD
  AVX512F	AVX2
  AVX2	AVX2
  SSE4.2	SSE4.2
  <SSE4.2	Not Supported

Other platforms can also apply the same logic if necessary in future.

Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
---
v7:
 * Use uint_8 instead of bool type for struct member.
v6:
 * Polish the doc and commit log.
 * Use rte_kvargs_process instead of directly kvlist internals.
v5:
 * Simpify the rx set function.
v4:
 * Polish the codes.
v3:
 * Polish the doc and commit log.
v2:
 * Correct the calling of the wrong function last time.
 * Fix seg fault bug.
---
 doc/guides/nics/i40e.rst               |   8 ++
 doc/guides/rel_notes/release_18_11.rst |   4 +
 drivers/net/i40e/i40e_ethdev.c         |  65 ++++++++++-
 drivers/net/i40e/i40e_ethdev.h         |   3 +
 drivers/net/i40e/i40e_rxtx.c           | 143 +++++++++++++------------
 5 files changed, 155 insertions(+), 68 deletions(-)

diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
index 5d8500cef..6393baf0b 100644
--- a/doc/guides/nics/i40e.rst
+++ b/doc/guides/nics/i40e.rst
@@ -163,6 +163,14 @@ Runtime Config Options
   Currently hot-plugging of representor ports is not supported so all required
   representors must be specified on the creation of the PF.
 
+- ``Use latest supported vector`` (default ``disable``)
+
+  Latest supported vector path may not always get the best perf so vector path was
+  recommended to use only on later platform. But users may want the latest vector path
+  since it can get better perf in some real work loading cases. So ``devargs`` param
+  ``use-latest-supported-vec`` is introduced, for example::
+    -w 84:00.0,use-latest-supported-vec=1
+
 Driver compilation and testing
 ------------------------------
 
diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst
index 8c4bb5447..49fdcdfc8 100644
--- a/doc/guides/rel_notes/release_18_11.rst
+++ b/doc/guides/rel_notes/release_18_11.rst
@@ -67,6 +67,10 @@ New Features
   SR-IOV option in Hyper-V and Azure. This is an alternative to the previous
   vdev_netvsc, tap, and failsafe drivers combination.
 
+* **Added a devarg to use the latest supported vector path.**
+  A new devarg ``use-latest-supported-vec`` was introduced to allow users to
+  choose the latest vector path that the platform supported. For example, users
+  can use AVX2 vector path on BDW/HSW to get better performance.
 
 API Changes
 -----------
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 24d73f2ff..51ad847e8 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -44,6 +44,7 @@
 #define ETH_I40E_FLOATING_VEB_LIST_ARG	"floating_veb_list"
 #define ETH_I40E_SUPPORT_MULTI_DRIVER	"support-multi-driver"
 #define ETH_I40E_QUEUE_NUM_PER_VF_ARG	"queue-num-per-vf"
+#define ETH_I40E_USE_LATEST_VEC	"use-latest-supported-vec"
 
 #define I40E_CLEAR_PXE_WAIT_MS     200
 
@@ -409,6 +410,7 @@ static const char *const valid_keys[] = {
 	ETH_I40E_FLOATING_VEB_LIST_ARG,
 	ETH_I40E_SUPPORT_MULTI_DRIVER,
 	ETH_I40E_QUEUE_NUM_PER_VF_ARG,
+	ETH_I40E_USE_LATEST_VEC,
 	NULL};
 
 static const struct rte_pci_id pci_id_i40e_map[] = {
@@ -1202,6 +1204,64 @@ i40e_aq_debug_write_global_register(struct i40e_hw *hw,
 	return i40e_aq_debug_write_register(hw, reg_addr, reg_val, cmd_details);
 }
 
+static int
+i40e_parse_latest_vec_handler(__rte_unused const char *key,
+				const char *value,
+				void *opaque)
+{
+	struct i40e_adapter *ad;
+	int use_latest_vec;
+
+	ad = (struct i40e_adapter *)opaque;
+
+	use_latest_vec = atoi(value);
+
+	if (use_latest_vec != 0 && use_latest_vec != 1)
+		PMD_DRV_LOG(WARNING, "Value should be 0 or 1, set it as 1!");
+
+	ad->use_latest_vec = (uint8_t)use_latest_vec;
+
+	return 0;
+}
+
+static int
+i40e_use_latest_vec(struct rte_eth_dev *dev)
+{
+	struct i40e_adapter *ad =
+		I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
+	struct rte_kvargs *kvlist;
+	int kvargs_count;
+
+	ad->use_latest_vec = false;
+
+	if (!dev->device->devargs)
+		return 0;
+
+	kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys);
+	if (!kvlist)
+		return -EINVAL;
+
+	kvargs_count = rte_kvargs_count(kvlist, ETH_I40E_USE_LATEST_VEC);
+	if (!kvargs_count) {
+		rte_kvargs_free(kvlist);
+		return 0;
+	}
+
+	if (kvargs_count > 1)
+		PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and only "
+			    "the first invalid or last valid one is used !",
+			    ETH_I40E_USE_LATEST_VEC);
+
+	if (rte_kvargs_process(kvlist, ETH_I40E_USE_LATEST_VEC,
+				i40e_parse_latest_vec_handler, ad) < 0) {
+		rte_kvargs_free(kvlist);
+		return -EINVAL;
+	}
+
+	rte_kvargs_free(kvlist);
+	return 0;
+}
+
 #define I40E_ALARM_INTERVAL 50000 /* us */
 
 static int
@@ -1266,6 +1326,8 @@ eth_i40e_dev_init(struct rte_eth_dev *dev, void *init_params __rte_unused)
 
 	/* Check if need to support multi-driver */
 	i40e_support_multi_driver(dev);
+	/* Check if users want the latest supported vec path */
+	i40e_use_latest_vec(dev);
 
 	/* Make sure all is clean before doing PF reset */
 	i40e_clear_hw(hw);
@@ -12599,4 +12661,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_i40e,
 			      ETH_I40E_FLOATING_VEB_ARG "=1"
 			      ETH_I40E_FLOATING_VEB_LIST_ARG "=<string>"
 			      ETH_I40E_QUEUE_NUM_PER_VF_ARG "=1|2|4|8|16"
-			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1");
+			      ETH_I40E_SUPPORT_MULTI_DRIVER "=1"
+			      ETH_I40E_USE_LATEST_VEC "=0|1");
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
index 3fffe5a55..7d197f6aa 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -1078,6 +1078,9 @@ struct i40e_adapter {
 	uint64_t pctypes_tbl[I40E_FLOW_TYPE_MAX] __rte_cache_min_aligned;
 	uint64_t flow_types_mask;
 	uint64_t pctypes_mask;
+
+	/* For devargs */
+	uint8_t use_latest_vec;
 };
 
 /**
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 5e78d489e..7c986d535 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -2909,6 +2909,35 @@ i40e_txq_info_get(struct rte_eth_dev *dev, uint16_t queue_id,
 	qinfo->conf.offloads = txq->offloads;
 }
 
+static eth_rx_burst_t
+i40e_get_latest_rx_vec(bool scatter)
+{
+#ifdef RTE_ARCH_X86
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+		return scatter ? i40e_recv_scattered_pkts_vec_avx2 :
+				 i40e_recv_pkts_vec_avx2;
+#endif
+	return scatter ? i40e_recv_scattered_pkts_vec :
+			 i40e_recv_pkts_vec;
+}
+
+static eth_rx_burst_t
+i40e_get_recommend_rx_vec(bool scatter)
+{
+#ifdef RTE_ARCH_X86
+	/*
+	 * since AVX frequency can be different to base frequency, limit
+	 * use of AVX2 version to later plaforms, not all those that could
+	 * theoretically run it.
+	 */
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
+		return scatter ? i40e_recv_scattered_pkts_vec_avx2 :
+				 i40e_recv_pkts_vec_avx2;
+#endif
+	return scatter ? i40e_recv_scattered_pkts_vec :
+			 i40e_recv_pkts_vec;
+}
+
 void __attribute__((cold))
 i40e_set_rx_function(struct rte_eth_dev *dev)
 {
@@ -2940,57 +2969,17 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
 		}
 	}
 
-	if (dev->data->scattered_rx) {
-		/* Set the non-LRO scattered callback: there are Vector and
-		 * single allocation versions.
-		 */
-		if (ad->rx_vec_allowed) {
-			PMD_INIT_LOG(DEBUG, "Using Vector Scattered Rx "
-					    "callback (port=%d).",
-				     dev->data->port_id);
-
-			dev->rx_pkt_burst = i40e_recv_scattered_pkts_vec;
-#ifdef RTE_ARCH_X86
-			/*
-			 * since AVX frequency can be different to base
-			 * frequency, limit use of AVX2 version to later
-			 * plaforms, not all those that could theoretically
-			 * run it.
-			 */
-			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
-				dev->rx_pkt_burst =
-					i40e_recv_scattered_pkts_vec_avx2;
-#endif
-		} else {
-			PMD_INIT_LOG(DEBUG, "Using a Scattered with bulk "
-					   "allocation callback (port=%d).",
-				     dev->data->port_id);
-			dev->rx_pkt_burst = i40e_recv_scattered_pkts;
-		}
-	/* If parameters allow we are going to choose between the following
-	 * callbacks:
-	 *    - Vector
-	 *    - Bulk Allocation
-	 *    - Single buffer allocation (the simplest one)
-	 */
-	} else if (ad->rx_vec_allowed) {
-		PMD_INIT_LOG(DEBUG, "Vector rx enabled, please make sure RX "
-				    "burst size no less than %d (port=%d).",
-			     RTE_I40E_DESCS_PER_LOOP,
-			     dev->data->port_id);
-
-		dev->rx_pkt_burst = i40e_recv_pkts_vec;
-#ifdef RTE_ARCH_X86
-		/*
-		 * since AVX frequency can be different to base
-		 * frequency, limit use of AVX2 version to later
-		 * plaforms, not all those that could theoretically
-		 * run it.
-		 */
-		if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
-			dev->rx_pkt_burst = i40e_recv_pkts_vec_avx2;
-#endif
-	} else if (ad->rx_bulk_alloc_allowed) {
+	if (ad->rx_vec_allowed) {
+		/* Vec Rx path */
+		PMD_INIT_LOG(DEBUG, "Vector Rx path will be used on port=%d.",
+				dev->data->port_id);
+		if (ad->use_latest_vec)
+			dev->rx_pkt_burst =
+			i40e_get_latest_rx_vec(dev->data->scattered_rx);
+		else
+			dev->rx_pkt_burst =
+			i40e_get_recommend_rx_vec(dev->data->scattered_rx);
+	} else if (!dev->data->scattered_rx && ad->rx_bulk_alloc_allowed) {
 		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are "
 				    "satisfied. Rx Burst Bulk Alloc function "
 				    "will be used on port=%d.",
@@ -2998,12 +2987,12 @@ i40e_set_rx_function(struct rte_eth_dev *dev)
 
 		dev->rx_pkt_burst = i40e_recv_pkts_bulk_alloc;
 	} else {
-		PMD_INIT_LOG(DEBUG, "Rx Burst Bulk Alloc Preconditions are not "
-				    "satisfied, or Scattered Rx is requested "
-				    "(port=%d).",
+		/* Simple Rx Path. */
+		PMD_INIT_LOG(DEBUG, "Simple Rx path will be used on port=%d.",
 			     dev->data->port_id);
-
-		dev->rx_pkt_burst = i40e_recv_pkts;
+		dev->rx_pkt_burst = dev->data->scattered_rx ?
+					i40e_recv_scattered_pkts :
+					i40e_recv_pkts;
 	}
 
 	/* Propagate information about RX function choice through all queues. */
@@ -3049,6 +3038,31 @@ i40e_set_tx_function_flag(struct rte_eth_dev *dev, struct i40e_tx_queue *txq)
 				txq->queue_id);
 }
 
+static eth_tx_burst_t
+i40e_get_latest_tx_vec(void)
+{
+#ifdef RTE_ARCH_X86
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
+		return i40e_xmit_pkts_vec_avx2;
+#endif
+	return i40e_xmit_pkts_vec;
+}
+
+static eth_tx_burst_t
+i40e_get_recommend_tx_vec(void)
+{
+#ifdef RTE_ARCH_X86
+	/*
+	 * since AVX frequency can be different to base frequency, limit
+	 * use of AVX2 version to later plaforms, not all those that could
+	 * theoretically run it.
+	 */
+	if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
+		return i40e_xmit_pkts_vec_avx2;
+#endif
+	return i40e_xmit_pkts_vec;
+}
+
 void __attribute__((cold))
 i40e_set_tx_function(struct rte_eth_dev *dev)
 {
@@ -3073,17 +3087,12 @@ i40e_set_tx_function(struct rte_eth_dev *dev)
 	if (ad->tx_simple_allowed) {
 		if (ad->tx_vec_allowed) {
 			PMD_INIT_LOG(DEBUG, "Vector tx finally be used.");
-			dev->tx_pkt_burst = i40e_xmit_pkts_vec;
-#ifdef RTE_ARCH_X86
-			/*
-			 * since AVX frequency can be different to base
-			 * frequency, limit use of AVX2 version to later
-			 * plaforms, not all those that could theoretically
-			 * run it.
-			 */
-			if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
-				dev->tx_pkt_burst = i40e_xmit_pkts_vec_avx2;
-#endif
+			if (ad->use_latest_vec)
+				dev->tx_pkt_burst =
+					i40e_get_latest_tx_vec();
+			else
+				dev->tx_pkt_burst =
+					i40e_get_recommend_tx_vec();
 		} else {
 			PMD_INIT_LOG(DEBUG, "Simple tx finally be used.");
 			dev->tx_pkt_burst = i40e_xmit_pkts_simple;
-- 
2.17.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v7] net/i40e: add interface to use latest vec path
  2018-09-18  2:22 ` [dpdk-dev] [PATCH v7] " Xiaoyun Li
@ 2018-09-18 13:01   ` Ferruh Yigit
  2018-09-18 13:52     ` Zhang, Qi Z
  0 siblings, 1 reply; 19+ messages in thread
From: Ferruh Yigit @ 2018-09-18 13:01 UTC (permalink / raw)
  To: Xiaoyun Li, beilei.xing, qi.z.zhang
  Cc: dev, zhiyong.yang, thomas, bruce.richardson

On 9/18/2018 3:22 AM, Xiaoyun Li wrote:
> For IA, the AVX2 vector path is only recommended to be used on later
> platforms (identified by AVX512 support, like SKL etc.) This is because
> performance benchmark shows downgrade when running AVX2 vector path on
> early platform (BDW/HSW) in some cases. But we still observe perf gain
> with some real work loading.
> 
> So this patch introduced the new devarg use-latest-supported-vec to
> force the driver always selecting the latest supported vec path. Then
> apps are able to take AVX2 path on early platforms. And this logic can
> be re-used if we will have AVX512 vec path in future.
> 
> This patch only affects IA platforms. The selected vec path would be
> like the following:
>   Without devarg/devarg = 0:
>   Machine	vPMD
>   AVX512F	AVX2
>   AVX2	SSE4.2
>   SSE4.2	SSE4.2
>   <SSE4.2	Not Supported
> 
>   With devarg = 1
>   Machine	vPMD
>   AVX512F	AVX2
>   AVX2	AVX2
>   SSE4.2	SSE4.2
>   <SSE4.2	Not Supported
> 
> Other platforms can also apply the same logic if necessary in future.
> 
> Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
> ---
> v7:
>  * Use uint_8 instead of bool type for struct member.
> v6:
>  * Polish the doc and commit log.
>  * Use rte_kvargs_process instead of directly kvlist internals.
> v5:
>  * Simpify the rx set function.
> v4:
>  * Polish the codes.
> v3:
>  * Polish the doc and commit log.
> v2:
>  * Correct the calling of the wrong function last time.
>  * Fix seg fault bug.
> ---
>  doc/guides/nics/i40e.rst               |   8 ++

Doc is causing warning:
doc/guides/nics/i40e.rst:172: WARNING: Unexpected indentation.

Except from that,
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [dpdk-dev] [PATCH v7] net/i40e: add interface to use latest vec path
  2018-09-18 13:01   ` Ferruh Yigit
@ 2018-09-18 13:52     ` Zhang, Qi Z
  0 siblings, 0 replies; 19+ messages in thread
From: Zhang, Qi Z @ 2018-09-18 13:52 UTC (permalink / raw)
  To: Yigit, Ferruh, Li, Xiaoyun, Xing, Beilei
  Cc: dev, Yang, Zhiyong, thomas, Richardson, Bruce



> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Tuesday, September 18, 2018 9:01 PM
> To: Li, Xiaoyun <xiaoyun.li@intel.com>; Xing, Beilei <beilei.xing@intel.com>;
> Zhang, Qi Z <qi.z.zhang@intel.com>
> Cc: dev@dpdk.org; Yang, Zhiyong <zhiyong.yang@intel.com>;
> thomas@monjalon.net; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [PATCH v7] net/i40e: add interface to use latest vec path
> 
> On 9/18/2018 3:22 AM, Xiaoyun Li wrote:
> > For IA, the AVX2 vector path is only recommended to be used on later
> > platforms (identified by AVX512 support, like SKL etc.) This is
> > because performance benchmark shows downgrade when running AVX2
> vector
> > path on early platform (BDW/HSW) in some cases. But we still observe
> > perf gain with some real work loading.
> >
> > So this patch introduced the new devarg use-latest-supported-vec to
> > force the driver always selecting the latest supported vec path. Then
> > apps are able to take AVX2 path on early platforms. And this logic can
> > be re-used if we will have AVX512 vec path in future.
> >
> > This patch only affects IA platforms. The selected vec path would be
> > like the following:
> >   Without devarg/devarg = 0:
> >   Machine	vPMD
> >   AVX512F	AVX2
> >   AVX2	SSE4.2
> >   SSE4.2	SSE4.2
> >   <SSE4.2	Not Supported
> >
> >   With devarg = 1
> >   Machine	vPMD
> >   AVX512F	AVX2
> >   AVX2	AVX2
> >   SSE4.2	SSE4.2
> >   <SSE4.2	Not Supported
> >
> > Other platforms can also apply the same logic if necessary in future.
> >
> > Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
> > ---
> > v7:
> >  * Use uint_8 instead of bool type for struct member.
> > v6:
> >  * Polish the doc and commit log.
> >  * Use rte_kvargs_process instead of directly kvlist internals.
> > v5:
> >  * Simpify the rx set function.
> > v4:
> >  * Polish the codes.
> > v3:
> >  * Polish the doc and commit log.
> > v2:
> >  * Correct the calling of the wrong function last time.
> >  * Fix seg fault bug.
> > ---
> >  doc/guides/nics/i40e.rst               |   8 ++
> 
> Doc is causing warning:
> doc/guides/nics/i40e.rst:172: WARNING: Unexpected indentation.
> 
> Except from that,
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

Applied to dpdk-next-net-intel.

Thanks
Qi

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-09-18 13:52 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-30  2:16 [dpdk-dev] [PATCH] net/i40e: add interface to choose latest vector path Xiaoyun Li
2018-08-31 11:24 ` [dpdk-dev] [PATCH v2] " Xiaoyun Li
2018-09-04 11:39 ` [dpdk-dev] [PATCH v3] " Xiaoyun Li
2018-09-05 12:21   ` Zhang, Qi Z
2018-09-07  9:23     ` Li, Xiaoyun
2018-09-10 10:17 ` [dpdk-dev] [PATCH v4] " Xiaoyun Li
2018-09-12  7:45   ` Zhang, Qi Z
2018-09-12  7:50     ` Zhang, Qi Z
2018-09-12 10:12 ` [dpdk-dev] [PATCH v5] " Xiaoyun Li
2018-09-13  1:38   ` Zhang, Qi Z
2018-09-13 13:27   ` Ferruh Yigit
2018-09-17  7:12     ` Li, Xiaoyun
2018-09-17  9:58 ` [dpdk-dev] [PATCH v6] net/i40e: add interface to use latest vec path Xiaoyun Li
2018-09-17 14:14   ` Ferruh Yigit
2018-09-17 14:37     ` Thomas Monjalon
2018-09-18  1:28       ` Li, Xiaoyun
2018-09-18  2:22 ` [dpdk-dev] [PATCH v7] " Xiaoyun Li
2018-09-18 13:01   ` Ferruh Yigit
2018-09-18 13:52     ` Zhang, Qi Z

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).