DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH v5 00/12] preparation for the rte_flow offload of nfp PMD
@ 2022-08-05  6:32 Chaoyong He
  2022-08-05  6:32 ` [PATCH v5 01/12] net/nfp: move app specific attributes to own struct Chaoyong He
                   ` (11 more replies)
  0 siblings, 12 replies; 29+ messages in thread
From: Chaoyong He @ 2022-08-05  6:32 UTC (permalink / raw)
  To: dev; +Cc: niklas.soderlund, Chaoyong He

This is the first patch series to add the support of rte_flow offload for
nfp PMD, includes:
Add the support of flower firmware
Add the support of representor port
Add the flower service infrastructure
Add the cmsg interactive channels between pmd and fw

* Changes since v4
- Remove the unneeded '__rte_unused' attribute
- Fixup a potential memory leak problem

* Changes since v3
- Add the 'Depends-on' tag

* Changes since v2
- Remove the use of rte_panic()

* Changes since v1
- Fix the compile error

Depends-on: series-23707 ("Add support of NFP3800 chip and firmware with NFDk")

Chaoyong He (12):
  net/nfp: move app specific attributes to own struct
  net/nfp: simplify initialization and remove dead code
  net/nfp: move app specific init logic to own function
  net/nfp: add initial flower firmware support
  net/nfp: add flower PF setup and mempool init logic
  net/nfp: add flower PF related routines
  net/nfp: add flower ctrl VNIC related logics
  net/nfp: move common rxtx function for flower use
  net/nfp: add flower ctrl VNIC rxtx logic
  net/nfp: add flower representor framework
  net/nfp: move rxtx function to header file
  net/nfp: add flower PF rxtx logic

 drivers/net/nfp/flower/nfp_flower.c             | 1551 +++++++++++++++++++++++
 drivers/net/nfp/flower/nfp_flower.h             |   71 ++
 drivers/net/nfp/flower/nfp_flower_cmsg.c        |  186 +++
 drivers/net/nfp/flower/nfp_flower_cmsg.h        |  173 +++
 drivers/net/nfp/flower/nfp_flower_ctrl.c        |  252 ++++
 drivers/net/nfp/flower/nfp_flower_ctrl.h        |   13 +
 drivers/net/nfp/flower/nfp_flower_ovs_compat.h  |  145 +++
 drivers/net/nfp/flower/nfp_flower_representor.c |  508 ++++++++
 drivers/net/nfp/flower/nfp_flower_representor.h |   39 +
 drivers/net/nfp/meson.build                     |    4 +
 drivers/net/nfp/nfp_common.c                    |    2 +-
 drivers/net/nfp/nfp_common.h                    |   37 +-
 drivers/net/nfp/nfp_cpp_bridge.c                |   88 +-
 drivers/net/nfp/nfp_cpp_bridge.h                |    6 +-
 drivers/net/nfp/nfp_ethdev.c                    |  359 ++++--
 drivers/net/nfp/nfp_ethdev_vf.c                 |    2 +-
 drivers/net/nfp/nfp_rxtx.c                      |  123 +-
 drivers/net/nfp/nfp_rxtx.h                      |  121 ++
 drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c      |   31 +-
 19 files changed, 3426 insertions(+), 285 deletions(-)
 create mode 100644 drivers/net/nfp/flower/nfp_flower.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower.h
 create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.h
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ctrl.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ctrl.h
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ovs_compat.h
 create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.h

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 01/12] net/nfp: move app specific attributes to own struct
  2022-08-05  6:32 [PATCH v5 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
@ 2022-08-05  6:32 ` Chaoyong He
  2022-08-05 10:49   ` Andrew Rybchenko
  2022-08-05  6:32 ` [PATCH v5 02/12] net/nfp: simplify initialization and remove dead code Chaoyong He
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Chaoyong He @ 2022-08-05  6:32 UTC (permalink / raw)
  To: dev; +Cc: niklas.soderlund, Chaoyong He, Heinrich Kuhn

The NFP Card can load different firmware applications. Currently
only the CoreNIC application is supported. This commit makes
needed infrastructure changes in order to support other firmware
applications too.

Clearer separation is made between the PF device and any application
specific concepts. The PF struct is now generic regardless of the
application loaded. A new struct is also made for the CoreNIC
application. Future additions to support other applications should
also add an applications specific struct.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/nfp_common.h |  33 +++++++-
 drivers/net/nfp/nfp_ethdev.c | 196 +++++++++++++++++++++++++++----------------
 2 files changed, 154 insertions(+), 75 deletions(-)

diff --git a/drivers/net/nfp/nfp_common.h b/drivers/net/nfp/nfp_common.h
index 6d917e4..2aaf1d6 100644
--- a/drivers/net/nfp/nfp_common.h
+++ b/drivers/net/nfp/nfp_common.h
@@ -111,6 +111,14 @@
 #include <linux/types.h>
 #include <rte_io.h>
 
+/* Firmware application ID's */
+enum nfp_app_id {
+	NFP_APP_CORE_NIC               = 0x1,
+	NFP_APP_BPF_NIC                = 0x2,
+	NFP_APP_FLOWER_NIC             = 0x3,
+	NFP_APP_ACTIVE_BUFFER_MGMT_NIC = 0x4,
+};
+
 /* nfp_qcp_ptr - Read or Write Pointer of a queue */
 enum nfp_qcp_ptr {
 	NFP_QCP_READ_PTR = 0,
@@ -121,8 +129,10 @@ struct nfp_pf_dev {
 	/* Backpointer to associated pci device */
 	struct rte_pci_device *pci_dev;
 
-	/* Array of physical ports belonging to this PF */
-	struct nfp_net_hw *ports[NFP_MAX_PHYPORTS];
+	enum nfp_app_id app_id;
+
+	/* Pointer to the app running on the PF */
+	void *app_priv;
 
 	/* Current values for control */
 	uint32_t ctrl;
@@ -151,8 +161,6 @@ struct nfp_pf_dev {
 	struct nfp_cpp_area *msix_area;
 
 	uint8_t *hw_queues;
-	uint8_t total_phyports;
-	bool	multiport;
 
 	union eth_table_entry *eth_table;
 
@@ -161,6 +169,20 @@ struct nfp_pf_dev {
 	uint32_t nfp_cpp_service_id;
 };
 
+struct nfp_app_nic {
+	/* Backpointer to the PF device */
+	struct nfp_pf_dev *pf_dev;
+
+	/*
+	 * Array of physical ports belonging to the this CoreNIC app
+	 * This is really a list of vNIC's. One for each physical port
+	 */
+	struct nfp_net_hw *ports[NFP_MAX_PHYPORTS];
+
+	bool multiport;
+	uint8_t total_phyports;
+};
+
 struct nfp_net_hw {
 	/* Backpointer to the PF this port belongs to */
 	struct nfp_pf_dev *pf_dev;
@@ -424,6 +446,9 @@ int nfp_net_rss_hash_conf_get(struct rte_eth_dev *dev,
 #define NFP_NET_DEV_PRIVATE_TO_PF(dev_priv)\
 	(((struct nfp_net_hw *)dev_priv)->pf_dev)
 
+#define NFP_APP_PRIV_TO_APP_NIC(app_priv)\
+	((struct nfp_app_nic *)app_priv)
+
 #endif /* _NFP_COMMON_H_ */
 /*
  * Local variables:
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index 5cdd34e..3c4b0ac 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -39,15 +39,15 @@
 #include "nfp_cpp_bridge.h"
 
 static int
-nfp_net_pf_read_mac(struct nfp_pf_dev *pf_dev, int port)
+nfp_net_pf_read_mac(struct nfp_app_nic *app_nic, int port)
 {
 	struct nfp_eth_table *nfp_eth_table;
 	struct nfp_net_hw *hw = NULL;
 
 	/* Grab a pointer to the correct physical port */
-	hw = pf_dev->ports[port];
+	hw = app_nic->ports[port];
 
-	nfp_eth_table = nfp_eth_read_ports(pf_dev->cpp);
+	nfp_eth_table = nfp_eth_read_ports(app_nic->pf_dev->cpp);
 
 	nfp_eth_copy_mac((uint8_t *)&hw->mac_addr,
 			 (uint8_t *)&nfp_eth_table->ports[port].mac_addr);
@@ -64,6 +64,7 @@
 	uint32_t new_ctrl, update = 0;
 	struct nfp_net_hw *hw;
 	struct nfp_pf_dev *pf_dev;
+	struct nfp_app_nic *app_nic;
 	struct rte_eth_conf *dev_conf;
 	struct rte_eth_rxmode *rxmode;
 	uint32_t intr_vector;
@@ -71,6 +72,7 @@
 
 	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+	app_nic = NFP_APP_PRIV_TO_APP_NIC(pf_dev->app_priv);
 
 	PMD_INIT_LOG(DEBUG, "Start");
 
@@ -82,7 +84,7 @@
 
 	/* check and configure queue intr-vector mapping */
 	if (dev->data->dev_conf.intr_conf.rxq != 0) {
-		if (pf_dev->multiport) {
+		if (app_nic->multiport) {
 			PMD_INIT_LOG(ERR, "PMD rx interrupt is not supported "
 					  "with NFP multiport PF");
 				return -EINVAL;
@@ -250,6 +252,7 @@
 	struct nfp_net_hw *hw;
 	struct rte_pci_device *pci_dev;
 	struct nfp_pf_dev *pf_dev;
+	struct nfp_app_nic *app_nic;
 	int i;
 
 	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
@@ -260,6 +263,7 @@
 	pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(dev->data->dev_private);
 	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	pci_dev = RTE_ETH_DEV_TO_PCI(dev);
+	app_nic = NFP_APP_PRIV_TO_APP_NIC(pf_dev->app_priv);
 
 	/*
 	 * We assume that the DPDK application is stopping all the
@@ -280,12 +284,12 @@
 	/* Only free PF resources after all physical ports have been closed */
 	/* Mark this port as unused and free device priv resources*/
 	nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff);
-	pf_dev->ports[hw->idx] = NULL;
+	app_nic->ports[hw->idx] = NULL;
 	rte_eth_dev_release_port(dev);
 
-	for (i = 0; i < pf_dev->total_phyports; i++) {
+	for (i = 0; i < app_nic->total_phyports; i++) {
 		/* Check to see if ports are still in use */
-		if (pf_dev->ports[i])
+		if (app_nic->ports[i])
 			return 0;
 	}
 
@@ -296,6 +300,7 @@
 	free(pf_dev->hwinfo);
 	free(pf_dev->sym_tbl);
 	nfp_cpp_free(pf_dev->cpp);
+	rte_free(app_nic);
 	rte_free(pf_dev);
 
 	rte_intr_disable(pci_dev->intr_handle);
@@ -404,6 +409,7 @@
 {
 	struct rte_pci_device *pci_dev;
 	struct nfp_pf_dev *pf_dev;
+	struct nfp_app_nic *app_nic;
 	struct nfp_net_hw *hw;
 	struct rte_ether_addr *tmp_ether_addr;
 	uint64_t rx_bar_off = 0;
@@ -420,6 +426,9 @@
 	/* Use backpointer here to the PF of this eth_dev */
 	pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(eth_dev->data->dev_private);
 
+	/* Use backpointer to the CoreNIC app struct */
+	app_nic = NFP_APP_PRIV_TO_APP_NIC(pf_dev->app_priv);
+
 	/* NFP can not handle DMA addresses requiring more than 40 bits */
 	if (rte_mem_check_dma_mask(40)) {
 		RTE_LOG(ERR, PMD,
@@ -438,7 +447,7 @@
 	 * Use PF array of physical ports to get pointer to
 	 * this specific port
 	 */
-	hw = pf_dev->ports[port];
+	hw = app_nic->ports[port];
 
 	PMD_INIT_LOG(DEBUG, "Working with physical port number: %d, "
 			"NFP internal port number: %d", port, hw->nfp_idx);
@@ -568,7 +577,7 @@
 		goto dev_err_queues_map;
 	}
 
-	nfp_net_pf_read_mac(pf_dev, port);
+	nfp_net_pf_read_mac(app_nic, port);
 	nfp_net_write_mac(hw, (uint8_t *)&hw->mac_addr);
 
 	tmp_ether_addr = (struct rte_ether_addr *)&hw->mac_addr;
@@ -718,25 +727,67 @@
 }
 
 static int
-nfp_init_phyports(struct nfp_pf_dev *pf_dev)
+nfp_init_app_nic(struct nfp_pf_dev *pf_dev,
+		struct nfp_eth_table *nfp_eth_table)
 {
 	int i;
-	int ret = 0;
+	int ret;
+	int err = 0;
+	int total_vnics;
 	struct nfp_net_hw *hw;
+	unsigned int numa_node;
 	struct rte_eth_dev *eth_dev;
-	struct nfp_eth_table *nfp_eth_table;
+	struct nfp_app_nic *app_nic;
+	char port_name[RTE_ETH_NAME_MAX_LEN];
 
-	nfp_eth_table = nfp_eth_read_ports(pf_dev->cpp);
-	if (nfp_eth_table == NULL) {
-		PMD_INIT_LOG(ERR, "Error reading NFP ethernet table");
-		return -EIO;
+	PMD_INIT_LOG(INFO, "Total physical ports: %d", nfp_eth_table->count);
+
+	/* Allocate memory for the CoreNIC app */
+	app_nic = rte_zmalloc("nfp_app_nic", sizeof(*app_nic), 0);
+	if (app_nic == NULL)
+		return -ENOMEM;
+
+	/* Point the app_priv pointer in the PF to the coreNIC app */
+	pf_dev->app_priv = app_nic;
+
+	/* Read the number of vNIC's created for the PF */
+	total_vnics = nfp_rtsym_read_le(pf_dev->sym_tbl, "nfd_cfg_pf0_num_ports", &err);
+	if (err || total_vnics <= 0 || total_vnics > 8) {
+		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
+		ret = -ENODEV;
+		goto app_cleanup;
 	}
 
-	/* Loop through all physical ports on PF */
-	for (i = 0; i < pf_dev->total_phyports; i++) {
-		const unsigned int numa_node = rte_socket_id();
-		char port_name[RTE_ETH_NAME_MAX_LEN];
+	/*
+	 * For coreNIC the number of vNICs exposed should be the same as the
+	 * number of physical ports
+	 */
+	if (total_vnics != (int)nfp_eth_table->count) {
+		PMD_INIT_LOG(ERR, "Total physical ports do not match number of vNICs");
+		ret = -ENODEV;
+		goto app_cleanup;
+	}
 
+	/* Populate coreNIC app properties*/
+	app_nic->total_phyports = total_vnics;
+	app_nic->pf_dev = pf_dev;
+	if (total_vnics > 1)
+		app_nic->multiport = true;
+
+	/* Map the symbol table */
+	pf_dev->ctrl_bar = nfp_rtsym_map(pf_dev->sym_tbl, "_pf0_net_bar0",
+			app_nic->total_phyports * 32768, &pf_dev->ctrl_area);
+	if (pf_dev->ctrl_bar == NULL) {
+		PMD_INIT_LOG(ERR, "nfp_rtsym_map fails for _pf0_net_ctrl_bar");
+		ret = -EIO;
+		goto app_cleanup;
+	}
+
+	PMD_INIT_LOG(DEBUG, "ctrl bar: %p", pf_dev->ctrl_bar);
+
+	/* Loop through all physical ports on PF */
+	numa_node = rte_socket_id();
+	for (i = 0; i < app_nic->total_phyports; i++) {
 		snprintf(port_name, sizeof(port_name), "%s_port%d",
 			 pf_dev->pci_dev->device.name, i);
 
@@ -760,7 +811,7 @@
 		hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
 
 		/* Add this device to the PF's array of physical ports */
-		pf_dev->ports[i] = hw;
+		app_nic->ports[i] = hw;
 
 		hw->pf_dev = pf_dev;
 		hw->cpp = pf_dev->cpp;
@@ -783,20 +834,21 @@
 		rte_eth_dev_probing_finish(eth_dev);
 
 	} /* End loop, all ports on this PF */
-	ret = 0;
-	goto eth_table_cleanup;
+
+	return 0;
 
 port_cleanup:
-	for (i = 0; i < pf_dev->total_phyports; i++) {
-		if (pf_dev->ports[i] && pf_dev->ports[i]->eth_dev) {
+	for (i = 0; i < app_nic->total_phyports; i++) {
+		if (app_nic->ports[i] && app_nic->ports[i]->eth_dev) {
 			struct rte_eth_dev *tmp_dev;
-			tmp_dev = pf_dev->ports[i]->eth_dev;
+			tmp_dev = app_nic->ports[i]->eth_dev;
 			rte_eth_dev_release_port(tmp_dev);
-			pf_dev->ports[i] = NULL;
+			app_nic->ports[i] = NULL;
 		}
 	}
-eth_table_cleanup:
-	free(nfp_eth_table);
+	nfp_cpp_area_free(pf_dev->ctrl_area);
+app_cleanup:
+	rte_free(app_nic);
 
 	return ret;
 }
@@ -804,11 +856,11 @@
 static int
 nfp_pf_init(struct rte_pci_device *pci_dev)
 {
-	int err;
-	int ret = 0;
+	int ret;
+	int err = 0;
 	uint64_t addr;
-	int total_ports;
 	struct nfp_cpp *cpp;
+	enum nfp_app_id app_id;
 	struct nfp_pf_dev *pf_dev;
 	struct nfp_hwinfo *hwinfo;
 	char name[RTE_ETH_NAME_MAX_LEN];
@@ -840,9 +892,10 @@
 	if (hwinfo == NULL) {
 		PMD_INIT_LOG(ERR, "Error reading hwinfo table");
 		ret = -EIO;
-		goto error;
+		goto cpp_cleanup;
 	}
 
+	/* Read the number of physical ports from hardware */
 	nfp_eth_table = nfp_eth_read_ports(cpp);
 	if (nfp_eth_table == NULL) {
 		PMD_INIT_LOG(ERR, "Error reading NFP ethernet table");
@@ -865,20 +918,14 @@
 		goto eth_table_cleanup;
 	}
 
-	total_ports = nfp_rtsym_read_le(sym_tbl, "nfd_cfg_pf0_num_ports", &err);
-	if (total_ports != (int)nfp_eth_table->count) {
-		PMD_DRV_LOG(ERR, "Inconsistent number of ports");
+	/* Read the app ID of the firmware loaded */
+	app_id = nfp_rtsym_read_le(sym_tbl, "_pf0_net_app_id", &err);
+	if (err) {
+		PMD_INIT_LOG(ERR, "Couldn't read app_id from fw");
 		ret = -EIO;
 		goto sym_tbl_cleanup;
 	}
 
-	PMD_INIT_LOG(INFO, "Total physical ports: %d", total_ports);
-
-	if (total_ports <= 0 || total_ports > 8) {
-		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
-		ret = -ENODEV;
-		goto sym_tbl_cleanup;
-	}
 	/* Allocate memory for the PF "device" */
 	snprintf(name, sizeof(name), "nfp_pf%d", 0);
 	pf_dev = rte_zmalloc(name, sizeof(*pf_dev), 0);
@@ -888,27 +935,12 @@
 	}
 
 	/* Populate the newly created PF device */
+	pf_dev->app_id = app_id;
 	pf_dev->cpp = cpp;
 	pf_dev->hwinfo = hwinfo;
 	pf_dev->sym_tbl = sym_tbl;
-	pf_dev->total_phyports = total_ports;
-
-	if (total_ports > 1)
-		pf_dev->multiport = true;
-
 	pf_dev->pci_dev = pci_dev;
 
-	/* Map the symbol table */
-	pf_dev->ctrl_bar = nfp_rtsym_map(pf_dev->sym_tbl, "_pf0_net_bar0",
-			pf_dev->total_phyports * 32768, &pf_dev->ctrl_area);
-	if (pf_dev->ctrl_bar == NULL) {
-		PMD_INIT_LOG(ERR, "nfp_rtsym_map fails for _pf0_net_ctrl_bar");
-		ret = -EIO;
-		goto pf_cleanup;
-	}
-
-	PMD_INIT_LOG(DEBUG, "ctrl bar: %p", pf_dev->ctrl_bar);
-
 	/* configure access to tx/rx vNIC BARs */
 	switch (pci_dev->id.device_id) {
 	case PCI_DEVICE_ID_NFP3800_PF_NIC:
@@ -923,7 +955,7 @@
 	default:
 		PMD_INIT_LOG(ERR, "nfp_net: no device ID matching");
 		err = -ENODEV;
-		goto ctrl_area_cleanup;
+		goto pf_cleanup;
 	}
 
 	pf_dev->hw_queues = nfp_cpp_map_area(pf_dev->cpp, 0, 0,
@@ -932,18 +964,27 @@
 	if (pf_dev->hw_queues == NULL) {
 		PMD_INIT_LOG(ERR, "nfp_rtsym_map fails for net.qc");
 		ret = -EIO;
-		goto ctrl_area_cleanup;
+		goto pf_cleanup;
 	}
 
 	PMD_INIT_LOG(DEBUG, "tx/rx bar address: 0x%p", pf_dev->hw_queues);
 
 	/*
-	 * Initialize and prep physical ports now
-	 * This will loop through all physical ports
+	 * PF initialization has been done at this point. Call app specific
+	 * init code now
 	 */
-	ret = nfp_init_phyports(pf_dev);
-	if (ret) {
-		PMD_INIT_LOG(ERR, "Could not create physical ports");
+	switch (pf_dev->app_id) {
+	case NFP_APP_CORE_NIC:
+		PMD_INIT_LOG(INFO, "Initializing coreNIC");
+		ret = nfp_init_app_nic(pf_dev, nfp_eth_table);
+		if (ret) {
+			PMD_INIT_LOG(ERR, "Could not initialize coreNIC!");
+			goto hwqueues_cleanup;
+		}
+		break;
+	default:
+		PMD_INIT_LOG(ERR, "Unsupported Firmware loaded");
+		ret = -EINVAL;
 		goto hwqueues_cleanup;
 	}
 
@@ -954,8 +995,6 @@
 
 hwqueues_cleanup:
 	nfp_cpp_area_free(pf_dev->hwqueues_area);
-ctrl_area_cleanup:
-	nfp_cpp_area_free(pf_dev->ctrl_area);
 pf_cleanup:
 	rte_free(pf_dev);
 sym_tbl_cleanup:
@@ -964,6 +1003,8 @@
 	free(nfp_eth_table);
 hwinfo_cleanup:
 	free(hwinfo);
+cpp_cleanup:
+	nfp_cpp_free(cpp);
 error:
 	return ret;
 }
@@ -977,7 +1018,8 @@
 nfp_pf_secondary_init(struct rte_pci_device *pci_dev)
 {
 	int i;
-	int err;
+	int err = 0;
+	int ret = 0;
 	int total_ports;
 	struct nfp_cpp *cpp;
 	struct nfp_net_hw *hw;
@@ -1015,6 +1057,11 @@
 	}
 
 	total_ports = nfp_rtsym_read_le(sym_tbl, "nfd_cfg_pf0_num_ports", &err);
+	if (err || total_ports <= 0 || total_ports > 8) {
+		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
+		ret = -ENODEV;
+		goto sym_tbl_cleanup;
+	}
 
 	for (i = 0; i < total_ports; i++) {
 		struct rte_eth_dev *eth_dev;
@@ -1028,7 +1075,8 @@
 		if (eth_dev == NULL) {
 			RTE_LOG(ERR, EAL,
 				"secondary process attach failed, ethdev doesn't exist");
-			return -ENODEV;
+			ret = -ENODEV;
+			break;
 		}
 
 		hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
@@ -1041,10 +1089,16 @@
 		rte_eth_dev_probing_finish(eth_dev);
 	}
 
+	if (ret)
+		goto sym_tbl_cleanup;
+
 	/* Register the CPP bridge service for the secondary too */
 	nfp_register_cpp_service(cpp);
 
-	return 0;
+sym_tbl_cleanup:
+	free(sym_tbl);
+
+	return ret;
 }
 
 static int
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 02/12] net/nfp: simplify initialization and remove dead code
  2022-08-05  6:32 [PATCH v5 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
  2022-08-05  6:32 ` [PATCH v5 01/12] net/nfp: move app specific attributes to own struct Chaoyong He
@ 2022-08-05  6:32 ` Chaoyong He
  2022-08-05  6:32 ` [PATCH v5 03/12] net/nfp: move app specific init logic to own function Chaoyong He
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 29+ messages in thread
From: Chaoyong He @ 2022-08-05  6:32 UTC (permalink / raw)
  To: dev; +Cc: niklas.soderlund, Chaoyong He

Calling nfp_net_init() is only done for the corenic firmware flavor
and it is guaranteed to always be called from the primary process,
so the explicit check for RTE_PROC_PRIMARY can be dropped.

The calling graph of nfp_net_init() already guaranteed the free of
resources when it fail, so remove the necessary free logics inside it.

While at it remove the unused member is_phyport from struct nfp_net_hw.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/nfp_common.h |  1 -
 drivers/net/nfp/nfp_ethdev.c | 40 +++++++++++-----------------------------
 2 files changed, 11 insertions(+), 30 deletions(-)

diff --git a/drivers/net/nfp/nfp_common.h b/drivers/net/nfp/nfp_common.h
index 2aaf1d6..b28ebc9 100644
--- a/drivers/net/nfp/nfp_common.h
+++ b/drivers/net/nfp/nfp_common.h
@@ -238,7 +238,6 @@ struct nfp_net_hw {
 	uint8_t idx;
 	/* Internal port number as seen from NFP */
 	uint8_t nfp_idx;
-	bool	is_phyport;
 
 	union eth_table_entry *eth_table;
 
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index 3c4b0ac..2c5607c 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -417,7 +417,6 @@
 	uint32_t start_q;
 	int stride = 4;
 	int port = 0;
-	int err;
 
 	PMD_INIT_FUNC_TRACE();
 
@@ -452,10 +451,6 @@
 	PMD_INIT_LOG(DEBUG, "Working with physical port number: %d, "
 			"NFP internal port number: %d", port, hw->nfp_idx);
 
-	/* For secondary processes, the primary has done all the work */
-	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
-		return 0;
-
 	rte_eth_copy_pci_info(eth_dev, pci_dev);
 
 	hw->device_id = pci_dev->id.device_id;
@@ -506,8 +501,7 @@
 		break;
 	default:
 		PMD_DRV_LOG(ERR, "nfp_net: no device ID matching");
-		err = -ENODEV;
-		goto dev_err_ctrl_map;
+		return -ENODEV;
 	}
 
 	PMD_INIT_LOG(DEBUG, "tx_bar_off: 0x%" PRIx64 "", tx_bar_off);
@@ -573,8 +567,7 @@
 					       RTE_ETHER_ADDR_LEN, 0);
 	if (eth_dev->data->mac_addrs == NULL) {
 		PMD_INIT_LOG(ERR, "Failed to space for MAC address");
-		err = -ENOMEM;
-		goto dev_err_queues_map;
+		return -ENOMEM;
 	}
 
 	nfp_net_pf_read_mac(app_nic, port);
@@ -604,24 +597,15 @@
 		     hw->mac_addr[0], hw->mac_addr[1], hw->mac_addr[2],
 		     hw->mac_addr[3], hw->mac_addr[4], hw->mac_addr[5]);
 
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-		/* Registering LSC interrupt handler */
-		rte_intr_callback_register(pci_dev->intr_handle,
-				nfp_net_dev_interrupt_handler, (void *)eth_dev);
-		/* Telling the firmware about the LSC interrupt entry */
-		nn_cfg_writeb(hw, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
-		/* Recording current stats counters values */
-		nfp_net_stats_reset(eth_dev);
-	}
+	/* Registering LSC interrupt handler */
+	rte_intr_callback_register(pci_dev->intr_handle,
+			nfp_net_dev_interrupt_handler, (void *)eth_dev);
+	/* Telling the firmware about the LSC interrupt entry */
+	nn_cfg_writeb(hw, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
+	/* Recording current stats counters values */
+	nfp_net_stats_reset(eth_dev);
 
 	return 0;
-
-dev_err_queues_map:
-		nfp_cpp_area_free(hw->hwqueues_area);
-dev_err_ctrl_map:
-		nfp_cpp_area_free(hw->ctrl_area);
-
-	return err;
 }
 
 #define DEFAULT_FW_PATH       "/lib/firmware/netronome"
@@ -818,7 +802,6 @@
 		hw->eth_dev = eth_dev;
 		hw->idx = i;
 		hw->nfp_idx = nfp_eth_table->ports[i].index;
-		hw->is_phyport = true;
 
 		eth_dev->device = &pf_dev->pci_dev->device;
 
@@ -884,8 +867,7 @@
 
 	if (cpp == NULL) {
 		PMD_INIT_LOG(ERR, "A CPP handle can not be obtained");
-		ret = -EIO;
-		goto error;
+		return -EIO;
 	}
 
 	hwinfo = nfp_hwinfo_read(cpp);
@@ -1005,7 +987,7 @@
 	free(hwinfo);
 cpp_cleanup:
 	nfp_cpp_free(cpp);
-error:
+
 	return ret;
 }
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 03/12] net/nfp: move app specific init logic to own function
  2022-08-05  6:32 [PATCH v5 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
  2022-08-05  6:32 ` [PATCH v5 01/12] net/nfp: move app specific attributes to own struct Chaoyong He
  2022-08-05  6:32 ` [PATCH v5 02/12] net/nfp: simplify initialization and remove dead code Chaoyong He
@ 2022-08-05  6:32 ` Chaoyong He
  2022-08-05 10:53   ` Andrew Rybchenko
  2022-08-05  6:32 ` [PATCH v5 04/12] net/nfp: add initial flower firmware support Chaoyong He
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Chaoyong He @ 2022-08-05  6:32 UTC (permalink / raw)
  To: dev; +Cc: niklas.soderlund, Chaoyong He

The NFP card can load different firmware applications.
This commit move the init logic of corenic app of the
secondary process into its own function.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/nfp_ethdev.c | 93 +++++++++++++++++++++++++++++---------------
 1 file changed, 62 insertions(+), 31 deletions(-)

diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index 2c5607c..90dd01e 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -936,7 +936,7 @@
 		break;
 	default:
 		PMD_INIT_LOG(ERR, "nfp_net: no device ID matching");
-		err = -ENODEV;
+		ret = -ENODEV;
 		goto pf_cleanup;
 	}
 
@@ -991,6 +991,50 @@
 	return ret;
 }
 
+static int
+nfp_secondary_init_app_nic(struct rte_pci_device *pci_dev,
+		struct nfp_rtsym_table *sym_tbl,
+		struct nfp_cpp *cpp)
+{
+	int i;
+	int err = 0;
+	int ret = 0;
+	int total_vnics;
+	struct nfp_net_hw *hw;
+
+	/* Read the number of vNIC's created for the PF */
+	total_vnics = nfp_rtsym_read_le(sym_tbl, "nfd_cfg_pf0_num_ports", &err);
+	if (err || total_vnics <= 0 || total_vnics > 8) {
+		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
+		return -ENODEV;
+	}
+
+	for (i = 0; i < total_vnics; i++) {
+		struct rte_eth_dev *eth_dev;
+		char port_name[RTE_ETH_NAME_MAX_LEN];
+		snprintf(port_name, sizeof(port_name), "%s_port%d",
+				pci_dev->device.name, i);
+
+		PMD_DRV_LOG(DEBUG, "Secondary attaching to port %s", port_name);
+		eth_dev = rte_eth_dev_attach_secondary(port_name);
+		if (eth_dev == NULL) {
+			RTE_LOG(ERR, EAL,
+				"secondary process attach failed, ethdev doesn't exist");
+			ret = -ENODEV;
+			break;
+		}
+
+		eth_dev->process_private = cpp;
+		hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
+		if (nfp_net_ethdev_ops_mount(hw, eth_dev))
+			return -EINVAL;
+
+		rte_eth_dev_probing_finish(eth_dev);
+	}
+
+	return ret;
+}
+
 /*
  * When attaching to the NFP4000/6000 PF on a secondary process there
  * is no need to initialise the PF again. Only minimal work is required
@@ -999,12 +1043,10 @@
 static int
 nfp_pf_secondary_init(struct rte_pci_device *pci_dev)
 {
-	int i;
 	int err = 0;
 	int ret = 0;
-	int total_ports;
+	enum nfp_app_id app_id;
 	struct nfp_cpp *cpp;
-	struct nfp_net_hw *hw;
 	struct nfp_rtsym_table *sym_tbl;
 
 	if (pci_dev == NULL)
@@ -1038,37 +1080,26 @@
 		return -EIO;
 	}
 
-	total_ports = nfp_rtsym_read_le(sym_tbl, "nfd_cfg_pf0_num_ports", &err);
-	if (err || total_ports <= 0 || total_ports > 8) {
-		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
-		ret = -ENODEV;
+	/* Read the app ID of the firmware loaded */
+	app_id = nfp_rtsym_read_le(sym_tbl, "_pf0_net_app_id", &err);
+	if (err) {
+		PMD_INIT_LOG(ERR, "Couldn't read app_id from fw");
 		goto sym_tbl_cleanup;
 	}
 
-	for (i = 0; i < total_ports; i++) {
-		struct rte_eth_dev *eth_dev;
-		char port_name[RTE_ETH_NAME_MAX_LEN];
-
-		snprintf(port_name, sizeof(port_name), "%s_port%d",
-			 pci_dev->device.name, i);
-
-		PMD_DRV_LOG(DEBUG, "Secondary attaching to port %s", port_name);
-		eth_dev = rte_eth_dev_attach_secondary(port_name);
-		if (eth_dev == NULL) {
-			RTE_LOG(ERR, EAL,
-				"secondary process attach failed, ethdev doesn't exist");
-			ret = -ENODEV;
-			break;
+	switch (app_id) {
+	case NFP_APP_CORE_NIC:
+		PMD_INIT_LOG(INFO, "Initializing coreNIC");
+		ret = nfp_secondary_init_app_nic(pci_dev, sym_tbl, cpp);
+		if (ret) {
+			PMD_INIT_LOG(ERR, "Could not initialize coreNIC!");
+			goto sym_tbl_cleanup;
 		}
-
-		hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
-
-		if (nfp_net_ethdev_ops_mount(hw, eth_dev))
-			return -EINVAL;
-
-		eth_dev->process_private = cpp;
-
-		rte_eth_dev_probing_finish(eth_dev);
+		break;
+	default:
+		PMD_INIT_LOG(ERR, "Unsupported Firmware loaded");
+		ret = -EINVAL;
+		goto sym_tbl_cleanup;
 	}
 
 	if (ret)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 04/12] net/nfp: add initial flower firmware support
  2022-08-05  6:32 [PATCH v5 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (2 preceding siblings ...)
  2022-08-05  6:32 ` [PATCH v5 03/12] net/nfp: move app specific init logic to own function Chaoyong He
@ 2022-08-05  6:32 ` Chaoyong He
  2022-08-05 11:00   ` Andrew Rybchenko
  2022-08-05  6:32 ` [PATCH v5 05/12] net/nfp: add flower PF setup and mempool init logic Chaoyong He
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Chaoyong He @ 2022-08-05  6:32 UTC (permalink / raw)
  To: dev; +Cc: niklas.soderlund, Chaoyong He, Heinrich Kuhn

This commits adds the basic probing infrastructure to support the flower
firmware. This firmware is geared towards offloading OVS and can
generally be found in /lib/firmware/netronome/flower. It is also used by
the NFP kernel driver when OVS offload with TC is desired.

This commit also adds the basic infrastructure needed by the flower
firmware to operate. The firmware requires threads to service both the
PF vNIC and the ctrl vNIC. The PF is responsible for handling any
fallback traffic and the ctrl vNIC is used to communicate OVS flows
and flow statistics to and from the smartNIC. rte_services are used to
facilitate this logic.

This commit also adds the cpp service, used for some user tools.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/flower/nfp_flower.c | 101 ++++++++++++++++++++++++++++++++++++
 drivers/net/nfp/flower/nfp_flower.h |  22 ++++++++
 drivers/net/nfp/meson.build         |   1 +
 drivers/net/nfp/nfp_cpp_bridge.c    |  88 ++++++++++++++++++++++++++-----
 drivers/net/nfp/nfp_cpp_bridge.h    |   6 ++-
 drivers/net/nfp/nfp_ethdev.c        |  40 ++++++++++++--
 6 files changed, 239 insertions(+), 19 deletions(-)
 create mode 100644 drivers/net/nfp/flower/nfp_flower.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower.h

diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
new file mode 100644
index 0000000..1dddced
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -0,0 +1,101 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#include <rte_common.h>
+#include <ethdev_driver.h>
+#include <rte_service_component.h>
+#include <rte_malloc.h>
+#include <ethdev_pci.h>
+#include <ethdev_driver.h>
+
+#include "../nfp_common.h"
+#include "../nfp_logs.h"
+#include "../nfp_ctrl.h"
+#include "../nfp_cpp_bridge.h"
+#include "nfp_flower.h"
+
+static struct rte_service_spec flower_services[NFP_FLOWER_SERVICE_MAX] = {
+};
+
+static int
+nfp_flower_enable_services(struct nfp_app_flower *app_flower)
+{
+	int i;
+	int ret = 0;
+
+	for (i = 0; i < NFP_FLOWER_SERVICE_MAX; i++) {
+		/* Pass a pointer to the flower app to the service */
+		flower_services[i].callback_userdata = (void *)app_flower;
+
+		/* Register the flower services */
+		ret = rte_service_component_register(&flower_services[i],
+				&app_flower->flower_services_ids[i]);
+		if (ret) {
+			PMD_INIT_LOG(WARNING,
+				"Could not register Flower PF vNIC service");
+			break;
+		}
+
+		PMD_INIT_LOG(INFO, "Flower PF vNIC service registered");
+
+		/* Map them to available service cores*/
+		ret = nfp_map_service(app_flower->flower_services_ids[i]);
+		if (ret)
+			break;
+	}
+
+	return ret;
+}
+
+int
+nfp_init_app_flower(struct nfp_pf_dev *pf_dev)
+{
+	int ret;
+	unsigned int numa_node;
+	struct nfp_net_hw *pf_hw;
+	struct nfp_app_flower *app_flower;
+
+	numa_node = rte_socket_id();
+
+	/* Allocate memory for the Flower app */
+	app_flower = rte_zmalloc_socket("nfp_app_flower", sizeof(*app_flower),
+			RTE_CACHE_LINE_SIZE, numa_node);
+	if (app_flower == NULL) {
+		ret = -ENOMEM;
+		goto done;
+	}
+
+	pf_dev->app_priv = app_flower;
+
+	/* Allocate memory for the PF AND ctrl vNIC here (hence the * 2) */
+	pf_hw = rte_zmalloc_socket("nfp_pf_vnic", 2 * sizeof(struct nfp_net_adapter),
+			RTE_CACHE_LINE_SIZE, numa_node);
+	if (pf_hw == NULL) {
+		ret = -ENOMEM;
+		goto app_cleanup;
+	}
+
+	/* Start up flower services */
+	if (nfp_flower_enable_services(app_flower)) {
+		ret = -ESRCH;
+		goto vnic_cleanup;
+	}
+
+	return 0;
+
+vnic_cleanup:
+	rte_free(pf_hw);
+app_cleanup:
+	rte_free(app_flower);
+done:
+	return ret;
+}
+
+int
+nfp_secondary_init_app_flower(__rte_unused struct nfp_cpp *cpp)
+{
+	PMD_INIT_LOG(ERR, "Flower firmware not supported");
+	return -ENOTSUP;
+}
diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
new file mode 100644
index 0000000..4a9b302
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#ifndef _NFP_FLOWER_H_
+#define _NFP_FLOWER_H_
+
+enum nfp_flower_service {
+	NFP_FLOWER_SERVICE_MAX
+};
+
+/* The flower application's private structure */
+struct nfp_app_flower {
+	/* List of rte_service ID's for the flower app */
+	uint32_t flower_services_ids[NFP_FLOWER_SERVICE_MAX];
+};
+
+int nfp_init_app_flower(struct nfp_pf_dev *pf_dev);
+int nfp_secondary_init_app_flower(struct nfp_cpp *cpp);
+
+#endif /* _NFP_FLOWER_H_ */
diff --git a/drivers/net/nfp/meson.build b/drivers/net/nfp/meson.build
index 810f02a..7ae3115 100644
--- a/drivers/net/nfp/meson.build
+++ b/drivers/net/nfp/meson.build
@@ -6,6 +6,7 @@ if not is_linux or not dpdk_conf.get('RTE_ARCH_64')
     reason = 'only supported on 64-bit Linux'
 endif
 sources = files(
+        'flower/nfp_flower.c',
         'nfpcore/nfp_cpp_pcie_ops.c',
         'nfpcore/nfp_nsp.c',
         'nfpcore/nfp_cppcore.c',
diff --git a/drivers/net/nfp/nfp_cpp_bridge.c b/drivers/net/nfp/nfp_cpp_bridge.c
index 0922ea9..9ac165a 100644
--- a/drivers/net/nfp/nfp_cpp_bridge.c
+++ b/drivers/net/nfp/nfp_cpp_bridge.c
@@ -28,22 +28,86 @@
 static int nfp_cpp_bridge_serve_write(int sockfd, struct nfp_cpp *cpp);
 static int nfp_cpp_bridge_serve_read(int sockfd, struct nfp_cpp *cpp);
 static int nfp_cpp_bridge_serve_ioctl(int sockfd, struct nfp_cpp *cpp);
+static int nfp_cpp_bridge_service_func(void *args);
 
-void nfp_register_cpp_service(struct nfp_cpp *cpp)
+static struct rte_service_spec cpp_service = {
+	.name         = "nfp_cpp_service",
+	.callback     = nfp_cpp_bridge_service_func,
+};
+
+int
+nfp_map_service(uint32_t service_id)
 {
-	uint32_t *cpp_service_id = NULL;
-	struct rte_service_spec service;
+	int32_t ret;
+	uint32_t slcore = 0;
+	int32_t slcore_count;
+	uint8_t service_count;
+	const char *service_name;
+	uint32_t slcore_array[RTE_MAX_LCORE];
+	uint8_t min_service_count = UINT8_MAX;
+
+	slcore_count = rte_service_lcore_list(slcore_array, RTE_MAX_LCORE);
+	if (slcore_count <= 0) {
+		PMD_INIT_LOG(DEBUG, "No service cores found");
+		return -ENOENT;
+	}
+
+	/*
+	 * Find a service core with the least number of services already
+	 * registered to it
+	 */
+	while (slcore_count--) {
+		service_count = rte_service_lcore_count_services(slcore_array[slcore_count]);
+		if (service_count < min_service_count) {
+			slcore = slcore_array[slcore_count];
+			min_service_count = service_count;
+		}
+	}
 
-	memset(&service, 0, sizeof(struct rte_service_spec));
-	snprintf(service.name, sizeof(service.name), "nfp_cpp_service");
-	service.callback = nfp_cpp_bridge_service_func;
-	service.callback_userdata = (void *)cpp;
+	service_name = rte_service_get_name(service_id);
+	PMD_INIT_LOG(INFO, "Mapping service %s to core %u", service_name, slcore);
+	ret = rte_service_map_lcore_set(service_id, slcore, 1);
+	if (ret) {
+		PMD_INIT_LOG(DEBUG, "Could not map flower service");
+		return -ENOENT;
+	}
 
-	if (rte_service_component_register(&service,
-					   cpp_service_id))
-		RTE_LOG(WARNING, PMD, "NFP CPP bridge service register() failed");
+	rte_service_runstate_set(service_id, 1);
+	rte_service_component_runstate_set(service_id, 1);
+	rte_service_lcore_start(slcore);
+	if (rte_service_may_be_active(slcore))
+		RTE_LOG(INFO, PMD, "The service %s is running", service_name);
 	else
-		RTE_LOG(DEBUG, PMD, "NFP CPP bridge service registered");
+		RTE_LOG(INFO, PMD, "The service %s is not running", service_name);
+
+	return 0;
+}
+
+int nfp_enable_cpp_service(struct nfp_cpp *cpp, enum nfp_app_id app_id)
+{
+	int ret = 0;
+	uint32_t id = 0;
+
+	cpp_service.callback_userdata = (void *)cpp;
+
+	/* Register the cpp service */
+	ret = rte_service_component_register(&cpp_service, &id);
+	if (ret) {
+		PMD_INIT_LOG(WARNING, "Could not register nfp cpp service");
+		return -EINVAL;
+	}
+
+	PMD_INIT_LOG(INFO, "NFP cpp service registered");
+
+	/* Map it to available service core*/
+	ret = nfp_map_service(id);
+	if (ret) {
+		PMD_INIT_LOG(DEBUG, "Could not map nfp cpp service");
+		if (app_id == NFP_APP_FLOWER_NIC)
+			return -EINVAL;
+	}
+
+	return 0;
 }
 
 /*
@@ -307,7 +371,7 @@ void nfp_register_cpp_service(struct nfp_cpp *cpp)
  * unaware of the CPP bridge performing the NFP kernel char driver for CPP
  * accesses.
  */
-int32_t
+static int
 nfp_cpp_bridge_service_func(void *args)
 {
 	struct sockaddr address;
diff --git a/drivers/net/nfp/nfp_cpp_bridge.h b/drivers/net/nfp/nfp_cpp_bridge.h
index aea5fdc..dde50d7 100644
--- a/drivers/net/nfp/nfp_cpp_bridge.h
+++ b/drivers/net/nfp/nfp_cpp_bridge.h
@@ -16,6 +16,8 @@
 #ifndef _NFP_CPP_BRIDGE_H_
 #define _NFP_CPP_BRIDGE_H_
 
+#include "nfp_common.h"
+
 #define NFP_CPP_MEMIO_BOUNDARY	(1 << 20)
 #define NFP_BRIDGE_OP_READ	20
 #define NFP_BRIDGE_OP_WRITE	30
@@ -24,8 +26,8 @@
 #define NFP_IOCTL 'n'
 #define NFP_IOCTL_CPP_IDENTIFICATION _IOW(NFP_IOCTL, 0x8f, uint32_t)
 
-void nfp_register_cpp_service(struct nfp_cpp *cpp);
-int32_t nfp_cpp_bridge_service_func(void *args);
+int nfp_map_service(uint32_t service_id);
+int nfp_enable_cpp_service(struct nfp_cpp *cpp, enum nfp_app_id app_id);
 
 #endif /* _NFP_CPP_BRIDGE_H_ */
 /*
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index 90dd01e..0b88749 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -38,6 +38,8 @@
 #include "nfp_ctrl.h"
 #include "nfp_cpp_bridge.h"
 
+#include "flower/nfp_flower.h"
+
 static int
 nfp_net_pf_read_mac(struct nfp_app_nic *app_nic, int port)
 {
@@ -837,7 +839,8 @@
 }
 
 static int
-nfp_pf_init(struct rte_pci_device *pci_dev)
+nfp_pf_init(struct rte_pci_device *pci_dev,
+		struct rte_pci_driver *pci_drv)
 {
 	int ret;
 	int err = 0;
@@ -964,6 +967,16 @@
 			goto hwqueues_cleanup;
 		}
 		break;
+	case NFP_APP_FLOWER_NIC:
+		PMD_INIT_LOG(INFO, "Initializing Flower");
+		pci_dev->device.driver = &pci_drv->driver;
+		ret = nfp_init_app_flower(pf_dev);
+		if (ret) {
+			PMD_INIT_LOG(ERR, "Could not initialize Flower!");
+			pci_dev->device.driver = NULL;
+			goto hwqueues_cleanup;
+		}
+		break;
 	default:
 		PMD_INIT_LOG(ERR, "Unsupported Firmware loaded");
 		ret = -EINVAL;
@@ -971,7 +984,12 @@
 	}
 
 	/* register the CPP bridge service here for primary use */
-	nfp_register_cpp_service(pf_dev->cpp);
+	ret = nfp_enable_cpp_service(pf_dev->cpp, pf_dev->app_id);
+	if (ret) {
+		PMD_INIT_LOG(ERR, "Enable cpp service failed.");
+		ret = -EINVAL;
+		goto hwqueues_cleanup;
+	}
 
 	return 0;
 
@@ -1096,6 +1114,14 @@
 			goto sym_tbl_cleanup;
 		}
 		break;
+	case NFP_APP_FLOWER_NIC:
+		PMD_INIT_LOG(INFO, "Initializing Flower");
+		ret = nfp_secondary_init_app_flower(cpp);
+		if (ret) {
+			PMD_INIT_LOG(ERR, "Could not initialize Flower!");
+			goto sym_tbl_cleanup;
+		}
+		break;
 	default:
 		PMD_INIT_LOG(ERR, "Unsupported Firmware loaded");
 		ret = -EINVAL;
@@ -1106,7 +1132,11 @@
 		goto sym_tbl_cleanup;
 
 	/* Register the CPP bridge service for the secondary too */
-	nfp_register_cpp_service(cpp);
+	ret = nfp_enable_cpp_service(cpp, app_id);
+	if (ret) {
+		PMD_INIT_LOG(ERR, "Enable cpp service failed.");
+		ret = -EINVAL;
+	}
 
 sym_tbl_cleanup:
 	free(sym_tbl);
@@ -1115,11 +1145,11 @@
 }
 
 static int
-nfp_pf_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
+nfp_pf_pci_probe(struct rte_pci_driver *pci_drv,
 		struct rte_pci_device *dev)
 {
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
-		return nfp_pf_init(dev);
+		return nfp_pf_init(dev, pci_drv);
 	else
 		return nfp_pf_secondary_init(dev);
 }
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 05/12] net/nfp: add flower PF setup and mempool init logic
  2022-08-05  6:32 [PATCH v5 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (3 preceding siblings ...)
  2022-08-05  6:32 ` [PATCH v5 04/12] net/nfp: add initial flower firmware support Chaoyong He
@ 2022-08-05  6:32 ` Chaoyong He
  2022-08-05 12:49   ` Andrew Rybchenko
  2022-08-05  6:32 ` [PATCH v5 06/12] net/nfp: add flower PF related routines Chaoyong He
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Chaoyong He @ 2022-08-05  6:32 UTC (permalink / raw)
  To: dev; +Cc: niklas.soderlund, Chaoyong He

This commit adds the vNIC initialization logic for the flower PF vNIC.
The flower firmware exposes this vNIC for the purposes of fallback
traffic in the switchdev use-case. The logic of setting up this vNIC is
similar to the logic seen in nfp_net_init() and nfp_net_start().

This commit also adds minimal dev_ops for this PF device. Because the
device is being exposed externally to DPDK it should also be configured
using DPDK helpers like rte_eth_configure(). For these helpers to work
the flower logic needs to implements a minimal set of dev_ops. The Rx
and Tx logic for this vNIC will be added in a subsequent commit.

OVS expects incoming packets coming into the OVS datapath to be
allocated from a mempool that contains objects of type "struct
dp_packet". For the PF handling the slowpath into OVS it should
use a mempool that is compatible with OVS. This commit adds the logic
to create the OVS compatible mempool. It adds certain OVS specific
structs to be able to instantiate the mempool.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/flower/nfp_flower.c            | 384 ++++++++++++++++++++++++-
 drivers/net/nfp/flower/nfp_flower.h            |   9 +
 drivers/net/nfp/flower/nfp_flower_ovs_compat.h | 145 ++++++++++
 drivers/net/nfp/nfp_common.h                   |   3 +
 4 files changed, 537 insertions(+), 4 deletions(-)
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ovs_compat.h

diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
index 1dddced..c05d4ca 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -14,7 +14,35 @@
 #include "../nfp_logs.h"
 #include "../nfp_ctrl.h"
 #include "../nfp_cpp_bridge.h"
+#include "../nfp_rxtx.h"
+#include "../nfpcore/nfp_mip.h"
+#include "../nfpcore/nfp_rtsym.h"
+#include "../nfpcore/nfp_nsp.h"
 #include "nfp_flower.h"
+#include "nfp_flower_ovs_compat.h"
+
+#define MAX_PKT_BURST 32
+#define MEMPOOL_CACHE_SIZE 512
+#define DEFAULT_FLBUF_SIZE 9216
+
+/*
+ * Simple dev ops functions for the flower PF. Because a rte_device is exposed
+ * to DPDK the flower logic also makes use of helper functions like
+ * rte_dev_configure() to set up the PF device. Stub functions are needed to
+ * use these helper functions
+ */
+static int
+nfp_flower_pf_configure(__rte_unused struct rte_eth_dev *dev)
+{
+	return 0;
+}
+
+static const struct eth_dev_ops nfp_flower_pf_dev_ops = {
+	.dev_configure          = nfp_flower_pf_configure,
+
+	/* Use the normal dev_infos_get functionality in the NFP PMD */
+	.dev_infos_get          = nfp_net_infos_get,
+};
 
 static struct rte_service_spec flower_services[NFP_FLOWER_SERVICE_MAX] = {
 };
@@ -49,6 +77,304 @@
 	return ret;
 }
 
+static void
+nfp_flower_pf_mp_init(__rte_unused struct rte_mempool *mp,
+		__rte_unused void *opaque_arg,
+		void *_p,
+		__rte_unused unsigned int i)
+{
+	struct dp_packet *pkt = _p;
+	pkt->source      = DPBUF_DPDK;
+	pkt->l2_pad_size = 0;
+	pkt->l2_5_ofs    = UINT16_MAX;
+	pkt->l3_ofs      = UINT16_MAX;
+	pkt->l4_ofs      = UINT16_MAX;
+	pkt->packet_type = 0; /* PT_ETH */
+}
+
+static struct rte_mempool *
+nfp_flower_pf_mp_create(void)
+{
+	uint32_t nb_mbufs;
+	uint32_t pkt_size;
+	uint32_t n_rxd = 1024;
+	uint32_t n_txd = 1024;
+	unsigned int numa_node;
+	uint32_t aligned_mbuf_size;
+	uint32_t mbuf_priv_data_len;
+	struct rte_mempool *pktmbuf_pool;
+
+	nb_mbufs = RTE_MAX(n_rxd + n_txd + MAX_PKT_BURST + MEMPOOL_CACHE_SIZE,
+			81920U);
+
+	/*
+	 * The size of the mbuf's private area (i.e. area that holds OvS'
+	 * dp_packet data)
+	 */
+	mbuf_priv_data_len = sizeof(struct dp_packet) - sizeof(struct rte_mbuf);
+	/* The size of the entire dp_packet. */
+	pkt_size = sizeof(struct dp_packet) + RTE_MBUF_DEFAULT_BUF_SIZE;
+	/* mbuf size, rounded up to cacheline size. */
+	aligned_mbuf_size = ROUND_UP(pkt_size, RTE_CACHE_LINE_SIZE);
+	mbuf_priv_data_len += (aligned_mbuf_size - pkt_size);
+
+	numa_node = rte_socket_id();
+	pktmbuf_pool = rte_pktmbuf_pool_create("flower_pf_mbuf_pool", nb_mbufs,
+			MEMPOOL_CACHE_SIZE, mbuf_priv_data_len,
+			RTE_MBUF_DEFAULT_BUF_SIZE, numa_node);
+	if (pktmbuf_pool == NULL) {
+		RTE_LOG(ERR, PMD, "Cannot init mbuf pool\n");
+		return NULL;
+	}
+
+	rte_mempool_obj_iter(pktmbuf_pool, nfp_flower_pf_mp_init, NULL);
+
+	return pktmbuf_pool;
+}
+
+static void
+nfp_flower_cleanup_pf_vnic(struct nfp_net_hw *hw)
+{
+	uint16_t i;
+	struct rte_eth_dev *eth_dev;
+	struct nfp_app_flower *app_flower;
+
+	eth_dev = hw->eth_dev;
+	app_flower = NFP_APP_PRIV_TO_APP_FLOWER(hw->pf_dev->app_priv);
+
+	for (i = 0; i < eth_dev->data->nb_tx_queues; i++)
+		nfp_net_tx_queue_release(eth_dev, i);
+
+	for (i = 0; i < eth_dev->data->nb_rx_queues; i++)
+		nfp_net_rx_queue_release(eth_dev, i);
+
+	rte_free(eth_dev->data->mac_addrs);
+	rte_mempool_free(app_flower->pf_pktmbuf_pool);
+	rte_free(eth_dev->data->dev_private);
+	rte_eth_dev_release_port(hw->eth_dev);
+}
+
+static int
+nfp_flower_init_vnic_common(struct nfp_net_hw *hw, const char *vnic_type)
+{
+	uint32_t start_q;
+	uint64_t rx_bar_off;
+	uint64_t tx_bar_off;
+	const int stride = 4;
+	struct nfp_pf_dev *pf_dev;
+	struct rte_pci_device *pci_dev;
+
+	pf_dev = hw->pf_dev;
+	pci_dev = hw->pf_dev->pci_dev;
+
+	/* NFP can not handle DMA addresses requiring more than 40 bits */
+	if (rte_mem_check_dma_mask(40)) {
+		RTE_LOG(ERR, PMD,
+			"device %s can not be used: restricted dma mask to 40 bits!\n",
+			pci_dev->device.name);
+		return -ENODEV;
+	};
+
+	hw->device_id = pci_dev->id.device_id;
+	hw->vendor_id = pci_dev->id.vendor_id;
+	hw->subsystem_device_id = pci_dev->id.subsystem_device_id;
+	hw->subsystem_vendor_id = pci_dev->id.subsystem_vendor_id;
+
+	PMD_INIT_LOG(DEBUG, "%s vNIC ctrl bar: %p", vnic_type, hw->ctrl_bar);
+
+	/* Read the number of available rx/tx queues from hardware */
+	hw->max_rx_queues = nn_cfg_readl(hw, NFP_NET_CFG_MAX_RXRINGS);
+	hw->max_tx_queues = nn_cfg_readl(hw, NFP_NET_CFG_MAX_TXRINGS);
+
+	/* Work out where in the BAR the queues start */
+	start_q = nn_cfg_readl(hw, NFP_NET_CFG_START_TXQ);
+	tx_bar_off = (uint64_t)start_q * NFP_QCP_QUEUE_ADDR_SZ;
+	start_q = nn_cfg_readl(hw, NFP_NET_CFG_START_RXQ);
+	rx_bar_off = (uint64_t)start_q * NFP_QCP_QUEUE_ADDR_SZ;
+
+	hw->tx_bar = pf_dev->hw_queues + tx_bar_off;
+	hw->rx_bar = pf_dev->hw_queues + rx_bar_off;
+
+	/* Get some of the read-only fields from the config BAR */
+	hw->ver = nn_cfg_readl(hw, NFP_NET_CFG_VERSION);
+	hw->cap = nn_cfg_readl(hw, NFP_NET_CFG_CAP);
+	hw->max_mtu = nn_cfg_readl(hw, NFP_NET_CFG_MAX_MTU);
+	/* Set the current MTU to the maximum supported */
+	hw->mtu = hw->max_mtu;
+	hw->flbufsz = DEFAULT_FLBUF_SIZE;
+
+	/* read the Rx offset configured from firmware */
+	if (NFD_CFG_MAJOR_VERSION_of(hw->ver) < 2)
+		hw->rx_offset = NFP_NET_RX_OFFSET;
+	else
+		hw->rx_offset = nn_cfg_readl(hw, NFP_NET_CFG_RX_OFFSET_ADDR);
+
+	hw->ctrl = 0;
+	hw->stride_rx = stride;
+	hw->stride_tx = stride;
+
+	/* Reuse cfg queue setup function */
+	nfp_net_cfg_queue_setup(hw);
+
+	PMD_INIT_LOG(INFO, "%s vNIC max_rx_queues: %u, max_tx_queues: %u",
+			vnic_type, hw->max_rx_queues, hw->max_tx_queues);
+
+	/* Initializing spinlock for reconfigs */
+	rte_spinlock_init(&hw->reconfig_lock);
+
+	return 0;
+}
+
+static int
+nfp_flower_init_pf_vnic(struct nfp_net_hw *hw)
+{
+	int ret;
+	uint16_t i;
+	uint16_t n_txq;
+	uint16_t n_rxq;
+	uint16_t port_id;
+	unsigned int numa_node;
+	struct rte_mempool *mp;
+	struct nfp_pf_dev *pf_dev;
+	struct rte_eth_dev *eth_dev;
+	struct nfp_app_flower *app_flower;
+
+	const struct rte_eth_rxconf rx_conf = {
+		.rx_free_thresh = DEFAULT_RX_FREE_THRESH,
+		.rx_drop_en = 1,
+	};
+
+	const struct rte_eth_txconf tx_conf = {
+		.tx_thresh = {
+			.pthresh  = DEFAULT_TX_PTHRESH,
+			.hthresh = DEFAULT_TX_HTHRESH,
+			.wthresh = DEFAULT_TX_WTHRESH,
+		},
+		.tx_free_thresh = DEFAULT_TX_FREE_THRESH,
+	};
+
+	static struct rte_eth_conf port_conf = {
+		.rxmode = {
+			.mq_mode  = RTE_ETH_MQ_RX_RSS,
+			.offloads = RTE_ETH_RX_OFFLOAD_CHECKSUM,
+		},
+		.txmode = {
+			.mq_mode = RTE_ETH_MQ_TX_NONE,
+		},
+	};
+
+	/* Set up some pointers here for ease of use */
+	pf_dev = hw->pf_dev;
+	app_flower = NFP_APP_PRIV_TO_APP_FLOWER(pf_dev->app_priv);
+
+	/*
+	 * Perform the "common" part of setting up a flower vNIC.
+	 * Mostly reading configuration from hardware.
+	 */
+	ret = nfp_flower_init_vnic_common(hw, "pf_vnic");
+	if (ret)
+		goto done;
+
+	hw->eth_dev = rte_eth_dev_allocate("pf_vnic_eth_dev");
+	if (hw->eth_dev == NULL) {
+		ret = -ENOMEM;
+		goto done;
+	}
+
+	/* Grab the pointer to the newly created rte_eth_dev here */
+	eth_dev = hw->eth_dev;
+
+	numa_node = rte_socket_id();
+	eth_dev->data->dev_private =
+		rte_zmalloc_socket("pf_vnic_eth_dev", sizeof(struct nfp_net_hw),
+				   RTE_CACHE_LINE_SIZE, numa_node);
+	if (eth_dev->data->dev_private == NULL) {
+		ret = -ENOMEM;
+		goto port_release;
+	}
+
+	/* Fill in some of the eth_dev fields */
+	eth_dev->device = &pf_dev->pci_dev->device;
+	eth_dev->data->nb_tx_queues = hw->max_tx_queues;
+	eth_dev->data->nb_rx_queues = hw->max_rx_queues;
+	eth_dev->data->dev_private = hw;
+
+	/* Create a mbuf pool for the PF */
+	app_flower->pf_pktmbuf_pool = nfp_flower_pf_mp_create();
+	if (app_flower->pf_pktmbuf_pool == NULL) {
+		ret = -ENOMEM;
+		goto private_cleanup;
+	}
+
+	mp = app_flower->pf_pktmbuf_pool;
+
+	/* Add Rx/Tx functions */
+	eth_dev->dev_ops = &nfp_flower_pf_dev_ops;
+
+	/* PF vNIC gets a random MAC */
+	eth_dev->data->mac_addrs = rte_zmalloc("mac_addr",
+			RTE_ETHER_ADDR_LEN, 0);
+	if (eth_dev->data->mac_addrs == NULL) {
+		ret = -ENOMEM;
+		goto mempool_cleanup;
+	}
+
+	rte_eth_random_addr(eth_dev->data->mac_addrs->addr_bytes);
+	rte_eth_dev_probing_finish(eth_dev);
+
+	/* Configure the PF device now */
+	n_rxq = hw->eth_dev->data->nb_rx_queues;
+	n_txq = hw->eth_dev->data->nb_tx_queues;
+	port_id = hw->eth_dev->data->port_id;
+
+	ret = rte_eth_dev_configure(port_id, n_rxq, n_txq, &port_conf);
+	if (ret) {
+		PMD_INIT_LOG(ERR, "Could not configure PF device %d", ret);
+		goto mac_cleanup;
+	}
+
+	/* Set up the Rx queues */
+	for (i = 0; i < n_rxq; i++) {
+		/* Hardcoded number of desc to 1024 */
+		ret = nfp_net_rx_queue_setup(eth_dev, i, 1024, numa_node,
+			&rx_conf, mp);
+		if (ret) {
+			PMD_INIT_LOG(ERR, "Configure flower PF vNIC Rx queue %d failed", i);
+			goto rx_queue_cleanup;
+		}
+	}
+
+	/* Set up the Tx queues */
+	for (i = 0; i < n_txq; i++) {
+		/* Hardcoded number of desc to 1024 */
+		ret = nfp_net_nfd3_tx_queue_setup(eth_dev, i, 1024, numa_node,
+			&tx_conf);
+		if (ret) {
+			PMD_INIT_LOG(ERR, "Configure flower PF vNIC Tx queue %d failed", i);
+			goto tx_queue_cleanup;
+		}
+	}
+
+	return 0;
+
+tx_queue_cleanup:
+	for (i = 0; i < n_txq; i++)
+		nfp_net_tx_queue_release(eth_dev, i);
+rx_queue_cleanup:
+	for (i = 0; i < n_rxq; i++)
+		nfp_net_rx_queue_release(eth_dev, i);
+mac_cleanup:
+	rte_free(eth_dev->data->mac_addrs);
+mempool_cleanup:
+	rte_mempool_free(mp);
+private_cleanup:
+	rte_free(eth_dev->data->dev_private);
+port_release:
+	rte_eth_dev_release_port(hw->eth_dev);
+done:
+	return ret;
+}
+
 int
 nfp_init_app_flower(struct nfp_pf_dev *pf_dev)
 {
@@ -77,14 +403,49 @@
 		goto app_cleanup;
 	}
 
+	/* Grab the number of physical ports present on hardware */
+	app_flower->nfp_eth_table = nfp_eth_read_ports(pf_dev->cpp);
+	if (app_flower->nfp_eth_table == NULL) {
+		PMD_INIT_LOG(ERR, "error reading nfp ethernet table");
+		ret = -EIO;
+		goto vnic_cleanup;
+	}
+
+	/* Map the PF ctrl bar */
+	pf_dev->ctrl_bar = nfp_rtsym_map(pf_dev->sym_tbl, "_pf0_net_bar0",
+			32768, &pf_dev->ctrl_area);
+	if (pf_dev->ctrl_bar == NULL) {
+		PMD_INIT_LOG(ERR, "Cloud not map the PF vNIC ctrl bar");
+		ret = -ENODEV;
+		goto eth_tbl_cleanup;
+	}
+
+	/* Fill in the PF vNIC and populate app struct */
+	app_flower->pf_hw = pf_hw;
+	pf_hw->ctrl_bar = pf_dev->ctrl_bar;
+	pf_hw->pf_dev = pf_dev;
+	pf_hw->cpp = pf_dev->cpp;
+
+	ret = nfp_flower_init_pf_vnic(app_flower->pf_hw);
+	if (ret) {
+		PMD_INIT_LOG(ERR, "Could not initialize flower PF vNIC");
+		goto pf_cpp_area_cleanup;
+	}
+
 	/* Start up flower services */
 	if (nfp_flower_enable_services(app_flower)) {
 		ret = -ESRCH;
-		goto vnic_cleanup;
+		goto pf_vnic_cleanup;
 	}
 
 	return 0;
 
+pf_vnic_cleanup:
+	nfp_flower_cleanup_pf_vnic(app_flower->pf_hw);
+pf_cpp_area_cleanup:
+	nfp_cpp_area_free(pf_dev->ctrl_area);
+eth_tbl_cleanup:
+	free(app_flower->nfp_eth_table);
 vnic_cleanup:
 	rte_free(pf_hw);
 app_cleanup:
@@ -94,8 +455,23 @@
 }
 
 int
-nfp_secondary_init_app_flower(__rte_unused struct nfp_cpp *cpp)
+nfp_secondary_init_app_flower(struct nfp_cpp *cpp)
 {
-	PMD_INIT_LOG(ERR, "Flower firmware not supported");
-	return -ENOTSUP;
+	struct rte_eth_dev *eth_dev;
+	const char *port_name = "pf_vnic_eth_dev";
+
+	PMD_DRV_LOG(DEBUG, "Secondary attaching to port %s", port_name);
+
+	eth_dev = rte_eth_dev_attach_secondary(port_name);
+	if (eth_dev == NULL) {
+		RTE_LOG(ERR, EAL, "secondary process attach failed, "
+			"ethdev doesn't exist");
+		return -ENODEV;
+	}
+
+	eth_dev->process_private = cpp;
+	eth_dev->dev_ops = &nfp_flower_pf_dev_ops;
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
 }
diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
index 4a9b302..f6fd4eb 100644
--- a/drivers/net/nfp/flower/nfp_flower.h
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -14,6 +14,15 @@ enum nfp_flower_service {
 struct nfp_app_flower {
 	/* List of rte_service ID's for the flower app */
 	uint32_t flower_services_ids[NFP_FLOWER_SERVICE_MAX];
+
+	/* Pointer to a mempool for the PF vNIC */
+	struct rte_mempool *pf_pktmbuf_pool;
+
+	/* Pointer to the PF vNIC */
+	struct nfp_net_hw *pf_hw;
+
+	/* the eth table as reported by firmware */
+	struct nfp_eth_table *nfp_eth_table;
 };
 
 int nfp_init_app_flower(struct nfp_pf_dev *pf_dev);
diff --git a/drivers/net/nfp/flower/nfp_flower_ovs_compat.h b/drivers/net/nfp/flower/nfp_flower_ovs_compat.h
new file mode 100644
index 0000000..f0fcbf2
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_ovs_compat.h
@@ -0,0 +1,145 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#ifndef _NFP_FLOWER_OVS_COMPAT_H_
+#define _NFP_FLOWER_OVS_COMPAT_H_
+
+/* From ovs */
+#define PAD_PASTE2(x, y) x##y
+#define PAD_PASTE(x, y) PAD_PASTE2(x, y)
+#define PAD_ID PAD_PASTE(pad, __COUNTER__)
+
+/* Returns X rounded up to the nearest multiple of Y. */
+#define ROUND_UP(X, Y) (DIV_ROUND_UP(X, Y) * (Y))
+
+typedef uint8_t OVS_CACHE_LINE_MARKER[1];
+
+#ifndef __cplusplus
+#define PADDED_MEMBERS_CACHELINE_MARKER(UNIT, CACHELINE, MEMBERS)   \
+	union {                                                         \
+		OVS_CACHE_LINE_MARKER CACHELINE;                            \
+		struct { MEMBERS };                                         \
+		uint8_t PAD_ID[ROUND_UP(sizeof(struct { MEMBERS }), UNIT)]; \
+	}
+#else
+#define PADDED_MEMBERS_CACHELINE_MARKER(UNIT, CACHELINE, MEMBERS)           \
+	struct struct_##CACHELINE { MEMBERS };                                  \
+	union {                                                                 \
+		OVS_CACHE_LINE_MARKER CACHELINE;                                    \
+		struct { MEMBERS };                                                 \
+		uint8_t PAD_ID[ROUND_UP(sizeof(struct struct_##CACHELINE), UNIT)];  \
+	}
+#endif
+
+struct ovs_key_ct_tuple_ipv4 {
+	rte_be32_t ipv4_src;
+	rte_be32_t ipv4_dst;
+	rte_be16_t src_port;
+	rte_be16_t dst_port;
+	uint8_t    ipv4_proto;
+};
+
+struct ovs_key_ct_tuple_ipv6 {
+	rte_be32_t ipv6_src[4];
+	rte_be32_t ipv6_dst[4];
+	rte_be16_t src_port;
+	rte_be16_t dst_port;
+	uint8_t    ipv6_proto;
+};
+
+/* Tunnel information used in flow key and metadata. */
+struct flow_tnl {
+	uint32_t ip_dst;
+	struct in6_addr ipv6_dst;
+	uint32_t ip_src;
+	struct in6_addr ipv6_src;
+	uint64_t tun_id;
+	uint16_t flags;
+	uint8_t ip_tos;
+	uint8_t ip_ttl;
+	uint16_t tp_src;
+	uint16_t tp_dst;
+	uint16_t gbp_id;
+	uint8_t  gbp_flags;
+	uint8_t erspan_ver;
+	uint32_t erspan_idx;
+	uint8_t erspan_dir;
+	uint8_t erspan_hwid;
+	uint8_t gtpu_flags;
+	uint8_t gtpu_msgtype;
+	uint8_t pad1[4];     /* Pad to 64 bits. */
+};
+
+enum dp_packet_source {
+	DPBUF_MALLOC,              /* Obtained via malloc(). */
+	DPBUF_STACK,               /* Un-movable stack space or static buffer. */
+	DPBUF_STUB,                /* Starts on stack, may expand into heap. */
+	DPBUF_DPDK,                /* buffer data is from DPDK allocated memory. */
+	DPBUF_AFXDP,               /* Buffer data from XDP frame. */
+};
+
+/* Datapath packet metadata */
+struct pkt_metadata {
+PADDED_MEMBERS_CACHELINE_MARKER(RTE_CACHE_LINE_SIZE, cacheline0,
+	/* Recirculation id carried with the recirculating packets. */
+	uint32_t recirc_id;         /* 0 for packets received from the wire. */
+	uint32_t dp_hash;           /* hash value computed by the recirculation action. */
+	uint32_t skb_priority;      /* Packet priority for QoS. */
+	uint32_t pkt_mark;          /* Packet mark. */
+	uint8_t  ct_state;          /* Connection state. */
+	bool ct_orig_tuple_ipv6;
+	uint16_t ct_zone;           /* Connection zone. */
+	uint32_t ct_mark;           /* Connection mark. */
+	uint32_t ct_label[4];       /* Connection label. */
+	uint32_t in_port;           /* Input port. */
+	uint32_t orig_in_port;      /* Originating in_port for tunneled packets */
+	void *conn;                 /* Cached conntrack connection. */
+	bool reply;                 /* True if reply direction. */
+	bool icmp_related;          /* True if ICMP related. */
+);
+
+PADDED_MEMBERS_CACHELINE_MARKER(RTE_CACHE_LINE_SIZE, cacheline1,
+	union {                     /* Populated only for non-zero 'ct_state'. */
+		struct ovs_key_ct_tuple_ipv4 ipv4;
+		struct ovs_key_ct_tuple_ipv6 ipv6;   /* Used only if */
+	} ct_orig_tuple;                             /* 'ct_orig_tuple_ipv6' is set */
+);
+
+/*
+ * Encapsulating tunnel parameters. Note that if 'ip_dst' == 0,
+ * the rest of the fields may be uninitialized.
+ */
+PADDED_MEMBERS_CACHELINE_MARKER(RTE_CACHE_LINE_SIZE, cacheline2,
+	struct flow_tnl tunnel;);
+};
+
+#define DP_PACKET_CONTEXT_SIZE 64
+
+/*
+ * Buffer for holding packet data.  A dp_packet is automatically reallocated
+ * as necessary if it grows too large for the available memory.
+ * By default the packet type is set to Ethernet (PT_ETH).
+ */
+struct dp_packet {
+	struct rte_mbuf mbuf;          /* DPDK mbuf */
+	enum dp_packet_source source;  /* Source of memory allocated as 'base'. */
+
+	/*
+	 * All the following elements of this struct are copied in a single call
+	 * of memcpy in dp_packet_clone_with_headroom.
+	 */
+	uint16_t l2_pad_size;          /* Detected l2 padding size. Padding is non-pullable. */
+	uint16_t l2_5_ofs;             /* MPLS label stack offset, or UINT16_MAX */
+	uint16_t l3_ofs;               /* Network-level header offset, or UINT16_MAX. */
+	uint16_t l4_ofs;               /* Transport-level header offset, or UINT16_MAX. */
+	uint32_t cutlen;               /* length in bytes to cut from the end. */
+	uint32_t packet_type;          /* Packet type as defined in OpenFlow */
+	union {
+		struct pkt_metadata md;
+		uint64_t data[DP_PACKET_CONTEXT_SIZE / 8];
+	};
+};
+
+#endif /* _NFP_FLOWER_OVS_COMPAT_ */
diff --git a/drivers/net/nfp/nfp_common.h b/drivers/net/nfp/nfp_common.h
index b28ebc9..ab2e5c2 100644
--- a/drivers/net/nfp/nfp_common.h
+++ b/drivers/net/nfp/nfp_common.h
@@ -448,6 +448,9 @@ int nfp_net_rss_hash_conf_get(struct rte_eth_dev *dev,
 #define NFP_APP_PRIV_TO_APP_NIC(app_priv)\
 	((struct nfp_app_nic *)app_priv)
 
+#define NFP_APP_PRIV_TO_APP_FLOWER(app_priv)\
+	((struct nfp_app_flower *)app_priv)
+
 #endif /* _NFP_COMMON_H_ */
 /*
  * Local variables:
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 06/12] net/nfp: add flower PF related routines
  2022-08-05  6:32 [PATCH v5 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (4 preceding siblings ...)
  2022-08-05  6:32 ` [PATCH v5 05/12] net/nfp: add flower PF setup and mempool init logic Chaoyong He
@ 2022-08-05  6:32 ` Chaoyong He
  2022-08-05 12:55   ` Andrew Rybchenko
  2022-08-05  6:32 ` [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics Chaoyong He
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Chaoyong He @ 2022-08-05  6:32 UTC (permalink / raw)
  To: dev; +Cc: niklas.soderlund, Chaoyong He

This commit adds the start/stop/close routine of the
flower PF vNIC.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/flower/nfp_flower.c | 193 ++++++++++++++++++++++++++++++++++++
 1 file changed, 193 insertions(+)

diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
index c05d4ca..2498020 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -7,6 +7,7 @@
 #include <ethdev_driver.h>
 #include <rte_service_component.h>
 #include <rte_malloc.h>
+#include <rte_alarm.h>
 #include <ethdev_pci.h>
 #include <ethdev_driver.h>
 
@@ -37,11 +38,178 @@
 	return 0;
 }
 
+static int
+nfp_flower_pf_start(struct rte_eth_dev *dev)
+{
+	int ret;
+	uint32_t new_ctrl;
+	uint32_t update = 0;
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/* Disabling queues just in case... */
+	nfp_net_disable_queues(dev);
+
+	/* Enabling the required queues in the device */
+	nfp_net_enable_queues(dev);
+
+	new_ctrl = nfp_check_offloads(dev);
+
+	/* Writing configuration parameters in the device */
+	nfp_net_params_setup(hw);
+
+	nfp_net_rss_config_default(dev);
+	update |= NFP_NET_CFG_UPDATE_RSS;
+
+	if (hw->cap & NFP_NET_CFG_CTRL_RSS2)
+		new_ctrl |= NFP_NET_CFG_CTRL_RSS2;
+	else
+		new_ctrl |= NFP_NET_CFG_CTRL_RSS;
+
+	/* Enable device */
+	new_ctrl |= NFP_NET_CFG_CTRL_ENABLE;
+
+	update |= NFP_NET_CFG_UPDATE_GEN | NFP_NET_CFG_UPDATE_RING;
+
+	if (hw->cap & NFP_NET_CFG_CTRL_RINGCFG)
+		new_ctrl |= NFP_NET_CFG_CTRL_RINGCFG;
+
+	nn_cfg_writel(hw, NFP_NET_CFG_CTRL, new_ctrl);
+
+	/* If an error when reconfig we avoid to change hw state */
+	ret = nfp_net_reconfig(hw, new_ctrl, update);
+	if (ret) {
+		PMD_INIT_LOG(ERR, "Failed to reconfig PF vnic");
+		return -EIO;
+	}
+
+	hw->ctrl = new_ctrl;
+
+	/* Setup the freelist ring */
+	ret = nfp_net_rx_freelist_setup(dev);
+	if (ret) {
+		PMD_INIT_LOG(ERR, "Error with flower PF vNIC freelist setup");
+		return -EIO;
+	}
+
+	return 0;
+}
+
+/* Stop device: disable rx and tx functions to allow for reconfiguring. */
+static int
+nfp_flower_pf_stop(struct rte_eth_dev *dev)
+{
+	uint16_t i;
+	struct nfp_net_hw *hw;
+	struct nfp_net_txq *this_tx_q;
+	struct nfp_net_rxq *this_rx_q;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	nfp_net_disable_queues(dev);
+
+	/* Clear queues */
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+		this_tx_q = (struct nfp_net_txq *)dev->data->tx_queues[i];
+		nfp_net_reset_tx_queue(this_tx_q);
+	}
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		this_rx_q = (struct nfp_net_rxq *)dev->data->rx_queues[i];
+		nfp_net_reset_rx_queue(this_rx_q);
+	}
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		/* Configure the physical port down */
+		nfp_eth_set_configured(hw->cpp, hw->nfp_idx, 0);
+	else
+		nfp_eth_set_configured(dev->process_private, hw->nfp_idx, 0);
+
+	return 0;
+}
+
+/* Reset and stop device. The device can not be restarted. */
+static int
+nfp_flower_pf_close(struct rte_eth_dev *dev)
+{
+	uint16_t i;
+	struct nfp_net_hw *hw;
+	struct nfp_pf_dev *pf_dev;
+	struct nfp_net_txq *this_tx_q;
+	struct nfp_net_rxq *this_rx_q;
+	struct rte_pci_device *pci_dev;
+	struct nfp_app_flower *app_flower;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	pci_dev = RTE_ETH_DEV_TO_PCI(dev);
+	app_flower = NFP_APP_PRIV_TO_APP_FLOWER(pf_dev->app_priv);
+
+	/*
+	 * We assume that the DPDK application is stopping all the
+	 * threads/queues before calling the device close function.
+	 */
+
+	nfp_net_disable_queues(dev);
+
+	/* Clear queues */
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+		this_tx_q = (struct nfp_net_txq *)dev->data->tx_queues[i];
+		nfp_net_reset_tx_queue(this_tx_q);
+	}
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		this_rx_q = (struct nfp_net_rxq *)dev->data->rx_queues[i];
+		nfp_net_reset_rx_queue(this_rx_q);
+	}
+
+	/* Cancel possible impending LSC work here before releasing the port*/
+	rte_eal_alarm_cancel(nfp_net_dev_interrupt_delayed_handler, (void *)dev);
+
+	nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff);
+
+	rte_eth_dev_release_port(dev);
+
+	/* Now it is safe to free all PF resources */
+	PMD_INIT_LOG(INFO, "Freeing PF resources");
+	nfp_cpp_area_free(pf_dev->ctrl_area);
+	nfp_cpp_area_free(pf_dev->hwqueues_area);
+	free(pf_dev->hwinfo);
+	free(pf_dev->sym_tbl);
+	nfp_cpp_free(pf_dev->cpp);
+	rte_free(app_flower);
+	rte_free(pf_dev);
+
+	rte_intr_disable(pci_dev->intr_handle);
+
+	/* unregister callback func from eal lib */
+	rte_intr_callback_unregister(pci_dev->intr_handle,
+			nfp_net_dev_interrupt_handler, (void *)dev);
+
+	return 0;
+}
+
+static int
+nfp_flower_pf_link_update(__rte_unused struct rte_eth_dev *dev,
+		__rte_unused int wait_to_complete)
+{
+	return 0;
+}
+
 static const struct eth_dev_ops nfp_flower_pf_dev_ops = {
 	.dev_configure          = nfp_flower_pf_configure,
 
 	/* Use the normal dev_infos_get functionality in the NFP PMD */
 	.dev_infos_get          = nfp_net_infos_get,
+
+	.dev_start              = nfp_flower_pf_start,
+	.dev_stop               = nfp_flower_pf_stop,
+	.dev_close              = nfp_flower_pf_close,
+	.link_update            = nfp_flower_pf_link_update,
 };
 
 static struct rte_service_spec flower_services[NFP_FLOWER_SERVICE_MAX] = {
@@ -375,6 +543,24 @@
 	return ret;
 }
 
+static int
+nfp_flower_start_pf_vnic(struct nfp_net_hw *hw)
+{
+	int ret;
+	uint16_t port_id;
+
+	port_id = hw->eth_dev->data->port_id;
+
+	/* Start the device */
+	ret = rte_eth_dev_start(port_id);
+	if (ret) {
+		PMD_INIT_LOG(ERR, "Could not start PF device %d", port_id);
+		return ret;
+	}
+
+	return 0;
+}
+
 int
 nfp_init_app_flower(struct nfp_pf_dev *pf_dev)
 {
@@ -432,6 +618,13 @@
 		goto pf_cpp_area_cleanup;
 	}
 
+	/* Start the PF vNIC */
+	ret = nfp_flower_start_pf_vnic(app_flower->pf_hw);
+	if (ret) {
+		PMD_INIT_LOG(ERR, "Could not start flower PF vNIC");
+		goto pf_vnic_cleanup;
+	}
+
 	/* Start up flower services */
 	if (nfp_flower_enable_services(app_flower)) {
 		ret = -ESRCH;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-08-05  6:32 [PATCH v5 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (5 preceding siblings ...)
  2022-08-05  6:32 ` [PATCH v5 06/12] net/nfp: add flower PF related routines Chaoyong He
@ 2022-08-05  6:32 ` Chaoyong He
  2022-08-05 13:05   ` Andrew Rybchenko
  2022-08-05  6:32 ` [PATCH v5 08/12] net/nfp: move common rxtx function for flower use Chaoyong He
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Chaoyong He @ 2022-08-05  6:32 UTC (permalink / raw)
  To: dev; +Cc: niklas.soderlund, Chaoyong He

This commit adds the setup/start logic for the ctrl vNIC. This vNIC
is used by the PMD and flower firmware as a communication channel
between driver and firmware. In the case of OVS it is also used to
communicate flow statistics from hardware to the driver.

A rte_eth device is not exposed to DPDK for this vNIC as it is strictly
used internally by flower logic. Rx and Tx logic will be added later for
this vNIC.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/flower/nfp_flower.c | 385 +++++++++++++++++++++++++++++++++++-
 drivers/net/nfp/flower/nfp_flower.h |   6 +
 2 files changed, 388 insertions(+), 3 deletions(-)

diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
index 2498020..51df504 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -26,6 +26,10 @@
 #define MEMPOOL_CACHE_SIZE 512
 #define DEFAULT_FLBUF_SIZE 9216
 
+#define CTRL_VNIC_NB_DESC 64
+#define CTRL_VNIC_RX_FREE_THRESH 32
+#define CTRL_VNIC_TX_FREE_THRESH 32
+
 /*
  * Simple dev ops functions for the flower PF. Because a rte_device is exposed
  * to DPDK the flower logic also makes use of helper functions like
@@ -543,6 +547,302 @@
 	return ret;
 }
 
+static void
+nfp_flower_cleanup_ctrl_vnic(struct nfp_net_hw *hw)
+{
+	uint32_t i;
+	struct nfp_net_rxq *rxq;
+	struct nfp_net_txq *txq;
+	struct rte_eth_dev *eth_dev;
+
+	eth_dev = hw->eth_dev;
+
+	for (i = 0; i < hw->max_tx_queues; i++) {
+		txq = eth_dev->data->tx_queues[i];
+		if (txq) {
+			rte_free(txq->txbufs);
+			rte_eth_dma_zone_free(eth_dev, "ctrl_tx_ring", i);
+			rte_free(txq);
+		}
+	}
+
+	for (i = 0; i < hw->max_rx_queues; i++) {
+		rxq = eth_dev->data->rx_queues[i];
+		if (rxq) {
+			rte_free(rxq->rxbufs);
+			rte_eth_dma_zone_free(eth_dev, "ctrl_rx_ring", i);
+			rte_free(rxq);
+		}
+	}
+
+	rte_free(eth_dev->data->tx_queues);
+	rte_free(eth_dev->data->rx_queues);
+	rte_free(eth_dev->data);
+	rte_free(eth_dev);
+}
+
+static int
+nfp_flower_init_ctrl_vnic(struct nfp_net_hw *hw)
+{
+	uint32_t i;
+	int ret = 0;
+	uint16_t nb_desc;
+	unsigned int numa_node;
+	struct rte_mempool *mp;
+	uint16_t rx_free_thresh;
+	uint16_t tx_free_thresh;
+	struct nfp_net_rxq *rxq;
+	struct nfp_net_txq *txq;
+	struct nfp_pf_dev *pf_dev;
+	struct rte_eth_dev *eth_dev;
+	const struct rte_memzone *tz;
+	struct nfp_app_flower *app_flower;
+
+	/* Hardcoded values for now */
+	nb_desc = CTRL_VNIC_NB_DESC;
+	rx_free_thresh = CTRL_VNIC_RX_FREE_THRESH;
+	tx_free_thresh = CTRL_VNIC_TX_FREE_THRESH;
+	numa_node = rte_socket_id();
+
+	/* Set up some pointers here for ease of use */
+	pf_dev = hw->pf_dev;
+	app_flower = NFP_APP_PRIV_TO_APP_FLOWER(pf_dev->app_priv);
+
+	ret = nfp_flower_init_vnic_common(hw, "ctrl_vnic");
+	if (ret)
+		goto done;
+
+	/* Allocate memory for the eth_dev of the vNIC */
+	hw->eth_dev = rte_zmalloc("ctrl_vnic_eth_dev",
+		sizeof(struct rte_eth_dev), RTE_CACHE_LINE_SIZE);
+	if (hw->eth_dev == NULL) {
+		ret = -ENOMEM;
+		goto done;
+	}
+
+	/* Grab the pointer to the newly created rte_eth_dev here */
+	eth_dev = hw->eth_dev;
+
+	/* Also allocate memory for the data part of the eth_dev */
+	eth_dev->data = rte_zmalloc("ctrl_vnic_eth_dev_data",
+		sizeof(struct rte_eth_dev_data), RTE_CACHE_LINE_SIZE);
+	if (eth_dev->data == NULL) {
+		ret = -ENOMEM;
+		goto eth_dev_cleanup;
+	}
+
+	eth_dev->data->rx_queues = rte_zmalloc("ethdev->rx_queues",
+		sizeof(eth_dev->data->rx_queues[0]) * hw->max_rx_queues,
+		RTE_CACHE_LINE_SIZE);
+	if (eth_dev->data->rx_queues == NULL) {
+		PMD_INIT_LOG(ERR, "rte_zmalloc failed for ctrl vnic rx queues");
+		ret = -ENOMEM;
+		goto dev_data_cleanup;
+	}
+
+	eth_dev->data->tx_queues = rte_zmalloc("ethdev->tx_queues",
+		sizeof(eth_dev->data->tx_queues[0]) * hw->max_tx_queues,
+		RTE_CACHE_LINE_SIZE);
+	if (eth_dev->data->tx_queues == NULL) {
+		PMD_INIT_LOG(ERR, "rte_zmalloc failed for ctrl vnic tx queues");
+		ret = -ENOMEM;
+		goto rx_queue_cleanup;
+	}
+
+	eth_dev->device = &pf_dev->pci_dev->device;
+	eth_dev->data->nb_tx_queues = hw->max_tx_queues;
+	eth_dev->data->nb_rx_queues = hw->max_rx_queues;
+	eth_dev->data->dev_private = hw;
+
+	/* Create a mbuf pool for the vNIC */
+	app_flower->ctrl_pktmbuf_pool = rte_pktmbuf_pool_create("ctrl_mbuf_pool",
+		4 * nb_desc, 64, 0, 9216, numa_node);
+	if (app_flower->ctrl_pktmbuf_pool == NULL) {
+		PMD_INIT_LOG(ERR, "create mbuf pool for ctrl vnic failed");
+		ret = -ENOMEM;
+		goto tx_queue_cleanup;
+	}
+
+	mp = app_flower->ctrl_pktmbuf_pool;
+
+	/* Set up the Rx queues */
+	PMD_INIT_LOG(INFO, "Configuring flower ctrl vNIC Rx queue");
+	for (i = 0; i < hw->max_rx_queues; i++) {
+		/* Hardcoded number of desc to 64 */
+		rxq = rte_zmalloc_socket("ethdev RX queue",
+			sizeof(struct nfp_net_rxq), RTE_CACHE_LINE_SIZE,
+			numa_node);
+		if (rxq == NULL) {
+			PMD_DRV_LOG(ERR, "Error allocating rxq");
+			ret = -ENOMEM;
+			goto rx_queue_setup_cleanup;
+		}
+
+		eth_dev->data->rx_queues[i] = rxq;
+
+		/* Hw queues mapping based on firmware configuration */
+		rxq->qidx = i;
+		rxq->fl_qcidx = i * hw->stride_rx;
+		rxq->rx_qcidx = rxq->fl_qcidx + (hw->stride_rx - 1);
+		rxq->qcp_fl = hw->rx_bar + NFP_QCP_QUEUE_OFF(rxq->fl_qcidx);
+		rxq->qcp_rx = hw->rx_bar + NFP_QCP_QUEUE_OFF(rxq->rx_qcidx);
+
+		/*
+		 * Tracking mbuf size for detecting a potential mbuf overflow due to
+		 * RX offset
+		 */
+		rxq->mem_pool = mp;
+		rxq->mbuf_size = rxq->mem_pool->elt_size;
+		rxq->mbuf_size -= (sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM);
+		hw->flbufsz = rxq->mbuf_size;
+
+		rxq->rx_count = nb_desc;
+		rxq->rx_free_thresh = rx_free_thresh;
+		rxq->drop_en = 1;
+
+		/*
+		 * Allocate RX ring hardware descriptors. A memzone large enough to
+		 * handle the maximum ring size is allocated in order to allow for
+		 * resizing in later calls to the queue setup function.
+		 */
+		tz = rte_eth_dma_zone_reserve(eth_dev, "ctrl_rx_ring", i,
+			sizeof(struct nfp_net_rx_desc) * NFP_NET_MAX_RX_DESC,
+			NFP_MEMZONE_ALIGN, numa_node);
+		if (tz == NULL) {
+			PMD_DRV_LOG(ERR, "Error allocating rx dma");
+			rte_free(rxq);
+			ret = -ENOMEM;
+			goto rx_queue_setup_cleanup;
+		}
+
+		/* Saving physical and virtual addresses for the RX ring */
+		rxq->dma = (uint64_t)tz->iova;
+		rxq->rxds = (struct nfp_net_rx_desc *)tz->addr;
+
+		/* mbuf pointers array for referencing mbufs linked to RX descriptors */
+		rxq->rxbufs = rte_zmalloc_socket("rxq->rxbufs",
+			sizeof(*rxq->rxbufs) * nb_desc, RTE_CACHE_LINE_SIZE,
+			numa_node);
+		if (rxq->rxbufs == NULL) {
+			rte_eth_dma_zone_free(eth_dev, "ctrl_rx_ring", i);
+			rte_free(rxq);
+			ret = -ENOMEM;
+			goto rx_queue_setup_cleanup;
+		}
+
+		nfp_net_reset_rx_queue(rxq);
+
+		rxq->hw = hw;
+
+		/*
+		 * Telling the HW about the physical address of the RX ring and number
+		 * of descriptors in log2 format
+		 */
+		nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(i), rxq->dma);
+		nn_cfg_writeb(hw, NFP_NET_CFG_RXR_SZ(i), rte_log2_u32(nb_desc));
+	}
+
+	/* Now the Tx queues */
+	PMD_INIT_LOG(INFO, "Configuring flower ctrl vNIC Tx queue");
+	for (i = 0; i < hw->max_tx_queues; i++) {
+		/* Hardcoded number of desc to 64 */
+		/* Allocating tx queue data structure */
+		txq = rte_zmalloc_socket("ethdev TX queue",
+			sizeof(struct nfp_net_txq), RTE_CACHE_LINE_SIZE,
+			numa_node);
+		if (txq == NULL) {
+			PMD_DRV_LOG(ERR, "Error allocating txq");
+			ret = -ENOMEM;
+			goto tx_queue_setup_cleanup;
+		}
+
+		eth_dev->data->tx_queues[i] = txq;
+
+		/*
+		 * Allocate TX ring hardware descriptors. A memzone large enough to
+		 * handle the maximum ring size is allocated in order to allow for
+		 * resizing in later calls to the queue setup function.
+		 */
+		tz = rte_eth_dma_zone_reserve(eth_dev, "ctrl_tx_ring", i,
+			sizeof(struct nfp_net_nfd3_tx_desc) * NFP_NET_MAX_TX_DESC,
+			NFP_MEMZONE_ALIGN, numa_node);
+		if (tz == NULL) {
+			PMD_DRV_LOG(ERR, "Error allocating tx dma");
+			rte_free(txq);
+			ret = -ENOMEM;
+			goto tx_queue_setup_cleanup;
+		}
+
+		txq->tx_count = nb_desc;
+		txq->tx_free_thresh = tx_free_thresh;
+		txq->tx_pthresh = DEFAULT_TX_PTHRESH;
+		txq->tx_hthresh = DEFAULT_TX_HTHRESH;
+		txq->tx_wthresh = DEFAULT_TX_WTHRESH;
+
+		/* queue mapping based on firmware configuration */
+		txq->qidx = i;
+		txq->tx_qcidx = i * hw->stride_tx;
+		txq->qcp_q = hw->tx_bar + NFP_QCP_QUEUE_OFF(txq->tx_qcidx);
+
+		/* Saving physical and virtual addresses for the TX ring */
+		txq->dma = (uint64_t)tz->iova;
+		txq->txds = (struct nfp_net_nfd3_tx_desc *)tz->addr;
+
+		/* mbuf pointers array for referencing mbufs linked to TX descriptors */
+		txq->txbufs = rte_zmalloc_socket("txq->txbufs",
+			sizeof(*txq->txbufs) * nb_desc, RTE_CACHE_LINE_SIZE,
+			numa_node);
+		if (txq->txbufs == NULL) {
+			rte_eth_dma_zone_free(eth_dev, "ctrl_tx_ring", i);
+			rte_free(txq);
+			ret = -ENOMEM;
+			goto tx_queue_setup_cleanup;
+		}
+
+		nfp_net_reset_tx_queue(txq);
+
+		txq->hw = hw;
+
+		/*
+		 * Telling the HW about the physical address of the TX ring and number
+		 * of descriptors in log2 format
+		 */
+		nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(i), txq->dma);
+		nn_cfg_writeb(hw, NFP_NET_CFG_TXR_SZ(i), rte_log2_u32(nb_desc));
+	}
+
+	return 0;
+
+tx_queue_setup_cleanup:
+	for (i = 0; i < hw->max_tx_queues; i++) {
+		txq = eth_dev->data->tx_queues[i];
+		if (txq) {
+			rte_free(txq->txbufs);
+			rte_eth_dma_zone_free(eth_dev, "ctrl_tx_ring", i);
+			rte_free(txq);
+		}
+	}
+rx_queue_setup_cleanup:
+	for (i = 0; i < hw->max_rx_queues; i++) {
+		rxq = eth_dev->data->rx_queues[i];
+		if (rxq) {
+			rte_free(rxq->rxbufs);
+			rte_eth_dma_zone_free(eth_dev, "ctrl_rx_ring", i);
+			rte_free(rxq);
+		}
+	}
+tx_queue_cleanup:
+	rte_free(eth_dev->data->tx_queues);
+rx_queue_cleanup:
+	rte_free(eth_dev->data->rx_queues);
+dev_data_cleanup:
+	rte_free(eth_dev->data);
+eth_dev_cleanup:
+	rte_free(eth_dev);
+done:
+	return ret;
+}
+
 static int
 nfp_flower_start_pf_vnic(struct nfp_net_hw *hw)
 {
@@ -561,12 +861,57 @@
 	return 0;
 }
 
+static int
+nfp_flower_start_ctrl_vnic(struct nfp_net_hw *hw)
+{
+	int ret;
+	uint32_t update;
+	uint32_t new_ctrl;
+	struct rte_eth_dev *dev;
+
+	dev = hw->eth_dev;
+
+	/* Disabling queues just in case... */
+	nfp_net_disable_queues(dev);
+
+	/* Enabling the required queues in the device */
+	nfp_net_enable_queues(dev);
+
+	/* Writing configuration parameters in the device */
+	nfp_net_params_setup(hw);
+
+	new_ctrl = NFP_NET_CFG_CTRL_ENABLE;
+	update = NFP_NET_CFG_UPDATE_GEN | NFP_NET_CFG_UPDATE_RING |
+		 NFP_NET_CFG_UPDATE_MSIX;
+
+	rte_wmb();
+
+	/* If an error when reconfig we avoid to change hw state */
+	ret = nfp_net_reconfig(hw, new_ctrl, update);
+	if (ret) {
+		PMD_INIT_LOG(ERR, "Failed to reconfig ctrl vnic");
+		return -EIO;
+	}
+
+	hw->ctrl = new_ctrl;
+
+	/* Setup the freelist ring */
+	ret = nfp_net_rx_freelist_setup(dev);
+	if (ret) {
+		PMD_INIT_LOG(ERR, "Error with flower ctrl vNIC freelist setup");
+		return -EIO;
+	}
+
+	return 0;
+}
+
 int
 nfp_init_app_flower(struct nfp_pf_dev *pf_dev)
 {
 	int ret;
 	unsigned int numa_node;
 	struct nfp_net_hw *pf_hw;
+	struct nfp_net_hw *ctrl_hw;
 	struct nfp_app_flower *app_flower;
 
 	numa_node = rte_socket_id();
@@ -612,29 +957,63 @@
 	pf_hw->pf_dev = pf_dev;
 	pf_hw->cpp = pf_dev->cpp;
 
+	/* The ctrl vNIC struct comes directly after the PF one */
+	app_flower->ctrl_hw = pf_hw + 1;
+	ctrl_hw = app_flower->ctrl_hw;
+
+	/* Map the ctrl vNIC ctrl bar */
+	ctrl_hw->ctrl_bar = nfp_rtsym_map(pf_dev->sym_tbl, "_pf0_net_ctrl_bar",
+		32768, &ctrl_hw->ctrl_area);
+	if (ctrl_hw->ctrl_bar == NULL) {
+		PMD_INIT_LOG(ERR, "Cloud not map the ctrl vNIC ctrl bar");
+		ret = -ENODEV;
+		goto pf_cpp_area_cleanup;
+	}
+
+	/* Now populate the ctrl vNIC */
+	ctrl_hw->pf_dev = pf_dev;
+	ctrl_hw->cpp = pf_dev->cpp;
+
 	ret = nfp_flower_init_pf_vnic(app_flower->pf_hw);
 	if (ret) {
 		PMD_INIT_LOG(ERR, "Could not initialize flower PF vNIC");
-		goto pf_cpp_area_cleanup;
+		goto ctrl_cpp_area_cleanup;
+	}
+
+	ret = nfp_flower_init_ctrl_vnic(app_flower->ctrl_hw);
+	if (ret) {
+		PMD_INIT_LOG(ERR, "Could not initialize flower ctrl vNIC");
+		goto pf_vnic_cleanup;
 	}
 
 	/* Start the PF vNIC */
 	ret = nfp_flower_start_pf_vnic(app_flower->pf_hw);
 	if (ret) {
 		PMD_INIT_LOG(ERR, "Could not start flower PF vNIC");
-		goto pf_vnic_cleanup;
+		goto ctrl_vnic_cleanup;
+	}
+
+	/* Start the ctrl vNIC */
+	ret = nfp_flower_start_ctrl_vnic(app_flower->ctrl_hw);
+	if (ret) {
+		PMD_INIT_LOG(ERR, "Could not start flower ctrl vNIC");
+		goto ctrl_vnic_cleanup;
 	}
 
 	/* Start up flower services */
 	if (nfp_flower_enable_services(app_flower)) {
 		ret = -ESRCH;
-		goto pf_vnic_cleanup;
+		goto ctrl_vnic_cleanup;
 	}
 
 	return 0;
 
+ctrl_vnic_cleanup:
+	nfp_flower_cleanup_ctrl_vnic(app_flower->ctrl_hw);
 pf_vnic_cleanup:
 	nfp_flower_cleanup_pf_vnic(app_flower->pf_hw);
+ctrl_cpp_area_cleanup:
+	nfp_cpp_area_free(ctrl_hw->ctrl_area);
 pf_cpp_area_cleanup:
 	nfp_cpp_area_free(pf_dev->ctrl_area);
 eth_tbl_cleanup:
diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
index f6fd4eb..f11ef6d 100644
--- a/drivers/net/nfp/flower/nfp_flower.h
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -21,6 +21,12 @@ struct nfp_app_flower {
 	/* Pointer to the PF vNIC */
 	struct nfp_net_hw *pf_hw;
 
+	/* Pointer to a mempool for the ctrlvNIC */
+	struct rte_mempool *ctrl_pktmbuf_pool;
+
+	/* Pointer to the ctrl vNIC */
+	struct nfp_net_hw *ctrl_hw;
+
 	/* the eth table as reported by firmware */
 	struct nfp_eth_table *nfp_eth_table;
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 08/12] net/nfp: move common rxtx function for flower use
  2022-08-05  6:32 [PATCH v5 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (6 preceding siblings ...)
  2022-08-05  6:32 ` [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics Chaoyong He
@ 2022-08-05  6:32 ` Chaoyong He
  2022-08-05  6:32 ` [PATCH v5 09/12] net/nfp: add flower ctrl VNIC rxtx logic Chaoyong He
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 29+ messages in thread
From: Chaoyong He @ 2022-08-05  6:32 UTC (permalink / raw)
  To: dev; +Cc: niklas.soderlund, Chaoyong He, Heinrich Kuhn

This commit move some common Rx and Tx logic to the rxtx header file so
that they can be re-used by flower tx and rx logic.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/nfp_rxtx.c | 32 +-------------------------------
 drivers/net/nfp/nfp_rxtx.h | 31 +++++++++++++++++++++++++++++++
 2 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/drivers/net/nfp/nfp_rxtx.c b/drivers/net/nfp/nfp_rxtx.c
index 8429b44..8d63a7b 100644
--- a/drivers/net/nfp/nfp_rxtx.c
+++ b/drivers/net/nfp/nfp_rxtx.c
@@ -116,12 +116,6 @@
 	return count;
 }
 
-static inline void
-nfp_net_mbuf_alloc_failed(struct nfp_net_rxq *rxq)
-{
-	rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed++;
-}
-
 /*
  * nfp_net_set_hash - Set mbuf hash data
  *
@@ -583,7 +577,7 @@
  * @txq: TX queue to work with
  * Returns number of descriptors freed
  */
-static int
+int
 nfp_net_tx_free_bufs(struct nfp_net_txq *txq)
 {
 	uint32_t qcp_rd_p;
@@ -774,30 +768,6 @@
 	return 0;
 }
 
-/* Leaving always free descriptors for avoiding wrapping confusion */
-static inline
-uint32_t nfp_net_nfd3_free_tx_desc(struct nfp_net_txq *txq)
-{
-	if (txq->wr_p >= txq->rd_p)
-		return txq->tx_count - (txq->wr_p - txq->rd_p) - 8;
-	else
-		return txq->rd_p - txq->wr_p - 8;
-}
-
-/*
- * nfp_net_txq_full - Check if the TX queue free descriptors
- * is below tx_free_threshold
- *
- * @txq: TX queue to check
- *
- * This function uses the host copy* of read/write pointers
- */
-static inline
-uint32_t nfp_net_nfd3_txq_full(struct nfp_net_txq *txq)
-{
-	return (nfp_net_nfd3_free_tx_desc(txq) < txq->tx_free_thresh);
-}
-
 /* nfp_net_tx_tso - Set TX descriptor for TSO */
 static inline void
 nfp_net_nfd3_tx_tso(struct nfp_net_txq *txq, struct nfp_net_nfd3_tx_desc *txd,
diff --git a/drivers/net/nfp/nfp_rxtx.h b/drivers/net/nfp/nfp_rxtx.h
index 5c005d7..a30171f 100644
--- a/drivers/net/nfp/nfp_rxtx.h
+++ b/drivers/net/nfp/nfp_rxtx.h
@@ -330,6 +330,36 @@ struct nfp_net_rxq {
 	int rx_qcidx;
 } __rte_aligned(64);
 
+static inline void
+nfp_net_mbuf_alloc_failed(struct nfp_net_rxq *rxq)
+{
+	rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed++;
+}
+
+/* Leaving always free descriptors for avoiding wrapping confusion */
+static inline uint32_t
+nfp_net_nfd3_free_tx_desc(struct nfp_net_txq *txq)
+{
+	if (txq->wr_p >= txq->rd_p)
+		return txq->tx_count - (txq->wr_p - txq->rd_p) - 8;
+	else
+		return txq->rd_p - txq->wr_p - 8;
+}
+
+/*
+ * nfp_net_nfd3_txq_full - Check if the TX queue free descriptors
+ * is below tx_free_threshold
+ *
+ * @txq: TX queue to check
+ *
+ * This function uses the host copy* of read/write pointers
+ */
+static inline uint32_t
+nfp_net_nfd3_txq_full(struct nfp_net_txq *txq)
+{
+	return (nfp_net_nfd3_free_tx_desc(txq) < txq->tx_free_thresh);
+}
+
 int nfp_net_rx_freelist_setup(struct rte_eth_dev *dev);
 uint32_t nfp_net_rx_queue_count(void *rx_queue);
 uint16_t nfp_net_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
@@ -355,6 +385,7 @@ int nfp_net_nfdk_tx_queue_setup(struct rte_eth_dev *dev,
 uint16_t nfp_net_nfdk_xmit_pkts(void *tx_queue,
 		struct rte_mbuf **tx_pkts,
 		uint16_t nb_pkts);
+int nfp_net_tx_free_bufs(struct nfp_net_txq *txq);
 
 #endif /* _NFP_RXTX_H_ */
 /*
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 09/12] net/nfp: add flower ctrl VNIC rxtx logic
  2022-08-05  6:32 [PATCH v5 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (7 preceding siblings ...)
  2022-08-05  6:32 ` [PATCH v5 08/12] net/nfp: move common rxtx function for flower use Chaoyong He
@ 2022-08-05  6:32 ` Chaoyong He
  2022-08-05  6:32 ` [PATCH v5 10/12] net/nfp: add flower representor framework Chaoyong He
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 29+ messages in thread
From: Chaoyong He @ 2022-08-05  6:32 UTC (permalink / raw)
  To: dev; +Cc: niklas.soderlund, Chaoyong He, Heinrich Kuhn

Add a Rx and Tx function for the control vNIC. The logic is mostly
identical to the normal Rx and Tx functionality of the NFP PMD.

This commit also makes use of the ctrl vNIC service logic to
service the ctrl vNIC Rx path.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/flower/nfp_flower.c      |  15 ++
 drivers/net/nfp/flower/nfp_flower.h      |  15 ++
 drivers/net/nfp/flower/nfp_flower_ctrl.c | 252 +++++++++++++++++++++++++++++++
 drivers/net/nfp/flower/nfp_flower_ctrl.h |  13 ++
 drivers/net/nfp/meson.build              |   1 +
 5 files changed, 296 insertions(+)
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ctrl.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ctrl.h

diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
index 51df504..5e9c4ef 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -21,6 +21,7 @@
 #include "../nfpcore/nfp_nsp.h"
 #include "nfp_flower.h"
 #include "nfp_flower_ovs_compat.h"
+#include "nfp_flower_ctrl.h"
 
 #define MAX_PKT_BURST 32
 #define MEMPOOL_CACHE_SIZE 512
@@ -216,7 +217,21 @@
 	.link_update            = nfp_flower_pf_link_update,
 };
 
+static int
+nfp_flower_ctrl_vnic_service(void *arg)
+{
+	struct nfp_app_flower *app_flower = arg;
+
+	nfp_flower_ctrl_vnic_poll(app_flower);
+
+	return 0;
+}
+
 static struct rte_service_spec flower_services[NFP_FLOWER_SERVICE_MAX] = {
+	[NFP_FLOWER_SERVICE_CTRL] = {
+		.name         = "flower_ctrl_vnic_service",
+		.callback     = nfp_flower_ctrl_vnic_service,
+	},
 };
 
 static int
diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
index f11ef6d..bdc64e3 100644
--- a/drivers/net/nfp/flower/nfp_flower.h
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -7,9 +7,18 @@
 #define _NFP_FLOWER_H_
 
 enum nfp_flower_service {
+	NFP_FLOWER_SERVICE_CTRL,
 	NFP_FLOWER_SERVICE_MAX
 };
 
+/*
+ * Flower fallback and ctrl path always adds and removes
+ * 8 bytes of prepended data. Tx descriptors must point
+ * to the correct packet data offset after metadata has
+ * been added
+ */
+#define FLOWER_PKT_DATA_OFFSET 8
+
 /* The flower application's private structure */
 struct nfp_app_flower {
 	/* List of rte_service ID's for the flower app */
@@ -29,6 +38,12 @@ struct nfp_app_flower {
 
 	/* the eth table as reported by firmware */
 	struct nfp_eth_table *nfp_eth_table;
+
+	/* Ctrl vNIC Rx counter */
+	uint64_t ctrl_vnic_rx_count;
+
+	/* Ctrl vNIC Tx counter */
+	uint64_t ctrl_vnic_tx_count;
 };
 
 int nfp_init_app_flower(struct nfp_pf_dev *pf_dev);
diff --git a/drivers/net/nfp/flower/nfp_flower_ctrl.c b/drivers/net/nfp/flower/nfp_flower_ctrl.c
new file mode 100644
index 0000000..e73054e
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_ctrl.c
@@ -0,0 +1,252 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#include <rte_common.h>
+#include <ethdev_pci.h>
+
+#include "../nfp_common.h"
+#include "../nfp_logs.h"
+#include "../nfp_ctrl.h"
+#include "../nfp_rxtx.h"
+#include "nfp_flower.h"
+#include "nfp_flower_ctrl.h"
+
+#define MAX_PKT_BURST 32
+
+static uint16_t
+nfp_flower_ctrl_vnic_recv(void *rx_queue,
+		struct rte_mbuf **rx_pkts,
+		uint16_t nb_pkts)
+{
+	uint64_t dma_addr;
+	uint16_t avail = 0;
+	struct rte_mbuf *mb;
+	uint16_t nb_hold = 0;
+	struct nfp_net_hw *hw;
+	struct nfp_net_rxq *rxq;
+	struct rte_mbuf *new_mb;
+	struct nfp_net_rx_buff *rxb;
+	struct nfp_net_rx_desc *rxds;
+
+	rxq = rx_queue;
+	if (unlikely(rxq == NULL)) {
+		/*
+		 * DPDK just checks the queue is lower than max queues
+		 * enabled. But the queue needs to be configured
+		 */
+		PMD_RX_LOG(ERR, "RX Bad queue");
+		return 0;
+	}
+
+	hw = rxq->hw;
+	while (avail < nb_pkts) {
+		rxb = &rxq->rxbufs[rxq->rd_p];
+		if (unlikely(rxb == NULL)) {
+			PMD_RX_LOG(ERR, "rxb does not exist!");
+			break;
+		}
+
+		rxds = &rxq->rxds[rxq->rd_p];
+		if ((rxds->rxd.meta_len_dd & PCIE_DESC_RX_DD) == 0)
+			break;
+
+		/*
+		 * Memory barrier to ensure that we won't do other
+		 * reads before the DD bit.
+		 */
+		rte_rmb();
+
+		/*
+		 * We got a packet. Let's alloc a new mbuf for refilling the
+		 * free descriptor ring as soon as possible
+		 */
+		new_mb = rte_pktmbuf_alloc(rxq->mem_pool);
+		if (unlikely(new_mb == NULL)) {
+			PMD_RX_LOG(ERR,
+			"RX mbuf alloc failed port_id=%u queue_id=%u",
+				rxq->port_id, (unsigned int)rxq->qidx);
+			nfp_net_mbuf_alloc_failed(rxq);
+			break;
+		}
+
+		nb_hold++;
+
+		/*
+		 * Grab the mbuf and refill the descriptor with the
+		 * previously allocated mbuf
+		 */
+		mb = rxb->mbuf;
+		rxb->mbuf = new_mb;
+
+		/* Size of this segment */
+		mb->data_len = rxds->rxd.data_len - NFP_DESC_META_LEN(rxds);
+		/* Size of the whole packet. We just support 1 segment */
+		mb->pkt_len = mb->data_len;
+
+		if (unlikely((mb->data_len + hw->rx_offset) > rxq->mbuf_size)) {
+			rte_pktmbuf_free(mb);
+			/*
+			 * This should not happen and the user has the
+			 * responsibility of avoiding it. But we have
+			 * to give some info about the error
+			 */
+			RTE_LOG_DP(ERR, PMD,
+				"mbuf overflow likely due to the RX offset.\n"
+				"\t\tYour mbuf size should have extra space for"
+				" RX offset=%u bytes.\n"
+				"\t\tCurrently you just have %u bytes available"
+				" but the received packet is %u bytes long",
+				hw->rx_offset,
+				rxq->mbuf_size - hw->rx_offset,
+				mb->data_len);
+			break;
+		}
+
+		/* Filling the received mbuf with packet info */
+		if (hw->rx_offset)
+			mb->data_off = RTE_PKTMBUF_HEADROOM + hw->rx_offset;
+		else
+			mb->data_off = RTE_PKTMBUF_HEADROOM + NFP_DESC_META_LEN(rxds);
+
+		/* No scatter mode supported */
+		mb->nb_segs = 1;
+		mb->next = NULL;
+		mb->port = rxq->port_id;
+
+		rx_pkts[avail++] = mb;
+
+		/* Now resetting and updating the descriptor */
+		rxds->vals[0] = 0;
+		rxds->vals[1] = 0;
+		dma_addr = rte_cpu_to_le_64(RTE_MBUF_DMA_ADDR_DEFAULT(new_mb));
+		rxds->fld.dd = 0;
+		rxds->fld.dma_addr_hi = (dma_addr >> 32) & 0xff;
+		rxds->fld.dma_addr_lo = dma_addr & 0xffffffff;
+
+		rxq->rd_p++;
+		if (unlikely(rxq->rd_p == rxq->rx_count)) /* wrapping?*/
+			rxq->rd_p = 0;
+	}
+
+	if (nb_hold == 0)
+		return 0;
+
+	nb_hold += rxq->nb_rx_hold;
+
+	/*
+	 * FL descriptors needs to be written before incrementing the
+	 * FL queue WR pointer
+	 */
+	rte_wmb();
+	if (nb_hold >= rxq->rx_free_thresh) {
+		PMD_RX_LOG(DEBUG, "port=%hu queue=%d nb_hold=%hu avail=%hu",
+			rxq->port_id, rxq->qidx, nb_hold, avail);
+		nfp_qcp_ptr_add(rxq->qcp_fl, NFP_QCP_WRITE_PTR, nb_hold);
+		nb_hold = 0;
+	}
+
+	rxq->nb_rx_hold = nb_hold;
+
+	return avail;
+}
+
+uint16_t
+nfp_flower_ctrl_vnic_xmit(struct nfp_app_flower *app_flower,
+		struct rte_mbuf *mbuf)
+{
+	uint16_t cnt = 0;
+	uint64_t dma_addr;
+	uint32_t free_descs;
+	struct rte_mbuf **lmbuf;
+	struct nfp_net_txq *txq;
+	struct nfp_net_hw *ctrl_hw;
+	struct rte_eth_dev *ctrl_dev;
+	struct nfp_net_nfd3_tx_desc *txds;
+
+	ctrl_hw = app_flower->ctrl_hw;
+	ctrl_dev = ctrl_hw->eth_dev;
+
+	/* Flower ctrl vNIC only has a single tx queue */
+	txq = ctrl_dev->data->tx_queues[0];
+	if (unlikely(txq == NULL)) {
+		/*
+		 * DPDK just checks the queue is lower than max queues
+		 * enabled. But the queue needs to be configured
+		 */
+		PMD_TX_LOG(ERROR, "ctrl dev TX Bad queue");
+		goto xmit_end;
+	}
+
+	txds = &txq->txds[txq->wr_p];
+	txds->vals[0] = 0;
+	txds->vals[1] = 0;
+	txds->vals[2] = 0;
+	txds->vals[3] = 0;
+
+	if (nfp_net_nfd3_txq_full(txq))
+		nfp_net_tx_free_bufs(txq);
+
+	free_descs = nfp_net_nfd3_free_tx_desc(txq);
+	if (unlikely(free_descs == 0)) {
+		PMD_TX_LOG(ERROR, "ctrl dev no free descs");
+		goto xmit_end;
+	}
+
+	lmbuf = &txq->txbufs[txq->wr_p].mbuf;
+	RTE_MBUF_PREFETCH_TO_FREE(*lmbuf);
+	if (*lmbuf)
+		rte_pktmbuf_free_seg(*lmbuf);
+
+	*lmbuf = mbuf;
+	dma_addr = rte_mbuf_data_iova(mbuf);
+
+	txds->data_len = mbuf->pkt_len;
+	txds->dma_len = txds->data_len;
+	txds->dma_addr_hi = (dma_addr >> 32) & 0xff;
+	txds->dma_addr_lo = (dma_addr & 0xffffffff);
+	txds->offset_eop = FLOWER_PKT_DATA_OFFSET | PCIE_DESC_TX_EOP;
+
+	txq->wr_p++;
+	if (unlikely(txq->wr_p == txq->tx_count)) /* wrapping?*/
+		txq->wr_p = 0;
+
+	cnt++;
+	app_flower->ctrl_vnic_tx_count++;
+
+xmit_end:
+	rte_wmb();
+	nfp_qcp_ptr_add(txq->qcp_q, NFP_QCP_WRITE_PTR, 1);
+
+	return cnt;
+}
+
+void
+nfp_flower_ctrl_vnic_poll(struct nfp_app_flower *app_flower)
+{
+	uint16_t i;
+	uint16_t count;
+	struct nfp_net_rxq *rxq;
+	struct nfp_net_hw *ctrl_hw;
+	struct rte_eth_dev *ctrl_eth_dev;
+	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+
+	ctrl_hw = app_flower->ctrl_hw;
+	ctrl_eth_dev = ctrl_hw->eth_dev;
+
+	/* ctrl vNIC only has a single Rx queue */
+	rxq = ctrl_eth_dev->data->rx_queues[0];
+	count = nfp_flower_ctrl_vnic_recv(rxq, pkts_burst, MAX_PKT_BURST);
+	if (count > MAX_PKT_BURST) {
+		PMD_RX_LOG(ERR, "nfp_net_ctrl_vnic_recv failed!");
+		return;
+	}
+
+	if (count) {
+		app_flower->ctrl_vnic_rx_count += count;
+		/* Process cmsgs here, only free for now */
+		for (i = 0; i < count; i++)
+			rte_pktmbuf_free(pkts_burst[i]);
+	}
+}
diff --git a/drivers/net/nfp/flower/nfp_flower_ctrl.h b/drivers/net/nfp/flower/nfp_flower_ctrl.h
new file mode 100644
index 0000000..74765c9
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_ctrl.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#ifndef _NFP_FLOWER_CTRL_H_
+#define _NFP_FLOWER_CTRL_H_
+
+void nfp_flower_ctrl_vnic_poll(struct nfp_app_flower *app_flower);
+uint16_t nfp_flower_ctrl_vnic_xmit(struct nfp_app_flower *app_flower,
+		struct rte_mbuf *mbuf);
+
+#endif /* _NFP_FLOWER_CTRL_H_ */
diff --git a/drivers/net/nfp/meson.build b/drivers/net/nfp/meson.build
index 7ae3115..8710213 100644
--- a/drivers/net/nfp/meson.build
+++ b/drivers/net/nfp/meson.build
@@ -7,6 +7,7 @@ if not is_linux or not dpdk_conf.get('RTE_ARCH_64')
 endif
 sources = files(
         'flower/nfp_flower.c',
+        'flower/nfp_flower_ctrl.c',
         'nfpcore/nfp_cpp_pcie_ops.c',
         'nfpcore/nfp_nsp.c',
         'nfpcore/nfp_cppcore.c',
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 10/12] net/nfp: add flower representor framework
  2022-08-05  6:32 [PATCH v5 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (8 preceding siblings ...)
  2022-08-05  6:32 ` [PATCH v5 09/12] net/nfp: add flower ctrl VNIC rxtx logic Chaoyong He
@ 2022-08-05  6:32 ` Chaoyong He
  2022-08-05 14:23   ` Andrew Rybchenko
  2022-08-05  6:32 ` [PATCH v5 11/12] net/nfp: move rxtx function to header file Chaoyong He
  2022-08-05  6:32 ` [PATCH v5 12/12] net/nfp: add flower PF rxtx logic Chaoyong He
  11 siblings, 1 reply; 29+ messages in thread
From: Chaoyong He @ 2022-08-05  6:32 UTC (permalink / raw)
  To: dev; +Cc: niklas.soderlund, Chaoyong He

This commit adds the framework to support flower representors.
The number of VF representors are parsed from the command line. For
physical port representors the current logic aims to create a
representor for each physical port present on the hardware.

A eth_dev is created for each phyport and VF, and flower firmware
requires a MAC repr cmsg to be transmitted to firmware with
info about the number of physical ports configured.

Reify messages are sent to hardware for each phyport representor.
A rte_ring is also created per representor so that traffic can be
pushed and pulled to this interface.

To up and down the real device represented by a flower representor port
a port mod message is used to convey that info to the firmware. This
message will be used in the dev_ops callbacks of flower representors.

Each cmsg generated by the driver is prepended with a cmsg header.
This commit also adds the logic to fill in the header of cmsgs.

This commit also adds the Rx and Tx path for flower representors. For
Rx packets are dequeued from the representor ring and passed to the
eth_dev. For Tx the first queue of the PF vNIC is used. Metadata about
the representor is added before the packet is sent down to firmware.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/flower/nfp_flower.c             |  59 +++
 drivers/net/nfp/flower/nfp_flower.h             |  18 +
 drivers/net/nfp/flower/nfp_flower_cmsg.c        | 186 +++++++++
 drivers/net/nfp/flower/nfp_flower_cmsg.h        | 173 ++++++++
 drivers/net/nfp/flower/nfp_flower_representor.c | 508 ++++++++++++++++++++++++
 drivers/net/nfp/flower/nfp_flower_representor.h |  39 ++
 drivers/net/nfp/meson.build                     |   2 +
 drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c      |  31 +-
 8 files changed, 1004 insertions(+), 12 deletions(-)
 create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.h
 create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.h

diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
index 5e9c4ef..d7772d6 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -22,6 +22,7 @@
 #include "nfp_flower.h"
 #include "nfp_flower_ovs_compat.h"
 #include "nfp_flower_ctrl.h"
+#include "nfp_flower_representor.h"
 
 #define MAX_PKT_BURST 32
 #define MEMPOOL_CACHE_SIZE 512
@@ -927,8 +928,13 @@
 	unsigned int numa_node;
 	struct nfp_net_hw *pf_hw;
 	struct nfp_net_hw *ctrl_hw;
+	struct rte_pci_device *pci_dev;
 	struct nfp_app_flower *app_flower;
+	struct rte_eth_devargs eth_da = {
+		.nb_representor_ports = 0
+	};
 
+	pci_dev = pf_dev->pci_dev;
 	numa_node = rte_socket_id();
 
 	/* Allocate memory for the Flower app */
@@ -1021,6 +1027,59 @@
 		goto ctrl_vnic_cleanup;
 	}
 
+	/* Allocate a switch domain for the flower app */
+	if (app_flower->switch_domain_id == RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID &&
+			rte_eth_switch_domain_alloc(&app_flower->switch_domain_id)) {
+		PMD_INIT_LOG(WARNING,
+				"failed to allocate switch domain for device");
+	}
+
+	/* Now parse PCI device args passed for representor info */
+	if (pci_dev->device.devargs) {
+		ret = rte_eth_devargs_parse(pci_dev->device.devargs->args,
+				&eth_da);
+		if (ret) {
+			PMD_INIT_LOG(ERR, "devarg parse failed");
+			goto ctrl_vnic_cleanup;
+		}
+	}
+
+	if (eth_da.nb_representor_ports == 0) {
+		PMD_INIT_LOG(DEBUG, "No representor port need to create.");
+		ret = 0;
+		goto done;
+	}
+
+	/* There always exist phy repr */
+	if (eth_da.nb_representor_ports < app_flower->nfp_eth_table->count) {
+		PMD_INIT_LOG(DEBUG, "Should also create phy representor port.");
+		ret = -ERANGE;
+		goto ctrl_vnic_cleanup;
+	}
+
+	/* Only support VF representor creation via the command line */
+	if (eth_da.type != RTE_ETH_REPRESENTOR_VF) {
+		PMD_DRV_LOG(ERR, "unsupported representor type: %s\n",
+				pci_dev->device.devargs->args);
+		ret = -ENOTSUP;
+		goto ctrl_vnic_cleanup;
+	}
+
+	/* Fill in flower app with repr counts */
+	app_flower->num_phyport_reprs = (uint8_t)app_flower->nfp_eth_table->count;
+	app_flower->num_vf_reprs = eth_da.nb_representor_ports -
+			app_flower->nfp_eth_table->count;
+
+	PMD_INIT_LOG(INFO, "%d number of VF reprs", app_flower->num_vf_reprs);
+	PMD_INIT_LOG(INFO, "%d number of phyport reprs", app_flower->num_phyport_reprs);
+
+	ret = nfp_flower_repr_alloc(app_flower);
+	if (ret) {
+		PMD_INIT_LOG(ERR,
+			"representors allocation for NFP_REPR_TYPE_VF error");
+		goto ctrl_vnic_cleanup;
+	}
+
 	return 0;
 
 ctrl_vnic_cleanup:
diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
index bdc64e3..24fced3 100644
--- a/drivers/net/nfp/flower/nfp_flower.h
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -19,8 +19,20 @@ enum nfp_flower_service {
  */
 #define FLOWER_PKT_DATA_OFFSET 8
 
+#define MAX_FLOWER_PHYPORTS 8
+#define MAX_FLOWER_VFS 64
+
 /* The flower application's private structure */
 struct nfp_app_flower {
+	/* switch domain for this app */
+	uint16_t switch_domain_id;
+
+	/* Number of VF representors */
+	uint8_t num_vf_reprs;
+
+	/* Number of phyport representors */
+	uint8_t num_phyport_reprs;
+
 	/* List of rte_service ID's for the flower app */
 	uint32_t flower_services_ids[NFP_FLOWER_SERVICE_MAX];
 
@@ -44,6 +56,12 @@ struct nfp_app_flower {
 
 	/* Ctrl vNIC Tx counter */
 	uint64_t ctrl_vnic_tx_count;
+
+	/* Array of phyport representors */
+	struct nfp_flower_representor *phy_reprs[MAX_FLOWER_PHYPORTS];
+
+	/* Array of VF representors */
+	struct nfp_flower_representor *vf_reprs[MAX_FLOWER_VFS];
 };
 
 int nfp_init_app_flower(struct nfp_pf_dev *pf_dev);
diff --git a/drivers/net/nfp/flower/nfp_flower_cmsg.c b/drivers/net/nfp/flower/nfp_flower_cmsg.c
new file mode 100644
index 0000000..5ce547c
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_cmsg.c
@@ -0,0 +1,186 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#include "../nfpcore/nfp_nsp.h"
+#include "../nfp_logs.h"
+#include "../nfp_common.h"
+#include "nfp_flower.h"
+#include "nfp_flower_cmsg.h"
+#include "nfp_flower_ctrl.h"
+#include "nfp_flower_representor.h"
+
+static void *
+nfp_flower_cmsg_init(struct rte_mbuf *m,
+		enum nfp_flower_cmsg_type type,
+		uint32_t size)
+{
+	char *pkt;
+	uint32_t data;
+	uint32_t new_size = size;
+	struct nfp_flower_cmsg_hdr *hdr;
+
+	pkt = rte_pktmbuf_mtod(m, char *);
+	PMD_DRV_LOG(DEBUG, "flower_cmsg_init using pkt at %p", pkt);
+
+	data = rte_cpu_to_be_32(NFP_NET_META_PORTID);
+	rte_memcpy(pkt, &data, 4);
+	pkt += 4;
+	new_size += 4;
+
+	/* First the metadata as flower requires it */
+	data = rte_cpu_to_be_32(NFP_META_PORT_ID_CTRL);
+	rte_memcpy(pkt, &data, 4);
+	pkt += 4;
+	new_size += 4;
+
+	/* Now the ctrl header */
+	hdr = (struct nfp_flower_cmsg_hdr *)pkt;
+	hdr->pad     = 0;
+	hdr->type    = type;
+	hdr->version = NFP_FLOWER_CMSG_VER1;
+
+	pkt = (char *)hdr + NFP_FLOWER_CMSG_HLEN;
+	new_size += NFP_FLOWER_CMSG_HLEN;
+
+	m->pkt_len = new_size;
+	m->data_len = m->pkt_len;
+
+	return pkt;
+}
+
+static void
+nfp_flower_cmsg_mac_repr_init(struct rte_mbuf *m, int num_ports)
+{
+	uint32_t size;
+	struct nfp_flower_cmsg_mac_repr *msg;
+	enum nfp_flower_cmsg_type type = NFP_FLOWER_CMSG_TYPE_MAC_REPR;
+
+	size = sizeof(*msg) + (num_ports * sizeof(msg->ports[0]));
+	PMD_DRV_LOG(DEBUG, "mac repr cmsg init with size: %u", size);
+	msg = (struct nfp_flower_cmsg_mac_repr *)nfp_flower_cmsg_init(m,
+			type, size);
+
+	memset(msg->reserved, 0, sizeof(msg->reserved));
+	msg->num_ports = num_ports;
+}
+
+static void
+nfp_flower_cmsg_mac_repr_fill(struct rte_mbuf *m,
+		unsigned int idx,
+		unsigned int nbi,
+		unsigned int nbi_port,
+		unsigned int phys_port)
+{
+	struct nfp_flower_cmsg_mac_repr *msg;
+
+	msg = (struct nfp_flower_cmsg_mac_repr *)nfp_flower_cmsg_get_data(m);
+	msg->ports[idx].idx       = idx;
+	msg->ports[idx].info      = nbi & NFP_FLOWER_CMSG_MAC_REPR_NBI;
+	msg->ports[idx].nbi_port  = nbi_port;
+	msg->ports[idx].phys_port = phys_port;
+}
+
+int
+nfp_flower_cmsg_mac_repr(struct nfp_app_flower *app_flower)
+{
+	int i;
+	uint16_t cnt;
+	unsigned int nbi;
+	unsigned int nbi_port;
+	unsigned int phys_port;
+	struct rte_mbuf *mbuf;
+	struct nfp_eth_table *nfp_eth_table;
+
+	mbuf = rte_pktmbuf_alloc(app_flower->ctrl_pktmbuf_pool);
+	if (mbuf == NULL) {
+		PMD_DRV_LOG(ERR, "Could not allocate mac repr cmsg");
+		return -ENOMEM;
+	}
+
+	nfp_flower_cmsg_mac_repr_init(mbuf, app_flower->num_phyport_reprs);
+
+	/* Fill in the mac repr cmsg */
+	nfp_eth_table = app_flower->nfp_eth_table;
+	for (i = 0; i < app_flower->num_phyport_reprs; i++) {
+		nbi = nfp_eth_table->ports[i].nbi;
+		nbi_port = nfp_eth_table->ports[i].base;
+		phys_port = nfp_eth_table->ports[i].index;
+
+		nfp_flower_cmsg_mac_repr_fill(mbuf, i, nbi, nbi_port, phys_port);
+	}
+
+	/* Send the cmsg via the ctrl vNIC */
+	cnt = nfp_flower_ctrl_vnic_xmit(app_flower, mbuf);
+	if (cnt == 0) {
+		PMD_DRV_LOG(ERR, "Send cmsg through ctrl vnic failed.");
+		rte_pktmbuf_free(mbuf);
+		return -EIO;
+	}
+
+	return 0;
+}
+
+int
+nfp_flower_cmsg_repr_reify(struct nfp_app_flower *app_flower,
+		struct nfp_flower_representor *repr)
+{
+	uint16_t cnt;
+	struct rte_mbuf *mbuf;
+	struct nfp_flower_cmsg_port_reify *msg;
+
+	mbuf = rte_pktmbuf_alloc(app_flower->ctrl_pktmbuf_pool);
+	if (mbuf == NULL) {
+		PMD_DRV_LOG(DEBUG, "alloc mbuf for repr reify failed");
+		return -ENOMEM;
+	}
+
+	msg = (struct nfp_flower_cmsg_port_reify *)nfp_flower_cmsg_init(mbuf,
+			NFP_FLOWER_CMSG_TYPE_PORT_REIFY, sizeof(*msg));
+
+	msg->portnum  = rte_cpu_to_be_32(repr->port_id);
+	msg->reserved = 0;
+	msg->info     = rte_cpu_to_be_16(1);
+
+	cnt = nfp_flower_ctrl_vnic_xmit(app_flower, mbuf);
+	if (cnt == 0) {
+		PMD_DRV_LOG(ERR, "Send cmsg through ctrl vnic failed.");
+		rte_pktmbuf_free(mbuf);
+		return -EIO;
+	}
+
+	return 0;
+}
+
+int
+nfp_flower_cmsg_port_mod(struct nfp_app_flower *app_flower,
+		uint32_t port_id, bool carrier_ok)
+{
+	uint16_t cnt;
+	struct rte_mbuf *mbuf;
+	struct nfp_flower_cmsg_port_mod *msg;
+
+	mbuf = rte_pktmbuf_alloc(app_flower->ctrl_pktmbuf_pool);
+	if (mbuf == NULL) {
+		PMD_DRV_LOG(DEBUG, "alloc mbuf for repr portmod failed");
+		return -ENOMEM;
+	}
+
+	msg = (struct nfp_flower_cmsg_port_mod *)nfp_flower_cmsg_init(mbuf,
+			NFP_FLOWER_CMSG_TYPE_PORT_MOD, sizeof(*msg));
+
+	msg->portnum  = rte_cpu_to_be_32(port_id);
+	msg->reserved = 0;
+	msg->info     = carrier_ok;
+	msg->mtu      = 9000;
+
+	cnt = nfp_flower_ctrl_vnic_xmit(app_flower, mbuf);
+	if (cnt == 0) {
+		PMD_DRV_LOG(ERR, "Send cmsg through ctrl vnic failed.");
+		rte_pktmbuf_free(mbuf);
+		return -EIO;
+	}
+
+	return 0;
+}
diff --git a/drivers/net/nfp/flower/nfp_flower_cmsg.h b/drivers/net/nfp/flower/nfp_flower_cmsg.h
new file mode 100644
index 0000000..fce5163
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_cmsg.h
@@ -0,0 +1,173 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#ifndef _NFP_CMSG_H_
+#define _NFP_CMSG_H_
+
+#include <rte_byteorder.h>
+#include <rte_ether.h>
+
+struct nfp_flower_cmsg_hdr {
+	rte_be16_t pad;
+	uint8_t type;
+	uint8_t version;
+};
+
+/* Types defined for control messages */
+enum nfp_flower_cmsg_type {
+	NFP_FLOWER_CMSG_TYPE_FLOW_ADD       = 0,
+	NFP_FLOWER_CMSG_TYPE_FLOW_MOD       = 1,
+	NFP_FLOWER_CMSG_TYPE_FLOW_DEL       = 2,
+	NFP_FLOWER_CMSG_TYPE_LAG_CONFIG     = 4,
+	NFP_FLOWER_CMSG_TYPE_PORT_REIFY     = 6,
+	NFP_FLOWER_CMSG_TYPE_MAC_REPR       = 7,
+	NFP_FLOWER_CMSG_TYPE_PORT_MOD       = 8,
+	NFP_FLOWER_CMSG_TYPE_MERGE_HINT     = 9,
+	NFP_FLOWER_CMSG_TYPE_NO_NEIGH       = 10,
+	NFP_FLOWER_CMSG_TYPE_TUN_MAC        = 11,
+	NFP_FLOWER_CMSG_TYPE_ACTIVE_TUNS    = 12,
+	NFP_FLOWER_CMSG_TYPE_TUN_NEIGH      = 13,
+	NFP_FLOWER_CMSG_TYPE_TUN_IPS        = 14,
+	NFP_FLOWER_CMSG_TYPE_FLOW_STATS     = 15,
+	NFP_FLOWER_CMSG_TYPE_PORT_ECHO      = 16,
+	NFP_FLOWER_CMSG_TYPE_QOS_MOD        = 18,
+	NFP_FLOWER_CMSG_TYPE_QOS_DEL        = 19,
+	NFP_FLOWER_CMSG_TYPE_QOS_STATS      = 20,
+	NFP_FLOWER_CMSG_TYPE_PRE_TUN_RULE   = 21,
+	NFP_FLOWER_CMSG_TYPE_TUN_IPS_V6     = 22,
+	NFP_FLOWER_CMSG_TYPE_NO_NEIGH_V6    = 23,
+	NFP_FLOWER_CMSG_TYPE_TUN_NEIGH_V6   = 24,
+	NFP_FLOWER_CMSG_TYPE_ACTIVE_TUNS_V6 = 25,
+	NFP_FLOWER_CMSG_TYPE_MAX            = 32,
+};
+
+/*
+ * NFP_FLOWER_CMSG_TYPE_MAC_REPR
+ *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
+ *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *     Word +---------------+-----------+---+---------------+---------------+
+ *       0  |                  spare                        |Number of ports|
+ *          +---------------+-----------+---+---------------+---------------+
+ *       1  |    Index      |   spare   |NBI|  Port on NBI  | Chip-wide port|
+ *          +---------------+-----------+---+---------------+---------------+
+ *                                        ....
+ *          +---------------+-----------+---+---------------+---------------+
+ *     N-1  |    Index      |   spare   |NBI|  Port on NBI  | Chip-wide port|
+ *          +---------------+-----------+---+---------------+---------------+
+ *     N    |    Index      |   spare   |NBI|  Port on NBI  | Chip-wide port|
+ *          +---------------+-----------+---+---------------+---------------+
+ *
+ *          Index: index into the eth table
+ *          NBI (bits 17-16): NBI number (0-3)
+ *          Port on NBI (bits 15-8): “base” in the driver
+ *            this forms NBIX.PortY notation as the NSP eth table.
+ *          "Chip-wide" port (bits 7-0):
+ */
+struct nfp_flower_cmsg_mac_repr {
+	uint8_t reserved[3];
+	uint8_t num_ports;
+	struct {
+		uint8_t idx;
+		uint8_t info;
+		uint8_t nbi_port;
+		uint8_t phys_port;
+	} ports[0];
+};
+
+/*
+ * NFP_FLOWER_CMSG_TYPE_PORT_REIFY
+ *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
+ *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *    Word  +-------+-------+---+---+-------+---+---+-----------+-----------+
+ *       0  |Port Ty|Sys ID |NIC|Rsv| Spare |PCI|typ|    vNIC   |  queue    |
+ *          +-------+-----+-+---+---+-------+---+---+-----------+---------+-+
+ *       1  |                             Spare                           |E|
+ *          +-------------------------------------------------------------+-+
+ *          E: 1 = Representor exists, 0 = Representor does not exist
+ */
+struct nfp_flower_cmsg_port_reify {
+	rte_be32_t portnum;
+	rte_be16_t reserved;
+	rte_be16_t info;
+};
+
+/*
+ * NFP_FLOWER_CMSG_TYPE_PORT_MOD
+ *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
+ *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *    Word  +-------+-------+---+---+-------+---+---+-------+---+-----------+
+ *       0  |Port Ty|Sys ID |NIC|Rsv|       Reserved        |    Port       |
+ *          +-------+-------+---+---+-----+-+---+---+-------+---+-----------+
+ *       1  |            Spare            |L|              MTU              |
+ *          +-----------------------------+-+-------------------------------+
+ *        L: Link or Admin state bit. When message is generated by host, this
+ *           bit indicates the admin state (0=down, 1=up). When generated by
+ *           NFP, it indicates the link state (0=down, 1=up)
+ *
+ *        Port Type (word 1, bits 31 to 28) = 1 (Physical Network)
+ *        Port: “Chip-wide number” as assigned by BSP
+ *
+ *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
+ *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *    Word  +-------+-------+---+---+-------+---+---+-------+---+-----------+
+ *       0  |Port Ty|Sys ID |NIC|Rsv| Spare |PCI|typ|    vNIC   |  queue    |
+ *          +-------+-----+-+---+---+---+-+-+---+---+-------+---+-----------+
+ *       1  |            Spare            |L|              MTU              |
+ *          +-----------------------------+-+-------------------------------+
+ *        L: Link or Admin state bit. When message is generated by host, this
+ *           bit indicates the admin state (0=down, 1=up). When generated by
+ *           NFP, it indicates the link state (0=down, 1=up)
+ *
+ *        Port Type (word 1, bits 31 to 28) = 2 (PCIE)
+ */
+struct nfp_flower_cmsg_port_mod {
+	rte_be32_t portnum;
+	uint8_t reserved;
+	uint8_t info;
+	rte_be16_t mtu;
+};
+
+enum nfp_flower_cmsg_port_type {
+	NFP_FLOWER_CMSG_PORT_TYPE_UNSPEC,
+	NFP_FLOWER_CMSG_PORT_TYPE_PHYS_PORT,
+	NFP_FLOWER_CMSG_PORT_TYPE_PCIE_PORT,
+	NFP_FLOWER_CMSG_PORT_TYPE_OTHER_PORT,
+};
+
+enum nfp_flower_cmsg_port_vnic_type {
+	NFP_FLOWER_CMSG_PORT_VNIC_TYPE_VF,
+	NFP_FLOWER_CMSG_PORT_VNIC_TYPE_PF,
+	NFP_FLOWER_CMSG_PORT_VNIC_TYPE_CTRL,
+};
+
+#define NFP_FLOWER_CMSG_MAC_REPR_NBI            (0x3)
+
+#define NFP_FLOWER_CMSG_HLEN            sizeof(struct nfp_flower_cmsg_hdr)
+#define NFP_FLOWER_CMSG_VER1            1
+#define NFP_NET_META_PORTID             5
+#define NFP_META_PORT_ID_CTRL           ~0U
+
+#define NFP_FLOWER_CMSG_PORT_TYPE(x)            (((x) >> 28) & 0xf)  /* [31,28] */
+#define NFP_FLOWER_CMSG_PORT_SYS_ID(x)          (((x) >> 24) & 0xf)  /* [24,27] */
+#define NFP_FLOWER_CMSG_PORT_NFP_ID(x)          (((x) >> 22) & 0x3)  /* [22,23] */
+#define NFP_FLOWER_CMSG_PORT_PCI(x)             (((x) >> 14) & 0x3)  /* [14,15] */
+#define NFP_FLOWER_CMSG_PORT_VNIC_TYPE(x)       (((x) >> 12) & 0x3)  /* [12,13] */
+#define NFP_FLOWER_CMSG_PORT_VNIC(x)            (((x) >> 6) & 0x3f)  /* [6,11] */
+#define NFP_FLOWER_CMSG_PORT_PCIE_Q(x)          ((x) & 0x3f)         /* [0,5] */
+#define NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM(x)   ((x) & 0xff)         /* [0,7] */
+
+static inline char*
+nfp_flower_cmsg_get_data(struct rte_mbuf *m)
+{
+	return rte_pktmbuf_mtod(m, char *) + 4 + 4 + NFP_FLOWER_CMSG_HLEN;
+}
+
+int nfp_flower_cmsg_mac_repr(struct nfp_app_flower *app_flower);
+int nfp_flower_cmsg_repr_reify(struct nfp_app_flower *app_flower,
+		struct nfp_flower_representor *repr);
+int nfp_flower_cmsg_port_mod(struct nfp_app_flower *app_flower,
+		uint32_t port_id, bool carrier_ok);
+
+#endif /* _NFP_CMSG_H_ */
diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c b/drivers/net/nfp/flower/nfp_flower_representor.c
new file mode 100644
index 0000000..9f23a23
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_representor.c
@@ -0,0 +1,508 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#include <rte_common.h>
+#include <ethdev_pci.h>
+
+#include "../nfp_common.h"
+#include "../nfp_logs.h"
+#include "../nfp_ctrl.h"
+#include "../nfp_rxtx.h"
+#include "../nfpcore/nfp_mip.h"
+#include "../nfpcore/nfp_rtsym.h"
+#include "../nfpcore/nfp_nsp.h"
+#include "nfp_flower.h"
+#include "nfp_flower_representor.h"
+#include "nfp_flower_ctrl.h"
+#include "nfp_flower_cmsg.h"
+
+static int
+nfp_flower_repr_link_update(__rte_unused struct rte_eth_dev *ethdev,
+		__rte_unused int wait_to_complete)
+{
+	return 0;
+}
+
+static int
+nfp_flower_repr_dev_infos_get(__rte_unused struct rte_eth_dev *dev,
+		struct rte_eth_dev_info *dev_info)
+{
+	/* Hardcoded pktlen and queues for now */
+	dev_info->max_rx_queues = 1;
+	dev_info->max_tx_queues = 1;
+	dev_info->min_rx_bufsize = RTE_ETHER_MIN_MTU;
+	dev_info->max_rx_pktlen = 9000;
+
+	dev_info->rx_offload_capa = RTE_ETH_RX_OFFLOAD_VLAN_STRIP;
+	dev_info->rx_offload_capa |= RTE_ETH_RX_OFFLOAD_IPV4_CKSUM |
+			RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
+			RTE_ETH_RX_OFFLOAD_TCP_CKSUM;
+
+	dev_info->tx_offload_capa = RTE_ETH_TX_OFFLOAD_VLAN_INSERT;
+	dev_info->tx_offload_capa |= RTE_ETH_TX_OFFLOAD_IPV4_CKSUM |
+			RTE_ETH_TX_OFFLOAD_UDP_CKSUM |
+			RTE_ETH_TX_OFFLOAD_TCP_CKSUM;
+	dev_info->tx_offload_capa |= RTE_ETH_TX_OFFLOAD_TCP_TSO;
+	dev_info->tx_offload_capa |= RTE_ETH_TX_OFFLOAD_MULTI_SEGS;
+
+	dev_info->max_mac_addrs = 1;
+
+	return 0;
+}
+
+static int
+nfp_flower_repr_dev_configure(__rte_unused struct rte_eth_dev *dev)
+{
+	return 0;
+}
+
+static int
+nfp_flower_repr_dev_start(struct rte_eth_dev *dev)
+{
+	struct nfp_app_flower *app_flower;
+	struct nfp_flower_representor *repr;
+
+	repr = (struct nfp_flower_representor *)dev->data->dev_private;
+	app_flower = repr->app_flower;
+
+	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT)
+		nfp_eth_set_configured(app_flower->pf_hw->pf_dev->cpp,
+				repr->nfp_idx, 1);
+
+	nfp_flower_cmsg_port_mod(app_flower, repr->port_id, true);
+
+	return 0;
+}
+
+static int
+nfp_flower_repr_dev_stop(struct rte_eth_dev *dev)
+{
+	struct nfp_app_flower *app_flower;
+	struct nfp_flower_representor *repr;
+
+	repr = (struct nfp_flower_representor *)dev->data->dev_private;
+	app_flower = repr->app_flower;
+
+	nfp_flower_cmsg_port_mod(app_flower, repr->port_id, false);
+
+	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT)
+		nfp_eth_set_configured(app_flower->pf_hw->pf_dev->cpp,
+				repr->nfp_idx, 0);
+
+	return 0;
+}
+
+static int
+nfp_flower_repr_rx_queue_setup(struct rte_eth_dev *dev,
+		uint16_t rx_queue_id,
+		__rte_unused uint16_t nb_rx_desc,
+		unsigned int socket_id,
+		__rte_unused const struct rte_eth_rxconf *rx_conf,
+		__rte_unused struct rte_mempool *mb_pool)
+{
+	struct nfp_net_rxq *rxq;
+	struct nfp_net_hw *pf_hw;
+	struct nfp_flower_representor *repr;
+
+	repr = (struct nfp_flower_representor *)dev->data->dev_private;
+	pf_hw = repr->app_flower->pf_hw;
+
+	/* Allocating rx queue data structure */
+	rxq = rte_zmalloc_socket("ethdev RX queue", sizeof(struct nfp_net_rxq),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (rxq == NULL)
+		return -ENOMEM;
+
+	rxq->hw = pf_hw;
+	rxq->qidx = rx_queue_id;
+	rxq->port_id = dev->data->port_id;
+	dev->data->rx_queues[rx_queue_id] = rxq;
+
+	return 0;
+}
+
+static int
+nfp_flower_repr_tx_queue_setup(struct rte_eth_dev *dev,
+		uint16_t tx_queue_id,
+		__rte_unused uint16_t nb_tx_desc,
+		unsigned int socket_id,
+		__rte_unused const struct rte_eth_txconf *tx_conf)
+{
+	struct nfp_net_txq *txq;
+	struct nfp_net_hw *pf_hw;
+	struct nfp_flower_representor *repr;
+
+	repr = (struct nfp_flower_representor *)dev->data->dev_private;
+	pf_hw = repr->app_flower->pf_hw;
+
+	/* Allocating tx queue data structure */
+	txq = rte_zmalloc_socket("ethdev TX queue", sizeof(struct nfp_net_txq),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (txq == NULL)
+		return -ENOMEM;
+
+	txq->hw = pf_hw;
+	txq->qidx = tx_queue_id;
+	txq->port_id = dev->data->port_id;
+	dev->data->tx_queues[tx_queue_id] = txq;
+
+	return 0;
+}
+
+static int
+nfp_flower_repr_stats_get(struct rte_eth_dev *ethdev,
+		struct rte_eth_stats *stats)
+{
+	struct nfp_flower_representor *repr;
+
+	repr = (struct nfp_flower_representor *)ethdev->data->dev_private;
+	rte_memcpy(stats, &repr->repr_stats, sizeof(struct rte_eth_stats));
+
+	return 0;
+}
+
+static int
+nfp_flower_repr_stats_reset(__rte_unused struct rte_eth_dev *ethdev)
+{
+	return 0;
+}
+
+static int
+nfp_flower_repr_promiscuous_enable(__rte_unused struct rte_eth_dev *ethdev)
+{
+	return 0;
+}
+
+static int
+nfp_flower_repr_promiscuous_disable(__rte_unused struct rte_eth_dev *ethdev)
+{
+	return 0;
+}
+
+static void
+nfp_flower_repr_mac_addr_remove(__rte_unused struct rte_eth_dev *ethdev,
+		__rte_unused uint32_t index)
+{
+}
+
+static int
+nfp_flower_repr_mac_addr_set(__rte_unused struct rte_eth_dev *ethdev,
+		__rte_unused struct rte_ether_addr *mac_addr)
+{
+	return 0;
+}
+
+static uint16_t
+nfp_flower_repr_rx_burst(void *rx_queue,
+		struct rte_mbuf **rx_pkts,
+		uint16_t nb_pkts)
+{
+	unsigned int available = 0;
+	unsigned int total_dequeue;
+	struct nfp_net_rxq *rxq;
+	struct rte_eth_dev *dev;
+	struct nfp_flower_representor *repr;
+
+	rxq = rx_queue;
+	if (unlikely(rxq == NULL)) {
+		PMD_RX_LOG(ERR, "RX Bad queue");
+		return 0;
+	}
+
+	dev = &rte_eth_devices[rxq->port_id];
+	repr = dev->data->dev_private;
+	if (unlikely(repr->ring == NULL)) {
+		PMD_RX_LOG(ERR, "representor %s has no ring configured!",
+				repr->name);
+		return 0;
+	}
+
+	total_dequeue = rte_ring_dequeue_burst(repr->ring, (void *)rx_pkts,
+			nb_pkts, &available);
+	if (total_dequeue) {
+		PMD_RX_LOG(DEBUG, "Representor Rx burst for %s, port_id: 0x%x, "
+				"received: %u, available: %u", repr->name,
+				repr->port_id, total_dequeue, available);
+
+		repr->repr_stats.ipackets += total_dequeue;
+	}
+
+	return total_dequeue;
+}
+
+static uint16_t
+nfp_flower_repr_tx_burst(void *tx_queue,
+		struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts)
+{
+	uint16_t i;
+	uint16_t sent;
+	char *meta_offset;
+	struct nfp_net_txq *txq;
+	struct nfp_net_hw *pf_hw;
+	struct rte_eth_dev *dev;
+	struct rte_eth_dev *repr_dev;
+	struct nfp_flower_representor *repr;
+
+	txq = tx_queue;
+	if (unlikely(txq == NULL)) {
+		PMD_RX_LOG(ERR, "TX Bad queue");
+		return 0;
+	}
+
+	/* This points to the PF vNIC that owns this representor */
+	pf_hw = txq->hw;
+	dev = pf_hw->eth_dev;
+
+	/* Grab a handle to the representor struct */
+	repr_dev = &rte_eth_devices[txq->port_id];
+	repr = repr_dev->data->dev_private;
+
+	for (i = 0; i < nb_pkts; i++) {
+		meta_offset = rte_pktmbuf_prepend(tx_pkts[i], FLOWER_PKT_DATA_OFFSET);
+		*(uint32_t *)meta_offset = rte_cpu_to_be_32(NFP_NET_META_PORTID);
+		meta_offset += 4;
+		*(uint32_t *)meta_offset = rte_cpu_to_be_32(repr->port_id);
+	}
+
+	/* Only using Tx queue 0 for now. */
+	sent = rte_eth_tx_burst(dev->data->port_id, 0, tx_pkts, nb_pkts);
+	if (sent) {
+		PMD_TX_LOG(DEBUG, "Representor Tx burst for %s, port_id: 0x%x "
+			"transmitted: %u\n", repr->name, repr->port_id, sent);
+		repr->repr_stats.opackets += sent;
+	}
+
+	return sent;
+}
+
+static const struct eth_dev_ops nfp_flower_repr_dev_ops = {
+	.dev_infos_get        = nfp_flower_repr_dev_infos_get,
+
+	.dev_start            = nfp_flower_repr_dev_start,
+	.dev_configure        = nfp_flower_repr_dev_configure,
+	.dev_stop             = nfp_flower_repr_dev_stop,
+
+	.rx_queue_setup       = nfp_flower_repr_rx_queue_setup,
+	.tx_queue_setup       = nfp_flower_repr_tx_queue_setup,
+
+	.link_update          = nfp_flower_repr_link_update,
+
+	.stats_get            = nfp_flower_repr_stats_get,
+	.stats_reset          = nfp_flower_repr_stats_reset,
+
+	.promiscuous_enable   = nfp_flower_repr_promiscuous_enable,
+	.promiscuous_disable  = nfp_flower_repr_promiscuous_disable,
+
+	.mac_addr_remove      = nfp_flower_repr_mac_addr_remove,
+	.mac_addr_set         = nfp_flower_repr_mac_addr_set,
+};
+
+static uint32_t
+nfp_flower_get_phys_port_id(uint8_t port)
+{
+	return (NFP_FLOWER_CMSG_PORT_TYPE_PHYS_PORT << 28) | port;
+}
+
+static uint32_t
+nfp_get_pcie_port_id(struct nfp_cpp *cpp,
+		int type,
+		uint8_t vnic,
+		uint8_t queue)
+{
+	uint8_t nfp_pcie;
+	uint32_t port_id;
+
+	nfp_pcie = NFP_CPP_INTERFACE_UNIT_of(nfp_cpp_interface(cpp));
+	port_id = ((nfp_pcie & 0x3) << 14) |
+		  ((type & 0x3) << 12) |
+		  ((vnic & 0x3f) << 6) |
+		  (queue & 0x3f) |
+		  ((NFP_FLOWER_CMSG_PORT_TYPE_PCIE_PORT & 0xf) << 28);
+
+	return port_id;
+}
+
+static int
+nfp_flower_repr_init(struct rte_eth_dev *eth_dev,
+		void *init_params)
+{
+	int ret;
+	unsigned int numa_node;
+	char ring_name[RTE_ETH_NAME_MAX_LEN];
+	struct nfp_app_flower *app_flower;
+	struct nfp_flower_representor *repr;
+	struct nfp_flower_representor *init_repr_data;
+
+	/* Cast the input representor data to the correct struct here */
+	init_repr_data = (struct nfp_flower_representor *)init_params;
+
+	app_flower = init_repr_data->app_flower;
+
+	/* Memory has been allocated in the eth_dev_create() function */
+	repr = eth_dev->data->dev_private;
+
+	/*
+	 * We need multiproduce rings as we can have multiple PF ports.
+	 * On the other hand, we need single consumer rings, as just one
+	 * representor PMD will try to read from the ring.
+	 */
+	snprintf(ring_name, sizeof(ring_name), "%s_%s",
+		init_repr_data->name, "ring");
+	numa_node = rte_socket_id();
+	repr->ring = rte_ring_create(ring_name, 256, numa_node, RING_F_SC_DEQ);
+	if (repr->ring == NULL) {
+		PMD_INIT_LOG(ERR, "rte_ring_create failed for %s\n", ring_name);
+		return -ENOMEM;
+	}
+
+	/* Copy data here from the input representor template*/
+	repr->vf_id            = init_repr_data->vf_id;
+	repr->switch_domain_id = init_repr_data->switch_domain_id;
+	repr->port_id          = init_repr_data->port_id;
+	repr->nfp_idx          = init_repr_data->nfp_idx;
+	repr->repr_type        = init_repr_data->repr_type;
+	repr->app_flower       = init_repr_data->app_flower;
+
+	snprintf(repr->name, sizeof(repr->name), "%s", init_repr_data->name);
+
+	eth_dev->dev_ops = &nfp_flower_repr_dev_ops;
+
+	eth_dev->rx_pkt_burst = nfp_flower_repr_rx_burst;
+	eth_dev->tx_pkt_burst = nfp_flower_repr_tx_burst;
+
+	eth_dev->data->dev_flags |= RTE_ETH_DEV_REPRESENTOR;
+
+	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT)
+		eth_dev->data->representor_id = repr->vf_id;
+	else
+		eth_dev->data->representor_id = repr->vf_id +
+				app_flower->num_phyport_reprs;
+
+	/* This backer port is that of the eth_device created for the PF vNIC */
+	eth_dev->data->backer_port_id = app_flower->pf_hw->eth_dev->data->port_id;
+
+	/* Only single queues for representor devices */
+	eth_dev->data->nb_rx_queues = 1;
+	eth_dev->data->nb_tx_queues = 1;
+
+	/* Allocating memory for mac addr */
+	eth_dev->data->mac_addrs = rte_zmalloc("mac_addr",
+		RTE_ETHER_ADDR_LEN, 0);
+	if (eth_dev->data->mac_addrs == NULL) {
+		PMD_INIT_LOG(ERR, "Failed to allocate memory for repr MAC");
+		ret = -ENOMEM;
+		goto ring_cleanup;
+	}
+
+	rte_ether_addr_copy(&init_repr_data->mac_addr, &repr->mac_addr);
+	rte_ether_addr_copy(&init_repr_data->mac_addr, eth_dev->data->mac_addrs);
+
+	/* Send reify message to hardware to inform it about the new repr */
+	ret = nfp_flower_cmsg_repr_reify(app_flower, repr);
+	if (ret) {
+		PMD_INIT_LOG(WARNING, "Failed to send repr reify message");
+		goto mac_cleanup;
+	}
+
+	/* Add repr to correct array */
+	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT)
+		app_flower->phy_reprs[repr->nfp_idx] = repr;
+	else
+		app_flower->vf_reprs[repr->vf_id] = repr;
+
+	return 0;
+
+mac_cleanup:
+	rte_free(eth_dev->data->mac_addrs);
+ring_cleanup:
+	rte_ring_free(repr->ring);
+
+	return ret;
+}
+
+int
+nfp_flower_repr_alloc(struct nfp_app_flower *app_flower)
+{
+	int i;
+	int ret;
+	struct rte_eth_dev *eth_dev;
+	struct nfp_eth_table *nfp_eth_table;
+	struct nfp_eth_table_port *eth_port;
+	struct nfp_flower_representor flower_repr = {
+		.switch_domain_id = app_flower->switch_domain_id,
+		.app_flower       = app_flower,
+	};
+
+	nfp_eth_table = app_flower->nfp_eth_table;
+	eth_dev = app_flower->pf_hw->eth_dev;
+
+	/* Send a NFP_FLOWER_CMSG_TYPE_MAC_REPR cmsg to hardware*/
+	ret = nfp_flower_cmsg_mac_repr(app_flower);
+	if (ret) {
+		PMD_INIT_LOG(ERR, "Cloud not send mac repr cmsgs");
+		return ret;
+	}
+
+	/* Create a rte_eth_dev for every phyport representor */
+	for (i = 0; i < app_flower->num_phyport_reprs; i++) {
+		eth_port = &nfp_eth_table->ports[i];
+		flower_repr.repr_type = NFP_REPR_TYPE_PHYS_PORT;
+		flower_repr.port_id = nfp_flower_get_phys_port_id(eth_port->index);
+		flower_repr.nfp_idx = eth_port->eth_index;
+		flower_repr.vf_id = i;
+
+		/* Copy the real mac of the interface to the representor struct */
+		rte_ether_addr_copy((struct rte_ether_addr *)eth_port->mac_addr,
+				&flower_repr.mac_addr);
+		sprintf(flower_repr.name, "flower_repr_p%d", i);
+
+		/*
+		 * Create a eth_dev for this representor
+		 * This will also allocate private memory for the device
+		 */
+		ret = rte_eth_dev_create(eth_dev->device, flower_repr.name,
+				sizeof(struct nfp_flower_representor),
+				NULL, NULL, nfp_flower_repr_init, &flower_repr);
+		if (ret) {
+			PMD_INIT_LOG(ERR, "Cloud not create eth_dev for repr");
+			break;
+		}
+	}
+
+	if (i < app_flower->num_phyport_reprs)
+		return ret;
+
+	/*
+	 * Now allocate eth_dev's for VF representors.
+	 * Also send reify messages
+	 */
+	for (i = 0; i < app_flower->num_vf_reprs; i++) {
+		flower_repr.repr_type = NFP_REPR_TYPE_VF;
+		flower_repr.port_id = nfp_get_pcie_port_id(app_flower->pf_hw->cpp,
+				NFP_FLOWER_CMSG_PORT_VNIC_TYPE_VF, i, 0);
+		flower_repr.nfp_idx = 0;
+		flower_repr.vf_id = i;
+
+		/* VF reprs get a random MAC address */
+		rte_eth_random_addr(flower_repr.mac_addr.addr_bytes);
+
+		sprintf(flower_repr.name, "flower_repr_vf%d", i);
+
+		 /* This will also allocate private memory for the device*/
+		ret = rte_eth_dev_create(eth_dev->device, flower_repr.name,
+				sizeof(struct nfp_flower_representor),
+				NULL, NULL, nfp_flower_repr_init, &flower_repr);
+		if (ret) {
+			PMD_INIT_LOG(ERR, "Cloud not create eth_dev for repr");
+			break;
+		}
+	}
+
+	if (i < app_flower->num_vf_reprs)
+		return ret;
+
+	return 0;
+}
diff --git a/drivers/net/nfp/flower/nfp_flower_representor.h b/drivers/net/nfp/flower/nfp_flower_representor.h
new file mode 100644
index 0000000..6ee54f1
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_representor.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#ifndef _NFP_FLOWER_REPRESENTOR_H_
+#define _NFP_FLOWER_REPRESENTOR_H_
+
+/*
+ * enum nfp_repr_type - type of representor
+ * @NFP_REPR_TYPE_PHYS_PORT:   external NIC port
+ * @NFP_REPR_TYPE_PF:          physical function
+ * @NFP_REPR_TYPE_VF:          virtual function
+ * @NFP_REPR_TYPE_MAX:         number of representor types
+ */
+enum nfp_repr_type {
+	NFP_REPR_TYPE_PHYS_PORT = 0,
+	NFP_REPR_TYPE_PF,
+	NFP_REPR_TYPE_VF,
+	NFP_REPR_TYPE_MAX,
+};
+
+struct nfp_flower_representor {
+	uint16_t vf_id;
+	uint16_t switch_domain_id;
+	uint32_t repr_type;
+	uint32_t port_id;
+	uint32_t nfp_idx;    /* only valid for the repr of physical port */
+	char name[RTE_ETH_NAME_MAX_LEN];
+	struct rte_ether_addr mac_addr;
+	struct nfp_app_flower *app_flower;
+	struct rte_ring *ring;
+	struct rte_eth_link *link;
+	struct rte_eth_stats repr_stats;
+};
+
+int nfp_flower_repr_alloc(struct nfp_app_flower *app_flower);
+
+#endif /* _NFP_FLOWER_REPRESENTOR_H_ */
diff --git a/drivers/net/nfp/meson.build b/drivers/net/nfp/meson.build
index 8710213..8a63979 100644
--- a/drivers/net/nfp/meson.build
+++ b/drivers/net/nfp/meson.build
@@ -7,7 +7,9 @@ if not is_linux or not dpdk_conf.get('RTE_ARCH_64')
 endif
 sources = files(
         'flower/nfp_flower.c',
+        'flower/nfp_flower_cmsg.c',
         'flower/nfp_flower_ctrl.c',
+        'flower/nfp_flower_representor.c',
         'nfpcore/nfp_cpp_pcie_ops.c',
         'nfpcore/nfp_nsp.c',
         'nfpcore/nfp_cppcore.c',
diff --git a/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c b/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
index 08bc4e8..22c8bc4 100644
--- a/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
+++ b/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
@@ -91,7 +91,10 @@
  * @refcnt:	number of current users
  * @iomem:	mapped IO memory
  */
+#define NFP_BAR_MIN 1
+#define NFP_BAR_MID 5
 #define NFP_BAR_MAX 7
+
 struct nfp_bar {
 	struct nfp_pcie_user *nfp;
 	uint32_t barcfg;
@@ -292,6 +295,7 @@ struct nfp_pcie_user {
  * BAR0.0: Reserved for General Mapping (for MSI-X access to PCIe SRAM)
  *
  *         Halving PCItoCPPBars for primary and secondary processes.
+ *         For CoreNIC firmware:
  *         NFP PMD just requires two fixed slots, one for configuration BAR,
  *         and another for accessing the hw queues. Another slot is needed
  *         for setting the link up or down. Secondary processes do not need
@@ -301,6 +305,9 @@ struct nfp_pcie_user {
  *         supported. Due to this requirement and future extensions requiring
  *         new slots per process, only one secondary process is supported by
  *         now.
+ *         For Flower firmware:
+ *         NFP PMD need another fixed slots, used as the configureation BAR
+ *         for ctrl vNIC.
  */
 static int
 nfp_enable_bars(struct nfp_pcie_user *nfp)
@@ -309,11 +316,11 @@ struct nfp_pcie_user {
 	int x, start, end;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-		start = 4;
-		end = 1;
+		start = NFP_BAR_MID;
+		end = NFP_BAR_MIN;
 	} else {
-		start = 7;
-		end = 4;
+		start = NFP_BAR_MAX;
+		end = NFP_BAR_MID;
 	}
 	for (x = start; x > end; x--) {
 		bar = &nfp->bar[x - 1];
@@ -341,11 +348,11 @@ struct nfp_pcie_user {
 	int x, start, end;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-		start = 4;
-		end = 1;
+		start = NFP_BAR_MID;
+		end = NFP_BAR_MIN;
 	} else {
-		start = 7;
-		end = 4;
+		start = NFP_BAR_MAX;
+		end = NFP_BAR_MID;
 	}
 	for (x = start; x > end; x--) {
 		bar = &nfp->bar[x - 1];
@@ -364,11 +371,11 @@ struct nfp_pcie_user {
 	int x, start, end;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-		start = 4;
-		end = 1;
+		start = NFP_BAR_MID;
+		end = NFP_BAR_MIN;
 	} else {
-		start = 7;
-		end = 4;
+		start = NFP_BAR_MAX;
+		end = NFP_BAR_MID;
 	}
 
 	for (x = start; x > end; x--) {
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 11/12] net/nfp: move rxtx function to header file
  2022-08-05  6:32 [PATCH v5 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (9 preceding siblings ...)
  2022-08-05  6:32 ` [PATCH v5 10/12] net/nfp: add flower representor framework Chaoyong He
@ 2022-08-05  6:32 ` Chaoyong He
  2022-08-05  6:32 ` [PATCH v5 12/12] net/nfp: add flower PF rxtx logic Chaoyong He
  11 siblings, 0 replies; 29+ messages in thread
From: Chaoyong He @ 2022-08-05  6:32 UTC (permalink / raw)
  To: dev; +Cc: niklas.soderlund, Chaoyong He, Heinrich Kuhn

Flower makes use of the same Rx and Tx checksum logic as the normal PMD.
Expose it so that flower can make use of it.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/nfp_common.c    |  2 +-
 drivers/net/nfp/nfp_ethdev.c    |  2 +-
 drivers/net/nfp/nfp_ethdev_vf.c |  2 +-
 drivers/net/nfp/nfp_rxtx.c      | 91 +----------------------------------------
 drivers/net/nfp/nfp_rxtx.h      | 90 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 94 insertions(+), 93 deletions(-)

diff --git a/drivers/net/nfp/nfp_common.c b/drivers/net/nfp/nfp_common.c
index 0e55f0c..e86929c 100644
--- a/drivers/net/nfp/nfp_common.c
+++ b/drivers/net/nfp/nfp_common.c
@@ -38,9 +38,9 @@
 #include "nfpcore/nfp_nsp.h"
 
 #include "nfp_common.h"
+#include "nfp_ctrl.h"
 #include "nfp_rxtx.h"
 #include "nfp_logs.h"
-#include "nfp_ctrl.h"
 #include "nfp_cpp_bridge.h"
 
 #include <sys/types.h>
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index 0b88749..f2571a0 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -33,9 +33,9 @@
 #include "nfpcore/nfp_nsp.h"
 
 #include "nfp_common.h"
+#include "nfp_ctrl.h"
 #include "nfp_rxtx.h"
 #include "nfp_logs.h"
-#include "nfp_ctrl.h"
 #include "nfp_cpp_bridge.h"
 
 #include "flower/nfp_flower.h"
diff --git a/drivers/net/nfp/nfp_ethdev_vf.c b/drivers/net/nfp/nfp_ethdev_vf.c
index d304d78..ceaf618 100644
--- a/drivers/net/nfp/nfp_ethdev_vf.c
+++ b/drivers/net/nfp/nfp_ethdev_vf.c
@@ -19,9 +19,9 @@
 #include "nfpcore/nfp_rtsym.h"
 
 #include "nfp_common.h"
+#include "nfp_ctrl.h"
 #include "nfp_rxtx.h"
 #include "nfp_logs.h"
-#include "nfp_ctrl.h"
 
 static void
 nfp_netvf_read_mac(struct nfp_net_hw *hw)
diff --git a/drivers/net/nfp/nfp_rxtx.c b/drivers/net/nfp/nfp_rxtx.c
index 8d63a7b..95403a3 100644
--- a/drivers/net/nfp/nfp_rxtx.c
+++ b/drivers/net/nfp/nfp_rxtx.c
@@ -17,9 +17,9 @@
 #include <ethdev_pci.h>
 
 #include "nfp_common.h"
+#include "nfp_ctrl.h"
 #include "nfp_rxtx.h"
 #include "nfp_logs.h"
-#include "nfp_ctrl.h"
 #include "nfpcore/nfp_mip.h"
 #include "nfpcore/nfp_rtsym.h"
 #include "nfpcore/nfp-common/nfp_platform.h"
@@ -208,34 +208,6 @@
 	}
 }
 
-/* nfp_net_rx_cksum - set mbuf checksum flags based on RX descriptor flags */
-static inline void
-nfp_net_rx_cksum(struct nfp_net_rxq *rxq, struct nfp_net_rx_desc *rxd,
-		 struct rte_mbuf *mb)
-{
-	struct nfp_net_hw *hw = rxq->hw;
-
-	if (!(hw->ctrl & NFP_NET_CFG_CTRL_RXCSUM))
-		return;
-
-	/* If IPv4 and IP checksum error, fail */
-	if (unlikely((rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM) &&
-	    !(rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM_OK)))
-		mb->ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_BAD;
-	else
-		mb->ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_GOOD;
-
-	/* If neither UDP nor TCP return */
-	if (!(rxd->rxd.flags & PCIE_DESC_RX_TCP_CSUM) &&
-	    !(rxd->rxd.flags & PCIE_DESC_RX_UDP_CSUM))
-		return;
-
-	if (likely(rxd->rxd.flags & PCIE_DESC_RX_L4_CSUM_OK))
-		mb->ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_GOOD;
-	else
-		mb->ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_BAD;
-}
-
 /*
  * RX path design:
  *
@@ -768,67 +740,6 @@
 	return 0;
 }
 
-/* nfp_net_tx_tso - Set TX descriptor for TSO */
-static inline void
-nfp_net_nfd3_tx_tso(struct nfp_net_txq *txq, struct nfp_net_nfd3_tx_desc *txd,
-	       struct rte_mbuf *mb)
-{
-	uint64_t ol_flags;
-	struct nfp_net_hw *hw = txq->hw;
-
-	if (!(hw->cap & NFP_NET_CFG_CTRL_LSO_ANY))
-		goto clean_txd;
-
-	ol_flags = mb->ol_flags;
-
-	if (!(ol_flags & RTE_MBUF_F_TX_TCP_SEG))
-		goto clean_txd;
-
-	txd->l3_offset = mb->l2_len;
-	txd->l4_offset = mb->l2_len + mb->l3_len;
-	txd->lso_hdrlen = mb->l2_len + mb->l3_len + mb->l4_len;
-	txd->mss = rte_cpu_to_le_16(mb->tso_segsz);
-	txd->flags = PCIE_DESC_TX_LSO;
-	return;
-
-clean_txd:
-	txd->flags = 0;
-	txd->l3_offset = 0;
-	txd->l4_offset = 0;
-	txd->lso_hdrlen = 0;
-	txd->mss = 0;
-}
-
-/* nfp_net_tx_cksum - Set TX CSUM offload flags in TX descriptor */
-static inline void
-nfp_net_nfd3_tx_cksum(struct nfp_net_txq *txq, struct nfp_net_nfd3_tx_desc *txd,
-		 struct rte_mbuf *mb)
-{
-	uint64_t ol_flags;
-	struct nfp_net_hw *hw = txq->hw;
-
-	if (!(hw->cap & NFP_NET_CFG_CTRL_TXCSUM))
-		return;
-
-	ol_flags = mb->ol_flags;
-
-	/* IPv6 does not need checksum */
-	if (ol_flags & RTE_MBUF_F_TX_IP_CKSUM)
-		txd->flags |= PCIE_DESC_TX_IP4_CSUM;
-
-	switch (ol_flags & RTE_MBUF_F_TX_L4_MASK) {
-	case RTE_MBUF_F_TX_UDP_CKSUM:
-		txd->flags |= PCIE_DESC_TX_UDP_CSUM;
-		break;
-	case RTE_MBUF_F_TX_TCP_CKSUM:
-		txd->flags |= PCIE_DESC_TX_TCP_CSUM;
-		break;
-	}
-
-	if (ol_flags & (RTE_MBUF_F_TX_IP_CKSUM | RTE_MBUF_F_TX_L4_MASK))
-		txd->flags |= PCIE_DESC_TX_CSUM;
-}
-
 uint16_t
 nfp_net_nfd3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 {
diff --git a/drivers/net/nfp/nfp_rxtx.h b/drivers/net/nfp/nfp_rxtx.h
index a30171f..cd70bdd 100644
--- a/drivers/net/nfp/nfp_rxtx.h
+++ b/drivers/net/nfp/nfp_rxtx.h
@@ -360,6 +360,96 @@ struct nfp_net_rxq {
 	return (nfp_net_nfd3_free_tx_desc(txq) < txq->tx_free_thresh);
 }
 
+/* set mbuf checksum flags based on RX descriptor flags */
+static inline void
+nfp_net_rx_cksum(struct nfp_net_rxq *rxq, struct nfp_net_rx_desc *rxd,
+		 struct rte_mbuf *mb)
+{
+	struct nfp_net_hw *hw = rxq->hw;
+
+	if (!(hw->ctrl & NFP_NET_CFG_CTRL_RXCSUM))
+		return;
+
+	/* If IPv4 and IP checksum error, fail */
+	if (unlikely((rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM) &&
+			!(rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM_OK)))
+		mb->ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_BAD;
+	else
+		mb->ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_GOOD;
+
+	/* If neither UDP nor TCP return */
+	if (!(rxd->rxd.flags & PCIE_DESC_RX_TCP_CSUM) &&
+			!(rxd->rxd.flags & PCIE_DESC_RX_UDP_CSUM))
+		return;
+
+	if (likely(rxd->rxd.flags & PCIE_DESC_RX_L4_CSUM_OK))
+		mb->ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_GOOD;
+	else
+		mb->ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_BAD;
+}
+
+/* Set NFD3 TX descriptor for TSO */
+static inline void
+nfp_net_nfd3_tx_tso(struct nfp_net_txq *txq,
+		struct nfp_net_nfd3_tx_desc *txd,
+		struct rte_mbuf *mb)
+{
+	uint64_t ol_flags;
+	struct nfp_net_hw *hw = txq->hw;
+
+	if (!(hw->cap & NFP_NET_CFG_CTRL_LSO_ANY))
+		goto clean_txd;
+
+	ol_flags = mb->ol_flags;
+
+	if (!(ol_flags & RTE_MBUF_F_TX_TCP_SEG))
+		goto clean_txd;
+
+	txd->l3_offset = mb->l2_len;
+	txd->l4_offset = mb->l2_len + mb->l3_len;
+	txd->lso_hdrlen = mb->l2_len + mb->l3_len + mb->l4_len;
+	txd->mss = rte_cpu_to_le_16(mb->tso_segsz);
+	txd->flags = PCIE_DESC_TX_LSO;
+	return;
+
+clean_txd:
+	txd->flags = 0;
+	txd->l3_offset = 0;
+	txd->l4_offset = 0;
+	txd->lso_hdrlen = 0;
+	txd->mss = 0;
+}
+
+/* Set TX CSUM offload flags in NFD3 TX descriptor */
+static inline void
+nfp_net_nfd3_tx_cksum(struct nfp_net_txq *txq, struct nfp_net_nfd3_tx_desc *txd,
+		 struct rte_mbuf *mb)
+{
+	uint64_t ol_flags;
+	struct nfp_net_hw *hw = txq->hw;
+
+	if (!(hw->cap & NFP_NET_CFG_CTRL_TXCSUM))
+		return;
+
+	ol_flags = mb->ol_flags;
+
+	/* IPv6 does not need checksum */
+	if (ol_flags & RTE_MBUF_F_TX_IP_CKSUM)
+		txd->flags |= PCIE_DESC_TX_IP4_CSUM;
+
+	switch (ol_flags & RTE_MBUF_F_TX_L4_MASK) {
+	case RTE_MBUF_F_TX_UDP_CKSUM:
+		txd->flags |= PCIE_DESC_TX_UDP_CSUM;
+		break;
+	case RTE_MBUF_F_TX_TCP_CKSUM:
+		txd->flags |= PCIE_DESC_TX_TCP_CSUM;
+		break;
+	}
+
+	if (ol_flags & (RTE_MBUF_F_TX_IP_CKSUM | RTE_MBUF_F_TX_L4_MASK))
+		txd->flags |= PCIE_DESC_TX_CSUM;
+}
+
 int nfp_net_rx_freelist_setup(struct rte_eth_dev *dev);
 uint32_t nfp_net_rx_queue_count(void *rx_queue);
 uint16_t nfp_net_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 12/12] net/nfp: add flower PF rxtx logic
  2022-08-05  6:32 [PATCH v5 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (10 preceding siblings ...)
  2022-08-05  6:32 ` [PATCH v5 11/12] net/nfp: move rxtx function to header file Chaoyong He
@ 2022-08-05  6:32 ` Chaoyong He
  11 siblings, 0 replies; 29+ messages in thread
From: Chaoyong He @ 2022-08-05  6:32 UTC (permalink / raw)
  To: dev; +Cc: niklas.soderlund, Chaoyong He

This commit implements the flower Rx logic. Fallback packets are
multiplexed to the correct representor port based on the prepended
metadata. The Rx poll is set to run on the existing service
infrastructure.

For Tx the existing NFP Tx logic is duplicated to keep the Tx two paths
distinct. Flower fallback also adds 8 bytes of metadata to the start of
the packet that has to be adjusted for in the Tx descriptor.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/flower/nfp_flower.c | 428 ++++++++++++++++++++++++++++++++++++
 drivers/net/nfp/flower/nfp_flower.h |   1 +
 2 files changed, 429 insertions(+)

diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
index d7772d6..bc0acff 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -23,6 +23,7 @@
 #include "nfp_flower_ovs_compat.h"
 #include "nfp_flower_ctrl.h"
 #include "nfp_flower_representor.h"
+#include "nfp_flower_cmsg.h"
 
 #define MAX_PKT_BURST 32
 #define MEMPOOL_CACHE_SIZE 512
@@ -218,6 +219,44 @@
 	.link_update            = nfp_flower_pf_link_update,
 };
 
+static void
+nfp_flower_pf_vnic_poll(struct nfp_app_flower *app_flower)
+{
+	uint16_t i;
+	uint16_t count;
+	uint16_t n_rxq;
+	uint16_t port_id;
+	struct nfp_net_hw *pf_hw;
+	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+
+	pf_hw = app_flower->pf_hw;
+
+	n_rxq = pf_hw->eth_dev->data->nb_rx_queues;
+	port_id = pf_hw->eth_dev->data->port_id;
+
+	/* Add ability to run Rx queues on multiple service cores? */
+	for (i = 0; i < n_rxq; i++) {
+		count = rte_eth_rx_burst(port_id, i, pkts_burst, MAX_PKT_BURST);
+
+		/*
+		 * Don't expect packets here but free them in case they have
+		 * not been multiplexed to a representor
+		 */
+		if (unlikely(count > 0))
+			rte_pktmbuf_free_bulk(pkts_burst, count);
+	}
+}
+
+static int
+nfp_flower_pf_vnic_service(void *arg)
+{
+	struct nfp_app_flower *app_flower = arg;
+
+	nfp_flower_pf_vnic_poll(app_flower);
+
+	return 0;
+}
+
 static int
 nfp_flower_ctrl_vnic_service(void *arg)
 {
@@ -229,6 +268,10 @@
 }
 
 static struct rte_service_spec flower_services[NFP_FLOWER_SERVICE_MAX] = {
+	[NFP_FLOWER_SERVICE_PF] = {
+		.name         = "flower_pf_vnic_service",
+		.callback     = nfp_flower_pf_vnic_service,
+	},
 	[NFP_FLOWER_SERVICE_CTRL] = {
 		.name         = "flower_ctrl_vnic_service",
 		.callback     = nfp_flower_ctrl_vnic_service,
@@ -265,6 +308,387 @@
 	return ret;
 }
 
+static inline void
+nfp_flower_parse_metadata(struct nfp_net_rxq *rxq,
+		struct nfp_net_rx_desc *rxd,
+		struct rte_mbuf *mbuf,
+		uint32_t *portid)
+{
+	uint32_t meta_info;
+	uint8_t *meta_offset;
+	struct nfp_net_hw *hw;
+
+	hw = rxq->hw;
+	if (!((hw->ctrl & NFP_NET_CFG_CTRL_RSS) ||
+			(hw->ctrl & NFP_NET_CFG_CTRL_RSS2)))
+		return;
+
+	meta_offset = rte_pktmbuf_mtod(mbuf, uint8_t *);
+	meta_offset -= NFP_DESC_META_LEN(rxd);
+	meta_info = rte_be_to_cpu_32(*(uint32_t *)meta_offset);
+	meta_offset += 4;
+
+	while (meta_info) {
+		switch (meta_info & NFP_NET_META_FIELD_MASK) {
+		/* Expect flower firmware to only send packets with META_PORTID */
+		case NFP_NET_META_PORTID:
+			*portid = rte_be_to_cpu_32(*(uint32_t *)meta_offset);
+			meta_offset += 4;
+			meta_info >>= NFP_NET_META_FIELD_SIZE;
+			break;
+		default:
+			/* Unsupported metadata can be a performance issue */
+			return;
+		}
+	}
+}
+
+static inline struct nfp_flower_representor *
+nfp_flower_get_repr(struct nfp_net_hw *hw,
+		uint32_t port_id)
+{
+	uint8_t port;
+	struct nfp_app_flower *app_flower;
+
+	/* Obtain handle to app_flower here */
+	app_flower = NFP_APP_PRIV_TO_APP_FLOWER(hw->pf_dev->app_priv);
+
+	switch (NFP_FLOWER_CMSG_PORT_TYPE(port_id)) {
+	case NFP_FLOWER_CMSG_PORT_TYPE_PHYS_PORT:
+		port =  NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM(port_id);
+		return app_flower->phy_reprs[port];
+	case NFP_FLOWER_CMSG_PORT_TYPE_PCIE_PORT:
+		port = NFP_FLOWER_CMSG_PORT_VNIC(port_id);
+		return app_flower->vf_reprs[port];
+	default:
+		break;
+	}
+
+	return NULL;
+}
+
+static uint16_t
+nfp_flower_pf_recv_pkts(void *rx_queue,
+		struct rte_mbuf **rx_pkts,
+		uint16_t nb_pkts)
+{
+	/*
+	 * we need different counters for packets given to the caller
+	 * and packets sent to representors
+	 */
+	int avail = 0;
+	int avail_multiplexed = 0;
+	uint64_t dma_addr;
+	uint32_t meta_portid;
+	uint16_t nb_hold = 0;
+	struct rte_mbuf *mb;
+	struct nfp_net_hw *hw;
+	struct rte_mbuf *new_mb;
+	struct nfp_net_rxq *rxq;
+	struct nfp_net_rx_buff *rxb;
+	struct nfp_net_rx_desc *rxds;
+	struct nfp_flower_representor *repr;
+
+	rxq = rx_queue;
+	if (unlikely(rxq == NULL)) {
+		/*
+		 * DPDK just checks the queue is lower than max queues
+		 * enabled. But the queue needs to be configured
+		 */
+		RTE_LOG_DP(ERR, PMD, "RX Bad queue\n");
+		return 0;
+	}
+
+	hw = rxq->hw;
+
+	/*
+	 * This is tunable as we could allow to receive more packets than
+	 * requested if most are multiplexed.
+	 */
+	while (avail + avail_multiplexed < nb_pkts) {
+		rxb = &rxq->rxbufs[rxq->rd_p];
+		if (unlikely(rxb == NULL)) {
+			RTE_LOG_DP(ERR, PMD, "rxb does not exist!\n");
+			break;
+		}
+
+		rxds = &rxq->rxds[rxq->rd_p];
+		if ((rxds->rxd.meta_len_dd & PCIE_DESC_RX_DD) == 0)
+			break;
+
+		/*
+		 * Memory barrier to ensure that we won't do other
+		 * reads before the DD bit.
+		 */
+		rte_rmb();
+
+		/*
+		 * We got a packet. Let's alloc a new mbuf for refilling the
+		 * free descriptor ring as soon as possible
+		 */
+		new_mb = rte_pktmbuf_alloc(rxq->mem_pool);
+		if (unlikely(new_mb == NULL)) {
+			RTE_LOG_DP(DEBUG, PMD,
+			"RX mbuf alloc failed port_id=%u queue_id=%d\n",
+				rxq->port_id, rxq->qidx);
+			nfp_net_mbuf_alloc_failed(rxq);
+			break;
+		}
+
+		nb_hold++;
+
+		/*
+		 * Grab the mbuf and refill the descriptor with the
+		 * previously allocated mbuf
+		 */
+		mb = rxb->mbuf;
+		rxb->mbuf = new_mb;
+
+		PMD_RX_LOG(DEBUG, "Packet len: %u, mbuf_size: %u",
+			   rxds->rxd.data_len, rxq->mbuf_size);
+
+		/* Size of this segment */
+		mb->data_len = rxds->rxd.data_len - NFP_DESC_META_LEN(rxds);
+		/* Size of the whole packet. We just support 1 segment */
+		mb->pkt_len = rxds->rxd.data_len - NFP_DESC_META_LEN(rxds);
+
+		if (unlikely((mb->data_len + hw->rx_offset) > rxq->mbuf_size)) {
+			/*
+			 * This should not happen and the user has the
+			 * responsibility of avoiding it. But we have
+			 * to give some info about the error
+			 */
+			RTE_LOG_DP(ERR, PMD,
+				"mbuf overflow likely due to the RX offset.\n"
+				"\t\tYour mbuf size should have extra space for"
+				" RX offset=%u bytes.\n"
+				"\t\tCurrently you just have %u bytes available"
+				" but the received packet is %u bytes long",
+				hw->rx_offset,
+				rxq->mbuf_size - hw->rx_offset,
+				mb->data_len);
+			return 0;
+		}
+
+		/* Filling the received mbuf with packet info */
+		if (hw->rx_offset)
+			mb->data_off = RTE_PKTMBUF_HEADROOM + hw->rx_offset;
+		else
+			mb->data_off = RTE_PKTMBUF_HEADROOM +
+					NFP_DESC_META_LEN(rxds);
+
+		/* No scatter mode supported */
+		mb->nb_segs = 1;
+		mb->next = NULL;
+
+		mb->port = rxq->port_id;
+		meta_portid = 0;
+
+		/* Checking the RSS flag */
+		nfp_flower_parse_metadata(rxq, rxds, mb, &meta_portid);
+		PMD_RX_LOG(DEBUG, "Received from port %u type %u",
+				NFP_FLOWER_CMSG_PORT_VNIC(meta_portid),
+				NFP_FLOWER_CMSG_PORT_VNIC_TYPE(meta_portid));
+
+		/* Checking the checksum flag */
+		nfp_net_rx_cksum(rxq, rxds, mb);
+
+		if ((rxds->rxd.flags & PCIE_DESC_RX_VLAN) &&
+				(hw->ctrl & NFP_NET_CFG_CTRL_RXVLAN)) {
+			mb->vlan_tci = rte_cpu_to_le_32(rxds->rxd.vlan);
+			mb->ol_flags |= RTE_MBUF_F_RX_VLAN |
+					RTE_MBUF_F_RX_VLAN_STRIPPED;
+		}
+
+		repr = nfp_flower_get_repr(hw, meta_portid);
+		if (repr && repr->ring) {
+			PMD_RX_LOG(DEBUG, "Using representor %s", repr->name);
+			rte_ring_enqueue(repr->ring, (void *)mb);
+			avail_multiplexed++;
+		} else if (repr) {
+			PMD_RX_LOG(ERR, "[%u] No ring available for repr_port %s\n",
+					hw->idx, repr->name);
+			PMD_RX_LOG(DEBUG, "Adding the mbuf to the mbuf array passed by the app");
+			rx_pkts[avail++] = mb;
+		} else {
+			PMD_RX_LOG(DEBUG, "Adding the mbuf to the mbuf array passed by the app");
+			rx_pkts[avail++] = mb;
+		}
+
+		/* Now resetting and updating the descriptor */
+		rxds->vals[0] = 0;
+		rxds->vals[1] = 0;
+		dma_addr = rte_cpu_to_le_64(RTE_MBUF_DMA_ADDR_DEFAULT(new_mb));
+		rxds->fld.dd = 0;
+		rxds->fld.dma_addr_hi = (dma_addr >> 32) & 0xff;
+		rxds->fld.dma_addr_lo = dma_addr & 0xffffffff;
+
+		rxq->rd_p++;
+		if (unlikely(rxq->rd_p == rxq->rx_count)) /* wrapping?*/
+			rxq->rd_p = 0;
+	}
+
+	if (nb_hold == 0)
+		return nb_hold;
+
+	PMD_RX_LOG(DEBUG, "RX port_id=%u queue_id=%d, %d packets received",
+		   rxq->port_id, rxq->qidx, nb_hold);
+
+	nb_hold += rxq->nb_rx_hold;
+
+	/*
+	 * FL descriptors needs to be written before incrementing the
+	 * FL queue WR pointer
+	 */
+	rte_wmb();
+	if (nb_hold > rxq->rx_free_thresh) {
+		PMD_RX_LOG(DEBUG, "port=%u queue=%d nb_hold=%u avail=%d",
+			   rxq->port_id, rxq->qidx, nb_hold, avail);
+		nfp_qcp_ptr_add(rxq->qcp_fl, NFP_QCP_WRITE_PTR, nb_hold);
+		nb_hold = 0;
+	}
+
+	rxq->nb_rx_hold = nb_hold;
+
+	return avail;
+}
+
+static uint16_t
+nfp_flower_pf_xmit_pkts(void *tx_queue,
+		struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts)
+{
+	int i;
+	int pkt_size;
+	int dma_size;
+	uint64_t dma_addr;
+	uint16_t free_descs;
+	uint16_t issued_descs;
+	struct rte_mbuf *pkt;
+	struct nfp_net_hw *hw;
+	struct rte_mbuf **lmbuf;
+	struct nfp_net_txq *txq;
+	struct nfp_net_nfd3_tx_desc txd;
+	struct nfp_net_nfd3_tx_desc *txds;
+
+	txq = tx_queue;
+	hw = txq->hw;
+	txds = &txq->txds[txq->wr_p];
+
+	PMD_TX_LOG(DEBUG, "working for queue %d at pos %u and %u packets",
+			txq->qidx, txq->wr_p, nb_pkts);
+
+	if ((nfp_net_nfd3_free_tx_desc(txq) < nb_pkts) || (nfp_net_nfd3_txq_full(txq)))
+		nfp_net_tx_free_bufs(txq);
+
+	free_descs = (uint16_t)nfp_net_nfd3_free_tx_desc(txq);
+	if (unlikely(free_descs == 0))
+		return 0;
+
+	pkt = *tx_pkts;
+
+	i = 0;
+	issued_descs = 0;
+
+	/* Sending packets */
+	while ((i < nb_pkts) && free_descs) {
+		/* Grabbing the mbuf linked to the current descriptor */
+		lmbuf = &txq->txbufs[txq->wr_p].mbuf;
+		/* Warming the cache for releasing the mbuf later on */
+		RTE_MBUF_PREFETCH_TO_FREE(*lmbuf);
+
+		pkt = *(tx_pkts + i);
+
+		if (unlikely(pkt->nb_segs > 1 &&
+				!(hw->cap & NFP_NET_CFG_CTRL_GATHER))) {
+			PMD_INIT_LOG(INFO, "NFP_NET_CFG_CTRL_GATHER not set");
+			PMD_INIT_LOG(INFO, "Multisegment packet unsupported");
+			goto xmit_end;
+		}
+
+		/* Checking if we have enough descriptors */
+		if (unlikely(pkt->nb_segs > free_descs))
+			goto xmit_end;
+
+		/*
+		 * Checksum and VLAN flags just in the first descriptor for a
+		 * multisegment packet, but TSO info needs to be in all of them.
+		 */
+		txd.data_len = pkt->pkt_len;
+		nfp_net_nfd3_tx_tso(txq, &txd, pkt);
+		nfp_net_nfd3_tx_cksum(txq, &txd, pkt);
+
+		if ((pkt->ol_flags & RTE_MBUF_F_TX_VLAN) &&
+				(hw->cap & NFP_NET_CFG_CTRL_TXVLAN)) {
+			txd.flags |= PCIE_DESC_TX_VLAN;
+			txd.vlan = pkt->vlan_tci;
+		}
+
+		/*
+		 * mbuf data_len is the data in one segment and pkt_len data
+		 * in the whole packet. When the packet is just one segment,
+		 * then data_len = pkt_len
+		 */
+		pkt_size = pkt->pkt_len;
+
+		while (pkt) {
+			/* Copying TSO, VLAN and cksum info */
+			*txds = txd;
+
+			/* Releasing mbuf used by this descriptor previously*/
+			if (*lmbuf)
+				rte_pktmbuf_free_seg(*lmbuf);
+
+			/*
+			 * Linking mbuf with descriptor for being released
+			 * next time descriptor is used
+			 */
+			*lmbuf = pkt;
+
+			dma_size = pkt->data_len;
+			dma_addr = rte_mbuf_data_iova(pkt);
+
+			/* Filling descriptors fields */
+			txds->dma_len = dma_size;
+			txds->data_len = txd.data_len;
+			txds->dma_addr_hi = (dma_addr >> 32) & 0xff;
+			txds->dma_addr_lo = (dma_addr & 0xffffffff);
+			ASSERT(free_descs > 0);
+			free_descs--;
+
+			txq->wr_p++;
+			if (unlikely(txq->wr_p == txq->tx_count)) /* wrapping?*/
+				txq->wr_p = 0;
+
+			pkt_size -= dma_size;
+
+			/*
+			 * Making the EOP, packets with just one segment
+			 * the priority
+			 */
+			if (likely(!pkt_size))
+				txds->offset_eop = PCIE_DESC_TX_EOP |
+						FLOWER_PKT_DATA_OFFSET;
+			else
+				txds->offset_eop = 0;
+
+			pkt = pkt->next;
+			/* Referencing next free TX descriptor */
+			txds = &txq->txds[txq->wr_p];
+			lmbuf = &txq->txbufs[txq->wr_p].mbuf;
+			issued_descs++;
+		}
+		i++;
+	}
+
+xmit_end:
+	/* Increment write pointers. Force memory write before we let HW know */
+	rte_wmb();
+	nfp_qcp_ptr_add(txq->qcp_q, NFP_QCP_WRITE_PTR, issued_descs);
+
+	return i;
+}
+
 static void
 nfp_flower_pf_mp_init(__rte_unused struct rte_mempool *mp,
 		__rte_unused void *opaque_arg,
@@ -498,6 +922,8 @@
 
 	/* Add Rx/Tx functions */
 	eth_dev->dev_ops = &nfp_flower_pf_dev_ops;
+	eth_dev->rx_pkt_burst = nfp_flower_pf_recv_pkts;
+	eth_dev->tx_pkt_burst = nfp_flower_pf_xmit_pkts;
 
 	/* PF vNIC gets a random MAC */
 	eth_dev->data->mac_addrs = rte_zmalloc("mac_addr",
@@ -1117,6 +1543,8 @@
 
 	eth_dev->process_private = cpp;
 	eth_dev->dev_ops = &nfp_flower_pf_dev_ops;
+	eth_dev->rx_pkt_burst = nfp_flower_pf_recv_pkts;
+	eth_dev->tx_pkt_burst = nfp_flower_pf_xmit_pkts;
 	rte_eth_dev_probing_finish(eth_dev);
 
 	return 0;
diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
index 24fced3..4ac89d1 100644
--- a/drivers/net/nfp/flower/nfp_flower.h
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -7,6 +7,7 @@
 #define _NFP_FLOWER_H_
 
 enum nfp_flower_service {
+	NFP_FLOWER_SERVICE_PF,
 	NFP_FLOWER_SERVICE_CTRL,
 	NFP_FLOWER_SERVICE_MAX
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 01/12] net/nfp: move app specific attributes to own struct
  2022-08-05  6:32 ` [PATCH v5 01/12] net/nfp: move app specific attributes to own struct Chaoyong He
@ 2022-08-05 10:49   ` Andrew Rybchenko
  0 siblings, 0 replies; 29+ messages in thread
From: Andrew Rybchenko @ 2022-08-05 10:49 UTC (permalink / raw)
  To: Chaoyong He, dev; +Cc: niklas.soderlund, Heinrich Kuhn

On 8/5/22 09:32, Chaoyong He wrote:
> The NFP Card can load different firmware applications. Currently
> only the CoreNIC application is supported. This commit makes
> needed infrastructure changes in order to support other firmware
> applications too.
> 
> Clearer separation is made between the PF device and any application
> specific concepts. The PF struct is now generic regardless of the
> application loaded. A new struct is also made for the CoreNIC
> application. Future additions to support other applications should
> also add an applications specific struct.
> 
> Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com>
> Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> ---
>   drivers/net/nfp/nfp_common.h |  33 +++++++-
>   drivers/net/nfp/nfp_ethdev.c | 196 +++++++++++++++++++++++++++----------------
>   2 files changed, 154 insertions(+), 75 deletions(-)
> 
> diff --git a/drivers/net/nfp/nfp_common.h b/drivers/net/nfp/nfp_common.h
> index 6d917e4..2aaf1d6 100644
> --- a/drivers/net/nfp/nfp_common.h
> +++ b/drivers/net/nfp/nfp_common.h
> @@ -111,6 +111,14 @@
>   #include <linux/types.h>
>   #include <rte_io.h>
>   
> +/* Firmware application ID's */
> +enum nfp_app_id {
> +	NFP_APP_CORE_NIC               = 0x1,

[snip]

> +	NFP_APP_BPF_NIC                = 0x2,
> +	NFP_APP_FLOWER_NIC             = 0x3,
> +	NFP_APP_ACTIVE_BUFFER_MGMT_NIC = 0x4,

Above three defines are dead in the patch. I think it would make
subsequent patches more selfcontained if corresponding defines are
added later when they are really required and used.

> +};
> +
>   /* nfp_qcp_ptr - Read or Write Pointer of a queue */
>   enum nfp_qcp_ptr {
>   	NFP_QCP_READ_PTR = 0,
> @@ -121,8 +129,10 @@ struct nfp_pf_dev {
>   	/* Backpointer to associated pci device */
>   	struct rte_pci_device *pci_dev;
>   
> -	/* Array of physical ports belonging to this PF */
> -	struct nfp_net_hw *ports[NFP_MAX_PHYPORTS];
> +	enum nfp_app_id app_id;
> +
> +	/* Pointer to the app running on the PF */
> +	void *app_priv;
>   
>   	/* Current values for control */
>   	uint32_t ctrl;
> @@ -151,8 +161,6 @@ struct nfp_pf_dev {
>   	struct nfp_cpp_area *msix_area;
>   
>   	uint8_t *hw_queues;
> -	uint8_t total_phyports;
> -	bool	multiport;
>   
>   	union eth_table_entry *eth_table;
>   
> @@ -161,6 +169,20 @@ struct nfp_pf_dev {
>   	uint32_t nfp_cpp_service_id;
>   };
>   
> +struct nfp_app_nic {
> +	/* Backpointer to the PF device */
> +	struct nfp_pf_dev *pf_dev;
> +
> +	/*
> +	 * Array of physical ports belonging to the this CoreNIC app
> +	 * This is really a list of vNIC's. One for each physical port
> +	 */
> +	struct nfp_net_hw *ports[NFP_MAX_PHYPORTS];
> +
> +	bool multiport;
> +	uint8_t total_phyports;
> +};
> +
>   struct nfp_net_hw {
>   	/* Backpointer to the PF this port belongs to */
>   	struct nfp_pf_dev *pf_dev;
> @@ -424,6 +446,9 @@ int nfp_net_rss_hash_conf_get(struct rte_eth_dev *dev,
>   #define NFP_NET_DEV_PRIVATE_TO_PF(dev_priv)\
>   	(((struct nfp_net_hw *)dev_priv)->pf_dev)
>   
> +#define NFP_APP_PRIV_TO_APP_NIC(app_priv)\
> +	((struct nfp_app_nic *)app_priv)
> +

Wouldn't it be better if tiny function is used instead.
It should accept struct nfp_pf_dev pointer as an input argument.
It would allow to validate that pf_dev->app_id is NFP_APP_CORE_NIC and
make code more robust.

>   #endif /* _NFP_COMMON_H_ */
>   /*
>    * Local variables:
> diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
> index 5cdd34e..3c4b0ac 100644
> --- a/drivers/net/nfp/nfp_ethdev.c
> +++ b/drivers/net/nfp/nfp_ethdev.c
> @@ -39,15 +39,15 @@
>   #include "nfp_cpp_bridge.h"
>   
>   static int
> -nfp_net_pf_read_mac(struct nfp_pf_dev *pf_dev, int port)
> +nfp_net_pf_read_mac(struct nfp_app_nic *app_nic, int port)
>   {
>   	struct nfp_eth_table *nfp_eth_table;
>   	struct nfp_net_hw *hw = NULL;
>   
>   	/* Grab a pointer to the correct physical port */
> -	hw = pf_dev->ports[port];
> +	hw = app_nic->ports[port];
>   
> -	nfp_eth_table = nfp_eth_read_ports(pf_dev->cpp);
> +	nfp_eth_table = nfp_eth_read_ports(app_nic->pf_dev->cpp);
>   
>   	nfp_eth_copy_mac((uint8_t *)&hw->mac_addr,
>   			 (uint8_t *)&nfp_eth_table->ports[port].mac_addr);
> @@ -64,6 +64,7 @@
>   	uint32_t new_ctrl, update = 0;
>   	struct nfp_net_hw *hw;
>   	struct nfp_pf_dev *pf_dev;
> +	struct nfp_app_nic *app_nic;
>   	struct rte_eth_conf *dev_conf;
>   	struct rte_eth_rxmode *rxmode;
>   	uint32_t intr_vector;
> @@ -71,6 +72,7 @@
>   
>   	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
>   	pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(dev->data->dev_private);
> +	app_nic = NFP_APP_PRIV_TO_APP_NIC(pf_dev->app_priv);
>   
>   	PMD_INIT_LOG(DEBUG, "Start");
>   
> @@ -82,7 +84,7 @@
>   
>   	/* check and configure queue intr-vector mapping */
>   	if (dev->data->dev_conf.intr_conf.rxq != 0) {
> -		if (pf_dev->multiport) {
> +		if (app_nic->multiport) {
>   			PMD_INIT_LOG(ERR, "PMD rx interrupt is not supported "
>   					  "with NFP multiport PF");
>   				return -EINVAL;
> @@ -250,6 +252,7 @@
>   	struct nfp_net_hw *hw;
>   	struct rte_pci_device *pci_dev;
>   	struct nfp_pf_dev *pf_dev;
> +	struct nfp_app_nic *app_nic;
>   	int i;
>   
>   	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> @@ -260,6 +263,7 @@
>   	pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(dev->data->dev_private);
>   	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
>   	pci_dev = RTE_ETH_DEV_TO_PCI(dev);
> +	app_nic = NFP_APP_PRIV_TO_APP_NIC(pf_dev->app_priv);
>   
>   	/*
>   	 * We assume that the DPDK application is stopping all the
> @@ -280,12 +284,12 @@
>   	/* Only free PF resources after all physical ports have been closed */
>   	/* Mark this port as unused and free device priv resources*/
>   	nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff);
> -	pf_dev->ports[hw->idx] = NULL;
> +	app_nic->ports[hw->idx] = NULL;
>   	rte_eth_dev_release_port(dev);
>   
> -	for (i = 0; i < pf_dev->total_phyports; i++) {
> +	for (i = 0; i < app_nic->total_phyports; i++) {
>   		/* Check to see if ports are still in use */
> -		if (pf_dev->ports[i])
> +		if (app_nic->ports[i])
>   			return 0;
>   	}
>   
> @@ -296,6 +300,7 @@
>   	free(pf_dev->hwinfo);
>   	free(pf_dev->sym_tbl);
>   	nfp_cpp_free(pf_dev->cpp);
> +	rte_free(app_nic);
>   	rte_free(pf_dev);
>   
>   	rte_intr_disable(pci_dev->intr_handle);
> @@ -404,6 +409,7 @@
>   {
>   	struct rte_pci_device *pci_dev;
>   	struct nfp_pf_dev *pf_dev;
> +	struct nfp_app_nic *app_nic;
>   	struct nfp_net_hw *hw;
>   	struct rte_ether_addr *tmp_ether_addr;
>   	uint64_t rx_bar_off = 0;
> @@ -420,6 +426,9 @@
>   	/* Use backpointer here to the PF of this eth_dev */
>   	pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(eth_dev->data->dev_private);
>   
> +	/* Use backpointer to the CoreNIC app struct */
> +	app_nic = NFP_APP_PRIV_TO_APP_NIC(pf_dev->app_priv);
> +
>   	/* NFP can not handle DMA addresses requiring more than 40 bits */
>   	if (rte_mem_check_dma_mask(40)) {
>   		RTE_LOG(ERR, PMD,
> @@ -438,7 +447,7 @@
>   	 * Use PF array of physical ports to get pointer to
>   	 * this specific port
>   	 */
> -	hw = pf_dev->ports[port];
> +	hw = app_nic->ports[port];
>   
>   	PMD_INIT_LOG(DEBUG, "Working with physical port number: %d, "
>   			"NFP internal port number: %d", port, hw->nfp_idx);
> @@ -568,7 +577,7 @@
>   		goto dev_err_queues_map;
>   	}
>   
> -	nfp_net_pf_read_mac(pf_dev, port);
> +	nfp_net_pf_read_mac(app_nic, port);
>   	nfp_net_write_mac(hw, (uint8_t *)&hw->mac_addr);
>   
>   	tmp_ether_addr = (struct rte_ether_addr *)&hw->mac_addr;
> @@ -718,25 +727,67 @@
>   }
>   
>   static int
> -nfp_init_phyports(struct nfp_pf_dev *pf_dev)
> +nfp_init_app_nic(struct nfp_pf_dev *pf_dev,
> +		struct nfp_eth_table *nfp_eth_table)
>   {
>   	int i;
> -	int ret = 0;
> +	int ret;
> +	int err = 0;
> +	int total_vnics;
>   	struct nfp_net_hw *hw;
> +	unsigned int numa_node;
>   	struct rte_eth_dev *eth_dev;
> -	struct nfp_eth_table *nfp_eth_table;
> +	struct nfp_app_nic *app_nic;
> +	char port_name[RTE_ETH_NAME_MAX_LEN];
>   
> -	nfp_eth_table = nfp_eth_read_ports(pf_dev->cpp);
> -	if (nfp_eth_table == NULL) {
> -		PMD_INIT_LOG(ERR, "Error reading NFP ethernet table");
> -		return -EIO;
> +	PMD_INIT_LOG(INFO, "Total physical ports: %d", nfp_eth_table->count);
> +
> +	/* Allocate memory for the CoreNIC app */
> +	app_nic = rte_zmalloc("nfp_app_nic", sizeof(*app_nic), 0);
> +	if (app_nic == NULL)
> +		return -ENOMEM;
> +
> +	/* Point the app_priv pointer in the PF to the coreNIC app */
> +	pf_dev->app_priv = app_nic;
> +
> +	/* Read the number of vNIC's created for the PF */
> +	total_vnics = nfp_rtsym_read_le(pf_dev->sym_tbl, "nfd_cfg_pf0_num_ports", &err);
> +	if (err || total_vnics <= 0 || total_vnics > 8) {

DPDK coding style says to compare integers with 0 explicitly.
Since both ways are already present in net/nfp code and there is no
clear preference in the driver itself, please, following DPDK coding
style in a new code.

> +		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
> +		ret = -ENODEV;
> +		goto app_cleanup;
>   	}
>   
> -	/* Loop through all physical ports on PF */
> -	for (i = 0; i < pf_dev->total_phyports; i++) {
> -		const unsigned int numa_node = rte_socket_id();
> -		char port_name[RTE_ETH_NAME_MAX_LEN];
> +	/*
> +	 * For coreNIC the number of vNICs exposed should be the same as the
> +	 * number of physical ports
> +	 */
> +	if (total_vnics != (int)nfp_eth_table->count) {
> +		PMD_INIT_LOG(ERR, "Total physical ports do not match number of vNICs");
> +		ret = -ENODEV;
> +		goto app_cleanup;
> +	}
>   
> +	/* Populate coreNIC app properties*/
> +	app_nic->total_phyports = total_vnics;
> +	app_nic->pf_dev = pf_dev;
> +	if (total_vnics > 1)
> +		app_nic->multiport = true;
> +
> +	/* Map the symbol table */
> +	pf_dev->ctrl_bar = nfp_rtsym_map(pf_dev->sym_tbl, "_pf0_net_bar0",
> +			app_nic->total_phyports * 32768, &pf_dev->ctrl_area);
> +	if (pf_dev->ctrl_bar == NULL) {
> +		PMD_INIT_LOG(ERR, "nfp_rtsym_map fails for _pf0_net_ctrl_bar");
> +		ret = -EIO;
> +		goto app_cleanup;
> +	}
> +
> +	PMD_INIT_LOG(DEBUG, "ctrl bar: %p", pf_dev->ctrl_bar);
> +
> +	/* Loop through all physical ports on PF */
> +	numa_node = rte_socket_id();
> +	for (i = 0; i < app_nic->total_phyports; i++) {
>   		snprintf(port_name, sizeof(port_name), "%s_port%d",
>   			 pf_dev->pci_dev->device.name, i);
>   
> @@ -760,7 +811,7 @@
>   		hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
>   
>   		/* Add this device to the PF's array of physical ports */
> -		pf_dev->ports[i] = hw;
> +		app_nic->ports[i] = hw;
>   
>   		hw->pf_dev = pf_dev;
>   		hw->cpp = pf_dev->cpp;
> @@ -783,20 +834,21 @@
>   		rte_eth_dev_probing_finish(eth_dev);
>   
>   	} /* End loop, all ports on this PF */
> -	ret = 0;
> -	goto eth_table_cleanup;
> +
> +	return 0;
>   
>   port_cleanup:
> -	for (i = 0; i < pf_dev->total_phyports; i++) {
> -		if (pf_dev->ports[i] && pf_dev->ports[i]->eth_dev) {
> +	for (i = 0; i < app_nic->total_phyports; i++) {
> +		if (app_nic->ports[i] && app_nic->ports[i]->eth_dev) {
>   			struct rte_eth_dev *tmp_dev;
> -			tmp_dev = pf_dev->ports[i]->eth_dev;
> +			tmp_dev = app_nic->ports[i]->eth_dev;
>   			rte_eth_dev_release_port(tmp_dev);
> -			pf_dev->ports[i] = NULL;
> +			app_nic->ports[i] = NULL;
>   		}
>   	}
> -eth_table_cleanup:
> -	free(nfp_eth_table);
> +	nfp_cpp_area_free(pf_dev->ctrl_area);
> +app_cleanup:
> +	rte_free(app_nic);
>   
>   	return ret;
>   }
> @@ -804,11 +856,11 @@
>   static int
>   nfp_pf_init(struct rte_pci_device *pci_dev)
>   {
> -	int err;
> -	int ret = 0;
> +	int ret;
> +	int err = 0;
>   	uint64_t addr;
> -	int total_ports;
>   	struct nfp_cpp *cpp;
> +	enum nfp_app_id app_id;
>   	struct nfp_pf_dev *pf_dev;
>   	struct nfp_hwinfo *hwinfo;
>   	char name[RTE_ETH_NAME_MAX_LEN];
> @@ -840,9 +892,10 @@
>   	if (hwinfo == NULL) {
>   		PMD_INIT_LOG(ERR, "Error reading hwinfo table");
>   		ret = -EIO;
> -		goto error;
> +		goto cpp_cleanup;
>   	}
>   
> +	/* Read the number of physical ports from hardware */
>   	nfp_eth_table = nfp_eth_read_ports(cpp);
>   	if (nfp_eth_table == NULL) {
>   		PMD_INIT_LOG(ERR, "Error reading NFP ethernet table");
> @@ -865,20 +918,14 @@
>   		goto eth_table_cleanup;
>   	}
>   
> -	total_ports = nfp_rtsym_read_le(sym_tbl, "nfd_cfg_pf0_num_ports", &err);
> -	if (total_ports != (int)nfp_eth_table->count) {
> -		PMD_DRV_LOG(ERR, "Inconsistent number of ports");
> +	/* Read the app ID of the firmware loaded */
> +	app_id = nfp_rtsym_read_le(sym_tbl, "_pf0_net_app_id", &err);
> +	if (err) {

Compare vs 0

> +		PMD_INIT_LOG(ERR, "Couldn't read app_id from fw");
>   		ret = -EIO;
>   		goto sym_tbl_cleanup;
>   	}
>   
> -	PMD_INIT_LOG(INFO, "Total physical ports: %d", total_ports);
> -
> -	if (total_ports <= 0 || total_ports > 8) {
> -		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
> -		ret = -ENODEV;
> -		goto sym_tbl_cleanup;
> -	}
>   	/* Allocate memory for the PF "device" */
>   	snprintf(name, sizeof(name), "nfp_pf%d", 0);
>   	pf_dev = rte_zmalloc(name, sizeof(*pf_dev), 0);
> @@ -888,27 +935,12 @@
>   	}
>   
>   	/* Populate the newly created PF device */
> +	pf_dev->app_id = app_id;
>   	pf_dev->cpp = cpp;
>   	pf_dev->hwinfo = hwinfo;
>   	pf_dev->sym_tbl = sym_tbl;
> -	pf_dev->total_phyports = total_ports;
> -
> -	if (total_ports > 1)
> -		pf_dev->multiport = true;
> -
>   	pf_dev->pci_dev = pci_dev;
>   
> -	/* Map the symbol table */
> -	pf_dev->ctrl_bar = nfp_rtsym_map(pf_dev->sym_tbl, "_pf0_net_bar0",
> -			pf_dev->total_phyports * 32768, &pf_dev->ctrl_area);
> -	if (pf_dev->ctrl_bar == NULL) {
> -		PMD_INIT_LOG(ERR, "nfp_rtsym_map fails for _pf0_net_ctrl_bar");
> -		ret = -EIO;
> -		goto pf_cleanup;
> -	}
> -
> -	PMD_INIT_LOG(DEBUG, "ctrl bar: %p", pf_dev->ctrl_bar);
> -
>   	/* configure access to tx/rx vNIC BARs */
>   	switch (pci_dev->id.device_id) {
>   	case PCI_DEVICE_ID_NFP3800_PF_NIC:
> @@ -923,7 +955,7 @@
>   	default:
>   		PMD_INIT_LOG(ERR, "nfp_net: no device ID matching");
>   		err = -ENODEV;
> -		goto ctrl_area_cleanup;
> +		goto pf_cleanup;
>   	}
>   
>   	pf_dev->hw_queues = nfp_cpp_map_area(pf_dev->cpp, 0, 0,
> @@ -932,18 +964,27 @@
>   	if (pf_dev->hw_queues == NULL) {
>   		PMD_INIT_LOG(ERR, "nfp_rtsym_map fails for net.qc");
>   		ret = -EIO;
> -		goto ctrl_area_cleanup;
> +		goto pf_cleanup;
>   	}
>   
>   	PMD_INIT_LOG(DEBUG, "tx/rx bar address: 0x%p", pf_dev->hw_queues);
>   
>   	/*
> -	 * Initialize and prep physical ports now
> -	 * This will loop through all physical ports
> +	 * PF initialization has been done at this point. Call app specific
> +	 * init code now
>   	 */
> -	ret = nfp_init_phyports(pf_dev);
> -	if (ret) {
> -		PMD_INIT_LOG(ERR, "Could not create physical ports");
> +	switch (pf_dev->app_id) {
> +	case NFP_APP_CORE_NIC:
> +		PMD_INIT_LOG(INFO, "Initializing coreNIC");
> +		ret = nfp_init_app_nic(pf_dev, nfp_eth_table);
> +		if (ret) {

Compare vs 0

> +			PMD_INIT_LOG(ERR, "Could not initialize coreNIC!");
> +			goto hwqueues_cleanup;
> +		}
> +		break;
> +	default:
> +		PMD_INIT_LOG(ERR, "Unsupported Firmware loaded");
> +		ret = -EINVAL;
>   		goto hwqueues_cleanup;
>   	}
>   
> @@ -954,8 +995,6 @@
>   
>   hwqueues_cleanup:
>   	nfp_cpp_area_free(pf_dev->hwqueues_area);
> -ctrl_area_cleanup:
> -	nfp_cpp_area_free(pf_dev->ctrl_area);
>   pf_cleanup:
>   	rte_free(pf_dev);
>   sym_tbl_cleanup:
> @@ -964,6 +1003,8 @@
>   	free(nfp_eth_table);
>   hwinfo_cleanup:
>   	free(hwinfo);
> +cpp_cleanup:
> +	nfp_cpp_free(cpp);
>   error:
>   	return ret;
>   }
> @@ -977,7 +1018,8 @@
>   nfp_pf_secondary_init(struct rte_pci_device *pci_dev)
>   {
>   	int i;
> -	int err;
> +	int err = 0;
> +	int ret = 0;
>   	int total_ports;
>   	struct nfp_cpp *cpp;
>   	struct nfp_net_hw *hw;
> @@ -1015,6 +1057,11 @@
>   	}
>   
>   	total_ports = nfp_rtsym_read_le(sym_tbl, "nfd_cfg_pf0_num_ports", &err);
> +	if (err || total_ports <= 0 || total_ports > 8) {

Compare err vs 0

> +		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
> +		ret = -ENODEV;
> +		goto sym_tbl_cleanup;
> +	}
>   
>   	for (i = 0; i < total_ports; i++) {
>   		struct rte_eth_dev *eth_dev;
> @@ -1028,7 +1075,8 @@
>   		if (eth_dev == NULL) {
>   			RTE_LOG(ERR, EAL,
>   				"secondary process attach failed, ethdev doesn't exist");
> -			return -ENODEV;
> +			ret = -ENODEV;
> +			break;
>   		}
>   
>   		hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
> @@ -1041,10 +1089,16 @@
>   		rte_eth_dev_probing_finish(eth_dev);
>   	}
>   
> +	if (ret)

Compare vs 0

> +		goto sym_tbl_cleanup;
> +
>   	/* Register the CPP bridge service for the secondary too */
>   	nfp_register_cpp_service(cpp);
>   
> -	return 0;
> +sym_tbl_cleanup:
> +	free(sym_tbl);
> +
> +	return ret;
>   }
>   
>   static int


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 03/12] net/nfp: move app specific init logic to own function
  2022-08-05  6:32 ` [PATCH v5 03/12] net/nfp: move app specific init logic to own function Chaoyong He
@ 2022-08-05 10:53   ` Andrew Rybchenko
  0 siblings, 0 replies; 29+ messages in thread
From: Andrew Rybchenko @ 2022-08-05 10:53 UTC (permalink / raw)
  To: dev

On 8/5/22 09:32, Chaoyong He wrote:
> The NFP card can load different firmware applications.
> This commit move the init logic of corenic app of the
> secondary process into its own function.
> 
> Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> ---
>   drivers/net/nfp/nfp_ethdev.c | 93 +++++++++++++++++++++++++++++---------------
>   1 file changed, 62 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
> index 2c5607c..90dd01e 100644
> --- a/drivers/net/nfp/nfp_ethdev.c
> +++ b/drivers/net/nfp/nfp_ethdev.c
> @@ -936,7 +936,7 @@
>   		break;
>   	default:
>   		PMD_INIT_LOG(ERR, "nfp_net: no device ID matching");
> -		err = -ENODEV;
> +		ret = -ENODEV;

It looks unrelated to the patch and looks as a bug in previous
code/patches which deserves separate fix.

>   		goto pf_cleanup;
>   	}
>   
> @@ -991,6 +991,50 @@
>   	return ret;
>   }
>   
> +static int
> +nfp_secondary_init_app_nic(struct rte_pci_device *pci_dev,
> +		struct nfp_rtsym_table *sym_tbl,
> +		struct nfp_cpp *cpp)
> +{
> +	int i;
> +	int err = 0;
> +	int ret = 0;
> +	int total_vnics;
> +	struct nfp_net_hw *hw;
> +
> +	/* Read the number of vNIC's created for the PF */
> +	total_vnics = nfp_rtsym_read_le(sym_tbl, "nfd_cfg_pf0_num_ports", &err);
> +	if (err || total_vnics <= 0 || total_vnics > 8) {

Compare err vs 0 explicitly

> +		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
> +		return -ENODEV;
> +	}
> +
> +	for (i = 0; i < total_vnics; i++) {
> +		struct rte_eth_dev *eth_dev;
> +		char port_name[RTE_ETH_NAME_MAX_LEN];
> +		snprintf(port_name, sizeof(port_name), "%s_port%d",
> +				pci_dev->device.name, i);
> +
> +		PMD_DRV_LOG(DEBUG, "Secondary attaching to port %s", port_name);
> +		eth_dev = rte_eth_dev_attach_secondary(port_name);
> +		if (eth_dev == NULL) {
> +			RTE_LOG(ERR, EAL,
> +				"secondary process attach failed, ethdev doesn't exist");
> +			ret = -ENODEV;
> +			break;
> +		}
> +
> +		eth_dev->process_private = cpp;
> +		hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
> +		if (nfp_net_ethdev_ops_mount(hw, eth_dev))
> +			return -EINVAL;
> +
> +		rte_eth_dev_probing_finish(eth_dev);
> +	}
> +
> +	return ret;
> +}
> +
>   /*
>    * When attaching to the NFP4000/6000 PF on a secondary process there
>    * is no need to initialise the PF again. Only minimal work is required
> @@ -999,12 +1043,10 @@
>   static int
>   nfp_pf_secondary_init(struct rte_pci_device *pci_dev)
>   {
> -	int i;
>   	int err = 0;
>   	int ret = 0;
> -	int total_ports;
> +	enum nfp_app_id app_id;
>   	struct nfp_cpp *cpp;
> -	struct nfp_net_hw *hw;
>   	struct nfp_rtsym_table *sym_tbl;
>   
>   	if (pci_dev == NULL)
> @@ -1038,37 +1080,26 @@
>   		return -EIO;
>   	}
>   
> -	total_ports = nfp_rtsym_read_le(sym_tbl, "nfd_cfg_pf0_num_ports", &err);
> -	if (err || total_ports <= 0 || total_ports > 8) {
> -		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
> -		ret = -ENODEV;
> +	/* Read the app ID of the firmware loaded */
> +	app_id = nfp_rtsym_read_le(sym_tbl, "_pf0_net_app_id", &err);
> +	if (err) {

Compare vs 0

> +		PMD_INIT_LOG(ERR, "Couldn't read app_id from fw");
>   		goto sym_tbl_cleanup;
>   	}
>   
> -	for (i = 0; i < total_ports; i++) {
> -		struct rte_eth_dev *eth_dev;
> -		char port_name[RTE_ETH_NAME_MAX_LEN];
> -
> -		snprintf(port_name, sizeof(port_name), "%s_port%d",
> -			 pci_dev->device.name, i);
> -
> -		PMD_DRV_LOG(DEBUG, "Secondary attaching to port %s", port_name);
> -		eth_dev = rte_eth_dev_attach_secondary(port_name);
> -		if (eth_dev == NULL) {
> -			RTE_LOG(ERR, EAL,
> -				"secondary process attach failed, ethdev doesn't exist");
> -			ret = -ENODEV;
> -			break;
> +	switch (app_id) {
> +	case NFP_APP_CORE_NIC:
> +		PMD_INIT_LOG(INFO, "Initializing coreNIC");
> +		ret = nfp_secondary_init_app_nic(pci_dev, sym_tbl, cpp);
> +		if (ret) {

Compare vs 0

> +			PMD_INIT_LOG(ERR, "Could not initialize coreNIC!");
> +			goto sym_tbl_cleanup;
>   		}
> -
> -		hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
> -
> -		if (nfp_net_ethdev_ops_mount(hw, eth_dev))
> -			return -EINVAL;
> -
> -		eth_dev->process_private = cpp;
> -
> -		rte_eth_dev_probing_finish(eth_dev);
> +		break;
> +	default:
> +		PMD_INIT_LOG(ERR, "Unsupported Firmware loaded");
> +		ret = -EINVAL;
> +		goto sym_tbl_cleanup;
>   	}
>   
>   	if (ret)


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 04/12] net/nfp: add initial flower firmware support
  2022-08-05  6:32 ` [PATCH v5 04/12] net/nfp: add initial flower firmware support Chaoyong He
@ 2022-08-05 11:00   ` Andrew Rybchenko
  0 siblings, 0 replies; 29+ messages in thread
From: Andrew Rybchenko @ 2022-08-05 11:00 UTC (permalink / raw)
  To: Chaoyong He, dev; +Cc: niklas.soderlund, Heinrich Kuhn

On 8/5/22 09:32, Chaoyong He wrote:
> This commits adds the basic probing infrastructure to support the flower

"This commits adds" -> "Add"
It is the description of the commit from the very beginning.

> firmware. This firmware is geared towards offloading OVS and can
> generally be found in /lib/firmware/netronome/flower. It is also used by
> the NFP kernel driver when OVS offload with TC is desired.
> 
> This commit also adds the basic infrastructure needed by the flower

Same here.

> firmware to operate. The firmware requires threads to service both the
> PF vNIC and the ctrl vNIC. The PF is responsible for handling any
> fallback traffic and the ctrl vNIC is used to communicate OVS flows
> and flow statistics to and from the smartNIC. rte_services are used to
> facilitate this logic.
> 
> This commit also adds the cpp service, used for some user tools.
> 
> Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com>
> Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> ---
>   drivers/net/nfp/flower/nfp_flower.c | 101 ++++++++++++++++++++++++++++++++++++
>   drivers/net/nfp/flower/nfp_flower.h |  22 ++++++++
>   drivers/net/nfp/meson.build         |   1 +
>   drivers/net/nfp/nfp_cpp_bridge.c    |  88 ++++++++++++++++++++++++++-----
>   drivers/net/nfp/nfp_cpp_bridge.h    |   6 ++-
>   drivers/net/nfp/nfp_ethdev.c        |  40 ++++++++++++--
>   6 files changed, 239 insertions(+), 19 deletions(-)
>   create mode 100644 drivers/net/nfp/flower/nfp_flower.c
>   create mode 100644 drivers/net/nfp/flower/nfp_flower.h
> 
> diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
> new file mode 100644
> index 0000000..1dddced
> --- /dev/null
> +++ b/drivers/net/nfp/flower/nfp_flower.c
> @@ -0,0 +1,101 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2022 Corigine, Inc.
> + * All rights reserved.
> + */
> +
> +#include <rte_common.h>
> +#include <ethdev_driver.h>
> +#include <rte_service_component.h>
> +#include <rte_malloc.h>
> +#include <ethdev_pci.h>
> +#include <ethdev_driver.h>
> +
> +#include "../nfp_common.h"
> +#include "../nfp_logs.h"
> +#include "../nfp_ctrl.h"
> +#include "../nfp_cpp_bridge.h"
> +#include "nfp_flower.h"
> +
> +static struct rte_service_spec flower_services[NFP_FLOWER_SERVICE_MAX] = {
> +};
> +
> +static int
> +nfp_flower_enable_services(struct nfp_app_flower *app_flower)
> +{
> +	int i;
> +	int ret = 0;
> +
> +	for (i = 0; i < NFP_FLOWER_SERVICE_MAX; i++) {
> +		/* Pass a pointer to the flower app to the service */
> +		flower_services[i].callback_userdata = (void *)app_flower;
> +
> +		/* Register the flower services */
> +		ret = rte_service_component_register(&flower_services[i],
> +				&app_flower->flower_services_ids[i]);
> +		if (ret) {

Compare vs 0 explicitly

> +			PMD_INIT_LOG(WARNING,
> +				"Could not register Flower PF vNIC service");
> +			break;
> +		}
> +
> +		PMD_INIT_LOG(INFO, "Flower PF vNIC service registered");
> +
> +		/* Map them to available service cores*/
> +		ret = nfp_map_service(app_flower->flower_services_ids[i]);
> +		if (ret)

Compare vs 0 explicitly

> +			break;
> +	}
> +
> +	return ret;
> +}
> +
> +int
> +nfp_init_app_flower(struct nfp_pf_dev *pf_dev)
> +{
> +	int ret;
> +	unsigned int numa_node;
> +	struct nfp_net_hw *pf_hw;
> +	struct nfp_app_flower *app_flower;
> +
> +	numa_node = rte_socket_id();
> +
> +	/* Allocate memory for the Flower app */
> +	app_flower = rte_zmalloc_socket("nfp_app_flower", sizeof(*app_flower),
> +			RTE_CACHE_LINE_SIZE, numa_node);
> +	if (app_flower == NULL) {
> +		ret = -ENOMEM;
> +		goto done;
> +	}
> +
> +	pf_dev->app_priv = app_flower;
> +
> +	/* Allocate memory for the PF AND ctrl vNIC here (hence the * 2) */
> +	pf_hw = rte_zmalloc_socket("nfp_pf_vnic", 2 * sizeof(struct nfp_net_adapter),
> +			RTE_CACHE_LINE_SIZE, numa_node);
> +	if (pf_hw == NULL) {
> +		ret = -ENOMEM;
> +		goto app_cleanup;
> +	}
> +
> +	/* Start up flower services */
> +	if (nfp_flower_enable_services(app_flower)) {
> +		ret = -ESRCH;
> +		goto vnic_cleanup;
> +	}
> +
> +	return 0;
> +
> +vnic_cleanup:
> +	rte_free(pf_hw);
> +app_cleanup:
> +	rte_free(app_flower);
> +done:
> +	return ret;
> +}
> +
> +int
> +nfp_secondary_init_app_flower(__rte_unused struct nfp_cpp *cpp)
> +{
> +	PMD_INIT_LOG(ERR, "Flower firmware not supported");
> +	return -ENOTSUP;
> +}
> diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
> new file mode 100644
> index 0000000..4a9b302
> --- /dev/null
> +++ b/drivers/net/nfp/flower/nfp_flower.h
> @@ -0,0 +1,22 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2022 Corigine, Inc.
> + * All rights reserved.
> + */
> +
> +#ifndef _NFP_FLOWER_H_
> +#define _NFP_FLOWER_H_
> +
> +enum nfp_flower_service {
> +	NFP_FLOWER_SERVICE_MAX
> +};
> +
> +/* The flower application's private structure */
> +struct nfp_app_flower {
> +	/* List of rte_service ID's for the flower app */
> +	uint32_t flower_services_ids[NFP_FLOWER_SERVICE_MAX];
> +};
> +
> +int nfp_init_app_flower(struct nfp_pf_dev *pf_dev);
> +int nfp_secondary_init_app_flower(struct nfp_cpp *cpp);
> +
> +#endif /* _NFP_FLOWER_H_ */
> diff --git a/drivers/net/nfp/meson.build b/drivers/net/nfp/meson.build
> index 810f02a..7ae3115 100644
> --- a/drivers/net/nfp/meson.build
> +++ b/drivers/net/nfp/meson.build
> @@ -6,6 +6,7 @@ if not is_linux or not dpdk_conf.get('RTE_ARCH_64')
>       reason = 'only supported on 64-bit Linux'
>   endif
>   sources = files(
> +        'flower/nfp_flower.c',
>           'nfpcore/nfp_cpp_pcie_ops.c',
>           'nfpcore/nfp_nsp.c',
>           'nfpcore/nfp_cppcore.c',
> diff --git a/drivers/net/nfp/nfp_cpp_bridge.c b/drivers/net/nfp/nfp_cpp_bridge.c
> index 0922ea9..9ac165a 100644
> --- a/drivers/net/nfp/nfp_cpp_bridge.c
> +++ b/drivers/net/nfp/nfp_cpp_bridge.c
> @@ -28,22 +28,86 @@
>   static int nfp_cpp_bridge_serve_write(int sockfd, struct nfp_cpp *cpp);
>   static int nfp_cpp_bridge_serve_read(int sockfd, struct nfp_cpp *cpp);
>   static int nfp_cpp_bridge_serve_ioctl(int sockfd, struct nfp_cpp *cpp);
> +static int nfp_cpp_bridge_service_func(void *args);
>   
> -void nfp_register_cpp_service(struct nfp_cpp *cpp)
> +static struct rte_service_spec cpp_service = {
> +	.name         = "nfp_cpp_service",
> +	.callback     = nfp_cpp_bridge_service_func,
> +};
> +
> +int
> +nfp_map_service(uint32_t service_id)
>   {
> -	uint32_t *cpp_service_id = NULL;
> -	struct rte_service_spec service;
> +	int32_t ret;
> +	uint32_t slcore = 0;
> +	int32_t slcore_count;
> +	uint8_t service_count;
> +	const char *service_name;
> +	uint32_t slcore_array[RTE_MAX_LCORE];
> +	uint8_t min_service_count = UINT8_MAX;
> +
> +	slcore_count = rte_service_lcore_list(slcore_array, RTE_MAX_LCORE);
> +	if (slcore_count <= 0) {
> +		PMD_INIT_LOG(DEBUG, "No service cores found");
> +		return -ENOENT;
> +	}
> +
> +	/*
> +	 * Find a service core with the least number of services already
> +	 * registered to it
> +	 */
> +	while (slcore_count--) {
> +		service_count = rte_service_lcore_count_services(slcore_array[slcore_count]);
> +		if (service_count < min_service_count) {
> +			slcore = slcore_array[slcore_count];
> +			min_service_count = service_count;
> +		}
> +	}
>   
> -	memset(&service, 0, sizeof(struct rte_service_spec));
> -	snprintf(service.name, sizeof(service.name), "nfp_cpp_service");
> -	service.callback = nfp_cpp_bridge_service_func;
> -	service.callback_userdata = (void *)cpp;
> +	service_name = rte_service_get_name(service_id);
> +	PMD_INIT_LOG(INFO, "Mapping service %s to core %u", service_name, slcore);
> +	ret = rte_service_map_lcore_set(service_id, slcore, 1);
> +	if (ret) {

Compare vs 0 explicitly

> +		PMD_INIT_LOG(DEBUG, "Could not map flower service");
> +		return -ENOENT;
> +	}
>   
> -	if (rte_service_component_register(&service,
> -					   cpp_service_id))
> -		RTE_LOG(WARNING, PMD, "NFP CPP bridge service register() failed");
> +	rte_service_runstate_set(service_id, 1);
> +	rte_service_component_runstate_set(service_id, 1);
> +	rte_service_lcore_start(slcore);
> +	if (rte_service_may_be_active(slcore))
> +		RTE_LOG(INFO, PMD, "The service %s is running", service_name);
>   	else
> -		RTE_LOG(DEBUG, PMD, "NFP CPP bridge service registered");
> +		RTE_LOG(INFO, PMD, "The service %s is not running", service_name);
> +
> +	return 0;
> +}
> +
> +int nfp_enable_cpp_service(struct nfp_cpp *cpp, enum nfp_app_id app_id)
> +{
> +	int ret = 0;
> +	uint32_t id = 0;
> +
> +	cpp_service.callback_userdata = (void *)cpp;
> +
> +	/* Register the cpp service */
> +	ret = rte_service_component_register(&cpp_service, &id);
> +	if (ret) {

Compare vs 0 explicitly

> +		PMD_INIT_LOG(WARNING, "Could not register nfp cpp service");
> +		return -EINVAL;
> +	}
> +
> +	PMD_INIT_LOG(INFO, "NFP cpp service registered");
> +
> +	/* Map it to available service core*/
> +	ret = nfp_map_service(id);
> +	if (ret) {

Compare vs 0 explicitly

> +		PMD_INIT_LOG(DEBUG, "Could not map nfp cpp service");
> +		if (app_id == NFP_APP_FLOWER_NIC)
> +			return -EINVAL;
> +	}
> +
> +	return 0;
>   }
>   
>   /*
> @@ -307,7 +371,7 @@ void nfp_register_cpp_service(struct nfp_cpp *cpp)
>    * unaware of the CPP bridge performing the NFP kernel char driver for CPP
>    * accesses.
>    */
> -int32_t
> +static int
>   nfp_cpp_bridge_service_func(void *args)
>   {
>   	struct sockaddr address;
> diff --git a/drivers/net/nfp/nfp_cpp_bridge.h b/drivers/net/nfp/nfp_cpp_bridge.h
> index aea5fdc..dde50d7 100644
> --- a/drivers/net/nfp/nfp_cpp_bridge.h
> +++ b/drivers/net/nfp/nfp_cpp_bridge.h
> @@ -16,6 +16,8 @@
>   #ifndef _NFP_CPP_BRIDGE_H_
>   #define _NFP_CPP_BRIDGE_H_
>   
> +#include "nfp_common.h"
> +
>   #define NFP_CPP_MEMIO_BOUNDARY	(1 << 20)
>   #define NFP_BRIDGE_OP_READ	20
>   #define NFP_BRIDGE_OP_WRITE	30
> @@ -24,8 +26,8 @@
>   #define NFP_IOCTL 'n'
>   #define NFP_IOCTL_CPP_IDENTIFICATION _IOW(NFP_IOCTL, 0x8f, uint32_t)
>   
> -void nfp_register_cpp_service(struct nfp_cpp *cpp);
> -int32_t nfp_cpp_bridge_service_func(void *args);
> +int nfp_map_service(uint32_t service_id);
> +int nfp_enable_cpp_service(struct nfp_cpp *cpp, enum nfp_app_id app_id);
>   
>   #endif /* _NFP_CPP_BRIDGE_H_ */
>   /*
> diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
> index 90dd01e..0b88749 100644
> --- a/drivers/net/nfp/nfp_ethdev.c
> +++ b/drivers/net/nfp/nfp_ethdev.c
> @@ -38,6 +38,8 @@
>   #include "nfp_ctrl.h"
>   #include "nfp_cpp_bridge.h"
>   
> +#include "flower/nfp_flower.h"
> +
>   static int
>   nfp_net_pf_read_mac(struct nfp_app_nic *app_nic, int port)
>   {
> @@ -837,7 +839,8 @@
>   }
>   
>   static int
> -nfp_pf_init(struct rte_pci_device *pci_dev)
> +nfp_pf_init(struct rte_pci_device *pci_dev,
> +		struct rte_pci_driver *pci_drv)
>   {
>   	int ret;
>   	int err = 0;
> @@ -964,6 +967,16 @@
>   			goto hwqueues_cleanup;
>   		}
>   		break;
> +	case NFP_APP_FLOWER_NIC:
> +		PMD_INIT_LOG(INFO, "Initializing Flower");
> +		pci_dev->device.driver = &pci_drv->driver;
> +		ret = nfp_init_app_flower(pf_dev);
> +		if (ret) {

Compare vs 0 explicitly

> +			PMD_INIT_LOG(ERR, "Could not initialize Flower!");
> +			pci_dev->device.driver = NULL;
> +			goto hwqueues_cleanup;
> +		}
> +		break;
>   	default:
>   		PMD_INIT_LOG(ERR, "Unsupported Firmware loaded");
>   		ret = -EINVAL;
> @@ -971,7 +984,12 @@
>   	}
>   
>   	/* register the CPP bridge service here for primary use */
> -	nfp_register_cpp_service(pf_dev->cpp);
> +	ret = nfp_enable_cpp_service(pf_dev->cpp, pf_dev->app_id);
> +	if (ret) {

Compare vs 0 explicitly

> +		PMD_INIT_LOG(ERR, "Enable cpp service failed.");
> +		ret = -EINVAL;
> +		goto hwqueues_cleanup;
> +	}
>   
>   	return 0;
>   
> @@ -1096,6 +1114,14 @@
>   			goto sym_tbl_cleanup;
>   		}
>   		break;
> +	case NFP_APP_FLOWER_NIC:
> +		PMD_INIT_LOG(INFO, "Initializing Flower");
> +		ret = nfp_secondary_init_app_flower(cpp);
> +		if (ret) {

Compare vs 0 explicitly

> +			PMD_INIT_LOG(ERR, "Could not initialize Flower!");
> +			goto sym_tbl_cleanup;
> +		}
> +		break;
>   	default:
>   		PMD_INIT_LOG(ERR, "Unsupported Firmware loaded");
>   		ret = -EINVAL;
> @@ -1106,7 +1132,11 @@
>   		goto sym_tbl_cleanup;
>   
>   	/* Register the CPP bridge service for the secondary too */
> -	nfp_register_cpp_service(cpp);
> +	ret = nfp_enable_cpp_service(cpp, app_id);
> +	if (ret) {

Compare vs 0 explicitly

> +		PMD_INIT_LOG(ERR, "Enable cpp service failed.");
> +		ret = -EINVAL;
> +	}
>   
>   sym_tbl_cleanup:
>   	free(sym_tbl);
> @@ -1115,11 +1145,11 @@
>   }
>   
>   static int
> -nfp_pf_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
> +nfp_pf_pci_probe(struct rte_pci_driver *pci_drv,
>   		struct rte_pci_device *dev)
>   {
>   	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> -		return nfp_pf_init(dev);
> +		return nfp_pf_init(dev, pci_drv);
>   	else
>   		return nfp_pf_secondary_init(dev);
>   }


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 05/12] net/nfp: add flower PF setup and mempool init logic
  2022-08-05  6:32 ` [PATCH v5 05/12] net/nfp: add flower PF setup and mempool init logic Chaoyong He
@ 2022-08-05 12:49   ` Andrew Rybchenko
  0 siblings, 0 replies; 29+ messages in thread
From: Andrew Rybchenko @ 2022-08-05 12:49 UTC (permalink / raw)
  To: Chaoyong He, dev
  Cc: niklas.soderlund, Stephen Hemminger, Hemant Agrawal, Thomas Monjalon

@Thomas, @Stephen, @Hemant, please, see lines from OvS below.

On 8/5/22 09:32, Chaoyong He wrote:
> This commit adds the vNIC initialization logic for the flower PF vNIC.

"This commit adds" -> "Add"

> The flower firmware exposes this vNIC for the purposes of fallback
> traffic in the switchdev use-case. The logic of setting up this vNIC is
> similar to the logic seen in nfp_net_init() and nfp_net_start().
> 
> This commit also adds minimal dev_ops for this PF device. Because the

same here

> device is being exposed externally to DPDK it should also be configured
> using DPDK helpers like rte_eth_configure(). For these helpers to work
> the flower logic needs to implements a minimal set of dev_ops. The Rx
> and Tx logic for this vNIC will be added in a subsequent commit.
> 
> OVS expects incoming packets coming into the OVS datapath to be
> allocated from a mempool that contains objects of type "struct
> dp_packet". For the PF handling the slowpath into OVS it should
> use a mempool that is compatible with OVS. This commit adds the logic
> to create the OVS compatible mempool. It adds certain OVS specific
> structs to be able to instantiate the mempool.
> 
> Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> ---
>   drivers/net/nfp/flower/nfp_flower.c            | 384 ++++++++++++++++++++++++-
>   drivers/net/nfp/flower/nfp_flower.h            |   9 +
>   drivers/net/nfp/flower/nfp_flower_ovs_compat.h | 145 ++++++++++
>   drivers/net/nfp/nfp_common.h                   |   3 +
>   4 files changed, 537 insertions(+), 4 deletions(-)
>   create mode 100644 drivers/net/nfp/flower/nfp_flower_ovs_compat.h
> 
> diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
> index 1dddced..c05d4ca 100644
> --- a/drivers/net/nfp/flower/nfp_flower.c
> +++ b/drivers/net/nfp/flower/nfp_flower.c
> @@ -14,7 +14,35 @@
>   #include "../nfp_logs.h"
>   #include "../nfp_ctrl.h"
>   #include "../nfp_cpp_bridge.h"
> +#include "../nfp_rxtx.h"
> +#include "../nfpcore/nfp_mip.h"
> +#include "../nfpcore/nfp_rtsym.h"
> +#include "../nfpcore/nfp_nsp.h"
>   #include "nfp_flower.h"
> +#include "nfp_flower_ovs_compat.h"
> +
> +#define MAX_PKT_BURST 32
> +#define MEMPOOL_CACHE_SIZE 512
> +#define DEFAULT_FLBUF_SIZE 9216
> +
> +/*
> + * Simple dev ops functions for the flower PF. Because a rte_device is exposed
> + * to DPDK the flower logic also makes use of helper functions like
> + * rte_dev_configure() to set up the PF device. Stub functions are needed to
> + * use these helper functions
> + */
> +static int
> +nfp_flower_pf_configure(__rte_unused struct rte_eth_dev *dev)
> +{
> +	return 0;
> +}
> +
> +static const struct eth_dev_ops nfp_flower_pf_dev_ops = {
> +	.dev_configure          = nfp_flower_pf_configure,
> +
> +	/* Use the normal dev_infos_get functionality in the NFP PMD */
> +	.dev_infos_get          = nfp_net_infos_get,
> +};
>   
>   static struct rte_service_spec flower_services[NFP_FLOWER_SERVICE_MAX] = {
>   };
> @@ -49,6 +77,304 @@
>   	return ret;
>   }
>   
> +static void
> +nfp_flower_pf_mp_init(__rte_unused struct rte_mempool *mp,
> +		__rte_unused void *opaque_arg,
> +		void *_p,
> +		__rte_unused unsigned int i)
> +{
> +	struct dp_packet *pkt = _p;
> +	pkt->source      = DPBUF_DPDK;
> +	pkt->l2_pad_size = 0;
> +	pkt->l2_5_ofs    = UINT16_MAX;
> +	pkt->l3_ofs      = UINT16_MAX;
> +	pkt->l4_ofs      = UINT16_MAX;
> +	pkt->packet_type = 0; /* PT_ETH */
> +}
> +
> +static struct rte_mempool *
> +nfp_flower_pf_mp_create(void)
> +{
> +	uint32_t nb_mbufs;
> +	uint32_t pkt_size;
> +	uint32_t n_rxd = 1024;
> +	uint32_t n_txd = 1024;
> +	unsigned int numa_node;
> +	uint32_t aligned_mbuf_size;
> +	uint32_t mbuf_priv_data_len;
> +	struct rte_mempool *pktmbuf_pool;
> +
> +	nb_mbufs = RTE_MAX(n_rxd + n_txd + MAX_PKT_BURST + MEMPOOL_CACHE_SIZE,
> +			81920U);
> +
> +	/*
> +	 * The size of the mbuf's private area (i.e. area that holds OvS'
> +	 * dp_packet data)
> +	 */
> +	mbuf_priv_data_len = sizeof(struct dp_packet) - sizeof(struct rte_mbuf);
> +	/* The size of the entire dp_packet. */
> +	pkt_size = sizeof(struct dp_packet) + RTE_MBUF_DEFAULT_BUF_SIZE;
> +	/* mbuf size, rounded up to cacheline size. */
> +	aligned_mbuf_size = ROUND_UP(pkt_size, RTE_CACHE_LINE_SIZE);
> +	mbuf_priv_data_len += (aligned_mbuf_size - pkt_size);
> +
> +	numa_node = rte_socket_id();
> +	pktmbuf_pool = rte_pktmbuf_pool_create("flower_pf_mbuf_pool", nb_mbufs,
> +			MEMPOOL_CACHE_SIZE, mbuf_priv_data_len,
> +			RTE_MBUF_DEFAULT_BUF_SIZE, numa_node);
> +	if (pktmbuf_pool == NULL) {
> +		RTE_LOG(ERR, PMD, "Cannot init mbuf pool\n");
> +		return NULL;
> +	}
> +
> +	rte_mempool_obj_iter(pktmbuf_pool, nfp_flower_pf_mp_init, NULL);
> +
> +	return pktmbuf_pool;
> +}
> +
> +static void
> +nfp_flower_cleanup_pf_vnic(struct nfp_net_hw *hw)
> +{
> +	uint16_t i;
> +	struct rte_eth_dev *eth_dev;
> +	struct nfp_app_flower *app_flower;
> +
> +	eth_dev = hw->eth_dev;
> +	app_flower = NFP_APP_PRIV_TO_APP_FLOWER(hw->pf_dev->app_priv);
> +
> +	for (i = 0; i < eth_dev->data->nb_tx_queues; i++)
> +		nfp_net_tx_queue_release(eth_dev, i);
> +
> +	for (i = 0; i < eth_dev->data->nb_rx_queues; i++)
> +		nfp_net_rx_queue_release(eth_dev, i);
> +
> +	rte_free(eth_dev->data->mac_addrs);
> +	rte_mempool_free(app_flower->pf_pktmbuf_pool);
> +	rte_free(eth_dev->data->dev_private);
> +	rte_eth_dev_release_port(hw->eth_dev);
> +}
> +
> +static int
> +nfp_flower_init_vnic_common(struct nfp_net_hw *hw, const char *vnic_type)
> +{
> +	uint32_t start_q;
> +	uint64_t rx_bar_off;
> +	uint64_t tx_bar_off;
> +	const int stride = 4;
> +	struct nfp_pf_dev *pf_dev;
> +	struct rte_pci_device *pci_dev;
> +
> +	pf_dev = hw->pf_dev;
> +	pci_dev = hw->pf_dev->pci_dev;
> +
> +	/* NFP can not handle DMA addresses requiring more than 40 bits */
> +	if (rte_mem_check_dma_mask(40)) {
> +		RTE_LOG(ERR, PMD,
> +			"device %s can not be used: restricted dma mask to 40 bits!\n",
> +			pci_dev->device.name);
> +		return -ENODEV;
> +	};
> +
> +	hw->device_id = pci_dev->id.device_id;
> +	hw->vendor_id = pci_dev->id.vendor_id;
> +	hw->subsystem_device_id = pci_dev->id.subsystem_device_id;
> +	hw->subsystem_vendor_id = pci_dev->id.subsystem_vendor_id;
> +
> +	PMD_INIT_LOG(DEBUG, "%s vNIC ctrl bar: %p", vnic_type, hw->ctrl_bar);
> +
> +	/* Read the number of available rx/tx queues from hardware */
> +	hw->max_rx_queues = nn_cfg_readl(hw, NFP_NET_CFG_MAX_RXRINGS);
> +	hw->max_tx_queues = nn_cfg_readl(hw, NFP_NET_CFG_MAX_TXRINGS);
> +
> +	/* Work out where in the BAR the queues start */
> +	start_q = nn_cfg_readl(hw, NFP_NET_CFG_START_TXQ);
> +	tx_bar_off = (uint64_t)start_q * NFP_QCP_QUEUE_ADDR_SZ;
> +	start_q = nn_cfg_readl(hw, NFP_NET_CFG_START_RXQ);
> +	rx_bar_off = (uint64_t)start_q * NFP_QCP_QUEUE_ADDR_SZ;
> +
> +	hw->tx_bar = pf_dev->hw_queues + tx_bar_off;
> +	hw->rx_bar = pf_dev->hw_queues + rx_bar_off;
> +
> +	/* Get some of the read-only fields from the config BAR */
> +	hw->ver = nn_cfg_readl(hw, NFP_NET_CFG_VERSION);
> +	hw->cap = nn_cfg_readl(hw, NFP_NET_CFG_CAP);
> +	hw->max_mtu = nn_cfg_readl(hw, NFP_NET_CFG_MAX_MTU);
> +	/* Set the current MTU to the maximum supported */
> +	hw->mtu = hw->max_mtu;
> +	hw->flbufsz = DEFAULT_FLBUF_SIZE;
> +
> +	/* read the Rx offset configured from firmware */
> +	if (NFD_CFG_MAJOR_VERSION_of(hw->ver) < 2)
> +		hw->rx_offset = NFP_NET_RX_OFFSET;
> +	else
> +		hw->rx_offset = nn_cfg_readl(hw, NFP_NET_CFG_RX_OFFSET_ADDR);
> +
> +	hw->ctrl = 0;
> +	hw->stride_rx = stride;
> +	hw->stride_tx = stride;
> +
> +	/* Reuse cfg queue setup function */
> +	nfp_net_cfg_queue_setup(hw);
> +
> +	PMD_INIT_LOG(INFO, "%s vNIC max_rx_queues: %u, max_tx_queues: %u",
> +			vnic_type, hw->max_rx_queues, hw->max_tx_queues);
> +
> +	/* Initializing spinlock for reconfigs */
> +	rte_spinlock_init(&hw->reconfig_lock);
> +
> +	return 0;
> +}
> +
> +static int
> +nfp_flower_init_pf_vnic(struct nfp_net_hw *hw)
> +{
> +	int ret;
> +	uint16_t i;
> +	uint16_t n_txq;
> +	uint16_t n_rxq;
> +	uint16_t port_id;
> +	unsigned int numa_node;
> +	struct rte_mempool *mp;
> +	struct nfp_pf_dev *pf_dev;
> +	struct rte_eth_dev *eth_dev;
> +	struct nfp_app_flower *app_flower;
> +
> +	const struct rte_eth_rxconf rx_conf = {

static const ?

> +		.rx_free_thresh = DEFAULT_RX_FREE_THRESH,
> +		.rx_drop_en = 1,
> +	};
> +
> +	const struct rte_eth_txconf tx_conf = {

static const ?

> +		.tx_thresh = {
> +			.pthresh  = DEFAULT_TX_PTHRESH,
> +			.hthresh = DEFAULT_TX_HTHRESH,
> +			.wthresh = DEFAULT_TX_WTHRESH,
> +		},
> +		.tx_free_thresh = DEFAULT_TX_FREE_THRESH,
> +	};
> +
> +	static struct rte_eth_conf port_conf = {

I think it should be const as well

> +		.rxmode = {
> +			.mq_mode  = RTE_ETH_MQ_RX_RSS,
> +			.offloads = RTE_ETH_RX_OFFLOAD_CHECKSUM,
> +		},
> +		.txmode = {
> +			.mq_mode = RTE_ETH_MQ_TX_NONE,
> +		},
> +	};
> +
> +	/* Set up some pointers here for ease of use */
> +	pf_dev = hw->pf_dev;
> +	app_flower = NFP_APP_PRIV_TO_APP_FLOWER(pf_dev->app_priv);
> +
> +	/*
> +	 * Perform the "common" part of setting up a flower vNIC.
> +	 * Mostly reading configuration from hardware.
> +	 */
> +	ret = nfp_flower_init_vnic_common(hw, "pf_vnic");
> +	if (ret)

Compare vs 0

> +		goto done;
> +
> +	hw->eth_dev = rte_eth_dev_allocate("pf_vnic_eth_dev");

Shoulnd't name mention 'nfp' ?

> +	if (hw->eth_dev == NULL) {
> +		ret = -ENOMEM;
> +		goto done;
> +	}
> +
> +	/* Grab the pointer to the newly created rte_eth_dev here */
> +	eth_dev = hw->eth_dev;
> +
> +	numa_node = rte_socket_id();
> +	eth_dev->data->dev_private =
> +		rte_zmalloc_socket("pf_vnic_eth_dev", sizeof(struct nfp_net_hw),
> +				   RTE_CACHE_LINE_SIZE, numa_node);
> +	if (eth_dev->data->dev_private == NULL) {
> +		ret = -ENOMEM;
> +		goto port_release;
> +	}
> +
> +	/* Fill in some of the eth_dev fields */
> +	eth_dev->device = &pf_dev->pci_dev->device;
> +	eth_dev->data->nb_tx_queues = hw->max_tx_queues;
> +	eth_dev->data->nb_rx_queues = hw->max_rx_queues;

Above two assignments look strange. It is rte_eth_dev_configure() job
to do it. I think that these max values should be simply passed on
configure.

> +	eth_dev->data->dev_private = hw;
> +
> +	/* Create a mbuf pool for the PF */
> +	app_flower->pf_pktmbuf_pool = nfp_flower_pf_mp_create();
> +	if (app_flower->pf_pktmbuf_pool == NULL) {
> +		ret = -ENOMEM;
> +		goto private_cleanup;
> +	}
> +
> +	mp = app_flower->pf_pktmbuf_pool;
> +
> +	/* Add Rx/Tx functions */
> +	eth_dev->dev_ops = &nfp_flower_pf_dev_ops;
> +
> +	/* PF vNIC gets a random MAC */
> +	eth_dev->data->mac_addrs = rte_zmalloc("mac_addr",
> +			RTE_ETHER_ADDR_LEN, 0);
> +	if (eth_dev->data->mac_addrs == NULL) {
> +		ret = -ENOMEM;
> +		goto mempool_cleanup;
> +	}
> +
> +	rte_eth_random_addr(eth_dev->data->mac_addrs->addr_bytes);
> +	rte_eth_dev_probing_finish(eth_dev);
> +
> +	/* Configure the PF device now */
> +	n_rxq = hw->eth_dev->data->nb_rx_queues;
> +	n_txq = hw->eth_dev->data->nb_tx_queues;
> +	port_id = hw->eth_dev->data->port_id;
> +
> +	ret = rte_eth_dev_configure(port_id, n_rxq, n_txq, &port_conf);
> +	if (ret) {

Compare vs 0

> +		PMD_INIT_LOG(ERR, "Could not configure PF device %d", ret);
> +		goto mac_cleanup;
> +	}
> +
> +	/* Set up the Rx queues */
> +	for (i = 0; i < n_rxq; i++) {
> +		/* Hardcoded number of desc to 1024 */
> +		ret = nfp_net_rx_queue_setup(eth_dev, i, 1024, numa_node,
> +			&rx_conf, mp);
> +		if (ret) {
> +			PMD_INIT_LOG(ERR, "Configure flower PF vNIC Rx queue %d failed", i);
> +			goto rx_queue_cleanup;
> +		}
> +	}
> +
> +	/* Set up the Tx queues */
> +	for (i = 0; i < n_txq; i++) {
> +		/* Hardcoded number of desc to 1024 */
> +		ret = nfp_net_nfd3_tx_queue_setup(eth_dev, i, 1024, numa_node,
> +			&tx_conf);
> +		if (ret) {
> +			PMD_INIT_LOG(ERR, "Configure flower PF vNIC Tx queue %d failed", i);
> +			goto tx_queue_cleanup;
> +		}
> +	}
> +
> +	return 0;
> +
> +tx_queue_cleanup:
> +	for (i = 0; i < n_txq; i++)
> +		nfp_net_tx_queue_release(eth_dev, i);
> +rx_queue_cleanup:
> +	for (i = 0; i < n_rxq; i++)
> +		nfp_net_rx_queue_release(eth_dev, i);
> +mac_cleanup:
> +	rte_free(eth_dev->data->mac_addrs);
> +mempool_cleanup:
> +	rte_mempool_free(mp);
> +private_cleanup:
> +	rte_free(eth_dev->data->dev_private);
> +port_release:
> +	rte_eth_dev_release_port(hw->eth_dev);
> +done:
> +	return ret;
> +}
> +
>   int
>   nfp_init_app_flower(struct nfp_pf_dev *pf_dev)
>   {
> @@ -77,14 +403,49 @@
>   		goto app_cleanup;
>   	}
>   
> +	/* Grab the number of physical ports present on hardware */
> +	app_flower->nfp_eth_table = nfp_eth_read_ports(pf_dev->cpp);
> +	if (app_flower->nfp_eth_table == NULL) {
> +		PMD_INIT_LOG(ERR, "error reading nfp ethernet table");
> +		ret = -EIO;
> +		goto vnic_cleanup;
> +	}
> +
> +	/* Map the PF ctrl bar */
> +	pf_dev->ctrl_bar = nfp_rtsym_map(pf_dev->sym_tbl, "_pf0_net_bar0",
> +			32768, &pf_dev->ctrl_area);
> +	if (pf_dev->ctrl_bar == NULL) {
> +		PMD_INIT_LOG(ERR, "Cloud not map the PF vNIC ctrl bar");
> +		ret = -ENODEV;
> +		goto eth_tbl_cleanup;
> +	}
> +
> +	/* Fill in the PF vNIC and populate app struct */
> +	app_flower->pf_hw = pf_hw;
> +	pf_hw->ctrl_bar = pf_dev->ctrl_bar;
> +	pf_hw->pf_dev = pf_dev;
> +	pf_hw->cpp = pf_dev->cpp;
> +
> +	ret = nfp_flower_init_pf_vnic(app_flower->pf_hw);
> +	if (ret) {

Compare vs 0

> +		PMD_INIT_LOG(ERR, "Could not initialize flower PF vNIC");
> +		goto pf_cpp_area_cleanup;
> +	}
> +
>   	/* Start up flower services */
>   	if (nfp_flower_enable_services(app_flower)) {
>   		ret = -ESRCH;
> -		goto vnic_cleanup;
> +		goto pf_vnic_cleanup;
>   	}
>   
>   	return 0;
>   
> +pf_vnic_cleanup:
> +	nfp_flower_cleanup_pf_vnic(app_flower->pf_hw);
> +pf_cpp_area_cleanup:
> +	nfp_cpp_area_free(pf_dev->ctrl_area);
> +eth_tbl_cleanup:
> +	free(app_flower->nfp_eth_table);
>   vnic_cleanup:
>   	rte_free(pf_hw);
>   app_cleanup:
> @@ -94,8 +455,23 @@
>   }
>   
>   int
> -nfp_secondary_init_app_flower(__rte_unused struct nfp_cpp *cpp)
> +nfp_secondary_init_app_flower(struct nfp_cpp *cpp)
>   {
> -	PMD_INIT_LOG(ERR, "Flower firmware not supported");
> -	return -ENOTSUP;
> +	struct rte_eth_dev *eth_dev;
> +	const char *port_name = "pf_vnic_eth_dev";
> +
> +	PMD_DRV_LOG(DEBUG, "Secondary attaching to port %s", port_name);
> +
> +	eth_dev = rte_eth_dev_attach_secondary(port_name);
> +	if (eth_dev == NULL) {
> +		RTE_LOG(ERR, EAL, "secondary process attach failed, "
> +			"ethdev doesn't exist");
> +		return -ENODEV;
> +	}
> +
> +	eth_dev->process_private = cpp;
> +	eth_dev->dev_ops = &nfp_flower_pf_dev_ops;
> +	rte_eth_dev_probing_finish(eth_dev);
> +
> +	return 0;
>   }
> diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
> index 4a9b302..f6fd4eb 100644
> --- a/drivers/net/nfp/flower/nfp_flower.h
> +++ b/drivers/net/nfp/flower/nfp_flower.h
> @@ -14,6 +14,15 @@ enum nfp_flower_service {
>   struct nfp_app_flower {
>   	/* List of rte_service ID's for the flower app */
>   	uint32_t flower_services_ids[NFP_FLOWER_SERVICE_MAX];
> +
> +	/* Pointer to a mempool for the PF vNIC */
> +	struct rte_mempool *pf_pktmbuf_pool;
> +
> +	/* Pointer to the PF vNIC */
> +	struct nfp_net_hw *pf_hw;
> +
> +	/* the eth table as reported by firmware */
> +	struct nfp_eth_table *nfp_eth_table;
>   };
>   
>   int nfp_init_app_flower(struct nfp_pf_dev *pf_dev);
> diff --git a/drivers/net/nfp/flower/nfp_flower_ovs_compat.h b/drivers/net/nfp/flower/nfp_flower_ovs_compat.h
> new file mode 100644
> index 0000000..f0fcbf2
> --- /dev/null
> +++ b/drivers/net/nfp/flower/nfp_flower_ovs_compat.h
> @@ -0,0 +1,145 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2022 Corigine, Inc.
> + * All rights reserved.
> + */
> +
> +#ifndef _NFP_FLOWER_OVS_COMPAT_H_
> +#define _NFP_FLOWER_OVS_COMPAT_H_
> +

Below lines come from OvS and correspdonging file is licenced
under Apache 2.0 licence. It is just few lines, but still.
I'm not sure that it is OK to change the license and drop copyright.
I'm adding more people in Cc to tell me if my concerns are wrong.

@Thomas, @Stephen, @Hemant ?

> +/* From ovs */
> +#define PAD_PASTE2(x, y) x##y
> +#define PAD_PASTE(x, y) PAD_PASTE2(x, y)
> +#define PAD_ID PAD_PASTE(pad, __COUNTER__)
> +
> +/* Returns X rounded up to the nearest multiple of Y. */
> +#define ROUND_UP(X, Y) (DIV_ROUND_UP(X, Y) * (Y))
> +
> +typedef uint8_t OVS_CACHE_LINE_MARKER[1];
> +
> +#ifndef __cplusplus
> +#define PADDED_MEMBERS_CACHELINE_MARKER(UNIT, CACHELINE, MEMBERS)   \
> +	union {                                                         \
> +		OVS_CACHE_LINE_MARKER CACHELINE;                            \
> +		struct { MEMBERS };                                         \
> +		uint8_t PAD_ID[ROUND_UP(sizeof(struct { MEMBERS }), UNIT)]; \
> +	}
> +#else
> +#define PADDED_MEMBERS_CACHELINE_MARKER(UNIT, CACHELINE, MEMBERS)           \
> +	struct struct_##CACHELINE { MEMBERS };                                  \

Confused to see duplicate 'struct struct' above.

> +	union {                                                                 \
> +		OVS_CACHE_LINE_MARKER CACHELINE;                                    \
> +		struct { MEMBERS };                                                 \
> +		uint8_t PAD_ID[ROUND_UP(sizeof(struct struct_##CACHELINE), UNIT)];  \
> +	}
> +#endif
> +
> +struct ovs_key_ct_tuple_ipv4 {
> +	rte_be32_t ipv4_src;
> +	rte_be32_t ipv4_dst;
> +	rte_be16_t src_port;
> +	rte_be16_t dst_port;
> +	uint8_t    ipv4_proto;
> +};
> +
> +struct ovs_key_ct_tuple_ipv6 {
> +	rte_be32_t ipv6_src[4];
> +	rte_be32_t ipv6_dst[4];
> +	rte_be16_t src_port;
> +	rte_be16_t dst_port;
> +	uint8_t    ipv6_proto;
> +};
> +
> +/* Tunnel information used in flow key and metadata. */
> +struct flow_tnl {
> +	uint32_t ip_dst;
> +	struct in6_addr ipv6_dst;
> +	uint32_t ip_src;
> +	struct in6_addr ipv6_src;
> +	uint64_t tun_id;
> +	uint16_t flags;
> +	uint8_t ip_tos;
> +	uint8_t ip_ttl;
> +	uint16_t tp_src;
> +	uint16_t tp_dst;
> +	uint16_t gbp_id;
> +	uint8_t  gbp_flags;
> +	uint8_t erspan_ver;
> +	uint32_t erspan_idx;
> +	uint8_t erspan_dir;
> +	uint8_t erspan_hwid;
> +	uint8_t gtpu_flags;
> +	uint8_t gtpu_msgtype;
> +	uint8_t pad1[4];     /* Pad to 64 bits. */
> +};
> +
> +enum dp_packet_source {
> +	DPBUF_MALLOC,              /* Obtained via malloc(). */
> +	DPBUF_STACK,               /* Un-movable stack space or static buffer. */
> +	DPBUF_STUB,                /* Starts on stack, may expand into heap. */
> +	DPBUF_DPDK,                /* buffer data is from DPDK allocated memory. */
> +	DPBUF_AFXDP,               /* Buffer data from XDP frame. */
> +};
> +
> +/* Datapath packet metadata */
> +struct pkt_metadata {
> +PADDED_MEMBERS_CACHELINE_MARKER(RTE_CACHE_LINE_SIZE, cacheline0,

If it is DPDK-specific code why do you prefer to use such macros intead
of approach used for rte_mbuf
	RTE_MARKER cacheline0;

> +	/* Recirculation id carried with the recirculating packets. */
> +	uint32_t recirc_id;         /* 0 for packets received from the wire. */
> +	uint32_t dp_hash;           /* hash value computed by the recirculation action. */
> +	uint32_t skb_priority;      /* Packet priority for QoS. */
> +	uint32_t pkt_mark;          /* Packet mark. */
> +	uint8_t  ct_state;          /* Connection state. */
> +	bool ct_orig_tuple_ipv6;
> +	uint16_t ct_zone;           /* Connection zone. */
> +	uint32_t ct_mark;           /* Connection mark. */
> +	uint32_t ct_label[4];       /* Connection label. */
> +	uint32_t in_port;           /* Input port. */
> +	uint32_t orig_in_port;      /* Originating in_port for tunneled packets */
> +	void *conn;                 /* Cached conntrack connection. */
> +	bool reply;                 /* True if reply direction. */
> +	bool icmp_related;          /* True if ICMP related. */
> +);
> +
> +PADDED_MEMBERS_CACHELINE_MARKER(RTE_CACHE_LINE_SIZE, cacheline1,
> +	union {                     /* Populated only for non-zero 'ct_state'. */
> +		struct ovs_key_ct_tuple_ipv4 ipv4;
> +		struct ovs_key_ct_tuple_ipv6 ipv6;   /* Used only if */
> +	} ct_orig_tuple;                             /* 'ct_orig_tuple_ipv6' is set */
> +);
> +
> +/*
> + * Encapsulating tunnel parameters. Note that if 'ip_dst' == 0,
> + * the rest of the fields may be uninitialized.
> + */
> +PADDED_MEMBERS_CACHELINE_MARKER(RTE_CACHE_LINE_SIZE, cacheline2,
> +	struct flow_tnl tunnel;);
> +};
> +
> +#define DP_PACKET_CONTEXT_SIZE 64
> +
> +/*
> + * Buffer for holding packet data.  A dp_packet is automatically reallocated
> + * as necessary if it grows too large for the available memory.
> + * By default the packet type is set to Ethernet (PT_ETH).
> + */
> +struct dp_packet {
> +	struct rte_mbuf mbuf;          /* DPDK mbuf */
> +	enum dp_packet_source source;  /* Source of memory allocated as 'base'. */
> +
> +	/*
> +	 * All the following elements of this struct are copied in a single call
> +	 * of memcpy in dp_packet_clone_with_headroom.
> +	 */
> +	uint16_t l2_pad_size;          /* Detected l2 padding size. Padding is non-pullable. */
> +	uint16_t l2_5_ofs;             /* MPLS label stack offset, or UINT16_MAX */
> +	uint16_t l3_ofs;               /* Network-level header offset, or UINT16_MAX. */
> +	uint16_t l4_ofs;               /* Transport-level header offset, or UINT16_MAX. */
> +	uint32_t cutlen;               /* length in bytes to cut from the end. */
> +	uint32_t packet_type;          /* Packet type as defined in OpenFlow */
> +	union {
> +		struct pkt_metadata md;
> +		uint64_t data[DP_PACKET_CONTEXT_SIZE / 8];
> +	};
> +};
> +
> +#endif /* _NFP_FLOWER_OVS_COMPAT_ */
> diff --git a/drivers/net/nfp/nfp_common.h b/drivers/net/nfp/nfp_common.h
> index b28ebc9..ab2e5c2 100644
> --- a/drivers/net/nfp/nfp_common.h
> +++ b/drivers/net/nfp/nfp_common.h
> @@ -448,6 +448,9 @@ int nfp_net_rss_hash_conf_get(struct rte_eth_dev *dev,
>   #define NFP_APP_PRIV_TO_APP_NIC(app_priv)\
>   	((struct nfp_app_nic *)app_priv)
>   
> +#define NFP_APP_PRIV_TO_APP_FLOWER(app_priv)\
> +	((struct nfp_app_flower *)app_priv)
> +

Same as NFP_APP_PRIV_TO_APP_NIC it is better to make it a tiny
function, use struct nfp_pf_dev pointer as input and validate
app_id before type cast.

>   #endif /* _NFP_COMMON_H_ */
>   /*
>    * Local variables:


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 06/12] net/nfp: add flower PF related routines
  2022-08-05  6:32 ` [PATCH v5 06/12] net/nfp: add flower PF related routines Chaoyong He
@ 2022-08-05 12:55   ` Andrew Rybchenko
  0 siblings, 0 replies; 29+ messages in thread
From: Andrew Rybchenko @ 2022-08-05 12:55 UTC (permalink / raw)
  To: Chaoyong He, dev; +Cc: niklas.soderlund

On 8/5/22 09:32, Chaoyong He wrote:
> This commit adds the start/stop/close routine of the

"This commit adds" -> "Add"

Typically close goes in pair with configure.

> flower PF vNIC.
> 
> Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> ---
>   drivers/net/nfp/flower/nfp_flower.c | 193 ++++++++++++++++++++++++++++++++++++
>   1 file changed, 193 insertions(+)
> 
> diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
> index c05d4ca..2498020 100644
> --- a/drivers/net/nfp/flower/nfp_flower.c
> +++ b/drivers/net/nfp/flower/nfp_flower.c
> @@ -7,6 +7,7 @@
>   #include <ethdev_driver.h>
>   #include <rte_service_component.h>
>   #include <rte_malloc.h>
> +#include <rte_alarm.h>
>   #include <ethdev_pci.h>
>   #include <ethdev_driver.h>
>   
> @@ -37,11 +38,178 @@
>   	return 0;
>   }
>   
> +static int
> +nfp_flower_pf_start(struct rte_eth_dev *dev)
> +{
> +	int ret;
> +	uint32_t new_ctrl;
> +	uint32_t update = 0;
> +	struct nfp_net_hw *hw;
> +
> +	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> +
> +	/* Disabling queues just in case... */
> +	nfp_net_disable_queues(dev);
> +
> +	/* Enabling the required queues in the device */
> +	nfp_net_enable_queues(dev);
> +
> +	new_ctrl = nfp_check_offloads(dev);
> +
> +	/* Writing configuration parameters in the device */
> +	nfp_net_params_setup(hw);
> +
> +	nfp_net_rss_config_default(dev);
> +	update |= NFP_NET_CFG_UPDATE_RSS;
> +
> +	if (hw->cap & NFP_NET_CFG_CTRL_RSS2)
> +		new_ctrl |= NFP_NET_CFG_CTRL_RSS2;
> +	else
> +		new_ctrl |= NFP_NET_CFG_CTRL_RSS;
> +
> +	/* Enable device */
> +	new_ctrl |= NFP_NET_CFG_CTRL_ENABLE;
> +
> +	update |= NFP_NET_CFG_UPDATE_GEN | NFP_NET_CFG_UPDATE_RING;
> +
> +	if (hw->cap & NFP_NET_CFG_CTRL_RINGCFG)
> +		new_ctrl |= NFP_NET_CFG_CTRL_RINGCFG;
> +
> +	nn_cfg_writel(hw, NFP_NET_CFG_CTRL, new_ctrl);
> +
> +	/* If an error when reconfig we avoid to change hw state */
> +	ret = nfp_net_reconfig(hw, new_ctrl, update);
> +	if (ret) {

Compare vs 0

> +		PMD_INIT_LOG(ERR, "Failed to reconfig PF vnic");
> +		return -EIO;
> +	}
> +
> +	hw->ctrl = new_ctrl;
> +
> +	/* Setup the freelist ring */
> +	ret = nfp_net_rx_freelist_setup(dev);
> +	if (ret) {

Compare vs 0

> +		PMD_INIT_LOG(ERR, "Error with flower PF vNIC freelist setup");
> +		return -EIO;
> +	}
> +
> +	return 0;
> +}
> +
> +/* Stop device: disable rx and tx functions to allow for reconfiguring. */
> +static int
> +nfp_flower_pf_stop(struct rte_eth_dev *dev)
> +{
> +	uint16_t i;
> +	struct nfp_net_hw *hw;
> +	struct nfp_net_txq *this_tx_q;
> +	struct nfp_net_rxq *this_rx_q;
> +
> +	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> +
> +	nfp_net_disable_queues(dev);
> +
> +	/* Clear queues */
> +	for (i = 0; i < dev->data->nb_tx_queues; i++) {
> +		this_tx_q = (struct nfp_net_txq *)dev->data->tx_queues[i];
> +		nfp_net_reset_tx_queue(this_tx_q);
> +	}
> +
> +	for (i = 0; i < dev->data->nb_rx_queues; i++) {
> +		this_rx_q = (struct nfp_net_rxq *)dev->data->rx_queues[i];
> +		nfp_net_reset_rx_queue(this_rx_q);
> +	}
> +
> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> +		/* Configure the physical port down */
> +		nfp_eth_set_configured(hw->cpp, hw->nfp_idx, 0);
> +	else
> +		nfp_eth_set_configured(dev->process_private, hw->nfp_idx, 0);
> +
> +	return 0;
> +}
> +
> +/* Reset and stop device. The device can not be restarted. */
> +static int
> +nfp_flower_pf_close(struct rte_eth_dev *dev)
> +{
> +	uint16_t i;
> +	struct nfp_net_hw *hw;
> +	struct nfp_pf_dev *pf_dev;
> +	struct nfp_net_txq *this_tx_q;
> +	struct nfp_net_rxq *this_rx_q;
> +	struct rte_pci_device *pci_dev;
> +	struct nfp_app_flower *app_flower;
> +
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> +		return 0;
> +
> +	pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(dev->data->dev_private);
> +	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> +	pci_dev = RTE_ETH_DEV_TO_PCI(dev);
> +	app_flower = NFP_APP_PRIV_TO_APP_FLOWER(pf_dev->app_priv);
> +
> +	/*
> +	 * We assume that the DPDK application is stopping all the
> +	 * threads/queues before calling the device close function.
> +	 */
> +
> +	nfp_net_disable_queues(dev);
> +
> +	/* Clear queues */
> +	for (i = 0; i < dev->data->nb_tx_queues; i++) {
> +		this_tx_q = (struct nfp_net_txq *)dev->data->tx_queues[i];
> +		nfp_net_reset_tx_queue(this_tx_q);
> +	}
> +
> +	for (i = 0; i < dev->data->nb_rx_queues; i++) {
> +		this_rx_q = (struct nfp_net_rxq *)dev->data->rx_queues[i];
> +		nfp_net_reset_rx_queue(this_rx_q);
> +	}
> +
> +	/* Cancel possible impending LSC work here before releasing the port*/
> +	rte_eal_alarm_cancel(nfp_net_dev_interrupt_delayed_handler, (void *)dev);
> +
> +	nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff);
> +
> +	rte_eth_dev_release_port(dev);
> +
> +	/* Now it is safe to free all PF resources */
> +	PMD_INIT_LOG(INFO, "Freeing PF resources");
> +	nfp_cpp_area_free(pf_dev->ctrl_area);
> +	nfp_cpp_area_free(pf_dev->hwqueues_area);
> +	free(pf_dev->hwinfo);
> +	free(pf_dev->sym_tbl);
> +	nfp_cpp_free(pf_dev->cpp);
> +	rte_free(app_flower);
> +	rte_free(pf_dev);
> +
> +	rte_intr_disable(pci_dev->intr_handle);
> +
> +	/* unregister callback func from eal lib */
> +	rte_intr_callback_unregister(pci_dev->intr_handle,
> +			nfp_net_dev_interrupt_handler, (void *)dev);
> +
> +	return 0;
> +}
> +
> +static int
> +nfp_flower_pf_link_update(__rte_unused struct rte_eth_dev *dev,
> +		__rte_unused int wait_to_complete)
> +{

It is really confusing implementation of the operatoin.
Could you explain why dummy implementation is OK? Why is it required?

> +	return 0;
> +}
> +
>   static const struct eth_dev_ops nfp_flower_pf_dev_ops = {
>   	.dev_configure          = nfp_flower_pf_configure,
>   
>   	/* Use the normal dev_infos_get functionality in the NFP PMD */
>   	.dev_infos_get          = nfp_net_infos_get,
> +
> +	.dev_start              = nfp_flower_pf_start,
> +	.dev_stop               = nfp_flower_pf_stop,
> +	.dev_close              = nfp_flower_pf_close,
> +	.link_update            = nfp_flower_pf_link_update,
>   };
>   
>   static struct rte_service_spec flower_services[NFP_FLOWER_SERVICE_MAX] = {
> @@ -375,6 +543,24 @@
>   	return ret;
>   }
>   
> +static int
> +nfp_flower_start_pf_vnic(struct nfp_net_hw *hw)
> +{
> +	int ret;
> +	uint16_t port_id;
> +
> +	port_id = hw->eth_dev->data->port_id;
> +
> +	/* Start the device */
> +	ret = rte_eth_dev_start(port_id);
> +	if (ret) {

Compare vs 0

> +		PMD_INIT_LOG(ERR, "Could not start PF device %d", port_id);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
>   int
>   nfp_init_app_flower(struct nfp_pf_dev *pf_dev)
>   {
> @@ -432,6 +618,13 @@
>   		goto pf_cpp_area_cleanup;
>   	}
>   
> +	/* Start the PF vNIC */
> +	ret = nfp_flower_start_pf_vnic(app_flower->pf_hw);
> +	if (ret) {

Compare vs 0

> +		PMD_INIT_LOG(ERR, "Could not start flower PF vNIC");
> +		goto pf_vnic_cleanup;
> +	}
> +
>   	/* Start up flower services */
>   	if (nfp_flower_enable_services(app_flower)) {
>   		ret = -ESRCH;


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-08-05  6:32 ` [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics Chaoyong He
@ 2022-08-05 13:05   ` Andrew Rybchenko
  2022-08-08 11:32     ` Chaoyong He
  0 siblings, 1 reply; 29+ messages in thread
From: Andrew Rybchenko @ 2022-08-05 13:05 UTC (permalink / raw)
  To: Chaoyong He, dev; +Cc: niklas.soderlund

On 8/5/22 09:32, Chaoyong He wrote:
> This commit adds the setup/start logic for the ctrl vNIC. This vNIC

"This commit adds" -> "Add"

> is used by the PMD and flower firmware as a communication channel
> between driver and firmware. In the case of OVS it is also used to
> communicate flow statistics from hardware to the driver.
> 
> A rte_eth device is not exposed to DPDK for this vNIC as it is strictly
> used internally by flower logic. Rx and Tx logic will be added later for
> this vNIC.
> 
> Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> ---
>   drivers/net/nfp/flower/nfp_flower.c | 385 +++++++++++++++++++++++++++++++++++-
>   drivers/net/nfp/flower/nfp_flower.h |   6 +
>   2 files changed, 388 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
> index 2498020..51df504 100644
> --- a/drivers/net/nfp/flower/nfp_flower.c
> +++ b/drivers/net/nfp/flower/nfp_flower.c
> @@ -26,6 +26,10 @@
>   #define MEMPOOL_CACHE_SIZE 512
>   #define DEFAULT_FLBUF_SIZE 9216
>   
> +#define CTRL_VNIC_NB_DESC 64
> +#define CTRL_VNIC_RX_FREE_THRESH 32
> +#define CTRL_VNIC_TX_FREE_THRESH 32
> +
>   /*
>    * Simple dev ops functions for the flower PF. Because a rte_device is exposed
>    * to DPDK the flower logic also makes use of helper functions like
> @@ -543,6 +547,302 @@
>   	return ret;
>   }
>   
> +static void
> +nfp_flower_cleanup_ctrl_vnic(struct nfp_net_hw *hw)
> +{
> +	uint32_t i;
> +	struct nfp_net_rxq *rxq;
> +	struct nfp_net_txq *txq;
> +	struct rte_eth_dev *eth_dev;
> +
> +	eth_dev = hw->eth_dev;
> +
> +	for (i = 0; i < hw->max_tx_queues; i++) {
> +		txq = eth_dev->data->tx_queues[i];
> +		if (txq) {

Compare vs NULL as you do in other cases and as DPDK coding style
recommends.

> +			rte_free(txq->txbufs);
> +			rte_eth_dma_zone_free(eth_dev, "ctrl_tx_ring", i);
> +			rte_free(txq);
> +		}
> +	}
> +
> +	for (i = 0; i < hw->max_rx_queues; i++) {
> +		rxq = eth_dev->data->rx_queues[i];
> +		if (rxq) {

Compare vs NULL

> +			rte_free(rxq->rxbufs);
> +			rte_eth_dma_zone_free(eth_dev, "ctrl_rx_ring", i);
> +			rte_free(rxq);
> +		}
> +	}
> +
> +	rte_free(eth_dev->data->tx_queues);
> +	rte_free(eth_dev->data->rx_queues);
> +	rte_free(eth_dev->data);
> +	rte_free(eth_dev);
> +}
> +
> +static int
> +nfp_flower_init_ctrl_vnic(struct nfp_net_hw *hw)
> +{
> +	uint32_t i;
> +	int ret = 0;
> +	uint16_t nb_desc;
> +	unsigned int numa_node;
> +	struct rte_mempool *mp;
> +	uint16_t rx_free_thresh;
> +	uint16_t tx_free_thresh;
> +	struct nfp_net_rxq *rxq;
> +	struct nfp_net_txq *txq;
> +	struct nfp_pf_dev *pf_dev;
> +	struct rte_eth_dev *eth_dev;
> +	const struct rte_memzone *tz;
> +	struct nfp_app_flower *app_flower;
> +
> +	/* Hardcoded values for now */
> +	nb_desc = CTRL_VNIC_NB_DESC;
> +	rx_free_thresh = CTRL_VNIC_RX_FREE_THRESH;

What's the point to introduce the variable and use it only
once below?

> +	tx_free_thresh = CTRL_VNIC_TX_FREE_THRESH;

Same here.

> +	numa_node = rte_socket_id();
> +
> +	/* Set up some pointers here for ease of use */
> +	pf_dev = hw->pf_dev;
> +	app_flower = NFP_APP_PRIV_TO_APP_FLOWER(pf_dev->app_priv);
> +
> +	ret = nfp_flower_init_vnic_common(hw, "ctrl_vnic");
> +	if (ret)

Compare vs 0

> +		goto done;
> +
> +	/* Allocate memory for the eth_dev of the vNIC */
> +	hw->eth_dev = rte_zmalloc("ctrl_vnic_eth_dev",

Why not rte_eth_dev_allocate()? Isn't an ethdev?
Why do you bypsss ethdev layer in this case completely and do
everything yourself?

> +		sizeof(struct rte_eth_dev), RTE_CACHE_LINE_SIZE);
> +	if (hw->eth_dev == NULL) {
> +		ret = -ENOMEM;
> +		goto done;
> +	}
> +
> +	/* Grab the pointer to the newly created rte_eth_dev here */
> +	eth_dev = hw->eth_dev;
> +
> +	/* Also allocate memory for the data part of the eth_dev */
> +	eth_dev->data = rte_zmalloc("ctrl_vnic_eth_dev_data",
> +		sizeof(struct rte_eth_dev_data), RTE_CACHE_LINE_SIZE);
> +	if (eth_dev->data == NULL) {
> +		ret = -ENOMEM;
> +		goto eth_dev_cleanup;
> +	}
> +
> +	eth_dev->data->rx_queues = rte_zmalloc("ethdev->rx_queues",
> +		sizeof(eth_dev->data->rx_queues[0]) * hw->max_rx_queues,
> +		RTE_CACHE_LINE_SIZE);
> +	if (eth_dev->data->rx_queues == NULL) {
> +		PMD_INIT_LOG(ERR, "rte_zmalloc failed for ctrl vnic rx queues");
> +		ret = -ENOMEM;
> +		goto dev_data_cleanup;
> +	}
> +
> +	eth_dev->data->tx_queues = rte_zmalloc("ethdev->tx_queues",
> +		sizeof(eth_dev->data->tx_queues[0]) * hw->max_tx_queues,
> +		RTE_CACHE_LINE_SIZE);
> +	if (eth_dev->data->tx_queues == NULL) {
> +		PMD_INIT_LOG(ERR, "rte_zmalloc failed for ctrl vnic tx queues");
> +		ret = -ENOMEM;
> +		goto rx_queue_cleanup;
> +	}
> +
> +	eth_dev->device = &pf_dev->pci_dev->device;
> +	eth_dev->data->nb_tx_queues = hw->max_tx_queues;
> +	eth_dev->data->nb_rx_queues = hw->max_rx_queues;
> +	eth_dev->data->dev_private = hw;
> +
> +	/* Create a mbuf pool for the vNIC */
> +	app_flower->ctrl_pktmbuf_pool = rte_pktmbuf_pool_create("ctrl_mbuf_pool",
> +		4 * nb_desc, 64, 0, 9216, numa_node);
> +	if (app_flower->ctrl_pktmbuf_pool == NULL) {
> +		PMD_INIT_LOG(ERR, "create mbuf pool for ctrl vnic failed");
> +		ret = -ENOMEM;
> +		goto tx_queue_cleanup;
> +	}
> +
> +	mp = app_flower->ctrl_pktmbuf_pool;
> +
> +	/* Set up the Rx queues */
> +	PMD_INIT_LOG(INFO, "Configuring flower ctrl vNIC Rx queue");
> +	for (i = 0; i < hw->max_rx_queues; i++) {
> +		/* Hardcoded number of desc to 64 */
> +		rxq = rte_zmalloc_socket("ethdev RX queue",
> +			sizeof(struct nfp_net_rxq), RTE_CACHE_LINE_SIZE,
> +			numa_node);
> +		if (rxq == NULL) {
> +			PMD_DRV_LOG(ERR, "Error allocating rxq");
> +			ret = -ENOMEM;
> +			goto rx_queue_setup_cleanup;
> +		}
> +
> +		eth_dev->data->rx_queues[i] = rxq;
> +
> +		/* Hw queues mapping based on firmware configuration */
> +		rxq->qidx = i;
> +		rxq->fl_qcidx = i * hw->stride_rx;
> +		rxq->rx_qcidx = rxq->fl_qcidx + (hw->stride_rx - 1);
> +		rxq->qcp_fl = hw->rx_bar + NFP_QCP_QUEUE_OFF(rxq->fl_qcidx);
> +		rxq->qcp_rx = hw->rx_bar + NFP_QCP_QUEUE_OFF(rxq->rx_qcidx);
> +
> +		/*
> +		 * Tracking mbuf size for detecting a potential mbuf overflow due to
> +		 * RX offset
> +		 */
> +		rxq->mem_pool = mp;
> +		rxq->mbuf_size = rxq->mem_pool->elt_size;
> +		rxq->mbuf_size -= (sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM);
> +		hw->flbufsz = rxq->mbuf_size;
> +
> +		rxq->rx_count = nb_desc;
> +		rxq->rx_free_thresh = rx_free_thresh;
> +		rxq->drop_en = 1;
> +
> +		/*
> +		 * Allocate RX ring hardware descriptors. A memzone large enough to
> +		 * handle the maximum ring size is allocated in order to allow for
> +		 * resizing in later calls to the queue setup function.
> +		 */
> +		tz = rte_eth_dma_zone_reserve(eth_dev, "ctrl_rx_ring", i,
> +			sizeof(struct nfp_net_rx_desc) * NFP_NET_MAX_RX_DESC,
> +			NFP_MEMZONE_ALIGN, numa_node);
> +		if (tz == NULL) {
> +			PMD_DRV_LOG(ERR, "Error allocating rx dma");
> +			rte_free(rxq);
> +			ret = -ENOMEM;
> +			goto rx_queue_setup_cleanup;
> +		}
> +
> +		/* Saving physical and virtual addresses for the RX ring */
> +		rxq->dma = (uint64_t)tz->iova;
> +		rxq->rxds = (struct nfp_net_rx_desc *)tz->addr;
> +
> +		/* mbuf pointers array for referencing mbufs linked to RX descriptors */
> +		rxq->rxbufs = rte_zmalloc_socket("rxq->rxbufs",
> +			sizeof(*rxq->rxbufs) * nb_desc, RTE_CACHE_LINE_SIZE,
> +			numa_node);
> +		if (rxq->rxbufs == NULL) {
> +			rte_eth_dma_zone_free(eth_dev, "ctrl_rx_ring", i);
> +			rte_free(rxq);
> +			ret = -ENOMEM;
> +			goto rx_queue_setup_cleanup;
> +		}
> +
> +		nfp_net_reset_rx_queue(rxq);
> +
> +		rxq->hw = hw;
> +
> +		/*
> +		 * Telling the HW about the physical address of the RX ring and number
> +		 * of descriptors in log2 format
> +		 */
> +		nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(i), rxq->dma);
> +		nn_cfg_writeb(hw, NFP_NET_CFG_RXR_SZ(i), rte_log2_u32(nb_desc));
> +	}
> +
> +	/* Now the Tx queues */
> +	PMD_INIT_LOG(INFO, "Configuring flower ctrl vNIC Tx queue");
> +	for (i = 0; i < hw->max_tx_queues; i++) {
> +		/* Hardcoded number of desc to 64 */
> +		/* Allocating tx queue data structure */
> +		txq = rte_zmalloc_socket("ethdev TX queue",
> +			sizeof(struct nfp_net_txq), RTE_CACHE_LINE_SIZE,
> +			numa_node);
> +		if (txq == NULL) {
> +			PMD_DRV_LOG(ERR, "Error allocating txq");
> +			ret = -ENOMEM;
> +			goto tx_queue_setup_cleanup;
> +		}
> +
> +		eth_dev->data->tx_queues[i] = txq;
> +
> +		/*
> +		 * Allocate TX ring hardware descriptors. A memzone large enough to
> +		 * handle the maximum ring size is allocated in order to allow for
> +		 * resizing in later calls to the queue setup function.
> +		 */
> +		tz = rte_eth_dma_zone_reserve(eth_dev, "ctrl_tx_ring", i,
> +			sizeof(struct nfp_net_nfd3_tx_desc) * NFP_NET_MAX_TX_DESC,
> +			NFP_MEMZONE_ALIGN, numa_node);
> +		if (tz == NULL) {
> +			PMD_DRV_LOG(ERR, "Error allocating tx dma");
> +			rte_free(txq);
> +			ret = -ENOMEM;
> +			goto tx_queue_setup_cleanup;
> +		}
> +
> +		txq->tx_count = nb_desc;
> +		txq->tx_free_thresh = tx_free_thresh;
> +		txq->tx_pthresh = DEFAULT_TX_PTHRESH;
> +		txq->tx_hthresh = DEFAULT_TX_HTHRESH;
> +		txq->tx_wthresh = DEFAULT_TX_WTHRESH;
> +
> +		/* queue mapping based on firmware configuration */
> +		txq->qidx = i;
> +		txq->tx_qcidx = i * hw->stride_tx;
> +		txq->qcp_q = hw->tx_bar + NFP_QCP_QUEUE_OFF(txq->tx_qcidx);
> +
> +		/* Saving physical and virtual addresses for the TX ring */
> +		txq->dma = (uint64_t)tz->iova;
> +		txq->txds = (struct nfp_net_nfd3_tx_desc *)tz->addr;
> +
> +		/* mbuf pointers array for referencing mbufs linked to TX descriptors */
> +		txq->txbufs = rte_zmalloc_socket("txq->txbufs",
> +			sizeof(*txq->txbufs) * nb_desc, RTE_CACHE_LINE_SIZE,
> +			numa_node);
> +		if (txq->txbufs == NULL) {
> +			rte_eth_dma_zone_free(eth_dev, "ctrl_tx_ring", i);
> +			rte_free(txq);
> +			ret = -ENOMEM;
> +			goto tx_queue_setup_cleanup;
> +		}
> +
> +		nfp_net_reset_tx_queue(txq);
> +
> +		txq->hw = hw;
> +
> +		/*
> +		 * Telling the HW about the physical address of the TX ring and number
> +		 * of descriptors in log2 format
> +		 */
> +		nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(i), txq->dma);
> +		nn_cfg_writeb(hw, NFP_NET_CFG_TXR_SZ(i), rte_log2_u32(nb_desc));
> +	}
> +
> +	return 0;
> +
> +tx_queue_setup_cleanup:
> +	for (i = 0; i < hw->max_tx_queues; i++) {
> +		txq = eth_dev->data->tx_queues[i];
> +		if (txq) {

Compare vs NULL

> +			rte_free(txq->txbufs);
> +			rte_eth_dma_zone_free(eth_dev, "ctrl_tx_ring", i);
> +			rte_free(txq);
> +		}
> +	}
> +rx_queue_setup_cleanup:
> +	for (i = 0; i < hw->max_rx_queues; i++) {
> +		rxq = eth_dev->data->rx_queues[i];
> +		if (rxq) {

Compare vs NULL

> +			rte_free(rxq->rxbufs);
> +			rte_eth_dma_zone_free(eth_dev, "ctrl_rx_ring", i);
> +			rte_free(rxq);
> +		}
> +	}
> +tx_queue_cleanup:
> +	rte_free(eth_dev->data->tx_queues);
> +rx_queue_cleanup:
> +	rte_free(eth_dev->data->rx_queues);
> +dev_data_cleanup:
> +	rte_free(eth_dev->data);
> +eth_dev_cleanup:
> +	rte_free(eth_dev);
> +done:
> +	return ret;
> +}
> +
>   static int
>   nfp_flower_start_pf_vnic(struct nfp_net_hw *hw)
>   {
> @@ -561,12 +861,57 @@
>   	return 0;
>   }
>   
> +static int
> +nfp_flower_start_ctrl_vnic(struct nfp_net_hw *hw)
> +{
> +	int ret;
> +	uint32_t update;
> +	uint32_t new_ctrl;
> +	struct rte_eth_dev *dev;
> +
> +	dev = hw->eth_dev;
> +
> +	/* Disabling queues just in case... */
> +	nfp_net_disable_queues(dev);
> +
> +	/* Enabling the required queues in the device */
> +	nfp_net_enable_queues(dev);
> +
> +	/* Writing configuration parameters in the device */
> +	nfp_net_params_setup(hw);
> +
> +	new_ctrl = NFP_NET_CFG_CTRL_ENABLE;
> +	update = NFP_NET_CFG_UPDATE_GEN | NFP_NET_CFG_UPDATE_RING |
> +		 NFP_NET_CFG_UPDATE_MSIX;
> +
> +	rte_wmb();
> +
> +	/* If an error when reconfig we avoid to change hw state */
> +	ret = nfp_net_reconfig(hw, new_ctrl, update);
> +	if (ret) {

Compare vs 0

> +		PMD_INIT_LOG(ERR, "Failed to reconfig ctrl vnic");
> +		return -EIO;
> +	}
> +
> +	hw->ctrl = new_ctrl;
> +
> +	/* Setup the freelist ring */
> +	ret = nfp_net_rx_freelist_setup(dev);
> +	if (ret) {

Compare vs 0

> +		PMD_INIT_LOG(ERR, "Error with flower ctrl vNIC freelist setup");
> +		return -EIO;
> +	}
> +
> +	return 0;
> +}
> +
>   int
>   nfp_init_app_flower(struct nfp_pf_dev *pf_dev)
>   {
>   	int ret;
>   	unsigned int numa_node;
>   	struct nfp_net_hw *pf_hw;
> +	struct nfp_net_hw *ctrl_hw;
>   	struct nfp_app_flower *app_flower;
>   
>   	numa_node = rte_socket_id();
> @@ -612,29 +957,63 @@
>   	pf_hw->pf_dev = pf_dev;
>   	pf_hw->cpp = pf_dev->cpp;
>   
> +	/* The ctrl vNIC struct comes directly after the PF one */
> +	app_flower->ctrl_hw = pf_hw + 1;
> +	ctrl_hw = app_flower->ctrl_hw;
> +
> +	/* Map the ctrl vNIC ctrl bar */
> +	ctrl_hw->ctrl_bar = nfp_rtsym_map(pf_dev->sym_tbl, "_pf0_net_ctrl_bar",
> +		32768, &ctrl_hw->ctrl_area);
> +	if (ctrl_hw->ctrl_bar == NULL) {
> +		PMD_INIT_LOG(ERR, "Cloud not map the ctrl vNIC ctrl bar");
> +		ret = -ENODEV;
> +		goto pf_cpp_area_cleanup;
> +	}
> +
> +	/* Now populate the ctrl vNIC */
> +	ctrl_hw->pf_dev = pf_dev;
> +	ctrl_hw->cpp = pf_dev->cpp;
> +
>   	ret = nfp_flower_init_pf_vnic(app_flower->pf_hw);
>   	if (ret) {
>   		PMD_INIT_LOG(ERR, "Could not initialize flower PF vNIC");
> -		goto pf_cpp_area_cleanup;
> +		goto ctrl_cpp_area_cleanup;
> +	}
> +
> +	ret = nfp_flower_init_ctrl_vnic(app_flower->ctrl_hw);
> +	if (ret) {

Compare vs 0

> +		PMD_INIT_LOG(ERR, "Could not initialize flower ctrl vNIC");
> +		goto pf_vnic_cleanup;
>   	}
>   
>   	/* Start the PF vNIC */
>   	ret = nfp_flower_start_pf_vnic(app_flower->pf_hw);
>   	if (ret) {
>   		PMD_INIT_LOG(ERR, "Could not start flower PF vNIC");
> -		goto pf_vnic_cleanup;
> +		goto ctrl_vnic_cleanup;
> +	}
> +
> +	/* Start the ctrl vNIC */
> +	ret = nfp_flower_start_ctrl_vnic(app_flower->ctrl_hw);
> +	if (ret) {

Compare vs 0

> +		PMD_INIT_LOG(ERR, "Could not start flower ctrl vNIC");
> +		goto ctrl_vnic_cleanup;
>   	}
>   
>   	/* Start up flower services */
>   	if (nfp_flower_enable_services(app_flower)) {
>   		ret = -ESRCH;
> -		goto pf_vnic_cleanup;
> +		goto ctrl_vnic_cleanup;
>   	}
>   
>   	return 0;
>   
> +ctrl_vnic_cleanup:
> +	nfp_flower_cleanup_ctrl_vnic(app_flower->ctrl_hw);
>   pf_vnic_cleanup:
>   	nfp_flower_cleanup_pf_vnic(app_flower->pf_hw);
> +ctrl_cpp_area_cleanup:
> +	nfp_cpp_area_free(ctrl_hw->ctrl_area);
>   pf_cpp_area_cleanup:
>   	nfp_cpp_area_free(pf_dev->ctrl_area);
>   eth_tbl_cleanup:
> diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
> index f6fd4eb..f11ef6d 100644
> --- a/drivers/net/nfp/flower/nfp_flower.h
> +++ b/drivers/net/nfp/flower/nfp_flower.h
> @@ -21,6 +21,12 @@ struct nfp_app_flower {
>   	/* Pointer to the PF vNIC */
>   	struct nfp_net_hw *pf_hw;
>   
> +	/* Pointer to a mempool for the ctrlvNIC */
> +	struct rte_mempool *ctrl_pktmbuf_pool;
> +
> +	/* Pointer to the ctrl vNIC */
> +	struct nfp_net_hw *ctrl_hw;
> +
>   	/* the eth table as reported by firmware */
>   	struct nfp_eth_table *nfp_eth_table;
>   };


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 10/12] net/nfp: add flower representor framework
  2022-08-05  6:32 ` [PATCH v5 10/12] net/nfp: add flower representor framework Chaoyong He
@ 2022-08-05 14:23   ` Andrew Rybchenko
  2022-08-08 11:56     ` Chaoyong He
  0 siblings, 1 reply; 29+ messages in thread
From: Andrew Rybchenko @ 2022-08-05 14:23 UTC (permalink / raw)
  To: Chaoyong He, dev; +Cc: niklas.soderlund

On 8/5/22 09:32, Chaoyong He wrote:
> This commit adds the framework to support flower representors.

"This commit adds" -> "Add"

> The number of VF representors are parsed from the command line. For
> physical port representors the current logic aims to create a
> representor for each physical port present on the hardware.
> 
> A eth_dev is created for each phyport and VF, and flower firmware

A -> An, phyport -> physical port

> requires a MAC repr cmsg to be transmitted to firmware with
> info about the number of physical ports configured.
> 
> Reify messages are sent to hardware for each phyport representor.
> A rte_ring is also created per representor so that traffic can be

A -> An

> pushed and pulled to this interface.
> 
> To up and down the real device represented by a flower representor port
> a port mod message is used to convey that info to the firmware. This
> message will be used in the dev_ops callbacks of flower representors.
> 
> Each cmsg generated by the driver is prepended with a cmsg header.
> This commit also adds the logic to fill in the header of cmsgs.
> 
> This commit also adds the Rx and Tx path for flower representors. For

"This commit also adds" -> "Also add"

> Rx packets are dequeued from the representor ring and passed to the
> eth_dev. For Tx the first queue of the PF vNIC is used. Metadata about
> the representor is added before the packet is sent down to firmware.
> 
> Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> ---
>   drivers/net/nfp/flower/nfp_flower.c             |  59 +++
>   drivers/net/nfp/flower/nfp_flower.h             |  18 +
>   drivers/net/nfp/flower/nfp_flower_cmsg.c        | 186 +++++++++
>   drivers/net/nfp/flower/nfp_flower_cmsg.h        | 173 ++++++++
>   drivers/net/nfp/flower/nfp_flower_representor.c | 508 ++++++++++++++++++++++++
>   drivers/net/nfp/flower/nfp_flower_representor.h |  39 ++
>   drivers/net/nfp/meson.build                     |   2 +
>   drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c      |  31 +-
>   8 files changed, 1004 insertions(+), 12 deletions(-)
>   create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.c
>   create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.h
>   create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.c
>   create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.h
> 
> diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
> index 5e9c4ef..d7772d6 100644
> --- a/drivers/net/nfp/flower/nfp_flower.c
> +++ b/drivers/net/nfp/flower/nfp_flower.c
> @@ -22,6 +22,7 @@
>   #include "nfp_flower.h"
>   #include "nfp_flower_ovs_compat.h"
>   #include "nfp_flower_ctrl.h"
> +#include "nfp_flower_representor.h"
>   
>   #define MAX_PKT_BURST 32
>   #define MEMPOOL_CACHE_SIZE 512
> @@ -927,8 +928,13 @@
>   	unsigned int numa_node;
>   	struct nfp_net_hw *pf_hw;
>   	struct nfp_net_hw *ctrl_hw;
> +	struct rte_pci_device *pci_dev;
>   	struct nfp_app_flower *app_flower;
> +	struct rte_eth_devargs eth_da = {
> +		.nb_representor_ports = 0
> +	};
>   
> +	pci_dev = pf_dev->pci_dev;
>   	numa_node = rte_socket_id();
>   
>   	/* Allocate memory for the Flower app */
> @@ -1021,6 +1027,59 @@
>   		goto ctrl_vnic_cleanup;
>   	}
>   
> +	/* Allocate a switch domain for the flower app */
> +	if (app_flower->switch_domain_id == RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID &&
> +			rte_eth_switch_domain_alloc(&app_flower->switch_domain_id)) {
> +		PMD_INIT_LOG(WARNING,
> +				"failed to allocate switch domain for device");
> +	}
> +
> +	/* Now parse PCI device args passed for representor info */
> +	if (pci_dev->device.devargs) {

Compare vs NULL

> +		ret = rte_eth_devargs_parse(pci_dev->device.devargs->args,
> +				&eth_da);
> +		if (ret) {

Compare vs 0

> +			PMD_INIT_LOG(ERR, "devarg parse failed");
> +			goto ctrl_vnic_cleanup;
> +		}
> +	}
> +

I see no representor_info_get implementation. How is application
supporsed to find out which representors can be instantiated?

> +	if (eth_da.nb_representor_ports == 0) {
> +		PMD_INIT_LOG(DEBUG, "No representor port need to create.");
> +		ret = 0;
> +		goto done;
> +	}
> +
> +	/* There always exist phy repr */
> +	if (eth_da.nb_representor_ports < app_flower->nfp_eth_table->count) {

It is really strange check. First of all I'd say that it is very
inconvinient to oblidge user to create some representors.
Second, who said that these representors are physical port representors?
May be number of representors is OK, but it is different representors...

> +		PMD_INIT_LOG(DEBUG, "Should also create phy representor port.");
> +		ret = -ERANGE;
> +		goto ctrl_vnic_cleanup;
> +	}
> +
> +	/* Only support VF representor creation via the command line */
> +	if (eth_da.type != RTE_ETH_REPRESENTOR_VF) {

I'm confused. Above you're talking about phy representors, but here
VFs...

> +		PMD_DRV_LOG(ERR, "unsupported representor type: %s\n",

The macro already adds \n. So, you don't need it here.

> +				pci_dev->device.devargs->args);
> +		ret = -ENOTSUP;
> +		goto ctrl_vnic_cleanup;
> +	}
> +
> +	/* Fill in flower app with repr counts */
> +	app_flower->num_phyport_reprs = (uint8_t)app_flower->nfp_eth_table->count;
> +	app_flower->num_vf_reprs = eth_da.nb_representor_ports -
> +			app_flower->nfp_eth_table->count;
> +
> +	PMD_INIT_LOG(INFO, "%d number of VF reprs", app_flower->num_vf_reprs);
> +	PMD_INIT_LOG(INFO, "%d number of phyport reprs", app_flower->num_phyport_reprs);
> +
> +	ret = nfp_flower_repr_alloc(app_flower);
> +	if (ret) {

Compare vs 0

> +		PMD_INIT_LOG(ERR,
> +			"representors allocation for NFP_REPR_TYPE_VF error");
> +		goto ctrl_vnic_cleanup;
> +	}
> +
>   	return 0;
>   
>   ctrl_vnic_cleanup:
> diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
> index bdc64e3..24fced3 100644
> --- a/drivers/net/nfp/flower/nfp_flower.h
> +++ b/drivers/net/nfp/flower/nfp_flower.h
> @@ -19,8 +19,20 @@ enum nfp_flower_service {
>    */
>   #define FLOWER_PKT_DATA_OFFSET 8
>   
> +#define MAX_FLOWER_PHYPORTS 8
> +#define MAX_FLOWER_VFS 64
> +
>   /* The flower application's private structure */
>   struct nfp_app_flower {
> +	/* switch domain for this app */
> +	uint16_t switch_domain_id;
> +
> +	/* Number of VF representors */
> +	uint8_t num_vf_reprs;
> +
> +	/* Number of phyport representors */
> +	uint8_t num_phyport_reprs;
> +
>   	/* List of rte_service ID's for the flower app */
>   	uint32_t flower_services_ids[NFP_FLOWER_SERVICE_MAX];
>   
> @@ -44,6 +56,12 @@ struct nfp_app_flower {
>   
>   	/* Ctrl vNIC Tx counter */
>   	uint64_t ctrl_vnic_tx_count;
> +
> +	/* Array of phyport representors */
> +	struct nfp_flower_representor *phy_reprs[MAX_FLOWER_PHYPORTS];
> +
> +	/* Array of VF representors */
> +	struct nfp_flower_representor *vf_reprs[MAX_FLOWER_VFS];
>   };
>   
>   int nfp_init_app_flower(struct nfp_pf_dev *pf_dev);
> diff --git a/drivers/net/nfp/flower/nfp_flower_cmsg.c b/drivers/net/nfp/flower/nfp_flower_cmsg.c
> new file mode 100644
> index 0000000..5ce547c
> --- /dev/null
> +++ b/drivers/net/nfp/flower/nfp_flower_cmsg.c
> @@ -0,0 +1,186 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2022 Corigine, Inc.
> + * All rights reserved.
> + */
> +
> +#include "../nfpcore/nfp_nsp.h"
> +#include "../nfp_logs.h"
> +#include "../nfp_common.h"
> +#include "nfp_flower.h"
> +#include "nfp_flower_cmsg.h"
> +#include "nfp_flower_ctrl.h"
> +#include "nfp_flower_representor.h"
> +
> +static void *
> +nfp_flower_cmsg_init(struct rte_mbuf *m,
> +		enum nfp_flower_cmsg_type type,
> +		uint32_t size)
> +{
> +	char *pkt;
> +	uint32_t data;
> +	uint32_t new_size = size;
> +	struct nfp_flower_cmsg_hdr *hdr;
> +
> +	pkt = rte_pktmbuf_mtod(m, char *);
> +	PMD_DRV_LOG(DEBUG, "flower_cmsg_init using pkt at %p", pkt);
> +
> +	data = rte_cpu_to_be_32(NFP_NET_META_PORTID);
> +	rte_memcpy(pkt, &data, 4);
> +	pkt += 4;
> +	new_size += 4;
> +
> +	/* First the metadata as flower requires it */
> +	data = rte_cpu_to_be_32(NFP_META_PORT_ID_CTRL);
> +	rte_memcpy(pkt, &data, 4);
> +	pkt += 4;
> +	new_size += 4;
> +
> +	/* Now the ctrl header */
> +	hdr = (struct nfp_flower_cmsg_hdr *)pkt;
> +	hdr->pad     = 0;
> +	hdr->type    = type;
> +	hdr->version = NFP_FLOWER_CMSG_VER1;
> +
> +	pkt = (char *)hdr + NFP_FLOWER_CMSG_HLEN;
> +	new_size += NFP_FLOWER_CMSG_HLEN;
> +
> +	m->pkt_len = new_size;
> +	m->data_len = m->pkt_len;
> +
> +	return pkt;
> +}
> +
> +static void
> +nfp_flower_cmsg_mac_repr_init(struct rte_mbuf *m, int num_ports)
> +{
> +	uint32_t size;
> +	struct nfp_flower_cmsg_mac_repr *msg;
> +	enum nfp_flower_cmsg_type type = NFP_FLOWER_CMSG_TYPE_MAC_REPR;
> +
> +	size = sizeof(*msg) + (num_ports * sizeof(msg->ports[0]));
> +	PMD_DRV_LOG(DEBUG, "mac repr cmsg init with size: %u", size);
> +	msg = (struct nfp_flower_cmsg_mac_repr *)nfp_flower_cmsg_init(m,
> +			type, size);
> +
> +	memset(msg->reserved, 0, sizeof(msg->reserved));
> +	msg->num_ports = num_ports;
> +}
> +
> +static void
> +nfp_flower_cmsg_mac_repr_fill(struct rte_mbuf *m,
> +		unsigned int idx,
> +		unsigned int nbi,
> +		unsigned int nbi_port,
> +		unsigned int phys_port)
> +{
> +	struct nfp_flower_cmsg_mac_repr *msg;
> +
> +	msg = (struct nfp_flower_cmsg_mac_repr *)nfp_flower_cmsg_get_data(m);
> +	msg->ports[idx].idx       = idx;
> +	msg->ports[idx].info      = nbi & NFP_FLOWER_CMSG_MAC_REPR_NBI;
> +	msg->ports[idx].nbi_port  = nbi_port;
> +	msg->ports[idx].phys_port = phys_port;
> +}
> +
> +int
> +nfp_flower_cmsg_mac_repr(struct nfp_app_flower *app_flower)
> +{
> +	int i;
> +	uint16_t cnt;
> +	unsigned int nbi;
> +	unsigned int nbi_port;
> +	unsigned int phys_port;
> +	struct rte_mbuf *mbuf;
> +	struct nfp_eth_table *nfp_eth_table;
> +
> +	mbuf = rte_pktmbuf_alloc(app_flower->ctrl_pktmbuf_pool);
> +	if (mbuf == NULL) {
> +		PMD_DRV_LOG(ERR, "Could not allocate mac repr cmsg");
> +		return -ENOMEM;
> +	}
> +
> +	nfp_flower_cmsg_mac_repr_init(mbuf, app_flower->num_phyport_reprs);
> +
> +	/* Fill in the mac repr cmsg */
> +	nfp_eth_table = app_flower->nfp_eth_table;
> +	for (i = 0; i < app_flower->num_phyport_reprs; i++) {
> +		nbi = nfp_eth_table->ports[i].nbi;
> +		nbi_port = nfp_eth_table->ports[i].base;
> +		phys_port = nfp_eth_table->ports[i].index;
> +
> +		nfp_flower_cmsg_mac_repr_fill(mbuf, i, nbi, nbi_port, phys_port);
> +	}
> +
> +	/* Send the cmsg via the ctrl vNIC */
> +	cnt = nfp_flower_ctrl_vnic_xmit(app_flower, mbuf);
> +	if (cnt == 0) {
> +		PMD_DRV_LOG(ERR, "Send cmsg through ctrl vnic failed.");
> +		rte_pktmbuf_free(mbuf);
> +		return -EIO;
> +	}
> +
> +	return 0;
> +}
> +
> +int
> +nfp_flower_cmsg_repr_reify(struct nfp_app_flower *app_flower,
> +		struct nfp_flower_representor *repr)
> +{
> +	uint16_t cnt;
> +	struct rte_mbuf *mbuf;
> +	struct nfp_flower_cmsg_port_reify *msg;
> +
> +	mbuf = rte_pktmbuf_alloc(app_flower->ctrl_pktmbuf_pool);
> +	if (mbuf == NULL) {
> +		PMD_DRV_LOG(DEBUG, "alloc mbuf for repr reify failed");
> +		return -ENOMEM;
> +	}
> +
> +	msg = (struct nfp_flower_cmsg_port_reify *)nfp_flower_cmsg_init(mbuf,
> +			NFP_FLOWER_CMSG_TYPE_PORT_REIFY, sizeof(*msg));
> +
> +	msg->portnum  = rte_cpu_to_be_32(repr->port_id);
> +	msg->reserved = 0;
> +	msg->info     = rte_cpu_to_be_16(1);
> +
> +	cnt = nfp_flower_ctrl_vnic_xmit(app_flower, mbuf);
> +	if (cnt == 0) {
> +		PMD_DRV_LOG(ERR, "Send cmsg through ctrl vnic failed.");
> +		rte_pktmbuf_free(mbuf);
> +		return -EIO;
> +	}
> +
> +	return 0;
> +}
> +
> +int
> +nfp_flower_cmsg_port_mod(struct nfp_app_flower *app_flower,
> +		uint32_t port_id, bool carrier_ok)
> +{
> +	uint16_t cnt;
> +	struct rte_mbuf *mbuf;
> +	struct nfp_flower_cmsg_port_mod *msg;
> +
> +	mbuf = rte_pktmbuf_alloc(app_flower->ctrl_pktmbuf_pool);
> +	if (mbuf == NULL) {
> +		PMD_DRV_LOG(DEBUG, "alloc mbuf for repr portmod failed");
> +		return -ENOMEM;
> +	}
> +
> +	msg = (struct nfp_flower_cmsg_port_mod *)nfp_flower_cmsg_init(mbuf,
> +			NFP_FLOWER_CMSG_TYPE_PORT_MOD, sizeof(*msg));
> +
> +	msg->portnum  = rte_cpu_to_be_32(port_id);
> +	msg->reserved = 0;
> +	msg->info     = carrier_ok;
> +	msg->mtu      = 9000;
> +
> +	cnt = nfp_flower_ctrl_vnic_xmit(app_flower, mbuf);
> +	if (cnt == 0) {
> +		PMD_DRV_LOG(ERR, "Send cmsg through ctrl vnic failed.");
> +		rte_pktmbuf_free(mbuf);
> +		return -EIO;
> +	}
> +
> +	return 0;
> +}
> diff --git a/drivers/net/nfp/flower/nfp_flower_cmsg.h b/drivers/net/nfp/flower/nfp_flower_cmsg.h
> new file mode 100644
> index 0000000..fce5163
> --- /dev/null
> +++ b/drivers/net/nfp/flower/nfp_flower_cmsg.h
> @@ -0,0 +1,173 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2022 Corigine, Inc.
> + * All rights reserved.
> + */
> +
> +#ifndef _NFP_CMSG_H_
> +#define _NFP_CMSG_H_
> +
> +#include <rte_byteorder.h>
> +#include <rte_ether.h>
> +
> +struct nfp_flower_cmsg_hdr {
> +	rte_be16_t pad;
> +	uint8_t type;
> +	uint8_t version;
> +};
> +
> +/* Types defined for control messages */
> +enum nfp_flower_cmsg_type {
> +	NFP_FLOWER_CMSG_TYPE_FLOW_ADD       = 0,
> +	NFP_FLOWER_CMSG_TYPE_FLOW_MOD       = 1,
> +	NFP_FLOWER_CMSG_TYPE_FLOW_DEL       = 2,
> +	NFP_FLOWER_CMSG_TYPE_LAG_CONFIG     = 4,
> +	NFP_FLOWER_CMSG_TYPE_PORT_REIFY     = 6,
> +	NFP_FLOWER_CMSG_TYPE_MAC_REPR       = 7,
> +	NFP_FLOWER_CMSG_TYPE_PORT_MOD       = 8,
> +	NFP_FLOWER_CMSG_TYPE_MERGE_HINT     = 9,
> +	NFP_FLOWER_CMSG_TYPE_NO_NEIGH       = 10,
> +	NFP_FLOWER_CMSG_TYPE_TUN_MAC        = 11,
> +	NFP_FLOWER_CMSG_TYPE_ACTIVE_TUNS    = 12,
> +	NFP_FLOWER_CMSG_TYPE_TUN_NEIGH      = 13,
> +	NFP_FLOWER_CMSG_TYPE_TUN_IPS        = 14,
> +	NFP_FLOWER_CMSG_TYPE_FLOW_STATS     = 15,
> +	NFP_FLOWER_CMSG_TYPE_PORT_ECHO      = 16,
> +	NFP_FLOWER_CMSG_TYPE_QOS_MOD        = 18,
> +	NFP_FLOWER_CMSG_TYPE_QOS_DEL        = 19,
> +	NFP_FLOWER_CMSG_TYPE_QOS_STATS      = 20,
> +	NFP_FLOWER_CMSG_TYPE_PRE_TUN_RULE   = 21,
> +	NFP_FLOWER_CMSG_TYPE_TUN_IPS_V6     = 22,
> +	NFP_FLOWER_CMSG_TYPE_NO_NEIGH_V6    = 23,
> +	NFP_FLOWER_CMSG_TYPE_TUN_NEIGH_V6   = 24,
> +	NFP_FLOWER_CMSG_TYPE_ACTIVE_TUNS_V6 = 25,
> +	NFP_FLOWER_CMSG_TYPE_MAX            = 32,
> +};
> +
> +/*
> + * NFP_FLOWER_CMSG_TYPE_MAC_REPR
> + *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
> + *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
> + *     Word +---------------+-----------+---+---------------+---------------+
> + *       0  |                  spare                        |Number of ports|
> + *          +---------------+-----------+---+---------------+---------------+
> + *       1  |    Index      |   spare   |NBI|  Port on NBI  | Chip-wide port|
> + *          +---------------+-----------+---+---------------+---------------+
> + *                                        ....
> + *          +---------------+-----------+---+---------------+---------------+
> + *     N-1  |    Index      |   spare   |NBI|  Port on NBI  | Chip-wide port|
> + *          +---------------+-----------+---+---------------+---------------+
> + *     N    |    Index      |   spare   |NBI|  Port on NBI  | Chip-wide port|
> + *          +---------------+-----------+---+---------------+---------------+
> + *
> + *          Index: index into the eth table
> + *          NBI (bits 17-16): NBI number (0-3)
> + *          Port on NBI (bits 15-8): “base” in the driver
> + *            this forms NBIX.PortY notation as the NSP eth table.
> + *          "Chip-wide" port (bits 7-0):
> + */
> +struct nfp_flower_cmsg_mac_repr {
> +	uint8_t reserved[3];
> +	uint8_t num_ports;
> +	struct {
> +		uint8_t idx;
> +		uint8_t info;
> +		uint8_t nbi_port;
> +		uint8_t phys_port;
> +	} ports[0];
> +};
> +
> +/*
> + * NFP_FLOWER_CMSG_TYPE_PORT_REIFY
> + *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
> + *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
> + *    Word  +-------+-------+---+---+-------+---+---+-----------+-----------+
> + *       0  |Port Ty|Sys ID |NIC|Rsv| Spare |PCI|typ|    vNIC   |  queue    |
> + *          +-------+-----+-+---+---+-------+---+---+-----------+---------+-+
> + *       1  |                             Spare                           |E|
> + *          +-------------------------------------------------------------+-+
> + *          E: 1 = Representor exists, 0 = Representor does not exist
> + */
> +struct nfp_flower_cmsg_port_reify {
> +	rte_be32_t portnum;
> +	rte_be16_t reserved;
> +	rte_be16_t info;
> +};
> +
> +/*
> + * NFP_FLOWER_CMSG_TYPE_PORT_MOD
> + *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
> + *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
> + *    Word  +-------+-------+---+---+-------+---+---+-------+---+-----------+
> + *       0  |Port Ty|Sys ID |NIC|Rsv|       Reserved        |    Port       |
> + *          +-------+-------+---+---+-----+-+---+---+-------+---+-----------+
> + *       1  |            Spare            |L|              MTU              |
> + *          +-----------------------------+-+-------------------------------+
> + *        L: Link or Admin state bit. When message is generated by host, this
> + *           bit indicates the admin state (0=down, 1=up). When generated by
> + *           NFP, it indicates the link state (0=down, 1=up)
> + *
> + *        Port Type (word 1, bits 31 to 28) = 1 (Physical Network)
> + *        Port: “Chip-wide number” as assigned by BSP
> + *
> + *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
> + *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
> + *    Word  +-------+-------+---+---+-------+---+---+-------+---+-----------+
> + *       0  |Port Ty|Sys ID |NIC|Rsv| Spare |PCI|typ|    vNIC   |  queue    |
> + *          +-------+-----+-+---+---+---+-+-+---+---+-------+---+-----------+
> + *       1  |            Spare            |L|              MTU              |
> + *          +-----------------------------+-+-------------------------------+
> + *        L: Link or Admin state bit. When message is generated by host, this
> + *           bit indicates the admin state (0=down, 1=up). When generated by
> + *           NFP, it indicates the link state (0=down, 1=up)
> + *
> + *        Port Type (word 1, bits 31 to 28) = 2 (PCIE)
> + */
> +struct nfp_flower_cmsg_port_mod {
> +	rte_be32_t portnum;
> +	uint8_t reserved;
> +	uint8_t info;
> +	rte_be16_t mtu;
> +};
> +
> +enum nfp_flower_cmsg_port_type {
> +	NFP_FLOWER_CMSG_PORT_TYPE_UNSPEC,
> +	NFP_FLOWER_CMSG_PORT_TYPE_PHYS_PORT,
> +	NFP_FLOWER_CMSG_PORT_TYPE_PCIE_PORT,
> +	NFP_FLOWER_CMSG_PORT_TYPE_OTHER_PORT,
> +};
> +
> +enum nfp_flower_cmsg_port_vnic_type {
> +	NFP_FLOWER_CMSG_PORT_VNIC_TYPE_VF,
> +	NFP_FLOWER_CMSG_PORT_VNIC_TYPE_PF,
> +	NFP_FLOWER_CMSG_PORT_VNIC_TYPE_CTRL,
> +};
> +
> +#define NFP_FLOWER_CMSG_MAC_REPR_NBI            (0x3)
> +
> +#define NFP_FLOWER_CMSG_HLEN            sizeof(struct nfp_flower_cmsg_hdr)
> +#define NFP_FLOWER_CMSG_VER1            1
> +#define NFP_NET_META_PORTID             5
> +#define NFP_META_PORT_ID_CTRL           ~0U
> +
> +#define NFP_FLOWER_CMSG_PORT_TYPE(x)            (((x) >> 28) & 0xf)  /* [31,28] */
> +#define NFP_FLOWER_CMSG_PORT_SYS_ID(x)          (((x) >> 24) & 0xf)  /* [24,27] */
> +#define NFP_FLOWER_CMSG_PORT_NFP_ID(x)          (((x) >> 22) & 0x3)  /* [22,23] */
> +#define NFP_FLOWER_CMSG_PORT_PCI(x)             (((x) >> 14) & 0x3)  /* [14,15] */
> +#define NFP_FLOWER_CMSG_PORT_VNIC_TYPE(x)       (((x) >> 12) & 0x3)  /* [12,13] */
> +#define NFP_FLOWER_CMSG_PORT_VNIC(x)            (((x) >> 6) & 0x3f)  /* [6,11] */
> +#define NFP_FLOWER_CMSG_PORT_PCIE_Q(x)          ((x) & 0x3f)         /* [0,5] */
> +#define NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM(x)   ((x) & 0xff)         /* [0,7] */
> +
> +static inline char*
> +nfp_flower_cmsg_get_data(struct rte_mbuf *m)
> +{
> +	return rte_pktmbuf_mtod(m, char *) + 4 + 4 + NFP_FLOWER_CMSG_HLEN;
> +}
> +
> +int nfp_flower_cmsg_mac_repr(struct nfp_app_flower *app_flower);
> +int nfp_flower_cmsg_repr_reify(struct nfp_app_flower *app_flower,
> +		struct nfp_flower_representor *repr);
> +int nfp_flower_cmsg_port_mod(struct nfp_app_flower *app_flower,
> +		uint32_t port_id, bool carrier_ok);
> +
> +#endif /* _NFP_CMSG_H_ */
> diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c b/drivers/net/nfp/flower/nfp_flower_representor.c
> new file mode 100644
> index 0000000..9f23a23
> --- /dev/null
> +++ b/drivers/net/nfp/flower/nfp_flower_representor.c
> @@ -0,0 +1,508 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2022 Corigine, Inc.
> + * All rights reserved.
> + */
> +
> +#include <rte_common.h>
> +#include <ethdev_pci.h>
> +
> +#include "../nfp_common.h"
> +#include "../nfp_logs.h"
> +#include "../nfp_ctrl.h"
> +#include "../nfp_rxtx.h"
> +#include "../nfpcore/nfp_mip.h"
> +#include "../nfpcore/nfp_rtsym.h"
> +#include "../nfpcore/nfp_nsp.h"
> +#include "nfp_flower.h"
> +#include "nfp_flower_representor.h"
> +#include "nfp_flower_ctrl.h"
> +#include "nfp_flower_cmsg.h"
> +
> +static int
> +nfp_flower_repr_link_update(__rte_unused struct rte_eth_dev *ethdev,
> +		__rte_unused int wait_to_complete)
> +{

Why is dummy implemenation OK?

> +	return 0;
> +}
> +
> +static int
> +nfp_flower_repr_dev_infos_get(__rte_unused struct rte_eth_dev *dev,
> +		struct rte_eth_dev_info *dev_info)
> +{
> +	/* Hardcoded pktlen and queues for now */
> +	dev_info->max_rx_queues = 1;
> +	dev_info->max_tx_queues = 1;
> +	dev_info->min_rx_bufsize = RTE_ETHER_MIN_MTU;
> +	dev_info->max_rx_pktlen = 9000;
> +
> +	dev_info->rx_offload_capa = RTE_ETH_RX_OFFLOAD_VLAN_STRIP;
> +	dev_info->rx_offload_capa |= RTE_ETH_RX_OFFLOAD_IPV4_CKSUM |
> +			RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
> +			RTE_ETH_RX_OFFLOAD_TCP_CKSUM;
> +
> +	dev_info->tx_offload_capa = RTE_ETH_TX_OFFLOAD_VLAN_INSERT;
> +	dev_info->tx_offload_capa |= RTE_ETH_TX_OFFLOAD_IPV4_CKSUM |
> +			RTE_ETH_TX_OFFLOAD_UDP_CKSUM |
> +			RTE_ETH_TX_OFFLOAD_TCP_CKSUM;
> +	dev_info->tx_offload_capa |= RTE_ETH_TX_OFFLOAD_TCP_TSO;
> +	dev_info->tx_offload_capa |= RTE_ETH_TX_OFFLOAD_MULTI_SEGS;
> +
> +	dev_info->max_mac_addrs = 1;
> +
> +	return 0;
> +}
> +
> +static int
> +nfp_flower_repr_dev_configure(__rte_unused struct rte_eth_dev *dev)
> +{
> +	return 0;
> +}
> +
> +static int
> +nfp_flower_repr_dev_start(struct rte_eth_dev *dev)
> +{
> +	struct nfp_app_flower *app_flower;
> +	struct nfp_flower_representor *repr;
> +
> +	repr = (struct nfp_flower_representor *)dev->data->dev_private;
> +	app_flower = repr->app_flower;
> +
> +	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT)
> +		nfp_eth_set_configured(app_flower->pf_hw->pf_dev->cpp,
> +				repr->nfp_idx, 1);
> +
> +	nfp_flower_cmsg_port_mod(app_flower, repr->port_id, true);
> +
> +	return 0;
> +}
> +
> +static int
> +nfp_flower_repr_dev_stop(struct rte_eth_dev *dev)
> +{
> +	struct nfp_app_flower *app_flower;
> +	struct nfp_flower_representor *repr;
> +
> +	repr = (struct nfp_flower_representor *)dev->data->dev_private;
> +	app_flower = repr->app_flower;
> +
> +	nfp_flower_cmsg_port_mod(app_flower, repr->port_id, false);
> +
> +	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT)
> +		nfp_eth_set_configured(app_flower->pf_hw->pf_dev->cpp,
> +				repr->nfp_idx, 0);
> +
> +	return 0;
> +}
> +
> +static int
> +nfp_flower_repr_rx_queue_setup(struct rte_eth_dev *dev,
> +		uint16_t rx_queue_id,
> +		__rte_unused uint16_t nb_rx_desc,
> +		unsigned int socket_id,
> +		__rte_unused const struct rte_eth_rxconf *rx_conf,
> +		__rte_unused struct rte_mempool *mb_pool)
> +{
> +	struct nfp_net_rxq *rxq;
> +	struct nfp_net_hw *pf_hw;
> +	struct nfp_flower_representor *repr;
> +
> +	repr = (struct nfp_flower_representor *)dev->data->dev_private;
> +	pf_hw = repr->app_flower->pf_hw;
> +
> +	/* Allocating rx queue data structure */
> +	rxq = rte_zmalloc_socket("ethdev RX queue", sizeof(struct nfp_net_rxq),
> +			RTE_CACHE_LINE_SIZE, socket_id);
> +	if (rxq == NULL)
> +		return -ENOMEM;
> +
> +	rxq->hw = pf_hw;
> +	rxq->qidx = rx_queue_id;
> +	rxq->port_id = dev->data->port_id;
> +	dev->data->rx_queues[rx_queue_id] = rxq;
> +
> +	return 0;
> +}
> +
> +static int
> +nfp_flower_repr_tx_queue_setup(struct rte_eth_dev *dev,
> +		uint16_t tx_queue_id,
> +		__rte_unused uint16_t nb_tx_desc,
> +		unsigned int socket_id,
> +		__rte_unused const struct rte_eth_txconf *tx_conf)
> +{
> +	struct nfp_net_txq *txq;
> +	struct nfp_net_hw *pf_hw;
> +	struct nfp_flower_representor *repr;
> +
> +	repr = (struct nfp_flower_representor *)dev->data->dev_private;
> +	pf_hw = repr->app_flower->pf_hw;
> +
> +	/* Allocating tx queue data structure */
> +	txq = rte_zmalloc_socket("ethdev TX queue", sizeof(struct nfp_net_txq),
> +			RTE_CACHE_LINE_SIZE, socket_id);
> +	if (txq == NULL)
> +		return -ENOMEM;
> +
> +	txq->hw = pf_hw;
> +	txq->qidx = tx_queue_id;
> +	txq->port_id = dev->data->port_id;
> +	dev->data->tx_queues[tx_queue_id] = txq;
> +
> +	return 0;
> +}
> +
> +static int
> +nfp_flower_repr_stats_get(struct rte_eth_dev *ethdev,
> +		struct rte_eth_stats *stats)
> +{
> +	struct nfp_flower_representor *repr;
> +
> +	repr = (struct nfp_flower_representor *)ethdev->data->dev_private;
> +	rte_memcpy(stats, &repr->repr_stats, sizeof(struct rte_eth_stats));
> +
> +	return 0;
> +}
> +
> +static int
> +nfp_flower_repr_stats_reset(__rte_unused struct rte_eth_dev *ethdev)
> +{

Why is it OK to have dummy implemenation of the reset?

> +	return 0;
> +}
> +
> +static int
> +nfp_flower_repr_promiscuous_enable(__rte_unused struct rte_eth_dev *ethdev)
> +{

Why is it OK to have dummy implementation? Same question for all
dummy callbacks below.

> +	return 0;
> +}
> +
> +static int
> +nfp_flower_repr_promiscuous_disable(__rte_unused struct rte_eth_dev *ethdev)
> +{
> +	return 0;
> +}
> +
> +static void
> +nfp_flower_repr_mac_addr_remove(__rte_unused struct rte_eth_dev *ethdev,
> +		__rte_unused uint32_t index)
> +{
> +}
> +
> +static int
> +nfp_flower_repr_mac_addr_set(__rte_unused struct rte_eth_dev *ethdev,
> +		__rte_unused struct rte_ether_addr *mac_addr)
> +{
> +	return 0;
> +}
> +
> +static uint16_t
> +nfp_flower_repr_rx_burst(void *rx_queue,
> +		struct rte_mbuf **rx_pkts,
> +		uint16_t nb_pkts)
> +{
> +	unsigned int available = 0;
> +	unsigned int total_dequeue;
> +	struct nfp_net_rxq *rxq;
> +	struct rte_eth_dev *dev;
> +	struct nfp_flower_representor *repr;
> +
> +	rxq = rx_queue;
> +	if (unlikely(rxq == NULL)) {
> +		PMD_RX_LOG(ERR, "RX Bad queue");
> +		return 0;
> +	}
> +
> +	dev = &rte_eth_devices[rxq->port_id];
> +	repr = dev->data->dev_private;
> +	if (unlikely(repr->ring == NULL)) {
> +		PMD_RX_LOG(ERR, "representor %s has no ring configured!",
> +				repr->name);
> +		return 0;
> +	}
> +
> +	total_dequeue = rte_ring_dequeue_burst(repr->ring, (void *)rx_pkts,
> +			nb_pkts, &available);
> +	if (total_dequeue) {

Compare vs 0

> +		PMD_RX_LOG(DEBUG, "Representor Rx burst for %s, port_id: 0x%x, "
> +				"received: %u, available: %u", repr->name,
> +				repr->port_id, total_dequeue, available);
> +
> +		repr->repr_stats.ipackets += total_dequeue;
> +	}
> +
> +	return total_dequeue;
> +}
> +
> +static uint16_t
> +nfp_flower_repr_tx_burst(void *tx_queue,
> +		struct rte_mbuf **tx_pkts,
> +		uint16_t nb_pkts)
> +{
> +	uint16_t i;
> +	uint16_t sent;
> +	char *meta_offset;
> +	struct nfp_net_txq *txq;
> +	struct nfp_net_hw *pf_hw;
> +	struct rte_eth_dev *dev;
> +	struct rte_eth_dev *repr_dev;
> +	struct nfp_flower_representor *repr;
> +
> +	txq = tx_queue;
> +	if (unlikely(txq == NULL)) {
> +		PMD_RX_LOG(ERR, "TX Bad queue");
> +		return 0;
> +	}
> +
> +	/* This points to the PF vNIC that owns this representor */
> +	pf_hw = txq->hw;
> +	dev = pf_hw->eth_dev;
> +
> +	/* Grab a handle to the representor struct */
> +	repr_dev = &rte_eth_devices[txq->port_id];
> +	repr = repr_dev->data->dev_private;
> +
> +	for (i = 0; i < nb_pkts; i++) {
> +		meta_offset = rte_pktmbuf_prepend(tx_pkts[i], FLOWER_PKT_DATA_OFFSET);
> +		*(uint32_t *)meta_offset = rte_cpu_to_be_32(NFP_NET_META_PORTID);
> +		meta_offset += 4;
> +		*(uint32_t *)meta_offset = rte_cpu_to_be_32(repr->port_id);
> +	}
> +
> +	/* Only using Tx queue 0 for now. */
> +	sent = rte_eth_tx_burst(dev->data->port_id, 0, tx_pkts, nb_pkts);
> +	if (sent) {

Compare vs 0

> +		PMD_TX_LOG(DEBUG, "Representor Tx burst for %s, port_id: 0x%x "
> +			"transmitted: %u\n", repr->name, repr->port_id, sent);
> +		repr->repr_stats.opackets += sent;
> +	}
> +
> +	return sent;
> +}
> +
> +static const struct eth_dev_ops nfp_flower_repr_dev_ops = {
> +	.dev_infos_get        = nfp_flower_repr_dev_infos_get,
> +
> +	.dev_start            = nfp_flower_repr_dev_start,
> +	.dev_configure        = nfp_flower_repr_dev_configure,
> +	.dev_stop             = nfp_flower_repr_dev_stop,
> +
> +	.rx_queue_setup       = nfp_flower_repr_rx_queue_setup,
> +	.tx_queue_setup       = nfp_flower_repr_tx_queue_setup,
> +
> +	.link_update          = nfp_flower_repr_link_update,
> +
> +	.stats_get            = nfp_flower_repr_stats_get,
> +	.stats_reset          = nfp_flower_repr_stats_reset,
> +
> +	.promiscuous_enable   = nfp_flower_repr_promiscuous_enable,
> +	.promiscuous_disable  = nfp_flower_repr_promiscuous_disable,
> +
> +	.mac_addr_remove      = nfp_flower_repr_mac_addr_remove,
> +	.mac_addr_set         = nfp_flower_repr_mac_addr_set,
> +};
> +
> +static uint32_t
> +nfp_flower_get_phys_port_id(uint8_t port)
> +{
> +	return (NFP_FLOWER_CMSG_PORT_TYPE_PHYS_PORT << 28) | port;
> +}
> +
> +static uint32_t
> +nfp_get_pcie_port_id(struct nfp_cpp *cpp,
> +		int type,
> +		uint8_t vnic,
> +		uint8_t queue)
> +{
> +	uint8_t nfp_pcie;
> +	uint32_t port_id;
> +
> +	nfp_pcie = NFP_CPP_INTERFACE_UNIT_of(nfp_cpp_interface(cpp));
> +	port_id = ((nfp_pcie & 0x3) << 14) |
> +		  ((type & 0x3) << 12) |
> +		  ((vnic & 0x3f) << 6) |
> +		  (queue & 0x3f) |
> +		  ((NFP_FLOWER_CMSG_PORT_TYPE_PCIE_PORT & 0xf) << 28);
> +
> +	return port_id;
> +}
> +
> +static int
> +nfp_flower_repr_init(struct rte_eth_dev *eth_dev,
> +		void *init_params)
> +{
> +	int ret;
> +	unsigned int numa_node;
> +	char ring_name[RTE_ETH_NAME_MAX_LEN];
> +	struct nfp_app_flower *app_flower;
> +	struct nfp_flower_representor *repr;
> +	struct nfp_flower_representor *init_repr_data;
> +
> +	/* Cast the input representor data to the correct struct here */
> +	init_repr_data = (struct nfp_flower_representor *)init_params;
> +
> +	app_flower = init_repr_data->app_flower;
> +
> +	/* Memory has been allocated in the eth_dev_create() function */
> +	repr = eth_dev->data->dev_private;
> +
> +	/*
> +	 * We need multiproduce rings as we can have multiple PF ports.
> +	 * On the other hand, we need single consumer rings, as just one
> +	 * representor PMD will try to read from the ring.
> +	 */
> +	snprintf(ring_name, sizeof(ring_name), "%s_%s",
> +		init_repr_data->name, "ring");
> +	numa_node = rte_socket_id();
> +	repr->ring = rte_ring_create(ring_name, 256, numa_node, RING_F_SC_DEQ);
> +	if (repr->ring == NULL) {
> +		PMD_INIT_LOG(ERR, "rte_ring_create failed for %s\n", ring_name);
> +		return -ENOMEM;
> +	}
> +
> +	/* Copy data here from the input representor template*/
> +	repr->vf_id            = init_repr_data->vf_id;
> +	repr->switch_domain_id = init_repr_data->switch_domain_id;
> +	repr->port_id          = init_repr_data->port_id;
> +	repr->nfp_idx          = init_repr_data->nfp_idx;
> +	repr->repr_type        = init_repr_data->repr_type;
> +	repr->app_flower       = init_repr_data->app_flower;
> +
> +	snprintf(repr->name, sizeof(repr->name), "%s", init_repr_data->name);
> +
> +	eth_dev->dev_ops = &nfp_flower_repr_dev_ops;
> +
> +	eth_dev->rx_pkt_burst = nfp_flower_repr_rx_burst;
> +	eth_dev->tx_pkt_burst = nfp_flower_repr_tx_burst;
> +
> +	eth_dev->data->dev_flags |= RTE_ETH_DEV_REPRESENTOR;
> +
> +	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT)
> +		eth_dev->data->representor_id = repr->vf_id;
> +	else
> +		eth_dev->data->representor_id = repr->vf_id +
> +				app_flower->num_phyport_reprs;
> +
> +	/* This backer port is that of the eth_device created for the PF vNIC */
> +	eth_dev->data->backer_port_id = app_flower->pf_hw->eth_dev->data->port_id;
> +
> +	/* Only single queues for representor devices */
> +	eth_dev->data->nb_rx_queues = 1;
> +	eth_dev->data->nb_tx_queues = 1;
> +
> +	/* Allocating memory for mac addr */
> +	eth_dev->data->mac_addrs = rte_zmalloc("mac_addr",
> +		RTE_ETHER_ADDR_LEN, 0);
> +	if (eth_dev->data->mac_addrs == NULL) {
> +		PMD_INIT_LOG(ERR, "Failed to allocate memory for repr MAC");
> +		ret = -ENOMEM;
> +		goto ring_cleanup;
> +	}
> +
> +	rte_ether_addr_copy(&init_repr_data->mac_addr, &repr->mac_addr);
> +	rte_ether_addr_copy(&init_repr_data->mac_addr, eth_dev->data->mac_addrs);
> +
> +	/* Send reify message to hardware to inform it about the new repr */
> +	ret = nfp_flower_cmsg_repr_reify(app_flower, repr);
> +	if (ret) {

Compare vs 0

> +		PMD_INIT_LOG(WARNING, "Failed to send repr reify message");
> +		goto mac_cleanup;
> +	}
> +
> +	/* Add repr to correct array */
> +	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT)
> +		app_flower->phy_reprs[repr->nfp_idx] = repr;
> +	else
> +		app_flower->vf_reprs[repr->vf_id] = repr;
> +
> +	return 0;
> +
> +mac_cleanup:
> +	rte_free(eth_dev->data->mac_addrs);
> +ring_cleanup:
> +	rte_ring_free(repr->ring);
> +
> +	return ret;
> +}
> +
> +int
> +nfp_flower_repr_alloc(struct nfp_app_flower *app_flower)
> +{
> +	int i;
> +	int ret;
> +	struct rte_eth_dev *eth_dev;
> +	struct nfp_eth_table *nfp_eth_table;
> +	struct nfp_eth_table_port *eth_port;
> +	struct nfp_flower_representor flower_repr = {
> +		.switch_domain_id = app_flower->switch_domain_id,
> +		.app_flower       = app_flower,
> +	};
> +
> +	nfp_eth_table = app_flower->nfp_eth_table;
> +	eth_dev = app_flower->pf_hw->eth_dev;
> +
> +	/* Send a NFP_FLOWER_CMSG_TYPE_MAC_REPR cmsg to hardware*/
> +	ret = nfp_flower_cmsg_mac_repr(app_flower);
> +	if (ret) {

Compare vs 0

> +		PMD_INIT_LOG(ERR, "Cloud not send mac repr cmsgs");
> +		return ret;
> +	}
> +
> +	/* Create a rte_eth_dev for every phyport representor */
> +	for (i = 0; i < app_flower->num_phyport_reprs; i++) {
> +		eth_port = &nfp_eth_table->ports[i];
> +		flower_repr.repr_type = NFP_REPR_TYPE_PHYS_PORT;
> +		flower_repr.port_id = nfp_flower_get_phys_port_id(eth_port->index);
> +		flower_repr.nfp_idx = eth_port->eth_index;
> +		flower_repr.vf_id = i;
> +
> +		/* Copy the real mac of the interface to the representor struct */
> +		rte_ether_addr_copy((struct rte_ether_addr *)eth_port->mac_addr,
> +				&flower_repr.mac_addr);
> +		sprintf(flower_repr.name, "flower_repr_p%d", i);
> +
> +		/*
> +		 * Create a eth_dev for this representor
> +		 * This will also allocate private memory for the device
> +		 */
> +		ret = rte_eth_dev_create(eth_dev->device, flower_repr.name,
> +				sizeof(struct nfp_flower_representor),
> +				NULL, NULL, nfp_flower_repr_init, &flower_repr);
> +		if (ret) {

Compare vs 0

> +			PMD_INIT_LOG(ERR, "Cloud not create eth_dev for repr");
> +			break;
> +		}
> +	}
> +
> +	if (i < app_flower->num_phyport_reprs)
> +		return ret;
> +
> +	/*
> +	 * Now allocate eth_dev's for VF representors.
> +	 * Also send reify messages
> +	 */
> +	for (i = 0; i < app_flower->num_vf_reprs; i++) {
> +		flower_repr.repr_type = NFP_REPR_TYPE_VF;
> +		flower_repr.port_id = nfp_get_pcie_port_id(app_flower->pf_hw->cpp,
> +				NFP_FLOWER_CMSG_PORT_VNIC_TYPE_VF, i, 0);
> +		flower_repr.nfp_idx = 0;
> +		flower_repr.vf_id = i;
> +
> +		/* VF reprs get a random MAC address */
> +		rte_eth_random_addr(flower_repr.mac_addr.addr_bytes);
> +
> +		sprintf(flower_repr.name, "flower_repr_vf%d", i);
> +
> +		 /* This will also allocate private memory for the device*/
> +		ret = rte_eth_dev_create(eth_dev->device, flower_repr.name,
> +				sizeof(struct nfp_flower_representor),
> +				NULL, NULL, nfp_flower_repr_init, &flower_repr);
> +		if (ret) {

Compare vs 0

> +			PMD_INIT_LOG(ERR, "Cloud not create eth_dev for repr");
> +			break;
> +		}
> +	}
> +
> +	if (i < app_flower->num_vf_reprs)
> +		return ret;
> +
> +	return 0;
> +}
> diff --git a/drivers/net/nfp/flower/nfp_flower_representor.h b/drivers/net/nfp/flower/nfp_flower_representor.h
> new file mode 100644
> index 0000000..6ee54f1
> --- /dev/null
> +++ b/drivers/net/nfp/flower/nfp_flower_representor.h
> @@ -0,0 +1,39 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2022 Corigine, Inc.
> + * All rights reserved.
> + */
> +
> +#ifndef _NFP_FLOWER_REPRESENTOR_H_
> +#define _NFP_FLOWER_REPRESENTOR_H_
> +
> +/*
> + * enum nfp_repr_type - type of representor
> + * @NFP_REPR_TYPE_PHYS_PORT:   external NIC port
> + * @NFP_REPR_TYPE_PF:          physical function
> + * @NFP_REPR_TYPE_VF:          virtual function
> + * @NFP_REPR_TYPE_MAX:         number of representor types
> + */
> +enum nfp_repr_type {
> +	NFP_REPR_TYPE_PHYS_PORT = 0,
> +	NFP_REPR_TYPE_PF,
> +	NFP_REPR_TYPE_VF,
> +	NFP_REPR_TYPE_MAX,
> +};
> +
> +struct nfp_flower_representor {
> +	uint16_t vf_id;
> +	uint16_t switch_domain_id;
> +	uint32_t repr_type;
> +	uint32_t port_id;
> +	uint32_t nfp_idx;    /* only valid for the repr of physical port */
> +	char name[RTE_ETH_NAME_MAX_LEN];
> +	struct rte_ether_addr mac_addr;
> +	struct nfp_app_flower *app_flower;
> +	struct rte_ring *ring;
> +	struct rte_eth_link *link;
> +	struct rte_eth_stats repr_stats;
> +};
> +
> +int nfp_flower_repr_alloc(struct nfp_app_flower *app_flower);
> +
> +#endif /* _NFP_FLOWER_REPRESENTOR_H_ */
> diff --git a/drivers/net/nfp/meson.build b/drivers/net/nfp/meson.build
> index 8710213..8a63979 100644
> --- a/drivers/net/nfp/meson.build
> +++ b/drivers/net/nfp/meson.build
> @@ -7,7 +7,9 @@ if not is_linux or not dpdk_conf.get('RTE_ARCH_64')
>   endif
>   sources = files(
>           'flower/nfp_flower.c',
> +        'flower/nfp_flower_cmsg.c',
>           'flower/nfp_flower_ctrl.c',
> +        'flower/nfp_flower_representor.c',
>           'nfpcore/nfp_cpp_pcie_ops.c',
>           'nfpcore/nfp_nsp.c',
>           'nfpcore/nfp_cppcore.c',
> diff --git a/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c b/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
> index 08bc4e8..22c8bc4 100644
> --- a/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
> +++ b/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
> @@ -91,7 +91,10 @@
>    * @refcnt:	number of current users
>    * @iomem:	mapped IO memory
>    */
> +#define NFP_BAR_MIN 1
> +#define NFP_BAR_MID 5
>   #define NFP_BAR_MAX 7
> +
>   struct nfp_bar {
>   	struct nfp_pcie_user *nfp;
>   	uint32_t barcfg;
> @@ -292,6 +295,7 @@ struct nfp_pcie_user {
>    * BAR0.0: Reserved for General Mapping (for MSI-X access to PCIe SRAM)
>    *
>    *         Halving PCItoCPPBars for primary and secondary processes.
> + *         For CoreNIC firmware:
>    *         NFP PMD just requires two fixed slots, one for configuration BAR,
>    *         and another for accessing the hw queues. Another slot is needed
>    *         for setting the link up or down. Secondary processes do not need
> @@ -301,6 +305,9 @@ struct nfp_pcie_user {
>    *         supported. Due to this requirement and future extensions requiring
>    *         new slots per process, only one secondary process is supported by
>    *         now.
> + *         For Flower firmware:
> + *         NFP PMD need another fixed slots, used as the configureation BAR
> + *         for ctrl vNIC.
>    */
>   static int
>   nfp_enable_bars(struct nfp_pcie_user *nfp)
> @@ -309,11 +316,11 @@ struct nfp_pcie_user {
>   	int x, start, end;
>   
>   	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> -		start = 4;
> -		end = 1;
> +		start = NFP_BAR_MID;
> +		end = NFP_BAR_MIN;

These and similar changes below look unrelated to the patch.

>   	} else {
> -		start = 7;
> -		end = 4;
> +		start = NFP_BAR_MAX;
> +		end = NFP_BAR_MID;
>   	}
>   	for (x = start; x > end; x--) {
>   		bar = &nfp->bar[x - 1];
> @@ -341,11 +348,11 @@ struct nfp_pcie_user {
>   	int x, start, end;
>   
>   	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> -		start = 4;
> -		end = 1;
> +		start = NFP_BAR_MID;
> +		end = NFP_BAR_MIN;
>   	} else {
> -		start = 7;
> -		end = 4;
> +		start = NFP_BAR_MAX;
> +		end = NFP_BAR_MID;
>   	}
>   	for (x = start; x > end; x--) {
>   		bar = &nfp->bar[x - 1];
> @@ -364,11 +371,11 @@ struct nfp_pcie_user {
>   	int x, start, end;
>   
>   	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> -		start = 4;
> -		end = 1;
> +		start = NFP_BAR_MID;
> +		end = NFP_BAR_MIN;
>   	} else {
> -		start = 7;
> -		end = 4;
> +		start = NFP_BAR_MAX;
> +		end = NFP_BAR_MID;
>   	}
>   
>   	for (x = start; x > end; x--) {


^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-08-05 13:05   ` Andrew Rybchenko
@ 2022-08-08 11:32     ` Chaoyong He
  2022-08-08 14:45       ` Stephen Hemminger
  0 siblings, 1 reply; 29+ messages in thread
From: Chaoyong He @ 2022-08-08 11:32 UTC (permalink / raw)
  To: Andrew Rybchenko; +Cc: Niklas Soderlund, dev

> On 8/5/22 09:32, Chaoyong He wrote:
> > This commit adds the setup/start logic for the ctrl vNIC. This vNIC
> 
> "This commit adds" -> "Add"
> 
> > is used by the PMD and flower firmware as a communication channel
> > between driver and firmware. In the case of OVS it is also used to
> > communicate flow statistics from hardware to the driver.
> >
> > A rte_eth device is not exposed to DPDK for this vNIC as it is
> > strictly used internally by flower logic. Rx and Tx logic will be
> > added later for this vNIC.
> >
> > Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> > Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> > ---
> >   drivers/net/nfp/flower/nfp_flower.c | 385
> +++++++++++++++++++++++++++++++++++-
> >   drivers/net/nfp/flower/nfp_flower.h |   6 +
> >   2 files changed, 388 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/nfp/flower/nfp_flower.c
> > b/drivers/net/nfp/flower/nfp_flower.c
> > index 2498020..51df504 100644
> > --- a/drivers/net/nfp/flower/nfp_flower.c
> > +++ b/drivers/net/nfp/flower/nfp_flower.c
> > @@ -26,6 +26,10 @@
> >   #define MEMPOOL_CACHE_SIZE 512
> >   #define DEFAULT_FLBUF_SIZE 9216
> >
> > +#define CTRL_VNIC_NB_DESC 64
> > +#define CTRL_VNIC_RX_FREE_THRESH 32
> > +#define CTRL_VNIC_TX_FREE_THRESH 32
> > +
> >   /*
> >    * Simple dev ops functions for the flower PF. Because a rte_device is
> exposed
> >    * to DPDK the flower logic also makes use of helper functions like
> > @@ -543,6 +547,302 @@
> >   	return ret;
> >   }
> >
> > +static void
> > +nfp_flower_cleanup_ctrl_vnic(struct nfp_net_hw *hw) {
> > +	uint32_t i;
> > +	struct nfp_net_rxq *rxq;
> > +	struct nfp_net_txq *txq;
> > +	struct rte_eth_dev *eth_dev;
> > +
> > +	eth_dev = hw->eth_dev;
> > +
> > +	for (i = 0; i < hw->max_tx_queues; i++) {
> > +		txq = eth_dev->data->tx_queues[i];
> > +		if (txq) {
> 
> Compare vs NULL as you do in other cases and as DPDK coding style
> recommends.
> 
> > +			rte_free(txq->txbufs);
> > +			rte_eth_dma_zone_free(eth_dev, "ctrl_tx_ring", i);
> > +			rte_free(txq);
> > +		}
> > +	}
> > +
> > +	for (i = 0; i < hw->max_rx_queues; i++) {
> > +		rxq = eth_dev->data->rx_queues[i];
> > +		if (rxq) {
> 
> Compare vs NULL
> 
> > +			rte_free(rxq->rxbufs);
> > +			rte_eth_dma_zone_free(eth_dev, "ctrl_rx_ring", i);
> > +			rte_free(rxq);
> > +		}
> > +	}
> > +
> > +	rte_free(eth_dev->data->tx_queues);
> > +	rte_free(eth_dev->data->rx_queues);
> > +	rte_free(eth_dev->data);
> > +	rte_free(eth_dev);
> > +}
> > +
> > +static int
> > +nfp_flower_init_ctrl_vnic(struct nfp_net_hw *hw) {
> > +	uint32_t i;
> > +	int ret = 0;
> > +	uint16_t nb_desc;
> > +	unsigned int numa_node;
> > +	struct rte_mempool *mp;
> > +	uint16_t rx_free_thresh;
> > +	uint16_t tx_free_thresh;
> > +	struct nfp_net_rxq *rxq;
> > +	struct nfp_net_txq *txq;
> > +	struct nfp_pf_dev *pf_dev;
> > +	struct rte_eth_dev *eth_dev;
> > +	const struct rte_memzone *tz;
> > +	struct nfp_app_flower *app_flower;
> > +
> > +	/* Hardcoded values for now */
> > +	nb_desc = CTRL_VNIC_NB_DESC;
> > +	rx_free_thresh = CTRL_VNIC_RX_FREE_THRESH;
> 
> What's the point to introduce the variable and use it only once below?
> 
> > +	tx_free_thresh = CTRL_VNIC_TX_FREE_THRESH;
> 
> Same here.
> 
> > +	numa_node = rte_socket_id();
> > +
> > +	/* Set up some pointers here for ease of use */
> > +	pf_dev = hw->pf_dev;
> > +	app_flower = NFP_APP_PRIV_TO_APP_FLOWER(pf_dev->app_priv);
> > +
> > +	ret = nfp_flower_init_vnic_common(hw, "ctrl_vnic");
> > +	if (ret)
> 
> Compare vs 0
> 
> > +		goto done;
> > +
> > +	/* Allocate memory for the eth_dev of the vNIC */
> > +	hw->eth_dev = rte_zmalloc("ctrl_vnic_eth_dev",
> 
> Why not rte_eth_dev_allocate()? Isn't an ethdev?
> Why do you bypsss ethdev layer in this case completely and do everything
> yourself?

Here we created an ethdev locally to nfp PMD, we want the user totally won't be aware of it.
If we use rte_eth_dev_allocate() to create it, it will be in array 'rte_ethdev_devices[]', that's not we want.

> 
> > +		sizeof(struct rte_eth_dev), RTE_CACHE_LINE_SIZE);
> > +	if (hw->eth_dev == NULL) {
> > +		ret = -ENOMEM;
> > +		goto done;
> > +	}
> > +
> > +	/* Grab the pointer to the newly created rte_eth_dev here */
> > +	eth_dev = hw->eth_dev;
> > +
> > +	/* Also allocate memory for the data part of the eth_dev */
> > +	eth_dev->data = rte_zmalloc("ctrl_vnic_eth_dev_data",
> > +		sizeof(struct rte_eth_dev_data), RTE_CACHE_LINE_SIZE);
> > +	if (eth_dev->data == NULL) {
> > +		ret = -ENOMEM;
> > +		goto eth_dev_cleanup;
> > +	}
> > +
> > +	eth_dev->data->rx_queues = rte_zmalloc("ethdev->rx_queues",
> > +		sizeof(eth_dev->data->rx_queues[0]) * hw-
> >max_rx_queues,
> > +		RTE_CACHE_LINE_SIZE);
> > +	if (eth_dev->data->rx_queues == NULL) {
> > +		PMD_INIT_LOG(ERR, "rte_zmalloc failed for ctrl vnic rx
> queues");
> > +		ret = -ENOMEM;
> > +		goto dev_data_cleanup;
> > +	}
> > +
> > +	eth_dev->data->tx_queues = rte_zmalloc("ethdev->tx_queues",
> > +		sizeof(eth_dev->data->tx_queues[0]) * hw-
> >max_tx_queues,
> > +		RTE_CACHE_LINE_SIZE);
> > +	if (eth_dev->data->tx_queues == NULL) {
> > +		PMD_INIT_LOG(ERR, "rte_zmalloc failed for ctrl vnic tx
> queues");
> > +		ret = -ENOMEM;
> > +		goto rx_queue_cleanup;
> > +	}
> > +
> > +	eth_dev->device = &pf_dev->pci_dev->device;
> > +	eth_dev->data->nb_tx_queues = hw->max_tx_queues;
> > +	eth_dev->data->nb_rx_queues = hw->max_rx_queues;
> > +	eth_dev->data->dev_private = hw;
> > +
> > +	/* Create a mbuf pool for the vNIC */
> > +	app_flower->ctrl_pktmbuf_pool =
> rte_pktmbuf_pool_create("ctrl_mbuf_pool",
> > +		4 * nb_desc, 64, 0, 9216, numa_node);
> > +	if (app_flower->ctrl_pktmbuf_pool == NULL) {
> > +		PMD_INIT_LOG(ERR, "create mbuf pool for ctrl vnic failed");
> > +		ret = -ENOMEM;
> > +		goto tx_queue_cleanup;
> > +	}
> > +
> > +	mp = app_flower->ctrl_pktmbuf_pool;
> > +
> > +	/* Set up the Rx queues */
> > +	PMD_INIT_LOG(INFO, "Configuring flower ctrl vNIC Rx queue");
> > +	for (i = 0; i < hw->max_rx_queues; i++) {
> > +		/* Hardcoded number of desc to 64 */
> > +		rxq = rte_zmalloc_socket("ethdev RX queue",
> > +			sizeof(struct nfp_net_rxq), RTE_CACHE_LINE_SIZE,
> > +			numa_node);
> > +		if (rxq == NULL) {
> > +			PMD_DRV_LOG(ERR, "Error allocating rxq");
> > +			ret = -ENOMEM;
> > +			goto rx_queue_setup_cleanup;
> > +		}
> > +
> > +		eth_dev->data->rx_queues[i] = rxq;
> > +
> > +		/* Hw queues mapping based on firmware configuration */
> > +		rxq->qidx = i;
> > +		rxq->fl_qcidx = i * hw->stride_rx;
> > +		rxq->rx_qcidx = rxq->fl_qcidx + (hw->stride_rx - 1);
> > +		rxq->qcp_fl = hw->rx_bar + NFP_QCP_QUEUE_OFF(rxq-
> >fl_qcidx);
> > +		rxq->qcp_rx = hw->rx_bar + NFP_QCP_QUEUE_OFF(rxq-
> >rx_qcidx);
> > +
> > +		/*
> > +		 * Tracking mbuf size for detecting a potential mbuf overflow
> due to
> > +		 * RX offset
> > +		 */
> > +		rxq->mem_pool = mp;
> > +		rxq->mbuf_size = rxq->mem_pool->elt_size;
> > +		rxq->mbuf_size -= (sizeof(struct rte_mbuf) +
> RTE_PKTMBUF_HEADROOM);
> > +		hw->flbufsz = rxq->mbuf_size;
> > +
> > +		rxq->rx_count = nb_desc;
> > +		rxq->rx_free_thresh = rx_free_thresh;
> > +		rxq->drop_en = 1;
> > +
> > +		/*
> > +		 * Allocate RX ring hardware descriptors. A memzone large
> enough to
> > +		 * handle the maximum ring size is allocated in order to allow
> for
> > +		 * resizing in later calls to the queue setup function.
> > +		 */
> > +		tz = rte_eth_dma_zone_reserve(eth_dev, "ctrl_rx_ring", i,
> > +			sizeof(struct nfp_net_rx_desc) *
> NFP_NET_MAX_RX_DESC,
> > +			NFP_MEMZONE_ALIGN, numa_node);
> > +		if (tz == NULL) {
> > +			PMD_DRV_LOG(ERR, "Error allocating rx dma");
> > +			rte_free(rxq);
> > +			ret = -ENOMEM;
> > +			goto rx_queue_setup_cleanup;
> > +		}
> > +
> > +		/* Saving physical and virtual addresses for the RX ring */
> > +		rxq->dma = (uint64_t)tz->iova;
> > +		rxq->rxds = (struct nfp_net_rx_desc *)tz->addr;
> > +
> > +		/* mbuf pointers array for referencing mbufs linked to RX
> descriptors */
> > +		rxq->rxbufs = rte_zmalloc_socket("rxq->rxbufs",
> > +			sizeof(*rxq->rxbufs) * nb_desc,
> RTE_CACHE_LINE_SIZE,
> > +			numa_node);
> > +		if (rxq->rxbufs == NULL) {
> > +			rte_eth_dma_zone_free(eth_dev, "ctrl_rx_ring", i);
> > +			rte_free(rxq);
> > +			ret = -ENOMEM;
> > +			goto rx_queue_setup_cleanup;
> > +		}
> > +
> > +		nfp_net_reset_rx_queue(rxq);
> > +
> > +		rxq->hw = hw;
> > +
> > +		/*
> > +		 * Telling the HW about the physical address of the RX ring
> and number
> > +		 * of descriptors in log2 format
> > +		 */
> > +		nn_cfg_writeq(hw, NFP_NET_CFG_RXR_ADDR(i), rxq->dma);
> > +		nn_cfg_writeb(hw, NFP_NET_CFG_RXR_SZ(i),
> rte_log2_u32(nb_desc));
> > +	}
> > +
> > +	/* Now the Tx queues */
> > +	PMD_INIT_LOG(INFO, "Configuring flower ctrl vNIC Tx queue");
> > +	for (i = 0; i < hw->max_tx_queues; i++) {
> > +		/* Hardcoded number of desc to 64 */
> > +		/* Allocating tx queue data structure */
> > +		txq = rte_zmalloc_socket("ethdev TX queue",
> > +			sizeof(struct nfp_net_txq), RTE_CACHE_LINE_SIZE,
> > +			numa_node);
> > +		if (txq == NULL) {
> > +			PMD_DRV_LOG(ERR, "Error allocating txq");
> > +			ret = -ENOMEM;
> > +			goto tx_queue_setup_cleanup;
> > +		}
> > +
> > +		eth_dev->data->tx_queues[i] = txq;
> > +
> > +		/*
> > +		 * Allocate TX ring hardware descriptors. A memzone large
> enough to
> > +		 * handle the maximum ring size is allocated in order to allow
> for
> > +		 * resizing in later calls to the queue setup function.
> > +		 */
> > +		tz = rte_eth_dma_zone_reserve(eth_dev, "ctrl_tx_ring", i,
> > +			sizeof(struct nfp_net_nfd3_tx_desc) *
> NFP_NET_MAX_TX_DESC,
> > +			NFP_MEMZONE_ALIGN, numa_node);
> > +		if (tz == NULL) {
> > +			PMD_DRV_LOG(ERR, "Error allocating tx dma");
> > +			rte_free(txq);
> > +			ret = -ENOMEM;
> > +			goto tx_queue_setup_cleanup;
> > +		}
> > +
> > +		txq->tx_count = nb_desc;
> > +		txq->tx_free_thresh = tx_free_thresh;
> > +		txq->tx_pthresh = DEFAULT_TX_PTHRESH;
> > +		txq->tx_hthresh = DEFAULT_TX_HTHRESH;
> > +		txq->tx_wthresh = DEFAULT_TX_WTHRESH;
> > +
> > +		/* queue mapping based on firmware configuration */
> > +		txq->qidx = i;
> > +		txq->tx_qcidx = i * hw->stride_tx;
> > +		txq->qcp_q = hw->tx_bar + NFP_QCP_QUEUE_OFF(txq-
> >tx_qcidx);
> > +
> > +		/* Saving physical and virtual addresses for the TX ring */
> > +		txq->dma = (uint64_t)tz->iova;
> > +		txq->txds = (struct nfp_net_nfd3_tx_desc *)tz->addr;
> > +
> > +		/* mbuf pointers array for referencing mbufs linked to TX
> descriptors */
> > +		txq->txbufs = rte_zmalloc_socket("txq->txbufs",
> > +			sizeof(*txq->txbufs) * nb_desc,
> RTE_CACHE_LINE_SIZE,
> > +			numa_node);
> > +		if (txq->txbufs == NULL) {
> > +			rte_eth_dma_zone_free(eth_dev, "ctrl_tx_ring", i);
> > +			rte_free(txq);
> > +			ret = -ENOMEM;
> > +			goto tx_queue_setup_cleanup;
> > +		}
> > +
> > +		nfp_net_reset_tx_queue(txq);
> > +
> > +		txq->hw = hw;
> > +
> > +		/*
> > +		 * Telling the HW about the physical address of the TX ring
> and number
> > +		 * of descriptors in log2 format
> > +		 */
> > +		nn_cfg_writeq(hw, NFP_NET_CFG_TXR_ADDR(i), txq->dma);
> > +		nn_cfg_writeb(hw, NFP_NET_CFG_TXR_SZ(i),
> rte_log2_u32(nb_desc));
> > +	}
> > +
> > +	return 0;
> > +
> > +tx_queue_setup_cleanup:
> > +	for (i = 0; i < hw->max_tx_queues; i++) {
> > +		txq = eth_dev->data->tx_queues[i];
> > +		if (txq) {
> 
> Compare vs NULL
> 
> > +			rte_free(txq->txbufs);
> > +			rte_eth_dma_zone_free(eth_dev, "ctrl_tx_ring", i);
> > +			rte_free(txq);
> > +		}
> > +	}
> > +rx_queue_setup_cleanup:
> > +	for (i = 0; i < hw->max_rx_queues; i++) {
> > +		rxq = eth_dev->data->rx_queues[i];
> > +		if (rxq) {
> 
> Compare vs NULL
> 
> > +			rte_free(rxq->rxbufs);
> > +			rte_eth_dma_zone_free(eth_dev, "ctrl_rx_ring", i);
> > +			rte_free(rxq);
> > +		}
> > +	}
> > +tx_queue_cleanup:
> > +	rte_free(eth_dev->data->tx_queues);
> > +rx_queue_cleanup:
> > +	rte_free(eth_dev->data->rx_queues);
> > +dev_data_cleanup:
> > +	rte_free(eth_dev->data);
> > +eth_dev_cleanup:
> > +	rte_free(eth_dev);
> > +done:
> > +	return ret;
> > +}
> > +
> >   static int
> >   nfp_flower_start_pf_vnic(struct nfp_net_hw *hw)
> >   {
> > @@ -561,12 +861,57 @@
> >   	return 0;
> >   }
> >
> > +static int
> > +nfp_flower_start_ctrl_vnic(struct nfp_net_hw *hw) {
> > +	int ret;
> > +	uint32_t update;
> > +	uint32_t new_ctrl;
> > +	struct rte_eth_dev *dev;
> > +
> > +	dev = hw->eth_dev;
> > +
> > +	/* Disabling queues just in case... */
> > +	nfp_net_disable_queues(dev);
> > +
> > +	/* Enabling the required queues in the device */
> > +	nfp_net_enable_queues(dev);
> > +
> > +	/* Writing configuration parameters in the device */
> > +	nfp_net_params_setup(hw);
> > +
> > +	new_ctrl = NFP_NET_CFG_CTRL_ENABLE;
> > +	update = NFP_NET_CFG_UPDATE_GEN |
> NFP_NET_CFG_UPDATE_RING |
> > +		 NFP_NET_CFG_UPDATE_MSIX;
> > +
> > +	rte_wmb();
> > +
> > +	/* If an error when reconfig we avoid to change hw state */
> > +	ret = nfp_net_reconfig(hw, new_ctrl, update);
> > +	if (ret) {
> 
> Compare vs 0
> 
> > +		PMD_INIT_LOG(ERR, "Failed to reconfig ctrl vnic");
> > +		return -EIO;
> > +	}
> > +
> > +	hw->ctrl = new_ctrl;
> > +
> > +	/* Setup the freelist ring */
> > +	ret = nfp_net_rx_freelist_setup(dev);
> > +	if (ret) {
> 
> Compare vs 0
> 
> > +		PMD_INIT_LOG(ERR, "Error with flower ctrl vNIC freelist
> setup");
> > +		return -EIO;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >   int
> >   nfp_init_app_flower(struct nfp_pf_dev *pf_dev)
> >   {
> >   	int ret;
> >   	unsigned int numa_node;
> >   	struct nfp_net_hw *pf_hw;
> > +	struct nfp_net_hw *ctrl_hw;
> >   	struct nfp_app_flower *app_flower;
> >
> >   	numa_node = rte_socket_id();
> > @@ -612,29 +957,63 @@
> >   	pf_hw->pf_dev = pf_dev;
> >   	pf_hw->cpp = pf_dev->cpp;
> >
> > +	/* The ctrl vNIC struct comes directly after the PF one */
> > +	app_flower->ctrl_hw = pf_hw + 1;
> > +	ctrl_hw = app_flower->ctrl_hw;
> > +
> > +	/* Map the ctrl vNIC ctrl bar */
> > +	ctrl_hw->ctrl_bar = nfp_rtsym_map(pf_dev->sym_tbl,
> "_pf0_net_ctrl_bar",
> > +		32768, &ctrl_hw->ctrl_area);
> > +	if (ctrl_hw->ctrl_bar == NULL) {
> > +		PMD_INIT_LOG(ERR, "Cloud not map the ctrl vNIC ctrl bar");
> > +		ret = -ENODEV;
> > +		goto pf_cpp_area_cleanup;
> > +	}
> > +
> > +	/* Now populate the ctrl vNIC */
> > +	ctrl_hw->pf_dev = pf_dev;
> > +	ctrl_hw->cpp = pf_dev->cpp;
> > +
> >   	ret = nfp_flower_init_pf_vnic(app_flower->pf_hw);
> >   	if (ret) {
> >   		PMD_INIT_LOG(ERR, "Could not initialize flower PF vNIC");
> > -		goto pf_cpp_area_cleanup;
> > +		goto ctrl_cpp_area_cleanup;
> > +	}
> > +
> > +	ret = nfp_flower_init_ctrl_vnic(app_flower->ctrl_hw);
> > +	if (ret) {
> 
> Compare vs 0
> 
> > +		PMD_INIT_LOG(ERR, "Could not initialize flower ctrl vNIC");
> > +		goto pf_vnic_cleanup;
> >   	}
> >
> >   	/* Start the PF vNIC */
> >   	ret = nfp_flower_start_pf_vnic(app_flower->pf_hw);
> >   	if (ret) {
> >   		PMD_INIT_LOG(ERR, "Could not start flower PF vNIC");
> > -		goto pf_vnic_cleanup;
> > +		goto ctrl_vnic_cleanup;
> > +	}
> > +
> > +	/* Start the ctrl vNIC */
> > +	ret = nfp_flower_start_ctrl_vnic(app_flower->ctrl_hw);
> > +	if (ret) {
> 
> Compare vs 0
> 
> > +		PMD_INIT_LOG(ERR, "Could not start flower ctrl vNIC");
> > +		goto ctrl_vnic_cleanup;
> >   	}
> >
> >   	/* Start up flower services */
> >   	if (nfp_flower_enable_services(app_flower)) {
> >   		ret = -ESRCH;
> > -		goto pf_vnic_cleanup;
> > +		goto ctrl_vnic_cleanup;
> >   	}
> >
> >   	return 0;
> >
> > +ctrl_vnic_cleanup:
> > +	nfp_flower_cleanup_ctrl_vnic(app_flower->ctrl_hw);
> >   pf_vnic_cleanup:
> >   	nfp_flower_cleanup_pf_vnic(app_flower->pf_hw);
> > +ctrl_cpp_area_cleanup:
> > +	nfp_cpp_area_free(ctrl_hw->ctrl_area);
> >   pf_cpp_area_cleanup:
> >   	nfp_cpp_area_free(pf_dev->ctrl_area);
> >   eth_tbl_cleanup:
> > diff --git a/drivers/net/nfp/flower/nfp_flower.h
> > b/drivers/net/nfp/flower/nfp_flower.h
> > index f6fd4eb..f11ef6d 100644
> > --- a/drivers/net/nfp/flower/nfp_flower.h
> > +++ b/drivers/net/nfp/flower/nfp_flower.h
> > @@ -21,6 +21,12 @@ struct nfp_app_flower {
> >   	/* Pointer to the PF vNIC */
> >   	struct nfp_net_hw *pf_hw;
> >
> > +	/* Pointer to a mempool for the ctrlvNIC */
> > +	struct rte_mempool *ctrl_pktmbuf_pool;
> > +
> > +	/* Pointer to the ctrl vNIC */
> > +	struct nfp_net_hw *ctrl_hw;
> > +
> >   	/* the eth table as reported by firmware */
> >   	struct nfp_eth_table *nfp_eth_table;
> >   };


^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH v5 10/12] net/nfp: add flower representor framework
  2022-08-05 14:23   ` Andrew Rybchenko
@ 2022-08-08 11:56     ` Chaoyong He
  0 siblings, 0 replies; 29+ messages in thread
From: Chaoyong He @ 2022-08-08 11:56 UTC (permalink / raw)
  To: Andrew Rybchenko, dev; +Cc: Niklas Soderlund

> On 8/5/22 09:32, Chaoyong He wrote:
> > This commit adds the framework to support flower representors.
> 
> "This commit adds" -> "Add"
> 
> > The number of VF representors are parsed from the command line. For
> > physical port representors the current logic aims to create a
> > representor for each physical port present on the hardware.
> >
> > A eth_dev is created for each phyport and VF, and flower firmware
> 
> A -> An, phyport -> physical port
> 
> > requires a MAC repr cmsg to be transmitted to firmware with info about
> > the number of physical ports configured.
> >
> > Reify messages are sent to hardware for each phyport representor.
> > A rte_ring is also created per representor so that traffic can be
> 
> A -> An
> 
> > pushed and pulled to this interface.
> >
> > To up and down the real device represented by a flower representor
> > port a port mod message is used to convey that info to the firmware.
> > This message will be used in the dev_ops callbacks of flower representors.
> >
> > Each cmsg generated by the driver is prepended with a cmsg header.
> > This commit also adds the logic to fill in the header of cmsgs.
> >
> > This commit also adds the Rx and Tx path for flower representors. For
> 
> "This commit also adds" -> "Also add"
> 
> > Rx packets are dequeued from the representor ring and passed to the
> > eth_dev. For Tx the first queue of the PF vNIC is used. Metadata about
> > the representor is added before the packet is sent down to firmware.
> >
> > Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> > Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> > ---
> >   drivers/net/nfp/flower/nfp_flower.c             |  59 +++
> >   drivers/net/nfp/flower/nfp_flower.h             |  18 +
> >   drivers/net/nfp/flower/nfp_flower_cmsg.c        | 186 +++++++++
> >   drivers/net/nfp/flower/nfp_flower_cmsg.h        | 173 ++++++++
> >   drivers/net/nfp/flower/nfp_flower_representor.c | 508
> ++++++++++++++++++++++++
> >   drivers/net/nfp/flower/nfp_flower_representor.h |  39 ++
> >   drivers/net/nfp/meson.build                     |   2 +
> >   drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c      |  31 +-
> >   8 files changed, 1004 insertions(+), 12 deletions(-)
> >   create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.c
> >   create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.h
> >   create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.c
> >   create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.h
> >
> > diff --git a/drivers/net/nfp/flower/nfp_flower.c
> > b/drivers/net/nfp/flower/nfp_flower.c
> > index 5e9c4ef..d7772d6 100644
> > --- a/drivers/net/nfp/flower/nfp_flower.c
> > +++ b/drivers/net/nfp/flower/nfp_flower.c
> > @@ -22,6 +22,7 @@
> >   #include "nfp_flower.h"
> >   #include "nfp_flower_ovs_compat.h"
> >   #include "nfp_flower_ctrl.h"
> > +#include "nfp_flower_representor.h"
> >
> >   #define MAX_PKT_BURST 32
> >   #define MEMPOOL_CACHE_SIZE 512
> > @@ -927,8 +928,13 @@
> >   	unsigned int numa_node;
> >   	struct nfp_net_hw *pf_hw;
> >   	struct nfp_net_hw *ctrl_hw;
> > +	struct rte_pci_device *pci_dev;
> >   	struct nfp_app_flower *app_flower;
> > +	struct rte_eth_devargs eth_da = {
> > +		.nb_representor_ports = 0
> > +	};
> >
> > +	pci_dev = pf_dev->pci_dev;
> >   	numa_node = rte_socket_id();
> >
> >   	/* Allocate memory for the Flower app */ @@ -1021,6 +1027,59 @@
> >   		goto ctrl_vnic_cleanup;
> >   	}
> >
> > +	/* Allocate a switch domain for the flower app */
> > +	if (app_flower->switch_domain_id ==
> RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID &&
> > +			rte_eth_switch_domain_alloc(&app_flower-
> >switch_domain_id)) {
> > +		PMD_INIT_LOG(WARNING,
> > +				"failed to allocate switch domain for device");
> > +	}
> > +
> > +	/* Now parse PCI device args passed for representor info */
> > +	if (pci_dev->device.devargs) {
> 
> Compare vs NULL
> 
> > +		ret = rte_eth_devargs_parse(pci_dev->device.devargs->args,
> > +				&eth_da);
> > +		if (ret) {
> 
> Compare vs 0
> 
> > +			PMD_INIT_LOG(ERR, "devarg parse failed");
> > +			goto ctrl_vnic_cleanup;
> > +		}
> > +	}
> > +
> 
> I see no representor_info_get implementation. How is application supporsed
> to find out which representors can be instantiated?
> 
> > +	if (eth_da.nb_representor_ports == 0) {
> > +		PMD_INIT_LOG(DEBUG, "No representor port need to
> create.");
> > +		ret = 0;
> > +		goto done;
> > +	}
> > +
> > +	/* There always exist phy repr */
> > +	if (eth_da.nb_representor_ports < app_flower->nfp_eth_table-
> >count)
> > +{
> 
> It is really strange check. First of all I'd say that it is very inconvinient to
> oblidge user to create some representors.
> Second, who said that these representors are physical port representors?
> May be number of representors is OK, but it is different representors...
> 
> > +		PMD_INIT_LOG(DEBUG, "Should also create phy representor
> port.");
> > +		ret = -ERANGE;
> > +		goto ctrl_vnic_cleanup;
> > +	}
> > +
> > +	/* Only support VF representor creation via the command line */
> > +	if (eth_da.type != RTE_ETH_REPRESENTOR_VF) {
> 
> I'm confused. Above you're talking about phy representors, but here VFs...

It's a little complicated to explain about here and the problem above:
Our nfp card has two physical ports, but only with one PF. It also supports VFs.
To offload rte_flow with ovs-dpdk, what we really need are the representor of
physical port and VF.

But the problem is, the representor of physical port has not supported by DPDK
for now.
So, we do a little trick here:
1. If the user wants to create any representor of VF, two representors of physical port
should be created also.
2. We pretend the type of these two representors of physical port are 'RTE_ETH_REPRESENTOR_VF'.

This is the reason for these 'strange' logics, and we will immediately update it once DPDK support
the representor of physical port.

> 
> > +		PMD_DRV_LOG(ERR, "unsupported representor type: %s\n",
> 
> The macro already adds \n. So, you don't need it here.
> 
> > +				pci_dev->device.devargs->args);
> > +		ret = -ENOTSUP;
> > +		goto ctrl_vnic_cleanup;
> > +	}
> > +
> > +	/* Fill in flower app with repr counts */
> > +	app_flower->num_phyport_reprs = (uint8_t)app_flower-
> >nfp_eth_table->count;
> > +	app_flower->num_vf_reprs = eth_da.nb_representor_ports -
> > +			app_flower->nfp_eth_table->count;
> > +
> > +	PMD_INIT_LOG(INFO, "%d number of VF reprs", app_flower-
> >num_vf_reprs);
> > +	PMD_INIT_LOG(INFO, "%d number of phyport reprs",
> > +app_flower->num_phyport_reprs);
> > +
> > +	ret = nfp_flower_repr_alloc(app_flower);
> > +	if (ret) {
> 
> Compare vs 0
> 
> > +		PMD_INIT_LOG(ERR,
> > +			"representors allocation for NFP_REPR_TYPE_VF
> error");
> > +		goto ctrl_vnic_cleanup;
> > +	}
> > +
> >   	return 0;
> >
> >   ctrl_vnic_cleanup:
> > diff --git a/drivers/net/nfp/flower/nfp_flower.h
> > b/drivers/net/nfp/flower/nfp_flower.h
> > index bdc64e3..24fced3 100644
> > --- a/drivers/net/nfp/flower/nfp_flower.h
> > +++ b/drivers/net/nfp/flower/nfp_flower.h
> > @@ -19,8 +19,20 @@ enum nfp_flower_service {
> >    */
> >   #define FLOWER_PKT_DATA_OFFSET 8
> >
> > +#define MAX_FLOWER_PHYPORTS 8
> > +#define MAX_FLOWER_VFS 64
> > +
> >   /* The flower application's private structure */
> >   struct nfp_app_flower {
> > +	/* switch domain for this app */
> > +	uint16_t switch_domain_id;
> > +
> > +	/* Number of VF representors */
> > +	uint8_t num_vf_reprs;
> > +
> > +	/* Number of phyport representors */
> > +	uint8_t num_phyport_reprs;
> > +
> >   	/* List of rte_service ID's for the flower app */
> >   	uint32_t flower_services_ids[NFP_FLOWER_SERVICE_MAX];
> >
> > @@ -44,6 +56,12 @@ struct nfp_app_flower {
> >
> >   	/* Ctrl vNIC Tx counter */
> >   	uint64_t ctrl_vnic_tx_count;
> > +
> > +	/* Array of phyport representors */
> > +	struct nfp_flower_representor
> *phy_reprs[MAX_FLOWER_PHYPORTS];
> > +
> > +	/* Array of VF representors */
> > +	struct nfp_flower_representor *vf_reprs[MAX_FLOWER_VFS];
> >   };
> >
> >   int nfp_init_app_flower(struct nfp_pf_dev *pf_dev); diff --git
> > a/drivers/net/nfp/flower/nfp_flower_cmsg.c
> > b/drivers/net/nfp/flower/nfp_flower_cmsg.c
> > new file mode 100644
> > index 0000000..5ce547c
> > --- /dev/null
> > +++ b/drivers/net/nfp/flower/nfp_flower_cmsg.c
> > @@ -0,0 +1,186 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2022 Corigine, Inc.
> > + * All rights reserved.
> > + */
> > +
> > +#include "../nfpcore/nfp_nsp.h"
> > +#include "../nfp_logs.h"
> > +#include "../nfp_common.h"
> > +#include "nfp_flower.h"
> > +#include "nfp_flower_cmsg.h"
> > +#include "nfp_flower_ctrl.h"
> > +#include "nfp_flower_representor.h"
> > +
> > +static void *
> > +nfp_flower_cmsg_init(struct rte_mbuf *m,
> > +		enum nfp_flower_cmsg_type type,
> > +		uint32_t size)
> > +{
> > +	char *pkt;
> > +	uint32_t data;
> > +	uint32_t new_size = size;
> > +	struct nfp_flower_cmsg_hdr *hdr;
> > +
> > +	pkt = rte_pktmbuf_mtod(m, char *);
> > +	PMD_DRV_LOG(DEBUG, "flower_cmsg_init using pkt at %p", pkt);
> > +
> > +	data = rte_cpu_to_be_32(NFP_NET_META_PORTID);
> > +	rte_memcpy(pkt, &data, 4);
> > +	pkt += 4;
> > +	new_size += 4;
> > +
> > +	/* First the metadata as flower requires it */
> > +	data = rte_cpu_to_be_32(NFP_META_PORT_ID_CTRL);
> > +	rte_memcpy(pkt, &data, 4);
> > +	pkt += 4;
> > +	new_size += 4;
> > +
> > +	/* Now the ctrl header */
> > +	hdr = (struct nfp_flower_cmsg_hdr *)pkt;
> > +	hdr->pad     = 0;
> > +	hdr->type    = type;
> > +	hdr->version = NFP_FLOWER_CMSG_VER1;
> > +
> > +	pkt = (char *)hdr + NFP_FLOWER_CMSG_HLEN;
> > +	new_size += NFP_FLOWER_CMSG_HLEN;
> > +
> > +	m->pkt_len = new_size;
> > +	m->data_len = m->pkt_len;
> > +
> > +	return pkt;
> > +}
> > +
> > +static void
> > +nfp_flower_cmsg_mac_repr_init(struct rte_mbuf *m, int num_ports) {
> > +	uint32_t size;
> > +	struct nfp_flower_cmsg_mac_repr *msg;
> > +	enum nfp_flower_cmsg_type type =
> NFP_FLOWER_CMSG_TYPE_MAC_REPR;
> > +
> > +	size = sizeof(*msg) + (num_ports * sizeof(msg->ports[0]));
> > +	PMD_DRV_LOG(DEBUG, "mac repr cmsg init with size: %u", size);
> > +	msg = (struct nfp_flower_cmsg_mac_repr
> *)nfp_flower_cmsg_init(m,
> > +			type, size);
> > +
> > +	memset(msg->reserved, 0, sizeof(msg->reserved));
> > +	msg->num_ports = num_ports;
> > +}
> > +
> > +static void
> > +nfp_flower_cmsg_mac_repr_fill(struct rte_mbuf *m,
> > +		unsigned int idx,
> > +		unsigned int nbi,
> > +		unsigned int nbi_port,
> > +		unsigned int phys_port)
> > +{
> > +	struct nfp_flower_cmsg_mac_repr *msg;
> > +
> > +	msg = (struct nfp_flower_cmsg_mac_repr
> *)nfp_flower_cmsg_get_data(m);
> > +	msg->ports[idx].idx       = idx;
> > +	msg->ports[idx].info      = nbi &
> NFP_FLOWER_CMSG_MAC_REPR_NBI;
> > +	msg->ports[idx].nbi_port  = nbi_port;
> > +	msg->ports[idx].phys_port = phys_port; }
> > +
> > +int
> > +nfp_flower_cmsg_mac_repr(struct nfp_app_flower *app_flower) {
> > +	int i;
> > +	uint16_t cnt;
> > +	unsigned int nbi;
> > +	unsigned int nbi_port;
> > +	unsigned int phys_port;
> > +	struct rte_mbuf *mbuf;
> > +	struct nfp_eth_table *nfp_eth_table;
> > +
> > +	mbuf = rte_pktmbuf_alloc(app_flower->ctrl_pktmbuf_pool);
> > +	if (mbuf == NULL) {
> > +		PMD_DRV_LOG(ERR, "Could not allocate mac repr cmsg");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	nfp_flower_cmsg_mac_repr_init(mbuf, app_flower-
> >num_phyport_reprs);
> > +
> > +	/* Fill in the mac repr cmsg */
> > +	nfp_eth_table = app_flower->nfp_eth_table;
> > +	for (i = 0; i < app_flower->num_phyport_reprs; i++) {
> > +		nbi = nfp_eth_table->ports[i].nbi;
> > +		nbi_port = nfp_eth_table->ports[i].base;
> > +		phys_port = nfp_eth_table->ports[i].index;
> > +
> > +		nfp_flower_cmsg_mac_repr_fill(mbuf, i, nbi, nbi_port,
> phys_port);
> > +	}
> > +
> > +	/* Send the cmsg via the ctrl vNIC */
> > +	cnt = nfp_flower_ctrl_vnic_xmit(app_flower, mbuf);
> > +	if (cnt == 0) {
> > +		PMD_DRV_LOG(ERR, "Send cmsg through ctrl vnic failed.");
> > +		rte_pktmbuf_free(mbuf);
> > +		return -EIO;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +int
> > +nfp_flower_cmsg_repr_reify(struct nfp_app_flower *app_flower,
> > +		struct nfp_flower_representor *repr) {
> > +	uint16_t cnt;
> > +	struct rte_mbuf *mbuf;
> > +	struct nfp_flower_cmsg_port_reify *msg;
> > +
> > +	mbuf = rte_pktmbuf_alloc(app_flower->ctrl_pktmbuf_pool);
> > +	if (mbuf == NULL) {
> > +		PMD_DRV_LOG(DEBUG, "alloc mbuf for repr reify failed");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	msg = (struct nfp_flower_cmsg_port_reify
> *)nfp_flower_cmsg_init(mbuf,
> > +			NFP_FLOWER_CMSG_TYPE_PORT_REIFY,
> sizeof(*msg));
> > +
> > +	msg->portnum  = rte_cpu_to_be_32(repr->port_id);
> > +	msg->reserved = 0;
> > +	msg->info     = rte_cpu_to_be_16(1);
> > +
> > +	cnt = nfp_flower_ctrl_vnic_xmit(app_flower, mbuf);
> > +	if (cnt == 0) {
> > +		PMD_DRV_LOG(ERR, "Send cmsg through ctrl vnic failed.");
> > +		rte_pktmbuf_free(mbuf);
> > +		return -EIO;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +int
> > +nfp_flower_cmsg_port_mod(struct nfp_app_flower *app_flower,
> > +		uint32_t port_id, bool carrier_ok)
> > +{
> > +	uint16_t cnt;
> > +	struct rte_mbuf *mbuf;
> > +	struct nfp_flower_cmsg_port_mod *msg;
> > +
> > +	mbuf = rte_pktmbuf_alloc(app_flower->ctrl_pktmbuf_pool);
> > +	if (mbuf == NULL) {
> > +		PMD_DRV_LOG(DEBUG, "alloc mbuf for repr portmod
> failed");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	msg = (struct nfp_flower_cmsg_port_mod
> *)nfp_flower_cmsg_init(mbuf,
> > +			NFP_FLOWER_CMSG_TYPE_PORT_MOD,
> sizeof(*msg));
> > +
> > +	msg->portnum  = rte_cpu_to_be_32(port_id);
> > +	msg->reserved = 0;
> > +	msg->info     = carrier_ok;
> > +	msg->mtu      = 9000;
> > +
> > +	cnt = nfp_flower_ctrl_vnic_xmit(app_flower, mbuf);
> > +	if (cnt == 0) {
> > +		PMD_DRV_LOG(ERR, "Send cmsg through ctrl vnic failed.");
> > +		rte_pktmbuf_free(mbuf);
> > +		return -EIO;
> > +	}
> > +
> > +	return 0;
> > +}
> > diff --git a/drivers/net/nfp/flower/nfp_flower_cmsg.h
> > b/drivers/net/nfp/flower/nfp_flower_cmsg.h
> > new file mode 100644
> > index 0000000..fce5163
> > --- /dev/null
> > +++ b/drivers/net/nfp/flower/nfp_flower_cmsg.h
> > @@ -0,0 +1,173 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2022 Corigine, Inc.
> > + * All rights reserved.
> > + */
> > +
> > +#ifndef _NFP_CMSG_H_
> > +#define _NFP_CMSG_H_
> > +
> > +#include <rte_byteorder.h>
> > +#include <rte_ether.h>
> > +
> > +struct nfp_flower_cmsg_hdr {
> > +	rte_be16_t pad;
> > +	uint8_t type;
> > +	uint8_t version;
> > +};
> > +
> > +/* Types defined for control messages */ enum nfp_flower_cmsg_type {
> > +	NFP_FLOWER_CMSG_TYPE_FLOW_ADD       = 0,
> > +	NFP_FLOWER_CMSG_TYPE_FLOW_MOD       = 1,
> > +	NFP_FLOWER_CMSG_TYPE_FLOW_DEL       = 2,
> > +	NFP_FLOWER_CMSG_TYPE_LAG_CONFIG     = 4,
> > +	NFP_FLOWER_CMSG_TYPE_PORT_REIFY     = 6,
> > +	NFP_FLOWER_CMSG_TYPE_MAC_REPR       = 7,
> > +	NFP_FLOWER_CMSG_TYPE_PORT_MOD       = 8,
> > +	NFP_FLOWER_CMSG_TYPE_MERGE_HINT     = 9,
> > +	NFP_FLOWER_CMSG_TYPE_NO_NEIGH       = 10,
> > +	NFP_FLOWER_CMSG_TYPE_TUN_MAC        = 11,
> > +	NFP_FLOWER_CMSG_TYPE_ACTIVE_TUNS    = 12,
> > +	NFP_FLOWER_CMSG_TYPE_TUN_NEIGH      = 13,
> > +	NFP_FLOWER_CMSG_TYPE_TUN_IPS        = 14,
> > +	NFP_FLOWER_CMSG_TYPE_FLOW_STATS     = 15,
> > +	NFP_FLOWER_CMSG_TYPE_PORT_ECHO      = 16,
> > +	NFP_FLOWER_CMSG_TYPE_QOS_MOD        = 18,
> > +	NFP_FLOWER_CMSG_TYPE_QOS_DEL        = 19,
> > +	NFP_FLOWER_CMSG_TYPE_QOS_STATS      = 20,
> > +	NFP_FLOWER_CMSG_TYPE_PRE_TUN_RULE   = 21,
> > +	NFP_FLOWER_CMSG_TYPE_TUN_IPS_V6     = 22,
> > +	NFP_FLOWER_CMSG_TYPE_NO_NEIGH_V6    = 23,
> > +	NFP_FLOWER_CMSG_TYPE_TUN_NEIGH_V6   = 24,
> > +	NFP_FLOWER_CMSG_TYPE_ACTIVE_TUNS_V6 = 25,
> > +	NFP_FLOWER_CMSG_TYPE_MAX            = 32,
> > +};
> > +
> > +/*
> > + * NFP_FLOWER_CMSG_TYPE_MAC_REPR
> > + *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
> > + *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
> > + *     Word +---------------+-----------+---+---------------+---------------+
> > + *       0  |                  spare                        |Number of ports|
> > + *          +---------------+-----------+---+---------------+---------------+
> > + *       1  |    Index      |   spare   |NBI|  Port on NBI  | Chip-wide port|
> > + *          +---------------+-----------+---+---------------+---------------+
> > + *                                        ....
> > + *          +---------------+-----------+---+---------------+---------------+
> > + *     N-1  |    Index      |   spare   |NBI|  Port on NBI  | Chip-wide port|
> > + *          +---------------+-----------+---+---------------+---------------+
> > + *     N    |    Index      |   spare   |NBI|  Port on NBI  | Chip-wide port|
> > + *          +---------------+-----------+---+---------------+---------------+
> > + *
> > + *          Index: index into the eth table
> > + *          NBI (bits 17-16): NBI number (0-3)
> > + *          Port on NBI (bits 15-8): “base” in the driver
> > + *            this forms NBIX.PortY notation as the NSP eth table.
> > + *          "Chip-wide" port (bits 7-0):
> > + */
> > +struct nfp_flower_cmsg_mac_repr {
> > +	uint8_t reserved[3];
> > +	uint8_t num_ports;
> > +	struct {
> > +		uint8_t idx;
> > +		uint8_t info;
> > +		uint8_t nbi_port;
> > +		uint8_t phys_port;
> > +	} ports[0];
> > +};
> > +
> > +/*
> > + * NFP_FLOWER_CMSG_TYPE_PORT_REIFY
> > + *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
> > + *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
> > + *    Word  +-------+-------+---+---+-------+---+---+-----------+-----------+
> > + *       0  |Port Ty|Sys ID |NIC|Rsv| Spare |PCI|typ|    vNIC   |  queue    |
> > + *          +-------+-----+-+---+---+-------+---+---+-----------+---------+-+
> > + *       1  |                             Spare                           |E|
> > + *          +-------------------------------------------------------------+-+
> > + *          E: 1 = Representor exists, 0 = Representor does not exist
> > + */
> > +struct nfp_flower_cmsg_port_reify {
> > +	rte_be32_t portnum;
> > +	rte_be16_t reserved;
> > +	rte_be16_t info;
> > +};
> > +
> > +/*
> > + * NFP_FLOWER_CMSG_TYPE_PORT_MOD
> > + *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
> > + *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
> > + *    Word  +-------+-------+---+---+-------+---+---+-------+---+-----------+
> > + *       0  |Port Ty|Sys ID |NIC|Rsv|       Reserved        |    Port       |
> > + *          +-------+-------+---+---+-----+-+---+---+-------+---+-----------+
> > + *       1  |            Spare            |L|              MTU              |
> > + *          +-----------------------------+-+-------------------------------+
> > + *        L: Link or Admin state bit. When message is generated by host, this
> > + *           bit indicates the admin state (0=down, 1=up). When generated by
> > + *           NFP, it indicates the link state (0=down, 1=up)
> > + *
> > + *        Port Type (word 1, bits 31 to 28) = 1 (Physical Network)
> > + *        Port: “Chip-wide number” as assigned by BSP
> > + *
> > + *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
> > + *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
> > + *    Word  +-------+-------+---+---+-------+---+---+-------+---+-----------+
> > + *       0  |Port Ty|Sys ID |NIC|Rsv| Spare |PCI|typ|    vNIC   |  queue    |
> > + *          +-------+-----+-+---+---+---+-+-+---+---+-------+---+-----------+
> > + *       1  |            Spare            |L|              MTU              |
> > + *          +-----------------------------+-+-------------------------------+
> > + *        L: Link or Admin state bit. When message is generated by host, this
> > + *           bit indicates the admin state (0=down, 1=up). When generated by
> > + *           NFP, it indicates the link state (0=down, 1=up)
> > + *
> > + *        Port Type (word 1, bits 31 to 28) = 2 (PCIE)
> > + */
> > +struct nfp_flower_cmsg_port_mod {
> > +	rte_be32_t portnum;
> > +	uint8_t reserved;
> > +	uint8_t info;
> > +	rte_be16_t mtu;
> > +};
> > +
> > +enum nfp_flower_cmsg_port_type {
> > +	NFP_FLOWER_CMSG_PORT_TYPE_UNSPEC,
> > +	NFP_FLOWER_CMSG_PORT_TYPE_PHYS_PORT,
> > +	NFP_FLOWER_CMSG_PORT_TYPE_PCIE_PORT,
> > +	NFP_FLOWER_CMSG_PORT_TYPE_OTHER_PORT,
> > +};
> > +
> > +enum nfp_flower_cmsg_port_vnic_type {
> > +	NFP_FLOWER_CMSG_PORT_VNIC_TYPE_VF,
> > +	NFP_FLOWER_CMSG_PORT_VNIC_TYPE_PF,
> > +	NFP_FLOWER_CMSG_PORT_VNIC_TYPE_CTRL,
> > +};
> > +
> > +#define NFP_FLOWER_CMSG_MAC_REPR_NBI            (0x3)
> > +
> > +#define NFP_FLOWER_CMSG_HLEN            sizeof(struct
> nfp_flower_cmsg_hdr)
> > +#define NFP_FLOWER_CMSG_VER1            1
> > +#define NFP_NET_META_PORTID             5
> > +#define NFP_META_PORT_ID_CTRL           ~0U
> > +
> > +#define NFP_FLOWER_CMSG_PORT_TYPE(x)            (((x) >> 28) & 0xf)  /*
> [31,28] */
> > +#define NFP_FLOWER_CMSG_PORT_SYS_ID(x)          (((x) >> 24) & 0xf)  /*
> [24,27] */
> > +#define NFP_FLOWER_CMSG_PORT_NFP_ID(x)          (((x) >> 22) & 0x3)
> /* [22,23] */
> > +#define NFP_FLOWER_CMSG_PORT_PCI(x)             (((x) >> 14) & 0x3)  /*
> [14,15] */
> > +#define NFP_FLOWER_CMSG_PORT_VNIC_TYPE(x)       (((x) >> 12) & 0x3)
> /* [12,13] */
> > +#define NFP_FLOWER_CMSG_PORT_VNIC(x)            (((x) >> 6) & 0x3f)  /*
> [6,11] */
> > +#define NFP_FLOWER_CMSG_PORT_PCIE_Q(x)          ((x) & 0x3f)         /*
> [0,5] */
> > +#define NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM(x)   ((x) & 0xff)
> /* [0,7] */
> > +
> > +static inline char*
> > +nfp_flower_cmsg_get_data(struct rte_mbuf *m) {
> > +	return rte_pktmbuf_mtod(m, char *) + 4 + 4 +
> NFP_FLOWER_CMSG_HLEN; }
> > +
> > +int nfp_flower_cmsg_mac_repr(struct nfp_app_flower *app_flower); int
> > +nfp_flower_cmsg_repr_reify(struct nfp_app_flower *app_flower,
> > +		struct nfp_flower_representor *repr); int
> > +nfp_flower_cmsg_port_mod(struct nfp_app_flower *app_flower,
> > +		uint32_t port_id, bool carrier_ok);
> > +
> > +#endif /* _NFP_CMSG_H_ */
> > diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c
> > b/drivers/net/nfp/flower/nfp_flower_representor.c
> > new file mode 100644
> > index 0000000..9f23a23
> > --- /dev/null
> > +++ b/drivers/net/nfp/flower/nfp_flower_representor.c
> > @@ -0,0 +1,508 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright (c) 2022 Corigine, Inc.
> > + * All rights reserved.
> > + */
> > +
> > +#include <rte_common.h>
> > +#include <ethdev_pci.h>
> > +
> > +#include "../nfp_common.h"
> > +#include "../nfp_logs.h"
> > +#include "../nfp_ctrl.h"
> > +#include "../nfp_rxtx.h"
> > +#include "../nfpcore/nfp_mip.h"
> > +#include "../nfpcore/nfp_rtsym.h"
> > +#include "../nfpcore/nfp_nsp.h"
> > +#include "nfp_flower.h"
> > +#include "nfp_flower_representor.h"
> > +#include "nfp_flower_ctrl.h"
> > +#include "nfp_flower_cmsg.h"
> > +
> > +static int
> > +nfp_flower_repr_link_update(__rte_unused struct rte_eth_dev *ethdev,
> > +		__rte_unused int wait_to_complete)
> > +{
> 
> Why is dummy implemenation OK?
> 
> > +	return 0;
> > +}
> > +
> > +static int
> > +nfp_flower_repr_dev_infos_get(__rte_unused struct rte_eth_dev *dev,
> > +		struct rte_eth_dev_info *dev_info)
> > +{
> > +	/* Hardcoded pktlen and queues for now */
> > +	dev_info->max_rx_queues = 1;
> > +	dev_info->max_tx_queues = 1;
> > +	dev_info->min_rx_bufsize = RTE_ETHER_MIN_MTU;
> > +	dev_info->max_rx_pktlen = 9000;
> > +
> > +	dev_info->rx_offload_capa = RTE_ETH_RX_OFFLOAD_VLAN_STRIP;
> > +	dev_info->rx_offload_capa |= RTE_ETH_RX_OFFLOAD_IPV4_CKSUM
> |
> > +			RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
> > +			RTE_ETH_RX_OFFLOAD_TCP_CKSUM;
> > +
> > +	dev_info->tx_offload_capa = RTE_ETH_TX_OFFLOAD_VLAN_INSERT;
> > +	dev_info->tx_offload_capa |= RTE_ETH_TX_OFFLOAD_IPV4_CKSUM
> |
> > +			RTE_ETH_TX_OFFLOAD_UDP_CKSUM |
> > +			RTE_ETH_TX_OFFLOAD_TCP_CKSUM;
> > +	dev_info->tx_offload_capa |= RTE_ETH_TX_OFFLOAD_TCP_TSO;
> > +	dev_info->tx_offload_capa |= RTE_ETH_TX_OFFLOAD_MULTI_SEGS;
> > +
> > +	dev_info->max_mac_addrs = 1;
> > +
> > +	return 0;
> > +}
> > +
> > +static int
> > +nfp_flower_repr_dev_configure(__rte_unused struct rte_eth_dev *dev)
> {
> > +	return 0;
> > +}
> > +
> > +static int
> > +nfp_flower_repr_dev_start(struct rte_eth_dev *dev) {
> > +	struct nfp_app_flower *app_flower;
> > +	struct nfp_flower_representor *repr;
> > +
> > +	repr = (struct nfp_flower_representor *)dev->data->dev_private;
> > +	app_flower = repr->app_flower;
> > +
> > +	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT)
> > +		nfp_eth_set_configured(app_flower->pf_hw->pf_dev->cpp,
> > +				repr->nfp_idx, 1);
> > +
> > +	nfp_flower_cmsg_port_mod(app_flower, repr->port_id, true);
> > +
> > +	return 0;
> > +}
> > +
> > +static int
> > +nfp_flower_repr_dev_stop(struct rte_eth_dev *dev) {
> > +	struct nfp_app_flower *app_flower;
> > +	struct nfp_flower_representor *repr;
> > +
> > +	repr = (struct nfp_flower_representor *)dev->data->dev_private;
> > +	app_flower = repr->app_flower;
> > +
> > +	nfp_flower_cmsg_port_mod(app_flower, repr->port_id, false);
> > +
> > +	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT)
> > +		nfp_eth_set_configured(app_flower->pf_hw->pf_dev->cpp,
> > +				repr->nfp_idx, 0);
> > +
> > +	return 0;
> > +}
> > +
> > +static int
> > +nfp_flower_repr_rx_queue_setup(struct rte_eth_dev *dev,
> > +		uint16_t rx_queue_id,
> > +		__rte_unused uint16_t nb_rx_desc,
> > +		unsigned int socket_id,
> > +		__rte_unused const struct rte_eth_rxconf *rx_conf,
> > +		__rte_unused struct rte_mempool *mb_pool) {
> > +	struct nfp_net_rxq *rxq;
> > +	struct nfp_net_hw *pf_hw;
> > +	struct nfp_flower_representor *repr;
> > +
> > +	repr = (struct nfp_flower_representor *)dev->data->dev_private;
> > +	pf_hw = repr->app_flower->pf_hw;
> > +
> > +	/* Allocating rx queue data structure */
> > +	rxq = rte_zmalloc_socket("ethdev RX queue", sizeof(struct
> nfp_net_rxq),
> > +			RTE_CACHE_LINE_SIZE, socket_id);
> > +	if (rxq == NULL)
> > +		return -ENOMEM;
> > +
> > +	rxq->hw = pf_hw;
> > +	rxq->qidx = rx_queue_id;
> > +	rxq->port_id = dev->data->port_id;
> > +	dev->data->rx_queues[rx_queue_id] = rxq;
> > +
> > +	return 0;
> > +}
> > +
> > +static int
> > +nfp_flower_repr_tx_queue_setup(struct rte_eth_dev *dev,
> > +		uint16_t tx_queue_id,
> > +		__rte_unused uint16_t nb_tx_desc,
> > +		unsigned int socket_id,
> > +		__rte_unused const struct rte_eth_txconf *tx_conf) {
> > +	struct nfp_net_txq *txq;
> > +	struct nfp_net_hw *pf_hw;
> > +	struct nfp_flower_representor *repr;
> > +
> > +	repr = (struct nfp_flower_representor *)dev->data->dev_private;
> > +	pf_hw = repr->app_flower->pf_hw;
> > +
> > +	/* Allocating tx queue data structure */
> > +	txq = rte_zmalloc_socket("ethdev TX queue", sizeof(struct
> nfp_net_txq),
> > +			RTE_CACHE_LINE_SIZE, socket_id);
> > +	if (txq == NULL)
> > +		return -ENOMEM;
> > +
> > +	txq->hw = pf_hw;
> > +	txq->qidx = tx_queue_id;
> > +	txq->port_id = dev->data->port_id;
> > +	dev->data->tx_queues[tx_queue_id] = txq;
> > +
> > +	return 0;
> > +}
> > +
> > +static int
> > +nfp_flower_repr_stats_get(struct rte_eth_dev *ethdev,
> > +		struct rte_eth_stats *stats)
> > +{
> > +	struct nfp_flower_representor *repr;
> > +
> > +	repr = (struct nfp_flower_representor *)ethdev->data->dev_private;
> > +	rte_memcpy(stats, &repr->repr_stats, sizeof(struct rte_eth_stats));
> > +
> > +	return 0;
> > +}
> > +
> > +static int
> > +nfp_flower_repr_stats_reset(__rte_unused struct rte_eth_dev *ethdev)
> > +{
> 
> Why is it OK to have dummy implemenation of the reset?
> 
> > +	return 0;
> > +}
> > +
> > +static int
> > +nfp_flower_repr_promiscuous_enable(__rte_unused struct rte_eth_dev
> > +*ethdev) {
> 
> Why is it OK to have dummy implementation? Same question for all dummy
> callbacks below.
> 
> > +	return 0;
> > +}
> > +
> > +static int
> > +nfp_flower_repr_promiscuous_disable(__rte_unused struct
> rte_eth_dev
> > +*ethdev) {
> > +	return 0;
> > +}
> > +
> > +static void
> > +nfp_flower_repr_mac_addr_remove(__rte_unused struct rte_eth_dev
> *ethdev,
> > +		__rte_unused uint32_t index)
> > +{
> > +}
> > +
> > +static int
> > +nfp_flower_repr_mac_addr_set(__rte_unused struct rte_eth_dev
> *ethdev,
> > +		__rte_unused struct rte_ether_addr *mac_addr) {
> > +	return 0;
> > +}
> > +
> > +static uint16_t
> > +nfp_flower_repr_rx_burst(void *rx_queue,
> > +		struct rte_mbuf **rx_pkts,
> > +		uint16_t nb_pkts)
> > +{
> > +	unsigned int available = 0;
> > +	unsigned int total_dequeue;
> > +	struct nfp_net_rxq *rxq;
> > +	struct rte_eth_dev *dev;
> > +	struct nfp_flower_representor *repr;
> > +
> > +	rxq = rx_queue;
> > +	if (unlikely(rxq == NULL)) {
> > +		PMD_RX_LOG(ERR, "RX Bad queue");
> > +		return 0;
> > +	}
> > +
> > +	dev = &rte_eth_devices[rxq->port_id];
> > +	repr = dev->data->dev_private;
> > +	if (unlikely(repr->ring == NULL)) {
> > +		PMD_RX_LOG(ERR, "representor %s has no ring configured!",
> > +				repr->name);
> > +		return 0;
> > +	}
> > +
> > +	total_dequeue = rte_ring_dequeue_burst(repr->ring, (void
> *)rx_pkts,
> > +			nb_pkts, &available);
> > +	if (total_dequeue) {
> 
> Compare vs 0
> 
> > +		PMD_RX_LOG(DEBUG, "Representor Rx burst for %s, port_id:
> 0x%x, "
> > +				"received: %u, available: %u", repr->name,
> > +				repr->port_id, total_dequeue, available);
> > +
> > +		repr->repr_stats.ipackets += total_dequeue;
> > +	}
> > +
> > +	return total_dequeue;
> > +}
> > +
> > +static uint16_t
> > +nfp_flower_repr_tx_burst(void *tx_queue,
> > +		struct rte_mbuf **tx_pkts,
> > +		uint16_t nb_pkts)
> > +{
> > +	uint16_t i;
> > +	uint16_t sent;
> > +	char *meta_offset;
> > +	struct nfp_net_txq *txq;
> > +	struct nfp_net_hw *pf_hw;
> > +	struct rte_eth_dev *dev;
> > +	struct rte_eth_dev *repr_dev;
> > +	struct nfp_flower_representor *repr;
> > +
> > +	txq = tx_queue;
> > +	if (unlikely(txq == NULL)) {
> > +		PMD_RX_LOG(ERR, "TX Bad queue");
> > +		return 0;
> > +	}
> > +
> > +	/* This points to the PF vNIC that owns this representor */
> > +	pf_hw = txq->hw;
> > +	dev = pf_hw->eth_dev;
> > +
> > +	/* Grab a handle to the representor struct */
> > +	repr_dev = &rte_eth_devices[txq->port_id];
> > +	repr = repr_dev->data->dev_private;
> > +
> > +	for (i = 0; i < nb_pkts; i++) {
> > +		meta_offset = rte_pktmbuf_prepend(tx_pkts[i],
> FLOWER_PKT_DATA_OFFSET);
> > +		*(uint32_t *)meta_offset =
> rte_cpu_to_be_32(NFP_NET_META_PORTID);
> > +		meta_offset += 4;
> > +		*(uint32_t *)meta_offset = rte_cpu_to_be_32(repr-
> >port_id);
> > +	}
> > +
> > +	/* Only using Tx queue 0 for now. */
> > +	sent = rte_eth_tx_burst(dev->data->port_id, 0, tx_pkts, nb_pkts);
> > +	if (sent) {
> 
> Compare vs 0
> 
> > +		PMD_TX_LOG(DEBUG, "Representor Tx burst for %s, port_id:
> 0x%x "
> > +			"transmitted: %u\n", repr->name, repr->port_id,
> sent);
> > +		repr->repr_stats.opackets += sent;
> > +	}
> > +
> > +	return sent;
> > +}
> > +
> > +static const struct eth_dev_ops nfp_flower_repr_dev_ops = {
> > +	.dev_infos_get        = nfp_flower_repr_dev_infos_get,
> > +
> > +	.dev_start            = nfp_flower_repr_dev_start,
> > +	.dev_configure        = nfp_flower_repr_dev_configure,
> > +	.dev_stop             = nfp_flower_repr_dev_stop,
> > +
> > +	.rx_queue_setup       = nfp_flower_repr_rx_queue_setup,
> > +	.tx_queue_setup       = nfp_flower_repr_tx_queue_setup,
> > +
> > +	.link_update          = nfp_flower_repr_link_update,
> > +
> > +	.stats_get            = nfp_flower_repr_stats_get,
> > +	.stats_reset          = nfp_flower_repr_stats_reset,
> > +
> > +	.promiscuous_enable   = nfp_flower_repr_promiscuous_enable,
> > +	.promiscuous_disable  = nfp_flower_repr_promiscuous_disable,
> > +
> > +	.mac_addr_remove      = nfp_flower_repr_mac_addr_remove,
> > +	.mac_addr_set         = nfp_flower_repr_mac_addr_set,
> > +};
> > +
> > +static uint32_t
> > +nfp_flower_get_phys_port_id(uint8_t port) {
> > +	return (NFP_FLOWER_CMSG_PORT_TYPE_PHYS_PORT << 28) | port; }
> > +
> > +static uint32_t
> > +nfp_get_pcie_port_id(struct nfp_cpp *cpp,
> > +		int type,
> > +		uint8_t vnic,
> > +		uint8_t queue)
> > +{
> > +	uint8_t nfp_pcie;
> > +	uint32_t port_id;
> > +
> > +	nfp_pcie = NFP_CPP_INTERFACE_UNIT_of(nfp_cpp_interface(cpp));
> > +	port_id = ((nfp_pcie & 0x3) << 14) |
> > +		  ((type & 0x3) << 12) |
> > +		  ((vnic & 0x3f) << 6) |
> > +		  (queue & 0x3f) |
> > +		  ((NFP_FLOWER_CMSG_PORT_TYPE_PCIE_PORT & 0xf) <<
> 28);
> > +
> > +	return port_id;
> > +}
> > +
> > +static int
> > +nfp_flower_repr_init(struct rte_eth_dev *eth_dev,
> > +		void *init_params)
> > +{
> > +	int ret;
> > +	unsigned int numa_node;
> > +	char ring_name[RTE_ETH_NAME_MAX_LEN];
> > +	struct nfp_app_flower *app_flower;
> > +	struct nfp_flower_representor *repr;
> > +	struct nfp_flower_representor *init_repr_data;
> > +
> > +	/* Cast the input representor data to the correct struct here */
> > +	init_repr_data = (struct nfp_flower_representor *)init_params;
> > +
> > +	app_flower = init_repr_data->app_flower;
> > +
> > +	/* Memory has been allocated in the eth_dev_create() function */
> > +	repr = eth_dev->data->dev_private;
> > +
> > +	/*
> > +	 * We need multiproduce rings as we can have multiple PF ports.
> > +	 * On the other hand, we need single consumer rings, as just one
> > +	 * representor PMD will try to read from the ring.
> > +	 */
> > +	snprintf(ring_name, sizeof(ring_name), "%s_%s",
> > +		init_repr_data->name, "ring");
> > +	numa_node = rte_socket_id();
> > +	repr->ring = rte_ring_create(ring_name, 256, numa_node,
> RING_F_SC_DEQ);
> > +	if (repr->ring == NULL) {
> > +		PMD_INIT_LOG(ERR, "rte_ring_create failed for %s\n",
> ring_name);
> > +		return -ENOMEM;
> > +	}
> > +
> > +	/* Copy data here from the input representor template*/
> > +	repr->vf_id            = init_repr_data->vf_id;
> > +	repr->switch_domain_id = init_repr_data->switch_domain_id;
> > +	repr->port_id          = init_repr_data->port_id;
> > +	repr->nfp_idx          = init_repr_data->nfp_idx;
> > +	repr->repr_type        = init_repr_data->repr_type;
> > +	repr->app_flower       = init_repr_data->app_flower;
> > +
> > +	snprintf(repr->name, sizeof(repr->name), "%s",
> > +init_repr_data->name);
> > +
> > +	eth_dev->dev_ops = &nfp_flower_repr_dev_ops;
> > +
> > +	eth_dev->rx_pkt_burst = nfp_flower_repr_rx_burst;
> > +	eth_dev->tx_pkt_burst = nfp_flower_repr_tx_burst;
> > +
> > +	eth_dev->data->dev_flags |= RTE_ETH_DEV_REPRESENTOR;
> > +
> > +	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT)
> > +		eth_dev->data->representor_id = repr->vf_id;
> > +	else
> > +		eth_dev->data->representor_id = repr->vf_id +
> > +				app_flower->num_phyport_reprs;
> > +
> > +	/* This backer port is that of the eth_device created for the PF vNIC
> */
> > +	eth_dev->data->backer_port_id =
> > +app_flower->pf_hw->eth_dev->data->port_id;
> > +
> > +	/* Only single queues for representor devices */
> > +	eth_dev->data->nb_rx_queues = 1;
> > +	eth_dev->data->nb_tx_queues = 1;
> > +
> > +	/* Allocating memory for mac addr */
> > +	eth_dev->data->mac_addrs = rte_zmalloc("mac_addr",
> > +		RTE_ETHER_ADDR_LEN, 0);
> > +	if (eth_dev->data->mac_addrs == NULL) {
> > +		PMD_INIT_LOG(ERR, "Failed to allocate memory for repr
> MAC");
> > +		ret = -ENOMEM;
> > +		goto ring_cleanup;
> > +	}
> > +
> > +	rte_ether_addr_copy(&init_repr_data->mac_addr, &repr-
> >mac_addr);
> > +	rte_ether_addr_copy(&init_repr_data->mac_addr,
> > +eth_dev->data->mac_addrs);
> > +
> > +	/* Send reify message to hardware to inform it about the new repr
> */
> > +	ret = nfp_flower_cmsg_repr_reify(app_flower, repr);
> > +	if (ret) {
> 
> Compare vs 0
> 
> > +		PMD_INIT_LOG(WARNING, "Failed to send repr reify
> message");
> > +		goto mac_cleanup;
> > +	}
> > +
> > +	/* Add repr to correct array */
> > +	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT)
> > +		app_flower->phy_reprs[repr->nfp_idx] = repr;
> > +	else
> > +		app_flower->vf_reprs[repr->vf_id] = repr;
> > +
> > +	return 0;
> > +
> > +mac_cleanup:
> > +	rte_free(eth_dev->data->mac_addrs);
> > +ring_cleanup:
> > +	rte_ring_free(repr->ring);
> > +
> > +	return ret;
> > +}
> > +
> > +int
> > +nfp_flower_repr_alloc(struct nfp_app_flower *app_flower) {
> > +	int i;
> > +	int ret;
> > +	struct rte_eth_dev *eth_dev;
> > +	struct nfp_eth_table *nfp_eth_table;
> > +	struct nfp_eth_table_port *eth_port;
> > +	struct nfp_flower_representor flower_repr = {
> > +		.switch_domain_id = app_flower->switch_domain_id,
> > +		.app_flower       = app_flower,
> > +	};
> > +
> > +	nfp_eth_table = app_flower->nfp_eth_table;
> > +	eth_dev = app_flower->pf_hw->eth_dev;
> > +
> > +	/* Send a NFP_FLOWER_CMSG_TYPE_MAC_REPR cmsg to
> hardware*/
> > +	ret = nfp_flower_cmsg_mac_repr(app_flower);
> > +	if (ret) {
> 
> Compare vs 0
> 
> > +		PMD_INIT_LOG(ERR, "Cloud not send mac repr cmsgs");
> > +		return ret;
> > +	}
> > +
> > +	/* Create a rte_eth_dev for every phyport representor */
> > +	for (i = 0; i < app_flower->num_phyport_reprs; i++) {
> > +		eth_port = &nfp_eth_table->ports[i];
> > +		flower_repr.repr_type = NFP_REPR_TYPE_PHYS_PORT;
> > +		flower_repr.port_id =
> nfp_flower_get_phys_port_id(eth_port->index);
> > +		flower_repr.nfp_idx = eth_port->eth_index;
> > +		flower_repr.vf_id = i;
> > +
> > +		/* Copy the real mac of the interface to the representor
> struct */
> > +		rte_ether_addr_copy((struct rte_ether_addr *)eth_port-
> >mac_addr,
> > +				&flower_repr.mac_addr);
> > +		sprintf(flower_repr.name, "flower_repr_p%d", i);
> > +
> > +		/*
> > +		 * Create a eth_dev for this representor
> > +		 * This will also allocate private memory for the device
> > +		 */
> > +		ret = rte_eth_dev_create(eth_dev->device,
> flower_repr.name,
> > +				sizeof(struct nfp_flower_representor),
> > +				NULL, NULL, nfp_flower_repr_init,
> &flower_repr);
> > +		if (ret) {
> 
> Compare vs 0
> 
> > +			PMD_INIT_LOG(ERR, "Cloud not create eth_dev for
> repr");
> > +			break;
> > +		}
> > +	}
> > +
> > +	if (i < app_flower->num_phyport_reprs)
> > +		return ret;
> > +
> > +	/*
> > +	 * Now allocate eth_dev's for VF representors.
> > +	 * Also send reify messages
> > +	 */
> > +	for (i = 0; i < app_flower->num_vf_reprs; i++) {
> > +		flower_repr.repr_type = NFP_REPR_TYPE_VF;
> > +		flower_repr.port_id = nfp_get_pcie_port_id(app_flower-
> >pf_hw->cpp,
> > +				NFP_FLOWER_CMSG_PORT_VNIC_TYPE_VF,
> i, 0);
> > +		flower_repr.nfp_idx = 0;
> > +		flower_repr.vf_id = i;
> > +
> > +		/* VF reprs get a random MAC address */
> > +		rte_eth_random_addr(flower_repr.mac_addr.addr_bytes);
> > +
> > +		sprintf(flower_repr.name, "flower_repr_vf%d", i);
> > +
> > +		 /* This will also allocate private memory for the device*/
> > +		ret = rte_eth_dev_create(eth_dev->device,
> flower_repr.name,
> > +				sizeof(struct nfp_flower_representor),
> > +				NULL, NULL, nfp_flower_repr_init,
> &flower_repr);
> > +		if (ret) {
> 
> Compare vs 0
> 
> > +			PMD_INIT_LOG(ERR, "Cloud not create eth_dev for
> repr");
> > +			break;
> > +		}
> > +	}
> > +
> > +	if (i < app_flower->num_vf_reprs)
> > +		return ret;
> > +
> > +	return 0;
> > +}
> > diff --git a/drivers/net/nfp/flower/nfp_flower_representor.h
> > b/drivers/net/nfp/flower/nfp_flower_representor.h
> > new file mode 100644
> > index 0000000..6ee54f1
> > --- /dev/null
> > +++ b/drivers/net/nfp/flower/nfp_flower_representor.h
> > @@ -0,0 +1,39 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2022 Corigine, Inc.
> > + * All rights reserved.
> > + */
> > +
> > +#ifndef _NFP_FLOWER_REPRESENTOR_H_
> > +#define _NFP_FLOWER_REPRESENTOR_H_
> > +
> > +/*
> > + * enum nfp_repr_type - type of representor
> > + * @NFP_REPR_TYPE_PHYS_PORT:   external NIC port
> > + * @NFP_REPR_TYPE_PF:          physical function
> > + * @NFP_REPR_TYPE_VF:          virtual function
> > + * @NFP_REPR_TYPE_MAX:         number of representor types
> > + */
> > +enum nfp_repr_type {
> > +	NFP_REPR_TYPE_PHYS_PORT = 0,
> > +	NFP_REPR_TYPE_PF,
> > +	NFP_REPR_TYPE_VF,
> > +	NFP_REPR_TYPE_MAX,
> > +};
> > +
> > +struct nfp_flower_representor {
> > +	uint16_t vf_id;
> > +	uint16_t switch_domain_id;
> > +	uint32_t repr_type;
> > +	uint32_t port_id;
> > +	uint32_t nfp_idx;    /* only valid for the repr of physical port */
> > +	char name[RTE_ETH_NAME_MAX_LEN];
> > +	struct rte_ether_addr mac_addr;
> > +	struct nfp_app_flower *app_flower;
> > +	struct rte_ring *ring;
> > +	struct rte_eth_link *link;
> > +	struct rte_eth_stats repr_stats;
> > +};
> > +
> > +int nfp_flower_repr_alloc(struct nfp_app_flower *app_flower);
> > +
> > +#endif /* _NFP_FLOWER_REPRESENTOR_H_ */
> > diff --git a/drivers/net/nfp/meson.build b/drivers/net/nfp/meson.build
> > index 8710213..8a63979 100644
> > --- a/drivers/net/nfp/meson.build
> > +++ b/drivers/net/nfp/meson.build
> > @@ -7,7 +7,9 @@ if not is_linux or not dpdk_conf.get('RTE_ARCH_64')
> >   endif
> >   sources = files(
> >           'flower/nfp_flower.c',
> > +        'flower/nfp_flower_cmsg.c',
> >           'flower/nfp_flower_ctrl.c',
> > +        'flower/nfp_flower_representor.c',
> >           'nfpcore/nfp_cpp_pcie_ops.c',
> >           'nfpcore/nfp_nsp.c',
> >           'nfpcore/nfp_cppcore.c',
> > diff --git a/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
> > b/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
> > index 08bc4e8..22c8bc4 100644
> > --- a/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
> > +++ b/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
> > @@ -91,7 +91,10 @@
> >    * @refcnt:	number of current users
> >    * @iomem:	mapped IO memory
> >    */
> > +#define NFP_BAR_MIN 1
> > +#define NFP_BAR_MID 5
> >   #define NFP_BAR_MAX 7
> > +
> >   struct nfp_bar {
> >   	struct nfp_pcie_user *nfp;
> >   	uint32_t barcfg;
> > @@ -292,6 +295,7 @@ struct nfp_pcie_user {
> >    * BAR0.0: Reserved for General Mapping (for MSI-X access to PCIe SRAM)
> >    *
> >    *         Halving PCItoCPPBars for primary and secondary processes.
> > + *         For CoreNIC firmware:
> >    *         NFP PMD just requires two fixed slots, one for configuration BAR,
> >    *         and another for accessing the hw queues. Another slot is needed
> >    *         for setting the link up or down. Secondary processes do not need
> > @@ -301,6 +305,9 @@ struct nfp_pcie_user {
> >    *         supported. Due to this requirement and future extensions requiring
> >    *         new slots per process, only one secondary process is supported by
> >    *         now.
> > + *         For Flower firmware:
> > + *         NFP PMD need another fixed slots, used as the configureation BAR
> > + *         for ctrl vNIC.
> >    */
> >   static int
> >   nfp_enable_bars(struct nfp_pcie_user *nfp) @@ -309,11 +316,11 @@
> > struct nfp_pcie_user {
> >   	int x, start, end;
> >
> >   	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > -		start = 4;
> > -		end = 1;
> > +		start = NFP_BAR_MID;
> > +		end = NFP_BAR_MIN;
> 
> These and similar changes below look unrelated to the patch.
> 
> >   	} else {
> > -		start = 7;
> > -		end = 4;
> > +		start = NFP_BAR_MAX;
> > +		end = NFP_BAR_MID;
> >   	}
> >   	for (x = start; x > end; x--) {
> >   		bar = &nfp->bar[x - 1];
> > @@ -341,11 +348,11 @@ struct nfp_pcie_user {
> >   	int x, start, end;
> >
> >   	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > -		start = 4;
> > -		end = 1;
> > +		start = NFP_BAR_MID;
> > +		end = NFP_BAR_MIN;
> >   	} else {
> > -		start = 7;
> > -		end = 4;
> > +		start = NFP_BAR_MAX;
> > +		end = NFP_BAR_MID;
> >   	}
> >   	for (x = start; x > end; x--) {
> >   		bar = &nfp->bar[x - 1];
> > @@ -364,11 +371,11 @@ struct nfp_pcie_user {
> >   	int x, start, end;
> >
> >   	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > -		start = 4;
> > -		end = 1;
> > +		start = NFP_BAR_MID;
> > +		end = NFP_BAR_MIN;
> >   	} else {
> > -		start = 7;
> > -		end = 4;
> > +		start = NFP_BAR_MAX;
> > +		end = NFP_BAR_MID;
> >   	}
> >
> >   	for (x = start; x > end; x--) {


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-08-08 11:32     ` Chaoyong He
@ 2022-08-08 14:45       ` Stephen Hemminger
  2022-08-10  1:51         ` Chaoyong He
  0 siblings, 1 reply; 29+ messages in thread
From: Stephen Hemminger @ 2022-08-08 14:45 UTC (permalink / raw)
  To: Chaoyong He; +Cc: Andrew Rybchenko, Niklas Soderlund, dev

On Mon, 8 Aug 2022 11:32:30 +0000
Chaoyong He <chaoyong.he@corigine.com> wrote:

> > > +		goto done;
> > > +
> > > +	/* Allocate memory for the eth_dev of the vNIC */
> > > +	hw->eth_dev = rte_zmalloc("ctrl_vnic_eth_dev",  
> > 
> > Why not rte_eth_dev_allocate()? Isn't an ethdev?
> > Why do you bypsss ethdev layer in this case completely and do everything
> > yourself?  
> 
> Here we created an ethdev locally to nfp PMD, we want the user totally won't be aware of it.
> If we use rte_eth_dev_allocate() to create it, it will be in array 'rte_ethdev_devices[]', that's not we want.

Having a floating ethdev does open the code and users up to a number of potential bugs.
What is the value of port_id on that ethdev? What is the mechanism to ensure it doesn't
conflict with other ones in the system.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-08-08 14:45       ` Stephen Hemminger
@ 2022-08-10  1:51         ` Chaoyong He
  2022-08-10 19:39           ` Stephen Hemminger
  0 siblings, 1 reply; 29+ messages in thread
From: Chaoyong He @ 2022-08-10  1:51 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Andrew Rybchenko, Niklas Soderlund, dev

> On Mon, 8 Aug 2022 11:32:30 +0000
> Chaoyong He <chaoyong.he@corigine.com> wrote:
> 
> > > > +		goto done;
> > > > +
> > > > +	/* Allocate memory for the eth_dev of the vNIC */
> > > > +	hw->eth_dev = rte_zmalloc("ctrl_vnic_eth_dev",
> > >
> > > Why not rte_eth_dev_allocate()? Isn't an ethdev?
> > > Why do you bypsss ethdev layer in this case completely and do
> > > everything yourself?
> >
> > Here we created an ethdev locally to nfp PMD, we want the user totally
> won't be aware of it.
> > If we use rte_eth_dev_allocate() to create it, it will be in array
> 'rte_ethdev_devices[]', that's not we want.
> 
> Having a floating ethdev does open the code and users up to a number of
> potential bugs.
> What is the value of port_id on that ethdev? What is the mechanism to
> ensure it doesn't conflict with other ones in the system.

The 'port_id' is the 'Device [external] port identifier', which related with the
'rte_ethdev_devices[]' I think.
Here the ethdev we created is not exposed to the user and is not in the 'rte_ethdev_devices[]'
array, so it can't be invoked by the user at all.
And we invoke this ethdev through a pointer in the `struct nfp_net_hw`,
so I think there should no conflict with other ones in the system.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-08-10  1:51         ` Chaoyong He
@ 2022-08-10 19:39           ` Stephen Hemminger
  2022-08-11  1:26             ` Chaoyong He
  0 siblings, 1 reply; 29+ messages in thread
From: Stephen Hemminger @ 2022-08-10 19:39 UTC (permalink / raw)
  To: Chaoyong He; +Cc: Andrew Rybchenko, Niklas Soderlund, dev

On Wed, 10 Aug 2022 01:51:55 +0000
Chaoyong He <chaoyong.he@corigine.com> wrote:

> > On Mon, 8 Aug 2022 11:32:30 +0000
> > Chaoyong He <chaoyong.he@corigine.com> wrote:
> >   
> > > > > +		goto done;
> > > > > +
> > > > > +	/* Allocate memory for the eth_dev of the vNIC */
> > > > > +	hw->eth_dev = rte_zmalloc("ctrl_vnic_eth_dev",  
> > > >
> > > > Why not rte_eth_dev_allocate()? Isn't an ethdev?
> > > > Why do you bypsss ethdev layer in this case completely and do
> > > > everything yourself?  
> > >
> > > Here we created an ethdev locally to nfp PMD, we want the user totally  
> > won't be aware of it.  
> > > If we use rte_eth_dev_allocate() to create it, it will be in array  
> > 'rte_ethdev_devices[]', that's not we want.
> > 
> > Having a floating ethdev does open the code and users up to a number of
> > potential bugs.
> > What is the value of port_id on that ethdev? What is the mechanism to
> > ensure it doesn't conflict with other ones in the system.  
> 
> The 'port_id' is the 'Device [external] port identifier', which related with the
> 'rte_ethdev_devices[]' I think.
> Here the ethdev we created is not exposed to the user and is not in the 'rte_ethdev_devices[]'
> array, so it can't be invoked by the user at all.
> And we invoke this ethdev through a pointer in the `struct nfp_net_hw`,
> so I think there should no conflict with other ones in the system.

DPDK already has a port ownership framework to deal with internal
ethernet device ports. Why was this not used?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-08-10 19:39           ` Stephen Hemminger
@ 2022-08-11  1:26             ` Chaoyong He
  2022-08-11  4:24               ` Stephen Hemminger
  0 siblings, 1 reply; 29+ messages in thread
From: Chaoyong He @ 2022-08-11  1:26 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Andrew Rybchenko, Niklas Soderlund, dev

> On Wed, 10 Aug 2022 01:51:55 +0000
> Chaoyong He <chaoyong.he@corigine.com> wrote:
> 
> > > On Mon, 8 Aug 2022 11:32:30 +0000
> > > Chaoyong He <chaoyong.he@corigine.com> wrote:
> > >
> > > > > > +		goto done;
> > > > > > +
> > > > > > +	/* Allocate memory for the eth_dev of the vNIC */
> > > > > > +	hw->eth_dev = rte_zmalloc("ctrl_vnic_eth_dev",
> > > > >
> > > > > Why not rte_eth_dev_allocate()? Isn't an ethdev?
> > > > > Why do you bypsss ethdev layer in this case completely and do
> > > > > everything yourself?
> > > >
> > > > Here we created an ethdev locally to nfp PMD, we want the user
> > > > totally
> > > won't be aware of it.
> > > > If we use rte_eth_dev_allocate() to create it, it will be in array
> > > 'rte_ethdev_devices[]', that's not we want.
> > >
> > > Having a floating ethdev does open the code and users up to a number
> > > of potential bugs.
> > > What is the value of port_id on that ethdev? What is the mechanism
> > > to ensure it doesn't conflict with other ones in the system.
> >
> > The 'port_id' is the 'Device [external] port identifier', which
> > related with the 'rte_ethdev_devices[]' I think.
> > Here the ethdev we created is not exposed to the user and is not in the
> 'rte_ethdev_devices[]'
> > array, so it can't be invoked by the user at all.
> > And we invoke this ethdev through a pointer in the `struct
> > nfp_net_hw`, so I think there should no conflict with other ones in the
> system.
> 
> DPDK already has a port ownership framework to deal with internal ethernet
> device ports. Why was this not used?

Sorry I have no knowledge about this framework before. Any document link or logic about
this framework will be greatly appreciated. Thanks!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-08-11  1:26             ` Chaoyong He
@ 2022-08-11  4:24               ` Stephen Hemminger
  2022-08-11  6:31                 ` Chaoyong He
  0 siblings, 1 reply; 29+ messages in thread
From: Stephen Hemminger @ 2022-08-11  4:24 UTC (permalink / raw)
  To: Chaoyong He; +Cc: Andrew Rybchenko, Niklas Soderlund, dev

On Thu, 11 Aug 2022 01:26:49 +0000
Chaoyong He <chaoyong.he@corigine.com> wrote:

> > > The 'port_id' is the 'Device [external] port identifier', which
> > > related with the 'rte_ethdev_devices[]' I think.
> > > Here the ethdev we created is not exposed to the user and is not in the  
> > 'rte_ethdev_devices[]'  
> > > array, so it can't be invoked by the user at all.
> > > And we invoke this ethdev through a pointer in the `struct
> > > nfp_net_hw`, so I think there should no conflict with other ones in the  
> > system.
> > 
> > DPDK already has a port ownership framework to deal with internal ethernet
> > device ports. Why was this not used?  
> 
> Sorry I have no knowledge about this framework before. Any document link or logic about
> this framework will be greatly appreciated. Thanks!

It is part of ethdev https://doc.dpdk.org/api/rte__ethdev_8h.html

See rte_eth_dev_owner_new, rte_eth_dev_owner_set, etc
https://doc.dpdk.org/api/rte__ethdev_8h.html#ad6817cc801bf0faa566f52d382214457

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-08-11  4:24               ` Stephen Hemminger
@ 2022-08-11  6:31                 ` Chaoyong He
  2022-08-11 15:07                   ` Stephen Hemminger
  0 siblings, 1 reply; 29+ messages in thread
From: Chaoyong He @ 2022-08-11  6:31 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Andrew Rybchenko, Niklas Soderlund, dev



> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Thursday, August 11, 2022 12:25 PM
> To: Chaoyong He <chaoyong.he@corigine.com>
> Cc: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; Niklas
> Soderlund <niklas.soderlund@corigine.com>; dev@dpdk.org
> Subject: Re: [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics
> 
> On Thu, 11 Aug 2022 01:26:49 +0000
> Chaoyong He <chaoyong.he@corigine.com> wrote:
> 
> > > > The 'port_id' is the 'Device [external] port identifier', which
> > > > related with the 'rte_ethdev_devices[]' I think.
> > > > Here the ethdev we created is not exposed to the user and is not
> > > > in the
> > > 'rte_ethdev_devices[]'
> > > > array, so it can't be invoked by the user at all.
> > > > And we invoke this ethdev through a pointer in the `struct
> > > > nfp_net_hw`, so I think there should no conflict with other ones
> > > > in the
> > > system.
> > >
> > > DPDK already has a port ownership framework to deal with internal
> > > ethernet device ports. Why was this not used?
> >
> > Sorry I have no knowledge about this framework before. Any document
> > link or logic about this framework will be greatly appreciated. Thanks!
> 
> It is part of ethdev https://doc.dpdk.org/api/rte__ethdev_8h.html
> 
> See rte_eth_dev_owner_new, rte_eth_dev_owner_set, etc
> https://doc.dpdk.org/api/rte__ethdev_8h.html#ad6817cc801bf0faa566f52d3
> 82214457

Thank you very much!

If the app uses the rte_eth_dev_owner_* APIs to check the ownership first, it does can
protect the internal ethdev ports.
But right now, the ovs-dpdk seems don't use these APIs at all, and it can use 'port_id' to
get any ethdev port in rte_ethdev_devices[] array.
So maybe it's a good idea to keep our original logic and keep an eye on this area, once
the ovs-dpdk use the rte_eth_dev_owner_* APIs, we'll update the logic here accordingly.

Thanks again!

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-08-11  6:31                 ` Chaoyong He
@ 2022-08-11 15:07                   ` Stephen Hemminger
  0 siblings, 0 replies; 29+ messages in thread
From: Stephen Hemminger @ 2022-08-11 15:07 UTC (permalink / raw)
  To: Chaoyong He; +Cc: Andrew Rybchenko, Niklas Soderlund, dev

On Thu, 11 Aug 2022 06:31:31 +0000
Chaoyong He <chaoyong.he@corigine.com> wrote:

> > -----Original Message-----
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Sent: Thursday, August 11, 2022 12:25 PM
> > To: Chaoyong He <chaoyong.he@corigine.com>
> > Cc: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>; Niklas
> > Soderlund <niklas.soderlund@corigine.com>; dev@dpdk.org
> > Subject: Re: [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics
> > 
> > On Thu, 11 Aug 2022 01:26:49 +0000
> > Chaoyong He <chaoyong.he@corigine.com> wrote:
> >   
> > > > > The 'port_id' is the 'Device [external] port identifier', which
> > > > > related with the 'rte_ethdev_devices[]' I think.
> > > > > Here the ethdev we created is not exposed to the user and is not
> > > > > in the  
> > > > 'rte_ethdev_devices[]'  
> > > > > array, so it can't be invoked by the user at all.
> > > > > And we invoke this ethdev through a pointer in the `struct
> > > > > nfp_net_hw`, so I think there should no conflict with other ones
> > > > > in the  
> > > > system.
> > > >
> > > > DPDK already has a port ownership framework to deal with internal
> > > > ethernet device ports. Why was this not used?  
> > >
> > > Sorry I have no knowledge about this framework before. Any document
> > > link or logic about this framework will be greatly appreciated. Thanks!  
> > 
> > It is part of ethdev https://doc.dpdk.org/api/rte__ethdev_8h.html
> > 
> > See rte_eth_dev_owner_new, rte_eth_dev_owner_set, etc
> > https://doc.dpdk.org/api/rte__ethdev_8h.html#ad6817cc801bf0faa566f52d3
> > 82214457  
> 
> Thank you very much!
> 
> If the app uses the rte_eth_dev_owner_* APIs to check the ownership first, it does can
> protect the internal ethdev ports.
> But right now, the ovs-dpdk seems don't use these APIs at all, and it can use 'port_id' to
> get any ethdev port in rte_ethdev_devices[] array.
> So maybe it's a good idea to keep our original logic and keep an eye on this area, once
> the ovs-dpdk use the rte_eth_dev_owner_* APIs, we'll update the logic here accordingly.
> 
> Thanks again!

Once device is owned by something, then it is no longer show in the FOREACH and other
iterators; so ovs-dpdk should be ok.  This mechanism is how bonding, failsafe, and netvsc
drivers handle sub devices. Therefore OVS should be smart enough to handle it.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-08-11 15:07 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-05  6:32 [PATCH v5 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
2022-08-05  6:32 ` [PATCH v5 01/12] net/nfp: move app specific attributes to own struct Chaoyong He
2022-08-05 10:49   ` Andrew Rybchenko
2022-08-05  6:32 ` [PATCH v5 02/12] net/nfp: simplify initialization and remove dead code Chaoyong He
2022-08-05  6:32 ` [PATCH v5 03/12] net/nfp: move app specific init logic to own function Chaoyong He
2022-08-05 10:53   ` Andrew Rybchenko
2022-08-05  6:32 ` [PATCH v5 04/12] net/nfp: add initial flower firmware support Chaoyong He
2022-08-05 11:00   ` Andrew Rybchenko
2022-08-05  6:32 ` [PATCH v5 05/12] net/nfp: add flower PF setup and mempool init logic Chaoyong He
2022-08-05 12:49   ` Andrew Rybchenko
2022-08-05  6:32 ` [PATCH v5 06/12] net/nfp: add flower PF related routines Chaoyong He
2022-08-05 12:55   ` Andrew Rybchenko
2022-08-05  6:32 ` [PATCH v5 07/12] net/nfp: add flower ctrl VNIC related logics Chaoyong He
2022-08-05 13:05   ` Andrew Rybchenko
2022-08-08 11:32     ` Chaoyong He
2022-08-08 14:45       ` Stephen Hemminger
2022-08-10  1:51         ` Chaoyong He
2022-08-10 19:39           ` Stephen Hemminger
2022-08-11  1:26             ` Chaoyong He
2022-08-11  4:24               ` Stephen Hemminger
2022-08-11  6:31                 ` Chaoyong He
2022-08-11 15:07                   ` Stephen Hemminger
2022-08-05  6:32 ` [PATCH v5 08/12] net/nfp: move common rxtx function for flower use Chaoyong He
2022-08-05  6:32 ` [PATCH v5 09/12] net/nfp: add flower ctrl VNIC rxtx logic Chaoyong He
2022-08-05  6:32 ` [PATCH v5 10/12] net/nfp: add flower representor framework Chaoyong He
2022-08-05 14:23   ` Andrew Rybchenko
2022-08-08 11:56     ` Chaoyong He
2022-08-05  6:32 ` [PATCH v5 11/12] net/nfp: move rxtx function to header file Chaoyong He
2022-08-05  6:32 ` [PATCH v5 12/12] net/nfp: add flower PF rxtx logic Chaoyong He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).