DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD
@ 2022-09-15 10:44 Chaoyong He
  2022-09-15 10:44 ` [PATCH v9 01/12] net/nfp: move app specific attributes to own struct Chaoyong He
                   ` (12 more replies)
  0 siblings, 13 replies; 22+ messages in thread
From: Chaoyong He @ 2022-09-15 10:44 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, niklas.soderlund, Chaoyong He

* Changes since v8
- Update the nfp.rst
- Fix the 'app_hw' to 'app_fw'
- Remove the ovs compatible header file
- Remove the use of rte_eth_dev_configure()/rte_eth_rx_burst()/rte_eth_dev_start() API

* Changes since v7
- Adjust the logics to make sure not break the pci probe process
- Change 'app' to 'app_fw' in all logics to avoid confuse
- Fix problem about log level

* Changes since v6
- Fix the compile error

* Changes since v5
- Compare integer with 0 explicitly
- Change helper macro to function
- Implement the dummy functions
- Remove some unnecessary logics

* Changes since v4
- Remove the unneeded '__rte_unused' attribute
- Fixup a potential memory leak problem

* Changes since v3
- Add the 'Depends-on' tag

* Changes since v2
- Remove the use of rte_panic()

* Changes since v1
- Fix the compile error

Depends-on: series-23707 ("Add support of NFP3800 chip and firmware with NFDk")

Chaoyong He (12):
  net/nfp: move app specific attributes to own struct
  net/nfp: simplify initialization and remove dead code
  net/nfp: move app specific init logic to own function
  net/nfp: add initial flower firmware support
  net/nfp: add flower PF setup logic
  net/nfp: add flower PF related routines
  net/nfp: add flower ctrl VNIC related logics
  net/nfp: move common rxtx function for flower use
  net/nfp: add flower ctrl VNIC rxtx logic
  net/nfp: add flower representor framework
  net/nfp: move rxtx function to header file
  net/nfp: add flower PF rxtx logic

 doc/guides/nics/nfp.rst                         |   32 +
 doc/guides/rel_notes/release_22_11.rst          |    9 +
 drivers/net/nfp/flower/nfp_flower.c             | 1310 +++++++++++++++++++++++
 drivers/net/nfp/flower/nfp_flower.h             |   70 ++
 drivers/net/nfp/flower/nfp_flower_cmsg.c        |  185 ++++
 drivers/net/nfp/flower/nfp_flower_cmsg.h        |  173 +++
 drivers/net/nfp/flower/nfp_flower_ctrl.c        |  250 +++++
 drivers/net/nfp/flower/nfp_flower_ctrl.h        |   13 +
 drivers/net/nfp/flower/nfp_flower_representor.c |  664 ++++++++++++
 drivers/net/nfp/flower/nfp_flower_representor.h |   39 +
 drivers/net/nfp/meson.build                     |    4 +
 drivers/net/nfp/nfp_common.c                    |    2 +-
 drivers/net/nfp/nfp_common.h                    |   35 +-
 drivers/net/nfp/nfp_cpp_bridge.c                |   88 +-
 drivers/net/nfp/nfp_cpp_bridge.h                |    6 +-
 drivers/net/nfp/nfp_ethdev.c                    |  347 +++---
 drivers/net/nfp/nfp_ethdev_vf.c                 |    2 +-
 drivers/net/nfp/nfp_rxtx.c                      |  123 +--
 drivers/net/nfp/nfp_rxtx.h                      |  121 +++
 drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c      |   31 +-
 20 files changed, 3222 insertions(+), 282 deletions(-)
 create mode 100644 drivers/net/nfp/flower/nfp_flower.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower.h
 create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.h
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ctrl.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ctrl.h
 create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.h

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v9 01/12] net/nfp: move app specific attributes to own struct
  2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
@ 2022-09-15 10:44 ` Chaoyong He
  2022-09-15 10:44 ` [PATCH v9 02/12] net/nfp: simplify initialization and remove dead code Chaoyong He
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Chaoyong He @ 2022-09-15 10:44 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, niklas.soderlund, Chaoyong He, Heinrich Kuhn

The NFP card can load different application firmwares. Currently
only the CoreNIC application firmware is supported. This commit makes
needed infrastructure changes in order to support other application
firmwares too.

Clearer separation is made between the PF device and any application
firmware specific concepts. The PF struct is now generic regardless
of the application firmware loaded. A new struct is also made for the
CoreNIC application firmware. Future additions to support other
application firmwares should also add an application firmware specific
struct.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/nfp_common.h |  30 ++++++-
 drivers/net/nfp/nfp_ethdev.c | 194 +++++++++++++++++++++++++++----------------
 2 files changed, 149 insertions(+), 75 deletions(-)

diff --git a/drivers/net/nfp/nfp_common.h b/drivers/net/nfp/nfp_common.h
index 6d917e4..bea5f95 100644
--- a/drivers/net/nfp/nfp_common.h
+++ b/drivers/net/nfp/nfp_common.h
@@ -111,6 +111,11 @@
 #include <linux/types.h>
 #include <rte_io.h>
 
+/* Firmware application ID's */
+enum nfp_app_fw_id {
+	NFP_APP_FW_CORE_NIC               = 0x1,
+};
+
 /* nfp_qcp_ptr - Read or Write Pointer of a queue */
 enum nfp_qcp_ptr {
 	NFP_QCP_READ_PTR = 0,
@@ -121,8 +126,10 @@ struct nfp_pf_dev {
 	/* Backpointer to associated pci device */
 	struct rte_pci_device *pci_dev;
 
-	/* Array of physical ports belonging to this PF */
-	struct nfp_net_hw *ports[NFP_MAX_PHYPORTS];
+	enum nfp_app_fw_id app_fw_id;
+
+	/* Pointer to the app running on the PF */
+	void *app_fw_priv;
 
 	/* Current values for control */
 	uint32_t ctrl;
@@ -151,8 +158,6 @@ struct nfp_pf_dev {
 	struct nfp_cpp_area *msix_area;
 
 	uint8_t *hw_queues;
-	uint8_t total_phyports;
-	bool	multiport;
 
 	union eth_table_entry *eth_table;
 
@@ -161,6 +166,20 @@ struct nfp_pf_dev {
 	uint32_t nfp_cpp_service_id;
 };
 
+struct nfp_app_fw_nic {
+	/* Backpointer to the PF device */
+	struct nfp_pf_dev *pf_dev;
+
+	/*
+	 * Array of physical ports belonging to the this CoreNIC app
+	 * This is really a list of vNIC's. One for each physical port
+	 */
+	struct nfp_net_hw *ports[NFP_MAX_PHYPORTS];
+
+	bool multiport;
+	uint8_t total_phyports;
+};
+
 struct nfp_net_hw {
 	/* Backpointer to the PF this port belongs to */
 	struct nfp_pf_dev *pf_dev;
@@ -424,6 +443,9 @@ int nfp_net_rss_hash_conf_get(struct rte_eth_dev *dev,
 #define NFP_NET_DEV_PRIVATE_TO_PF(dev_priv)\
 	(((struct nfp_net_hw *)dev_priv)->pf_dev)
 
+#define NFP_PRIV_TO_APP_FW_NIC(app_fw_priv)\
+	((struct nfp_app_fw_nic *)app_fw_priv)
+
 #endif /* _NFP_COMMON_H_ */
 /*
  * Local variables:
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index e9d01f4..40d21e4 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -39,15 +39,15 @@
 #include "nfp_cpp_bridge.h"
 
 static int
-nfp_net_pf_read_mac(struct nfp_pf_dev *pf_dev, int port)
+nfp_net_pf_read_mac(struct nfp_app_fw_nic *app_fw_nic, int port)
 {
 	struct nfp_eth_table *nfp_eth_table;
 	struct nfp_net_hw *hw = NULL;
 
 	/* Grab a pointer to the correct physical port */
-	hw = pf_dev->ports[port];
+	hw = app_fw_nic->ports[port];
 
-	nfp_eth_table = nfp_eth_read_ports(pf_dev->cpp);
+	nfp_eth_table = nfp_eth_read_ports(app_fw_nic->pf_dev->cpp);
 
 	nfp_eth_copy_mac((uint8_t *)&hw->mac_addr,
 			 (uint8_t *)&nfp_eth_table->ports[port].mac_addr);
@@ -64,6 +64,7 @@
 	uint32_t new_ctrl, update = 0;
 	struct nfp_net_hw *hw;
 	struct nfp_pf_dev *pf_dev;
+	struct nfp_app_fw_nic *app_fw_nic;
 	struct rte_eth_conf *dev_conf;
 	struct rte_eth_rxmode *rxmode;
 	uint32_t intr_vector;
@@ -71,6 +72,7 @@
 
 	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+	app_fw_nic = NFP_PRIV_TO_APP_FW_NIC(pf_dev->app_fw_priv);
 
 	PMD_INIT_LOG(DEBUG, "Start");
 
@@ -82,7 +84,7 @@
 
 	/* check and configure queue intr-vector mapping */
 	if (dev->data->dev_conf.intr_conf.rxq != 0) {
-		if (pf_dev->multiport) {
+		if (app_fw_nic->multiport) {
 			PMD_INIT_LOG(ERR, "PMD rx interrupt is not supported "
 					  "with NFP multiport PF");
 				return -EINVAL;
@@ -250,6 +252,7 @@
 	struct nfp_net_hw *hw;
 	struct rte_pci_device *pci_dev;
 	struct nfp_pf_dev *pf_dev;
+	struct nfp_app_fw_nic *app_fw_nic;
 	int i;
 
 	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
@@ -260,6 +263,7 @@
 	pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(dev->data->dev_private);
 	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
 	pci_dev = RTE_ETH_DEV_TO_PCI(dev);
+	app_fw_nic = NFP_PRIV_TO_APP_FW_NIC(pf_dev->app_fw_priv);
 
 	/*
 	 * We assume that the DPDK application is stopping all the
@@ -280,12 +284,12 @@
 	/* Only free PF resources after all physical ports have been closed */
 	/* Mark this port as unused and free device priv resources*/
 	nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff);
-	pf_dev->ports[hw->idx] = NULL;
+	app_fw_nic->ports[hw->idx] = NULL;
 	rte_eth_dev_release_port(dev);
 
-	for (i = 0; i < pf_dev->total_phyports; i++) {
+	for (i = 0; i < app_fw_nic->total_phyports; i++) {
 		/* Check to see if ports are still in use */
-		if (pf_dev->ports[i])
+		if (app_fw_nic->ports[i])
 			return 0;
 	}
 
@@ -296,6 +300,7 @@
 	free(pf_dev->hwinfo);
 	free(pf_dev->sym_tbl);
 	nfp_cpp_free(pf_dev->cpp);
+	rte_free(app_fw_nic);
 	rte_free(pf_dev);
 
 	rte_intr_disable(pci_dev->intr_handle);
@@ -404,6 +409,7 @@
 {
 	struct rte_pci_device *pci_dev;
 	struct nfp_pf_dev *pf_dev;
+	struct nfp_app_fw_nic *app_fw_nic;
 	struct nfp_net_hw *hw;
 	struct rte_ether_addr *tmp_ether_addr;
 	uint64_t rx_bar_off = 0;
@@ -420,6 +426,9 @@
 	/* Use backpointer here to the PF of this eth_dev */
 	pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(eth_dev->data->dev_private);
 
+	/* Use backpointer to the CoreNIC app struct */
+	app_fw_nic = NFP_PRIV_TO_APP_FW_NIC(pf_dev->app_fw_priv);
+
 	/* NFP can not handle DMA addresses requiring more than 40 bits */
 	if (rte_mem_check_dma_mask(40)) {
 		RTE_LOG(ERR, PMD,
@@ -438,7 +447,7 @@
 	 * Use PF array of physical ports to get pointer to
 	 * this specific port
 	 */
-	hw = pf_dev->ports[port];
+	hw = app_fw_nic->ports[port];
 
 	PMD_INIT_LOG(DEBUG, "Working with physical port number: %d, "
 			"NFP internal port number: %d", port, hw->nfp_idx);
@@ -568,7 +577,7 @@
 		goto dev_err_queues_map;
 	}
 
-	nfp_net_pf_read_mac(pf_dev, port);
+	nfp_net_pf_read_mac(app_fw_nic, port);
 	nfp_net_write_mac(hw, (uint8_t *)&hw->mac_addr);
 
 	tmp_ether_addr = (struct rte_ether_addr *)&hw->mac_addr;
@@ -720,25 +729,67 @@
 }
 
 static int
-nfp_init_phyports(struct nfp_pf_dev *pf_dev)
+nfp_init_app_fw_nic(struct nfp_pf_dev *pf_dev,
+		struct nfp_eth_table *nfp_eth_table)
 {
 	int i;
-	int ret = 0;
+	int ret;
+	int err = 0;
+	int total_vnics;
 	struct nfp_net_hw *hw;
+	unsigned int numa_node;
 	struct rte_eth_dev *eth_dev;
-	struct nfp_eth_table *nfp_eth_table;
+	struct nfp_app_fw_nic *app_fw_nic;
+	char port_name[RTE_ETH_NAME_MAX_LEN];
 
-	nfp_eth_table = nfp_eth_read_ports(pf_dev->cpp);
-	if (nfp_eth_table == NULL) {
-		PMD_INIT_LOG(ERR, "Error reading NFP ethernet table");
-		return -EIO;
+	PMD_INIT_LOG(INFO, "Total physical ports: %d", nfp_eth_table->count);
+
+	/* Allocate memory for the CoreNIC app */
+	app_fw_nic = rte_zmalloc("nfp_app_fw_nic", sizeof(*app_fw_nic), 0);
+	if (app_fw_nic == NULL)
+		return -ENOMEM;
+
+	/* Point the app_fw_priv pointer in the PF to the coreNIC app */
+	pf_dev->app_fw_priv = app_fw_nic;
+
+	/* Read the number of vNIC's created for the PF */
+	total_vnics = nfp_rtsym_read_le(pf_dev->sym_tbl, "nfd_cfg_pf0_num_ports", &err);
+	if (err != 0 || total_vnics <= 0 || total_vnics > 8) {
+		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
+		ret = -ENODEV;
+		goto app_cleanup;
 	}
 
-	/* Loop through all physical ports on PF */
-	for (i = 0; i < pf_dev->total_phyports; i++) {
-		const unsigned int numa_node = rte_socket_id();
-		char port_name[RTE_ETH_NAME_MAX_LEN];
+	/*
+	 * For coreNIC the number of vNICs exposed should be the same as the
+	 * number of physical ports
+	 */
+	if (total_vnics != (int)nfp_eth_table->count) {
+		PMD_INIT_LOG(ERR, "Total physical ports do not match number of vNICs");
+		ret = -ENODEV;
+		goto app_cleanup;
+	}
+
+	/* Populate coreNIC app properties*/
+	app_fw_nic->total_phyports = total_vnics;
+	app_fw_nic->pf_dev = pf_dev;
+	if (total_vnics > 1)
+		app_fw_nic->multiport = true;
 
+	/* Map the symbol table */
+	pf_dev->ctrl_bar = nfp_rtsym_map(pf_dev->sym_tbl, "_pf0_net_bar0",
+			app_fw_nic->total_phyports * 32768, &pf_dev->ctrl_area);
+	if (pf_dev->ctrl_bar == NULL) {
+		PMD_INIT_LOG(ERR, "nfp_rtsym_map fails for _pf0_net_ctrl_bar");
+		ret = -EIO;
+		goto app_cleanup;
+	}
+
+	PMD_INIT_LOG(DEBUG, "ctrl bar: %p", pf_dev->ctrl_bar);
+
+	/* Loop through all physical ports on PF */
+	numa_node = rte_socket_id();
+	for (i = 0; i < app_fw_nic->total_phyports; i++) {
 		snprintf(port_name, sizeof(port_name), "%s_port%d",
 			 pf_dev->pci_dev->device.name, i);
 
@@ -762,7 +813,7 @@
 		hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
 
 		/* Add this device to the PF's array of physical ports */
-		pf_dev->ports[i] = hw;
+		app_fw_nic->ports[i] = hw;
 
 		hw->pf_dev = pf_dev;
 		hw->cpp = pf_dev->cpp;
@@ -785,20 +836,21 @@
 		rte_eth_dev_probing_finish(eth_dev);
 
 	} /* End loop, all ports on this PF */
-	ret = 0;
-	goto eth_table_cleanup;
+
+	return 0;
 
 port_cleanup:
-	for (i = 0; i < pf_dev->total_phyports; i++) {
-		if (pf_dev->ports[i] && pf_dev->ports[i]->eth_dev) {
+	for (i = 0; i < app_fw_nic->total_phyports; i++) {
+		if (app_fw_nic->ports[i] && app_fw_nic->ports[i]->eth_dev) {
 			struct rte_eth_dev *tmp_dev;
-			tmp_dev = pf_dev->ports[i]->eth_dev;
+			tmp_dev = app_fw_nic->ports[i]->eth_dev;
 			rte_eth_dev_release_port(tmp_dev);
-			pf_dev->ports[i] = NULL;
+			app_fw_nic->ports[i] = NULL;
 		}
 	}
-eth_table_cleanup:
-	free(nfp_eth_table);
+	nfp_cpp_area_free(pf_dev->ctrl_area);
+app_cleanup:
+	rte_free(app_fw_nic);
 
 	return ret;
 }
@@ -806,11 +858,11 @@
 static int
 nfp_pf_init(struct rte_pci_device *pci_dev)
 {
-	int err;
-	int ret = 0;
+	int ret;
+	int err = 0;
 	uint64_t addr;
-	int total_ports;
 	struct nfp_cpp *cpp;
+	enum nfp_app_fw_id app_fw_id;
 	struct nfp_pf_dev *pf_dev;
 	struct nfp_hwinfo *hwinfo;
 	char name[RTE_ETH_NAME_MAX_LEN];
@@ -842,9 +894,10 @@
 	if (hwinfo == NULL) {
 		PMD_INIT_LOG(ERR, "Error reading hwinfo table");
 		ret = -EIO;
-		goto error;
+		goto cpp_cleanup;
 	}
 
+	/* Read the number of physical ports from hardware */
 	nfp_eth_table = nfp_eth_read_ports(cpp);
 	if (nfp_eth_table == NULL) {
 		PMD_INIT_LOG(ERR, "Error reading NFP ethernet table");
@@ -867,20 +920,14 @@
 		goto eth_table_cleanup;
 	}
 
-	total_ports = nfp_rtsym_read_le(sym_tbl, "nfd_cfg_pf0_num_ports", &err);
-	if (total_ports != (int)nfp_eth_table->count) {
-		PMD_DRV_LOG(ERR, "Inconsistent number of ports");
+	/* Read the app ID of the firmware loaded */
+	app_fw_id = nfp_rtsym_read_le(sym_tbl, "_pf0_net_app_id", &err);
+	if (err != 0) {
+		PMD_INIT_LOG(ERR, "Couldn't read app_fw_id from fw");
 		ret = -EIO;
 		goto sym_tbl_cleanup;
 	}
 
-	PMD_INIT_LOG(INFO, "Total physical ports: %d", total_ports);
-
-	if (total_ports <= 0 || total_ports > 8) {
-		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
-		ret = -ENODEV;
-		goto sym_tbl_cleanup;
-	}
 	/* Allocate memory for the PF "device" */
 	snprintf(name, sizeof(name), "nfp_pf%d", 0);
 	pf_dev = rte_zmalloc(name, sizeof(*pf_dev), 0);
@@ -890,27 +937,12 @@
 	}
 
 	/* Populate the newly created PF device */
+	pf_dev->app_fw_id = app_fw_id;
 	pf_dev->cpp = cpp;
 	pf_dev->hwinfo = hwinfo;
 	pf_dev->sym_tbl = sym_tbl;
-	pf_dev->total_phyports = total_ports;
-
-	if (total_ports > 1)
-		pf_dev->multiport = true;
-
 	pf_dev->pci_dev = pci_dev;
 
-	/* Map the symbol table */
-	pf_dev->ctrl_bar = nfp_rtsym_map(pf_dev->sym_tbl, "_pf0_net_bar0",
-			pf_dev->total_phyports * 32768, &pf_dev->ctrl_area);
-	if (pf_dev->ctrl_bar == NULL) {
-		PMD_INIT_LOG(ERR, "nfp_rtsym_map fails for _pf0_net_ctrl_bar");
-		ret = -EIO;
-		goto pf_cleanup;
-	}
-
-	PMD_INIT_LOG(DEBUG, "ctrl bar: %p", pf_dev->ctrl_bar);
-
 	/* configure access to tx/rx vNIC BARs */
 	switch (pci_dev->id.device_id) {
 	case PCI_DEVICE_ID_NFP3800_PF_NIC:
@@ -925,7 +957,8 @@
 	default:
 		PMD_INIT_LOG(ERR, "nfp_net: no device ID matching");
 		err = -ENODEV;
-		goto ctrl_area_cleanup;
+		ret = -ENODEV;
+		goto pf_cleanup;
 	}
 
 	pf_dev->hw_queues = nfp_cpp_map_area(pf_dev->cpp, 0, 0,
@@ -934,18 +967,27 @@
 	if (pf_dev->hw_queues == NULL) {
 		PMD_INIT_LOG(ERR, "nfp_rtsym_map fails for net.qc");
 		ret = -EIO;
-		goto ctrl_area_cleanup;
+		goto pf_cleanup;
 	}
 
 	PMD_INIT_LOG(DEBUG, "tx/rx bar address: 0x%p", pf_dev->hw_queues);
 
 	/*
-	 * Initialize and prep physical ports now
-	 * This will loop through all physical ports
+	 * PF initialization has been done at this point. Call app specific
+	 * init code now
 	 */
-	ret = nfp_init_phyports(pf_dev);
-	if (ret) {
-		PMD_INIT_LOG(ERR, "Could not create physical ports");
+	switch (pf_dev->app_fw_id) {
+	case NFP_APP_FW_CORE_NIC:
+		PMD_INIT_LOG(INFO, "Initializing coreNIC");
+		ret = nfp_init_app_fw_nic(pf_dev, nfp_eth_table);
+		if (ret != 0) {
+			PMD_INIT_LOG(ERR, "Could not initialize coreNIC!");
+			goto hwqueues_cleanup;
+		}
+		break;
+	default:
+		PMD_INIT_LOG(ERR, "Unsupported Firmware loaded");
+		ret = -EINVAL;
 		goto hwqueues_cleanup;
 	}
 
@@ -956,8 +998,6 @@
 
 hwqueues_cleanup:
 	nfp_cpp_area_free(pf_dev->hwqueues_area);
-ctrl_area_cleanup:
-	nfp_cpp_area_free(pf_dev->ctrl_area);
 pf_cleanup:
 	rte_free(pf_dev);
 sym_tbl_cleanup:
@@ -966,6 +1006,8 @@
 	free(nfp_eth_table);
 hwinfo_cleanup:
 	free(hwinfo);
+cpp_cleanup:
+	nfp_cpp_free(cpp);
 error:
 	return ret;
 }
@@ -979,7 +1021,8 @@
 nfp_pf_secondary_init(struct rte_pci_device *pci_dev)
 {
 	int i;
-	int err;
+	int err = 0;
+	int ret = 0;
 	int total_ports;
 	struct nfp_cpp *cpp;
 	struct nfp_net_hw *hw;
@@ -1017,6 +1060,11 @@
 	}
 
 	total_ports = nfp_rtsym_read_le(sym_tbl, "nfd_cfg_pf0_num_ports", &err);
+	if (err != 0 || total_ports <= 0 || total_ports > 8) {
+		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
+		ret = -ENODEV;
+		goto sym_tbl_cleanup;
+	}
 
 	for (i = 0; i < total_ports; i++) {
 		struct rte_eth_dev *eth_dev;
@@ -1030,7 +1078,8 @@
 		if (eth_dev == NULL) {
 			RTE_LOG(ERR, EAL,
 				"secondary process attach failed, ethdev doesn't exist");
-			return -ENODEV;
+			ret = -ENODEV;
+			break;
 		}
 
 		hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
@@ -1046,7 +1095,10 @@
 	/* Register the CPP bridge service for the secondary too */
 	nfp_register_cpp_service(cpp);
 
-	return 0;
+sym_tbl_cleanup:
+	free(sym_tbl);
+
+	return ret;
 }
 
 static int
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v9 02/12] net/nfp: simplify initialization and remove dead code
  2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
  2022-09-15 10:44 ` [PATCH v9 01/12] net/nfp: move app specific attributes to own struct Chaoyong He
@ 2022-09-15 10:44 ` Chaoyong He
  2022-09-15 10:44 ` [PATCH v9 03/12] net/nfp: move app specific init logic to own function Chaoyong He
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Chaoyong He @ 2022-09-15 10:44 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, niklas.soderlund, Chaoyong He

Calling nfp_net_init() is only done for the corenic firmware flavor
and it is guaranteed to always be called from the primary process,
so the explicit check for RTE_PROC_PRIMARY can be dropped.

The calling graph of nfp_net_init() already guaranteed the free of
resources when it fail, so remove the necessary free logics inside it.

While at it remove the unused member is_phyport from struct nfp_net_hw.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/nfp_common.h |  1 -
 drivers/net/nfp/nfp_ethdev.c | 40 +++++++++++-----------------------------
 2 files changed, 11 insertions(+), 30 deletions(-)

diff --git a/drivers/net/nfp/nfp_common.h b/drivers/net/nfp/nfp_common.h
index bea5f95..6af8481 100644
--- a/drivers/net/nfp/nfp_common.h
+++ b/drivers/net/nfp/nfp_common.h
@@ -235,7 +235,6 @@ struct nfp_net_hw {
 	uint8_t idx;
 	/* Internal port number as seen from NFP */
 	uint8_t nfp_idx;
-	bool	is_phyport;
 
 	union eth_table_entry *eth_table;
 
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index 40d21e4..5c96f0b 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -417,7 +417,6 @@
 	uint32_t start_q;
 	int stride = 4;
 	int port = 0;
-	int err;
 
 	PMD_INIT_FUNC_TRACE();
 
@@ -452,10 +451,6 @@
 	PMD_INIT_LOG(DEBUG, "Working with physical port number: %d, "
 			"NFP internal port number: %d", port, hw->nfp_idx);
 
-	/* For secondary processes, the primary has done all the work */
-	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
-		return 0;
-
 	rte_eth_copy_pci_info(eth_dev, pci_dev);
 
 	hw->device_id = pci_dev->id.device_id;
@@ -506,8 +501,7 @@
 		break;
 	default:
 		PMD_DRV_LOG(ERR, "nfp_net: no device ID matching");
-		err = -ENODEV;
-		goto dev_err_ctrl_map;
+		return -ENODEV;
 	}
 
 	PMD_INIT_LOG(DEBUG, "tx_bar_off: 0x%" PRIx64 "", tx_bar_off);
@@ -573,8 +567,7 @@
 					       RTE_ETHER_ADDR_LEN, 0);
 	if (eth_dev->data->mac_addrs == NULL) {
 		PMD_INIT_LOG(ERR, "Failed to space for MAC address");
-		err = -ENOMEM;
-		goto dev_err_queues_map;
+		return -ENOMEM;
 	}
 
 	nfp_net_pf_read_mac(app_fw_nic, port);
@@ -604,24 +597,15 @@
 		     hw->mac_addr[0], hw->mac_addr[1], hw->mac_addr[2],
 		     hw->mac_addr[3], hw->mac_addr[4], hw->mac_addr[5]);
 
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-		/* Registering LSC interrupt handler */
-		rte_intr_callback_register(pci_dev->intr_handle,
-				nfp_net_dev_interrupt_handler, (void *)eth_dev);
-		/* Telling the firmware about the LSC interrupt entry */
-		nn_cfg_writeb(hw, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
-		/* Recording current stats counters values */
-		nfp_net_stats_reset(eth_dev);
-	}
+	/* Registering LSC interrupt handler */
+	rte_intr_callback_register(pci_dev->intr_handle,
+			nfp_net_dev_interrupt_handler, (void *)eth_dev);
+	/* Telling the firmware about the LSC interrupt entry */
+	nn_cfg_writeb(hw, NFP_NET_CFG_LSC, NFP_NET_IRQ_LSC_IDX);
+	/* Recording current stats counters values */
+	nfp_net_stats_reset(eth_dev);
 
 	return 0;
-
-dev_err_queues_map:
-		nfp_cpp_area_free(hw->hwqueues_area);
-dev_err_ctrl_map:
-		nfp_cpp_area_free(hw->ctrl_area);
-
-	return err;
 }
 
 #define DEFAULT_FW_PATH       "/lib/firmware/netronome"
@@ -820,7 +804,6 @@
 		hw->eth_dev = eth_dev;
 		hw->idx = i;
 		hw->nfp_idx = nfp_eth_table->ports[i].index;
-		hw->is_phyport = true;
 
 		eth_dev->device = &pf_dev->pci_dev->device;
 
@@ -886,8 +869,7 @@
 
 	if (cpp == NULL) {
 		PMD_INIT_LOG(ERR, "A CPP handle can not be obtained");
-		ret = -EIO;
-		goto error;
+		return -EIO;
 	}
 
 	hwinfo = nfp_hwinfo_read(cpp);
@@ -1008,7 +990,7 @@
 	free(hwinfo);
 cpp_cleanup:
 	nfp_cpp_free(cpp);
-error:
+
 	return ret;
 }
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v9 03/12] net/nfp: move app specific init logic to own function
  2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
  2022-09-15 10:44 ` [PATCH v9 01/12] net/nfp: move app specific attributes to own struct Chaoyong He
  2022-09-15 10:44 ` [PATCH v9 02/12] net/nfp: simplify initialization and remove dead code Chaoyong He
@ 2022-09-15 10:44 ` Chaoyong He
  2022-09-15 10:44 ` [PATCH v9 04/12] net/nfp: add initial flower firmware support Chaoyong He
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Chaoyong He @ 2022-09-15 10:44 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, niklas.soderlund, Chaoyong He

The NFP card can load different firmware applications.
This commit move the init logic of corenic app of the
secondary process into its own function.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/nfp_ethdev.c | 90 +++++++++++++++++++++++++++++---------------
 1 file changed, 60 insertions(+), 30 deletions(-)

diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index 5c96f0b..0d09a69 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -994,6 +994,49 @@
 	return ret;
 }
 
+static int
+nfp_secondary_init_app_fw_nic(struct rte_pci_device *pci_dev,
+		struct nfp_rtsym_table *sym_tbl,
+		struct nfp_cpp *cpp)
+{
+	int i;
+	int err = 0;
+	int ret = 0;
+	int total_vnics;
+	struct nfp_net_hw *hw;
+
+	/* Read the number of vNIC's created for the PF */
+	total_vnics = nfp_rtsym_read_le(sym_tbl, "nfd_cfg_pf0_num_ports", &err);
+	if (err != 0 || total_vnics <= 0 || total_vnics > 8) {
+		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
+		return -ENODEV;
+	}
+
+	for (i = 0; i < total_vnics; i++) {
+		struct rte_eth_dev *eth_dev;
+		char port_name[RTE_ETH_NAME_MAX_LEN];
+		snprintf(port_name, sizeof(port_name), "%s_port%d",
+				pci_dev->device.name, i);
+
+		PMD_INIT_LOG(DEBUG, "Secondary attaching to port %s", port_name);
+		eth_dev = rte_eth_dev_attach_secondary(port_name);
+		if (eth_dev == NULL) {
+			PMD_INIT_LOG(ERR, "Secondary process attach to port %s failed", port_name);
+			ret = -ENODEV;
+			break;
+		}
+
+		eth_dev->process_private = cpp;
+		hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
+		if (nfp_net_ethdev_ops_mount(hw, eth_dev))
+			return -EINVAL;
+
+		rte_eth_dev_probing_finish(eth_dev);
+	}
+
+	return ret;
+}
+
 /*
  * When attaching to the NFP4000/6000 PF on a secondary process there
  * is no need to initialise the PF again. Only minimal work is required
@@ -1002,12 +1045,10 @@
 static int
 nfp_pf_secondary_init(struct rte_pci_device *pci_dev)
 {
-	int i;
 	int err = 0;
 	int ret = 0;
-	int total_ports;
 	struct nfp_cpp *cpp;
-	struct nfp_net_hw *hw;
+	enum nfp_app_fw_id app_fw_id;
 	struct nfp_rtsym_table *sym_tbl;
 
 	if (pci_dev == NULL)
@@ -1041,37 +1082,26 @@
 		return -EIO;
 	}
 
-	total_ports = nfp_rtsym_read_le(sym_tbl, "nfd_cfg_pf0_num_ports", &err);
-	if (err != 0 || total_ports <= 0 || total_ports > 8) {
-		PMD_INIT_LOG(ERR, "nfd_cfg_pf0_num_ports symbol with wrong value");
-		ret = -ENODEV;
+	/* Read the app ID of the firmware loaded */
+	app_fw_id = nfp_rtsym_read_le(sym_tbl, "_pf0_net_app_id", &err);
+	if (err != 0) {
+		PMD_INIT_LOG(ERR, "Couldn't read app_fw_id from fw");
 		goto sym_tbl_cleanup;
 	}
 
-	for (i = 0; i < total_ports; i++) {
-		struct rte_eth_dev *eth_dev;
-		char port_name[RTE_ETH_NAME_MAX_LEN];
-
-		snprintf(port_name, sizeof(port_name), "%s_port%d",
-			 pci_dev->device.name, i);
-
-		PMD_DRV_LOG(DEBUG, "Secondary attaching to port %s", port_name);
-		eth_dev = rte_eth_dev_attach_secondary(port_name);
-		if (eth_dev == NULL) {
-			RTE_LOG(ERR, EAL,
-				"secondary process attach failed, ethdev doesn't exist");
-			ret = -ENODEV;
-			break;
+	switch (app_fw_id) {
+	case NFP_APP_FW_CORE_NIC:
+		PMD_INIT_LOG(INFO, "Initializing coreNIC");
+		ret = nfp_secondary_init_app_fw_nic(pci_dev, sym_tbl, cpp);
+		if (ret != 0) {
+			PMD_INIT_LOG(ERR, "Could not initialize coreNIC!");
+			goto sym_tbl_cleanup;
 		}
-
-		hw = NFP_NET_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private);
-
-		if (nfp_net_ethdev_ops_mount(hw, eth_dev))
-			return -EINVAL;
-
-		eth_dev->process_private = cpp;
-
-		rte_eth_dev_probing_finish(eth_dev);
+		break;
+	default:
+		PMD_INIT_LOG(ERR, "Unsupported Firmware loaded");
+		ret = -EINVAL;
+		goto sym_tbl_cleanup;
 	}
 
 	/* Register the CPP bridge service for the secondary too */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v9 04/12] net/nfp: add initial flower firmware support
  2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (2 preceding siblings ...)
  2022-09-15 10:44 ` [PATCH v9 03/12] net/nfp: move app specific init logic to own function Chaoyong He
@ 2022-09-15 10:44 ` Chaoyong He
  2022-09-15 10:44 ` [PATCH v9 05/12] net/nfp: add flower PF setup logic Chaoyong He
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Chaoyong He @ 2022-09-15 10:44 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, niklas.soderlund, Chaoyong He, Heinrich Kuhn

Adds the basic probing infrastructure to support the flower firmware
application.

Adds the cpp service, used for some user tools.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 doc/guides/nics/nfp.rst                | 13 +++++
 doc/guides/rel_notes/release_22_11.rst |  7 +++
 drivers/net/nfp/flower/nfp_flower.c    | 45 +++++++++++++++++
 drivers/net/nfp/flower/nfp_flower.h    | 16 +++++++
 drivers/net/nfp/meson.build            |  1 +
 drivers/net/nfp/nfp_common.h           |  1 +
 drivers/net/nfp/nfp_cpp_bridge.c       | 88 +++++++++++++++++++++++++++++-----
 drivers/net/nfp/nfp_cpp_bridge.h       |  6 ++-
 drivers/net/nfp/nfp_ethdev.c           | 31 +++++++++++-
 9 files changed, 192 insertions(+), 16 deletions(-)
 create mode 100644 drivers/net/nfp/flower/nfp_flower.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower.h

diff --git a/doc/guides/nics/nfp.rst b/doc/guides/nics/nfp.rst
index 55539ac..4faab39 100644
--- a/doc/guides/nics/nfp.rst
+++ b/doc/guides/nics/nfp.rst
@@ -181,3 +181,16 @@ System configuration
    -k option shows the device driver, if any, that devices are bound to.
    Depending on the modules loaded at this point the new PCI devices may be
    bound to nfp_netvf driver.
+
+
+Flow offload
+------------
+
+Use the flower firmware application, some type of Netronome's SmartNICs can
+offload the flow into cards.
+
+The flower firmware application requires the PMD running two services:
+
+	* PF vNIC service: handling the feedback traffic.
+	* ctrl vNIC service: communicate between PMD and firmware through
+	  control message.
diff --git a/doc/guides/rel_notes/release_22_11.rst b/doc/guides/rel_notes/release_22_11.rst
index f601617..6a666aa 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -55,6 +55,13 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Updated Netronome nfp driver.**
+
+  Add the needed data structures and logics to support the offload of rte_flow:
+
+    * Added the support of flower firmware.
+    * Added the flower service infrastructure.
+
 
 Removed Items
 -------------
diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
new file mode 100644
index 0000000..87cb922
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#include <rte_common.h>
+#include <ethdev_driver.h>
+#include <rte_service_component.h>
+#include <rte_malloc.h>
+#include <ethdev_pci.h>
+#include <ethdev_driver.h>
+
+#include "../nfp_common.h"
+#include "../nfp_logs.h"
+#include "../nfp_ctrl.h"
+#include "../nfp_cpp_bridge.h"
+#include "nfp_flower.h"
+
+int
+nfp_init_app_fw_flower(struct nfp_pf_dev *pf_dev)
+{
+	unsigned int numa_node;
+	struct nfp_app_fw_flower *app_fw_flower;
+
+	numa_node = rte_socket_id();
+
+	/* Allocate memory for the Flower app */
+	app_fw_flower = rte_zmalloc_socket("nfp_app_fw_flower", sizeof(*app_fw_flower),
+			RTE_CACHE_LINE_SIZE, numa_node);
+	if (app_fw_flower == NULL) {
+		PMD_INIT_LOG(ERR, "Could not malloc app fw flower");
+		return -ENOMEM;
+	}
+
+	pf_dev->app_fw_priv = app_fw_flower;
+
+	return 0;
+}
+
+int
+nfp_secondary_init_app_fw_flower(__rte_unused struct nfp_cpp *cpp)
+{
+	PMD_INIT_LOG(ERR, "Flower firmware not supported");
+	return -ENOTSUP;
+}
diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
new file mode 100644
index 0000000..8b9ef95
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#ifndef _NFP_FLOWER_H_
+#define _NFP_FLOWER_H_
+
+/* The flower application's private structure */
+struct nfp_app_fw_flower {
+};
+
+int nfp_init_app_fw_flower(struct nfp_pf_dev *pf_dev);
+int nfp_secondary_init_app_fw_flower(struct nfp_cpp *cpp);
+
+#endif /* _NFP_FLOWER_H_ */
diff --git a/drivers/net/nfp/meson.build b/drivers/net/nfp/meson.build
index 810f02a..7ae3115 100644
--- a/drivers/net/nfp/meson.build
+++ b/drivers/net/nfp/meson.build
@@ -6,6 +6,7 @@ if not is_linux or not dpdk_conf.get('RTE_ARCH_64')
     reason = 'only supported on 64-bit Linux'
 endif
 sources = files(
+        'flower/nfp_flower.c',
         'nfpcore/nfp_cpp_pcie_ops.c',
         'nfpcore/nfp_nsp.c',
         'nfpcore/nfp_cppcore.c',
diff --git a/drivers/net/nfp/nfp_common.h b/drivers/net/nfp/nfp_common.h
index 6af8481..cefe717 100644
--- a/drivers/net/nfp/nfp_common.h
+++ b/drivers/net/nfp/nfp_common.h
@@ -114,6 +114,7 @@
 /* Firmware application ID's */
 enum nfp_app_fw_id {
 	NFP_APP_FW_CORE_NIC               = 0x1,
+	NFP_APP_FW_FLOWER_NIC             = 0x3,
 };
 
 /* nfp_qcp_ptr - Read or Write Pointer of a queue */
diff --git a/drivers/net/nfp/nfp_cpp_bridge.c b/drivers/net/nfp/nfp_cpp_bridge.c
index 0922ea9..155628d 100644
--- a/drivers/net/nfp/nfp_cpp_bridge.c
+++ b/drivers/net/nfp/nfp_cpp_bridge.c
@@ -28,22 +28,86 @@
 static int nfp_cpp_bridge_serve_write(int sockfd, struct nfp_cpp *cpp);
 static int nfp_cpp_bridge_serve_read(int sockfd, struct nfp_cpp *cpp);
 static int nfp_cpp_bridge_serve_ioctl(int sockfd, struct nfp_cpp *cpp);
+static int nfp_cpp_bridge_service_func(void *args);
 
-void nfp_register_cpp_service(struct nfp_cpp *cpp)
+int
+nfp_map_service(uint32_t service_id)
 {
-	uint32_t *cpp_service_id = NULL;
-	struct rte_service_spec service;
+	int32_t ret;
+	uint32_t slcore = 0;
+	int32_t slcore_count;
+	uint8_t service_count;
+	const char *service_name;
+	uint32_t slcore_array[RTE_MAX_LCORE];
+	uint8_t min_service_count = UINT8_MAX;
+
+	slcore_count = rte_service_lcore_list(slcore_array, RTE_MAX_LCORE);
+	if (slcore_count <= 0) {
+		PMD_INIT_LOG(DEBUG, "No service cores found");
+		return -ENOENT;
+	}
+
+	/*
+	 * Find a service core with the least number of services already
+	 * registered to it
+	 */
+	while (slcore_count--) {
+		service_count = rte_service_lcore_count_services(slcore_array[slcore_count]);
+		if (service_count < min_service_count) {
+			slcore = slcore_array[slcore_count];
+			min_service_count = service_count;
+		}
+	}
+
+	service_name = rte_service_get_name(service_id);
+	PMD_INIT_LOG(INFO, "Mapping service %s to core %u", service_name, slcore);
 
-	memset(&service, 0, sizeof(struct rte_service_spec));
-	snprintf(service.name, sizeof(service.name), "nfp_cpp_service");
-	service.callback = nfp_cpp_bridge_service_func;
-	service.callback_userdata = (void *)cpp;
+	ret = rte_service_map_lcore_set(service_id, slcore, 1);
+	if (ret != 0) {
+		PMD_INIT_LOG(DEBUG, "Could not map flower service");
+		return -ENOENT;
+	}
 
-	if (rte_service_component_register(&service,
-					   cpp_service_id))
-		RTE_LOG(WARNING, PMD, "NFP CPP bridge service register() failed");
+	rte_service_runstate_set(service_id, 1);
+	rte_service_component_runstate_set(service_id, 1);
+	rte_service_lcore_start(slcore);
+	if (rte_service_may_be_active(slcore))
+		PMD_INIT_LOG(INFO, "The service %s is running", service_name);
 	else
-		RTE_LOG(DEBUG, PMD, "NFP CPP bridge service registered");
+		PMD_INIT_LOG(ERR, "The service %s is not running", service_name);
+
+	return 0;
+}
+
+int
+nfp_enable_cpp_service(struct nfp_cpp *cpp)
+{
+	int ret;
+	uint32_t service_id = 0;
+	struct rte_service_spec cpp_service = {
+		.name         = "nfp_cpp_service",
+		.callback     = nfp_cpp_bridge_service_func,
+	};
+
+	cpp_service.callback_userdata = (void *)cpp;
+
+	/* Register the cpp service */
+	ret = rte_service_component_register(&cpp_service, &service_id);
+	if (ret != 0) {
+		PMD_INIT_LOG(WARNING, "Could not register nfp cpp service");
+		return -EINVAL;
+	}
+
+	PMD_INIT_LOG(INFO, "NFP cpp service registered");
+
+	/* Map it to available service core*/
+	ret = nfp_map_service(service_id);
+	if (ret != 0) {
+		PMD_INIT_LOG(DEBUG, "Could not map nfp cpp service");
+		return -EINVAL;
+	}
+
+	return 0;
 }
 
 /*
@@ -307,7 +371,7 @@ void nfp_register_cpp_service(struct nfp_cpp *cpp)
  * unaware of the CPP bridge performing the NFP kernel char driver for CPP
  * accesses.
  */
-int32_t
+static int
 nfp_cpp_bridge_service_func(void *args)
 {
 	struct sockaddr address;
diff --git a/drivers/net/nfp/nfp_cpp_bridge.h b/drivers/net/nfp/nfp_cpp_bridge.h
index aea5fdc..7fee3a9 100644
--- a/drivers/net/nfp/nfp_cpp_bridge.h
+++ b/drivers/net/nfp/nfp_cpp_bridge.h
@@ -16,6 +16,8 @@
 #ifndef _NFP_CPP_BRIDGE_H_
 #define _NFP_CPP_BRIDGE_H_
 
+#include "nfp_common.h"
+
 #define NFP_CPP_MEMIO_BOUNDARY	(1 << 20)
 #define NFP_BRIDGE_OP_READ	20
 #define NFP_BRIDGE_OP_WRITE	30
@@ -24,8 +26,8 @@
 #define NFP_IOCTL 'n'
 #define NFP_IOCTL_CPP_IDENTIFICATION _IOW(NFP_IOCTL, 0x8f, uint32_t)
 
-void nfp_register_cpp_service(struct nfp_cpp *cpp);
-int32_t nfp_cpp_bridge_service_func(void *args);
+int nfp_enable_cpp_service(struct nfp_cpp *cpp);
+int nfp_map_service(uint32_t service_id);
 
 #endif /* _NFP_CPP_BRIDGE_H_ */
 /*
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index 0d09a69..ddfe495 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -38,6 +38,8 @@
 #include "nfp_ctrl.h"
 #include "nfp_cpp_bridge.h"
 
+#include "flower/nfp_flower.h"
+
 static int
 nfp_net_pf_read_mac(struct nfp_app_fw_nic *app_fw_nic, int port)
 {
@@ -967,6 +969,14 @@
 			goto hwqueues_cleanup;
 		}
 		break;
+	case NFP_APP_FW_FLOWER_NIC:
+		PMD_INIT_LOG(INFO, "Initializing Flower");
+		ret = nfp_init_app_fw_flower(pf_dev);
+		if (ret != 0) {
+			PMD_INIT_LOG(ERR, "Could not initialize Flower!");
+			goto hwqueues_cleanup;
+		}
+		break;
 	default:
 		PMD_INIT_LOG(ERR, "Unsupported Firmware loaded");
 		ret = -EINVAL;
@@ -974,7 +984,12 @@
 	}
 
 	/* register the CPP bridge service here for primary use */
-	nfp_register_cpp_service(pf_dev->cpp);
+	ret = nfp_enable_cpp_service(pf_dev->cpp);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Enable cpp service failed.");
+		ret = -EINVAL;
+		goto hwqueues_cleanup;
+	}
 
 	return 0;
 
@@ -1098,6 +1113,14 @@
 			goto sym_tbl_cleanup;
 		}
 		break;
+	case NFP_APP_FW_FLOWER_NIC:
+		PMD_INIT_LOG(INFO, "Initializing Flower");
+		ret = nfp_secondary_init_app_fw_flower(cpp);
+		if (ret != 0) {
+			PMD_INIT_LOG(ERR, "Could not initialize Flower!");
+			goto sym_tbl_cleanup;
+		}
+		break;
 	default:
 		PMD_INIT_LOG(ERR, "Unsupported Firmware loaded");
 		ret = -EINVAL;
@@ -1105,7 +1128,11 @@
 	}
 
 	/* Register the CPP bridge service for the secondary too */
-	nfp_register_cpp_service(cpp);
+	ret = nfp_enable_cpp_service(cpp);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Enable cpp service failed.");
+		ret = -EINVAL;
+	}
 
 sym_tbl_cleanup:
 	free(sym_tbl);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v9 05/12] net/nfp: add flower PF setup logic
  2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (3 preceding siblings ...)
  2022-09-15 10:44 ` [PATCH v9 04/12] net/nfp: add initial flower firmware support Chaoyong He
@ 2022-09-15 10:44 ` Chaoyong He
  2022-09-20 14:57   ` Ferruh Yigit
  2022-09-15 10:44 ` [PATCH v9 06/12] net/nfp: add flower PF related routines Chaoyong He
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Chaoyong He @ 2022-09-15 10:44 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, niklas.soderlund, Chaoyong He

Adds the vNIC initialization logic for the flower PF vNIC. The flower
firmware application exposes this vNIC for the purposes of fallback
traffic in the switchdev use-case.

Adds minimal dev_ops for this PF vNIC device. Because the device is
being exposed externally to DPDK it needs to implements a minimal set
of dev_ops.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/flower/nfp_flower.c | 369 +++++++++++++++++++++++++++++++++++-
 drivers/net/nfp/flower/nfp_flower.h |   8 +
 drivers/net/nfp/nfp_common.h        |   3 +
 3 files changed, 377 insertions(+), 3 deletions(-)

diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
index 87cb922..34e60f8 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -14,12 +14,312 @@
 #include "../nfp_logs.h"
 #include "../nfp_ctrl.h"
 #include "../nfp_cpp_bridge.h"
+#include "../nfp_rxtx.h"
+#include "../nfpcore/nfp_mip.h"
+#include "../nfpcore/nfp_rtsym.h"
+#include "../nfpcore/nfp_nsp.h"
 #include "nfp_flower.h"
 
+#define MAX_PKT_BURST 32
+#define MBUF_PRIV_SIZE 128
+#define MEMPOOL_CACHE_SIZE 512
+#define DEFAULT_FLBUF_SIZE 9216
+
+#define PF_VNIC_NB_DESC 1024
+
+static const struct rte_eth_rxconf rx_conf = {
+	.rx_free_thresh = DEFAULT_RX_FREE_THRESH,
+	.rx_drop_en = 1,
+};
+
+static const struct rte_eth_txconf tx_conf = {
+	.tx_thresh = {
+		.pthresh  = DEFAULT_TX_PTHRESH,
+		.hthresh = DEFAULT_TX_HTHRESH,
+		.wthresh = DEFAULT_TX_WTHRESH,
+	},
+	.tx_free_thresh = DEFAULT_TX_FREE_THRESH,
+};
+
+static const struct eth_dev_ops nfp_flower_pf_vnic_ops = {
+	.dev_infos_get          = nfp_net_infos_get,
+};
+
+struct dp_packet {
+	struct rte_mbuf mbuf;
+	uint32_t source;
+};
+
+static void
+nfp_flower_pf_mp_init(__rte_unused struct rte_mempool *mp,
+		__rte_unused void *opaque_arg,
+		void *packet,
+		__rte_unused unsigned int i)
+{
+	struct dp_packet *pkt = packet;
+	/* Indicate that this pkt is from DPDK */
+	pkt->source = 3;
+}
+
+static struct rte_mempool *
+nfp_flower_pf_mp_create(void)
+{
+	uint32_t nb_mbufs;
+	unsigned int numa_node;
+	struct rte_mempool *pktmbuf_pool;
+	uint32_t n_rxd = PF_VNIC_NB_DESC;
+	uint32_t n_txd = PF_VNIC_NB_DESC;
+
+	nb_mbufs = RTE_MAX(n_rxd + n_txd + MAX_PKT_BURST + MEMPOOL_CACHE_SIZE, 81920U);
+
+	numa_node = rte_socket_id();
+	pktmbuf_pool = rte_pktmbuf_pool_create("flower_pf_mbuf_pool", nb_mbufs,
+			MEMPOOL_CACHE_SIZE, MBUF_PRIV_SIZE,
+			RTE_MBUF_DEFAULT_BUF_SIZE, numa_node);
+	if (pktmbuf_pool == NULL) {
+		PMD_INIT_LOG(ERR, "Cannot init pf vnic mbuf pool");
+		return NULL;
+	}
+
+	rte_mempool_obj_iter(pktmbuf_pool, nfp_flower_pf_mp_init, NULL);
+
+	return pktmbuf_pool;
+}
+
+static int
+nfp_flower_init_vnic_common(struct nfp_net_hw *hw, const char *vnic_type)
+{
+	uint32_t start_q;
+	uint64_t rx_bar_off;
+	uint64_t tx_bar_off;
+	const int stride = 4;
+	struct nfp_pf_dev *pf_dev;
+	struct rte_pci_device *pci_dev;
+
+	pf_dev = hw->pf_dev;
+	pci_dev = hw->pf_dev->pci_dev;
+
+	/* NFP can not handle DMA addresses requiring more than 40 bits */
+	if (rte_mem_check_dma_mask(40)) {
+		PMD_INIT_LOG(ERR, "Device %s can not be used: restricted dma mask to 40 bits!\n",
+				pci_dev->device.name);
+		return -ENODEV;
+	};
+
+	hw->device_id = pci_dev->id.device_id;
+	hw->vendor_id = pci_dev->id.vendor_id;
+	hw->subsystem_device_id = pci_dev->id.subsystem_device_id;
+	hw->subsystem_vendor_id = pci_dev->id.subsystem_vendor_id;
+
+	PMD_INIT_LOG(DEBUG, "%s vNIC ctrl bar: %p", vnic_type, hw->ctrl_bar);
+
+	/* Read the number of available rx/tx queues from hardware */
+	hw->max_rx_queues = nn_cfg_readl(hw, NFP_NET_CFG_MAX_RXRINGS);
+	hw->max_tx_queues = nn_cfg_readl(hw, NFP_NET_CFG_MAX_TXRINGS);
+
+	/* Work out where in the BAR the queues start */
+	start_q = nn_cfg_readl(hw, NFP_NET_CFG_START_TXQ);
+	tx_bar_off = (uint64_t)start_q * NFP_QCP_QUEUE_ADDR_SZ;
+	start_q = nn_cfg_readl(hw, NFP_NET_CFG_START_RXQ);
+	rx_bar_off = (uint64_t)start_q * NFP_QCP_QUEUE_ADDR_SZ;
+
+	hw->tx_bar = pf_dev->hw_queues + tx_bar_off;
+	hw->rx_bar = pf_dev->hw_queues + rx_bar_off;
+
+	/* Get some of the read-only fields from the config BAR */
+	hw->ver = nn_cfg_readl(hw, NFP_NET_CFG_VERSION);
+	hw->cap = nn_cfg_readl(hw, NFP_NET_CFG_CAP);
+	hw->max_mtu = nn_cfg_readl(hw, NFP_NET_CFG_MAX_MTU);
+	/* Set the current MTU to the maximum supported */
+	hw->mtu = hw->max_mtu;
+	hw->flbufsz = DEFAULT_FLBUF_SIZE;
+
+	/* read the Rx offset configured from firmware */
+	if (NFD_CFG_MAJOR_VERSION_of(hw->ver) < 2)
+		hw->rx_offset = NFP_NET_RX_OFFSET;
+	else
+		hw->rx_offset = nn_cfg_readl(hw, NFP_NET_CFG_RX_OFFSET_ADDR);
+
+	hw->ctrl = 0;
+	hw->stride_rx = stride;
+	hw->stride_tx = stride;
+
+	/* Reuse cfg queue setup function */
+	nfp_net_cfg_queue_setup(hw);
+
+	PMD_INIT_LOG(INFO, "%s vNIC max_rx_queues: %u, max_tx_queues: %u",
+			vnic_type, hw->max_rx_queues, hw->max_tx_queues);
+
+	/* Initializing spinlock for reconfigs */
+	rte_spinlock_init(&hw->reconfig_lock);
+
+	return 0;
+}
+
+static int
+nfp_flower_init_pf_vnic(struct nfp_net_hw *hw)
+{
+	int ret;
+	uint16_t i;
+	uint16_t n_txq;
+	uint16_t n_rxq;
+	unsigned int numa_node;
+	struct rte_mempool *mp;
+	struct nfp_pf_dev *pf_dev;
+	struct rte_eth_dev *eth_dev;
+	struct nfp_app_fw_flower *app_fw_flower;
+
+	static const struct rte_eth_conf port_conf = {
+		.rxmode = {
+			.mq_mode  = RTE_ETH_MQ_RX_RSS,
+			.offloads = RTE_ETH_RX_OFFLOAD_CHECKSUM,
+		},
+	};
+
+	/* Set up some pointers here for ease of use */
+	pf_dev = hw->pf_dev;
+	app_fw_flower = NFP_PRIV_TO_APP_FW_FLOWER(pf_dev->app_fw_priv);
+
+	/*
+	 * Perform the "common" part of setting up a flower vNIC.
+	 * Mostly reading configuration from hardware.
+	 */
+	ret = nfp_flower_init_vnic_common(hw, "pf_vnic");
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Could not init pf vnic");
+		return -EINVAL;
+	}
+
+	hw->eth_dev = rte_eth_dev_allocate("nfp_pf_vnic");
+	if (hw->eth_dev == NULL) {
+		PMD_INIT_LOG(ERR, "Could not allocate pf vnic");
+		return -ENOMEM;
+	}
+
+	/* Grab the pointer to the newly created rte_eth_dev here */
+	eth_dev = hw->eth_dev;
+
+	numa_node = rte_socket_id();
+
+	/* Create a mbuf pool for the PF */
+	app_fw_flower->pf_pktmbuf_pool = nfp_flower_pf_mp_create();
+	if (app_fw_flower->pf_pktmbuf_pool == NULL) {
+		PMD_INIT_LOG(ERR, "Could not create mempool for pf vnic");
+		ret = -ENOMEM;
+		goto port_release;
+	}
+
+	mp = app_fw_flower->pf_pktmbuf_pool;
+
+	/* Add Rx/Tx functions */
+	eth_dev->dev_ops = &nfp_flower_pf_vnic_ops;
+
+	/* PF vNIC gets a random MAC */
+	eth_dev->data->mac_addrs = rte_zmalloc("mac_addr", RTE_ETHER_ADDR_LEN, 0);
+	if (eth_dev->data->mac_addrs == NULL) {
+		PMD_INIT_LOG(ERR, "Could not allocate mac addr");
+		ret = -ENOMEM;
+		goto mempool_cleanup;
+	}
+
+	rte_eth_random_addr(eth_dev->data->mac_addrs->addr_bytes);
+	rte_eth_dev_probing_finish(eth_dev);
+
+	/* Configure the PF device now */
+	n_rxq = hw->max_rx_queues;
+	n_txq = hw->max_tx_queues;
+	memcpy(&eth_dev->data->dev_conf, &port_conf, sizeof(struct rte_eth_conf));
+	eth_dev->data->rx_queues = rte_zmalloc("ethdev->rx_queues",
+		sizeof(eth_dev->data->rx_queues[0]) * n_rxq, RTE_CACHE_LINE_SIZE);
+	if (eth_dev->data->rx_queues == NULL) {
+		PMD_INIT_LOG(ERR, "rte_zmalloc failed for PF vNIC rx queues");
+		ret = -ENOMEM;
+		goto mac_cleanup;
+	}
+
+	eth_dev->data->tx_queues = rte_zmalloc("ethdev->tx_queues",
+		sizeof(eth_dev->data->tx_queues[0]) * n_txq, RTE_CACHE_LINE_SIZE);
+	if (eth_dev->data->tx_queues == NULL) {
+		PMD_INIT_LOG(ERR, "rte_zmalloc failed for PF vNIC tx queues");
+		ret = -ENOMEM;
+		goto rx_queue_free;
+	}
+
+	/* Fill in some of the eth_dev fields */
+	eth_dev->device = &pf_dev->pci_dev->device;
+	eth_dev->data->nb_tx_queues = n_rxq;
+	eth_dev->data->nb_rx_queues = n_txq;
+	eth_dev->data->dev_private = hw;
+	eth_dev->data->dev_configured = 1;
+
+	/* Set up the Rx queues */
+	for (i = 0; i < n_rxq; i++) {
+		ret = nfp_net_rx_queue_setup(eth_dev, i, PF_VNIC_NB_DESC, numa_node,
+				&rx_conf, mp);
+		if (ret != 0) {
+			PMD_INIT_LOG(ERR, "Configure flower PF vNIC Rx queue %d failed", i);
+			goto rx_queue_cleanup;
+		}
+	}
+
+	/* Set up the Tx queues */
+	for (i = 0; i < n_txq; i++) {
+		ret = nfp_net_nfd3_tx_queue_setup(eth_dev, i, PF_VNIC_NB_DESC, numa_node,
+				&tx_conf);
+		if (ret != 0) {
+			PMD_INIT_LOG(ERR, "Configure flower PF vNIC Tx queue %d failed", i);
+			goto tx_queue_cleanup;
+		}
+	}
+
+	return 0;
+
+tx_queue_cleanup:
+	for (i = 0; i < n_txq; i++)
+		nfp_net_tx_queue_release(eth_dev, i);
+rx_queue_cleanup:
+	for (i = 0; i < n_rxq; i++)
+		nfp_net_rx_queue_release(eth_dev, i);
+	rte_free(eth_dev->data->tx_queues);
+rx_queue_free:
+	rte_free(eth_dev->data->rx_queues);
+mac_cleanup:
+	rte_free(eth_dev->data->mac_addrs);
+mempool_cleanup:
+	rte_mempool_free(mp);
+port_release:
+	rte_eth_dev_release_port(hw->eth_dev);
+
+	return ret;
+}
+
+__rte_unused static void
+nfp_flower_cleanup_pf_vnic(struct nfp_net_hw *hw)
+{
+	uint16_t i;
+	struct nfp_app_fw_flower *app_fw_flower;
+
+	app_fw_flower = NFP_PRIV_TO_APP_FW_FLOWER(hw->pf_dev->app_fw_priv);
+
+	for (i = 0; i < hw->max_tx_queues; i++)
+		nfp_net_tx_queue_release(hw->eth_dev, i);
+
+	for (i = 0; i < hw->max_tx_queues; i++)
+		nfp_net_rx_queue_release(hw->eth_dev, i);
+
+	rte_free(hw->eth_dev->data->tx_queues);
+	rte_free(hw->eth_dev->data->rx_queues);
+	rte_free(hw->eth_dev->data->mac_addrs);
+	rte_mempool_free(app_fw_flower->pf_pktmbuf_pool);
+	rte_eth_dev_release_port(hw->eth_dev);
+}
+
 int
 nfp_init_app_fw_flower(struct nfp_pf_dev *pf_dev)
 {
+	int ret;
 	unsigned int numa_node;
+	struct nfp_net_hw *pf_hw;
 	struct nfp_app_fw_flower *app_fw_flower;
 
 	numa_node = rte_socket_id();
@@ -34,12 +334,75 @@
 
 	pf_dev->app_fw_priv = app_fw_flower;
 
+	/* Allocate memory for the PF AND ctrl vNIC here (hence the * 2) */
+	pf_hw = rte_zmalloc_socket("nfp_pf_vnic", 2 * sizeof(struct nfp_net_adapter),
+			RTE_CACHE_LINE_SIZE, numa_node);
+	if (pf_hw == NULL) {
+		PMD_INIT_LOG(ERR, "Could not malloc nfp pf vnic");
+		ret = -ENOMEM;
+		goto app_cleanup;
+	}
+
+	/* Grab the number of physical ports present on hardware */
+	app_fw_flower->nfp_eth_table = nfp_eth_read_ports(pf_dev->cpp);
+	if (app_fw_flower->nfp_eth_table == NULL) {
+		PMD_INIT_LOG(ERR, "error reading nfp ethernet table");
+		ret = -EIO;
+		goto vnic_cleanup;
+	}
+
+	/* Map the PF ctrl bar */
+	pf_dev->ctrl_bar = nfp_rtsym_map(pf_dev->sym_tbl, "_pf0_net_bar0",
+			32768, &pf_dev->ctrl_area);
+	if (pf_dev->ctrl_bar == NULL) {
+		PMD_INIT_LOG(ERR, "Cloud not map the PF vNIC ctrl bar");
+		ret = -ENODEV;
+		goto eth_tbl_cleanup;
+	}
+
+	/* Fill in the PF vNIC and populate app struct */
+	app_fw_flower->pf_hw = pf_hw;
+	pf_hw->ctrl_bar = pf_dev->ctrl_bar;
+	pf_hw->pf_dev = pf_dev;
+	pf_hw->cpp = pf_dev->cpp;
+
+	ret = nfp_flower_init_pf_vnic(app_fw_flower->pf_hw);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Could not initialize flower PF vNIC");
+		goto pf_cpp_area_cleanup;
+	}
+
 	return 0;
+
+pf_cpp_area_cleanup:
+	nfp_cpp_area_free(pf_dev->ctrl_area);
+eth_tbl_cleanup:
+	free(app_fw_flower->nfp_eth_table);
+vnic_cleanup:
+	rte_free(pf_hw);
+app_cleanup:
+	rte_free(app_fw_flower);
+
+	return ret;
 }
 
 int
-nfp_secondary_init_app_fw_flower(__rte_unused struct nfp_cpp *cpp)
+nfp_secondary_init_app_fw_flower(struct nfp_cpp *cpp)
 {
-	PMD_INIT_LOG(ERR, "Flower firmware not supported");
-	return -ENOTSUP;
+	struct rte_eth_dev *eth_dev;
+	const char *port_name = "pf_vnic_eth_dev";
+
+	PMD_INIT_LOG(DEBUG, "Secondary attaching to port %s", port_name);
+
+	eth_dev = rte_eth_dev_attach_secondary(port_name);
+	if (eth_dev == NULL) {
+		PMD_INIT_LOG(ERR, "Secondary process attach to port %s failed", port_name);
+		return -ENODEV;
+	}
+
+	eth_dev->process_private = cpp;
+	eth_dev->dev_ops = &nfp_flower_pf_vnic_ops;
+	rte_eth_dev_probing_finish(eth_dev);
+
+	return 0;
 }
diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
index 8b9ef95..981d88d 100644
--- a/drivers/net/nfp/flower/nfp_flower.h
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -8,6 +8,14 @@
 
 /* The flower application's private structure */
 struct nfp_app_fw_flower {
+	/* Pointer to a mempool for the PF vNIC */
+	struct rte_mempool *pf_pktmbuf_pool;
+
+	/* Pointer to the PF vNIC */
+	struct nfp_net_hw *pf_hw;
+
+	/* the eth table as reported by firmware */
+	struct nfp_eth_table *nfp_eth_table;
 };
 
 int nfp_init_app_fw_flower(struct nfp_pf_dev *pf_dev);
diff --git a/drivers/net/nfp/nfp_common.h b/drivers/net/nfp/nfp_common.h
index cefe717..aa6fdd4 100644
--- a/drivers/net/nfp/nfp_common.h
+++ b/drivers/net/nfp/nfp_common.h
@@ -446,6 +446,9 @@ int nfp_net_rss_hash_conf_get(struct rte_eth_dev *dev,
 #define NFP_PRIV_TO_APP_FW_NIC(app_fw_priv)\
 	((struct nfp_app_fw_nic *)app_fw_priv)
 
+#define NFP_PRIV_TO_APP_FW_FLOWER(app_fw_priv)\
+	((struct nfp_app_fw_flower *)app_fw_priv)
+
 #endif /* _NFP_COMMON_H_ */
 /*
  * Local variables:
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v9 06/12] net/nfp: add flower PF related routines
  2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (4 preceding siblings ...)
  2022-09-15 10:44 ` [PATCH v9 05/12] net/nfp: add flower PF setup logic Chaoyong He
@ 2022-09-15 10:44 ` Chaoyong He
  2022-09-15 10:44 ` [PATCH v9 07/12] net/nfp: add flower ctrl VNIC related logics Chaoyong He
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Chaoyong He @ 2022-09-15 10:44 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, niklas.soderlund, Chaoyong He

Adds the start/stop/close routine of the flower PF vNIC.
And we reuse the configure/link_update routine.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/flower/nfp_flower.c | 193 +++++++++++++++++++++++++++++++++++-
 1 file changed, 192 insertions(+), 1 deletion(-)

diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
index 34e60f8..24aa288 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -7,6 +7,7 @@
 #include <ethdev_driver.h>
 #include <rte_service_component.h>
 #include <rte_malloc.h>
+#include <rte_alarm.h>
 #include <ethdev_pci.h>
 #include <ethdev_driver.h>
 
@@ -41,8 +42,168 @@
 	.tx_free_thresh = DEFAULT_TX_FREE_THRESH,
 };
 
+static int
+nfp_flower_pf_start(struct rte_eth_dev *dev)
+{
+	int ret;
+	uint32_t new_ctrl;
+	uint32_t update = 0;
+	struct nfp_net_hw *hw;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	/* Disabling queues just in case... */
+	nfp_net_disable_queues(dev);
+
+	/* Enabling the required queues in the device */
+	nfp_net_enable_queues(dev);
+
+	new_ctrl = nfp_check_offloads(dev);
+
+	/* Writing configuration parameters in the device */
+	nfp_net_params_setup(hw);
+
+	nfp_net_rss_config_default(dev);
+	update |= NFP_NET_CFG_UPDATE_RSS;
+
+	if (hw->cap & NFP_NET_CFG_CTRL_RSS2)
+		new_ctrl |= NFP_NET_CFG_CTRL_RSS2;
+	else
+		new_ctrl |= NFP_NET_CFG_CTRL_RSS;
+
+	/* Enable device */
+	new_ctrl |= NFP_NET_CFG_CTRL_ENABLE;
+
+	update |= NFP_NET_CFG_UPDATE_GEN | NFP_NET_CFG_UPDATE_RING;
+
+	if (hw->cap & NFP_NET_CFG_CTRL_RINGCFG)
+		new_ctrl |= NFP_NET_CFG_CTRL_RINGCFG;
+
+	nn_cfg_writel(hw, NFP_NET_CFG_CTRL, new_ctrl);
+
+	/* If an error when reconfig we avoid to change hw state */
+	ret = nfp_net_reconfig(hw, new_ctrl, update);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Failed to reconfig PF vnic");
+		return -EIO;
+	}
+
+	hw->ctrl = new_ctrl;
+
+	/* Setup the freelist ring */
+	ret = nfp_net_rx_freelist_setup(dev);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Error with flower PF vNIC freelist setup");
+		return -EIO;
+	}
+
+	return 0;
+}
+
+/* Stop device: disable rx and tx functions to allow for reconfiguring. */
+static int
+nfp_flower_pf_stop(struct rte_eth_dev *dev)
+{
+	uint16_t i;
+	struct nfp_net_hw *hw;
+	struct nfp_net_txq *this_tx_q;
+	struct nfp_net_rxq *this_rx_q;
+
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+	nfp_net_disable_queues(dev);
+
+	/* Clear queues */
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+		this_tx_q = (struct nfp_net_txq *)dev->data->tx_queues[i];
+		nfp_net_reset_tx_queue(this_tx_q);
+	}
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		this_rx_q = (struct nfp_net_rxq *)dev->data->rx_queues[i];
+		nfp_net_reset_rx_queue(this_rx_q);
+	}
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		/* Configure the physical port down */
+		nfp_eth_set_configured(hw->cpp, hw->nfp_idx, 0);
+	else
+		nfp_eth_set_configured(dev->process_private, hw->nfp_idx, 0);
+
+	return 0;
+}
+
+/* Reset and stop device. The device can not be restarted. */
+static int
+nfp_flower_pf_close(struct rte_eth_dev *dev)
+{
+	uint16_t i;
+	struct nfp_net_hw *hw;
+	struct nfp_pf_dev *pf_dev;
+	struct nfp_net_txq *this_tx_q;
+	struct nfp_net_rxq *this_rx_q;
+	struct rte_pci_device *pci_dev;
+	struct nfp_app_fw_flower *app_fw_flower;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	pf_dev = NFP_NET_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+	hw = NFP_NET_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+	pci_dev = RTE_ETH_DEV_TO_PCI(dev);
+	app_fw_flower = NFP_PRIV_TO_APP_FW_FLOWER(pf_dev->app_fw_priv);
+
+	/*
+	 * We assume that the DPDK application is stopping all the
+	 * threads/queues before calling the device close function.
+	 */
+	nfp_net_disable_queues(dev);
+
+	/* Clear queues */
+	for (i = 0; i < dev->data->nb_tx_queues; i++) {
+		this_tx_q = (struct nfp_net_txq *)dev->data->tx_queues[i];
+		nfp_net_reset_tx_queue(this_tx_q);
+	}
+
+	for (i = 0; i < dev->data->nb_rx_queues; i++) {
+		this_rx_q = (struct nfp_net_rxq *)dev->data->rx_queues[i];
+		nfp_net_reset_rx_queue(this_rx_q);
+	}
+
+	/* Cancel possible impending LSC work here before releasing the port*/
+	rte_eal_alarm_cancel(nfp_net_dev_interrupt_delayed_handler, (void *)dev);
+
+	nn_cfg_writeb(hw, NFP_NET_CFG_LSC, 0xff);
+
+	rte_eth_dev_release_port(dev);
+
+	/* Now it is safe to free all PF resources */
+	PMD_DRV_LOG(INFO, "Freeing PF resources");
+	nfp_cpp_area_free(pf_dev->ctrl_area);
+	nfp_cpp_area_free(pf_dev->hwqueues_area);
+	free(pf_dev->hwinfo);
+	free(pf_dev->sym_tbl);
+	nfp_cpp_free(pf_dev->cpp);
+	rte_free(app_fw_flower);
+	rte_free(pf_dev);
+
+	rte_intr_disable(pci_dev->intr_handle);
+
+	/* unregister callback func from eal lib */
+	rte_intr_callback_unregister(pci_dev->intr_handle,
+			nfp_net_dev_interrupt_handler, (void *)dev);
+
+	return 0;
+}
+
 static const struct eth_dev_ops nfp_flower_pf_vnic_ops = {
 	.dev_infos_get          = nfp_net_infos_get,
+	.link_update            = nfp_net_link_update,
+	.dev_configure          = nfp_net_configure,
+
+	.dev_start              = nfp_flower_pf_start,
+	.dev_stop               = nfp_flower_pf_stop,
+	.dev_close              = nfp_flower_pf_close,
 };
 
 struct dp_packet {
@@ -293,7 +454,7 @@ struct dp_packet {
 	return ret;
 }
 
-__rte_unused static void
+static void
 nfp_flower_cleanup_pf_vnic(struct nfp_net_hw *hw)
 {
 	uint16_t i;
@@ -314,6 +475,27 @@ struct dp_packet {
 	rte_eth_dev_release_port(hw->eth_dev);
 }
 
+static int
+nfp_flower_start_pf_vnic(struct nfp_net_hw *hw)
+{
+	int ret;
+	struct rte_eth_dev *dev;
+
+	dev = hw->eth_dev;
+
+	/* Start the device */
+	ret = nfp_flower_pf_start(dev);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Could not start pf vnic");
+		return -EINVAL;
+	}
+
+	dev->data->dev_started = 1;
+	nfp_net_link_update(dev, 0);
+
+	return 0;
+}
+
 int
 nfp_init_app_fw_flower(struct nfp_pf_dev *pf_dev)
 {
@@ -372,8 +554,17 @@ struct dp_packet {
 		goto pf_cpp_area_cleanup;
 	}
 
+	/* Start the PF vNIC */
+	ret = nfp_flower_start_pf_vnic(app_fw_flower->pf_hw);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Could not start flower PF vNIC");
+		goto pf_vnic_cleanup;
+	}
+
 	return 0;
 
+pf_vnic_cleanup:
+	nfp_flower_cleanup_pf_vnic(app_fw_flower->pf_hw);
 pf_cpp_area_cleanup:
 	nfp_cpp_area_free(pf_dev->ctrl_area);
 eth_tbl_cleanup:
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v9 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (5 preceding siblings ...)
  2022-09-15 10:44 ` [PATCH v9 06/12] net/nfp: add flower PF related routines Chaoyong He
@ 2022-09-15 10:44 ` Chaoyong He
  2022-09-20 14:56   ` Ferruh Yigit
  2022-09-15 10:44 ` [PATCH v9 08/12] net/nfp: move common rxtx function for flower use Chaoyong He
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 22+ messages in thread
From: Chaoyong He @ 2022-09-15 10:44 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, niklas.soderlund, Chaoyong He

Adds the setup/start logic for the ctrl vNIC. This vNIC is used by
the PMD and flower firmware application as a communication channel
between driver and firmware. In the case of OVS it is also used to
communicate flow statistics from hardware to the driver.

A rte_eth device is not exposed to DPDK for this vNIC as it is strictly
used internally by flower logic.

Because of the add of ctrl vNIC, a new PCItoCPPBar is needed. Modify the
related logics.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/flower/nfp_flower.c        | 220 +++++++++++++++++++++++++++++
 drivers/net/nfp/flower/nfp_flower.h        |   6 +
 drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c |  31 ++--
 3 files changed, 245 insertions(+), 12 deletions(-)

diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
index 24aa288..18ffa5c 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -27,6 +27,7 @@
 #define DEFAULT_FLBUF_SIZE 9216
 
 #define PF_VNIC_NB_DESC 1024
+#define CTRL_VNIC_NB_DESC 512
 
 static const struct rte_eth_rxconf rx_conf = {
 	.rx_free_thresh = DEFAULT_RX_FREE_THRESH,
@@ -206,6 +207,11 @@
 	.dev_close              = nfp_flower_pf_close,
 };
 
+static const struct eth_dev_ops nfp_flower_ctrl_vnic_ops = {
+	.dev_infos_get          = nfp_net_infos_get,
+	.dev_configure          = nfp_net_configure,
+};
+
 struct dp_packet {
 	struct rte_mbuf mbuf;
 	uint32_t source;
@@ -496,12 +502,192 @@ struct dp_packet {
 	return 0;
 }
 
+static int
+nfp_flower_init_ctrl_vnic(struct nfp_net_hw *hw)
+{
+	uint32_t i;
+	int ret = 0;
+	uint16_t n_txq;
+	uint16_t n_rxq;
+	unsigned int numa_node;
+	struct rte_mempool *mp;
+	struct nfp_pf_dev *pf_dev;
+	struct rte_eth_dev *eth_dev;
+	struct nfp_app_fw_flower *app_fw_flower;
+
+	/* Set up some pointers here for ease of use */
+	pf_dev = hw->pf_dev;
+	app_fw_flower = NFP_PRIV_TO_APP_FW_FLOWER(pf_dev->app_fw_priv);
+
+	ret = nfp_flower_init_vnic_common(hw, "ctrl_vnic");
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Could not init pf vnic");
+		return -EINVAL;
+	}
+
+	/* Allocate memory for the eth_dev of the vNIC */
+	hw->eth_dev = rte_eth_dev_allocate("nfp_ctrl_vnic");
+	if (hw->eth_dev == NULL) {
+		PMD_INIT_LOG(ERR, "Could not allocate ctrl vnic");
+		return -ENOMEM;
+	}
+
+	/* Grab the pointer to the newly created rte_eth_dev here */
+	eth_dev = hw->eth_dev;
+
+	numa_node = rte_socket_id();
+
+	/* Create a mbuf pool for the ctrl vNIC */
+	app_fw_flower->ctrl_pktmbuf_pool = rte_pktmbuf_pool_create("ctrl_mbuf_pool",
+			4 * CTRL_VNIC_NB_DESC, 64, 0, 9216, numa_node);
+	if (app_fw_flower->ctrl_pktmbuf_pool == NULL) {
+		PMD_INIT_LOG(ERR, "create mbuf pool for ctrl vnic failed");
+		ret = -ENOMEM;
+		goto port_release;
+	}
+
+	mp = app_fw_flower->ctrl_pktmbuf_pool;
+
+	eth_dev->dev_ops = &nfp_flower_ctrl_vnic_ops;
+	rte_eth_dev_probing_finish(eth_dev);
+
+	/* Configure the ctrl vNIC device */
+	n_rxq = hw->max_rx_queues;
+	n_txq = hw->max_tx_queues;
+	eth_dev->data->rx_queues = rte_zmalloc("ethdev->rx_queues",
+		sizeof(eth_dev->data->rx_queues[0]) * n_rxq,
+		RTE_CACHE_LINE_SIZE);
+	if (eth_dev->data->rx_queues == NULL) {
+		PMD_INIT_LOG(ERR, "rte_zmalloc failed for ctrl vNIC rx queues");
+		ret = -ENOMEM;
+		goto mempool_cleanup;
+	}
+
+	eth_dev->data->tx_queues = rte_zmalloc("ethdev->tx_queues",
+		sizeof(eth_dev->data->tx_queues[0]) * n_txq,
+		RTE_CACHE_LINE_SIZE);
+	if (eth_dev->data->tx_queues == NULL) {
+		PMD_INIT_LOG(ERR, "rte_zmalloc failed for ctrl vNIC tx queues");
+		ret = -ENOMEM;
+		goto rx_queue_free;
+	}
+
+	/* Fill in some of the eth_dev fields */
+	eth_dev->device = &pf_dev->pci_dev->device;
+	eth_dev->data->nb_tx_queues = n_rxq;
+	eth_dev->data->nb_rx_queues = n_txq;
+	eth_dev->data->dev_private = hw;
+
+	/* Set up the Rx queues */
+	for (i = 0; i < n_rxq; i++) {
+		ret = nfp_net_rx_queue_setup(eth_dev, i, CTRL_VNIC_NB_DESC, numa_node,
+				&rx_conf, mp);
+		if (ret != 0) {
+			PMD_INIT_LOG(ERR, "Configure ctrl vNIC Rx queue %d failed", i);
+			goto rx_queue_cleanup;
+		}
+	}
+
+	/* Set up the Tx queues */
+	for (i = 0; i < n_txq; i++) {
+		ret = nfp_net_nfd3_tx_queue_setup(eth_dev, i, CTRL_VNIC_NB_DESC, numa_node,
+				&tx_conf);
+		if (ret != 0) {
+			PMD_INIT_LOG(ERR, "Configure ctrl vNIC Tx queue %d failed", i);
+			goto tx_queue_cleanup;
+		}
+	}
+
+	return 0;
+
+tx_queue_cleanup:
+	for (i = 0; i < n_txq; i++)
+		nfp_net_tx_queue_release(eth_dev, i);
+rx_queue_cleanup:
+	for (i = 0; i < n_rxq; i++)
+		nfp_net_rx_queue_release(eth_dev, i);
+	rte_free(eth_dev->data->tx_queues);
+rx_queue_free:
+	rte_free(eth_dev->data->rx_queues);
+mempool_cleanup:
+	rte_mempool_free(mp);
+port_release:
+	rte_eth_dev_release_port(hw->eth_dev);
+
+	return ret;
+}
+
+static void
+nfp_flower_cleanup_ctrl_vnic(struct nfp_net_hw *hw)
+{
+	uint32_t i;
+	struct nfp_app_fw_flower *app_fw_flower;
+
+	app_fw_flower = NFP_PRIV_TO_APP_FW_FLOWER(hw->pf_dev->app_fw_priv);
+
+	for (i = 0; i < hw->max_tx_queues; i++)
+		nfp_net_tx_queue_release(hw->eth_dev, i);
+
+	for (i = 0; i < hw->max_rx_queues; i++)
+		nfp_net_rx_queue_release(hw->eth_dev, i);
+
+	rte_free(hw->eth_dev->data->tx_queues);
+	rte_free(hw->eth_dev->data->rx_queues);
+	rte_mempool_free(app_fw_flower->ctrl_pktmbuf_pool);
+	rte_eth_dev_release_port(hw->eth_dev);
+}
+
+static int
+nfp_flower_start_ctrl_vnic(struct nfp_net_hw *hw)
+{
+	int ret;
+	uint32_t update;
+	uint32_t new_ctrl;
+	struct rte_eth_dev *dev;
+
+	dev = hw->eth_dev;
+
+	/* Disabling queues just in case... */
+	nfp_net_disable_queues(dev);
+
+	/* Enabling the required queues in the device */
+	nfp_net_enable_queues(dev);
+
+	/* Writing configuration parameters in the device */
+	nfp_net_params_setup(hw);
+
+	new_ctrl = NFP_NET_CFG_CTRL_ENABLE;
+	update = NFP_NET_CFG_UPDATE_GEN | NFP_NET_CFG_UPDATE_RING |
+			NFP_NET_CFG_UPDATE_MSIX;
+
+	rte_wmb();
+
+	/* If an error when reconfig we avoid to change hw state */
+	ret = nfp_net_reconfig(hw, new_ctrl, update);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Failed to reconfig ctrl vnic");
+		return -EIO;
+	}
+
+	hw->ctrl = new_ctrl;
+
+	/* Setup the freelist ring */
+	ret = nfp_net_rx_freelist_setup(dev);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Error with flower ctrl vNIC freelist setup");
+		return -EIO;
+	}
+
+	return 0;
+}
+
 int
 nfp_init_app_fw_flower(struct nfp_pf_dev *pf_dev)
 {
 	int ret;
 	unsigned int numa_node;
 	struct nfp_net_hw *pf_hw;
+	struct nfp_net_hw *ctrl_hw;
 	struct nfp_app_fw_flower *app_fw_flower;
 
 	numa_node = rte_socket_id();
@@ -561,8 +747,42 @@ struct dp_packet {
 		goto pf_vnic_cleanup;
 	}
 
+	/* The ctrl vNIC struct comes directly after the PF one */
+	app_fw_flower->ctrl_hw = pf_hw + 1;
+	ctrl_hw = app_fw_flower->ctrl_hw;
+
+	/* Map the ctrl vNIC ctrl bar */
+	ctrl_hw->ctrl_bar = nfp_rtsym_map(pf_dev->sym_tbl, "_pf0_net_ctrl_bar",
+		32768, &ctrl_hw->ctrl_area);
+	if (ctrl_hw->ctrl_bar == NULL) {
+		PMD_INIT_LOG(ERR, "Cloud not map the ctrl vNIC ctrl bar");
+		ret = -ENODEV;
+		goto pf_vnic_cleanup;
+	}
+
+	/* Now populate the ctrl vNIC */
+	ctrl_hw->pf_dev = pf_dev;
+	ctrl_hw->cpp = pf_dev->cpp;
+
+	ret = nfp_flower_init_ctrl_vnic(app_fw_flower->ctrl_hw);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Could not initialize flower ctrl vNIC");
+		goto ctrl_cpp_area_cleanup;
+	}
+
+	/* Start the ctrl vNIC */
+	ret = nfp_flower_start_ctrl_vnic(app_fw_flower->ctrl_hw);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Could not start flower ctrl vNIC");
+		goto ctrl_vnic_cleanup;
+	}
+
 	return 0;
 
+ctrl_vnic_cleanup:
+	nfp_flower_cleanup_ctrl_vnic(app_fw_flower->ctrl_hw);
+ctrl_cpp_area_cleanup:
+	nfp_cpp_area_free(ctrl_hw->ctrl_area);
 pf_vnic_cleanup:
 	nfp_flower_cleanup_pf_vnic(app_fw_flower->pf_hw);
 pf_cpp_area_cleanup:
diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
index 981d88d..e18703e 100644
--- a/drivers/net/nfp/flower/nfp_flower.h
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -14,6 +14,12 @@ struct nfp_app_fw_flower {
 	/* Pointer to the PF vNIC */
 	struct nfp_net_hw *pf_hw;
 
+	/* Pointer to a mempool for the ctrlvNIC */
+	struct rte_mempool *ctrl_pktmbuf_pool;
+
+	/* Pointer to the ctrl vNIC */
+	struct nfp_net_hw *ctrl_hw;
+
 	/* the eth table as reported by firmware */
 	struct nfp_eth_table *nfp_eth_table;
 };
diff --git a/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c b/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
index 08bc4e8..22c8bc4 100644
--- a/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
+++ b/drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c
@@ -91,7 +91,10 @@
  * @refcnt:	number of current users
  * @iomem:	mapped IO memory
  */
+#define NFP_BAR_MIN 1
+#define NFP_BAR_MID 5
 #define NFP_BAR_MAX 7
+
 struct nfp_bar {
 	struct nfp_pcie_user *nfp;
 	uint32_t barcfg;
@@ -292,6 +295,7 @@ struct nfp_pcie_user {
  * BAR0.0: Reserved for General Mapping (for MSI-X access to PCIe SRAM)
  *
  *         Halving PCItoCPPBars for primary and secondary processes.
+ *         For CoreNIC firmware:
  *         NFP PMD just requires two fixed slots, one for configuration BAR,
  *         and another for accessing the hw queues. Another slot is needed
  *         for setting the link up or down. Secondary processes do not need
@@ -301,6 +305,9 @@ struct nfp_pcie_user {
  *         supported. Due to this requirement and future extensions requiring
  *         new slots per process, only one secondary process is supported by
  *         now.
+ *         For Flower firmware:
+ *         NFP PMD need another fixed slots, used as the configureation BAR
+ *         for ctrl vNIC.
  */
 static int
 nfp_enable_bars(struct nfp_pcie_user *nfp)
@@ -309,11 +316,11 @@ struct nfp_pcie_user {
 	int x, start, end;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-		start = 4;
-		end = 1;
+		start = NFP_BAR_MID;
+		end = NFP_BAR_MIN;
 	} else {
-		start = 7;
-		end = 4;
+		start = NFP_BAR_MAX;
+		end = NFP_BAR_MID;
 	}
 	for (x = start; x > end; x--) {
 		bar = &nfp->bar[x - 1];
@@ -341,11 +348,11 @@ struct nfp_pcie_user {
 	int x, start, end;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-		start = 4;
-		end = 1;
+		start = NFP_BAR_MID;
+		end = NFP_BAR_MIN;
 	} else {
-		start = 7;
-		end = 4;
+		start = NFP_BAR_MAX;
+		end = NFP_BAR_MID;
 	}
 	for (x = start; x > end; x--) {
 		bar = &nfp->bar[x - 1];
@@ -364,11 +371,11 @@ struct nfp_pcie_user {
 	int x, start, end;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-		start = 4;
-		end = 1;
+		start = NFP_BAR_MID;
+		end = NFP_BAR_MIN;
 	} else {
-		start = 7;
-		end = 4;
+		start = NFP_BAR_MAX;
+		end = NFP_BAR_MID;
 	}
 
 	for (x = start; x > end; x--) {
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v9 08/12] net/nfp: move common rxtx function for flower use
  2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (6 preceding siblings ...)
  2022-09-15 10:44 ` [PATCH v9 07/12] net/nfp: add flower ctrl VNIC related logics Chaoyong He
@ 2022-09-15 10:44 ` Chaoyong He
  2022-09-15 10:44 ` [PATCH v9 09/12] net/nfp: add flower ctrl VNIC rxtx logic Chaoyong He
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Chaoyong He @ 2022-09-15 10:44 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, niklas.soderlund, Chaoyong He, Heinrich Kuhn

Move some common Rx and Tx logic to the header file so that
they can be re-used by flower Tx and Rx logic.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/nfp_rxtx.c | 32 +-------------------------------
 drivers/net/nfp/nfp_rxtx.h | 31 +++++++++++++++++++++++++++++++
 2 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/drivers/net/nfp/nfp_rxtx.c b/drivers/net/nfp/nfp_rxtx.c
index 8429b44..8d63a7b 100644
--- a/drivers/net/nfp/nfp_rxtx.c
+++ b/drivers/net/nfp/nfp_rxtx.c
@@ -116,12 +116,6 @@
 	return count;
 }
 
-static inline void
-nfp_net_mbuf_alloc_failed(struct nfp_net_rxq *rxq)
-{
-	rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed++;
-}
-
 /*
  * nfp_net_set_hash - Set mbuf hash data
  *
@@ -583,7 +577,7 @@
  * @txq: TX queue to work with
  * Returns number of descriptors freed
  */
-static int
+int
 nfp_net_tx_free_bufs(struct nfp_net_txq *txq)
 {
 	uint32_t qcp_rd_p;
@@ -774,30 +768,6 @@
 	return 0;
 }
 
-/* Leaving always free descriptors for avoiding wrapping confusion */
-static inline
-uint32_t nfp_net_nfd3_free_tx_desc(struct nfp_net_txq *txq)
-{
-	if (txq->wr_p >= txq->rd_p)
-		return txq->tx_count - (txq->wr_p - txq->rd_p) - 8;
-	else
-		return txq->rd_p - txq->wr_p - 8;
-}
-
-/*
- * nfp_net_txq_full - Check if the TX queue free descriptors
- * is below tx_free_threshold
- *
- * @txq: TX queue to check
- *
- * This function uses the host copy* of read/write pointers
- */
-static inline
-uint32_t nfp_net_nfd3_txq_full(struct nfp_net_txq *txq)
-{
-	return (nfp_net_nfd3_free_tx_desc(txq) < txq->tx_free_thresh);
-}
-
 /* nfp_net_tx_tso - Set TX descriptor for TSO */
 static inline void
 nfp_net_nfd3_tx_tso(struct nfp_net_txq *txq, struct nfp_net_nfd3_tx_desc *txd,
diff --git a/drivers/net/nfp/nfp_rxtx.h b/drivers/net/nfp/nfp_rxtx.h
index 5c005d7..a30171f 100644
--- a/drivers/net/nfp/nfp_rxtx.h
+++ b/drivers/net/nfp/nfp_rxtx.h
@@ -330,6 +330,36 @@ struct nfp_net_rxq {
 	int rx_qcidx;
 } __rte_aligned(64);
 
+static inline void
+nfp_net_mbuf_alloc_failed(struct nfp_net_rxq *rxq)
+{
+	rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed++;
+}
+
+/* Leaving always free descriptors for avoiding wrapping confusion */
+static inline uint32_t
+nfp_net_nfd3_free_tx_desc(struct nfp_net_txq *txq)
+{
+	if (txq->wr_p >= txq->rd_p)
+		return txq->tx_count - (txq->wr_p - txq->rd_p) - 8;
+	else
+		return txq->rd_p - txq->wr_p - 8;
+}
+
+/*
+ * nfp_net_nfd3_txq_full - Check if the TX queue free descriptors
+ * is below tx_free_threshold
+ *
+ * @txq: TX queue to check
+ *
+ * This function uses the host copy* of read/write pointers
+ */
+static inline uint32_t
+nfp_net_nfd3_txq_full(struct nfp_net_txq *txq)
+{
+	return (nfp_net_nfd3_free_tx_desc(txq) < txq->tx_free_thresh);
+}
+
 int nfp_net_rx_freelist_setup(struct rte_eth_dev *dev);
 uint32_t nfp_net_rx_queue_count(void *rx_queue);
 uint16_t nfp_net_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
@@ -355,6 +385,7 @@ int nfp_net_nfdk_tx_queue_setup(struct rte_eth_dev *dev,
 uint16_t nfp_net_nfdk_xmit_pkts(void *tx_queue,
 		struct rte_mbuf **tx_pkts,
 		uint16_t nb_pkts);
+int nfp_net_tx_free_bufs(struct nfp_net_txq *txq);
 
 #endif /* _NFP_RXTX_H_ */
 /*
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v9 09/12] net/nfp: add flower ctrl VNIC rxtx logic
  2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (7 preceding siblings ...)
  2022-09-15 10:44 ` [PATCH v9 08/12] net/nfp: move common rxtx function for flower use Chaoyong He
@ 2022-09-15 10:44 ` Chaoyong He
  2022-09-15 10:44 ` [PATCH v9 10/12] net/nfp: add flower representor framework Chaoyong He
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Chaoyong He @ 2022-09-15 10:44 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, niklas.soderlund, Chaoyong He, Heinrich Kuhn

Adds the Rx and Tx function for the ctrl VNIC. The logic is mostly
identical to the normal Rx and Tx functionality of the NFP PMD.

Make use of the ctrl VNIC service logic to service the ctrl vNIC Rx
path.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 doc/guides/rel_notes/release_22_11.rst   |   1 +
 drivers/net/nfp/flower/nfp_flower.c      |  59 ++++++++
 drivers/net/nfp/flower/nfp_flower.h      |  21 +++
 drivers/net/nfp/flower/nfp_flower_ctrl.c | 250 +++++++++++++++++++++++++++++++
 drivers/net/nfp/flower/nfp_flower_ctrl.h |  13 ++
 drivers/net/nfp/meson.build              |   1 +
 6 files changed, 345 insertions(+)
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ctrl.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_ctrl.h

diff --git a/doc/guides/rel_notes/release_22_11.rst b/doc/guides/rel_notes/release_22_11.rst
index 6a666aa..c5a8c08 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -61,6 +61,7 @@ New Features
 
     * Added the support of flower firmware.
     * Added the flower service infrastructure.
+    * Added the control message interactive channels between PMD and firmware.
 
 
 Removed Items
diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
index 18ffa5c..e935821 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -20,6 +20,7 @@
 #include "../nfpcore/nfp_rtsym.h"
 #include "../nfpcore/nfp_nsp.h"
 #include "nfp_flower.h"
+#include "nfp_flower_ctrl.h"
 
 #define MAX_PKT_BURST 32
 #define MBUF_PRIV_SIZE 128
@@ -681,6 +682,56 @@ struct dp_packet {
 	return 0;
 }
 
+static int
+nfp_flower_ctrl_vnic_service(void *arg)
+{
+	struct nfp_app_fw_flower *app_fw_flower = arg;
+
+	nfp_flower_ctrl_vnic_poll(app_fw_flower);
+
+	return 0;
+}
+
+static struct rte_service_spec flower_services[NFP_FLOWER_SERVICE_MAX] = {
+	[NFP_FLOWER_SERVICE_CTRL] = {
+		.name         = "flower_ctrl_vnic_service",
+		.callback     = nfp_flower_ctrl_vnic_service,
+	},
+};
+
+static int
+nfp_flower_enable_services(struct nfp_app_fw_flower *app_fw_flower)
+{
+	int i;
+	int ret;
+
+	for (i = 0; i < NFP_FLOWER_SERVICE_MAX; i++) {
+		/* Pass a pointer to the flower app to the service */
+		flower_services[i].callback_userdata = (void *)app_fw_flower;
+
+		/* Register the flower services */
+		ret = rte_service_component_register(&flower_services[i],
+				&app_fw_flower->service_ids[i]);
+		if (ret != 0) {
+			PMD_INIT_LOG(ERR, "Could not register %s",
+					flower_services[i].name);
+			return -EINVAL;
+		}
+
+		PMD_INIT_LOG(INFO, "%s registered", flower_services[i].name);
+
+		/* Map them to available service cores*/
+		ret = nfp_map_service(app_fw_flower->service_ids[i]);
+		if (ret != 0) {
+			PMD_INIT_LOG(ERR, "Could not map %s",
+					flower_services[i].name);
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
 int
 nfp_init_app_fw_flower(struct nfp_pf_dev *pf_dev)
 {
@@ -777,6 +828,14 @@ struct dp_packet {
 		goto ctrl_vnic_cleanup;
 	}
 
+	/* Start up flower services */
+	ret = nfp_flower_enable_services(app_fw_flower);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Could not enable flower services");
+		ret = -ESRCH;
+		goto ctrl_vnic_cleanup;
+	}
+
 	return 0;
 
 ctrl_vnic_cleanup:
diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
index e18703e..b9379e5 100644
--- a/drivers/net/nfp/flower/nfp_flower.h
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -6,8 +6,23 @@
 #ifndef _NFP_FLOWER_H_
 #define _NFP_FLOWER_H_
 
+enum nfp_flower_service {
+	NFP_FLOWER_SERVICE_CTRL,
+	NFP_FLOWER_SERVICE_MAX,
+};
+/*
+ * Flower fallback and ctrl path always adds and removes
+ * 8 bytes of prepended data. Tx descriptors must point
+ * to the correct packet data offset after metadata has
+ * been added
+ */
+#define FLOWER_PKT_DATA_OFFSET 8
+
 /* The flower application's private structure */
 struct nfp_app_fw_flower {
+	/* List of rte_service ID's */
+	uint32_t service_ids[NFP_FLOWER_SERVICE_MAX];
+
 	/* Pointer to a mempool for the PF vNIC */
 	struct rte_mempool *pf_pktmbuf_pool;
 
@@ -22,6 +37,12 @@ struct nfp_app_fw_flower {
 
 	/* the eth table as reported by firmware */
 	struct nfp_eth_table *nfp_eth_table;
+
+	/* Ctrl vNIC Rx counter */
+	uint64_t ctrl_vnic_rx_count;
+
+	/* Ctrl vNIC Tx counter */
+	uint64_t ctrl_vnic_tx_count;
 };
 
 int nfp_init_app_fw_flower(struct nfp_pf_dev *pf_dev);
diff --git a/drivers/net/nfp/flower/nfp_flower_ctrl.c b/drivers/net/nfp/flower/nfp_flower_ctrl.c
new file mode 100644
index 0000000..df908ef
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_ctrl.c
@@ -0,0 +1,250 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#include <rte_common.h>
+#include <ethdev_pci.h>
+
+#include "../nfp_common.h"
+#include "../nfp_logs.h"
+#include "../nfp_ctrl.h"
+#include "../nfp_rxtx.h"
+#include "nfp_flower.h"
+#include "nfp_flower_ctrl.h"
+
+#define MAX_PKT_BURST 32
+
+static uint16_t
+nfp_flower_ctrl_vnic_recv(void *rx_queue,
+		struct rte_mbuf **rx_pkts,
+		uint16_t nb_pkts)
+{
+	uint64_t dma_addr;
+	uint16_t avail = 0;
+	struct rte_mbuf *mb;
+	uint16_t nb_hold = 0;
+	struct nfp_net_hw *hw;
+	struct nfp_net_rxq *rxq;
+	struct rte_mbuf *new_mb;
+	struct nfp_net_rx_buff *rxb;
+	struct nfp_net_rx_desc *rxds;
+
+	rxq = rx_queue;
+	if (unlikely(rxq == NULL)) {
+		/*
+		 * DPDK just checks the queue is lower than max queues
+		 * enabled. But the queue needs to be configured
+		 */
+		PMD_RX_LOG(ERR, "RX Bad queue");
+		return 0;
+	}
+
+	hw = rxq->hw;
+	while (avail < nb_pkts) {
+		rxb = &rxq->rxbufs[rxq->rd_p];
+		if (unlikely(rxb == NULL)) {
+			PMD_RX_LOG(ERR, "rxb does not exist!");
+			break;
+		}
+
+		rxds = &rxq->rxds[rxq->rd_p];
+		if ((rxds->rxd.meta_len_dd & PCIE_DESC_RX_DD) == 0)
+			break;
+
+		/*
+		 * Memory barrier to ensure that we won't do other
+		 * reads before the DD bit.
+		 */
+		rte_rmb();
+
+		/*
+		 * We got a packet. Let's alloc a new mbuf for refilling the
+		 * free descriptor ring as soon as possible
+		 */
+		new_mb = rte_pktmbuf_alloc(rxq->mem_pool);
+		if (unlikely(new_mb == NULL)) {
+			PMD_RX_LOG(ERR,
+				"RX mbuf alloc failed port_id=%u queue_id=%u",
+				rxq->port_id, (unsigned int)rxq->qidx);
+			nfp_net_mbuf_alloc_failed(rxq);
+			break;
+		}
+
+		nb_hold++;
+
+		/*
+		 * Grab the mbuf and refill the descriptor with the
+		 * previously allocated mbuf
+		 */
+		mb = rxb->mbuf;
+		rxb->mbuf = new_mb;
+
+		/* Size of this segment */
+		mb->data_len = rxds->rxd.data_len - NFP_DESC_META_LEN(rxds);
+		/* Size of the whole packet. We just support 1 segment */
+		mb->pkt_len = mb->data_len;
+
+		if (unlikely((mb->data_len + hw->rx_offset) > rxq->mbuf_size)) {
+			/*
+			 * This should not happen and the user has the
+			 * responsibility of avoiding it. But we have
+			 * to give some info about the error
+			 */
+			RTE_LOG_DP(ERR, PMD,
+				"mbuf overflow likely due to the RX offset.\n"
+				"\t\tYour mbuf size should have extra space for"
+				" RX offset=%u bytes.\n"
+				"\t\tCurrently you just have %u bytes available"
+				" but the received packet is %u bytes long",
+				hw->rx_offset,
+				rxq->mbuf_size - hw->rx_offset,
+				mb->data_len);
+			rte_pktmbuf_free(mb);
+			break;
+		}
+
+		/* Filling the received mbuf with packet info */
+		if (hw->rx_offset)
+			mb->data_off = RTE_PKTMBUF_HEADROOM + hw->rx_offset;
+		else
+			mb->data_off = RTE_PKTMBUF_HEADROOM + NFP_DESC_META_LEN(rxds);
+
+		/* No scatter mode supported */
+		mb->nb_segs = 1;
+		mb->next = NULL;
+		mb->port = rxq->port_id;
+
+		rx_pkts[avail++] = mb;
+
+		/* Now resetting and updating the descriptor */
+		rxds->vals[0] = 0;
+		rxds->vals[1] = 0;
+		dma_addr = rte_cpu_to_le_64(RTE_MBUF_DMA_ADDR_DEFAULT(new_mb));
+		rxds->fld.dd = 0;
+		rxds->fld.dma_addr_hi = (dma_addr >> 32) & 0xff;
+		rxds->fld.dma_addr_lo = dma_addr & 0xffffffff;
+
+		rxq->rd_p++;
+		if (unlikely(rxq->rd_p == rxq->rx_count)) /* wrapping?*/
+			rxq->rd_p = 0;
+	}
+
+	if (nb_hold == 0)
+		return 0;
+
+	nb_hold += rxq->nb_rx_hold;
+
+	/*
+	 * FL descriptors needs to be written before incrementing the
+	 * FL queue WR pointer
+	 */
+	rte_wmb();
+	if (nb_hold >= rxq->rx_free_thresh) {
+		PMD_RX_LOG(DEBUG, "port=%hu queue=%d nb_hold=%hu avail=%hu",
+			rxq->port_id, rxq->qidx, nb_hold, avail);
+		nfp_qcp_ptr_add(rxq->qcp_fl, NFP_QCP_WRITE_PTR, nb_hold);
+		nb_hold = 0;
+	}
+
+	rxq->nb_rx_hold = nb_hold;
+
+	return avail;
+}
+
+uint16_t
+nfp_flower_ctrl_vnic_xmit(struct nfp_app_fw_flower *app_fw_flower,
+		struct rte_mbuf *mbuf)
+{
+	uint16_t cnt = 0;
+	uint64_t dma_addr;
+	uint32_t free_descs;
+	struct rte_mbuf **lmbuf;
+	struct nfp_net_txq *txq;
+	struct nfp_net_hw *ctrl_hw;
+	struct rte_eth_dev *ctrl_dev;
+	struct nfp_net_nfd3_tx_desc *txds;
+
+	ctrl_hw = app_fw_flower->ctrl_hw;
+	ctrl_dev = ctrl_hw->eth_dev;
+
+	/* Flower ctrl vNIC only has a single tx queue */
+	txq = ctrl_dev->data->tx_queues[0];
+	if (unlikely(txq == NULL)) {
+		/*
+		 * DPDK just checks the queue is lower than max queues
+		 * enabled. But the queue needs to be configured
+		 */
+		PMD_TX_LOG(ERR, "ctrl dev TX Bad queue");
+		goto xmit_end;
+	}
+
+	txds = &txq->txds[txq->wr_p];
+	txds->vals[0] = 0;
+	txds->vals[1] = 0;
+	txds->vals[2] = 0;
+	txds->vals[3] = 0;
+
+	if (nfp_net_nfd3_txq_full(txq))
+		nfp_net_tx_free_bufs(txq);
+
+	free_descs = nfp_net_nfd3_free_tx_desc(txq);
+	if (unlikely(free_descs == 0)) {
+		PMD_TX_LOG(ERR, "ctrl dev no free descs");
+		goto xmit_end;
+	}
+
+	lmbuf = &txq->txbufs[txq->wr_p].mbuf;
+	RTE_MBUF_PREFETCH_TO_FREE(*lmbuf);
+	if (*lmbuf)
+		rte_pktmbuf_free_seg(*lmbuf);
+
+	*lmbuf = mbuf;
+	dma_addr = rte_mbuf_data_iova(mbuf);
+
+	txds->data_len = mbuf->pkt_len;
+	txds->dma_len = txds->data_len;
+	txds->dma_addr_hi = (dma_addr >> 32) & 0xff;
+	txds->dma_addr_lo = (dma_addr & 0xffffffff);
+	txds->offset_eop = FLOWER_PKT_DATA_OFFSET | PCIE_DESC_TX_EOP;
+
+	txq->wr_p++;
+	if (unlikely(txq->wr_p == txq->tx_count)) /* wrapping?*/
+		txq->wr_p = 0;
+
+	cnt++;
+	app_fw_flower->ctrl_vnic_tx_count++;
+
+xmit_end:
+	rte_wmb();
+	nfp_qcp_ptr_add(txq->qcp_q, NFP_QCP_WRITE_PTR, 1);
+
+	return cnt;
+}
+
+void
+nfp_flower_ctrl_vnic_poll(struct nfp_app_fw_flower *app_fw_flower)
+{
+	uint16_t i;
+	uint16_t count;
+	struct nfp_net_rxq *rxq;
+	struct nfp_net_hw *ctrl_hw;
+	struct rte_eth_dev *ctrl_eth_dev;
+	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+
+	ctrl_hw = app_fw_flower->ctrl_hw;
+	ctrl_eth_dev = ctrl_hw->eth_dev;
+
+	/* ctrl vNIC only has a single Rx queue */
+	rxq = ctrl_eth_dev->data->rx_queues[0];
+
+	while (true) {
+		count = nfp_flower_ctrl_vnic_recv(rxq, pkts_burst, MAX_PKT_BURST);
+		if (count != 0) {
+			app_fw_flower->ctrl_vnic_rx_count += count;
+			/* Process cmsgs here, only free for now */
+			for (i = 0; i < count; i++)
+				rte_pktmbuf_free(pkts_burst[i]);
+		}
+	}
+}
diff --git a/drivers/net/nfp/flower/nfp_flower_ctrl.h b/drivers/net/nfp/flower/nfp_flower_ctrl.h
new file mode 100644
index 0000000..1e38578
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_ctrl.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#ifndef _NFP_FLOWER_CTRL_H_
+#define _NFP_FLOWER_CTRL_H_
+
+void nfp_flower_ctrl_vnic_poll(struct nfp_app_fw_flower *app_fw_flower);
+uint16_t nfp_flower_ctrl_vnic_xmit(struct nfp_app_fw_flower *app_fw_flower,
+		struct rte_mbuf *mbuf);
+
+#endif /* _NFP_FLOWER_CTRL_H_ */
diff --git a/drivers/net/nfp/meson.build b/drivers/net/nfp/meson.build
index 7ae3115..8710213 100644
--- a/drivers/net/nfp/meson.build
+++ b/drivers/net/nfp/meson.build
@@ -7,6 +7,7 @@ if not is_linux or not dpdk_conf.get('RTE_ARCH_64')
 endif
 sources = files(
         'flower/nfp_flower.c',
+        'flower/nfp_flower_ctrl.c',
         'nfpcore/nfp_cpp_pcie_ops.c',
         'nfpcore/nfp_nsp.c',
         'nfpcore/nfp_cppcore.c',
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v9 10/12] net/nfp: add flower representor framework
  2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (8 preceding siblings ...)
  2022-09-15 10:44 ` [PATCH v9 09/12] net/nfp: add flower ctrl VNIC rxtx logic Chaoyong He
@ 2022-09-15 10:44 ` Chaoyong He
  2022-09-15 10:44 ` [PATCH v9 11/12] net/nfp: move rxtx function to header file Chaoyong He
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 22+ messages in thread
From: Chaoyong He @ 2022-09-15 10:44 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, niklas.soderlund, Chaoyong He

Adds the framework to support flower representors. The number of VF
representors are parsed from the command line. For physical port
representors the current logic aims to create a representor for
each physical port present on the hardware.

An eth_dev is created for each physical port and VF, and flower
firmware requires a MAC repr cmsg to be transmitted to firmware
with info about the number of physical ports configured.

Reify messages are sent to hardware for each physical port representor.
An rte_ring is also created per representor so that traffic can be
pushed and pulled to this interface.

To up and down the real device represented by a flower representor port
a port mod message is used to convey that info to the firmware. This
message will be used in the dev_ops callbacks of flower representors.

Each cmsg generated by the driver is prepended with a cmsg header.
This commit also adds the logic to fill in the header of cmsgs.

Also add the Rx and Tx path for flower representors. For Rx packets are
dequeued from the representor ring and passed to the eth_dev. For Tx
the first queue of the PF vNIC is used. Metadata about the representor
is added before the packet is sent down to firmware.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 doc/guides/nics/nfp.rst                         |   6 +
 doc/guides/rel_notes/release_22_11.rst          |   1 +
 drivers/net/nfp/flower/nfp_flower.c             |   7 +
 drivers/net/nfp/flower/nfp_flower.h             |  18 +
 drivers/net/nfp/flower/nfp_flower_cmsg.c        | 185 +++++++
 drivers/net/nfp/flower/nfp_flower_cmsg.h        | 173 ++++++
 drivers/net/nfp/flower/nfp_flower_representor.c | 664 ++++++++++++++++++++++++
 drivers/net/nfp/flower/nfp_flower_representor.h |  39 ++
 drivers/net/nfp/meson.build                     |   2 +
 9 files changed, 1095 insertions(+)
 create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_cmsg.h
 create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.c
 create mode 100644 drivers/net/nfp/flower/nfp_flower_representor.h

diff --git a/doc/guides/nics/nfp.rst b/doc/guides/nics/nfp.rst
index 4faab39..c62b1fb 100644
--- a/doc/guides/nics/nfp.rst
+++ b/doc/guides/nics/nfp.rst
@@ -194,3 +194,9 @@ The flower firmware application requires the PMD running two services:
 	* PF vNIC service: handling the feedback traffic.
 	* ctrl vNIC service: communicate between PMD and firmware through
 	  control message.
+
+To achieve the offload of flow, the representor ports are exposed to OVS.
+The flower firmware application support representor port for VF and physical
+port. There will always exist a representor port for each physical port,
+and the number of the representor port for VF is specified by the user through
+parameter.
diff --git a/doc/guides/rel_notes/release_22_11.rst b/doc/guides/rel_notes/release_22_11.rst
index c5a8c08..98425e9 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -62,6 +62,7 @@ New Features
     * Added the support of flower firmware.
     * Added the flower service infrastructure.
     * Added the control message interactive channels between PMD and firmware.
+    * Added the support of representor port.
 
 
 Removed Items
diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
index e935821..c52785c 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -21,6 +21,7 @@
 #include "../nfpcore/nfp_nsp.h"
 #include "nfp_flower.h"
 #include "nfp_flower_ctrl.h"
+#include "nfp_flower_representor.h"
 
 #define MAX_PKT_BURST 32
 #define MBUF_PRIV_SIZE 128
@@ -836,6 +837,12 @@ struct dp_packet {
 		goto ctrl_vnic_cleanup;
 	}
 
+	ret = nfp_flower_repr_create(app_fw_flower);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Could not create representor ports");
+		goto ctrl_vnic_cleanup;
+	}
+
 	return 0;
 
 ctrl_vnic_cleanup:
diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
index b9379e5..c7d673f 100644
--- a/drivers/net/nfp/flower/nfp_flower.h
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -18,8 +18,20 @@ enum nfp_flower_service {
  */
 #define FLOWER_PKT_DATA_OFFSET 8
 
+#define MAX_FLOWER_PHYPORTS 8
+#define MAX_FLOWER_VFS 64
+
 /* The flower application's private structure */
 struct nfp_app_fw_flower {
+	/* switch domain for this app */
+	uint16_t switch_domain_id;
+
+	/* Number of VF representors */
+	uint8_t num_vf_reprs;
+
+	/* Number of phyport representors */
+	uint8_t num_phyport_reprs;
+
 	/* List of rte_service ID's */
 	uint32_t service_ids[NFP_FLOWER_SERVICE_MAX];
 
@@ -43,6 +55,12 @@ struct nfp_app_fw_flower {
 
 	/* Ctrl vNIC Tx counter */
 	uint64_t ctrl_vnic_tx_count;
+
+	/* Array of phyport representors */
+	struct nfp_flower_representor *phy_reprs[MAX_FLOWER_PHYPORTS];
+
+	/* Array of VF representors */
+	struct nfp_flower_representor *vf_reprs[MAX_FLOWER_VFS];
 };
 
 int nfp_init_app_fw_flower(struct nfp_pf_dev *pf_dev);
diff --git a/drivers/net/nfp/flower/nfp_flower_cmsg.c b/drivers/net/nfp/flower/nfp_flower_cmsg.c
new file mode 100644
index 0000000..9d67609
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_cmsg.c
@@ -0,0 +1,185 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#include "../nfpcore/nfp_nsp.h"
+#include "../nfp_logs.h"
+#include "../nfp_common.h"
+#include "nfp_flower.h"
+#include "nfp_flower_cmsg.h"
+#include "nfp_flower_ctrl.h"
+#include "nfp_flower_representor.h"
+
+static void *
+nfp_flower_cmsg_init(struct rte_mbuf *m,
+		enum nfp_flower_cmsg_type type,
+		uint32_t size)
+{
+	char *pkt;
+	uint32_t data;
+	uint32_t new_size = size;
+	struct nfp_flower_cmsg_hdr *hdr;
+
+	pkt = rte_pktmbuf_mtod(m, char *);
+	PMD_DRV_LOG(DEBUG, "flower_cmsg_init using pkt at %p", pkt);
+
+	data = rte_cpu_to_be_32(NFP_NET_META_PORTID);
+	rte_memcpy(pkt, &data, 4);
+	pkt += 4;
+	new_size += 4;
+
+	/* First the metadata as flower requires it */
+	data = rte_cpu_to_be_32(NFP_META_PORT_ID_CTRL);
+	rte_memcpy(pkt, &data, 4);
+	pkt += 4;
+	new_size += 4;
+
+	/* Now the ctrl header */
+	hdr = (struct nfp_flower_cmsg_hdr *)pkt;
+	hdr->pad     = 0;
+	hdr->type    = type;
+	hdr->version = NFP_FLOWER_CMSG_VER1;
+
+	pkt = (char *)hdr + NFP_FLOWER_CMSG_HLEN;
+	new_size += NFP_FLOWER_CMSG_HLEN;
+
+	m->pkt_len = new_size;
+	m->data_len = m->pkt_len;
+
+	return pkt;
+}
+
+static void
+nfp_flower_cmsg_mac_repr_init(struct rte_mbuf *m, int num_ports)
+{
+	uint32_t size;
+	struct nfp_flower_cmsg_mac_repr *msg;
+	enum nfp_flower_cmsg_type type = NFP_FLOWER_CMSG_TYPE_MAC_REPR;
+
+	size = sizeof(*msg) + (num_ports * sizeof(msg->ports[0]));
+	msg = (struct nfp_flower_cmsg_mac_repr *)nfp_flower_cmsg_init(m,
+			type, size);
+
+	memset(msg->reserved, 0, sizeof(msg->reserved));
+	msg->num_ports = num_ports;
+}
+
+static void
+nfp_flower_cmsg_mac_repr_fill(struct rte_mbuf *m,
+		unsigned int idx,
+		unsigned int nbi,
+		unsigned int nbi_port,
+		unsigned int phys_port)
+{
+	struct nfp_flower_cmsg_mac_repr *msg;
+
+	msg = (struct nfp_flower_cmsg_mac_repr *)nfp_flower_cmsg_get_data(m);
+	msg->ports[idx].idx       = idx;
+	msg->ports[idx].info      = nbi & NFP_FLOWER_CMSG_MAC_REPR_NBI;
+	msg->ports[idx].nbi_port  = nbi_port;
+	msg->ports[idx].phys_port = phys_port;
+}
+
+int
+nfp_flower_cmsg_mac_repr(struct nfp_app_fw_flower *app_fw_flower)
+{
+	int i;
+	uint16_t cnt;
+	unsigned int nbi;
+	unsigned int nbi_port;
+	unsigned int phys_port;
+	struct rte_mbuf *mbuf;
+	struct nfp_eth_table *nfp_eth_table;
+
+	mbuf = rte_pktmbuf_alloc(app_fw_flower->ctrl_pktmbuf_pool);
+	if (mbuf == NULL) {
+		PMD_DRV_LOG(ERR, "Could not allocate mac repr cmsg");
+		return -ENOMEM;
+	}
+
+	nfp_flower_cmsg_mac_repr_init(mbuf, app_fw_flower->num_phyport_reprs);
+
+	/* Fill in the mac repr cmsg */
+	nfp_eth_table = app_fw_flower->nfp_eth_table;
+	for (i = 0; i < app_fw_flower->num_phyport_reprs; i++) {
+		nbi = nfp_eth_table->ports[i].nbi;
+		nbi_port = nfp_eth_table->ports[i].base;
+		phys_port = nfp_eth_table->ports[i].index;
+
+		nfp_flower_cmsg_mac_repr_fill(mbuf, i, nbi, nbi_port, phys_port);
+	}
+
+	/* Send the cmsg via the ctrl vNIC */
+	cnt = nfp_flower_ctrl_vnic_xmit(app_fw_flower, mbuf);
+	if (cnt == 0) {
+		PMD_DRV_LOG(ERR, "Send cmsg through ctrl vnic failed.");
+		rte_pktmbuf_free(mbuf);
+		return -EIO;
+	}
+
+	return 0;
+}
+
+int
+nfp_flower_cmsg_repr_reify(struct nfp_app_fw_flower *app_fw_flower,
+		struct nfp_flower_representor *repr)
+{
+	uint16_t cnt;
+	struct rte_mbuf *mbuf;
+	struct nfp_flower_cmsg_port_reify *msg;
+
+	mbuf = rte_pktmbuf_alloc(app_fw_flower->ctrl_pktmbuf_pool);
+	if (mbuf == NULL) {
+		PMD_DRV_LOG(DEBUG, "alloc mbuf for repr reify failed");
+		return -ENOMEM;
+	}
+
+	msg = (struct nfp_flower_cmsg_port_reify *)nfp_flower_cmsg_init(mbuf,
+			NFP_FLOWER_CMSG_TYPE_PORT_REIFY, sizeof(*msg));
+
+	msg->portnum  = rte_cpu_to_be_32(repr->port_id);
+	msg->reserved = 0;
+	msg->info     = rte_cpu_to_be_16(1);
+
+	cnt = nfp_flower_ctrl_vnic_xmit(app_fw_flower, mbuf);
+	if (cnt == 0) {
+		PMD_DRV_LOG(ERR, "Send cmsg through ctrl vnic failed.");
+		rte_pktmbuf_free(mbuf);
+		return -EIO;
+	}
+
+	return 0;
+}
+
+int
+nfp_flower_cmsg_port_mod(struct nfp_app_fw_flower *app_fw_flower,
+		uint32_t port_id, bool carrier_ok)
+{
+	uint16_t cnt;
+	struct rte_mbuf *mbuf;
+	struct nfp_flower_cmsg_port_mod *msg;
+
+	mbuf = rte_pktmbuf_alloc(app_fw_flower->ctrl_pktmbuf_pool);
+	if (mbuf == NULL) {
+		PMD_DRV_LOG(DEBUG, "alloc mbuf for repr portmod failed");
+		return -ENOMEM;
+	}
+
+	msg = (struct nfp_flower_cmsg_port_mod *)nfp_flower_cmsg_init(mbuf,
+			NFP_FLOWER_CMSG_TYPE_PORT_MOD, sizeof(*msg));
+
+	msg->portnum  = rte_cpu_to_be_32(port_id);
+	msg->reserved = 0;
+	msg->info     = carrier_ok;
+	msg->mtu      = 9000;
+
+	cnt = nfp_flower_ctrl_vnic_xmit(app_fw_flower, mbuf);
+	if (cnt == 0) {
+		PMD_DRV_LOG(ERR, "Send cmsg through ctrl vnic failed.");
+		rte_pktmbuf_free(mbuf);
+		return -EIO;
+	}
+
+	return 0;
+}
diff --git a/drivers/net/nfp/flower/nfp_flower_cmsg.h b/drivers/net/nfp/flower/nfp_flower_cmsg.h
new file mode 100644
index 0000000..0bf8fc8
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_cmsg.h
@@ -0,0 +1,173 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#ifndef _NFP_CMSG_H_
+#define _NFP_CMSG_H_
+
+#include <rte_byteorder.h>
+#include <rte_ether.h>
+
+struct nfp_flower_cmsg_hdr {
+	rte_be16_t pad;
+	uint8_t type;
+	uint8_t version;
+};
+
+/* Types defined for control messages */
+enum nfp_flower_cmsg_type {
+	NFP_FLOWER_CMSG_TYPE_FLOW_ADD       = 0,
+	NFP_FLOWER_CMSG_TYPE_FLOW_MOD       = 1,
+	NFP_FLOWER_CMSG_TYPE_FLOW_DEL       = 2,
+	NFP_FLOWER_CMSG_TYPE_LAG_CONFIG     = 4,
+	NFP_FLOWER_CMSG_TYPE_PORT_REIFY     = 6,
+	NFP_FLOWER_CMSG_TYPE_MAC_REPR       = 7,
+	NFP_FLOWER_CMSG_TYPE_PORT_MOD       = 8,
+	NFP_FLOWER_CMSG_TYPE_MERGE_HINT     = 9,
+	NFP_FLOWER_CMSG_TYPE_NO_NEIGH       = 10,
+	NFP_FLOWER_CMSG_TYPE_TUN_MAC        = 11,
+	NFP_FLOWER_CMSG_TYPE_ACTIVE_TUNS    = 12,
+	NFP_FLOWER_CMSG_TYPE_TUN_NEIGH      = 13,
+	NFP_FLOWER_CMSG_TYPE_TUN_IPS        = 14,
+	NFP_FLOWER_CMSG_TYPE_FLOW_STATS     = 15,
+	NFP_FLOWER_CMSG_TYPE_PORT_ECHO      = 16,
+	NFP_FLOWER_CMSG_TYPE_QOS_MOD        = 18,
+	NFP_FLOWER_CMSG_TYPE_QOS_DEL        = 19,
+	NFP_FLOWER_CMSG_TYPE_QOS_STATS      = 20,
+	NFP_FLOWER_CMSG_TYPE_PRE_TUN_RULE   = 21,
+	NFP_FLOWER_CMSG_TYPE_TUN_IPS_V6     = 22,
+	NFP_FLOWER_CMSG_TYPE_NO_NEIGH_V6    = 23,
+	NFP_FLOWER_CMSG_TYPE_TUN_NEIGH_V6   = 24,
+	NFP_FLOWER_CMSG_TYPE_ACTIVE_TUNS_V6 = 25,
+	NFP_FLOWER_CMSG_TYPE_MAX            = 32,
+};
+
+/*
+ * NFP_FLOWER_CMSG_TYPE_MAC_REPR
+ *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
+ *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *     Word +---------------+-----------+---+---------------+---------------+
+ *       0  |                  spare                        |Number of ports|
+ *          +---------------+-----------+---+---------------+---------------+
+ *       1  |    Index      |   spare   |NBI|  Port on NBI  | Chip-wide port|
+ *          +---------------+-----------+---+---------------+---------------+
+ *                                        ....
+ *          +---------------+-----------+---+---------------+---------------+
+ *     N-1  |    Index      |   spare   |NBI|  Port on NBI  | Chip-wide port|
+ *          +---------------+-----------+---+---------------+---------------+
+ *     N    |    Index      |   spare   |NBI|  Port on NBI  | Chip-wide port|
+ *          +---------------+-----------+---+---------------+---------------+
+ *
+ *          Index: index into the eth table
+ *          NBI (bits 17-16): NBI number (0-3)
+ *          Port on NBI (bits 15-8): “base” in the driver
+ *            this forms NBIX.PortY notation as the NSP eth table.
+ *          "Chip-wide" port (bits 7-0):
+ */
+struct nfp_flower_cmsg_mac_repr {
+	uint8_t reserved[3];
+	uint8_t num_ports;
+	struct {
+		uint8_t idx;
+		uint8_t info;
+		uint8_t nbi_port;
+		uint8_t phys_port;
+	} ports[0];
+};
+
+/*
+ * NFP_FLOWER_CMSG_TYPE_PORT_REIFY
+ *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
+ *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *    Word  +-------+-------+---+---+-------+---+---+-----------+-----------+
+ *       0  |Port Ty|Sys ID |NIC|Rsv| Spare |PCI|typ|    vNIC   |  queue    |
+ *          +-------+-----+-+---+---+-------+---+---+-----------+---------+-+
+ *       1  |                             Spare                           |E|
+ *          +-------------------------------------------------------------+-+
+ *          E: 1 = Representor exists, 0 = Representor does not exist
+ */
+struct nfp_flower_cmsg_port_reify {
+	rte_be32_t portnum;
+	rte_be16_t reserved;
+	rte_be16_t info;
+};
+
+/*
+ * NFP_FLOWER_CMSG_TYPE_PORT_MOD
+ *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
+ *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *    Word  +-------+-------+---+---+-------+---+---+-------+---+-----------+
+ *       0  |Port Ty|Sys ID |NIC|Rsv|       Reserved        |    Port       |
+ *          +-------+-------+---+---+-----+-+---+---+-------+---+-----------+
+ *       1  |            Spare            |L|              MTU              |
+ *          +-----------------------------+-+-------------------------------+
+ *        L: Link or Admin state bit. When message is generated by host, this
+ *           bit indicates the admin state (0=down, 1=up). When generated by
+ *           NFP, it indicates the link state (0=down, 1=up)
+ *
+ *        Port Type (word 1, bits 31 to 28) = 1 (Physical Network)
+ *        Port: “Chip-wide number” as assigned by BSP
+ *
+ *    Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
+ *    -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+ *    Word  +-------+-------+---+---+-------+---+---+-------+---+-----------+
+ *       0  |Port Ty|Sys ID |NIC|Rsv| Spare |PCI|typ|    vNIC   |  queue    |
+ *          +-------+-----+-+---+---+---+-+-+---+---+-------+---+-----------+
+ *       1  |            Spare            |L|              MTU              |
+ *          +-----------------------------+-+-------------------------------+
+ *        L: Link or Admin state bit. When message is generated by host, this
+ *           bit indicates the admin state (0=down, 1=up). When generated by
+ *           NFP, it indicates the link state (0=down, 1=up)
+ *
+ *        Port Type (word 1, bits 31 to 28) = 2 (PCIE)
+ */
+struct nfp_flower_cmsg_port_mod {
+	rte_be32_t portnum;
+	uint8_t reserved;
+	uint8_t info;
+	rte_be16_t mtu;
+};
+
+enum nfp_flower_cmsg_port_type {
+	NFP_FLOWER_CMSG_PORT_TYPE_UNSPEC,
+	NFP_FLOWER_CMSG_PORT_TYPE_PHYS_PORT,
+	NFP_FLOWER_CMSG_PORT_TYPE_PCIE_PORT,
+	NFP_FLOWER_CMSG_PORT_TYPE_OTHER_PORT,
+};
+
+enum nfp_flower_cmsg_port_vnic_type {
+	NFP_FLOWER_CMSG_PORT_VNIC_TYPE_VF,
+	NFP_FLOWER_CMSG_PORT_VNIC_TYPE_PF,
+	NFP_FLOWER_CMSG_PORT_VNIC_TYPE_CTRL,
+};
+
+#define NFP_FLOWER_CMSG_MAC_REPR_NBI            (0x3)
+
+#define NFP_FLOWER_CMSG_HLEN            sizeof(struct nfp_flower_cmsg_hdr)
+#define NFP_FLOWER_CMSG_VER1            1
+#define NFP_NET_META_PORTID             5
+#define NFP_META_PORT_ID_CTRL           ~0U
+
+#define NFP_FLOWER_CMSG_PORT_TYPE(x)            (((x) >> 28) & 0xf)  /* [31,28] */
+#define NFP_FLOWER_CMSG_PORT_SYS_ID(x)          (((x) >> 24) & 0xf)  /* [24,27] */
+#define NFP_FLOWER_CMSG_PORT_NFP_ID(x)          (((x) >> 22) & 0x3)  /* [22,23] */
+#define NFP_FLOWER_CMSG_PORT_PCI(x)             (((x) >> 14) & 0x3)  /* [14,15] */
+#define NFP_FLOWER_CMSG_PORT_VNIC_TYPE(x)       (((x) >> 12) & 0x3)  /* [12,13] */
+#define NFP_FLOWER_CMSG_PORT_VNIC(x)            (((x) >> 6) & 0x3f)  /* [6,11] */
+#define NFP_FLOWER_CMSG_PORT_PCIE_Q(x)          ((x) & 0x3f)         /* [0,5] */
+#define NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM(x)   ((x) & 0xff)         /* [0,7] */
+
+static inline char*
+nfp_flower_cmsg_get_data(struct rte_mbuf *m)
+{
+	return rte_pktmbuf_mtod(m, char *) + 4 + 4 + NFP_FLOWER_CMSG_HLEN;
+}
+
+int nfp_flower_cmsg_mac_repr(struct nfp_app_fw_flower *app_fw_flower);
+int nfp_flower_cmsg_repr_reify(struct nfp_app_fw_flower *app_fw_flower,
+		struct nfp_flower_representor *repr);
+int nfp_flower_cmsg_port_mod(struct nfp_app_fw_flower *app_fw_flower,
+		uint32_t port_id, bool carrier_ok);
+
+#endif /* _NFP_CMSG_H_ */
diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c b/drivers/net/nfp/flower/nfp_flower_representor.c
new file mode 100644
index 0000000..745fde3
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_representor.c
@@ -0,0 +1,664 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#include <rte_common.h>
+#include <ethdev_pci.h>
+
+#include "../nfp_common.h"
+#include "../nfp_logs.h"
+#include "../nfp_ctrl.h"
+#include "../nfp_rxtx.h"
+#include "../nfpcore/nfp_mip.h"
+#include "../nfpcore/nfp_rtsym.h"
+#include "../nfpcore/nfp_nsp.h"
+#include "nfp_flower.h"
+#include "nfp_flower_representor.h"
+#include "nfp_flower_ctrl.h"
+#include "nfp_flower_cmsg.h"
+
+static int
+nfp_flower_repr_link_update(struct rte_eth_dev *dev,
+		__rte_unused int wait_to_complete)
+{
+	int ret;
+	uint32_t nn_link_status;
+	struct nfp_net_hw *pf_hw;
+	struct rte_eth_link *link;
+	struct nfp_flower_representor *repr;
+
+	static const uint32_t ls_to_ethtool[] = {
+		[NFP_NET_CFG_STS_LINK_RATE_UNSUPPORTED] = RTE_ETH_SPEED_NUM_NONE,
+		[NFP_NET_CFG_STS_LINK_RATE_UNKNOWN]     = RTE_ETH_SPEED_NUM_NONE,
+		[NFP_NET_CFG_STS_LINK_RATE_1G]          = RTE_ETH_SPEED_NUM_1G,
+		[NFP_NET_CFG_STS_LINK_RATE_10G]         = RTE_ETH_SPEED_NUM_10G,
+		[NFP_NET_CFG_STS_LINK_RATE_25G]         = RTE_ETH_SPEED_NUM_25G,
+		[NFP_NET_CFG_STS_LINK_RATE_40G]         = RTE_ETH_SPEED_NUM_40G,
+		[NFP_NET_CFG_STS_LINK_RATE_50G]         = RTE_ETH_SPEED_NUM_50G,
+		[NFP_NET_CFG_STS_LINK_RATE_100G]        = RTE_ETH_SPEED_NUM_100G,
+	};
+
+	repr = (struct nfp_flower_representor *)dev->data->dev_private;
+	link = &repr->link;
+	pf_hw = repr->app_fw_flower->pf_hw;
+
+	memset(link, 0, sizeof(struct rte_eth_link));
+	nn_link_status = nn_cfg_readl(pf_hw, NFP_NET_CFG_STS);
+
+	if (nn_link_status & NFP_NET_CFG_STS_LINK)
+		link->link_status = RTE_ETH_LINK_UP;
+
+	link->link_duplex = RTE_ETH_LINK_FULL_DUPLEX;
+
+	nn_link_status = (nn_link_status >> NFP_NET_CFG_STS_LINK_RATE_SHIFT) &
+			 NFP_NET_CFG_STS_LINK_RATE_MASK;
+
+	if (nn_link_status >= RTE_DIM(ls_to_ethtool))
+		link->link_speed = RTE_ETH_SPEED_NUM_NONE;
+	else
+		link->link_speed = ls_to_ethtool[nn_link_status];
+
+	ret = rte_eth_linkstatus_set(dev, link);
+	if (ret == 0) {
+		if (link->link_status)
+			PMD_DRV_LOG(INFO, "NIC Link is Up");
+		else
+			PMD_DRV_LOG(INFO, "NIC Link is Down");
+	}
+
+	return ret;
+}
+
+static int
+nfp_flower_repr_dev_infos_get(__rte_unused struct rte_eth_dev *dev,
+		struct rte_eth_dev_info *dev_info)
+{
+	/* Hardcoded pktlen and queues for now */
+	dev_info->max_rx_queues = 1;
+	dev_info->max_tx_queues = 1;
+	dev_info->min_rx_bufsize = RTE_ETHER_MIN_MTU;
+	dev_info->max_rx_pktlen = 9000;
+
+	dev_info->rx_offload_capa = RTE_ETH_RX_OFFLOAD_VLAN_STRIP;
+	dev_info->rx_offload_capa |= RTE_ETH_RX_OFFLOAD_IPV4_CKSUM |
+			RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
+			RTE_ETH_RX_OFFLOAD_TCP_CKSUM;
+
+	dev_info->tx_offload_capa = RTE_ETH_TX_OFFLOAD_VLAN_INSERT;
+	dev_info->tx_offload_capa |= RTE_ETH_TX_OFFLOAD_IPV4_CKSUM |
+			RTE_ETH_TX_OFFLOAD_UDP_CKSUM |
+			RTE_ETH_TX_OFFLOAD_TCP_CKSUM;
+	dev_info->tx_offload_capa |= RTE_ETH_TX_OFFLOAD_TCP_TSO;
+	dev_info->tx_offload_capa |= RTE_ETH_TX_OFFLOAD_MULTI_SEGS;
+
+	dev_info->max_mac_addrs = 1;
+
+	return 0;
+}
+
+static int
+nfp_flower_repr_dev_configure(struct rte_eth_dev *dev)
+{
+	struct nfp_net_hw *pf_hw;
+	struct rte_eth_conf *dev_conf;
+	struct rte_eth_rxmode *rxmode;
+	struct nfp_flower_representor *repr;
+
+	repr = (struct nfp_flower_representor *)dev->data->dev_private;
+	pf_hw = repr->app_fw_flower->pf_hw;
+
+	dev_conf = &dev->data->dev_conf;
+	rxmode = &dev_conf->rxmode;
+
+	/* Checking MTU set */
+	if (rxmode->mtu > pf_hw->flbufsz) {
+		PMD_DRV_LOG(INFO, "MTU (%u) larger then current mbufsize (%u) not supported",
+				rxmode->mtu, pf_hw->flbufsz);
+		return -ERANGE;
+	}
+
+	return 0;
+}
+
+static int
+nfp_flower_repr_dev_start(struct rte_eth_dev *dev)
+{
+	struct nfp_flower_representor *repr;
+	struct nfp_app_fw_flower *app_fw_flower;
+
+	repr = (struct nfp_flower_representor *)dev->data->dev_private;
+	app_fw_flower = repr->app_fw_flower;
+
+	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT) {
+		nfp_eth_set_configured(app_fw_flower->pf_hw->pf_dev->cpp,
+				repr->nfp_idx, 1);
+	}
+
+	nfp_flower_cmsg_port_mod(app_fw_flower, repr->port_id, true);
+
+	return 0;
+}
+
+static int
+nfp_flower_repr_dev_stop(struct rte_eth_dev *dev)
+{
+	struct nfp_flower_representor *repr;
+	struct nfp_app_fw_flower *app_fw_flower;
+
+	repr = (struct nfp_flower_representor *)dev->data->dev_private;
+	app_fw_flower = repr->app_fw_flower;
+
+	nfp_flower_cmsg_port_mod(app_fw_flower, repr->port_id, false);
+
+	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT) {
+		nfp_eth_set_configured(app_fw_flower->pf_hw->pf_dev->cpp,
+				repr->nfp_idx, 0);
+	}
+
+	return 0;
+}
+
+static int
+nfp_flower_repr_rx_queue_setup(struct rte_eth_dev *dev,
+		uint16_t rx_queue_id,
+		__rte_unused uint16_t nb_rx_desc,
+		unsigned int socket_id,
+		__rte_unused const struct rte_eth_rxconf *rx_conf,
+		__rte_unused struct rte_mempool *mb_pool)
+{
+	struct nfp_net_rxq *rxq;
+	struct nfp_net_hw *pf_hw;
+	struct nfp_flower_representor *repr;
+
+	repr = (struct nfp_flower_representor *)dev->data->dev_private;
+	pf_hw = repr->app_fw_flower->pf_hw;
+
+	/* Allocating rx queue data structure */
+	rxq = rte_zmalloc_socket("ethdev RX queue", sizeof(struct nfp_net_rxq),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (rxq == NULL)
+		return -ENOMEM;
+
+	rxq->hw = pf_hw;
+	rxq->qidx = rx_queue_id;
+	rxq->port_id = dev->data->port_id;
+	dev->data->rx_queues[rx_queue_id] = rxq;
+
+	return 0;
+}
+
+static int
+nfp_flower_repr_tx_queue_setup(struct rte_eth_dev *dev,
+		uint16_t tx_queue_id,
+		__rte_unused uint16_t nb_tx_desc,
+		unsigned int socket_id,
+		__rte_unused const struct rte_eth_txconf *tx_conf)
+{
+	struct nfp_net_txq *txq;
+	struct nfp_net_hw *pf_hw;
+	struct nfp_flower_representor *repr;
+
+	repr = (struct nfp_flower_representor *)dev->data->dev_private;
+	pf_hw = repr->app_fw_flower->pf_hw;
+
+	/* Allocating tx queue data structure */
+	txq = rte_zmalloc_socket("ethdev TX queue", sizeof(struct nfp_net_txq),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (txq == NULL)
+		return -ENOMEM;
+
+	txq->hw = pf_hw;
+	txq->qidx = tx_queue_id;
+	txq->port_id = dev->data->port_id;
+	dev->data->tx_queues[tx_queue_id] = txq;
+
+	return 0;
+}
+
+static int
+nfp_flower_repr_stats_get(struct rte_eth_dev *ethdev,
+		struct rte_eth_stats *stats)
+{
+	struct nfp_flower_representor *repr;
+
+	repr = (struct nfp_flower_representor *)ethdev->data->dev_private;
+	rte_memcpy(stats, &repr->repr_stats, sizeof(struct rte_eth_stats));
+
+	return 0;
+}
+
+static int
+nfp_flower_repr_stats_reset(struct rte_eth_dev *ethdev)
+{
+	struct nfp_flower_representor *repr;
+
+	repr = (struct nfp_flower_representor *)ethdev->data->dev_private;
+	memset(&repr->repr_stats, 0, sizeof(struct rte_eth_stats));
+
+	return 0;
+}
+
+static int
+nfp_flower_repr_promiscuous_enable(struct rte_eth_dev *dev)
+{
+	struct nfp_net_hw *pf_hw;
+	struct nfp_flower_representor *repr;
+
+	repr = (struct nfp_flower_representor *)dev->data->dev_private;
+	pf_hw = repr->app_fw_flower->pf_hw;
+
+	if (!(pf_hw->cap & NFP_NET_CFG_CTRL_PROMISC)) {
+		PMD_DRV_LOG(INFO, "Promiscuous mode not supported");
+		return -ENOTSUP;
+	}
+
+	if (pf_hw->ctrl & NFP_NET_CFG_CTRL_PROMISC) {
+		PMD_DRV_LOG(INFO, "Promiscuous mode already enabled");
+		return 0;
+	}
+
+	return nfp_net_promisc_enable(pf_hw->eth_dev);
+}
+
+static int
+nfp_flower_repr_promiscuous_disable(struct rte_eth_dev *dev)
+{
+	struct nfp_net_hw *pf_hw;
+	struct nfp_flower_representor *repr;
+
+	repr = (struct nfp_flower_representor *)dev->data->dev_private;
+	pf_hw = repr->app_fw_flower->pf_hw;
+
+	if ((pf_hw->ctrl & NFP_NET_CFG_CTRL_PROMISC) == 0) {
+		PMD_DRV_LOG(INFO, "Promiscuous mode already disabled");
+		return 0;
+	}
+
+	return nfp_net_promisc_disable(pf_hw->eth_dev);
+}
+
+static int
+nfp_flower_repr_mac_addr_set(struct rte_eth_dev *ethdev,
+		struct rte_ether_addr *mac_addr)
+{
+	struct nfp_flower_representor *repr;
+
+	repr = (struct nfp_flower_representor *)ethdev->data->dev_private;
+	rte_ether_addr_copy(mac_addr, &repr->mac_addr);
+	rte_ether_addr_copy(mac_addr, ethdev->data->mac_addrs);
+
+	return 0;
+}
+
+static uint16_t
+nfp_flower_repr_rx_burst(void *rx_queue,
+		struct rte_mbuf **rx_pkts,
+		uint16_t nb_pkts)
+{
+	unsigned int available = 0;
+	unsigned int total_dequeue;
+	struct nfp_net_rxq *rxq;
+	struct rte_eth_dev *dev;
+	struct nfp_flower_representor *repr;
+
+	rxq = rx_queue;
+	if (unlikely(rxq == NULL)) {
+		PMD_RX_LOG(ERR, "RX Bad queue");
+		return 0;
+	}
+
+	dev = &rte_eth_devices[rxq->port_id];
+	repr = dev->data->dev_private;
+	if (unlikely(repr->ring == NULL)) {
+		PMD_RX_LOG(ERR, "representor %s has no ring configured!",
+				repr->name);
+		return 0;
+	}
+
+	total_dequeue = rte_ring_dequeue_burst(repr->ring, (void *)rx_pkts,
+			nb_pkts, &available);
+	if (total_dequeue != 0) {
+		PMD_RX_LOG(DEBUG, "Representor Rx burst for %s, port_id: 0x%x, "
+				"received: %u, available: %u", repr->name,
+				repr->port_id, total_dequeue, available);
+
+		repr->repr_stats.ipackets += total_dequeue;
+	}
+
+	return total_dequeue;
+}
+
+static uint16_t
+nfp_flower_repr_tx_burst(void *tx_queue,
+		struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts)
+{
+	uint16_t i;
+	uint16_t sent;
+	char *meta_offset;
+	struct nfp_net_txq *txq;
+	struct nfp_net_hw *pf_hw;
+	struct rte_eth_dev *dev;
+	struct rte_eth_dev *repr_dev;
+	struct nfp_flower_representor *repr;
+
+	txq = tx_queue;
+	if (unlikely(txq == NULL)) {
+		PMD_TX_LOG(ERR, "TX Bad queue");
+		return 0;
+	}
+
+	/* This points to the PF vNIC that owns this representor */
+	pf_hw = txq->hw;
+	dev = pf_hw->eth_dev;
+
+	/* Grab a handle to the representor struct */
+	repr_dev = &rte_eth_devices[txq->port_id];
+	repr = repr_dev->data->dev_private;
+
+	for (i = 0; i < nb_pkts; i++) {
+		meta_offset = rte_pktmbuf_prepend(tx_pkts[i], FLOWER_PKT_DATA_OFFSET);
+		*(uint32_t *)meta_offset = rte_cpu_to_be_32(NFP_NET_META_PORTID);
+		meta_offset += 4;
+		*(uint32_t *)meta_offset = rte_cpu_to_be_32(repr->port_id);
+	}
+
+	/* Only using Tx queue 0 for now. */
+	sent = rte_eth_tx_burst(dev->data->port_id, 0, tx_pkts, nb_pkts);
+	if (sent != 0) {
+		PMD_TX_LOG(DEBUG, "Representor Tx burst for %s, port_id: 0x%x transmitted: %u",
+				repr->name, repr->port_id, sent);
+		repr->repr_stats.opackets += sent;
+	}
+
+	return sent;
+}
+
+static const struct eth_dev_ops nfp_flower_repr_dev_ops = {
+	.dev_infos_get        = nfp_flower_repr_dev_infos_get,
+
+	.dev_start            = nfp_flower_repr_dev_start,
+	.dev_configure        = nfp_flower_repr_dev_configure,
+	.dev_stop             = nfp_flower_repr_dev_stop,
+
+	.rx_queue_setup       = nfp_flower_repr_rx_queue_setup,
+	.tx_queue_setup       = nfp_flower_repr_tx_queue_setup,
+
+	.link_update          = nfp_flower_repr_link_update,
+
+	.stats_get            = nfp_flower_repr_stats_get,
+	.stats_reset          = nfp_flower_repr_stats_reset,
+
+	.promiscuous_enable   = nfp_flower_repr_promiscuous_enable,
+	.promiscuous_disable  = nfp_flower_repr_promiscuous_disable,
+
+	.mac_addr_set         = nfp_flower_repr_mac_addr_set,
+};
+
+static uint32_t
+nfp_flower_get_phys_port_id(uint8_t port)
+{
+	return (NFP_FLOWER_CMSG_PORT_TYPE_PHYS_PORT << 28) | port;
+}
+
+static uint32_t
+nfp_get_pcie_port_id(struct nfp_cpp *cpp,
+		int type,
+		uint8_t vnic,
+		uint8_t queue)
+{
+	uint8_t nfp_pcie;
+	uint32_t port_id;
+
+	nfp_pcie = NFP_CPP_INTERFACE_UNIT_of(nfp_cpp_interface(cpp));
+	port_id = ((nfp_pcie & 0x3) << 14) |
+			((type & 0x3) << 12) |
+			((vnic & 0x3f) << 6) |
+			(queue & 0x3f) |
+			((NFP_FLOWER_CMSG_PORT_TYPE_PCIE_PORT & 0xf) << 28);
+
+	return port_id;
+}
+
+static int
+nfp_flower_repr_init(struct rte_eth_dev *eth_dev,
+		void *init_params)
+{
+	int ret;
+	unsigned int numa_node;
+	char ring_name[RTE_ETH_NAME_MAX_LEN];
+	struct nfp_app_fw_flower *app_fw_flower;
+	struct nfp_flower_representor *repr;
+	struct nfp_flower_representor *init_repr_data;
+
+	/* Cast the input representor data to the correct struct here */
+	init_repr_data = (struct nfp_flower_representor *)init_params;
+
+	app_fw_flower = init_repr_data->app_fw_flower;
+
+	/* Memory has been allocated in the eth_dev_create() function */
+	repr = eth_dev->data->dev_private;
+
+	/*
+	 * We need multiproduce rings as we can have multiple PF ports.
+	 * On the other hand, we need single consumer rings, as just one
+	 * representor PMD will try to read from the ring.
+	 */
+	snprintf(ring_name, sizeof(ring_name), "%s_%s", init_repr_data->name, "ring");
+	numa_node = rte_socket_id();
+	repr->ring = rte_ring_create(ring_name, 256, numa_node, RING_F_SC_DEQ);
+	if (repr->ring == NULL) {
+		PMD_DRV_LOG(ERR, "rte_ring_create failed for %s", ring_name);
+		return -ENOMEM;
+	}
+
+	/* Copy data here from the input representor template*/
+	repr->vf_id            = init_repr_data->vf_id;
+	repr->switch_domain_id = init_repr_data->switch_domain_id;
+	repr->port_id          = init_repr_data->port_id;
+	repr->nfp_idx          = init_repr_data->nfp_idx;
+	repr->repr_type        = init_repr_data->repr_type;
+	repr->app_fw_flower       = init_repr_data->app_fw_flower;
+
+	snprintf(repr->name, sizeof(repr->name), "%s", init_repr_data->name);
+
+	eth_dev->dev_ops = &nfp_flower_repr_dev_ops;
+
+	eth_dev->rx_pkt_burst = nfp_flower_repr_rx_burst;
+	eth_dev->tx_pkt_burst = nfp_flower_repr_tx_burst;
+
+	eth_dev->data->dev_flags |= RTE_ETH_DEV_REPRESENTOR;
+
+	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT)
+		eth_dev->data->representor_id = repr->vf_id;
+	else
+		eth_dev->data->representor_id = repr->vf_id +
+				app_fw_flower->num_phyport_reprs;
+
+	/* This backer port is that of the eth_device created for the PF vNIC */
+	eth_dev->data->backer_port_id = app_fw_flower->pf_hw->eth_dev->data->port_id;
+
+	/* Only single queues for representor devices */
+	eth_dev->data->nb_rx_queues = 1;
+	eth_dev->data->nb_tx_queues = 1;
+
+	/* Allocating memory for mac addr */
+	eth_dev->data->mac_addrs = rte_zmalloc("mac_addr", RTE_ETHER_ADDR_LEN, 0);
+	if (eth_dev->data->mac_addrs == NULL) {
+		PMD_INIT_LOG(ERR, "Failed to allocate memory for repr MAC");
+		ret = -ENOMEM;
+		goto ring_cleanup;
+	}
+
+	rte_ether_addr_copy(&init_repr_data->mac_addr, &repr->mac_addr);
+	rte_ether_addr_copy(&init_repr_data->mac_addr, eth_dev->data->mac_addrs);
+
+	/* Send reify message to hardware to inform it about the new repr */
+	ret = nfp_flower_cmsg_repr_reify(app_fw_flower, repr);
+	if (ret != 0) {
+		PMD_INIT_LOG(WARNING, "Failed to send repr reify message");
+		goto mac_cleanup;
+	}
+
+	/* Add repr to correct array */
+	if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT)
+		app_fw_flower->phy_reprs[repr->nfp_idx] = repr;
+	else
+		app_fw_flower->vf_reprs[repr->vf_id] = repr;
+
+	return 0;
+
+mac_cleanup:
+	rte_free(eth_dev->data->mac_addrs);
+ring_cleanup:
+	rte_ring_free(repr->ring);
+
+	return ret;
+}
+
+static int
+nfp_flower_repr_alloc(struct nfp_app_fw_flower *app_fw_flower)
+{
+	int i;
+	int ret;
+	struct rte_eth_dev *eth_dev;
+	struct nfp_eth_table *nfp_eth_table;
+	struct nfp_eth_table_port *eth_port;
+	struct nfp_flower_representor flower_repr = {
+		.switch_domain_id = app_fw_flower->switch_domain_id,
+		.app_fw_flower    = app_fw_flower,
+	};
+
+	nfp_eth_table = app_fw_flower->nfp_eth_table;
+	eth_dev = app_fw_flower->pf_hw->eth_dev;
+
+	/* Send a NFP_FLOWER_CMSG_TYPE_MAC_REPR cmsg to hardware*/
+	ret = nfp_flower_cmsg_mac_repr(app_fw_flower);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "Cloud not send mac repr cmsgs");
+		return ret;
+	}
+
+	/* Create a rte_eth_dev for every phyport representor */
+	for (i = 0; i < app_fw_flower->num_phyport_reprs; i++) {
+		eth_port = &nfp_eth_table->ports[i];
+		flower_repr.repr_type = NFP_REPR_TYPE_PHYS_PORT;
+		flower_repr.port_id = nfp_flower_get_phys_port_id(eth_port->index);
+		flower_repr.nfp_idx = eth_port->eth_index;
+		flower_repr.vf_id = i;
+
+		/* Copy the real mac of the interface to the representor struct */
+		rte_ether_addr_copy((struct rte_ether_addr *)eth_port->mac_addr,
+				&flower_repr.mac_addr);
+		sprintf(flower_repr.name, "flower_repr_p%d", i);
+
+		/*
+		 * Create a eth_dev for this representor
+		 * This will also allocate private memory for the device
+		 */
+		ret = rte_eth_dev_create(eth_dev->device, flower_repr.name,
+				sizeof(struct nfp_flower_representor),
+				NULL, NULL, nfp_flower_repr_init, &flower_repr);
+		if (ret != 0) {
+			PMD_INIT_LOG(ERR, "Cloud not create eth_dev for repr");
+			break;
+		}
+	}
+
+	if (i < app_fw_flower->num_phyport_reprs)
+		return ret;
+
+	/*
+	 * Now allocate eth_dev's for VF representors.
+	 * Also send reify messages
+	 */
+	for (i = 0; i < app_fw_flower->num_vf_reprs; i++) {
+		flower_repr.repr_type = NFP_REPR_TYPE_VF;
+		flower_repr.port_id = nfp_get_pcie_port_id(app_fw_flower->pf_hw->cpp,
+				NFP_FLOWER_CMSG_PORT_VNIC_TYPE_VF, i, 0);
+		flower_repr.nfp_idx = 0;
+		flower_repr.vf_id = i;
+
+		/* VF reprs get a random MAC address */
+		rte_eth_random_addr(flower_repr.mac_addr.addr_bytes);
+
+		sprintf(flower_repr.name, "flower_repr_vf%d", i);
+
+		 /* This will also allocate private memory for the device*/
+		ret = rte_eth_dev_create(eth_dev->device, flower_repr.name,
+				sizeof(struct nfp_flower_representor),
+				NULL, NULL, nfp_flower_repr_init, &flower_repr);
+		if (ret != 0) {
+			PMD_INIT_LOG(ERR, "Cloud not create eth_dev for repr");
+			break;
+		}
+	}
+
+	if (i < app_fw_flower->num_vf_reprs)
+		return ret;
+
+	return 0;
+}
+
+int
+nfp_flower_repr_create(struct nfp_app_fw_flower *app_fw_flower)
+{
+	int ret;
+	struct nfp_pf_dev *pf_dev;
+	struct rte_pci_device *pci_dev;
+	struct rte_eth_devargs eth_da = {
+		.nb_representor_ports = 0
+	};
+
+	pf_dev = app_fw_flower->pf_hw->pf_dev;
+	pci_dev = pf_dev->pci_dev;
+
+	/* Allocate a switch domain for the flower app */
+	if (app_fw_flower->switch_domain_id == RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID &&
+			rte_eth_switch_domain_alloc(&app_fw_flower->switch_domain_id)) {
+		PMD_INIT_LOG(WARNING, "failed to allocate switch domain for device");
+	}
+
+	/* Now parse PCI device args passed for representor info */
+	if (pci_dev->device.devargs != NULL) {
+		ret = rte_eth_devargs_parse(pci_dev->device.devargs->args, &eth_da);
+		if (ret != 0) {
+			PMD_INIT_LOG(ERR, "devarg parse failed");
+			return -EINVAL;
+		}
+	}
+
+	if (eth_da.nb_representor_ports == 0) {
+		PMD_INIT_LOG(DEBUG, "No representor port need to create.");
+		return 0;
+	}
+
+	/* There always exist phy repr */
+	if (eth_da.nb_representor_ports < app_fw_flower->nfp_eth_table->count) {
+		PMD_INIT_LOG(ERR, "Should also create phy representor port.");
+		return -ERANGE;
+	}
+
+	/* Only support VF representor creation via the command line */
+	if (eth_da.type != RTE_ETH_REPRESENTOR_VF) {
+		PMD_INIT_LOG(ERR, "Unsupported representor type: %d", eth_da.type);
+		return -ENOTSUP;
+	}
+
+	/* Fill in flower app with repr counts */
+	app_fw_flower->num_phyport_reprs = (uint8_t)app_fw_flower->nfp_eth_table->count;
+	app_fw_flower->num_vf_reprs = eth_da.nb_representor_ports -
+			app_fw_flower->nfp_eth_table->count;
+
+	PMD_INIT_LOG(INFO, "%d number of VF reprs", app_fw_flower->num_vf_reprs);
+	PMD_INIT_LOG(INFO, "%d number of phyport reprs", app_fw_flower->num_phyport_reprs);
+
+	ret = nfp_flower_repr_alloc(app_fw_flower);
+	if (ret != 0) {
+		PMD_INIT_LOG(ERR, "representors allocation failed");
+		return -EINVAL;
+	}
+
+	return 0;
+}
diff --git a/drivers/net/nfp/flower/nfp_flower_representor.h b/drivers/net/nfp/flower/nfp_flower_representor.h
new file mode 100644
index 0000000..af44ef3
--- /dev/null
+++ b/drivers/net/nfp/flower/nfp_flower_representor.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Corigine, Inc.
+ * All rights reserved.
+ */
+
+#ifndef _NFP_FLOWER_REPRESENTOR_H_
+#define _NFP_FLOWER_REPRESENTOR_H_
+
+/*
+ * enum nfp_repr_type - type of representor
+ * @NFP_REPR_TYPE_PHYS_PORT:   external NIC port
+ * @NFP_REPR_TYPE_PF:          physical function
+ * @NFP_REPR_TYPE_VF:          virtual function
+ * @NFP_REPR_TYPE_MAX:         number of representor types
+ */
+enum nfp_repr_type {
+	NFP_REPR_TYPE_PHYS_PORT = 0,
+	NFP_REPR_TYPE_PF,
+	NFP_REPR_TYPE_VF,
+	NFP_REPR_TYPE_MAX,
+};
+
+struct nfp_flower_representor {
+	uint16_t vf_id;
+	uint16_t switch_domain_id;
+	uint32_t repr_type;
+	uint32_t port_id;
+	uint32_t nfp_idx;    /* only valid for the repr of physical port */
+	char name[RTE_ETH_NAME_MAX_LEN];
+	struct rte_ether_addr mac_addr;
+	struct nfp_app_fw_flower *app_fw_flower;
+	struct rte_ring *ring;
+	struct rte_eth_link link;
+	struct rte_eth_stats repr_stats;
+};
+
+int nfp_flower_repr_create(struct nfp_app_fw_flower *app_fw_flower);
+
+#endif /* _NFP_FLOWER_REPRESENTOR_H_ */
diff --git a/drivers/net/nfp/meson.build b/drivers/net/nfp/meson.build
index 8710213..8a63979 100644
--- a/drivers/net/nfp/meson.build
+++ b/drivers/net/nfp/meson.build
@@ -7,7 +7,9 @@ if not is_linux or not dpdk_conf.get('RTE_ARCH_64')
 endif
 sources = files(
         'flower/nfp_flower.c',
+        'flower/nfp_flower_cmsg.c',
         'flower/nfp_flower_ctrl.c',
+        'flower/nfp_flower_representor.c',
         'nfpcore/nfp_cpp_pcie_ops.c',
         'nfpcore/nfp_nsp.c',
         'nfpcore/nfp_cppcore.c',
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v9 11/12] net/nfp: move rxtx function to header file
  2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (9 preceding siblings ...)
  2022-09-15 10:44 ` [PATCH v9 10/12] net/nfp: add flower representor framework Chaoyong He
@ 2022-09-15 10:44 ` Chaoyong He
  2022-09-15 10:44 ` [PATCH v9 12/12] net/nfp: add flower PF rxtx logic Chaoyong He
  2022-09-20 14:56 ` [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Ferruh Yigit
  12 siblings, 0 replies; 22+ messages in thread
From: Chaoyong He @ 2022-09-15 10:44 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, niklas.soderlund, Chaoyong He, Heinrich Kuhn

The flower firmware application makes use of the same Rx
and Tx checksum logic as the normal PMD. Expose it so that
flower firmware application also can make use of it.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 drivers/net/nfp/nfp_common.c    |  2 +-
 drivers/net/nfp/nfp_ethdev.c    |  2 +-
 drivers/net/nfp/nfp_ethdev_vf.c |  2 +-
 drivers/net/nfp/nfp_rxtx.c      | 91 +----------------------------------------
 drivers/net/nfp/nfp_rxtx.h      | 90 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 94 insertions(+), 93 deletions(-)

diff --git a/drivers/net/nfp/nfp_common.c b/drivers/net/nfp/nfp_common.c
index 0e55f0c..e86929c 100644
--- a/drivers/net/nfp/nfp_common.c
+++ b/drivers/net/nfp/nfp_common.c
@@ -38,9 +38,9 @@
 #include "nfpcore/nfp_nsp.h"
 
 #include "nfp_common.h"
+#include "nfp_ctrl.h"
 #include "nfp_rxtx.h"
 #include "nfp_logs.h"
-#include "nfp_ctrl.h"
 #include "nfp_cpp_bridge.h"
 
 #include <sys/types.h>
diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c
index ddfe495..c37a92d 100644
--- a/drivers/net/nfp/nfp_ethdev.c
+++ b/drivers/net/nfp/nfp_ethdev.c
@@ -33,9 +33,9 @@
 #include "nfpcore/nfp_nsp.h"
 
 #include "nfp_common.h"
+#include "nfp_ctrl.h"
 #include "nfp_rxtx.h"
 #include "nfp_logs.h"
-#include "nfp_ctrl.h"
 #include "nfp_cpp_bridge.h"
 
 #include "flower/nfp_flower.h"
diff --git a/drivers/net/nfp/nfp_ethdev_vf.c b/drivers/net/nfp/nfp_ethdev_vf.c
index d304d78..ceaf618 100644
--- a/drivers/net/nfp/nfp_ethdev_vf.c
+++ b/drivers/net/nfp/nfp_ethdev_vf.c
@@ -19,9 +19,9 @@
 #include "nfpcore/nfp_rtsym.h"
 
 #include "nfp_common.h"
+#include "nfp_ctrl.h"
 #include "nfp_rxtx.h"
 #include "nfp_logs.h"
-#include "nfp_ctrl.h"
 
 static void
 nfp_netvf_read_mac(struct nfp_net_hw *hw)
diff --git a/drivers/net/nfp/nfp_rxtx.c b/drivers/net/nfp/nfp_rxtx.c
index 8d63a7b..95403a3 100644
--- a/drivers/net/nfp/nfp_rxtx.c
+++ b/drivers/net/nfp/nfp_rxtx.c
@@ -17,9 +17,9 @@
 #include <ethdev_pci.h>
 
 #include "nfp_common.h"
+#include "nfp_ctrl.h"
 #include "nfp_rxtx.h"
 #include "nfp_logs.h"
-#include "nfp_ctrl.h"
 #include "nfpcore/nfp_mip.h"
 #include "nfpcore/nfp_rtsym.h"
 #include "nfpcore/nfp-common/nfp_platform.h"
@@ -208,34 +208,6 @@
 	}
 }
 
-/* nfp_net_rx_cksum - set mbuf checksum flags based on RX descriptor flags */
-static inline void
-nfp_net_rx_cksum(struct nfp_net_rxq *rxq, struct nfp_net_rx_desc *rxd,
-		 struct rte_mbuf *mb)
-{
-	struct nfp_net_hw *hw = rxq->hw;
-
-	if (!(hw->ctrl & NFP_NET_CFG_CTRL_RXCSUM))
-		return;
-
-	/* If IPv4 and IP checksum error, fail */
-	if (unlikely((rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM) &&
-	    !(rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM_OK)))
-		mb->ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_BAD;
-	else
-		mb->ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_GOOD;
-
-	/* If neither UDP nor TCP return */
-	if (!(rxd->rxd.flags & PCIE_DESC_RX_TCP_CSUM) &&
-	    !(rxd->rxd.flags & PCIE_DESC_RX_UDP_CSUM))
-		return;
-
-	if (likely(rxd->rxd.flags & PCIE_DESC_RX_L4_CSUM_OK))
-		mb->ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_GOOD;
-	else
-		mb->ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_BAD;
-}
-
 /*
  * RX path design:
  *
@@ -768,67 +740,6 @@
 	return 0;
 }
 
-/* nfp_net_tx_tso - Set TX descriptor for TSO */
-static inline void
-nfp_net_nfd3_tx_tso(struct nfp_net_txq *txq, struct nfp_net_nfd3_tx_desc *txd,
-	       struct rte_mbuf *mb)
-{
-	uint64_t ol_flags;
-	struct nfp_net_hw *hw = txq->hw;
-
-	if (!(hw->cap & NFP_NET_CFG_CTRL_LSO_ANY))
-		goto clean_txd;
-
-	ol_flags = mb->ol_flags;
-
-	if (!(ol_flags & RTE_MBUF_F_TX_TCP_SEG))
-		goto clean_txd;
-
-	txd->l3_offset = mb->l2_len;
-	txd->l4_offset = mb->l2_len + mb->l3_len;
-	txd->lso_hdrlen = mb->l2_len + mb->l3_len + mb->l4_len;
-	txd->mss = rte_cpu_to_le_16(mb->tso_segsz);
-	txd->flags = PCIE_DESC_TX_LSO;
-	return;
-
-clean_txd:
-	txd->flags = 0;
-	txd->l3_offset = 0;
-	txd->l4_offset = 0;
-	txd->lso_hdrlen = 0;
-	txd->mss = 0;
-}
-
-/* nfp_net_tx_cksum - Set TX CSUM offload flags in TX descriptor */
-static inline void
-nfp_net_nfd3_tx_cksum(struct nfp_net_txq *txq, struct nfp_net_nfd3_tx_desc *txd,
-		 struct rte_mbuf *mb)
-{
-	uint64_t ol_flags;
-	struct nfp_net_hw *hw = txq->hw;
-
-	if (!(hw->cap & NFP_NET_CFG_CTRL_TXCSUM))
-		return;
-
-	ol_flags = mb->ol_flags;
-
-	/* IPv6 does not need checksum */
-	if (ol_flags & RTE_MBUF_F_TX_IP_CKSUM)
-		txd->flags |= PCIE_DESC_TX_IP4_CSUM;
-
-	switch (ol_flags & RTE_MBUF_F_TX_L4_MASK) {
-	case RTE_MBUF_F_TX_UDP_CKSUM:
-		txd->flags |= PCIE_DESC_TX_UDP_CSUM;
-		break;
-	case RTE_MBUF_F_TX_TCP_CKSUM:
-		txd->flags |= PCIE_DESC_TX_TCP_CSUM;
-		break;
-	}
-
-	if (ol_flags & (RTE_MBUF_F_TX_IP_CKSUM | RTE_MBUF_F_TX_L4_MASK))
-		txd->flags |= PCIE_DESC_TX_CSUM;
-}
-
 uint16_t
 nfp_net_nfd3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 {
diff --git a/drivers/net/nfp/nfp_rxtx.h b/drivers/net/nfp/nfp_rxtx.h
index a30171f..cd70bdd 100644
--- a/drivers/net/nfp/nfp_rxtx.h
+++ b/drivers/net/nfp/nfp_rxtx.h
@@ -360,6 +360,96 @@ struct nfp_net_rxq {
 	return (nfp_net_nfd3_free_tx_desc(txq) < txq->tx_free_thresh);
 }
 
+/* set mbuf checksum flags based on RX descriptor flags */
+static inline void
+nfp_net_rx_cksum(struct nfp_net_rxq *rxq, struct nfp_net_rx_desc *rxd,
+		 struct rte_mbuf *mb)
+{
+	struct nfp_net_hw *hw = rxq->hw;
+
+	if (!(hw->ctrl & NFP_NET_CFG_CTRL_RXCSUM))
+		return;
+
+	/* If IPv4 and IP checksum error, fail */
+	if (unlikely((rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM) &&
+			!(rxd->rxd.flags & PCIE_DESC_RX_IP4_CSUM_OK)))
+		mb->ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_BAD;
+	else
+		mb->ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_GOOD;
+
+	/* If neither UDP nor TCP return */
+	if (!(rxd->rxd.flags & PCIE_DESC_RX_TCP_CSUM) &&
+			!(rxd->rxd.flags & PCIE_DESC_RX_UDP_CSUM))
+		return;
+
+	if (likely(rxd->rxd.flags & PCIE_DESC_RX_L4_CSUM_OK))
+		mb->ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_GOOD;
+	else
+		mb->ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_BAD;
+}
+
+/* Set NFD3 TX descriptor for TSO */
+static inline void
+nfp_net_nfd3_tx_tso(struct nfp_net_txq *txq,
+		struct nfp_net_nfd3_tx_desc *txd,
+		struct rte_mbuf *mb)
+{
+	uint64_t ol_flags;
+	struct nfp_net_hw *hw = txq->hw;
+
+	if (!(hw->cap & NFP_NET_CFG_CTRL_LSO_ANY))
+		goto clean_txd;
+
+	ol_flags = mb->ol_flags;
+
+	if (!(ol_flags & RTE_MBUF_F_TX_TCP_SEG))
+		goto clean_txd;
+
+	txd->l3_offset = mb->l2_len;
+	txd->l4_offset = mb->l2_len + mb->l3_len;
+	txd->lso_hdrlen = mb->l2_len + mb->l3_len + mb->l4_len;
+	txd->mss = rte_cpu_to_le_16(mb->tso_segsz);
+	txd->flags = PCIE_DESC_TX_LSO;
+	return;
+
+clean_txd:
+	txd->flags = 0;
+	txd->l3_offset = 0;
+	txd->l4_offset = 0;
+	txd->lso_hdrlen = 0;
+	txd->mss = 0;
+}
+
+/* Set TX CSUM offload flags in NFD3 TX descriptor */
+static inline void
+nfp_net_nfd3_tx_cksum(struct nfp_net_txq *txq, struct nfp_net_nfd3_tx_desc *txd,
+		 struct rte_mbuf *mb)
+{
+	uint64_t ol_flags;
+	struct nfp_net_hw *hw = txq->hw;
+
+	if (!(hw->cap & NFP_NET_CFG_CTRL_TXCSUM))
+		return;
+
+	ol_flags = mb->ol_flags;
+
+	/* IPv6 does not need checksum */
+	if (ol_flags & RTE_MBUF_F_TX_IP_CKSUM)
+		txd->flags |= PCIE_DESC_TX_IP4_CSUM;
+
+	switch (ol_flags & RTE_MBUF_F_TX_L4_MASK) {
+	case RTE_MBUF_F_TX_UDP_CKSUM:
+		txd->flags |= PCIE_DESC_TX_UDP_CSUM;
+		break;
+	case RTE_MBUF_F_TX_TCP_CKSUM:
+		txd->flags |= PCIE_DESC_TX_TCP_CSUM;
+		break;
+	}
+
+	if (ol_flags & (RTE_MBUF_F_TX_IP_CKSUM | RTE_MBUF_F_TX_L4_MASK))
+		txd->flags |= PCIE_DESC_TX_CSUM;
+}
+
 int nfp_net_rx_freelist_setup(struct rte_eth_dev *dev);
 uint32_t nfp_net_rx_queue_count(void *rx_queue);
 uint16_t nfp_net_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v9 12/12] net/nfp: add flower PF rxtx logic
  2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (10 preceding siblings ...)
  2022-09-15 10:44 ` [PATCH v9 11/12] net/nfp: move rxtx function to header file Chaoyong He
@ 2022-09-15 10:44 ` Chaoyong He
  2022-09-20 14:56 ` [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Ferruh Yigit
  12 siblings, 0 replies; 22+ messages in thread
From: Chaoyong He @ 2022-09-15 10:44 UTC (permalink / raw)
  To: dev; +Cc: oss-drivers, niklas.soderlund, Chaoyong He

Implements the flower Rx logic. Fallback packets are multiplexed to the
correct representor port based on the prepended metadata. The Rx poll
is set to run on the existing service infrastructure.

For Tx the existing NFP Tx logic is duplicated to keep the Tx two paths
distinct. Flower fallback also adds 8 bytes of metadata to the start of
the packet that has to be adjusted for in the Tx descriptor.

Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
---
 doc/guides/nics/nfp.rst             |  13 ++
 drivers/net/nfp/flower/nfp_flower.c | 425 ++++++++++++++++++++++++++++++++++++
 drivers/net/nfp/flower/nfp_flower.h |   1 +
 3 files changed, 439 insertions(+)

diff --git a/doc/guides/nics/nfp.rst b/doc/guides/nics/nfp.rst
index c62b1fb..b74067c 100644
--- a/doc/guides/nics/nfp.rst
+++ b/doc/guides/nics/nfp.rst
@@ -200,3 +200,16 @@ The flower firmware application support representor port for VF and physical
 port. There will always exist a representor port for each physical port,
 and the number of the representor port for VF is specified by the user through
 parameter.
+
+In the Rx direction, the flower firmware application will prepend the input
+port information into metadata for each packet which can't offloaded. The PF
+vNIC service will keep polling packets from the firmware, and multiplex them
+to the corresponding representor port.
+
+In the Tx direction, the representor port will prepend the output port
+information into metadata for each packet, and then send it to firmware through
+PF vNIC.
+
+The ctrl vNIC service handling various control message, like the creation and
+configuration of representor port, the pattern and action of flow rules, the
+statistics of flow rules, and so on.
diff --git a/drivers/net/nfp/flower/nfp_flower.c b/drivers/net/nfp/flower/nfp_flower.c
index c52785c..d2bda15 100644
--- a/drivers/net/nfp/flower/nfp_flower.c
+++ b/drivers/net/nfp/flower/nfp_flower.c
@@ -22,6 +22,7 @@
 #include "nfp_flower.h"
 #include "nfp_flower_ctrl.h"
 #include "nfp_flower_representor.h"
+#include "nfp_flower_cmsg.h"
 
 #define MAX_PKT_BURST 32
 #define MBUF_PRIV_SIZE 128
@@ -214,6 +215,383 @@
 	.dev_configure          = nfp_net_configure,
 };
 
+static inline void
+nfp_flower_parse_metadata(struct nfp_net_rxq *rxq,
+		struct nfp_net_rx_desc *rxd,
+		struct rte_mbuf *mbuf,
+		uint32_t *portid)
+{
+	uint32_t meta_info;
+	uint8_t *meta_offset;
+	struct nfp_net_hw *hw;
+
+	hw = rxq->hw;
+	if (!((hw->ctrl & NFP_NET_CFG_CTRL_RSS) ||
+			(hw->ctrl & NFP_NET_CFG_CTRL_RSS2)))
+		return;
+
+	meta_offset = rte_pktmbuf_mtod(mbuf, uint8_t *);
+	meta_offset -= NFP_DESC_META_LEN(rxd);
+	meta_info = rte_be_to_cpu_32(*(uint32_t *)meta_offset);
+	meta_offset += 4;
+
+	while (meta_info) {
+		switch (meta_info & NFP_NET_META_FIELD_MASK) {
+		/* Expect flower firmware to only send packets with META_PORTID */
+		case NFP_NET_META_PORTID:
+			*portid = rte_be_to_cpu_32(*(uint32_t *)meta_offset);
+			meta_offset += 4;
+			meta_info >>= NFP_NET_META_FIELD_SIZE;
+			break;
+		default:
+			/* Unsupported metadata can be a performance issue */
+			return;
+		}
+	}
+}
+
+static inline struct nfp_flower_representor *
+nfp_flower_get_repr(struct nfp_net_hw *hw,
+		uint32_t port_id)
+{
+	uint8_t port;
+	struct nfp_app_fw_flower *app_fw_flower;
+
+	/* Obtain handle to app_fw_flower here */
+	app_fw_flower = NFP_PRIV_TO_APP_FW_FLOWER(hw->pf_dev->app_fw_priv);
+
+	switch (NFP_FLOWER_CMSG_PORT_TYPE(port_id)) {
+	case NFP_FLOWER_CMSG_PORT_TYPE_PHYS_PORT:
+		port =  NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM(port_id);
+		return app_fw_flower->phy_reprs[port];
+	case NFP_FLOWER_CMSG_PORT_TYPE_PCIE_PORT:
+		port = NFP_FLOWER_CMSG_PORT_VNIC(port_id);
+		return app_fw_flower->vf_reprs[port];
+	default:
+		break;
+	}
+
+	return NULL;
+}
+
+static uint16_t
+nfp_flower_pf_recv_pkts(void *rx_queue,
+		struct rte_mbuf **rx_pkts,
+		uint16_t nb_pkts)
+{
+	/*
+	 * We need different counters for packets given to the caller
+	 * and packets sent to representors
+	 */
+	int avail = 0;
+	int avail_multiplexed = 0;
+	uint64_t dma_addr;
+	uint32_t meta_portid;
+	uint16_t nb_hold = 0;
+	struct rte_mbuf *mb;
+	struct nfp_net_hw *hw;
+	struct rte_mbuf *new_mb;
+	struct nfp_net_rxq *rxq;
+	struct nfp_net_rx_buff *rxb;
+	struct nfp_net_rx_desc *rxds;
+	struct nfp_flower_representor *repr;
+
+	rxq = rx_queue;
+	if (unlikely(rxq == NULL)) {
+		/*
+		 * DPDK just checks the queue is lower than max queues
+		 * enabled. But the queue needs to be configured
+		 */
+		RTE_LOG_DP(ERR, PMD, "RX Bad queue\n");
+		return 0;
+	}
+
+	hw = rxq->hw;
+
+	/*
+	 * This is tunable as we could allow to receive more packets than
+	 * requested if most are multiplexed.
+	 */
+	while (avail + avail_multiplexed < nb_pkts) {
+		rxb = &rxq->rxbufs[rxq->rd_p];
+		if (unlikely(rxb == NULL)) {
+			RTE_LOG_DP(ERR, PMD, "rxb does not exist!\n");
+			break;
+		}
+
+		rxds = &rxq->rxds[rxq->rd_p];
+		if ((rxds->rxd.meta_len_dd & PCIE_DESC_RX_DD) == 0)
+			break;
+
+		/*
+		 * Memory barrier to ensure that we won't do other
+		 * reads before the DD bit.
+		 */
+		rte_rmb();
+
+		/*
+		 * We got a packet. Let's alloc a new mbuf for refilling the
+		 * free descriptor ring as soon as possible
+		 */
+		new_mb = rte_pktmbuf_alloc(rxq->mem_pool);
+		if (unlikely(new_mb == NULL)) {
+			RTE_LOG_DP(DEBUG, PMD,
+			"RX mbuf alloc failed port_id=%u queue_id=%d\n",
+				rxq->port_id, rxq->qidx);
+			nfp_net_mbuf_alloc_failed(rxq);
+			break;
+		}
+
+		nb_hold++;
+
+		/*
+		 * Grab the mbuf and refill the descriptor with the
+		 * previously allocated mbuf
+		 */
+		mb = rxb->mbuf;
+		rxb->mbuf = new_mb;
+
+		PMD_RX_LOG(DEBUG, "Packet len: %u, mbuf_size: %u",
+				rxds->rxd.data_len, rxq->mbuf_size);
+
+		/* Size of this segment */
+		mb->data_len = rxds->rxd.data_len - NFP_DESC_META_LEN(rxds);
+		/* Size of the whole packet. We just support 1 segment */
+		mb->pkt_len = rxds->rxd.data_len - NFP_DESC_META_LEN(rxds);
+
+		if (unlikely((mb->data_len + hw->rx_offset) > rxq->mbuf_size)) {
+			/*
+			 * This should not happen and the user has the
+			 * responsibility of avoiding it. But we have
+			 * to give some info about the error
+			 */
+			RTE_LOG_DP(ERR, PMD,
+				"mbuf overflow likely due to the RX offset.\n"
+				"\t\tYour mbuf size should have extra space for"
+				" RX offset=%u bytes.\n"
+				"\t\tCurrently you just have %u bytes available"
+				" but the received packet is %u bytes long",
+				hw->rx_offset,
+				rxq->mbuf_size - hw->rx_offset,
+				mb->data_len);
+			rte_pktmbuf_free(mb);
+			break;
+		}
+
+		/* Filling the received mbuf with packet info */
+		if (hw->rx_offset)
+			mb->data_off = RTE_PKTMBUF_HEADROOM + hw->rx_offset;
+		else
+			mb->data_off = RTE_PKTMBUF_HEADROOM + NFP_DESC_META_LEN(rxds);
+
+		/* No scatter mode supported */
+		mb->nb_segs = 1;
+		mb->next = NULL;
+
+		mb->port = rxq->port_id;
+		meta_portid = 0;
+
+		/* Checking the RSS flag */
+		nfp_flower_parse_metadata(rxq, rxds, mb, &meta_portid);
+		PMD_RX_LOG(DEBUG, "Received from port %u type %u",
+				NFP_FLOWER_CMSG_PORT_VNIC(meta_portid),
+				NFP_FLOWER_CMSG_PORT_VNIC_TYPE(meta_portid));
+
+		/* Checking the checksum flag */
+		nfp_net_rx_cksum(rxq, rxds, mb);
+
+		if ((rxds->rxd.flags & PCIE_DESC_RX_VLAN) &&
+				(hw->ctrl & NFP_NET_CFG_CTRL_RXVLAN)) {
+			mb->vlan_tci = rte_cpu_to_le_32(rxds->rxd.vlan);
+			mb->ol_flags |= RTE_MBUF_F_RX_VLAN | RTE_MBUF_F_RX_VLAN_STRIPPED;
+		}
+
+		repr = nfp_flower_get_repr(hw, meta_portid);
+		if (repr && repr->ring) {
+			PMD_RX_LOG(DEBUG, "Using representor %s", repr->name);
+			rte_ring_enqueue(repr->ring, (void *)mb);
+			avail_multiplexed++;
+		} else if (repr) {
+			PMD_RX_LOG(ERR, "[%u] No ring available for repr_port %s\n",
+					hw->idx, repr->name);
+			PMD_RX_LOG(DEBUG, "Adding the mbuf to the mbuf array passed by the app");
+			rx_pkts[avail++] = mb;
+		} else {
+			PMD_RX_LOG(DEBUG, "Adding the mbuf to the mbuf array passed by the app");
+			rx_pkts[avail++] = mb;
+		}
+
+		/* Now resetting and updating the descriptor */
+		rxds->vals[0] = 0;
+		rxds->vals[1] = 0;
+		dma_addr = rte_cpu_to_le_64(RTE_MBUF_DMA_ADDR_DEFAULT(new_mb));
+		rxds->fld.dd = 0;
+		rxds->fld.dma_addr_hi = (dma_addr >> 32) & 0xff;
+		rxds->fld.dma_addr_lo = dma_addr & 0xffffffff;
+
+		rxq->rd_p++;
+		if (unlikely(rxq->rd_p == rxq->rx_count))
+			rxq->rd_p = 0;
+	}
+
+	if (nb_hold == 0)
+		return nb_hold;
+
+	PMD_RX_LOG(DEBUG, "RX port_id=%u queue_id=%d, %d packets received",
+			rxq->port_id, rxq->qidx, nb_hold);
+
+	nb_hold += rxq->nb_rx_hold;
+
+	/*
+	 * FL descriptors needs to be written before incrementing the
+	 * FL queue WR pointer
+	 */
+	rte_wmb();
+	if (nb_hold > rxq->rx_free_thresh) {
+		PMD_RX_LOG(DEBUG, "port=%u queue=%d nb_hold=%u avail=%d",
+				rxq->port_id, rxq->qidx, nb_hold, avail);
+		nfp_qcp_ptr_add(rxq->qcp_fl, NFP_QCP_WRITE_PTR, nb_hold);
+		nb_hold = 0;
+	}
+
+	rxq->nb_rx_hold = nb_hold;
+
+	return avail;
+}
+
+static uint16_t
+nfp_flower_pf_xmit_pkts(void *tx_queue,
+		struct rte_mbuf **tx_pkts,
+		uint16_t nb_pkts)
+{
+	int i = 0;
+	int pkt_size;
+	int dma_size;
+	uint64_t dma_addr;
+	uint16_t free_descs;
+	uint16_t issued_descs;
+	struct rte_mbuf *pkt;
+	struct nfp_net_hw *hw;
+	struct rte_mbuf **lmbuf;
+	struct nfp_net_txq *txq;
+	struct nfp_net_nfd3_tx_desc txd;
+	struct nfp_net_nfd3_tx_desc *txds;
+
+	txq = tx_queue;
+	hw = txq->hw;
+	txds = &txq->txds[txq->wr_p];
+
+	PMD_TX_LOG(DEBUG, "working for queue %d at pos %u and %u packets",
+			txq->qidx, txq->wr_p, nb_pkts);
+
+	if ((nfp_net_nfd3_free_tx_desc(txq) < nb_pkts) || (nfp_net_nfd3_txq_full(txq)))
+		nfp_net_tx_free_bufs(txq);
+
+	free_descs = (uint16_t)nfp_net_nfd3_free_tx_desc(txq);
+	if (unlikely(free_descs == 0))
+		return 0;
+
+	pkt = *tx_pkts;
+	issued_descs = 0;
+
+	/* Sending packets */
+	while ((i < nb_pkts) && free_descs) {
+		/* Grabbing the mbuf linked to the current descriptor */
+		lmbuf = &txq->txbufs[txq->wr_p].mbuf;
+		/* Warming the cache for releasing the mbuf later on */
+		RTE_MBUF_PREFETCH_TO_FREE(*lmbuf);
+
+		pkt = *(tx_pkts + i);
+
+		if (unlikely(pkt->nb_segs > 1 &&
+				!(hw->cap & NFP_NET_CFG_CTRL_GATHER))) {
+			PMD_INIT_LOG(INFO, "NFP_NET_CFG_CTRL_GATHER not set");
+			PMD_INIT_LOG(INFO, "Multisegment packet unsupported");
+			goto xmit_end;
+		}
+
+		/* Checking if we have enough descriptors */
+		if (unlikely(pkt->nb_segs > free_descs))
+			goto xmit_end;
+
+		/*
+		 * Checksum and VLAN flags just in the first descriptor for a
+		 * multisegment packet, but TSO info needs to be in all of them.
+		 */
+		txd.data_len = pkt->pkt_len;
+		nfp_net_nfd3_tx_tso(txq, &txd, pkt);
+		nfp_net_nfd3_tx_cksum(txq, &txd, pkt);
+
+		if ((pkt->ol_flags & RTE_MBUF_F_TX_VLAN) &&
+				(hw->cap & NFP_NET_CFG_CTRL_TXVLAN)) {
+			txd.flags |= PCIE_DESC_TX_VLAN;
+			txd.vlan = pkt->vlan_tci;
+		}
+
+		/*
+		 * mbuf data_len is the data in one segment and pkt_len data
+		 * in the whole packet. When the packet is just one segment,
+		 * then data_len = pkt_len
+		 */
+		pkt_size = pkt->pkt_len;
+
+		while (pkt) {
+			/* Copying TSO, VLAN and cksum info */
+			*txds = txd;
+
+			/* Releasing mbuf used by this descriptor previously*/
+			if (*lmbuf)
+				rte_pktmbuf_free_seg(*lmbuf);
+
+			/*
+			 * Linking mbuf with descriptor for being released
+			 * next time descriptor is used
+			 */
+			*lmbuf = pkt;
+
+			dma_size = pkt->data_len;
+			dma_addr = rte_mbuf_data_iova(pkt);
+
+			/* Filling descriptors fields */
+			txds->dma_len = dma_size;
+			txds->data_len = txd.data_len;
+			txds->dma_addr_hi = (dma_addr >> 32) & 0xff;
+			txds->dma_addr_lo = (dma_addr & 0xffffffff);
+			ASSERT(free_descs > 0);
+			free_descs--;
+
+			txq->wr_p++;
+			if (unlikely(txq->wr_p == txq->tx_count)) /* wrapping?*/
+				txq->wr_p = 0;
+
+			pkt_size -= dma_size;
+
+			/*
+			 * Making the EOP, packets with just one segment
+			 * the priority
+			 */
+			if (likely(!pkt_size))
+				txds->offset_eop = PCIE_DESC_TX_EOP | FLOWER_PKT_DATA_OFFSET;
+			else
+				txds->offset_eop = 0;
+
+			pkt = pkt->next;
+			/* Referencing next free TX descriptor */
+			txds = &txq->txds[txq->wr_p];
+			lmbuf = &txq->txbufs[txq->wr_p].mbuf;
+			issued_descs++;
+		}
+		i++;
+	}
+
+xmit_end:
+	/* Increment write pointers. Force memory write before we let HW know */
+	rte_wmb();
+	nfp_qcp_ptr_add(txq->qcp_q, NFP_QCP_WRITE_PTR, issued_descs);
+
+	return i;
+}
+
 struct dp_packet {
 	struct rte_mbuf mbuf;
 	uint32_t source;
@@ -382,6 +760,8 @@ struct dp_packet {
 
 	/* Add Rx/Tx functions */
 	eth_dev->dev_ops = &nfp_flower_pf_vnic_ops;
+	eth_dev->rx_pkt_burst = nfp_flower_pf_recv_pkts;
+	eth_dev->tx_pkt_burst = nfp_flower_pf_xmit_pkts;
 
 	/* PF vNIC gets a random MAC */
 	eth_dev->data->mac_addrs = rte_zmalloc("mac_addr", RTE_ETHER_ADDR_LEN, 0);
@@ -504,6 +884,45 @@ struct dp_packet {
 	return 0;
 }
 
+static void
+nfp_flower_pf_vnic_poll(struct nfp_app_fw_flower *app_fw_flower)
+{
+	uint16_t i;
+	void *rx_queue;
+	uint16_t count;
+	uint16_t n_rxq;
+	struct rte_eth_dev *dev;
+	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+
+	dev = app_fw_flower->pf_hw->eth_dev;
+	n_rxq = dev->data->nb_rx_queues;
+
+	while (true) {
+		/* Add ability to run Rx queues on multiple service cores? */
+		for (i = 0; i < n_rxq; i++) {
+			rx_queue = dev->data->rx_queues[i];
+			count = nfp_flower_pf_recv_pkts(rx_queue, pkts_burst, MAX_PKT_BURST);
+
+			/*
+			 * Don't expect packets here but free them in case they have
+			 * not been multiplexed to a representor
+			 */
+			if (unlikely(count > 0))
+				rte_pktmbuf_free_bulk(pkts_burst, count);
+		}
+	}
+}
+
+static int
+nfp_flower_pf_vnic_service(void *arg)
+{
+	struct nfp_app_fw_flower *app_fw_flower = arg;
+
+	nfp_flower_pf_vnic_poll(app_fw_flower);
+
+	return 0;
+}
+
 static int
 nfp_flower_init_ctrl_vnic(struct nfp_net_hw *hw)
 {
@@ -694,6 +1113,10 @@ struct dp_packet {
 }
 
 static struct rte_service_spec flower_services[NFP_FLOWER_SERVICE_MAX] = {
+	[NFP_FLOWER_SERVICE_PF] = {
+		.name         = "flower_pf_vnic_service",
+		.callback     = nfp_flower_pf_vnic_service,
+	},
 	[NFP_FLOWER_SERVICE_CTRL] = {
 		.name         = "flower_ctrl_vnic_service",
 		.callback     = nfp_flower_ctrl_vnic_service,
@@ -879,6 +1302,8 @@ struct dp_packet {
 
 	eth_dev->process_private = cpp;
 	eth_dev->dev_ops = &nfp_flower_pf_vnic_ops;
+	eth_dev->rx_pkt_burst = nfp_flower_pf_recv_pkts;
+	eth_dev->tx_pkt_burst = nfp_flower_pf_xmit_pkts;
 	rte_eth_dev_probing_finish(eth_dev);
 
 	return 0;
diff --git a/drivers/net/nfp/flower/nfp_flower.h b/drivers/net/nfp/flower/nfp_flower.h
index c7d673f..01c3f7f 100644
--- a/drivers/net/nfp/flower/nfp_flower.h
+++ b/drivers/net/nfp/flower/nfp_flower.h
@@ -7,6 +7,7 @@
 #define _NFP_FLOWER_H_
 
 enum nfp_flower_service {
+	NFP_FLOWER_SERVICE_PF,
 	NFP_FLOWER_SERVICE_CTRL,
 	NFP_FLOWER_SERVICE_MAX,
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD
  2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
                   ` (11 preceding siblings ...)
  2022-09-15 10:44 ` [PATCH v9 12/12] net/nfp: add flower PF rxtx logic Chaoyong He
@ 2022-09-20 14:56 ` Ferruh Yigit
  12 siblings, 0 replies; 22+ messages in thread
From: Ferruh Yigit @ 2022-09-20 14:56 UTC (permalink / raw)
  To: Chaoyong He, dev; +Cc: oss-drivers, niklas.soderlund

On 9/15/2022 11:44 AM, Chaoyong He wrote:
> * Changes since v8
> - Update the nfp.rst
> - Fix the 'app_hw' to 'app_fw'
> - Remove the ovs compatible header file
> - Remove the use of rte_eth_dev_configure()/rte_eth_rx_burst()/rte_eth_dev_start() API
> 
> * Changes since v7
> - Adjust the logics to make sure not break the pci probe process
> - Change 'app' to 'app_fw' in all logics to avoid confuse
> - Fix problem about log level
> 
> * Changes since v6
> - Fix the compile error
> 
> * Changes since v5
> - Compare integer with 0 explicitly
> - Change helper macro to function
> - Implement the dummy functions
> - Remove some unnecessary logics
> 
> * Changes since v4
> - Remove the unneeded '__rte_unused' attribute
> - Fixup a potential memory leak problem
> 
> * Changes since v3
> - Add the 'Depends-on' tag
> 
> * Changes since v2
> - Remove the use of rte_panic()
> 
> * Changes since v1
> - Fix the compile error
> 
> Depends-on: series-23707 ("Add support of NFP3800 chip and firmware with NFDk")
> 
> Chaoyong He (12):
>    net/nfp: move app specific attributes to own struct
>    net/nfp: simplify initialization and remove dead code
>    net/nfp: move app specific init logic to own function
>    net/nfp: add initial flower firmware support
>    net/nfp: add flower PF setup logic
>    net/nfp: add flower PF related routines
>    net/nfp: add flower ctrl VNIC related logics
>    net/nfp: move common rxtx function for flower use
>    net/nfp: add flower ctrl VNIC rxtx logic
>    net/nfp: add flower representor framework
>    net/nfp: move rxtx function to header file
>    net/nfp: add flower PF rxtx logic
> 

Hi Chaoyong,

Patchset looks good, except from two issues we have discussed before, 
those issues are:

* Creating a new ethdev just for driver-FW control communication
* Application (OvS) specific code in the driver

I commented them separately and cc'ed more folks, we can proceed when 
above are resolved.

Thanks,
ferruh


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-09-15 10:44 ` [PATCH v9 07/12] net/nfp: add flower ctrl VNIC related logics Chaoyong He
@ 2022-09-20 14:56   ` Ferruh Yigit
  2022-09-21  2:02     ` Chaoyong He
  0 siblings, 1 reply; 22+ messages in thread
From: Ferruh Yigit @ 2022-09-20 14:56 UTC (permalink / raw)
  To: Chaoyong He, dev
  Cc: oss-drivers, niklas.soderlund, Jerin Jacob Kollanukkaran,
	Andrew Rybchenko, Thomas Monjalon, David Marchand

On 9/15/2022 11:44 AM, Chaoyong He wrote:
> Adds the setup/start logic for the ctrl vNIC. This vNIC is used by
> the PMD and flower firmware application as a communication channel
> between driver and firmware. In the case of OVS it is also used to
> communicate flow statistics from hardware to the driver.
> 
> A rte_eth device is not exposed to DPDK for this vNIC as it is strictly
> used internally by flower logic.
> 

Hi Chaoyong,

Similar comment with previous versions, interface is created using 
regular 'rte_eth_dev_allocate()' API, I think interface will be visible 
to application, I can't understand the need of creating an interface for 
control.

What is the communication method between driver and FW?
Since one of the following patches (09/12) introduces Rx/Tx for ctrl 
interface, is device interface is control packets (similar to network 
data packets)?

> Because of the add of ctrl vNIC, a new PCItoCPPBar is needed. Modify the
> related logics.
> 
> Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>

<...>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 05/12] net/nfp: add flower PF setup logic
  2022-09-15 10:44 ` [PATCH v9 05/12] net/nfp: add flower PF setup logic Chaoyong He
@ 2022-09-20 14:57   ` Ferruh Yigit
  2022-09-21  2:50     ` Chaoyong He
  0 siblings, 1 reply; 22+ messages in thread
From: Ferruh Yigit @ 2022-09-20 14:57 UTC (permalink / raw)
  To: Chaoyong He, dev, Ian Stokes, David Marchand
  Cc: oss-drivers, niklas.soderlund, Thomas Monjalon, Andrew Rybchenko,
	Jerin Jacob Kollanukkaran

On 9/15/2022 11:44 AM, Chaoyong He wrote:
> Adds the vNIC initialization logic for the flower PF vNIC. The flower
> firmware application exposes this vNIC for the purposes of fallback
> traffic in the switchdev use-case.
> 
> Adds minimal dev_ops for this PF vNIC device. Because the device is
> being exposed externally to DPDK it needs to implements a minimal set
> of dev_ops.
> 
> Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>

<...>

> +
> +struct dp_packet {
> +	struct rte_mbuf mbuf;
> +	uint32_t source;
> +};
> +
> +static void
> +nfp_flower_pf_mp_init(__rte_unused struct rte_mempool *mp,
> +		__rte_unused void *opaque_arg,
> +		void *packet,
> +		__rte_unused unsigned int i)
> +{
> +	struct dp_packet *pkt = packet;
> +	/* Indicate that this pkt is from DPDK */
> +	pkt->source = 3;
> +}
> +
> +static struct rte_mempool *
> +nfp_flower_pf_mp_create(void)
> +{
> +	uint32_t nb_mbufs;
> +	unsigned int numa_node;
> +	struct rte_mempool *pktmbuf_pool;
> +	uint32_t n_rxd = PF_VNIC_NB_DESC;
> +	uint32_t n_txd = PF_VNIC_NB_DESC;
> +
> +	nb_mbufs = RTE_MAX(n_rxd + n_txd + MAX_PKT_BURST + MEMPOOL_CACHE_SIZE, 81920U);
> +
> +	numa_node = rte_socket_id();
> +	pktmbuf_pool = rte_pktmbuf_pool_create("flower_pf_mbuf_pool", nb_mbufs,
> +			MEMPOOL_CACHE_SIZE, MBUF_PRIV_SIZE,
> +			RTE_MBUF_DEFAULT_BUF_SIZE, numa_node);
> +	if (pktmbuf_pool == NULL) {
> +		PMD_INIT_LOG(ERR, "Cannot init pf vnic mbuf pool");
> +		return NULL;
> +	}
> +
> +	rte_mempool_obj_iter(pktmbuf_pool, nfp_flower_pf_mp_init, NULL);
> +
> +	return pktmbuf_pool;
> +}
> +

Hi Chaoyong,

Again, similar comment to previous versions, what I understand is this 
new flower FW supports HW flow filter and intended use case is for OvS 
HW acceleration.
But is DPDK driver need to know OvS data structures, like "struct 
dp_packet", can it be transparent to application, I am sure there are 
other devices offloading some OvS task to HW.

@Ian, @David,

Can you please comment on above usage, do you guys see any way to escape 
from OvS specific code in the driver?


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH v9 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-09-20 14:56   ` Ferruh Yigit
@ 2022-09-21  2:02     ` Chaoyong He
  2022-09-21  7:29       ` Thomas Monjalon
  0 siblings, 1 reply; 22+ messages in thread
From: Chaoyong He @ 2022-09-21  2:02 UTC (permalink / raw)
  To: Ferruh Yigit, dev
  Cc: oss-drivers, Niklas Soderlund, Jerin Jacob Kollanukkaran,
	Andrew Rybchenko, Thomas Monjalon, David Marchand

> On 9/15/2022 11:44 AM, Chaoyong He wrote:
> > Adds the setup/start logic for the ctrl vNIC. This vNIC is used by the
> > PMD and flower firmware application as a communication channel
> between
> > driver and firmware. In the case of OVS it is also used to communicate
> > flow statistics from hardware to the driver.
> >
> > A rte_eth device is not exposed to DPDK for this vNIC as it is
> > strictly used internally by flower logic.
> >
> 
> Hi Chaoyong,
> 
> Similar comment with previous versions, interface is created using regular
> 'rte_eth_dev_allocate()' API, I think interface will be visible to application, I
> can't understand the need of creating an interface for control.
> 
> What is the communication method between driver and FW?
> Since one of the following patches (09/12) introduces Rx/Tx for ctrl interface,
> is device interface is control packets (similar to network data packets)?
> 

Basically, the 'control message' is exist in the form of normal data packets.

When we use the flower firmware application, there exist two types of packets for now,
and they are identified only from the prepend meta-data.

Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
-----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
Word  +---------------+---------------+---------------+---------------+
   0  |    type       |     type      |     type      |     type      |
      +---------------+---------------+---------------+---------------+
The 'control message' packets are processed by the ctrl vNIC.
The 'normal' packets are processed by the pf vNIC.

The communication method between driver and firmware is decided by the
designment of hardware and firmware.

The kernel driver also has the same ctrl vNIC and pf vNIC ethdev and the usage is same.

> > Because of the add of ctrl vNIC, a new PCItoCPPBar is needed. Modify
> > the related logics.
> >
> > Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> > Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> 
> <...>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH v9 05/12] net/nfp: add flower PF setup logic
  2022-09-20 14:57   ` Ferruh Yigit
@ 2022-09-21  2:50     ` Chaoyong He
  2022-09-21  7:35       ` Thomas Monjalon
  0 siblings, 1 reply; 22+ messages in thread
From: Chaoyong He @ 2022-09-21  2:50 UTC (permalink / raw)
  To: Ferruh Yigit, dev, Ian Stokes, David Marchand
  Cc: oss-drivers, Niklas Soderlund, Thomas Monjalon, Andrew Rybchenko,
	Jerin Jacob Kollanukkaran

> On 9/15/2022 11:44 AM, Chaoyong He wrote:
> > Adds the vNIC initialization logic for the flower PF vNIC. The flower
> > firmware application exposes this vNIC for the purposes of fallback
> > traffic in the switchdev use-case.
> >
> > Adds minimal dev_ops for this PF vNIC device. Because the device is
> > being exposed externally to DPDK it needs to implements a minimal set
> > of dev_ops.
> >
> > Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> > Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> 
> <...>
> 
> > +
> > +struct dp_packet {
> > +	struct rte_mbuf mbuf;
> > +	uint32_t source;
> > +};
> > +
> > +static void
> > +nfp_flower_pf_mp_init(__rte_unused struct rte_mempool *mp,
> > +		__rte_unused void *opaque_arg,
> > +		void *packet,
> > +		__rte_unused unsigned int i)
> > +{
> > +	struct dp_packet *pkt = packet;
> > +	/* Indicate that this pkt is from DPDK */
> > +	pkt->source = 3;
> > +}
> > +
> > +static struct rte_mempool *
> > +nfp_flower_pf_mp_create(void)
> > +{
> > +	uint32_t nb_mbufs;
> > +	unsigned int numa_node;
> > +	struct rte_mempool *pktmbuf_pool;
> > +	uint32_t n_rxd = PF_VNIC_NB_DESC;
> > +	uint32_t n_txd = PF_VNIC_NB_DESC;
> > +
> > +	nb_mbufs = RTE_MAX(n_rxd + n_txd + MAX_PKT_BURST +
> > +MEMPOOL_CACHE_SIZE, 81920U);
> > +
> > +	numa_node = rte_socket_id();
> > +	pktmbuf_pool = rte_pktmbuf_pool_create("flower_pf_mbuf_pool",
> nb_mbufs,
> > +			MEMPOOL_CACHE_SIZE, MBUF_PRIV_SIZE,
> > +			RTE_MBUF_DEFAULT_BUF_SIZE, numa_node);
> > +	if (pktmbuf_pool == NULL) {
> > +		PMD_INIT_LOG(ERR, "Cannot init pf vnic mbuf pool");
> > +		return NULL;
> > +	}
> > +
> > +	rte_mempool_obj_iter(pktmbuf_pool, nfp_flower_pf_mp_init,
> NULL);
> > +
> > +	return pktmbuf_pool;
> > +}
> > +
> 
> Hi Chaoyong,
> 
> Again, similar comment to previous versions, what I understand is this new
> flower FW supports HW flow filter and intended use case is for OvS HW
> acceleration.
> But is DPDK driver need to know OvS data structures, like "struct dp_packet",
> can it be transparent to application, I am sure there are other devices
> offloading some OvS task to HW.
> 
> @Ian, @David,
> 
> Can you please comment on above usage, do you guys see any way to
> escape from OvS specific code in the driver?

Firstly, I'll explain why we must include some OvS specific code in the driver.
If we don't set the `pkt->source = 3`, the OvS will coredump like this:
```
(gdb) bt
#0  0x00007fe1d48fd387 in raise () from /lib64/libc.so.6
#1  0x00007fe1d48fea78 in abort () from /lib64/libc.so.6
#2  0x00007fe1d493ff67 in __libc_message () from /lib64/libc.so.6
#3  0x00007fe1d4948329 in _int_free () from /lib64/libc.so.6
#4  0x000000000049c006 in dp_packet_uninit (b=0x1f262db80) at lib/dp-packet.c:135
#5  0x000000000061440a in dp_packet_delete (b=0x1f262db80) at lib/dp-packet.h:261
#6  0x0000000000619aa0 in dpdk_copy_batch_to_mbuf (netdev=0x1f0a04a80, batch=0x7fe1b40050c0) at lib/netdev-dpdk.c:274
#7  0x0000000000619b46 in netdev_dpdk_common_send (netdev=0x1f0a04a80, batch=0x7fe1b40050c0, stats=0x7fe1be7321f0) at
#8  0x000000000061a0ba in netdev_dpdk_eth_send (netdev=0x1f0a04a80, qid=0, batch=0x7fe1b40050c0, concurrent_txq=true)
#9  0x00000000004fbd10 in netdev_send (netdev=0x1f0a04a80, qid=0, batch=0x7fe1b40050c0, concurrent_txq=true) at lib/n
#10 0x00000000004aa663 in dp_netdev_pmd_flush_output_on_port (pmd=0x7fe1be735010, p=0x7fe1b4005090) at lib/dpif-netde
#11 0x00000000004aa85d in dp_netdev_pmd_flush_output_packets (pmd=0x7fe1be735010, force=false) at lib/dpif-netdev.c:5
#12 0x00000000004aaaef in dp_netdev_process_rxq_port (pmd=0x7fe1be735010, rxq=0x16f3f80, port_no=3) at lib/dpif-netde
#13 0x00000000004af17a in pmd_thread_main (f_=0x7fe1be735010) at lib/dpif-netdev.c:6958
#14 0x000000000057da80 in ovsthread_wrapper (aux_=0x1608b30) at lib/ovs-thread.c:422
#15 0x00007fe1d51a6ea5 in start_thread () from /lib64/libpthread.so.0
#16 0x00007fe1d49c5b0d in clone () from /lib64/libc.so.6
```
The logic in function `dp_packet_delete()` run into the wrong branch.

Then, why just our PMD need do this, and other PMDs don't?
Generally, it's greatly dependent on the hardware.

The Netronome's Network Flow Processor 4xxx (NFP-4xxx) card is the target card of these series patches.
Which only has one PF but has 2 physical ports, and the NFP PMD can work with up to 8 ports on the same PF device. 
Other PMDs hardware seems all 'one PF <--> one physical port'.

For the use case of OvS, we should add the representor port of 'physical port' to the bridge, not the representor port of PF like other PMDs.

We use a two-layer poll mode architecture. (Other PMDs are simple poll mode architecture)
In the RX direction:
1. When the physical port or vf receives pkts, the firmware will prepend a meta-data(indicating the input port) into the pkt.
2. We use the PF vNIC as a multiplexer, which keeps polling pkts from the firmware.
3. The PF vNIC will parse the meta-data, and enqueue the pkt into the corresponding rte_ring of the representor port of physical port or vf.
4. The OVS will polling pkts from the RX function of representor port, which dequeue pkts from the rte_ring.
In the TX direction:
1. The OVS send the pkts from the TX functions of representor port.
2. The representor port will prepend a meta-data(indicating the output port) into the pkt and send the pkt to firmware through the queue 0 of PF vNIC.
3. The firmware will parse the meta-data, and forward the pkt to the corresponding physical port or vf.

So the OvS won't create the mempool for us and we must create it ourselves for the PF vNIC to use.

Hopefully, I explained the things clearly. Thanks.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-09-21  2:02     ` Chaoyong He
@ 2022-09-21  7:29       ` Thomas Monjalon
  2022-09-21  7:42         ` Chaoyong He
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Monjalon @ 2022-09-21  7:29 UTC (permalink / raw)
  To: Ferruh Yigit, dev, Chaoyong He
  Cc: oss-drivers, Niklas Soderlund, Jerin Jacob Kollanukkaran,
	Andrew Rybchenko, David Marchand

21/09/2022 04:02, Chaoyong He:
> > On 9/15/2022 11:44 AM, Chaoyong He wrote:
> > > Adds the setup/start logic for the ctrl vNIC. This vNIC is used by the
> > > PMD and flower firmware application as a communication channel
> > between
> > > driver and firmware. In the case of OVS it is also used to communicate
> > > flow statistics from hardware to the driver.
> > >
> > > A rte_eth device is not exposed to DPDK for this vNIC as it is
> > > strictly used internally by flower logic.
> > >
> > 
> > Hi Chaoyong,
> > 
> > Similar comment with previous versions, interface is created using regular
> > 'rte_eth_dev_allocate()' API, I think interface will be visible to application, I
> > can't understand the need of creating an interface for control.

You didn't reply to this.
Why the control port should be exposed to the application?
We recommend not using ethdev for this.


> > What is the communication method between driver and FW?
> > Since one of the following patches (09/12) introduces Rx/Tx for ctrl interface,
> > is device interface is control packets (similar to network data packets)?
> > 
> 
> Basically, the 'control message' is exist in the form of normal data packets.
> 
> When we use the flower firmware application, there exist two types of packets for now,
> and they are identified only from the prepend meta-data.
> 
> Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
> -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
> Word  +---------------+---------------+---------------+---------------+
>    0  |    type       |     type      |     type      |     type      |
>       +---------------+---------------+---------------+---------------+
> The 'control message' packets are processed by the ctrl vNIC.
> The 'normal' packets are processed by the pf vNIC.
> 
> The communication method between driver and firmware is decided by the
> designment of hardware and firmware.
> 
> The kernel driver also has the same ctrl vNIC and pf vNIC ethdev and the usage is same.
> 
> > > Because of the add of ctrl vNIC, a new PCItoCPPBar is needed. Modify
> > > the related logics.
> > >
> > > Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> > > Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 05/12] net/nfp: add flower PF setup logic
  2022-09-21  2:50     ` Chaoyong He
@ 2022-09-21  7:35       ` Thomas Monjalon
  2022-09-21  7:47         ` Chaoyong He
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Monjalon @ 2022-09-21  7:35 UTC (permalink / raw)
  To: Ferruh Yigit, dev, Ian Stokes, David Marchand, Chaoyong He
  Cc: oss-drivers, Niklas Soderlund, Andrew Rybchenko,
	Jerin Jacob Kollanukkaran

I don't understand your logic fully,
but I understand you need special code to make your hardware work with OvS,
meaning:
	- OvS must have a special handling for your HW
	- other applications won't work
Tell me I misunderstand,
but I feel we should not accept this patch,
there is probably a better way to manage the specific of your HW.

You said "NFP PMD can work with up to 8 ports on the same PF device."
Let's imagine you have 8 ports for 1 PF device.
Do you allocate 8 ethdev ports?
If yes, then each ethdev should do the internal work,
and nothing is needed at application level.


21/09/2022 04:50, Chaoyong He:
> > On 9/15/2022 11:44 AM, Chaoyong He wrote:
> > Hi Chaoyong,
> > 
> > Again, similar comment to previous versions, what I understand is this new
> > flower FW supports HW flow filter and intended use case is for OvS HW
> > acceleration.
> > But is DPDK driver need to know OvS data structures, like "struct dp_packet",
> > can it be transparent to application, I am sure there are other devices
> > offloading some OvS task to HW.
> > 
> > @Ian, @David,
> > 
> > Can you please comment on above usage, do you guys see any way to
> > escape from OvS specific code in the driver?
> 
> Firstly, I'll explain why we must include some OvS specific code in the driver.
> If we don't set the `pkt->source = 3`, the OvS will coredump like this:
> ```
> (gdb) bt
> #0  0x00007fe1d48fd387 in raise () from /lib64/libc.so.6
> #1  0x00007fe1d48fea78 in abort () from /lib64/libc.so.6
> #2  0x00007fe1d493ff67 in __libc_message () from /lib64/libc.so.6
> #3  0x00007fe1d4948329 in _int_free () from /lib64/libc.so.6
> #4  0x000000000049c006 in dp_packet_uninit (b=0x1f262db80) at lib/dp-packet.c:135
> #5  0x000000000061440a in dp_packet_delete (b=0x1f262db80) at lib/dp-packet.h:261
> #6  0x0000000000619aa0 in dpdk_copy_batch_to_mbuf (netdev=0x1f0a04a80, batch=0x7fe1b40050c0) at lib/netdev-dpdk.c:274
> #7  0x0000000000619b46 in netdev_dpdk_common_send (netdev=0x1f0a04a80, batch=0x7fe1b40050c0, stats=0x7fe1be7321f0) at
> #8  0x000000000061a0ba in netdev_dpdk_eth_send (netdev=0x1f0a04a80, qid=0, batch=0x7fe1b40050c0, concurrent_txq=true)
> #9  0x00000000004fbd10 in netdev_send (netdev=0x1f0a04a80, qid=0, batch=0x7fe1b40050c0, concurrent_txq=true) at lib/n
> #10 0x00000000004aa663 in dp_netdev_pmd_flush_output_on_port (pmd=0x7fe1be735010, p=0x7fe1b4005090) at lib/dpif-netde
> #11 0x00000000004aa85d in dp_netdev_pmd_flush_output_packets (pmd=0x7fe1be735010, force=false) at lib/dpif-netdev.c:5
> #12 0x00000000004aaaef in dp_netdev_process_rxq_port (pmd=0x7fe1be735010, rxq=0x16f3f80, port_no=3) at lib/dpif-netde
> #13 0x00000000004af17a in pmd_thread_main (f_=0x7fe1be735010) at lib/dpif-netdev.c:6958
> #14 0x000000000057da80 in ovsthread_wrapper (aux_=0x1608b30) at lib/ovs-thread.c:422
> #15 0x00007fe1d51a6ea5 in start_thread () from /lib64/libpthread.so.0
> #16 0x00007fe1d49c5b0d in clone () from /lib64/libc.so.6
> ```
> The logic in function `dp_packet_delete()` run into the wrong branch.
> 
> Then, why just our PMD need do this, and other PMDs don't?
> Generally, it's greatly dependent on the hardware.
> 
> The Netronome's Network Flow Processor 4xxx (NFP-4xxx) card is the target card of these series patches.
> Which only has one PF but has 2 physical ports, and the NFP PMD can work with up to 8 ports on the same PF device. 
> Other PMDs hardware seems all 'one PF <--> one physical port'.
> 
> For the use case of OvS, we should add the representor port of 'physical port' to the bridge, not the representor port of PF like other PMDs.
> 
> We use a two-layer poll mode architecture. (Other PMDs are simple poll mode architecture)
> In the RX direction:
> 1. When the physical port or vf receives pkts, the firmware will prepend a meta-data(indicating the input port) into the pkt.
> 2. We use the PF vNIC as a multiplexer, which keeps polling pkts from the firmware.
> 3. The PF vNIC will parse the meta-data, and enqueue the pkt into the corresponding rte_ring of the representor port of physical port or vf.
> 4. The OVS will polling pkts from the RX function of representor port, which dequeue pkts from the rte_ring.
> In the TX direction:
> 1. The OVS send the pkts from the TX functions of representor port.
> 2. The representor port will prepend a meta-data(indicating the output port) into the pkt and send the pkt to firmware through the queue 0 of PF vNIC.
> 3. The firmware will parse the meta-data, and forward the pkt to the corresponding physical port or vf.
> 
> So the OvS won't create the mempool for us and we must create it ourselves for the PF vNIC to use.
> 
> Hopefully, I explained the things clearly. Thanks.




^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH v9 07/12] net/nfp: add flower ctrl VNIC related logics
  2022-09-21  7:29       ` Thomas Monjalon
@ 2022-09-21  7:42         ` Chaoyong He
  0 siblings, 0 replies; 22+ messages in thread
From: Chaoyong He @ 2022-09-21  7:42 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, dev
  Cc: oss-drivers, Niklas Soderlund, Jerin Jacob Kollanukkaran,
	Andrew Rybchenko, David Marchand

> 21/09/2022 04:02, Chaoyong He:
> > > On 9/15/2022 11:44 AM, Chaoyong He wrote:
> > > > Adds the setup/start logic for the ctrl vNIC. This vNIC is used by
> > > > the PMD and flower firmware application as a communication channel
> > > between
> > > > driver and firmware. In the case of OVS it is also used to
> > > > communicate flow statistics from hardware to the driver.
> > > >
> > > > A rte_eth device is not exposed to DPDK for this vNIC as it is
> > > > strictly used internally by flower logic.
> > > >
> > >
> > > Hi Chaoyong,
> > >
> > > Similar comment with previous versions, interface is created using
> > > regular 'rte_eth_dev_allocate()' API, I think interface will be
> > > visible to application, I can't understand the need of creating an interface
> for control.
> 
> You didn't reply to this.
> Why the control port should be exposed to the application?
> We recommend not using ethdev for this.
> 

Actually, in the v1--v5 of this patch series, we did create a control port which is not
in the rte_eth_devices[] array, so it won't exposed to the application.

> 
> > > What is the communication method between driver and FW?
> > > Since one of the following patches (09/12) introduces Rx/Tx for ctrl
> > > interface, is device interface is control packets (similar to network data
> packets)?
> > >
> >
> > Basically, the 'control message' is exist in the form of normal data packets.
> >
> > When we use the flower firmware application, there exist two types of
> > packets for now, and they are identified only from the prepend meta-data.
> >
> > Bit    3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
> > -----\ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
> > Word  +---------------+---------------+---------------+---------------+
> >    0  |    type       |     type      |     type      |     type      |
> >
> > +---------------+---------------+---------------+---------------+
> > The 'control message' packets are processed by the ctrl vNIC.
> > The 'normal' packets are processed by the pf vNIC.
> >
> > The communication method between driver and firmware is decided by
> the
> > designment of hardware and firmware.
> >
> > The kernel driver also has the same ctrl vNIC and pf vNIC ethdev and the
> usage is same.
> >
> > > > Because of the add of ctrl vNIC, a new PCItoCPPBar is needed.
> > > > Modify the related logics.
> > > >
> > > > Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
> > > > Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
> 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH v9 05/12] net/nfp: add flower PF setup logic
  2022-09-21  7:35       ` Thomas Monjalon
@ 2022-09-21  7:47         ` Chaoyong He
  0 siblings, 0 replies; 22+ messages in thread
From: Chaoyong He @ 2022-09-21  7:47 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, dev, Ian Stokes, David Marchand
  Cc: oss-drivers, Niklas Soderlund, Andrew Rybchenko,
	Jerin Jacob Kollanukkaran

> Subject: Re: [PATCH v9 05/12] net/nfp: add flower PF setup logic
> 
> I don't understand your logic fully,
> but I understand you need special code to make your hardware work with
> OvS,
> meaning:
> 	- OvS must have a special handling for your HW
> 	- other applications won't work
> Tell me I misunderstand,
> but I feel we should not accept this patch, there is probably a better way to
> manage the specific of your HW.

OvS need not do anything special handling for our HW.
Other applications won't work -- Sorry I don't understand your mean at this point.

> You said "NFP PMD can work with up to 8 ports on the same PF device."
> Let's imagine you have 8 ports for 1 PF device.
> Do you allocate 8 ethdev ports?
> If yes, then each ethdev should do the internal work, and nothing is needed
> at application level.

No, we still just create 1 PF vNIC to handle the feedback traffic.
Of course we will 8 representor port for physical port.
 
> 21/09/2022 04:50, Chaoyong He:
> > > On 9/15/2022 11:44 AM, Chaoyong He wrote:
> > > Hi Chaoyong,
> > >
> > > Again, similar comment to previous versions, what I understand is
> > > this new flower FW supports HW flow filter and intended use case is
> > > for OvS HW acceleration.
> > > But is DPDK driver need to know OvS data structures, like "struct
> > > dp_packet", can it be transparent to application, I am sure there
> > > are other devices offloading some OvS task to HW.
> > >
> > > @Ian, @David,
> > >
> > > Can you please comment on above usage, do you guys see any way to
> > > escape from OvS specific code in the driver?
> >
> > Firstly, I'll explain why we must include some OvS specific code in the driver.
> > If we don't set the `pkt->source = 3`, the OvS will coredump like this:
> > ```
> > (gdb) bt
> > #0  0x00007fe1d48fd387 in raise () from /lib64/libc.so.6
> > #1  0x00007fe1d48fea78 in abort () from /lib64/libc.so.6
> > #2  0x00007fe1d493ff67 in __libc_message () from /lib64/libc.so.6
> > #3  0x00007fe1d4948329 in _int_free () from /lib64/libc.so.6
> > #4  0x000000000049c006 in dp_packet_uninit (b=0x1f262db80) at
> > lib/dp-packet.c:135
> > #5  0x000000000061440a in dp_packet_delete (b=0x1f262db80) at
> > lib/dp-packet.h:261
> > #6  0x0000000000619aa0 in dpdk_copy_batch_to_mbuf
> (netdev=0x1f0a04a80,
> > batch=0x7fe1b40050c0) at lib/netdev-dpdk.c:274
> > #7  0x0000000000619b46 in netdev_dpdk_common_send
> (netdev=0x1f0a04a80,
> > batch=0x7fe1b40050c0, stats=0x7fe1be7321f0) at
> > #8  0x000000000061a0ba in netdev_dpdk_eth_send (netdev=0x1f0a04a80,
> > qid=0, batch=0x7fe1b40050c0, concurrent_txq=true)
> > #9  0x00000000004fbd10 in netdev_send (netdev=0x1f0a04a80, qid=0,
> > batch=0x7fe1b40050c0, concurrent_txq=true) at lib/n
> > #10 0x00000000004aa663 in dp_netdev_pmd_flush_output_on_port
> > (pmd=0x7fe1be735010, p=0x7fe1b4005090) at lib/dpif-netde
> > #11 0x00000000004aa85d in dp_netdev_pmd_flush_output_packets
> > (pmd=0x7fe1be735010, force=false) at lib/dpif-netdev.c:5
> > #12 0x00000000004aaaef in dp_netdev_process_rxq_port
> > (pmd=0x7fe1be735010, rxq=0x16f3f80, port_no=3) at lib/dpif-netde
> > #13 0x00000000004af17a in pmd_thread_main (f_=0x7fe1be735010) at
> > lib/dpif-netdev.c:6958
> > #14 0x000000000057da80 in ovsthread_wrapper (aux_=0x1608b30) at
> > lib/ovs-thread.c:422
> > #15 0x00007fe1d51a6ea5 in start_thread () from /lib64/libpthread.so.0
> > #16 0x00007fe1d49c5b0d in clone () from /lib64/libc.so.6 ``` The logic
> > in function `dp_packet_delete()` run into the wrong branch.
> >
> > Then, why just our PMD need do this, and other PMDs don't?
> > Generally, it's greatly dependent on the hardware.
> >
> > The Netronome's Network Flow Processor 4xxx (NFP-4xxx) card is the
> target card of these series patches.
> > Which only has one PF but has 2 physical ports, and the NFP PMD can work
> with up to 8 ports on the same PF device.
> > Other PMDs hardware seems all 'one PF <--> one physical port'.
> >
> > For the use case of OvS, we should add the representor port of 'physical
> port' to the bridge, not the representor port of PF like other PMDs.
> >
> > We use a two-layer poll mode architecture. (Other PMDs are simple poll
> > mode architecture) In the RX direction:
> > 1. When the physical port or vf receives pkts, the firmware will prepend a
> meta-data(indicating the input port) into the pkt.
> > 2. We use the PF vNIC as a multiplexer, which keeps polling pkts from the
> firmware.
> > 3. The PF vNIC will parse the meta-data, and enqueue the pkt into the
> corresponding rte_ring of the representor port of physical port or vf.
> > 4. The OVS will polling pkts from the RX function of representor port, which
> dequeue pkts from the rte_ring.
> > In the TX direction:
> > 1. The OVS send the pkts from the TX functions of representor port.
> > 2. The representor port will prepend a meta-data(indicating the output
> port) into the pkt and send the pkt to firmware through the queue 0 of PF
> vNIC.
> > 3. The firmware will parse the meta-data, and forward the pkt to the
> corresponding physical port or vf.
> >
> > So the OvS won't create the mempool for us and we must create it
> ourselves for the PF vNIC to use.
> >
> > Hopefully, I explained the things clearly. Thanks.
> 
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-09-21  7:47 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-15 10:44 [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Chaoyong He
2022-09-15 10:44 ` [PATCH v9 01/12] net/nfp: move app specific attributes to own struct Chaoyong He
2022-09-15 10:44 ` [PATCH v9 02/12] net/nfp: simplify initialization and remove dead code Chaoyong He
2022-09-15 10:44 ` [PATCH v9 03/12] net/nfp: move app specific init logic to own function Chaoyong He
2022-09-15 10:44 ` [PATCH v9 04/12] net/nfp: add initial flower firmware support Chaoyong He
2022-09-15 10:44 ` [PATCH v9 05/12] net/nfp: add flower PF setup logic Chaoyong He
2022-09-20 14:57   ` Ferruh Yigit
2022-09-21  2:50     ` Chaoyong He
2022-09-21  7:35       ` Thomas Monjalon
2022-09-21  7:47         ` Chaoyong He
2022-09-15 10:44 ` [PATCH v9 06/12] net/nfp: add flower PF related routines Chaoyong He
2022-09-15 10:44 ` [PATCH v9 07/12] net/nfp: add flower ctrl VNIC related logics Chaoyong He
2022-09-20 14:56   ` Ferruh Yigit
2022-09-21  2:02     ` Chaoyong He
2022-09-21  7:29       ` Thomas Monjalon
2022-09-21  7:42         ` Chaoyong He
2022-09-15 10:44 ` [PATCH v9 08/12] net/nfp: move common rxtx function for flower use Chaoyong He
2022-09-15 10:44 ` [PATCH v9 09/12] net/nfp: add flower ctrl VNIC rxtx logic Chaoyong He
2022-09-15 10:44 ` [PATCH v9 10/12] net/nfp: add flower representor framework Chaoyong He
2022-09-15 10:44 ` [PATCH v9 11/12] net/nfp: move rxtx function to header file Chaoyong He
2022-09-15 10:44 ` [PATCH v9 12/12] net/nfp: add flower PF rxtx logic Chaoyong He
2022-09-20 14:56 ` [PATCH v9 00/12] preparation for the rte_flow offload of nfp PMD Ferruh Yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).