[dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management

DPDK patches and discussions
 help / color / mirror / Atom feed

* [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management
@ 2017-05-26 18:11 Jasvinder Singh
  2017-05-26 18:11 ` [dpdk-dev] [PATCH 1/2] net/softnic: add softnic PMD " Jasvinder Singh
                   ` (3 more replies)
  0 siblings, 4 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-05-26 18:11 UTC (permalink / raw)
  To: dev
  Cc: cristian.dumitrescu, ferruh.yigit, hemant.agrawal,
	Jerin.JacobKollanukkaran, wenzhuo.lu

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 2960 bytes --]

The SoftNIC PMD provides SW fall-back option for the NICs not supporting
the Traffic Management (TM) features. 

SoftNIC PMD overview:
- The SW fall-back is based on the existing librte_sched DPDK library.
- The TM-agnostic port (the underlay device) is wrapped into a TM-aware
  softnic port (the overlay device).
- Once the overlay device (virtual device) is created, the configuration of
  the underlay device is taking place through the overlay device.
- The SoftNIC PMD is generic, i.e. it works for any underlay device PMD that
  implements the ethdev API.

  Similarly to Ring PMD, the SoftNIC virtual device can be created in two
different ways:
1. Through EAL command line (--vdev option) 
2. Through the rte_eth_softnic_create() API function called by the application

SoftNIC PMD params:
- iface (mandatory): the ethdev port name (i.e. PCI address or vdev name) for
the underlay device
- txq_id (optional, default = 0): tx queue id of the underlay device
- deq_bsz (optional, default = 24): traffic manager dequeue burst size
- Example:  --vdev 'net_softnic0,iface=0000:04:00.1,txq_id=0,deq_bsz=28'

SoftNIC PMD build instructions:
- To build SoftNIC PMD, the following parameter needs to be set on
config/common_base file: CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
- The SoftNIC PMD depends on the TM API [1] and therefore is initially
targeted for the tm-next repository


Patch 1 adds softnic device PMD for traffic management.
Patch 2 adds traffic management ops to the softnic device suggested in
generic ethdev API for traffic management[1].

[1] TM API version 4:  
http://www.dpdk.org/dev/patchwork/patch/24411/,
http://www.dpdk.org/dev/patchwork/patch/24412/


Jasvinder Singh (2):
  net/softnic: add softnic PMD for traffic management
  net/softnic: add traffic management ops

 MAINTAINERS                                     |    5 +
 config/common_base                              |    5 +
 drivers/net/Makefile                            |    5 +
 drivers/net/softnic/Makefile                    |   58 ++
 drivers/net/softnic/rte_eth_softnic.c           |  578 ++++++++++++
 drivers/net/softnic/rte_eth_softnic.h           |   99 ++
 drivers/net/softnic/rte_eth_softnic_default.c   | 1104 +++++++++++++++++++++++
 drivers/net/softnic/rte_eth_softnic_internals.h |   93 ++
 drivers/net/softnic/rte_eth_softnic_tm.c        |  235 +++++
 drivers/net/softnic/rte_eth_softnic_version.map |    7 +
 mk/rte.app.mk                                   |    5 +-
 11 files changed, 2193 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_default.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic_version.map

-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH 1/2] net/softnic: add softnic PMD for traffic management
  2017-05-26 18:11 [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management Jasvinder Singh
@ 2017-05-26 18:11 ` Jasvinder Singh
  2017-06-26 16:43   ` [dpdk-dev] [PATCH v2 0/2] net/softnic: sw fall-back " Jasvinder Singh
  2017-05-26 18:11 ` [dpdk-dev] [PATCH " Jasvinder Singh
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 79+ messages in thread
From: Jasvinder Singh @ 2017-05-26 18:11 UTC (permalink / raw)
  To: dev
  Cc: cristian.dumitrescu, ferruh.yigit, hemant.agrawal,
	Jerin.JacobKollanukkaran, wenzhuo.lu

Softnic PMD implements HQoS scheduler as software fallback solution for
the hardware with no HQoS support. When application call rx function on
this device, it simply invokes underlay device rx function. On the egress
path, softnic tx funtion enqueues the packets into QoS scheduler. The packets
are dequeued from the QoS scheduler and sent to the underlay device through
the rte_eth_softnic_run() API.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 MAINTAINERS                                     |    5 +
 config/common_base                              |    5 +
 drivers/net/Makefile                            |    5 +
 drivers/net/softnic/Makefile                    |   57 ++
 drivers/net/softnic/rte_eth_softnic.c           |  535 +++++++++++
 drivers/net/softnic/rte_eth_softnic.h           |   99 ++
 drivers/net/softnic/rte_eth_softnic_default.c   | 1104 +++++++++++++++++++++++
 drivers/net/softnic/rte_eth_softnic_internals.h |   67 ++
 drivers/net/softnic/rte_eth_softnic_version.map |    7 +
 mk/rte.app.mk                                   |    5 +-
 10 files changed, 1888 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_default.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index cdaf2ac..16a53de 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -490,6 +490,11 @@ M: Tetsuya Mukawa <mtetsuyah@gmail.com>
 F: drivers/net/null/
 F: doc/guides/nics/features/null.ini
 
+Softnic PMD
+M: Jasvinder Singh <jasvinder.singh@intel.com>
+M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
+F: drivers/net/softnic
+
 
 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index 8907bea..f526b90 100644
--- a/config/common_base
+++ b/config/common_base
@@ -273,6 +273,11 @@ CONFIG_RTE_LIBRTE_SFC_EFX_PMD=y
 CONFIG_RTE_LIBRTE_SFC_EFX_DEBUG=n
 
 #
+# Compile SOFTNIC PMD
+#
+CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
+
+#
 # Compile software PMD backed by SZEDATA2 device
 #
 CONFIG_RTE_LIBRTE_PMD_SZEDATA2=n
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 35ed813..db73d33 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -108,4 +108,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
 endif # $(CONFIG_RTE_LIBRTE_VHOST)
 DEPDIRS-vhost = $(core-libs) librte_vhost
 
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += softnic
+endif # $(CONFIG_RTE_LIBRTE_SCHED)
+DEPDIRS-sched = $(core-libs) librte_sched
+
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
new file mode 100644
index 0000000..c0374fa
--- /dev/null
+++ b/drivers/net/softnic/Makefile
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_softnic.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_eth_softnic_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic_default.c
+
+#
+# Export include files
+#
+SYMLINK-y-include +=rte_eth_softnic.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
new file mode 100644
index 0000000..529200e
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -0,0 +1,535 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_ethdev.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_vdev.h>
+#include <rte_kvargs.h>
+#include <rte_errno.h>
+
+#include "rte_eth_softnic.h"
+#include "rte_eth_softnic_internals.h"
+
+#define PMD_PARAM_IFACE_NAME				"iface"
+#define PMD_PARAM_IFACE_QUEUE				"txq_id"
+#define PMD_PARAM_DEQ_BSZ					"deq_bsz"
+
+static const char *pmd_valid_args[] = {
+	PMD_PARAM_IFACE_NAME,
+	PMD_PARAM_IFACE_QUEUE,
+	PMD_PARAM_DEQ_BSZ,
+	NULL
+};
+
+static struct rte_vdev_driver pmd_drv;
+
+static int
+pmd_eth_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	int status;
+
+	/* Stop the underlay device */
+	rte_eth_dev_stop(p->uport_id);
+
+	/* Call the current function for the underlay device */
+	status = rte_eth_dev_configure(p->uport_id,
+		dev->data->nb_rx_queues,
+		dev->data->nb_tx_queues,
+		&dev->data->dev_conf);
+	if (status)
+		return status;
+
+	/* Rework on the RX queues of the overlay device */
+	if (dev->data->rx_queues)
+		rte_free(dev->data->rx_queues);
+	dev->data->rx_queues = p->udev->data->rx_queues;
+
+	return 0;
+}
+
+static int
+pmd_eth_dev_tx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id,
+	uint16_t nb_tx_desc,
+	unsigned int socket_id,
+	const struct rte_eth_txconf *tx_conf)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	int status;
+
+	/* Call the current function for the underlay device */
+	status = rte_eth_tx_queue_setup(p->uport_id,
+		tx_queue_id,
+		nb_tx_desc,
+		socket_id,
+		tx_conf);
+	if (status)
+		return status;
+
+	/* Handle TX queue of the overlay device */
+	dev->data->tx_queues[tx_queue_id] = (void *) p;
+
+	return 0;
+}
+
+static int
+pmd_eth_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Clone dev->data from underlay to overlay */
+	memcpy(dev->data->mac_pool_sel,
+		p->udev->data->mac_pool_sel,
+		sizeof(dev->data->mac_pool_sel));
+	dev->data->promiscuous = p->udev->data->promiscuous;
+	dev->data->all_multicast = p->udev->data->all_multicast;
+
+	/* Call the current function for the underlay device */
+	return rte_eth_dev_start(p->uport_id);
+}
+
+static void
+pmd_eth_dev_stop(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_dev_stop(p->uport_id);
+
+}
+
+static void
+pmd_eth_dev_close(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_dev_close(p->uport_id);
+
+	/* Cleanup on the overlay device */
+	dev->data->rx_queues = NULL;
+	dev->data->tx_queues = NULL;
+
+	return;
+}
+
+static void
+pmd_eth_dev_promiscuous_enable(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_promiscuous_enable(p->uport_id);
+}
+
+static void
+pmd_eth_dev_promiscuous_disable(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_promiscuous_disable(p->uport_id);
+}
+
+static void
+pmd_eth_dev_all_multicast_enable(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_allmulticast_enable(p->uport_id);
+}
+
+static void
+pmd_eth_dev_all_multicast_disable(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_allmulticast_disable(p->uport_id);
+}
+
+static int
+pmd_eth_dev_link_update(struct rte_eth_dev *dev,
+	int wait_to_complete __rte_unused)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_eth_link dev_link;
+
+	/* Call the current function for the underlay device */
+	rte_eth_link_get(p->uport_id, &dev_link);
+
+	/* Overlay device update */
+	dev->data->dev_link = dev_link;
+
+	return 0;
+}
+
+static int
+pmd_eth_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	int status;
+
+	/* Call the current function for the underlay device */
+	status = rte_eth_dev_set_mtu(p->uport_id, mtu);
+	if (status)
+		return status;
+
+	/* Overlay device update */
+	dev->data->mtu = mtu;
+
+	return 0;
+}
+
+static void
+pmd_eth_dev_mac_addr_set(struct rte_eth_dev *dev,
+	struct ether_addr *mac_addr)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_dev_default_mac_addr_set(p->uport_id, mac_addr);
+}
+
+static int
+pmd_eth_dev_mac_addr_add(struct rte_eth_dev *dev,
+	struct ether_addr *mac_addr,
+	uint32_t index __rte_unused,
+	uint32_t vmdq)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	return rte_eth_dev_mac_addr_add(p->uport_id, mac_addr, vmdq);
+}
+
+static void
+pmd_eth_dev_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_dev_mac_addr_remove(p->uport_id, &dev->data->mac_addrs[index]);
+}
+
+static uint16_t
+pmd_eth_dev_tx_burst(void *txq,
+	struct rte_mbuf **tx_pkts,
+	uint16_t nb_pkts)
+{
+	struct pmd_internals *p = txq;
+
+	return rte_eth_tx_burst(p->uport_id, p->txq_id, tx_pkts, nb_pkts);
+
+}
+
+int
+rte_eth_softnic_run(uint8_t port_id __rte_unused)
+{
+	return 0;
+}
+
+static void
+pmd_ops_build(struct eth_dev_ops *o, const struct eth_dev_ops *u)
+{
+	/* Inherited functionality */
+	pmd_ops_inherit(o, u);
+
+	/* Derived functionality */
+	o->dev_configure = pmd_eth_dev_configure;
+	o->tx_queue_setup = pmd_eth_dev_tx_queue_setup;
+	o->dev_start = pmd_eth_dev_start;
+	o->dev_stop = pmd_eth_dev_stop;
+	o->dev_close = pmd_eth_dev_close;
+	o->promiscuous_enable = pmd_eth_dev_promiscuous_enable;
+	o->promiscuous_disable = pmd_eth_dev_promiscuous_disable;
+	o->allmulticast_enable = pmd_eth_dev_all_multicast_enable;
+	o->allmulticast_disable = pmd_eth_dev_all_multicast_disable;
+	o->link_update = pmd_eth_dev_link_update;
+	o->mtu_set = pmd_eth_dev_mtu_set;
+	o->mac_addr_set = pmd_eth_dev_mac_addr_set;
+	o->mac_addr_add = pmd_eth_dev_mac_addr_add;
+	o->mac_addr_remove = pmd_eth_dev_mac_addr_remove;
+}
+
+int
+rte_eth_softnic_create(struct rte_eth_softnic_params *params)
+{
+	struct rte_eth_dev_info uinfo;
+	struct rte_eth_dev *odev, *udev;
+	struct rte_eth_dev_data *odata, *udata;
+	struct eth_dev_ops *odev_ops;
+	const struct eth_dev_ops *udev_ops;
+	void **otx_queues;
+	struct pmd_internals *p;
+	int numa_node;
+	uint8_t oport_id, uport_id;
+
+	/* Check input arguments */
+	if ((params == NULL) ||
+		(params->oname == NULL) ||
+		(params->uname == NULL) ||
+		(params->deq_bsz > RTE_ETH_SOFTNIC_DEQ_BSZ_MAX))
+		return -EINVAL;
+
+	if (rte_eth_dev_get_port_by_name(params->uname, &uport_id))
+		return -EINVAL;
+	udev = &rte_eth_devices[uport_id];
+	udata = udev->data;
+	udev_ops = udev->dev_ops;
+	numa_node = udata->numa_node;
+
+	rte_eth_dev_info_get(uport_id, &uinfo);
+	if (params->txq_id >= uinfo.max_tx_queues)
+		return -EINVAL;
+
+	RTE_LOG(INFO, PMD, "Creating overlay device %s for underlay device %s\n",
+		params->oname, params->uname);
+
+	/* Overlay device ethdev entry: entry allocation */
+	odev = rte_eth_dev_allocate(params->oname);
+	if (!odev)
+		return -ENOMEM;
+	oport_id = odev - rte_eth_devices;
+
+	/* Overlay device ethdev entry: memory allocation */
+	odev_ops = rte_zmalloc_socket(params->oname,
+		sizeof(*odev_ops), 0, numa_node);
+	if (!odev_ops) {
+		rte_eth_dev_release_port(odev);
+		return -ENOMEM;
+	}
+	odev->dev_ops = odev_ops;
+
+	odata = rte_zmalloc_socket(params->oname,
+		sizeof(*odata), 0, numa_node);
+	if (!odata) {
+		rte_free(odev_ops);
+		rte_eth_dev_release_port(odev);
+		return -ENOMEM;
+	}
+	memmove(odata->name, odev->data->name, sizeof(odata->name));
+	odata->port_id = odev->data->port_id;
+	odata->mtu = odev->data->mtu;
+	odev->data = odata;
+
+	otx_queues = rte_zmalloc_socket(params->oname,
+		uinfo.max_tx_queues * sizeof(void *), 0, numa_node);
+	if (!otx_queues) {
+		rte_free(odata);
+		rte_free(odev_ops);
+		rte_eth_dev_release_port(odev);
+		return -ENOMEM;
+	}
+	odev->data->tx_queues = otx_queues;
+
+	p = rte_zmalloc_socket(params->oname,
+		sizeof(struct pmd_internals), 0, numa_node);
+	if (!p) {
+		rte_free(otx_queues);
+		rte_free(odata);
+		rte_free(odev_ops);
+		rte_eth_dev_release_port(odev);
+		return -ENOMEM;
+	}
+	odev->data->dev_private = p;
+
+	/* Overlay device ethdev entry: fill in dev */
+	odev->rx_pkt_burst = udev->rx_pkt_burst;
+	odev->tx_pkt_burst = pmd_eth_dev_tx_burst;
+	odev->tx_pkt_prepare = udev->tx_pkt_prepare;
+
+	/* Overlay device ethdev entry: fill in dev->data */
+	odev->data->dev_link = udev->data->dev_link;
+	odev->data->mtu = udev->data->mtu;
+	odev->data->min_rx_buf_size = udev->data->min_rx_buf_size;
+	odev->data->mac_addrs = udev->data->mac_addrs;
+	odev->data->hash_mac_addrs = udev->data->hash_mac_addrs;
+	odev->data->promiscuous = udev->data->promiscuous;
+	odev->data->all_multicast = udev->data->all_multicast;
+	odev->data->dev_flags = udev->data->dev_flags;
+	odev->data->kdrv = RTE_KDRV_NONE;
+	odev->data->numa_node = numa_node;
+	odev->data->drv_name = pmd_drv.driver.name;
+
+	/* Overlay device ethdev entry: fill in dev->dev_ops */
+	pmd_ops_build(odev_ops, udev_ops);
+
+	/* Overlay device ethdev entry: fill in dev->data->dev_private */
+	p->odev = odev;
+	p->udev = udev;
+	p->odata = odata;
+	p->udata = udata;
+	p->odev_ops = odev_ops;
+	p->udev_ops = udev_ops;
+	p->oport_id = oport_id;
+	p->uport_id = uport_id;
+	p->deq_bsz = params->deq_bsz;
+	p->txq_id = params->txq_id;
+
+	return 0;
+}
+
+static int
+get_string_arg(const char *key __rte_unused,
+               const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(char **)extra_args = strdup(value);
+
+	if (!*(char **)extra_args)
+		return -ENOMEM;
+
+        return 0;
+}
+
+static int
+get_int_arg(const char *key __rte_unused,
+                const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(uint32_t *)extra_args = strtoull(value, NULL, 0);
+
+	return 0;
+}
+
+static int
+pmd_probe(struct rte_vdev_device *dev)
+{
+	struct rte_eth_softnic_params p;
+	const char *params;
+	struct rte_kvargs *kvlist;
+	int ret;
+
+	if (!dev)
+		return -EINVAL;
+
+	RTE_LOG(INFO, PMD, "Probing device %s\n", rte_vdev_device_name(dev));
+
+	params = rte_vdev_device_args(dev);
+	if(!params)
+		return -EINVAL;
+
+	kvlist = rte_kvargs_parse(params, pmd_valid_args);
+	if (kvlist == NULL)
+		return -EINVAL;
+
+	p.oname = rte_vdev_device_name(dev);
+
+	/* Interface: Mandatory */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_IFACE_NAME) != 1) {
+		ret = -EINVAL;
+		goto out_free;
+	}
+	ret = rte_kvargs_process(kvlist, PMD_PARAM_IFACE_NAME, &get_string_arg,
+		&p.uname);
+	if (ret < 0)
+		goto out_free;
+
+	/* Interface Queue ID: Optional */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_IFACE_QUEUE) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_IFACE_QUEUE, &get_int_arg,
+			&p.txq_id);
+		if (ret < 0)
+			goto out_free;
+	} else
+		p.txq_id = RTE_ETH_SOFTNIC_TXQ_ID_DEFAULT;
+
+	/* Dequeue Burst Size: Optional */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_DEQ_BSZ) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_DEQ_BSZ,
+			&get_int_arg, &p.deq_bsz);
+		if (ret < 0)
+			goto out_free;
+	} else
+		p.deq_bsz = RTE_ETH_SOFTNIC_DEQ_BSZ_DEFAULT;
+
+	return rte_eth_softnic_create(&p);
+
+out_free:
+	rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+pmd_remove(struct rte_vdev_device *dev)
+{
+	struct rte_eth_dev *eth_dev = NULL;
+	struct pmd_internals *p;
+	struct eth_dev_ops *dev_ops;
+
+	if (!dev)
+		return -EINVAL;
+
+	RTE_LOG(INFO, PMD, "Removing device %s\n", rte_vdev_device_name(dev));
+
+	/* Find the ethdev entry */
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
+	if (eth_dev == NULL)
+		return -ENODEV;
+	p = eth_dev->data->dev_private;
+	dev_ops = p->odev_ops;
+
+	pmd_eth_dev_stop(eth_dev);
+
+	/* Free device data structures*/
+	rte_free(eth_dev->data->dev_private);
+	rte_free(eth_dev->data->tx_queues);
+	rte_free(eth_dev->data);
+	rte_free(dev_ops);
+	rte_eth_dev_release_port(eth_dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_drv = {
+	.probe = pmd_probe,
+	.remove = pmd_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_softnic, pmd_drv);
+RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
+	PMD_PARAM_IFACE_NAME "=<string> "
+	PMD_PARAM_IFACE_QUEUE "=<int> "
+	PMD_PARAM_DEQ_BSZ "=<int>");
diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
new file mode 100644
index 0000000..cb7e391
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.h
@@ -0,0 +1,99 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_H__
+
+#include <stdint.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifndef RTE_ETH_SOFTNIC_TXQ_ID_DEFAULT
+#define RTE_ETH_SOFTNIC_TXQ_ID_DEFAULT				0
+#endif
+
+#ifndef RTE_ETH_SOFTNIC_DEQ_BSZ_MAX
+#define RTE_ETH_SOFTNIC_DEQ_BSZ_MAX				256
+#endif
+
+#ifndef RTE_ETH_SOFTNIC_DEQ_BSZ_DEFAULT
+#define RTE_ETH_SOFTNIC_DEQ_BSZ_DEFAULT				24
+#endif
+
+struct rte_eth_softnic_params {
+	/**< Name of the overlay network interface (to be created) */
+	const char *oname;
+
+	 /**< Name for the underlay network interface (existing) */
+	const char *uname;
+
+	/**< TX queue ID for the underlay device */
+	uint32_t txq_id;
+
+	/**< Dequeue burst size */
+	uint32_t deq_bsz;
+};
+
+/**
+ * Create a new overlay device
+ *
+ * @param params
+ *    a pointer to a structure rte_eth_softnic_params which contains
+ *    all the arguments required for creating the overlay device.
+ * @return
+ *    0 if device is created successfully,  or -1 on error.
+ */
+int
+rte_eth_softnic_create(struct rte_eth_softnic_params *params);
+
+/**
+ * Run the traffic management on the overlay device
+ *
+ * This function dequeues the scheduled packets from the HQoS scheduler
+ * and transmit them onto underlay device interface.
+ *
+ * @param portid
+ *    port id of the overlay device.
+ * @return
+ *    0.
+ */
+int
+rte_eth_softnic_run(uint8_t port_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_default.c b/drivers/net/softnic/rte_eth_softnic_default.c
new file mode 100644
index 0000000..e889df1
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_default.c
@@ -0,0 +1,1104 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES LOSS OF USE,
+ *   DATA, OR PROFITS OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+
+#include <rte_ethdev.h>
+
+#include "rte_eth_softnic_internals.h"
+#include "rte_eth_softnic.h"
+
+#define UDEV(odev)						\
+	((struct pmd_internals *) ((odev)->data->dev_private))->udev
+
+static int
+pmd_dev_configure(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->dev_configure(dev);
+}
+
+static int
+pmd_dev_start(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->dev_start(dev);
+}
+
+static void
+pmd_dev_stop(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->dev_stop(dev);
+}
+
+static int
+pmd_dev_set_link_up(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->dev_set_link_up(dev);
+}
+
+static int
+pmd_dev_set_link_down(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->dev_set_link_down(dev);
+}
+
+static void
+pmd_dev_close(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->dev_close(dev);
+}
+
+static void
+pmd_promiscuous_enable(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->promiscuous_enable(dev);
+}
+
+static void
+pmd_promiscuous_disable(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->promiscuous_disable(dev);
+}
+
+static void
+pmd_allmulticast_enable(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->allmulticast_enable(dev);
+}
+
+static void
+pmd_allmulticast_disable(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->allmulticast_disable(dev);
+}
+
+static int
+pmd_link_update(struct rte_eth_dev *dev, int wait_to_complete)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->link_update(dev, wait_to_complete);
+}
+
+static void
+pmd_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->stats_get(dev, igb_stats);
+}
+
+static void
+pmd_stats_reset(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->stats_reset(dev);
+}
+
+static int
+pmd_xstats_get(struct rte_eth_dev *dev,
+	struct rte_eth_xstat *stats, unsigned n)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->xstats_get(dev, stats, n);
+}
+
+static int
+pmd_xstats_get_by_id(struct rte_eth_dev *dev,
+	const uint64_t *ids, uint64_t *values, unsigned int n)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->xstats_get_by_id(dev, ids, values, n);
+}
+
+static void
+pmd_xstats_reset(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->xstats_reset(dev);
+}
+
+static int
+pmd_xstats_get_names(struct rte_eth_dev *dev,
+	struct rte_eth_xstat_name *xstats_names, unsigned size)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->xstats_get_names(dev, xstats_names, size);
+}
+
+static int
+pmd_xstats_get_names_by_id(struct rte_eth_dev *dev,
+	struct rte_eth_xstat_name *xstats_names, const uint64_t *ids,
+	unsigned int size)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->xstats_get_names_by_id(dev, xstats_names, ids, size);
+}
+
+static int
+pmd_queue_stats_mapping_set(struct rte_eth_dev *dev,
+	uint16_t queue_id, uint8_t stat_idx, uint8_t is_rx)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->queue_stats_mapping_set(dev, queue_id,
+		stat_idx, is_rx);
+}
+
+static void
+pmd_dev_infos_get(struct rte_eth_dev *dev,
+	struct rte_eth_dev_info *dev_info)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->dev_infos_get(dev, dev_info);
+}
+
+static const uint32_t *
+pmd_dev_supported_ptypes_get(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->dev_supported_ptypes_get(dev);
+}
+
+static int
+pmd_rx_queue_start(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rx_queue_start(dev, queue_id);
+}
+
+static int
+pmd_rx_queue_stop(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rx_queue_stop(dev, queue_id);
+}
+
+static int
+pmd_tx_queue_start(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->tx_queue_start(dev, queue_id);
+}
+
+static int
+pmd_tx_queue_stop(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->tx_queue_stop(dev, queue_id);
+}
+
+static int
+pmd_rx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id, uint16_t nb_rx_desc, unsigned int socket_id,
+	const struct rte_eth_rxconf *rx_conf, struct rte_mempool *mb_pool)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rx_queue_setup(dev, rx_queue_id, nb_rx_desc,
+		socket_id, rx_conf, mb_pool);
+}
+
+static int
+pmd_tx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id, uint16_t nb_tx_desc, unsigned int socket_id,
+	const struct rte_eth_txconf *tx_conf)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->tx_queue_setup(dev, tx_queue_id, nb_tx_desc,
+		socket_id, tx_conf);
+}
+
+static int
+pmd_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t rx_queue_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rx_queue_intr_enable(dev, rx_queue_id);
+}
+
+static int
+pmd_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t rx_queue_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rx_queue_intr_disable(dev, rx_queue_id);
+}
+
+static uint32_t
+pmd_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rx_queue_count(dev, rx_queue_id);
+}
+
+static int
+pmd_fw_version_get(struct rte_eth_dev *dev,
+	char *fw_version, size_t fw_size)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->fw_version_get(dev, fw_version, fw_size);
+}
+
+static void
+pmd_rxq_info_get(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->rxq_info_get(dev, rx_queue_id, qinfo);
+}
+
+static void
+pmd_txq_info_get(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id, struct rte_eth_txq_info *qinfo)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->txq_info_get(dev, tx_queue_id, qinfo);
+}
+
+static int
+pmd_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->mtu_set(dev, mtu);
+}
+
+static int
+pmd_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->vlan_filter_set(dev, vlan_id, on);
+}
+
+static int
+pmd_vlan_tpid_set(struct rte_eth_dev *dev,
+	enum rte_vlan_type type, uint16_t tpid)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->vlan_tpid_set(dev, type, tpid);
+}
+
+static void
+pmd_vlan_offload_set(struct rte_eth_dev *dev, int mask)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->vlan_offload_set(dev, mask);
+}
+
+static int
+pmd_vlan_pvid_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->vlan_pvid_set(dev, vlan_id, on);
+}
+
+static void
+pmd_vlan_strip_queue_set(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id, int on)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->vlan_strip_queue_set(dev, rx_queue_id, on);
+}
+
+static int
+pmd_flow_ctrl_get(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->flow_ctrl_get(dev, fc_conf);
+}
+
+static int
+pmd_flow_ctrl_set(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->flow_ctrl_set(dev, fc_conf);
+}
+
+static int
+pmd_priority_flow_ctrl_set(struct rte_eth_dev *dev,
+	struct rte_eth_pfc_conf *pfc_conf)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->priority_flow_ctrl_set(dev, pfc_conf);
+}
+
+static int
+pmd_reta_update(struct rte_eth_dev *dev,
+	struct rte_eth_rss_reta_entry64 *reta_conf, uint16_t reta_size)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->reta_update(dev, reta_conf, reta_size);
+}
+
+static int
+pmd_reta_query(struct rte_eth_dev *dev,
+	struct rte_eth_rss_reta_entry64 *reta_conf, uint16_t reta_size)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->reta_query(dev, reta_conf, reta_size);
+}
+
+static int
+pmd_rss_hash_update(struct rte_eth_dev *dev,
+	struct rte_eth_rss_conf *rss_conf)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rss_hash_update(dev, rss_conf);
+}
+
+static int
+pmd_rss_hash_conf_get(struct rte_eth_dev *dev,
+	struct rte_eth_rss_conf *rss_conf)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rss_hash_conf_get(dev, rss_conf);
+}
+
+static int
+pmd_dev_led_on(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->dev_led_on(dev);
+}
+
+static int
+pmd_dev_led_off(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->dev_led_off(dev);
+}
+
+static void
+pmd_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->mac_addr_remove(dev, index);
+}
+
+static int
+pmd_mac_addr_add(struct rte_eth_dev *dev,
+	struct ether_addr *mac_addr, uint32_t index, uint32_t vmdq)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->mac_addr_add(dev, mac_addr, index, vmdq);
+}
+
+static void
+pmd_mac_addr_set(struct rte_eth_dev *dev, struct ether_addr *mac_addr)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->mac_addr_set(dev, mac_addr);
+}
+
+static int
+pmd_uc_hash_table_set(struct rte_eth_dev *dev,
+	struct ether_addr *mac_addr, uint8_t on)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->uc_hash_table_set(dev, mac_addr, on);
+}
+
+static int
+pmd_uc_all_hash_table_set(struct rte_eth_dev *dev, uint8_t on)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->uc_all_hash_table_set(dev, on);
+}
+
+static int
+pmd_set_queue_rate_limit(struct rte_eth_dev *dev,
+	uint16_t queue_idx, uint16_t tx_rate)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->set_queue_rate_limit(dev, queue_idx, tx_rate);
+}
+
+static int
+pmd_mirror_rule_set(struct rte_eth_dev *dev,
+	struct rte_eth_mirror_conf *mirror_conf, uint8_t rule_id, uint8_t on)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->mirror_rule_set(dev, mirror_conf, rule_id, on);
+}
+
+static int
+pmd_mirror_rule_reset(struct rte_eth_dev *dev, uint8_t rule_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->mirror_rule_reset(dev, rule_id);
+}
+
+static int
+pmd_udp_tunnel_port_add(struct rte_eth_dev *dev,
+	struct rte_eth_udp_tunnel *tunnel_udp)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->udp_tunnel_port_add(dev, tunnel_udp);
+}
+
+static int
+pmd_udp_tunnel_port_del(struct rte_eth_dev *dev,
+	struct rte_eth_udp_tunnel *tunnel_udp)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->udp_tunnel_port_del(dev, tunnel_udp);
+}
+
+static int
+pmd_set_mc_addr_list(struct rte_eth_dev *dev,
+	struct ether_addr *mc_addr_set, uint32_t nb_mc_addr)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->set_mc_addr_list(dev, mc_addr_set, nb_mc_addr);
+}
+
+static int
+pmd_timesync_enable(struct rte_eth_dev *dev)
+{
+	return dev->dev_ops->timesync_enable(dev);
+}
+
+static int
+pmd_timesync_disable(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->timesync_disable(dev);
+}
+
+static int
+pmd_timesync_read_rx_timestamp(struct rte_eth_dev *dev,
+               struct timespec *timestamp, uint32_t flags)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->timesync_read_rx_timestamp(dev, timestamp, flags);
+}
+
+static int
+pmd_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
+	struct timespec *timestamp)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->timesync_read_tx_timestamp(dev, timestamp);
+}
+
+static int
+pmd_timesync_adjust_time(struct rte_eth_dev *dev, int64_t delta)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->timesync_adjust_time(dev, delta);
+}
+
+static int
+pmd_timesync_read_time(struct rte_eth_dev *dev, struct timespec *timestamp)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->timesync_read_time(dev, timestamp);
+}
+
+static int
+pmd_timesync_write_time(struct rte_eth_dev *dev,
+	const struct timespec *timestamp)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->timesync_write_time(dev, timestamp);
+}
+
+static int
+pmd_get_reg(struct rte_eth_dev *dev, struct rte_dev_reg_info *info)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->get_reg(dev, info);
+}
+
+static int
+pmd_get_eeprom_length(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->get_eeprom_length(dev);
+}
+
+static int
+pmd_get_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *info)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->get_eeprom(dev, info);
+}
+
+static int
+pmd_set_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *info)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->set_eeprom(dev, info);
+}
+
+static int
+pmd_l2_tunnel_eth_type_conf(struct rte_eth_dev *dev,
+               struct rte_eth_l2_tunnel_conf *l2_tunnel)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->l2_tunnel_eth_type_conf(dev, l2_tunnel);
+}
+
+static int
+pmd_l2_tunnel_offload_set(struct rte_eth_dev *dev,
+               struct rte_eth_l2_tunnel_conf *l2_tunnel,
+			   uint32_t mask,
+			   uint8_t en)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->l2_tunnel_offload_set(dev,l2_tunnel, mask, en);
+}
+
+#ifdef RTE_NIC_BYPASS
+
+static void
+pmd_bypass_init(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->bypass_init(dev);
+}
+
+static int32_t
+pmd_bypass_state_set(struct rte_eth_dev *dev, uint32_t *new_state)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_state_set(dev, new_state);
+}
+
+static int32_t
+pmd_bypass_state_show(struct rte_eth_dev *dev, uint32_t *state)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_state_show(dev, state);
+}
+
+static int32_t
+pmd_bypass_event_set(struct rte_eth_dev *dev,
+	uint32_t state, uint32_t event)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_event_set(dev, state, event);
+}
+
+static int32_t
+pmd_bypass_event_show(struct rte_eth_dev *dev,
+	uint32_t event_shift, uint32_t *event)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_event_show(dev, event_shift, event);
+}
+
+static int32_t
+pmd_bypass_wd_timeout_set(struct rte_eth_dev *dev, uint32_t timeout)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_wd_timeout_set(dev, timeout);
+}
+
+static int32_t
+pmd_bypass_wd_timeout_show(struct rte_eth_dev *dev, uint32_t *wd_timeout)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_wd_timeout_show(dev, wd_timeout);
+}
+
+static int32_t
+pmd_bypass_ver_show(struct rte_eth_dev *dev, uint32_t *ver)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_ver_show(dev, ver);
+}
+
+static int32_t
+pmd_bypass_wd_reset(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_wd_reset(dev);
+}
+
+#endif
+
+static int
+pmd_filter_ctrl(struct rte_eth_dev *dev, enum rte_filter_type filter_type,
+	enum rte_filter_op filter_op, void *arg)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->filter_ctrl(dev, filter_type, filter_op, arg);
+}
+
+static int
+pmd_get_dcb_info(struct rte_eth_dev *dev,
+	struct rte_eth_dcb_info *dcb_info)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->get_dcb_info(dev, dcb_info);
+}
+
+static int
+pmd_tm_ops_get(struct rte_eth_dev *dev, void *ops)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->tm_ops_get(dev, ops);
+}
+
+static const struct eth_dev_ops pmd_ops_default = {
+	.dev_configure = pmd_dev_configure,
+	.dev_start = pmd_dev_start,
+	.dev_stop = pmd_dev_stop,
+	.dev_set_link_up = pmd_dev_set_link_up,
+	.dev_set_link_down = pmd_dev_set_link_down,
+	.dev_close = pmd_dev_close,
+	.link_update = pmd_link_update,
+
+	.promiscuous_enable = pmd_promiscuous_enable,
+	.promiscuous_disable = pmd_promiscuous_disable,
+	.allmulticast_enable = pmd_allmulticast_enable,
+	.allmulticast_disable = pmd_allmulticast_disable,
+	.mac_addr_remove = pmd_mac_addr_remove,
+	.mac_addr_add = pmd_mac_addr_add,
+	.mac_addr_set = pmd_mac_addr_set,
+	.set_mc_addr_list = pmd_set_mc_addr_list,
+	.mtu_set = pmd_mtu_set,
+
+	.stats_get = pmd_stats_get,
+	.stats_reset = pmd_stats_reset,
+	.xstats_get = pmd_xstats_get,
+	.xstats_reset = pmd_xstats_reset,
+	.xstats_get_names = pmd_xstats_get_names,
+	.queue_stats_mapping_set = pmd_queue_stats_mapping_set,
+
+	.dev_infos_get = pmd_dev_infos_get,
+	.rxq_info_get = pmd_rxq_info_get,
+	.txq_info_get = pmd_txq_info_get,
+	.fw_version_get = pmd_fw_version_get,
+	.dev_supported_ptypes_get = pmd_dev_supported_ptypes_get,
+
+	.vlan_filter_set = pmd_vlan_filter_set,
+	.vlan_tpid_set = pmd_vlan_tpid_set,
+	.vlan_strip_queue_set = pmd_vlan_strip_queue_set,
+	.vlan_offload_set = pmd_vlan_offload_set,
+	.vlan_pvid_set = pmd_vlan_pvid_set,
+
+	.rx_queue_start = pmd_rx_queue_start,
+	.rx_queue_stop = pmd_rx_queue_stop,
+	.tx_queue_start = pmd_tx_queue_start,
+	.tx_queue_stop = pmd_tx_queue_stop,
+	.rx_queue_setup = pmd_rx_queue_setup,
+	.rx_queue_release = NULL,
+	.rx_queue_count = pmd_rx_queue_count,
+	.rx_descriptor_done = NULL,
+	.rx_descriptor_status = NULL,
+	.tx_descriptor_status = NULL,
+	.rx_queue_intr_enable = pmd_rx_queue_intr_enable,
+	.rx_queue_intr_disable = pmd_rx_queue_intr_disable,
+	.tx_queue_setup = pmd_tx_queue_setup,
+	.tx_queue_release = NULL,
+	.tx_done_cleanup = NULL,
+
+	.dev_led_on = pmd_dev_led_on,
+	.dev_led_off = pmd_dev_led_off,
+
+	.flow_ctrl_get = pmd_flow_ctrl_get,
+	.flow_ctrl_set = pmd_flow_ctrl_set,
+	.priority_flow_ctrl_set = pmd_priority_flow_ctrl_set,
+
+	.uc_hash_table_set = pmd_uc_hash_table_set,
+	.uc_all_hash_table_set = pmd_uc_all_hash_table_set,
+
+	.mirror_rule_set = pmd_mirror_rule_set,
+	.mirror_rule_reset = pmd_mirror_rule_reset,
+
+	.udp_tunnel_port_add = pmd_udp_tunnel_port_add,
+	.udp_tunnel_port_del = pmd_udp_tunnel_port_del,
+	.l2_tunnel_eth_type_conf = pmd_l2_tunnel_eth_type_conf,
+	.l2_tunnel_offload_set = pmd_l2_tunnel_offload_set,
+
+	.set_queue_rate_limit = pmd_set_queue_rate_limit,
+
+	.rss_hash_update = pmd_rss_hash_update,
+	.rss_hash_conf_get = pmd_rss_hash_conf_get,
+	.reta_update = pmd_reta_update,
+	.reta_query = pmd_reta_query,
+
+	.get_reg = pmd_get_reg,
+	.get_eeprom_length = pmd_get_eeprom_length,
+	.get_eeprom = pmd_get_eeprom,
+	.set_eeprom = pmd_set_eeprom,
+
+#ifdef RTE_NIC_BYPASS
+	.bypass_init = pmd_bypass_init,
+	.bypass_state_set = pmd_bypass_state_set,
+	.bypass_state_show = pmd_bypass_state_show,
+	.bypass_event_set = pmd_bypass_event_set,
+	.bypass_event_show = pmd_bypass_event_show,
+	.bypass_wd_timeout_set = pmd_bypass_wd_timeout_set,
+	.bypass_wd_timeout_show = pmd_bypass_wd_timeout_show,
+	.bypass_ver_show = pmd_bypass_ver_show,
+	.bypass_wd_reset = pmd_bypass_wd_reset,
+#endif
+
+	.filter_ctrl = pmd_filter_ctrl,
+
+	.get_dcb_info = pmd_get_dcb_info,
+
+	.timesync_enable = pmd_timesync_enable,
+	.timesync_disable = pmd_timesync_disable,
+	.timesync_read_rx_timestamp = pmd_timesync_read_rx_timestamp,
+	.timesync_read_tx_timestamp = pmd_timesync_read_tx_timestamp,
+	.timesync_adjust_time = pmd_timesync_adjust_time,
+	.timesync_read_time = pmd_timesync_read_time,
+	.timesync_write_time = pmd_timesync_write_time,
+
+	.xstats_get_by_id = pmd_xstats_get_by_id,
+	.xstats_get_names_by_id = pmd_xstats_get_names_by_id,
+
+	.tm_ops_get = pmd_tm_ops_get,
+};
+
+#define CHECK_AND_SET_NULL(o, u, func)			\
+	if ((u)->func == NULL)				\
+		(o)->func = NULL
+
+#define CHECK_AND_SET_NONNULL(o, u, func)			\
+	if ((u)->func != NULL)				\
+		(o)->func = (u)->func
+
+static void
+pmd_ops_check_and_set_null(struct eth_dev_ops *o,
+	const struct eth_dev_ops *u)
+{
+	CHECK_AND_SET_NULL(o, u, dev_configure);
+	CHECK_AND_SET_NULL(o, u, dev_start);
+	CHECK_AND_SET_NULL(o, u, dev_stop);
+	CHECK_AND_SET_NULL(o, u, dev_set_link_up);
+	CHECK_AND_SET_NULL(o, u, dev_set_link_down);
+	CHECK_AND_SET_NULL(o, u, dev_close);
+	CHECK_AND_SET_NULL(o, u, link_update);
+	CHECK_AND_SET_NULL(o, u, promiscuous_enable);
+	CHECK_AND_SET_NULL(o, u, promiscuous_disable);
+	CHECK_AND_SET_NULL(o, u, allmulticast_enable);
+	CHECK_AND_SET_NULL(o, u, allmulticast_disable);
+	CHECK_AND_SET_NULL(o, u, mac_addr_remove);
+	CHECK_AND_SET_NULL(o, u, mac_addr_add);
+	CHECK_AND_SET_NULL(o, u, mac_addr_set);
+	CHECK_AND_SET_NULL(o, u, set_mc_addr_list);
+	CHECK_AND_SET_NULL(o, u, mtu_set);
+	CHECK_AND_SET_NULL(o, u, stats_get);
+	CHECK_AND_SET_NULL(o, u, stats_reset);
+	CHECK_AND_SET_NULL(o, u, xstats_get);
+	CHECK_AND_SET_NULL(o, u, xstats_reset);
+	CHECK_AND_SET_NULL(o, u, xstats_get_names);
+	CHECK_AND_SET_NULL(o, u, queue_stats_mapping_set);
+	CHECK_AND_SET_NULL(o, u, dev_infos_get);
+	CHECK_AND_SET_NULL(o, u, rxq_info_get);
+	CHECK_AND_SET_NULL(o, u, txq_info_get);
+	CHECK_AND_SET_NULL(o, u, fw_version_get);
+	CHECK_AND_SET_NULL(o, u, dev_supported_ptypes_get);
+	CHECK_AND_SET_NULL(o, u, vlan_filter_set);
+	CHECK_AND_SET_NULL(o, u, vlan_tpid_set);
+	CHECK_AND_SET_NULL(o, u, vlan_strip_queue_set);
+	CHECK_AND_SET_NULL(o, u, vlan_offload_set);
+	CHECK_AND_SET_NULL(o, u, vlan_pvid_set);
+	CHECK_AND_SET_NULL(o, u, rx_queue_start);
+	CHECK_AND_SET_NULL(o, u, rx_queue_stop);
+	CHECK_AND_SET_NULL(o, u, tx_queue_start);
+	CHECK_AND_SET_NULL(o, u, tx_queue_stop);
+	CHECK_AND_SET_NULL(o, u, rx_queue_setup);
+	CHECK_AND_SET_NULL(o, u, rx_queue_release);
+	CHECK_AND_SET_NULL(o, u, rx_queue_count);
+	CHECK_AND_SET_NULL(o, u, rx_descriptor_done);
+	CHECK_AND_SET_NULL(o, u, rx_descriptor_status);
+	CHECK_AND_SET_NULL(o, u, tx_descriptor_status);
+	CHECK_AND_SET_NULL(o, u, rx_queue_intr_enable);
+	CHECK_AND_SET_NULL(o, u, rx_queue_intr_disable);
+	CHECK_AND_SET_NULL(o, u, tx_queue_setup);
+	CHECK_AND_SET_NULL(o, u, tx_queue_release);
+	CHECK_AND_SET_NULL(o, u, tx_done_cleanup);
+	CHECK_AND_SET_NULL(o, u, dev_led_on);
+	CHECK_AND_SET_NULL(o, u, dev_led_off);
+	CHECK_AND_SET_NULL(o, u, flow_ctrl_get);
+	CHECK_AND_SET_NULL(o, u, flow_ctrl_set);
+	CHECK_AND_SET_NULL(o, u, priority_flow_ctrl_set);
+	CHECK_AND_SET_NULL(o, u, uc_hash_table_set);
+	CHECK_AND_SET_NULL(o, u, uc_all_hash_table_set);
+	CHECK_AND_SET_NULL(o, u, mirror_rule_set);
+	CHECK_AND_SET_NULL(o, u, mirror_rule_reset);
+	CHECK_AND_SET_NULL(o, u, udp_tunnel_port_add);
+	CHECK_AND_SET_NULL(o, u, udp_tunnel_port_del);
+	CHECK_AND_SET_NULL(o, u, l2_tunnel_eth_type_conf);
+	CHECK_AND_SET_NULL(o, u, l2_tunnel_offload_set);
+	CHECK_AND_SET_NULL(o, u, set_queue_rate_limit);
+	CHECK_AND_SET_NULL(o, u, rss_hash_update);
+	CHECK_AND_SET_NULL(o, u, rss_hash_conf_get);
+	CHECK_AND_SET_NULL(o, u, reta_update);
+	CHECK_AND_SET_NULL(o, u, reta_query);
+	CHECK_AND_SET_NULL(o, u, get_reg);
+	CHECK_AND_SET_NULL(o, u, get_eeprom_length);
+	CHECK_AND_SET_NULL(o, u, get_eeprom);
+	CHECK_AND_SET_NULL(o, u, set_eeprom);
+
+	#ifdef RTE_NIC_BYPASS
+
+	CHECK_AND_SET_NULL(o, u, bypass_init);
+	CHECK_AND_SET_NULL(o, u, bypass_state_set);
+	CHECK_AND_SET_NULL(o, u, bypass_state_show);
+	CHECK_AND_SET_NULL(o, u, bypass_event_set);
+	CHECK_AND_SET_NULL(o, u, bypass_event_show);
+	CHECK_AND_SET_NULL(o, u, bypass_wd_timeout_set);
+	CHECK_AND_SET_NULL(o, u, bypass_wd_timeout_show);
+	CHECK_AND_SET_NULL(o, u, bypass_ver_show);
+	CHECK_AND_SET_NULL(o, u, bypass_wd_reset);
+
+	#endif
+
+	CHECK_AND_SET_NULL(o, u, filter_ctrl);
+	CHECK_AND_SET_NULL(o, u, get_dcb_info);
+	CHECK_AND_SET_NULL(o, u, timesync_enable);
+	CHECK_AND_SET_NULL(o, u, timesync_disable);
+	CHECK_AND_SET_NULL(o, u, timesync_read_rx_timestamp);
+	CHECK_AND_SET_NULL(o, u, timesync_read_tx_timestamp);
+	CHECK_AND_SET_NULL(o, u, timesync_adjust_time);
+	CHECK_AND_SET_NULL(o, u, timesync_read_time);
+	CHECK_AND_SET_NULL(o, u, timesync_write_time);
+	CHECK_AND_SET_NULL(o, u, xstats_get_by_id);
+	CHECK_AND_SET_NULL(o, u, xstats_get_names_by_id);
+	CHECK_AND_SET_NULL(o, u, tm_ops_get);
+}
+
+void
+pmd_ops_inherit(struct eth_dev_ops *o, const struct eth_dev_ops *u)
+{
+	/* Rules:
+	 *    1. u->func == NULL => o->func = NULL
+	 *    2. u->func != NULL => o->func = pmd_ops_default.func
+	 *    3. queue related func => o->func = u->func
+	 */
+
+	memcpy(o, &pmd_ops_default, sizeof(struct eth_dev_ops));
+	pmd_ops_check_and_set_null(o, u);
+
+	/* Copy queue related functions */
+	o->rx_queue_release = u->rx_queue_release;
+	o->tx_queue_release = u->tx_queue_release;
+	o->rx_descriptor_done = u->rx_descriptor_done;
+	o->rx_descriptor_status = u->rx_descriptor_status;
+	o->tx_descriptor_status = u->tx_descriptor_status;
+	o->tx_done_cleanup = u->tx_done_cleanup;
+}
+
+void
+pmd_ops_derive(struct eth_dev_ops *o, const struct eth_dev_ops *u)
+{
+	CHECK_AND_SET_NONNULL(o, u, dev_configure);
+	CHECK_AND_SET_NONNULL(o, u, dev_start);
+	CHECK_AND_SET_NONNULL(o, u, dev_stop);
+	CHECK_AND_SET_NONNULL(o, u, dev_set_link_up);
+	CHECK_AND_SET_NONNULL(o, u, dev_set_link_down);
+	CHECK_AND_SET_NONNULL(o, u, dev_close);
+	CHECK_AND_SET_NONNULL(o, u, link_update);
+	CHECK_AND_SET_NONNULL(o, u, promiscuous_enable);
+	CHECK_AND_SET_NONNULL(o, u, promiscuous_disable);
+	CHECK_AND_SET_NONNULL(o, u, allmulticast_enable);
+	CHECK_AND_SET_NONNULL(o, u, allmulticast_disable);
+	CHECK_AND_SET_NONNULL(o, u, mac_addr_remove);
+	CHECK_AND_SET_NONNULL(o, u, mac_addr_add);
+	CHECK_AND_SET_NONNULL(o, u, mac_addr_set);
+	CHECK_AND_SET_NONNULL(o, u, set_mc_addr_list);
+	CHECK_AND_SET_NONNULL(o, u, mtu_set);
+	CHECK_AND_SET_NONNULL(o, u, stats_get);
+	CHECK_AND_SET_NONNULL(o, u, stats_reset);
+	CHECK_AND_SET_NONNULL(o, u, xstats_get);
+	CHECK_AND_SET_NONNULL(o, u, xstats_reset);
+	CHECK_AND_SET_NONNULL(o, u, xstats_get_names);
+	CHECK_AND_SET_NONNULL(o, u, queue_stats_mapping_set);
+	CHECK_AND_SET_NONNULL(o, u, dev_infos_get);
+	CHECK_AND_SET_NONNULL(o, u, rxq_info_get);
+	CHECK_AND_SET_NONNULL(o, u, txq_info_get);
+	CHECK_AND_SET_NONNULL(o, u, fw_version_get);
+	CHECK_AND_SET_NONNULL(o, u, dev_supported_ptypes_get);
+	CHECK_AND_SET_NONNULL(o, u, vlan_filter_set);
+	CHECK_AND_SET_NONNULL(o, u, vlan_tpid_set);
+	CHECK_AND_SET_NONNULL(o, u, vlan_strip_queue_set);
+	CHECK_AND_SET_NONNULL(o, u, vlan_offload_set);
+	CHECK_AND_SET_NONNULL(o, u, vlan_pvid_set);
+	CHECK_AND_SET_NONNULL(o, u, rx_queue_start);
+	CHECK_AND_SET_NONNULL(o, u, rx_queue_stop);
+	CHECK_AND_SET_NONNULL(o, u, tx_queue_start);
+	CHECK_AND_SET_NONNULL(o, u, tx_queue_stop);
+	CHECK_AND_SET_NONNULL(o, u, rx_queue_setup);
+	CHECK_AND_SET_NONNULL(o, u, rx_queue_release);
+	CHECK_AND_SET_NONNULL(o, u, rx_queue_count);
+	CHECK_AND_SET_NONNULL(o, u, rx_descriptor_done);
+	CHECK_AND_SET_NONNULL(o, u, rx_descriptor_status);
+	CHECK_AND_SET_NONNULL(o, u, tx_descriptor_status);
+	CHECK_AND_SET_NONNULL(o, u, rx_queue_intr_enable);
+	CHECK_AND_SET_NONNULL(o, u, rx_queue_intr_disable);
+	CHECK_AND_SET_NONNULL(o, u, tx_queue_setup);
+	CHECK_AND_SET_NONNULL(o, u, tx_queue_release);
+	CHECK_AND_SET_NONNULL(o, u, tx_done_cleanup);
+	CHECK_AND_SET_NONNULL(o, u, dev_led_on);
+	CHECK_AND_SET_NONNULL(o, u, dev_led_off);
+	CHECK_AND_SET_NONNULL(o, u, flow_ctrl_get);
+	CHECK_AND_SET_NONNULL(o, u, flow_ctrl_set);
+	CHECK_AND_SET_NONNULL(o, u, priority_flow_ctrl_set);
+	CHECK_AND_SET_NONNULL(o, u, uc_hash_table_set);
+	CHECK_AND_SET_NONNULL(o, u, uc_all_hash_table_set);
+	CHECK_AND_SET_NONNULL(o, u, mirror_rule_set);
+	CHECK_AND_SET_NONNULL(o, u, mirror_rule_reset);
+	CHECK_AND_SET_NONNULL(o, u, udp_tunnel_port_add);
+	CHECK_AND_SET_NONNULL(o, u, udp_tunnel_port_del);
+	CHECK_AND_SET_NONNULL(o, u, l2_tunnel_eth_type_conf);
+	CHECK_AND_SET_NONNULL(o, u, l2_tunnel_offload_set);
+	CHECK_AND_SET_NONNULL(o, u, set_queue_rate_limit);
+	CHECK_AND_SET_NONNULL(o, u, rss_hash_update);
+	CHECK_AND_SET_NONNULL(o, u, rss_hash_conf_get);
+	CHECK_AND_SET_NONNULL(o, u, reta_update);
+	CHECK_AND_SET_NONNULL(o, u, reta_query);
+	CHECK_AND_SET_NONNULL(o, u, get_reg);
+	CHECK_AND_SET_NONNULL(o, u, get_eeprom_length);
+	CHECK_AND_SET_NONNULL(o, u, get_eeprom);
+	CHECK_AND_SET_NONNULL(o, u, set_eeprom);
+
+	#ifdef RTE_NIC_BYPASS
+
+	CHECK_AND_SET_NONNULL(o, u, bypass_init);
+	CHECK_AND_SET_NONNULL(o, u, bypass_state_set);
+	CHECK_AND_SET_NONNULL(o, u, bypass_state_show);
+	CHECK_AND_SET_NONNULL(o, u, bypass_event_set);
+	CHECK_AND_SET_NONNULL(o, u, bypass_event_show);
+	CHECK_AND_SET_NONNULL(o, u, bypass_wd_timeout_set);
+	CHECK_AND_SET_NONNULL(o, u, bypass_wd_timeout_show);
+	CHECK_AND_SET_NONNULL(o, u, bypass_ver_show);
+	CHECK_AND_SET_NONNULL(o, u, bypass_wd_reset);
+
+	#endif
+
+	CHECK_AND_SET_NONNULL(o, u, filter_ctrl);
+	CHECK_AND_SET_NONNULL(o, u, get_dcb_info);
+	CHECK_AND_SET_NONNULL(o, u, timesync_enable);
+	CHECK_AND_SET_NONNULL(o, u, timesync_disable);
+	CHECK_AND_SET_NONNULL(o, u, timesync_read_rx_timestamp);
+	CHECK_AND_SET_NONNULL(o, u, timesync_read_tx_timestamp);
+	CHECK_AND_SET_NONNULL(o, u, timesync_adjust_time);
+	CHECK_AND_SET_NONNULL(o, u, timesync_read_time);
+	CHECK_AND_SET_NONNULL(o, u, timesync_write_time);
+	CHECK_AND_SET_NONNULL(o, u, xstats_get_by_id);
+	CHECK_AND_SET_NONNULL(o, u, xstats_get_names_by_id);
+	CHECK_AND_SET_NONNULL(o, u, tm_ops_get);
+}
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
new file mode 100644
index 0000000..d456a54
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -0,0 +1,67 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+
+#include "rte_eth_softnic.h"
+
+struct pmd_internals {
+	/* Devices */
+	struct rte_eth_dev *odev;
+	struct rte_eth_dev *udev;
+	struct rte_eth_dev_data *odata;
+	struct rte_eth_dev_data *udata;
+	struct eth_dev_ops *odev_ops;
+	const struct eth_dev_ops *udev_ops;
+	uint8_t oport_id;
+	uint8_t uport_id;
+
+	/* Operation */
+	struct rte_mbuf *pkts[RTE_ETH_SOFTNIC_DEQ_BSZ_MAX];
+	uint32_t deq_bsz;
+	uint32_t txq_id;
+};
+
+void
+pmd_ops_inherit(struct eth_dev_ops *o, const struct eth_dev_ops *u);
+
+void
+pmd_ops_derive(struct eth_dev_ops *o, const struct eth_dev_ops *u);
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_version.map b/drivers/net/softnic/rte_eth_softnic_version.map
new file mode 100644
index 0000000..bb730e5
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_version.map
@@ -0,0 +1,7 @@
+DPDK_17.08 {
+	global:
+	rte_eth_softnic_create;
+	rte_eth_softnic_run;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index bcaf1b3..be13730 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -66,7 +66,6 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
-_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
 # librte_acl needs --whole-archive because of weak functions
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)            += --whole-archive
@@ -98,6 +97,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
@@ -133,6 +133,9 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL)       += -lrte_pmd_null
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP)       += -lrte_pmd_pcap -lpcap
 _LDLIBS-$(CONFIG_RTE_LIBRTE_QEDE_PMD)       += -lrte_pmd_qede
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING)       += -lrte_pmd_ring
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SCHED)      += -lrte_pmd_sched
+endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD)    += -lrte_pmd_sfc_efx
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2)   += -lrte_pmd_szedata2 -lsze2
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_TAP)        += -lrte_pmd_tap
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH 2/2] net/softnic: add traffic management ops
  2017-05-26 18:11 [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management Jasvinder Singh
  2017-05-26 18:11 ` [dpdk-dev] [PATCH 1/2] net/softnic: add softnic PMD " Jasvinder Singh
@ 2017-05-26 18:11 ` Jasvinder Singh
  2017-06-07 14:32 ` [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management Thomas Monjalon
  2017-08-11 15:28 ` Stephen Hemminger
  3 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-05-26 18:11 UTC (permalink / raw)
  To: dev
  Cc: cristian.dumitrescu, ferruh.yigit, hemant.agrawal,
	Jerin.JacobKollanukkaran, wenzhuo.lu

The traffic management specific functions of the softnic driver are supplied
through set of pointers contained in the generic structure of type
'rte_tm_ops'. These functions help to build and manage the hierarchical QoS
scheduler for traffic management.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 drivers/net/softnic/Makefile                    |   1 +
 drivers/net/softnic/rte_eth_softnic.c           |  47 ++++-
 drivers/net/softnic/rte_eth_softnic_internals.h |  26 +++
 drivers/net/softnic/rte_eth_softnic_tm.c        | 235 ++++++++++++++++++++++++
 4 files changed, 307 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c

diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
index c0374fa..e847907 100644
--- a/drivers/net/softnic/Makefile
+++ b/drivers/net/softnic/Makefile
@@ -47,6 +47,7 @@ LIBABIVER := 1
 # all source are stored in SRCS-y
 #
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic_tm.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic_default.c
 
 #
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index 529200e..af0e4b0 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -41,6 +41,8 @@
 #include <rte_vdev.h>
 #include <rte_kvargs.h>
 #include <rte_errno.h>
+#include <rte_tm_driver.h>
+#include <rte_sched.h>
 
 #include "rte_eth_softnic.h"
 #include "rte_eth_softnic_internals.h"
@@ -58,6 +60,10 @@ static const char *pmd_valid_args[] = {
 
 static struct rte_vdev_driver pmd_drv;
 
+#ifndef TM
+#define TM						1
+#endif
+
 static int
 pmd_eth_dev_configure(struct rte_eth_dev *dev)
 {
@@ -113,6 +119,13 @@ pmd_eth_dev_start(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
 
+#if TM
+	/* Initialize the Traffic Manager for the overlay device */
+	int status = tm_init(p);
+	if (status)
+		return status;
+#endif
+
 	/* Clone dev->data from underlay to overlay */
 	memcpy(dev->data->mac_pool_sel,
 		p->udev->data->mac_pool_sel,
@@ -132,6 +145,10 @@ pmd_eth_dev_stop(struct rte_eth_dev *dev)
 	/* Call the current function for the underlay device */
 	rte_eth_dev_stop(p->uport_id);
 
+#if TM
+	/* Free the Traffic Manager for the overlay device */
+	tm_free(p);
+#endif
 }
 
 static void
@@ -249,6 +266,14 @@ pmd_eth_dev_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
 	rte_eth_dev_mac_addr_remove(p->uport_id, &dev->data->mac_addrs[index]);
 }
 
+static int
+pmd_eth_dev_tm_ops_get(struct rte_eth_dev *dev __rte_unused, void *arg)
+{
+	*(const struct rte_tm_ops **) arg	= &pmd_tm_ops;
+
+	return 0;
+}
+
 static uint16_t
 pmd_eth_dev_tx_burst(void *txq,
 	struct rte_mbuf **tx_pkts,
@@ -256,13 +281,30 @@ pmd_eth_dev_tx_burst(void *txq,
 {
 	struct pmd_internals *p = txq;
 
-	return rte_eth_tx_burst(p->uport_id, p->txq_id, tx_pkts, nb_pkts);
+#if TM
+	rte_sched_port_enqueue(p->sched, tx_pkts, nb_pkts);
 
+	return nb_pkts;
+#else
+	return rte_eth_tx_burst(p->uport_id, p->txq_id, tx_pkts, nb_pkts);
+#endif
 }
 
 int
-rte_eth_softnic_run(uint8_t port_id __rte_unused)
+rte_eth_softnic_run(uint8_t port_id)
 {
+	struct rte_eth_dev *odev = &rte_eth_devices[port_id];
+	struct pmd_internals *p = odev->data->dev_private;
+	uint32_t n_pkts, n_pkts_deq;
+
+	n_pkts_deq = rte_sched_port_dequeue(p->sched, p->pkts, p->deq_bsz);
+
+	for (n_pkts = 0; n_pkts < n_pkts_deq;)
+		n_pkts += rte_eth_tx_burst(p->uport_id,
+			p->txq_id,
+			&p->pkts[n_pkts],
+			(uint16_t) (n_pkts_deq - n_pkts));
+
 	return 0;
 }
 
@@ -287,6 +329,7 @@ pmd_ops_build(struct eth_dev_ops *o, const struct eth_dev_ops *u)
 	o->mac_addr_set = pmd_eth_dev_mac_addr_set;
 	o->mac_addr_add = pmd_eth_dev_mac_addr_add;
 	o->mac_addr_remove = pmd_eth_dev_mac_addr_remove;
+	o->tm_ops_get = pmd_eth_dev_tm_ops_get;
 }
 
 int
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index d456a54..2d725b4 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -38,9 +38,25 @@
 
 #include <rte_mbuf.h>
 #include <rte_ethdev.h>
+#include <rte_sched.h>
 
 #include "rte_eth_softnic.h"
 
+#ifndef TM_MAX_SUBPORTS
+#define TM_MAX_SUBPORTS					8
+#endif
+
+#ifndef TM_MAX_PIPES_PER_SUBPORT
+#define TM_MAX_PIPES_PER_SUBPORT				4096
+#endif
+
+struct tm_params {
+	struct rte_sched_port_params port_params;
+	struct rte_sched_subport_params subport_params[TM_MAX_SUBPORTS];
+	struct rte_sched_pipe_params pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PORT];
+	int pipe_to_profile[TM_MAX_SUBPORTS * TM_MAX_PIPES_PER_SUBPORT];
+};
+
 struct pmd_internals {
 	/* Devices */
 	struct rte_eth_dev *odev;
@@ -54,10 +70,20 @@ struct pmd_internals {
 
 	/* Operation */
 	struct rte_mbuf *pkts[RTE_ETH_SOFTNIC_DEQ_BSZ_MAX];
+	struct tm_params tm_params;
+	struct rte_sched_port *sched;
 	uint32_t deq_bsz;
 	uint32_t txq_id;
 };
 
+extern const struct rte_tm_ops pmd_tm_ops;
+
+int
+tm_init(struct pmd_internals *p);
+
+void
+tm_free(struct pmd_internals *p);
+
 void
 pmd_ops_inherit(struct eth_dev_ops *o, const struct eth_dev_ops *u);
 
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
new file mode 100644
index 0000000..0354657
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -0,0 +1,235 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+
+#include <rte_tm_driver.h>
+#include <rte_sched.h>
+
+#include "rte_eth_softnic_internals.h"
+#include "rte_eth_softnic.h"
+
+int
+tm_init(struct pmd_internals *p)
+{
+	struct tm_params *t = &p->tm_params;
+	uint32_t n_subports, subport_id;
+	int status;
+
+	/* Port */
+	t->port_params.name = p->odev->data->name;
+	t->port_params.socket = p->udev->data->numa_node;
+	t->port_params.rate = p->udev->data->dev_link.link_speed;
+
+	p->sched = rte_sched_port_config(&t->port_params);
+	if (p->sched == NULL)
+		return -1;
+
+	/* Subport */
+	n_subports = t->port_params.n_subports_per_port;
+	for (subport_id = 0; subport_id < n_subports; subport_id++) {
+		uint32_t n_pipes_per_subport = t->port_params.n_pipes_per_subport;
+		uint32_t pipe_id;
+
+		status = rte_sched_subport_config(p->sched,
+			subport_id,
+			&t->subport_params[subport_id]);
+		if (status) {
+			rte_sched_port_free(p->sched);
+			return -1;
+		}
+
+		/* Pipe */
+		n_pipes_per_subport = t->port_params.n_pipes_per_subport;
+		for (pipe_id = 0; pipe_id < n_pipes_per_subport; pipe_id++) {
+			int pos = subport_id * TM_MAX_PIPES_PER_SUBPORT + pipe_id;
+			int profile_id = t->pipe_to_profile[pos];
+
+			if (profile_id < 0)
+				continue;
+
+			status = rte_sched_pipe_config(p->sched,
+				subport_id,
+				pipe_id,
+				profile_id);
+			if (status) {
+				rte_sched_port_free(p->sched);
+				return -1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+void
+tm_free(struct pmd_internals *p)
+{
+	if (p->sched)
+		rte_sched_port_free(p->sched);
+}
+
+/* Traffic manager node type get */
+static int
+pmd_tm_node_type_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t node_id __rte_unused,
+	int *is_leaf __rte_unused,
+	struct rte_tm_error *error __rte_unused)
+{
+	return 0;
+}
+
+/* Traffic manager capabilities get */
+static int
+pmd_tm_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	struct rte_tm_capabilities *cap __rte_unused,
+	struct rte_tm_error *error __rte_unused)
+{
+	return 0;
+}
+
+/* Traffic manager level capabilities get */
+static int
+pmd_tm_level_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_level_capabilities *cap __rte_unused,
+	struct rte_tm_error *error __rte_unused)
+{
+	return 0;
+}
+
+/* Traffic manager node capabilities get */
+static int
+pmd_tm_node_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t node_id __rte_unused,
+	struct rte_tm_node_capabilities *cap __rte_unused,
+	struct rte_tm_error *error __rte_unused)
+{
+	return 0;
+}
+
+/* Traffic manager shaper profile add */
+static int
+pmd_tm_shaper_profile_add(struct rte_eth_dev *dev __rte_unused,
+	uint32_t shaper_profile_id __rte_unused,
+	struct rte_tm_shaper_params *profile __rte_unused,
+	struct rte_tm_error *error __rte_unused)
+{
+	return 0;
+}
+
+/* Traffic manager shaper profile delete */
+static int
+pmd_tm_shaper_profile_delete(struct rte_eth_dev *dev __rte_unused,
+	uint32_t shaper_profile_id __rte_unused,
+	struct rte_tm_error *error __rte_unused)
+{
+	return 0;
+}
+
+/* Traffic manager node add */
+static int
+pmd_tm_node_add(struct rte_eth_dev *dev __rte_unused,
+	uint32_t node_id __rte_unused,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority __rte_unused,
+	uint32_t weight __rte_unused,
+	struct rte_tm_node_params *params __rte_unused,
+	struct rte_tm_error *error __rte_unused)
+{
+	return 0;
+}
+
+/* Traffic manager node delete */
+static int
+pmd_tm_node_delete(struct rte_eth_dev *dev __rte_unused,
+	uint32_t node_id __rte_unused,
+	struct rte_tm_error *error __rte_unused)
+{
+	return 0;
+}
+
+/* Traffic manager hierarchy commit */
+static int
+pmd_tm_hierarchy_commit(struct rte_eth_dev *dev __rte_unused,
+	int clear_on_fail __rte_unused,
+	struct rte_tm_error *error __rte_unused)
+{
+	return 0;
+}
+
+/* Traffic manager read stats counters for specific node */
+static int
+pmd_tm_node_stats_read(struct rte_eth_dev *dev __rte_unused,
+	uint32_t node_id __rte_unused,
+	struct rte_tm_node_stats *stats __rte_unused,
+	uint64_t *stats_mask __rte_unused,
+	int clear __rte_unused,
+	struct rte_tm_error *error __rte_unused)
+{
+	return 0;
+}
+
+const struct rte_tm_ops pmd_tm_ops = {
+	.node_type_get = pmd_tm_node_type_get,
+	.capabilities_get = pmd_tm_capabilities_get,
+	.level_capabilities_get = pmd_tm_level_capabilities_get,
+	.node_capabilities_get = pmd_tm_node_capabilities_get,
+
+	.wred_profile_add = NULL,
+	.wred_profile_delete = NULL,
+	.shared_wred_context_add_update = NULL,
+	.shared_wred_context_delete = NULL,
+
+	.shaper_profile_add = pmd_tm_shaper_profile_add,
+	.shaper_profile_delete = pmd_tm_shaper_profile_delete,
+	.shared_shaper_add_update = NULL,
+	.shared_shaper_delete = NULL,
+
+	.node_add = pmd_tm_node_add,
+	.node_delete = pmd_tm_node_delete,
+	.node_suspend = NULL,
+	.node_resume = NULL,
+	.hierarchy_commit = pmd_tm_hierarchy_commit,
+
+	.node_parent_update = NULL,
+	.node_shaper_update = NULL,
+	.node_shared_shaper_update = NULL,
+	.node_stats_update = NULL,
+	.node_wfq_weight_mode_update = NULL,
+	.node_cman_update = NULL,
+	.node_wred_context_update = NULL,
+	.node_shared_wred_context_update = NULL,
+
+	.node_stats_read = pmd_tm_node_stats_read,
+};
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management
  2017-05-26 18:11 [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management Jasvinder Singh
  2017-05-26 18:11 ` [dpdk-dev] [PATCH 1/2] net/softnic: add softnic PMD " Jasvinder Singh
  2017-05-26 18:11 ` [dpdk-dev] [PATCH " Jasvinder Singh
@ 2017-06-07 14:32 ` Thomas Monjalon
  2017-06-08 13:27   ` Dumitrescu, Cristian
  2017-08-11 15:28 ` Stephen Hemminger
  3 siblings, 1 reply; 79+ messages in thread
From: Thomas Monjalon @ 2017-06-07 14:32 UTC (permalink / raw)
  To: Jasvinder Singh
  Cc: dev, cristian.dumitrescu, ferruh.yigit, hemant.agrawal,
	Jerin.JacobKollanukkaran, wenzhuo.lu

Hi Jasvinder,

26/05/2017 20:11, Jasvinder Singh:
> The SoftNIC PMD provides SW fall-back option for the NICs not supporting
> the Traffic Management (TM) features. 

Do you mean that you want to stack PMDs in order to offer some fallbacks?
It means the user needs to instantiate this PMD for each HW which does
not support traffic management, instead of normal hardware probing?

> SoftNIC PMD overview:
> - The SW fall-back is based on the existing librte_sched DPDK library.
> - The TM-agnostic port (the underlay device) is wrapped into a TM-aware
>   softnic port (the overlay device).
> - Once the overlay device (virtual device) is created, the configuration of
>   the underlay device is taking place through the overlay device.
> - The SoftNIC PMD is generic, i.e. it works for any underlay device PMD that
>   implements the ethdev API.

Why not calling librte_sched directly in ethdev for PMDs which do not
implement hardware offload?
Am I missing something obvious?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management
  2017-06-07 14:32 ` [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management Thomas Monjalon
@ 2017-06-08 13:27   ` Dumitrescu, Cristian
  2017-06-08 13:59     ` Thomas Monjalon
  0 siblings, 1 reply; 79+ messages in thread
From: Dumitrescu, Cristian @ 2017-06-08 13:27 UTC (permalink / raw)
  To: Thomas Monjalon, Singh, Jasvinder
  Cc: dev, Yigit, Ferruh, hemant.agrawal, Jerin.JacobKollanukkaran, Lu,
	Wenzhuo

Hi Thomas,

Thanks for reviewing this patch set!

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Wednesday, June 7, 2017 3:32 PM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>
> Cc: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>;
> Yigit, Ferruh <ferruh.yigit@intel.com>; hemant.agrawal@nxp.com;
> Jerin.JacobKollanukkaran@cavium.com; Lu, Wenzhuo
> <wenzhuo.lu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic
> management
> 
> Hi Jasvinder,
> 
> 26/05/2017 20:11, Jasvinder Singh:
> > The SoftNIC PMD provides SW fall-back option for the NICs not supporting
> > the Traffic Management (TM) features.
> 
> Do you mean that you want to stack PMDs in order to offer some fallbacks?
> It means the user needs to instantiate this PMD for each HW which does
> not support traffic management, instead of normal hardware probing?
> 

No, the normal HW probing still takes place for the HW device. Then if QoS "probing" fails, the user can decide to create a new virtual device on top of the HW device.

> > SoftNIC PMD overview:
> > - The SW fall-back is based on the existing librte_sched DPDK library.
> > - The TM-agnostic port (the underlay device) is wrapped into a TM-aware
> >   softnic port (the overlay device).
> > - Once the overlay device (virtual device) is created, the configuration of
> >   the underlay device is taking place through the overlay device.
> > - The SoftNIC PMD is generic, i.e. it works for any underlay device PMD that
> >   implements the ethdev API.
> 
> Why not calling librte_sched directly in ethdev for PMDs which do not
> implement hardware offload?
> Am I missing something obvious?

Yes, we are calling the librte_sched in ethdev, but how else can we do it?
	- We cannot change the ethdev ops of the HW device PMD because same might be used by other HW devices in the system where TM feature is not required.
	- We cannot change the ethdev ops of the current HW device, as on-the-fly changes of the ops structure are not allowed, right?
	- We can create a new virtual device on top of existing HW device to inherit most of the ethdev ops of the HW device and patch some specific ethdev ops with librte_sched.

IMHO there aren't two different ways to do this.

Regards,
Cristian

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management
  2017-06-08 13:27   ` Dumitrescu, Cristian
@ 2017-06-08 13:59     ` Thomas Monjalon
  2017-06-08 15:27       ` Dumitrescu, Cristian
  0 siblings, 1 reply; 79+ messages in thread
From: Thomas Monjalon @ 2017-06-08 13:59 UTC (permalink / raw)
  To: Dumitrescu, Cristian
  Cc: Singh, Jasvinder, dev, Yigit, Ferruh, hemant.agrawal,
	Jerin.JacobKollanukkaran, Lu, Wenzhuo

08/06/2017 15:27, Dumitrescu, Cristian:
> Hi Thomas,
> 
> Thanks for reviewing this patch set!
> 
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > 
> > Hi Jasvinder,
> > 
> > 26/05/2017 20:11, Jasvinder Singh:
> > > The SoftNIC PMD provides SW fall-back option for the NICs not supporting
> > > the Traffic Management (TM) features.
> > 
> > Do you mean that you want to stack PMDs in order to offer some fallbacks?
> > It means the user needs to instantiate this PMD for each HW which does
> > not support traffic management, instead of normal hardware probing?
> > 
> 
> No, the normal HW probing still takes place for the HW device. Then if QoS "probing" fails, the user can decide to create a new virtual device on top of the HW device.

What do you mean by "QoS probing"?

> > > SoftNIC PMD overview:
> > > - The SW fall-back is based on the existing librte_sched DPDK library.
> > > - The TM-agnostic port (the underlay device) is wrapped into a TM-aware
> > >   softnic port (the overlay device).
> > > - Once the overlay device (virtual device) is created, the configuration of
> > >   the underlay device is taking place through the overlay device.
> > > - The SoftNIC PMD is generic, i.e. it works for any underlay device PMD that
> > >   implements the ethdev API.
> > 
> > Why not calling librte_sched directly in ethdev for PMDs which do not
> > implement hardware offload?
> > Am I missing something obvious?
> 
> Yes, we are calling the librte_sched in ethdev, but how else can we do it?

If you call librte_sched from ethdev, that's fine.
We don't need more, do we?

> 	- We cannot change the ethdev ops of the HW device PMD because same might be used by other HW devices in the system where TM feature is not required.
> 	- We cannot change the ethdev ops of the current HW device, as on-the-fly changes of the ops structure are not allowed, right?

Right

> 	- We can create a new virtual device on top of existing HW device to inherit most of the ethdev ops of the HW device and patch some specific ethdev ops with librte_sched.
> 
> IMHO there aren't two different ways to do this.

When initializing a HW device, it can (should) reports its TM capabilities.
Then ethdev can decide to use a SW fallback if a capability is missing.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management
  2017-06-08 13:59     ` Thomas Monjalon
@ 2017-06-08 15:27       ` Dumitrescu, Cristian
  2017-06-08 16:16         ` Thomas Monjalon
  0 siblings, 1 reply; 79+ messages in thread
From: Dumitrescu, Cristian @ 2017-06-08 15:27 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Singh, Jasvinder, dev, Yigit, Ferruh, hemant.agrawal,
	Jerin.JacobKollanukkaran, Lu, Wenzhuo



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Thursday, June 8, 2017 3:00 PM
> To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Cc: Singh, Jasvinder <jasvinder.singh@intel.com>; dev@dpdk.org; Yigit,
> Ferruh <ferruh.yigit@intel.com>; hemant.agrawal@nxp.com;
> Jerin.JacobKollanukkaran@cavium.com; Lu, Wenzhuo
> <wenzhuo.lu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic
> management
> 
> 08/06/2017 15:27, Dumitrescu, Cristian:
> > Hi Thomas,
> >
> > Thanks for reviewing this patch set!
> >
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > >
> > > Hi Jasvinder,
> > >
> > > 26/05/2017 20:11, Jasvinder Singh:
> > > > The SoftNIC PMD provides SW fall-back option for the NICs not
> supporting
> > > > the Traffic Management (TM) features.
> > >
> > > Do you mean that you want to stack PMDs in order to offer some
> fallbacks?
> > > It means the user needs to instantiate this PMD for each HW which does
> > > not support traffic management, instead of normal hardware probing?
> > >
> >
> > No, the normal HW probing still takes place for the HW device. Then if QoS
> "probing" fails, the user can decide to create a new virtual device on top of
> the HW device.
> 
> What do you mean by "QoS probing"?

Check if the hierarchy specified by the user can be met by the HW device.

> 
> > > > SoftNIC PMD overview:
> > > > - The SW fall-back is based on the existing librte_sched DPDK library.
> > > > - The TM-agnostic port (the underlay device) is wrapped into a TM-
> aware
> > > >   softnic port (the overlay device).
> > > > - Once the overlay device (virtual device) is created, the configuration
> of
> > > >   the underlay device is taking place through the overlay device.
> > > > - The SoftNIC PMD is generic, i.e. it works for any underlay device PMD
> that
> > > >   implements the ethdev API.
> > >
> > > Why not calling librte_sched directly in ethdev for PMDs which do not
> > > implement hardware offload?
> > > Am I missing something obvious?
> >
> > Yes, we are calling the librte_sched in ethdev, but how else can we do it?
> 
> If you call librte_sched from ethdev, that's fine.
> We don't need more, do we?
> 

We also need to make sure the other non-patched functionality is working as provided by the underlying HW device. E.g. we patch TX to enable TM, but we don't patch RX and RX should still be working.

> > 	- We cannot change the ethdev ops of the HW device PMD because
> same might be used by other HW devices in the system where TM feature is
> not required.
> > 	- We cannot change the ethdev ops of the current HW device, as on-
> the-fly changes of the ops structure are not allowed, right?
> 
> Right
> 
> > 	- We can create a new virtual device on top of existing HW device to
> inherit most of the ethdev ops of the HW device and patch some specific
> ethdev ops with librte_sched.
> >
> > IMHO there aren't two different ways to do this.
> 
> When initializing a HW device, it can (should) reports its TM capabilities.
> Then ethdev can decide to use a SW fallback if a capability is missing.

Unfortunately, having the ethdev taking this decision is not possible with the current librte_ether, as this means the changing the ethdev ops on the fly, which you also agreed is currently not allowed.

This is why we have to leave this decision to the application, which creates the virtual device on top of the existing HW when it wants the SW fall-back to be enabled.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management
  2017-06-08 15:27       ` Dumitrescu, Cristian
@ 2017-06-08 16:16         ` Thomas Monjalon
  2017-06-08 16:43           ` Dumitrescu, Cristian
  0 siblings, 1 reply; 79+ messages in thread
From: Thomas Monjalon @ 2017-06-08 16:16 UTC (permalink / raw)
  To: Dumitrescu, Cristian
  Cc: Singh, Jasvinder, dev, Yigit, Ferruh, hemant.agrawal,
	Jerin.JacobKollanukkaran, Lu, Wenzhuo

08/06/2017 17:27, Dumitrescu, Cristian:
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > 08/06/2017 15:27, Dumitrescu, Cristian:
> > > Hi Thomas,
> > >
> > > Thanks for reviewing this patch set!
> > >
> > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > >
> > > > Hi Jasvinder,
> > > >
> > > > 26/05/2017 20:11, Jasvinder Singh:
> > > > > The SoftNIC PMD provides SW fall-back option for the NICs not
> > supporting
> > > > > the Traffic Management (TM) features.
> > > >
> > > > Do you mean that you want to stack PMDs in order to offer some
> > fallbacks?
> > > > It means the user needs to instantiate this PMD for each HW which does
> > > > not support traffic management, instead of normal hardware probing?
> > > >
> > >
> > > No, the normal HW probing still takes place for the HW device. Then if QoS
> > "probing" fails, the user can decide to create a new virtual device on top of
> > the HW device.
> > 
> > What do you mean by "QoS probing"?
> 
> Check if the hierarchy specified by the user can be met by the HW device.
> 
> > 
> > > > > SoftNIC PMD overview:
> > > > > - The SW fall-back is based on the existing librte_sched DPDK library.
> > > > > - The TM-agnostic port (the underlay device) is wrapped into a TM-
> > aware
> > > > >   softnic port (the overlay device).
> > > > > - Once the overlay device (virtual device) is created, the configuration
> > of
> > > > >   the underlay device is taking place through the overlay device.
> > > > > - The SoftNIC PMD is generic, i.e. it works for any underlay device PMD
> > that
> > > > >   implements the ethdev API.
> > > >
> > > > Why not calling librte_sched directly in ethdev for PMDs which do not
> > > > implement hardware offload?
> > > > Am I missing something obvious?
> > >
> > > Yes, we are calling the librte_sched in ethdev, but how else can we do it?
> > 
> > If you call librte_sched from ethdev, that's fine.
> > We don't need more, do we?
> > 
> 
> We also need to make sure the other non-patched functionality is working as provided by the underlying HW device. E.g. we patch TX to enable TM, but we don't patch RX and RX should still be working.
> 
> > > 	- We cannot change the ethdev ops of the HW device PMD because
> > same might be used by other HW devices in the system where TM feature is
> > not required.
> > > 	- We cannot change the ethdev ops of the current HW device, as on-
> > the-fly changes of the ops structure are not allowed, right?
> > 
> > Right
> > 
> > > 	- We can create a new virtual device on top of existing HW device to
> > inherit most of the ethdev ops of the HW device and patch some specific
> > ethdev ops with librte_sched.
> > >
> > > IMHO there aren't two different ways to do this.
> > 
> > When initializing a HW device, it can (should) reports its TM capabilities.
> > Then ethdev can decide to use a SW fallback if a capability is missing.
> 
> Unfortunately, having the ethdev taking this decision is not possible with the current librte_ether, as this means the changing the ethdev ops on the fly, which you also agreed is currently not allowed.

I'm sure I'm missing something.
In my understanding, we do not need to change the ops:
	- if the device offers the capability, let's call the ops
	- else call the software fallback function

> This is why we have to leave this decision to the application, which creates the virtual device on top of the existing HW when it wants the SW fall-back to be enabled.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management
  2017-06-08 16:16         ` Thomas Monjalon
@ 2017-06-08 16:43           ` Dumitrescu, Cristian
  2017-07-04 23:48             ` Thomas Monjalon
  0 siblings, 1 reply; 79+ messages in thread
From: Dumitrescu, Cristian @ 2017-06-08 16:43 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Singh, Jasvinder, dev, Yigit, Ferruh, hemant.agrawal,
	Jerin.JacobKollanukkaran, Lu, Wenzhuo

<snip> ...
> 
> I'm sure I'm missing something.
> In my understanding, we do not need to change the ops:
> 	- if the device offers the capability, let's call the ops
> 	- else call the software fallback function
> 

What you might be missing is the observation that the approach you're describing requires changing each and every PMD. The changes are also intrusive: need to change the ops that need the SW fall-back patching, also need to change the private data of each PMD (as assigned to the opaque dev->data->dev_private) to add the context data needed by the patched ops. Therefore, this approach is a no-go.

We are looking for a generic approach that can gracefully and transparently work with any PMD.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v2 0/2] net/softnic: sw fall-back for traffic management
  2017-05-26 18:11 ` [dpdk-dev] [PATCH 1/2] net/softnic: add softnic PMD " Jasvinder Singh
@ 2017-06-26 16:43   ` Jasvinder Singh
  2017-06-26 16:43     ` [dpdk-dev] [PATCH v2 1/2] net/softnic: add softnic PMD " Jasvinder Singh
  2017-06-26 16:43     ` [dpdk-dev] [PATCH v2 2/2] net/softnic: add traffic management ops Jasvinder Singh
  0 siblings, 2 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-06-26 16:43 UTC (permalink / raw)
  To: dev
  Cc: cristian.dumitrescu, ferruh.yigit, hemant.agrawal,
	Jerin.JacobKollanukkaran, wenzhuo.lu

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 2974 bytes --]

The SoftNIC PMD provides SW fall-back option for the NICs not supporting
the Traffic Management (TM) features. 

SoftNIC PMD overview:
- The SW fall-back is based on the existing librte_sched DPDK library.
- The TM-agnostic port (the underlay device) is wrapped into a TM-aware
  softnic port (the overlay device).
- Once the overlay device (virtual device) is created, the configuration
  of the underlay device is taking place through the overlay device.
- The SoftNIC PMD is generic, i.e. it works for any underlay device PMD
  that implements the ethdev API.

Similarly to Ring PMD, the SoftNIC virtual device can be created in two
different ways:
1. Through EAL command line (--vdev option) 
2. Through the rte_eth_softnic_create() API function called by the
   application.

SoftNIC PMD params:
- iface (mandatory): the ethdev port name (i.e. PCI address or vdev name)
  for the underlay device
- txq_id (optional, default = 0): tx queue id of the underlay device
- deq_bsz (optional, default = 24): traffic manager dequeue burst size
- Example:  --vdev 'net_softnic0,iface=0000:04:00.1,txq_id=0,deq_bsz=28'

SoftNIC PMD build instructions:
- To build SoftNIC PMD, the following parameter needs to be set on
  config/common_base file: CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
- The SoftNIC PMD depends on the TM API [1] and therefore is initially
targeted for the tm-next repository


Patch 1 adds softnic device PMD for traffic management.
Patch 2 adds traffic management ops to the softnic device suggested in
generic ethdev API for traffic management[1].

[1] TM API version 6:  
http://dpdk.org/dev/patchwork/patch/25275/
http://dpdk.org/dev/patchwork/patch/25276/

Jasvinder Singh (2):
  net/softnic: add softnic PMD for traffic management
  net/softnic: add traffic management ops

 MAINTAINERS                                     |    5 +
 config/common_base                              |    5 +
 drivers/net/Makefile                            |    5 +
 drivers/net/softnic/Makefile                    |   58 ++
 drivers/net/softnic/rte_eth_softnic.c           |  580 ++++++++++++
 drivers/net/softnic/rte_eth_softnic.h           |   99 ++
 drivers/net/softnic/rte_eth_softnic_default.c   | 1108 ++++++++++++++++++++++
 drivers/net/softnic/rte_eth_softnic_internals.h |  148 +++
 drivers/net/softnic/rte_eth_softnic_tm.c        | 1145 +++++++++++++++++++++++
 drivers/net/softnic/rte_eth_softnic_version.map |    7 +
 mk/rte.app.mk                                   |    5 +-
 11 files changed, 3164 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_default.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic_version.map

-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v2 1/2] net/softnic: add softnic PMD for traffic management
  2017-06-26 16:43   ` [dpdk-dev] [PATCH v2 0/2] net/softnic: sw fall-back " Jasvinder Singh
@ 2017-06-26 16:43     ` Jasvinder Singh
  2017-08-11 12:49       ` [dpdk-dev] [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-06-26 16:43     ` [dpdk-dev] [PATCH v2 2/2] net/softnic: add traffic management ops Jasvinder Singh
  1 sibling, 1 reply; 79+ messages in thread
From: Jasvinder Singh @ 2017-06-26 16:43 UTC (permalink / raw)
  To: dev
  Cc: cristian.dumitrescu, ferruh.yigit, hemant.agrawal,
	Jerin.JacobKollanukkaran, wenzhuo.lu

Softnic PMD implements HQoS scheduler as software fallback solution
for the hardware with no HQoS support. When application call rx
function on this device, it simply invokes underlay device rx function.
On the egress path, softnic tx function enqueues the packets into
QoS scheduler. The packets are dequeued from the QoS scheduler and
sent to the underlay device through the rte_eth_softnic_run() API.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
v2 changes:
- fix build errors
- rebased to TM APIs v6 plus dpdk master 

 MAINTAINERS                                     |    5 +
 config/common_base                              |    5 +
 drivers/net/Makefile                            |    5 +
 drivers/net/softnic/Makefile                    |   57 ++
 drivers/net/softnic/rte_eth_softnic.c           |  534 +++++++++++
 drivers/net/softnic/rte_eth_softnic.h           |   99 ++
 drivers/net/softnic/rte_eth_softnic_default.c   | 1108 +++++++++++++++++++++++
 drivers/net/softnic/rte_eth_softnic_internals.h |   67 ++
 drivers/net/softnic/rte_eth_softnic_version.map |    7 +
 mk/rte.app.mk                                   |    5 +-
 10 files changed, 1891 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_default.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 3c7414f..7136fde 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -491,6 +491,11 @@ M: Tetsuya Mukawa <mtetsuyah@gmail.com>
 F: drivers/net/null/
 F: doc/guides/nics/features/null.ini
 
+Softnic PMD
+M: Jasvinder Singh <jasvinder.singh@intel.com>
+M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
+F: drivers/net/softnic
+
 
 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index f6aafd1..f4d33e0 100644
--- a/config/common_base
+++ b/config/common_base
@@ -271,6 +271,11 @@ CONFIG_RTE_LIBRTE_SFC_EFX_PMD=y
 CONFIG_RTE_LIBRTE_SFC_EFX_DEBUG=n
 
 #
+# Compile SOFTNIC PMD
+#
+CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
+
+#
 # Compile software PMD backed by SZEDATA2 device
 #
 CONFIG_RTE_LIBRTE_PMD_SZEDATA2=n
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 35ed813..d3cba67 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -108,4 +108,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
 endif # $(CONFIG_RTE_LIBRTE_VHOST)
 DEPDIRS-vhost = $(core-libs) librte_vhost
 
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += softnic
+endif # $(CONFIG_RTE_LIBRTE_SCHED)
+DEPDIRS-softnic = $(core-libs) librte_sched
+
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
new file mode 100644
index 0000000..809112c
--- /dev/null
+++ b/drivers/net/softnic/Makefile
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_softnic.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_eth_softnic_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic_default.c
+
+#
+# Export include files
+#
+SYMLINK-y-include += rte_eth_softnic.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
new file mode 100644
index 0000000..d4ac100
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -0,0 +1,534 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_ethdev.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_vdev.h>
+#include <rte_kvargs.h>
+#include <rte_errno.h>
+
+#include "rte_eth_softnic.h"
+#include "rte_eth_softnic_internals.h"
+
+#define PMD_PARAM_IFACE_NAME				"iface"
+#define PMD_PARAM_IFACE_QUEUE				"txq_id"
+#define PMD_PARAM_DEQ_BSZ					"deq_bsz"
+
+static const char *pmd_valid_args[] = {
+	PMD_PARAM_IFACE_NAME,
+	PMD_PARAM_IFACE_QUEUE,
+	PMD_PARAM_DEQ_BSZ,
+	NULL
+};
+
+static struct rte_vdev_driver pmd_drv;
+static struct rte_device *device;
+
+static int
+pmd_eth_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	int status;
+
+	/* Stop the underlay device */
+	rte_eth_dev_stop(p->uport_id);
+
+	/* Call the current function for the underlay device */
+	status = rte_eth_dev_configure(p->uport_id,
+		dev->data->nb_rx_queues,
+		dev->data->nb_tx_queues,
+		&dev->data->dev_conf);
+	if (status)
+		return status;
+
+	/* Rework on the RX queues of the overlay device */
+	if (dev->data->rx_queues)
+		rte_free(dev->data->rx_queues);
+	dev->data->rx_queues = p->udev->data->rx_queues;
+
+	return 0;
+}
+
+static int
+pmd_eth_dev_tx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id,
+	uint16_t nb_tx_desc,
+	unsigned int socket_id,
+	const struct rte_eth_txconf *tx_conf)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	int status;
+
+	/* Call the current function for the underlay device */
+	status = rte_eth_tx_queue_setup(p->uport_id,
+		tx_queue_id,
+		nb_tx_desc,
+		socket_id,
+		tx_conf);
+	if (status)
+		return status;
+
+	/* Handle TX queue of the overlay device */
+	dev->data->tx_queues[tx_queue_id] = (void *)p;
+
+	return 0;
+}
+
+static int
+pmd_eth_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Clone dev->data from underlay to overlay */
+	memcpy(dev->data->mac_pool_sel,
+		p->udev->data->mac_pool_sel,
+		sizeof(dev->data->mac_pool_sel));
+	dev->data->promiscuous = p->udev->data->promiscuous;
+	dev->data->all_multicast = p->udev->data->all_multicast;
+
+	/* Call the current function for the underlay device */
+	return rte_eth_dev_start(p->uport_id);
+}
+
+static void
+pmd_eth_dev_stop(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_dev_stop(p->uport_id);
+}
+
+static void
+pmd_eth_dev_close(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_dev_close(p->uport_id);
+
+	/* Cleanup on the overlay device */
+	dev->data->rx_queues = NULL;
+	dev->data->tx_queues = NULL;
+}
+
+static void
+pmd_eth_dev_promiscuous_enable(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_promiscuous_enable(p->uport_id);
+}
+
+static void
+pmd_eth_dev_promiscuous_disable(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_promiscuous_disable(p->uport_id);
+}
+
+static void
+pmd_eth_dev_all_multicast_enable(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_allmulticast_enable(p->uport_id);
+}
+
+static void
+pmd_eth_dev_all_multicast_disable(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_allmulticast_disable(p->uport_id);
+}
+
+static int
+pmd_eth_dev_link_update(struct rte_eth_dev *dev,
+	int wait_to_complete __rte_unused)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_eth_link dev_link;
+
+	/* Call the current function for the underlay device */
+	rte_eth_link_get(p->uport_id, &dev_link);
+
+	/* Overlay device update */
+	dev->data->dev_link = dev_link;
+
+	return 0;
+}
+
+static int
+pmd_eth_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	int status;
+
+	/* Call the current function for the underlay device */
+	status = rte_eth_dev_set_mtu(p->uport_id, mtu);
+	if (status)
+		return status;
+
+	/* Overlay device update */
+	dev->data->mtu = mtu;
+
+	return 0;
+}
+
+static void
+pmd_eth_dev_mac_addr_set(struct rte_eth_dev *dev,
+	struct ether_addr *mac_addr)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_dev_default_mac_addr_set(p->uport_id, mac_addr);
+}
+
+static int
+pmd_eth_dev_mac_addr_add(struct rte_eth_dev *dev,
+	struct ether_addr *mac_addr,
+	uint32_t index __rte_unused,
+	uint32_t vmdq)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	return rte_eth_dev_mac_addr_add(p->uport_id, mac_addr, vmdq);
+}
+
+static void
+pmd_eth_dev_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Call the current function for the underlay device */
+	rte_eth_dev_mac_addr_remove(p->uport_id, &dev->data->mac_addrs[index]);
+}
+
+static uint16_t
+pmd_eth_dev_tx_burst(void *txq,
+	struct rte_mbuf **tx_pkts,
+	uint16_t nb_pkts)
+{
+	struct pmd_internals *p = txq;
+
+	return rte_eth_tx_burst(p->uport_id, p->txq_id, tx_pkts, nb_pkts);
+}
+
+int
+rte_eth_softnic_run(uint8_t port_id __rte_unused)
+{
+	return 0;
+}
+
+static void
+pmd_ops_build(struct eth_dev_ops *o, const struct eth_dev_ops *u)
+{
+	/* Inherited functionality */
+	pmd_ops_inherit(o, u);
+
+	/* Derived functionality */
+	o->dev_configure = pmd_eth_dev_configure;
+	o->tx_queue_setup = pmd_eth_dev_tx_queue_setup;
+	o->dev_start = pmd_eth_dev_start;
+	o->dev_stop = pmd_eth_dev_stop;
+	o->dev_close = pmd_eth_dev_close;
+	o->promiscuous_enable = pmd_eth_dev_promiscuous_enable;
+	o->promiscuous_disable = pmd_eth_dev_promiscuous_disable;
+	o->allmulticast_enable = pmd_eth_dev_all_multicast_enable;
+	o->allmulticast_disable = pmd_eth_dev_all_multicast_disable;
+	o->link_update = pmd_eth_dev_link_update;
+	o->mtu_set = pmd_eth_dev_mtu_set;
+	o->mac_addr_set = pmd_eth_dev_mac_addr_set;
+	o->mac_addr_add = pmd_eth_dev_mac_addr_add;
+	o->mac_addr_remove = pmd_eth_dev_mac_addr_remove;
+}
+
+int
+rte_eth_softnic_create(struct rte_eth_softnic_params *params)
+{
+	struct rte_eth_dev_info uinfo;
+	struct rte_eth_dev *odev, *udev;
+	struct rte_eth_dev_data *odata, *udata;
+	struct eth_dev_ops *odev_ops;
+	const struct eth_dev_ops *udev_ops;
+	void **otx_queues;
+	struct pmd_internals *p;
+	int numa_node;
+	uint8_t oport_id, uport_id;
+
+	/* Check input arguments */
+	if ((!params) || (!params->oname) || (!params->uname) ||
+		(params->deq_bsz > RTE_ETH_SOFTNIC_DEQ_BSZ_MAX))
+		return -EINVAL;
+
+	if (rte_eth_dev_get_port_by_name(params->uname, &uport_id))
+		return -EINVAL;
+	udev = &rte_eth_devices[uport_id];
+	udata = udev->data;
+	udev_ops = udev->dev_ops;
+	numa_node = udata->numa_node;
+
+	rte_eth_dev_info_get(uport_id, &uinfo);
+	if (params->txq_id >= uinfo.max_tx_queues)
+		return -EINVAL;
+
+	RTE_LOG(INFO, PMD,
+		"Creating overlay device %s for underlay device %s\n",
+		params->oname, params->uname);
+
+	/* Overlay device ethdev entry: entry allocation */
+	odev = rte_eth_dev_allocate(params->oname);
+	if (!odev)
+		return -ENOMEM;
+	oport_id = odev - rte_eth_devices;
+
+	/* Overlay device ethdev entry: memory allocation */
+	odev_ops = rte_zmalloc_socket(params->oname,
+		sizeof(*odev_ops), 0, numa_node);
+	if (!odev_ops) {
+		rte_eth_dev_release_port(odev);
+		return -ENOMEM;
+	}
+	odev->dev_ops = odev_ops;
+
+	odata = rte_zmalloc_socket(params->oname,
+		sizeof(*odata), 0, numa_node);
+	if (!odata) {
+		rte_free(odev_ops);
+		rte_eth_dev_release_port(odev);
+		return -ENOMEM;
+	}
+	memmove(odata->name, odev->data->name, sizeof(odata->name));
+	odata->port_id = odev->data->port_id;
+	odata->mtu = odev->data->mtu;
+	odev->data = odata;
+
+	otx_queues = rte_zmalloc_socket(params->oname,
+		uinfo.max_tx_queues * sizeof(void *), 0, numa_node);
+	if (!otx_queues) {
+		rte_free(odata);
+		rte_free(odev_ops);
+		rte_eth_dev_release_port(odev);
+		return -ENOMEM;
+	}
+	odev->data->tx_queues = otx_queues;
+
+	p = rte_zmalloc_socket(params->oname,
+		sizeof(struct pmd_internals), 0, numa_node);
+	if (!p) {
+		rte_free(otx_queues);
+		rte_free(odata);
+		rte_free(odev_ops);
+		rte_eth_dev_release_port(odev);
+		return -ENOMEM;
+	}
+	odev->data->dev_private = p;
+
+	/* Overlay device ethdev entry: fill in dev */
+	odev->rx_pkt_burst = udev->rx_pkt_burst;
+	odev->tx_pkt_burst = pmd_eth_dev_tx_burst;
+	odev->tx_pkt_prepare = udev->tx_pkt_prepare;
+
+	/* Overlay device ethdev entry: fill in dev->data */
+	odev->data->dev_link = udev->data->dev_link;
+	odev->data->mtu = udev->data->mtu;
+	odev->data->min_rx_buf_size = udev->data->min_rx_buf_size;
+	odev->data->mac_addrs = udev->data->mac_addrs;
+	odev->data->hash_mac_addrs = udev->data->hash_mac_addrs;
+	odev->data->promiscuous = udev->data->promiscuous;
+	odev->data->all_multicast = udev->data->all_multicast;
+	odev->data->dev_flags = udev->data->dev_flags;
+	odev->data->kdrv = RTE_KDRV_NONE;
+	odev->data->numa_node = numa_node;
+	odev->device = device;
+
+	/* Overlay device ethdev entry: fill in dev->dev_ops */
+	pmd_ops_build(odev_ops, udev_ops);
+
+	/* Overlay device ethdev entry: fill in dev->data->dev_private */
+	p->odev = odev;
+	p->udev = udev;
+	p->odata = odata;
+	p->udata = udata;
+	p->odev_ops = odev_ops;
+	p->udev_ops = udev_ops;
+	p->oport_id = oport_id;
+	p->uport_id = uport_id;
+	p->deq_bsz = params->deq_bsz;
+	p->txq_id = params->txq_id;
+
+	return 0;
+}
+
+static int
+get_string_arg(const char *key __rte_unused,
+	const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(char **)extra_args = strdup(value);
+
+	if (!*(char **)extra_args)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int
+get_int_arg(const char *key __rte_unused,
+	const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(uint32_t *)extra_args = strtoull(value, NULL, 0);
+
+	return 0;
+}
+
+static int
+pmd_probe(struct rte_vdev_device *dev)
+{
+	struct rte_eth_softnic_params p;
+	const char *params;
+	struct rte_kvargs *kvlist;
+	int ret;
+
+	if (!dev)
+		return -EINVAL;
+
+	RTE_LOG(INFO, PMD, "Probing device %s\n", rte_vdev_device_name(dev));
+
+	params = rte_vdev_device_args(dev);
+	if (!params)
+		return -EINVAL;
+
+	kvlist = rte_kvargs_parse(params, pmd_valid_args);
+	if (!kvlist)
+		return -EINVAL;
+
+	p.oname = rte_vdev_device_name(dev);
+	device = &dev->device;
+
+	/* Interface: Mandatory */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_IFACE_NAME) != 1) {
+		ret = -EINVAL;
+		goto out_free;
+	}
+	ret = rte_kvargs_process(kvlist, PMD_PARAM_IFACE_NAME, &get_string_arg,
+		&p.uname);
+	if (ret < 0)
+		goto out_free;
+
+	/* Interface Queue ID: Optional */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_IFACE_QUEUE) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_IFACE_QUEUE,
+			&get_int_arg, &p.txq_id);
+		if (ret < 0)
+			goto out_free;
+	} else {
+		p.txq_id = RTE_ETH_SOFTNIC_TXQ_ID_DEFAULT;
+	}
+
+	/* Dequeue Burst Size: Optional */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_DEQ_BSZ) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_DEQ_BSZ,
+			&get_int_arg, &p.deq_bsz);
+		if (ret < 0)
+			goto out_free;
+	} else {
+		p.deq_bsz = RTE_ETH_SOFTNIC_DEQ_BSZ_DEFAULT;
+	}
+
+	return rte_eth_softnic_create(&p);
+
+out_free:
+	rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+pmd_remove(struct rte_vdev_device *dev)
+{
+	struct rte_eth_dev *eth_dev = NULL;
+	struct pmd_internals *p;
+	struct eth_dev_ops *dev_ops;
+
+	if (!dev)
+		return -EINVAL;
+
+	RTE_LOG(INFO, PMD, "Removing device %s\n", rte_vdev_device_name(dev));
+
+	/* Find the ethdev entry */
+	eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
+	if (!eth_dev)
+		return -ENODEV;
+	p = eth_dev->data->dev_private;
+	dev_ops = p->odev_ops;
+
+	pmd_eth_dev_stop(eth_dev);
+
+	/* Free device data structures*/
+	rte_free(eth_dev->data->dev_private);
+	rte_free(eth_dev->data->tx_queues);
+	rte_free(eth_dev->data);
+	rte_free(dev_ops);
+	rte_eth_dev_release_port(eth_dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_drv = {
+	.probe = pmd_probe,
+	.remove = pmd_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_softnic, pmd_drv);
+RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
+	PMD_PARAM_IFACE_NAME "=<string> "
+	PMD_PARAM_IFACE_QUEUE "=<int> "
+	PMD_PARAM_DEQ_BSZ "=<int>");
diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
new file mode 100644
index 0000000..f2a21ab
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.h
@@ -0,0 +1,99 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_H__
+
+#include <stdint.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifndef RTE_ETH_SOFTNIC_TXQ_ID_DEFAULT
+#define RTE_ETH_SOFTNIC_TXQ_ID_DEFAULT				0
+#endif
+
+#ifndef RTE_ETH_SOFTNIC_DEQ_BSZ_MAX
+#define RTE_ETH_SOFTNIC_DEQ_BSZ_MAX				256
+#endif
+
+#ifndef RTE_ETH_SOFTNIC_DEQ_BSZ_DEFAULT
+#define RTE_ETH_SOFTNIC_DEQ_BSZ_DEFAULT				24
+#endif
+
+struct rte_eth_softnic_params {
+	/**< Name of the overlay network interface (to be created) */
+	const char *oname;
+
+	 /**< Name for the underlay network interface (existing) */
+	const char *uname;
+
+	/**< TX queue ID for the underlay device */
+	uint32_t txq_id;
+
+	/**< Dequeue burst size */
+	uint32_t deq_bsz;
+};
+
+/**
+ * Create a new overlay device
+ *
+ * @param params
+ *    a pointer to a struct rte_eth_softnic_params which contains
+ *    all the arguments to create the overlay device.
+ * @return
+ *    0 if device is created successfully,  or -1 on error.
+ */
+int
+rte_eth_softnic_create(struct rte_eth_softnic_params *params);
+
+/**
+ * Run the traffic management on the overlay device
+ *
+ * This function dequeues the scheduled packets from the scheduler
+ * and transmit them onto underlay device interface.
+ *
+ * @param portid
+ *    port id of the overlay device.
+ * @return
+ *    0.
+ */
+int
+rte_eth_softnic_run(uint8_t port_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_default.c b/drivers/net/softnic/rte_eth_softnic_default.c
new file mode 100644
index 0000000..138e8a1
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_default.c
@@ -0,0 +1,1108 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES LOSS OF USE,
+ *   DATA, OR PROFITS OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+
+#include <rte_ethdev.h>
+
+#include "rte_eth_softnic_internals.h"
+#include "rte_eth_softnic.h"
+
+#define UDEV(odev)	({		\
+	((struct pmd_internals *)((odev)->data->dev_private))->udev;	\
+})
+
+static int
+pmd_dev_configure(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->dev_configure(dev);
+}
+
+static int
+pmd_dev_start(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->dev_start(dev);
+}
+
+static void
+pmd_dev_stop(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->dev_stop(dev);
+}
+
+static int
+pmd_dev_set_link_up(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->dev_set_link_up(dev);
+}
+
+static int
+pmd_dev_set_link_down(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->dev_set_link_down(dev);
+}
+
+static void
+pmd_dev_close(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->dev_close(dev);
+}
+
+static void
+pmd_promiscuous_enable(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->promiscuous_enable(dev);
+}
+
+static void
+pmd_promiscuous_disable(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->promiscuous_disable(dev);
+}
+
+static void
+pmd_allmulticast_enable(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->allmulticast_enable(dev);
+}
+
+static void
+pmd_allmulticast_disable(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->allmulticast_disable(dev);
+}
+
+static int
+pmd_link_update(struct rte_eth_dev *dev, int wait_to_complete)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->link_update(dev, wait_to_complete);
+}
+
+static void
+pmd_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *igb_stats)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->stats_get(dev, igb_stats);
+}
+
+static void
+pmd_stats_reset(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->stats_reset(dev);
+}
+
+static int
+pmd_xstats_get(struct rte_eth_dev *dev,
+	struct rte_eth_xstat *stats, unsigned int n)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->xstats_get(dev, stats, n);
+}
+
+static int
+pmd_xstats_get_by_id(struct rte_eth_dev *dev,
+	const uint64_t *ids, uint64_t *values, unsigned int n)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->xstats_get_by_id(dev, ids, values, n);
+}
+
+static void
+pmd_xstats_reset(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->xstats_reset(dev);
+}
+
+static int
+pmd_xstats_get_names(struct rte_eth_dev *dev,
+	struct rte_eth_xstat_name *xstats_names, unsigned int size)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->xstats_get_names(dev, xstats_names, size);
+}
+
+static int
+pmd_xstats_get_names_by_id(struct rte_eth_dev *dev,
+	struct rte_eth_xstat_name *xstats_names, const uint64_t *ids,
+	unsigned int size)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->xstats_get_names_by_id(dev, xstats_names,
+		ids, size);
+}
+
+static int
+pmd_queue_stats_mapping_set(struct rte_eth_dev *dev,
+	uint16_t queue_id, uint8_t stat_idx, uint8_t is_rx)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->queue_stats_mapping_set(dev, queue_id,
+		stat_idx, is_rx);
+}
+
+static void
+pmd_dev_infos_get(struct rte_eth_dev *dev,
+	struct rte_eth_dev_info *dev_info)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->dev_infos_get(dev, dev_info);
+}
+
+static const uint32_t *
+pmd_dev_supported_ptypes_get(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->dev_supported_ptypes_get(dev);
+}
+
+static int
+pmd_rx_queue_start(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rx_queue_start(dev, queue_id);
+}
+
+static int
+pmd_rx_queue_stop(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rx_queue_stop(dev, queue_id);
+}
+
+static int
+pmd_tx_queue_start(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->tx_queue_start(dev, queue_id);
+}
+
+static int
+pmd_tx_queue_stop(struct rte_eth_dev *dev, uint16_t queue_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->tx_queue_stop(dev, queue_id);
+}
+
+static int
+pmd_rx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id, uint16_t nb_rx_desc, unsigned int socket_id,
+	const struct rte_eth_rxconf *rx_conf, struct rte_mempool *mb_pool)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rx_queue_setup(dev, rx_queue_id, nb_rx_desc,
+		socket_id, rx_conf, mb_pool);
+}
+
+static int
+pmd_tx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id, uint16_t nb_tx_desc, unsigned int socket_id,
+	const struct rte_eth_txconf *tx_conf)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->tx_queue_setup(dev, tx_queue_id, nb_tx_desc,
+		socket_id, tx_conf);
+}
+
+static int
+pmd_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t rx_queue_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rx_queue_intr_enable(dev, rx_queue_id);
+}
+
+static int
+pmd_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t rx_queue_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rx_queue_intr_disable(dev, rx_queue_id);
+}
+
+static uint32_t
+pmd_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rx_queue_count(dev, rx_queue_id);
+}
+
+static int
+pmd_fw_version_get(struct rte_eth_dev *dev,
+	char *fw_version, size_t fw_size)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->fw_version_get(dev, fw_version, fw_size);
+}
+
+static void
+pmd_rxq_info_get(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->rxq_info_get(dev, rx_queue_id, qinfo);
+}
+
+static void
+pmd_txq_info_get(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id, struct rte_eth_txq_info *qinfo)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->txq_info_get(dev, tx_queue_id, qinfo);
+}
+
+static int
+pmd_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->mtu_set(dev, mtu);
+}
+
+static int
+pmd_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->vlan_filter_set(dev, vlan_id, on);
+}
+
+static int
+pmd_vlan_tpid_set(struct rte_eth_dev *dev,
+	enum rte_vlan_type type, uint16_t tpid)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->vlan_tpid_set(dev, type, tpid);
+}
+
+static void
+pmd_vlan_offload_set(struct rte_eth_dev *dev, int mask)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->vlan_offload_set(dev, mask);
+}
+
+static int
+pmd_vlan_pvid_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->vlan_pvid_set(dev, vlan_id, on);
+}
+
+static void
+pmd_vlan_strip_queue_set(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id, int on)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->vlan_strip_queue_set(dev, rx_queue_id, on);
+}
+
+static int
+pmd_flow_ctrl_get(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->flow_ctrl_get(dev, fc_conf);
+}
+
+static int
+pmd_flow_ctrl_set(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->flow_ctrl_set(dev, fc_conf);
+}
+
+static int
+pmd_priority_flow_ctrl_set(struct rte_eth_dev *dev,
+	struct rte_eth_pfc_conf *pfc_conf)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->priority_flow_ctrl_set(dev, pfc_conf);
+}
+
+static int
+pmd_reta_update(struct rte_eth_dev *dev,
+	struct rte_eth_rss_reta_entry64 *reta_conf, uint16_t reta_size)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->reta_update(dev, reta_conf, reta_size);
+}
+
+static int
+pmd_reta_query(struct rte_eth_dev *dev,
+	struct rte_eth_rss_reta_entry64 *reta_conf, uint16_t reta_size)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->reta_query(dev, reta_conf, reta_size);
+}
+
+static int
+pmd_rss_hash_update(struct rte_eth_dev *dev,
+	struct rte_eth_rss_conf *rss_conf)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rss_hash_update(dev, rss_conf);
+}
+
+static int
+pmd_rss_hash_conf_get(struct rte_eth_dev *dev,
+	struct rte_eth_rss_conf *rss_conf)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->rss_hash_conf_get(dev, rss_conf);
+}
+
+static int
+pmd_dev_led_on(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->dev_led_on(dev);
+}
+
+static int
+pmd_dev_led_off(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->dev_led_off(dev);
+}
+
+static void
+pmd_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->mac_addr_remove(dev, index);
+}
+
+static int
+pmd_mac_addr_add(struct rte_eth_dev *dev,
+	struct ether_addr *mac_addr, uint32_t index, uint32_t vmdq)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->mac_addr_add(dev, mac_addr, index, vmdq);
+}
+
+static void
+pmd_mac_addr_set(struct rte_eth_dev *dev, struct ether_addr *mac_addr)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->mac_addr_set(dev, mac_addr);
+}
+
+static int
+pmd_uc_hash_table_set(struct rte_eth_dev *dev,
+	struct ether_addr *mac_addr, uint8_t on)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->uc_hash_table_set(dev, mac_addr, on);
+}
+
+static int
+pmd_uc_all_hash_table_set(struct rte_eth_dev *dev, uint8_t on)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->uc_all_hash_table_set(dev, on);
+}
+
+static int
+pmd_set_queue_rate_limit(struct rte_eth_dev *dev,
+	uint16_t queue_idx, uint16_t tx_rate)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->set_queue_rate_limit(dev, queue_idx, tx_rate);
+}
+
+static int
+pmd_mirror_rule_set(struct rte_eth_dev *dev,
+	struct rte_eth_mirror_conf *mirror_conf, uint8_t rule_id, uint8_t on)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->mirror_rule_set(dev, mirror_conf, rule_id, on);
+}
+
+static int
+pmd_mirror_rule_reset(struct rte_eth_dev *dev, uint8_t rule_id)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->mirror_rule_reset(dev, rule_id);
+}
+
+static int
+pmd_udp_tunnel_port_add(struct rte_eth_dev *dev,
+	struct rte_eth_udp_tunnel *tunnel_udp)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->udp_tunnel_port_add(dev, tunnel_udp);
+}
+
+static int
+pmd_udp_tunnel_port_del(struct rte_eth_dev *dev,
+	struct rte_eth_udp_tunnel *tunnel_udp)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->udp_tunnel_port_del(dev, tunnel_udp);
+}
+
+static int
+pmd_set_mc_addr_list(struct rte_eth_dev *dev,
+	struct ether_addr *mc_addr_set, uint32_t nb_mc_addr)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->set_mc_addr_list(dev, mc_addr_set, nb_mc_addr);
+}
+
+static int
+pmd_timesync_enable(struct rte_eth_dev *dev)
+{
+	return dev->dev_ops->timesync_enable(dev);
+}
+
+static int
+pmd_timesync_disable(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->timesync_disable(dev);
+}
+
+static int
+pmd_timesync_read_rx_timestamp(struct rte_eth_dev *dev,
+	struct timespec *timestamp, uint32_t flags)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->timesync_read_rx_timestamp(dev, timestamp, flags);
+}
+
+static int
+pmd_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
+	struct timespec *timestamp)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->timesync_read_tx_timestamp(dev, timestamp);
+}
+
+static int
+pmd_timesync_adjust_time(struct rte_eth_dev *dev, int64_t delta)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->timesync_adjust_time(dev, delta);
+}
+
+static int
+pmd_timesync_read_time(struct rte_eth_dev *dev, struct timespec *timestamp)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->timesync_read_time(dev, timestamp);
+}
+
+static int
+pmd_timesync_write_time(struct rte_eth_dev *dev,
+	const struct timespec *timestamp)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->timesync_write_time(dev, timestamp);
+}
+
+static int
+pmd_get_reg(struct rte_eth_dev *dev, struct rte_dev_reg_info *info)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->get_reg(dev, info);
+}
+
+static int
+pmd_get_eeprom_length(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->get_eeprom_length(dev);
+}
+
+static int
+pmd_get_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *info)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->get_eeprom(dev, info);
+}
+
+static int
+pmd_set_eeprom(struct rte_eth_dev *dev, struct rte_dev_eeprom_info *info)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->set_eeprom(dev, info);
+}
+
+static int
+pmd_l2_tunnel_eth_type_conf(struct rte_eth_dev *dev,
+	struct rte_eth_l2_tunnel_conf *l2_tunnel)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->l2_tunnel_eth_type_conf(dev, l2_tunnel);
+}
+
+static int
+pmd_l2_tunnel_offload_set(struct rte_eth_dev *dev,
+	struct rte_eth_l2_tunnel_conf *l2_tunnel,
+	uint32_t mask,
+	uint8_t en)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->l2_tunnel_offload_set(dev, l2_tunnel, mask, en);
+}
+
+#ifdef RTE_NIC_BYPASS
+
+static void
+pmd_bypass_init(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	dev->dev_ops->bypass_init(dev);
+}
+
+static int32_t
+pmd_bypass_state_set(struct rte_eth_dev *dev, uint32_t *new_state)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_state_set(dev, new_state);
+}
+
+static int32_t
+pmd_bypass_state_show(struct rte_eth_dev *dev, uint32_t *state)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_state_show(dev, state);
+}
+
+static int32_t
+pmd_bypass_event_set(struct rte_eth_dev *dev,
+	uint32_t state, uint32_t event)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_event_set(dev, state, event);
+}
+
+static int32_t
+pmd_bypass_event_show(struct rte_eth_dev *dev,
+	uint32_t event_shift, uint32_t *event)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_event_show(dev, event_shift, event);
+}
+
+static int32_t
+pmd_bypass_wd_timeout_set(struct rte_eth_dev *dev, uint32_t timeout)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_wd_timeout_set(dev, timeout);
+}
+
+static int32_t
+pmd_bypass_wd_timeout_show(struct rte_eth_dev *dev, uint32_t *wd_timeout)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_wd_timeout_show(dev, wd_timeout);
+}
+
+static int32_t
+pmd_bypass_ver_show(struct rte_eth_dev *dev, uint32_t *ver)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_ver_show(dev, ver);
+}
+
+static int32_t
+pmd_bypass_wd_reset(struct rte_eth_dev *dev)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->bypass_wd_reset(dev);
+}
+
+#endif
+
+static int
+pmd_filter_ctrl(struct rte_eth_dev *dev, enum rte_filter_type filter_type,
+	enum rte_filter_op filter_op, void *arg)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->filter_ctrl(dev, filter_type, filter_op, arg);
+}
+
+static int
+pmd_get_dcb_info(struct rte_eth_dev *dev,
+	struct rte_eth_dcb_info *dcb_info)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->get_dcb_info(dev, dcb_info);
+}
+
+static int
+pmd_tm_ops_get(struct rte_eth_dev *dev, void *ops)
+{
+	dev = UDEV(dev);
+
+	return dev->dev_ops->tm_ops_get(dev, ops);
+}
+
+static const struct eth_dev_ops pmd_ops_default = {
+	.dev_configure = pmd_dev_configure,
+	.dev_start = pmd_dev_start,
+	.dev_stop = pmd_dev_stop,
+	.dev_set_link_up = pmd_dev_set_link_up,
+	.dev_set_link_down = pmd_dev_set_link_down,
+	.dev_close = pmd_dev_close,
+	.link_update = pmd_link_update,
+
+	.promiscuous_enable = pmd_promiscuous_enable,
+	.promiscuous_disable = pmd_promiscuous_disable,
+	.allmulticast_enable = pmd_allmulticast_enable,
+	.allmulticast_disable = pmd_allmulticast_disable,
+	.mac_addr_remove = pmd_mac_addr_remove,
+	.mac_addr_add = pmd_mac_addr_add,
+	.mac_addr_set = pmd_mac_addr_set,
+	.set_mc_addr_list = pmd_set_mc_addr_list,
+	.mtu_set = pmd_mtu_set,
+
+	.stats_get = pmd_stats_get,
+	.stats_reset = pmd_stats_reset,
+	.xstats_get = pmd_xstats_get,
+	.xstats_reset = pmd_xstats_reset,
+	.xstats_get_names = pmd_xstats_get_names,
+	.queue_stats_mapping_set = pmd_queue_stats_mapping_set,
+
+	.dev_infos_get = pmd_dev_infos_get,
+	.rxq_info_get = pmd_rxq_info_get,
+	.txq_info_get = pmd_txq_info_get,
+	.fw_version_get = pmd_fw_version_get,
+	.dev_supported_ptypes_get = pmd_dev_supported_ptypes_get,
+
+	.vlan_filter_set = pmd_vlan_filter_set,
+	.vlan_tpid_set = pmd_vlan_tpid_set,
+	.vlan_strip_queue_set = pmd_vlan_strip_queue_set,
+	.vlan_offload_set = pmd_vlan_offload_set,
+	.vlan_pvid_set = pmd_vlan_pvid_set,
+
+	.rx_queue_start = pmd_rx_queue_start,
+	.rx_queue_stop = pmd_rx_queue_stop,
+	.tx_queue_start = pmd_tx_queue_start,
+	.tx_queue_stop = pmd_tx_queue_stop,
+	.rx_queue_setup = pmd_rx_queue_setup,
+	.rx_queue_release = NULL,
+	.rx_queue_count = pmd_rx_queue_count,
+	.rx_descriptor_done = NULL,
+	.rx_descriptor_status = NULL,
+	.tx_descriptor_status = NULL,
+	.rx_queue_intr_enable = pmd_rx_queue_intr_enable,
+	.rx_queue_intr_disable = pmd_rx_queue_intr_disable,
+	.tx_queue_setup = pmd_tx_queue_setup,
+	.tx_queue_release = NULL,
+	.tx_done_cleanup = NULL,
+
+	.dev_led_on = pmd_dev_led_on,
+	.dev_led_off = pmd_dev_led_off,
+
+	.flow_ctrl_get = pmd_flow_ctrl_get,
+	.flow_ctrl_set = pmd_flow_ctrl_set,
+	.priority_flow_ctrl_set = pmd_priority_flow_ctrl_set,
+
+	.uc_hash_table_set = pmd_uc_hash_table_set,
+	.uc_all_hash_table_set = pmd_uc_all_hash_table_set,
+
+	.mirror_rule_set = pmd_mirror_rule_set,
+	.mirror_rule_reset = pmd_mirror_rule_reset,
+
+	.udp_tunnel_port_add = pmd_udp_tunnel_port_add,
+	.udp_tunnel_port_del = pmd_udp_tunnel_port_del,
+	.l2_tunnel_eth_type_conf = pmd_l2_tunnel_eth_type_conf,
+	.l2_tunnel_offload_set = pmd_l2_tunnel_offload_set,
+
+	.set_queue_rate_limit = pmd_set_queue_rate_limit,
+
+	.rss_hash_update = pmd_rss_hash_update,
+	.rss_hash_conf_get = pmd_rss_hash_conf_get,
+	.reta_update = pmd_reta_update,
+	.reta_query = pmd_reta_query,
+
+	.get_reg = pmd_get_reg,
+	.get_eeprom_length = pmd_get_eeprom_length,
+	.get_eeprom = pmd_get_eeprom,
+	.set_eeprom = pmd_set_eeprom,
+
+#ifdef RTE_NIC_BYPASS
+	.bypass_init = pmd_bypass_init,
+	.bypass_state_set = pmd_bypass_state_set,
+	.bypass_state_show = pmd_bypass_state_show,
+	.bypass_event_set = pmd_bypass_event_set,
+	.bypass_event_show = pmd_bypass_event_show,
+	.bypass_wd_timeout_set = pmd_bypass_wd_timeout_set,
+	.bypass_wd_timeout_show = pmd_bypass_wd_timeout_show,
+	.bypass_ver_show = pmd_bypass_ver_show,
+	.bypass_wd_reset = pmd_bypass_wd_reset,
+#endif
+
+	.filter_ctrl = pmd_filter_ctrl,
+
+	.get_dcb_info = pmd_get_dcb_info,
+
+	.timesync_enable = pmd_timesync_enable,
+	.timesync_disable = pmd_timesync_disable,
+	.timesync_read_rx_timestamp = pmd_timesync_read_rx_timestamp,
+	.timesync_read_tx_timestamp = pmd_timesync_read_tx_timestamp,
+	.timesync_adjust_time = pmd_timesync_adjust_time,
+	.timesync_read_time = pmd_timesync_read_time,
+	.timesync_write_time = pmd_timesync_write_time,
+
+	.xstats_get_by_id = pmd_xstats_get_by_id,
+	.xstats_get_names_by_id = pmd_xstats_get_names_by_id,
+
+	.tm_ops_get = pmd_tm_ops_get,
+};
+
+#define CHECK_AND_SET_NULL(o, u, func)	({			\
+	if (!(u)->func)				\
+		(o)->func = NULL;				\
+})
+
+#define CHECK_AND_SET_NONNULL(o, u, func)	({		\
+	if ((u)->func)				\
+		(o)->func = (u)->func;			\
+})
+
+static void
+pmd_ops_check_and_set_null(struct eth_dev_ops *o,
+	const struct eth_dev_ops *u)
+{
+	CHECK_AND_SET_NULL(o, u, dev_configure);
+	CHECK_AND_SET_NULL(o, u, dev_start);
+	CHECK_AND_SET_NULL(o, u, dev_stop);
+	CHECK_AND_SET_NULL(o, u, dev_set_link_up);
+	CHECK_AND_SET_NULL(o, u, dev_set_link_down);
+	CHECK_AND_SET_NULL(o, u, dev_close);
+	CHECK_AND_SET_NULL(o, u, link_update);
+	CHECK_AND_SET_NULL(o, u, promiscuous_enable);
+	CHECK_AND_SET_NULL(o, u, promiscuous_disable);
+	CHECK_AND_SET_NULL(o, u, allmulticast_enable);
+	CHECK_AND_SET_NULL(o, u, allmulticast_disable);
+	CHECK_AND_SET_NULL(o, u, mac_addr_remove);
+	CHECK_AND_SET_NULL(o, u, mac_addr_add);
+	CHECK_AND_SET_NULL(o, u, mac_addr_set);
+	CHECK_AND_SET_NULL(o, u, set_mc_addr_list);
+	CHECK_AND_SET_NULL(o, u, mtu_set);
+	CHECK_AND_SET_NULL(o, u, stats_get);
+	CHECK_AND_SET_NULL(o, u, stats_reset);
+	CHECK_AND_SET_NULL(o, u, xstats_get);
+	CHECK_AND_SET_NULL(o, u, xstats_reset);
+	CHECK_AND_SET_NULL(o, u, xstats_get_names);
+	CHECK_AND_SET_NULL(o, u, queue_stats_mapping_set);
+	CHECK_AND_SET_NULL(o, u, dev_infos_get);
+	CHECK_AND_SET_NULL(o, u, rxq_info_get);
+	CHECK_AND_SET_NULL(o, u, txq_info_get);
+	CHECK_AND_SET_NULL(o, u, fw_version_get);
+	CHECK_AND_SET_NULL(o, u, dev_supported_ptypes_get);
+	CHECK_AND_SET_NULL(o, u, vlan_filter_set);
+	CHECK_AND_SET_NULL(o, u, vlan_tpid_set);
+	CHECK_AND_SET_NULL(o, u, vlan_strip_queue_set);
+	CHECK_AND_SET_NULL(o, u, vlan_offload_set);
+	CHECK_AND_SET_NULL(o, u, vlan_pvid_set);
+	CHECK_AND_SET_NULL(o, u, rx_queue_start);
+	CHECK_AND_SET_NULL(o, u, rx_queue_stop);
+	CHECK_AND_SET_NULL(o, u, tx_queue_start);
+	CHECK_AND_SET_NULL(o, u, tx_queue_stop);
+	CHECK_AND_SET_NULL(o, u, rx_queue_setup);
+	CHECK_AND_SET_NULL(o, u, rx_queue_release);
+	CHECK_AND_SET_NULL(o, u, rx_queue_count);
+	CHECK_AND_SET_NULL(o, u, rx_descriptor_done);
+	CHECK_AND_SET_NULL(o, u, rx_descriptor_status);
+	CHECK_AND_SET_NULL(o, u, tx_descriptor_status);
+	CHECK_AND_SET_NULL(o, u, rx_queue_intr_enable);
+	CHECK_AND_SET_NULL(o, u, rx_queue_intr_disable);
+	CHECK_AND_SET_NULL(o, u, tx_queue_setup);
+	CHECK_AND_SET_NULL(o, u, tx_queue_release);
+	CHECK_AND_SET_NULL(o, u, tx_done_cleanup);
+	CHECK_AND_SET_NULL(o, u, dev_led_on);
+	CHECK_AND_SET_NULL(o, u, dev_led_off);
+	CHECK_AND_SET_NULL(o, u, flow_ctrl_get);
+	CHECK_AND_SET_NULL(o, u, flow_ctrl_set);
+	CHECK_AND_SET_NULL(o, u, priority_flow_ctrl_set);
+	CHECK_AND_SET_NULL(o, u, uc_hash_table_set);
+	CHECK_AND_SET_NULL(o, u, uc_all_hash_table_set);
+	CHECK_AND_SET_NULL(o, u, mirror_rule_set);
+	CHECK_AND_SET_NULL(o, u, mirror_rule_reset);
+	CHECK_AND_SET_NULL(o, u, udp_tunnel_port_add);
+	CHECK_AND_SET_NULL(o, u, udp_tunnel_port_del);
+	CHECK_AND_SET_NULL(o, u, l2_tunnel_eth_type_conf);
+	CHECK_AND_SET_NULL(o, u, l2_tunnel_offload_set);
+	CHECK_AND_SET_NULL(o, u, set_queue_rate_limit);
+	CHECK_AND_SET_NULL(o, u, rss_hash_update);
+	CHECK_AND_SET_NULL(o, u, rss_hash_conf_get);
+	CHECK_AND_SET_NULL(o, u, reta_update);
+	CHECK_AND_SET_NULL(o, u, reta_query);
+	CHECK_AND_SET_NULL(o, u, get_reg);
+	CHECK_AND_SET_NULL(o, u, get_eeprom_length);
+	CHECK_AND_SET_NULL(o, u, get_eeprom);
+	CHECK_AND_SET_NULL(o, u, set_eeprom);
+
+	#ifdef RTE_NIC_BYPASS
+
+	CHECK_AND_SET_NULL(o, u, bypass_init);
+	CHECK_AND_SET_NULL(o, u, bypass_state_set);
+	CHECK_AND_SET_NULL(o, u, bypass_state_show);
+	CHECK_AND_SET_NULL(o, u, bypass_event_set);
+	CHECK_AND_SET_NULL(o, u, bypass_event_show);
+	CHECK_AND_SET_NULL(o, u, bypass_wd_timeout_set);
+	CHECK_AND_SET_NULL(o, u, bypass_wd_timeout_show);
+	CHECK_AND_SET_NULL(o, u, bypass_ver_show);
+	CHECK_AND_SET_NULL(o, u, bypass_wd_reset);
+
+	#endif
+
+	CHECK_AND_SET_NULL(o, u, filter_ctrl);
+	CHECK_AND_SET_NULL(o, u, get_dcb_info);
+	CHECK_AND_SET_NULL(o, u, timesync_enable);
+	CHECK_AND_SET_NULL(o, u, timesync_disable);
+	CHECK_AND_SET_NULL(o, u, timesync_read_rx_timestamp);
+	CHECK_AND_SET_NULL(o, u, timesync_read_tx_timestamp);
+	CHECK_AND_SET_NULL(o, u, timesync_adjust_time);
+	CHECK_AND_SET_NULL(o, u, timesync_read_time);
+	CHECK_AND_SET_NULL(o, u, timesync_write_time);
+	CHECK_AND_SET_NULL(o, u, xstats_get_by_id);
+	CHECK_AND_SET_NULL(o, u, xstats_get_names_by_id);
+	CHECK_AND_SET_NULL(o, u, tm_ops_get);
+}
+
+void
+pmd_ops_inherit(struct eth_dev_ops *o, const struct eth_dev_ops *u)
+{
+	/* Rules:
+	 *    1. u->func == NULL => o->func = NULL
+	 *    2. u->func != NULL => o->func = pmd_ops_default.func
+	 *    3. queue related func => o->func = u->func
+	 */
+
+	memcpy(o, &pmd_ops_default, sizeof(struct eth_dev_ops));
+	pmd_ops_check_and_set_null(o, u);
+
+	/* Copy queue related functions */
+	o->rx_queue_release = u->rx_queue_release;
+	o->tx_queue_release = u->tx_queue_release;
+	o->rx_descriptor_done = u->rx_descriptor_done;
+	o->rx_descriptor_status = u->rx_descriptor_status;
+	o->tx_descriptor_status = u->tx_descriptor_status;
+	o->tx_done_cleanup = u->tx_done_cleanup;
+}
+
+void
+pmd_ops_derive(struct eth_dev_ops *o, const struct eth_dev_ops *u)
+{
+	CHECK_AND_SET_NONNULL(o, u, dev_configure);
+	CHECK_AND_SET_NONNULL(o, u, dev_start);
+	CHECK_AND_SET_NONNULL(o, u, dev_stop);
+	CHECK_AND_SET_NONNULL(o, u, dev_set_link_up);
+	CHECK_AND_SET_NONNULL(o, u, dev_set_link_down);
+	CHECK_AND_SET_NONNULL(o, u, dev_close);
+	CHECK_AND_SET_NONNULL(o, u, link_update);
+	CHECK_AND_SET_NONNULL(o, u, promiscuous_enable);
+	CHECK_AND_SET_NONNULL(o, u, promiscuous_disable);
+	CHECK_AND_SET_NONNULL(o, u, allmulticast_enable);
+	CHECK_AND_SET_NONNULL(o, u, allmulticast_disable);
+	CHECK_AND_SET_NONNULL(o, u, mac_addr_remove);
+	CHECK_AND_SET_NONNULL(o, u, mac_addr_add);
+	CHECK_AND_SET_NONNULL(o, u, mac_addr_set);
+	CHECK_AND_SET_NONNULL(o, u, set_mc_addr_list);
+	CHECK_AND_SET_NONNULL(o, u, mtu_set);
+	CHECK_AND_SET_NONNULL(o, u, stats_get);
+	CHECK_AND_SET_NONNULL(o, u, stats_reset);
+	CHECK_AND_SET_NONNULL(o, u, xstats_get);
+	CHECK_AND_SET_NONNULL(o, u, xstats_reset);
+	CHECK_AND_SET_NONNULL(o, u, xstats_get_names);
+	CHECK_AND_SET_NONNULL(o, u, queue_stats_mapping_set);
+	CHECK_AND_SET_NONNULL(o, u, dev_infos_get);
+	CHECK_AND_SET_NONNULL(o, u, rxq_info_get);
+	CHECK_AND_SET_NONNULL(o, u, txq_info_get);
+	CHECK_AND_SET_NONNULL(o, u, fw_version_get);
+	CHECK_AND_SET_NONNULL(o, u, dev_supported_ptypes_get);
+	CHECK_AND_SET_NONNULL(o, u, vlan_filter_set);
+	CHECK_AND_SET_NONNULL(o, u, vlan_tpid_set);
+	CHECK_AND_SET_NONNULL(o, u, vlan_strip_queue_set);
+	CHECK_AND_SET_NONNULL(o, u, vlan_offload_set);
+	CHECK_AND_SET_NONNULL(o, u, vlan_pvid_set);
+	CHECK_AND_SET_NONNULL(o, u, rx_queue_start);
+	CHECK_AND_SET_NONNULL(o, u, rx_queue_stop);
+	CHECK_AND_SET_NONNULL(o, u, tx_queue_start);
+	CHECK_AND_SET_NONNULL(o, u, tx_queue_stop);
+	CHECK_AND_SET_NONNULL(o, u, rx_queue_setup);
+	CHECK_AND_SET_NONNULL(o, u, rx_queue_release);
+	CHECK_AND_SET_NONNULL(o, u, rx_queue_count);
+	CHECK_AND_SET_NONNULL(o, u, rx_descriptor_done);
+	CHECK_AND_SET_NONNULL(o, u, rx_descriptor_status);
+	CHECK_AND_SET_NONNULL(o, u, tx_descriptor_status);
+	CHECK_AND_SET_NONNULL(o, u, rx_queue_intr_enable);
+	CHECK_AND_SET_NONNULL(o, u, rx_queue_intr_disable);
+	CHECK_AND_SET_NONNULL(o, u, tx_queue_setup);
+	CHECK_AND_SET_NONNULL(o, u, tx_queue_release);
+	CHECK_AND_SET_NONNULL(o, u, tx_done_cleanup);
+	CHECK_AND_SET_NONNULL(o, u, dev_led_on);
+	CHECK_AND_SET_NONNULL(o, u, dev_led_off);
+	CHECK_AND_SET_NONNULL(o, u, flow_ctrl_get);
+	CHECK_AND_SET_NONNULL(o, u, flow_ctrl_set);
+	CHECK_AND_SET_NONNULL(o, u, priority_flow_ctrl_set);
+	CHECK_AND_SET_NONNULL(o, u, uc_hash_table_set);
+	CHECK_AND_SET_NONNULL(o, u, uc_all_hash_table_set);
+	CHECK_AND_SET_NONNULL(o, u, mirror_rule_set);
+	CHECK_AND_SET_NONNULL(o, u, mirror_rule_reset);
+	CHECK_AND_SET_NONNULL(o, u, udp_tunnel_port_add);
+	CHECK_AND_SET_NONNULL(o, u, udp_tunnel_port_del);
+	CHECK_AND_SET_NONNULL(o, u, l2_tunnel_eth_type_conf);
+	CHECK_AND_SET_NONNULL(o, u, l2_tunnel_offload_set);
+	CHECK_AND_SET_NONNULL(o, u, set_queue_rate_limit);
+	CHECK_AND_SET_NONNULL(o, u, rss_hash_update);
+	CHECK_AND_SET_NONNULL(o, u, rss_hash_conf_get);
+	CHECK_AND_SET_NONNULL(o, u, reta_update);
+	CHECK_AND_SET_NONNULL(o, u, reta_query);
+	CHECK_AND_SET_NONNULL(o, u, get_reg);
+	CHECK_AND_SET_NONNULL(o, u, get_eeprom_length);
+	CHECK_AND_SET_NONNULL(o, u, get_eeprom);
+	CHECK_AND_SET_NONNULL(o, u, set_eeprom);
+
+	#ifdef RTE_NIC_BYPASS
+
+	CHECK_AND_SET_NONNULL(o, u, bypass_init);
+	CHECK_AND_SET_NONNULL(o, u, bypass_state_set);
+	CHECK_AND_SET_NONNULL(o, u, bypass_state_show);
+	CHECK_AND_SET_NONNULL(o, u, bypass_event_set);
+	CHECK_AND_SET_NONNULL(o, u, bypass_event_show);
+	CHECK_AND_SET_NONNULL(o, u, bypass_wd_timeout_set);
+	CHECK_AND_SET_NONNULL(o, u, bypass_wd_timeout_show);
+	CHECK_AND_SET_NONNULL(o, u, bypass_ver_show);
+	CHECK_AND_SET_NONNULL(o, u, bypass_wd_reset);
+
+	#endif
+
+	CHECK_AND_SET_NONNULL(o, u, filter_ctrl);
+	CHECK_AND_SET_NONNULL(o, u, get_dcb_info);
+	CHECK_AND_SET_NONNULL(o, u, timesync_enable);
+	CHECK_AND_SET_NONNULL(o, u, timesync_disable);
+	CHECK_AND_SET_NONNULL(o, u, timesync_read_rx_timestamp);
+	CHECK_AND_SET_NONNULL(o, u, timesync_read_tx_timestamp);
+	CHECK_AND_SET_NONNULL(o, u, timesync_adjust_time);
+	CHECK_AND_SET_NONNULL(o, u, timesync_read_time);
+	CHECK_AND_SET_NONNULL(o, u, timesync_write_time);
+	CHECK_AND_SET_NONNULL(o, u, xstats_get_by_id);
+	CHECK_AND_SET_NONNULL(o, u, xstats_get_names_by_id);
+	CHECK_AND_SET_NONNULL(o, u, tm_ops_get);
+}
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
new file mode 100644
index 0000000..d456a54
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -0,0 +1,67 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+
+#include "rte_eth_softnic.h"
+
+struct pmd_internals {
+	/* Devices */
+	struct rte_eth_dev *odev;
+	struct rte_eth_dev *udev;
+	struct rte_eth_dev_data *odata;
+	struct rte_eth_dev_data *udata;
+	struct eth_dev_ops *odev_ops;
+	const struct eth_dev_ops *udev_ops;
+	uint8_t oport_id;
+	uint8_t uport_id;
+
+	/* Operation */
+	struct rte_mbuf *pkts[RTE_ETH_SOFTNIC_DEQ_BSZ_MAX];
+	uint32_t deq_bsz;
+	uint32_t txq_id;
+};
+
+void
+pmd_ops_inherit(struct eth_dev_ops *o, const struct eth_dev_ops *u);
+
+void
+pmd_ops_derive(struct eth_dev_ops *o, const struct eth_dev_ops *u);
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_version.map b/drivers/net/softnic/rte_eth_softnic_version.map
new file mode 100644
index 0000000..bb730e5
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_version.map
@@ -0,0 +1,7 @@
+DPDK_17.08 {
+	global:
+	rte_eth_softnic_create;
+	rte_eth_softnic_run;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index bcaf1b3..de633cd 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -66,7 +66,6 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PDUMP)          += -lrte_pdump
 _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
-_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
 # librte_acl needs --whole-archive because of weak functions
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)            += --whole-archive
@@ -98,6 +97,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
@@ -133,6 +133,9 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL)       += -lrte_pmd_null
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP)       += -lrte_pmd_pcap -lpcap
 _LDLIBS-$(CONFIG_RTE_LIBRTE_QEDE_PMD)       += -lrte_pmd_qede
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING)       += -lrte_pmd_ring
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC)    += -lrte_pmd_softnic
+endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD)    += -lrte_pmd_sfc_efx
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2)   += -lrte_pmd_szedata2 -lsze2
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_TAP)        += -lrte_pmd_tap
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v2 2/2] net/softnic: add traffic management ops
  2017-06-26 16:43   ` [dpdk-dev] [PATCH v2 0/2] net/softnic: sw fall-back " Jasvinder Singh
  2017-06-26 16:43     ` [dpdk-dev] [PATCH v2 1/2] net/softnic: add softnic PMD " Jasvinder Singh
@ 2017-06-26 16:43     ` Jasvinder Singh
  1 sibling, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-06-26 16:43 UTC (permalink / raw)
  To: dev
  Cc: cristian.dumitrescu, ferruh.yigit, hemant.agrawal,
	Jerin.JacobKollanukkaran, wenzhuo.lu

The traffic management specific functions of the softnic driver are
supplied through set of pointers contained in the generic structure
of type 'rte_tm_ops'. These functions help to build and manage the
hierarchical QoS scheduler for traffic management.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
v2 changes:
- add TM functions for hierarchical QoS scheduler

 drivers/net/softnic/Makefile                    |    1 +
 drivers/net/softnic/rte_eth_softnic.c           |   48 +-
 drivers/net/softnic/rte_eth_softnic_internals.h |   81 ++
 drivers/net/softnic/rte_eth_softnic_tm.c        | 1145 +++++++++++++++++++++++
 4 files changed, 1274 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c

diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
index 809112c..e59766d 100644
--- a/drivers/net/softnic/Makefile
+++ b/drivers/net/softnic/Makefile
@@ -47,6 +47,7 @@ LIBABIVER := 1
 # all source are stored in SRCS-y
 #
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic_tm.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic_default.c
 
 #
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index d4ac100..24abb8e 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -41,6 +41,8 @@
 #include <rte_vdev.h>
 #include <rte_kvargs.h>
 #include <rte_errno.h>
+#include <rte_tm_driver.h>
+#include <rte_sched.h>
 
 #include "rte_eth_softnic.h"
 #include "rte_eth_softnic_internals.h"
@@ -59,6 +61,10 @@ static const char *pmd_valid_args[] = {
 static struct rte_vdev_driver pmd_drv;
 static struct rte_device *device;
 
+#ifndef TM
+#define TM						0
+#endif
+
 static int
 pmd_eth_dev_configure(struct rte_eth_dev *dev)
 {
@@ -114,6 +120,14 @@ pmd_eth_dev_start(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
 
+#if TM
+	/* Initialize the Traffic Manager for the overlay device */
+	int status = tm_init(p);
+
+	if (status)
+		return status;
+#endif
+
 	/* Clone dev->data from underlay to overlay */
 	memcpy(dev->data->mac_pool_sel,
 		p->udev->data->mac_pool_sel,
@@ -132,6 +146,11 @@ pmd_eth_dev_stop(struct rte_eth_dev *dev)
 
 	/* Call the current function for the underlay device */
 	rte_eth_dev_stop(p->uport_id);
+
+#if TM
+	/* Free the Traffic Manager for the overlay device */
+	tm_free(p);
+#endif
 }
 
 static void
@@ -247,6 +266,14 @@ pmd_eth_dev_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index)
 	rte_eth_dev_mac_addr_remove(p->uport_id, &dev->data->mac_addrs[index]);
 }
 
+static int
+pmd_eth_dev_tm_ops_get(struct rte_eth_dev *dev __rte_unused, void *arg)
+{
+	*(const struct rte_tm_ops **)arg = &pmd_tm_ops;
+
+	return 0;
+}
+
 static uint16_t
 pmd_eth_dev_tx_burst(void *txq,
 	struct rte_mbuf **tx_pkts,
@@ -254,12 +281,30 @@ pmd_eth_dev_tx_burst(void *txq,
 {
 	struct pmd_internals *p = txq;
 
+#if TM
+	rte_sched_port_enqueue(p->sched, tx_pkts, nb_pkts);
+	rte_eth_softnic_run(p->oport_id);
+	return nb_pkts;
+#else
 	return rte_eth_tx_burst(p->uport_id, p->txq_id, tx_pkts, nb_pkts);
+#endif
 }
 
 int
-rte_eth_softnic_run(uint8_t port_id __rte_unused)
+rte_eth_softnic_run(uint8_t port_id)
 {
+	struct rte_eth_dev *odev = &rte_eth_devices[port_id];
+	struct pmd_internals *p = odev->data->dev_private;
+	uint32_t n_pkts, n_pkts_deq;
+
+	n_pkts_deq = rte_sched_port_dequeue(p->sched, p->pkts, p->deq_bsz);
+
+	for (n_pkts = 0; n_pkts < n_pkts_deq;)
+		n_pkts += rte_eth_tx_burst(p->uport_id,
+			p->txq_id,
+			&p->pkts[n_pkts],
+			(uint16_t)(n_pkts_deq - n_pkts));
+
 	return 0;
 }
 
@@ -284,6 +329,7 @@ pmd_ops_build(struct eth_dev_ops *o, const struct eth_dev_ops *u)
 	o->mac_addr_set = pmd_eth_dev_mac_addr_set;
 	o->mac_addr_add = pmd_eth_dev_mac_addr_add;
 	o->mac_addr_remove = pmd_eth_dev_mac_addr_remove;
+	o->tm_ops_get = pmd_eth_dev_tm_ops_get;
 }
 
 int
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index d456a54..5ca5121 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -38,9 +38,73 @@
 
 #include <rte_mbuf.h>
 #include <rte_ethdev.h>
+#include <rte_sched.h>
 
+#include <rte_tm_driver.h>
 #include "rte_eth_softnic.h"
 
+#ifndef TM_MAX_SUBPORTS
+#define TM_MAX_SUBPORTS					8
+#endif
+
+#ifndef TM_MAX_PIPES_PER_SUBPORT
+#define TM_MAX_PIPES_PER_SUBPORT		4096
+#endif
+
+#ifndef TM_MAX_QUEUE_SIZE
+#define TM_MAX_QUEUE_SIZE				64
+#endif
+
+/* TM Shaper Profile. */
+struct tm_shaper_profile {
+	TAILQ_ENTRY(tm_shaper_profile) node;
+	uint32_t shaper_profile_id;
+	uint32_t shared_shaper_id;
+	uint32_t n_users;
+	struct rte_tm_shaper_params params;
+};
+
+/* TM Node */
+struct tm_node {
+	TAILQ_ENTRY(tm_node) node;
+	uint32_t id;
+	uint32_t priority;
+	uint32_t weight;
+	uint32_t level;
+	uint32_t n_child;
+	struct tm_node *parent_node;
+	struct tm_shaper_profile *shaper_profile;
+	struct rte_tm_node_params params;
+};
+
+TAILQ_HEAD(tm_nodes, tm_node);
+TAILQ_HEAD(tm_shaper_profiles, tm_shaper_profile);
+
+/* TM node levels */
+enum tm_node_level {
+	TM_NODE_LEVEL_PORT = 0,
+	TM_NODE_LEVEL_SUBPORT,
+	TM_NODE_LEVEL_PIPE,
+	TM_NODE_LEVEL_TC,
+	TM_NODE_LEVEL_QUEUE,
+	TM_NODE_LEVEL_MAX,
+};
+
+/* TM Configuration */
+struct tm_conf {
+	struct tm_shaper_profiles shaper_profiles;	/*< TM shaper profile */
+	struct tm_nodes tm_nodes;	/*< TM nodes */
+	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];	/*< TM nodes per level */
+};
+
+struct tm_params {
+	struct rte_sched_port_params port_params;
+	struct rte_sched_subport_params subport_params[TM_MAX_SUBPORTS];
+	struct rte_sched_pipe_params
+		pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PORT];
+	int pipe_to_profile[TM_MAX_SUBPORTS * TM_MAX_PIPES_PER_SUBPORT];
+};
+
 struct pmd_internals {
 	/* Devices */
 	struct rte_eth_dev *odev;
@@ -54,10 +118,27 @@ struct pmd_internals {
 
 	/* Operation */
 	struct rte_mbuf *pkts[RTE_ETH_SOFTNIC_DEQ_BSZ_MAX];
+	struct tm_params tm_params;
+	struct rte_sched_port *sched;
+	struct tm_conf tm_conf;
 	uint32_t deq_bsz;
 	uint32_t txq_id;
 };
 
+extern const struct rte_tm_ops pmd_tm_ops;
+
+void
+tm_conf_init(struct rte_eth_dev *dev);
+
+void
+tm_conf_uninit(struct rte_eth_dev *dev);
+
+int
+tm_init(struct pmd_internals *p);
+
+void
+tm_free(struct pmd_internals *p);
+
 void
 pmd_ops_inherit(struct eth_dev_ops *o, const struct eth_dev_ops *u);
 
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
new file mode 100644
index 0000000..7c55cfd
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -0,0 +1,1145 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+
+#include <rte_malloc.h>
+#include <rte_sched.h>
+
+#include "rte_eth_softnic_internals.h"
+#include "rte_eth_softnic.h"
+
+void
+tm_conf_init(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Initialize shaper profiles list */
+	TAILQ_INIT(&p->tm_conf.shaper_profiles);
+
+	/* Initialize TM nodes */
+	TAILQ_INIT(&p->tm_conf.tm_nodes);
+
+	memset(p->tm_conf.n_tm_nodes, 0, TM_NODE_LEVEL_MAX);
+}
+
+void
+tm_conf_uninit(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *shaper_profile;
+	struct tm_node *tm_node;
+
+	/* Remove all tm shaper profiles */
+	while ((shaper_profile =
+	       TAILQ_FIRST(&p->tm_conf.shaper_profiles))) {
+		TAILQ_REMOVE(&p->tm_conf.shaper_profiles,
+			     shaper_profile, node);
+		rte_free(shaper_profile);
+	}
+
+	/* Remove all tm nodes*/
+	while ((tm_node =
+	       TAILQ_FIRST(&p->tm_conf.tm_nodes))) {
+		TAILQ_REMOVE(&p->tm_conf.tm_nodes,
+			     tm_node, node);
+		rte_free(tm_node);
+	}
+
+	memset(p->tm_conf.n_tm_nodes, 0, TM_NODE_LEVEL_MAX);
+}
+
+static struct tm_shaper_profile *
+tm_shaper_profile_search(struct rte_eth_dev *dev, uint32_t shaper_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profiles *shaper_profile_list =
+		&p->tm_conf.shaper_profiles;
+	struct tm_shaper_profile *sp;
+
+	TAILQ_FOREACH(sp, shaper_profile_list, node) {
+		if (shaper_profile_id == sp->shaper_profile_id)
+			return sp;
+	}
+
+	return NULL;
+}
+
+static int
+tm_shaper_profile_count(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profiles *shaper_profile_list =
+		&p->tm_conf.shaper_profiles;
+	struct tm_shaper_profile *sp;
+	int n_shapers = 0;
+
+	/* Private Shaper Profile */
+	TAILQ_FOREACH(sp, shaper_profile_list, node) {
+		if (sp->shared_shaper_id == RTE_TM_SHAPER_PROFILE_ID_NONE)
+			n_shapers++;
+	}
+
+	return n_shapers;
+}
+
+static int
+tm_shared_shaper_count(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profiles *shaper_profile_list =
+		&p->tm_conf.shaper_profiles;
+	struct tm_shaper_profile *sp;
+	int n_shapers = 0;
+
+	/* Shared Shaper */
+	TAILQ_FOREACH(sp, shaper_profile_list, node) {
+		if (sp->shared_shaper_id != RTE_TM_SHAPER_PROFILE_ID_NONE)
+			n_shapers++;
+	}
+
+	return n_shapers;
+}
+
+static struct tm_shaper_profile *
+tm_shared_shaper_search(struct rte_eth_dev *dev, uint32_t shared_shaper_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profiles *shaper_profile_list =
+		&p->tm_conf.shaper_profiles;
+	struct tm_shaper_profile *sp;
+
+	TAILQ_FOREACH(sp, shaper_profile_list, node) {
+		if (shared_shaper_id == sp->shared_shaper_id)
+			return sp;
+	}
+
+	return NULL;
+}
+
+static struct tm_node *
+tm_node_search(struct rte_eth_dev *dev, uint32_t node_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_nodes *tm_nodes_list = &p->tm_conf.tm_nodes;
+	struct tm_node *tm_node;
+
+	TAILQ_FOREACH(tm_node, tm_nodes_list, node) {
+		if (tm_node->id == node_id)
+			return tm_node;
+	}
+
+	return NULL;
+}
+
+static int
+tm_node_get_child(struct rte_eth_dev *dev, uint32_t parent_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_nodes *tm_nodes_list = &p->tm_conf.tm_nodes;
+	struct tm_node *tm_node;
+	int n_child = 0;
+
+	TAILQ_FOREACH(tm_node, tm_nodes_list, node) {
+		if (tm_node->parent_node->id == parent_id)
+			n_child++;
+	}
+
+	return n_child;
+}
+
+static struct tm_node *
+tm_root_node_present(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_nodes *tm_nodes_list = &p->tm_conf.tm_nodes;
+	struct tm_node *tm_node;
+
+	TAILQ_FOREACH(tm_node, tm_nodes_list, node) {
+		if (tm_node->parent_node->id == RTE_TM_NODE_ID_NULL)
+			return tm_node;
+	}
+	return NULL;
+}
+
+int
+tm_init(struct pmd_internals *p)
+{
+	struct tm_params *t = &p->tm_params;
+	struct tm_nodes *tm_nodes_list = &p->tm_conf.tm_nodes;
+	uint32_t n_subports, subport_id, n_pipes;
+	struct tm_node *tm_node;
+	int status;
+
+	/* Port */
+	t->port_params.name = p->odev->data->name;
+	t->port_params.socket = p->udev->data->numa_node;
+	t->port_params.rate = p->udev->data->dev_link.link_speed;
+	t->port_params.mtu = p->udev->data->mtu;
+
+	p->sched = rte_sched_port_config(&t->port_params);
+	if (!p->sched)
+		return -EINVAL;
+
+	/* Subport */
+	n_subports = t->port_params.n_subports_per_port;
+	for (subport_id = 0; subport_id < n_subports; subport_id++) {
+		uint32_t n_pipes_per_subport
+			= t->port_params.n_pipes_per_subport;
+		uint32_t pipe_id;
+
+		status = rte_sched_subport_config(p->sched,
+				subport_id,
+				&t->subport_params[subport_id]);
+		if (status) {
+			rte_sched_port_free(p->sched);
+			return -EINVAL;
+		}
+
+		/* Pipe */
+		n_pipes = 0;
+		pipe_id = n_subports + 1;
+		for (; pipe_id < n_pipes_per_subport; pipe_id++) {
+			TAILQ_FOREACH(tm_node, tm_nodes_list, node) {
+				if (tm_node->parent_node->id == subport_id)
+				n_pipes++;
+			}
+
+			uint32_t pos = subport_id * n_pipes + pipe_id;
+			int profile_id = t->pipe_to_profile[pos];
+
+			if (profile_id < 0)
+				continue;
+
+			status = rte_sched_pipe_config(p->sched,
+				subport_id,
+				pipe_id,
+				profile_id);
+			if (status) {
+				rte_sched_port_free(p->sched);
+				return -EINVAL;
+			}
+		}
+	}
+
+	return 0;
+}
+
+void
+tm_free(struct pmd_internals *p)
+{
+	if (p->sched)
+		rte_sched_port_free(p->sched);
+}
+
+/* Traffic manager node type get */
+static int
+pmd_tm_node_type_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t node_id,
+	int *is_leaf,
+	struct rte_tm_error *error)
+{
+	struct tm_node *tm_node;
+
+	if (!is_leaf || !error)
+		return -EINVAL;
+
+	/* Check: node id */
+	if (node_id == RTE_TM_NODE_ID_NULL) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_ID;
+		error->message = "node id invalid!";
+		return -EINVAL;
+	}
+
+	tm_node = tm_node_search(dev, node_id);
+	if (!tm_node) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_ID;
+		error->message = "node doesn't exists!";
+		return -EINVAL;
+	}
+
+	if (tm_node->n_child)
+		*is_leaf = 0;
+	else
+		*is_leaf = 1;
+
+	return 0;
+}
+
+/* Traffic manager capabilities get */
+static int
+pmd_tm_capabilities_get(struct rte_eth_dev *dev,
+	struct rte_tm_capabilities *cap,
+	struct rte_tm_error *error __rte_unused)
+{
+	uint64_t n_nodes_level1 = TM_MAX_SUBPORTS;
+	uint64_t n_nodes_level2 = n_nodes_level1 * TM_MAX_PIPES_PER_SUBPORT;
+	uint64_t n_nodes_level3 = n_nodes_level2 * RTE_SCHED_QUEUES_PER_PIPE;
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t ls = p->udev->data->dev_link.link_speed;
+	int i;
+
+	if (!cap || !error)
+		return -EINVAL;
+
+	memset(cap, 0, sizeof(struct rte_tm_capabilities));
+
+	/* TM  capabilities */
+	cap->n_nodes_max = n_nodes_level1 + n_nodes_level2 + n_nodes_level3 + 1;
+	cap->n_levels_max = TM_NODE_LEVEL_MAX;
+	cap->non_leaf_nodes_identical = 0;
+	cap->leaf_nodes_identical = 1;
+	cap->shaper_n_max = n_nodes_level1 + n_nodes_level2;
+	cap->shaper_private_n_max = n_nodes_level2;
+	cap->shaper_private_dual_rate_n_max = 0;
+	cap->shaper_private_rate_min = 0;
+	cap->shaper_private_rate_max = (ls * 1000000) / 8;
+	cap->shaper_shared_n_max = n_nodes_level1;
+	cap->shaper_shared_n_nodes_per_shaper_max = TM_MAX_PIPES_PER_SUBPORT;
+	cap->shaper_shared_n_shapers_per_node_max = n_nodes_level1;
+	cap->shaper_shared_dual_rate_n_max = 0;
+	cap->shaper_shared_rate_min = 0;
+	cap->shaper_shared_rate_max = (ls * 1000000) / 8;
+	cap->shaper_pkt_length_adjust_min = 0;
+	cap->shaper_pkt_length_adjust_max = RTE_SCHED_FRAME_OVERHEAD_DEFAULT;
+	cap->sched_n_children_max = TM_MAX_PIPES_PER_SUBPORT;
+	cap->sched_sp_n_priorities_max = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+	cap->sched_wfq_n_children_per_group_max = TM_MAX_PIPES_PER_SUBPORT;
+	cap->sched_wfq_n_groups_max = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+	cap->sched_wfq_weight_max = 1;
+	cap->cman_head_drop_supported = 0;
+	cap->cman_wred_context_n_max = n_nodes_level3;
+	cap->cman_wred_context_private_n_max = n_nodes_level3;
+	cap->cman_wred_context_shared_n_max = 0;
+	cap->cman_wred_context_shared_n_nodes_per_context_max = 0;
+	cap->cman_wred_context_shared_n_contexts_per_node_max = 0;
+
+	for (i = 0; i < RTE_TM_COLORS; i++) {
+		cap->mark_vlan_dei_supported[i] = 0;
+		cap->mark_ip_ecn_tcp_supported[i] = 0;
+		cap->mark_ip_ecn_sctp_supported[i] = 0;
+		cap->mark_ip_dscp_supported[i] = 0;
+	}
+
+	cap->dynamic_update_mask = 0;
+	cap->stats_mask = 0;
+
+	return 0;
+}
+
+/* Traffic manager level capabilities get */
+static int
+pmd_tm_level_capabilities_get(struct rte_eth_dev *dev,
+	uint32_t level_id,
+	struct rte_tm_level_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t ls = p->udev->data->dev_link.link_speed;
+
+	if (!cap || !error)
+		return -EINVAL;
+
+	if (level_id >= TM_NODE_LEVEL_MAX) {
+		error->type = RTE_TM_ERROR_TYPE_LEVEL_ID;
+		error->message = "level id invalid!";
+		return -EINVAL;
+	}
+
+	memset(cap, 0, sizeof(struct rte_tm_capabilities));
+
+	if (level_id == TM_NODE_LEVEL_PORT) {
+		/* Root node */
+		cap->n_nodes_max = 1;
+		cap->n_nodes_nonleaf_max = 1;
+		cap->n_nodes_leaf_max = 0;
+		cap->non_leaf_nodes_identical = 1;
+		cap->leaf_nodes_identical = 0;
+		cap->nonleaf.shaper_private_supported = 1;
+		cap->nonleaf.shaper_private_dual_rate_supported = 0;
+		cap->nonleaf.shaper_private_rate_min = 0;
+
+		cap->nonleaf.shaper_private_rate_max = (ls * 1000000) / 8;
+		cap->nonleaf.shaper_shared_n_max = 0;
+		cap->nonleaf.sched_n_children_max = TM_MAX_SUBPORTS;
+		cap->nonleaf.sched_sp_n_priorities_max = 0;
+		cap->nonleaf.sched_wfq_n_children_per_group_max = 0;
+		cap->nonleaf.sched_wfq_n_groups_max = 0;
+		cap->nonleaf.sched_wfq_weight_max = 0;
+		cap->nonleaf.stats_mask = 0;
+
+	} else if (level_id == TM_NODE_LEVEL_SUBPORT) {
+		/* Subport */
+		cap->n_nodes_max = TM_MAX_SUBPORTS;
+		cap->n_nodes_nonleaf_max = TM_MAX_SUBPORTS;
+		cap->n_nodes_leaf_max = 0;
+		cap->non_leaf_nodes_identical = 1;
+		cap->leaf_nodes_identical = 0;
+		cap->nonleaf.shaper_private_supported = 0;
+		cap->nonleaf.shaper_private_dual_rate_supported = 0;
+		cap->nonleaf.shaper_private_rate_min = 0;
+		cap->nonleaf.shaper_private_rate_max = (ls * 1000000) / 8;
+		cap->nonleaf.shaper_shared_n_max = 1;
+		cap->nonleaf.sched_n_children_max = TM_MAX_PIPES_PER_SUBPORT;
+		cap->nonleaf.sched_sp_n_priorities_max
+			= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->nonleaf.sched_wfq_n_children_per_group_max
+			= TM_MAX_PIPES_PER_SUBPORT;
+		cap->nonleaf.sched_wfq_n_groups_max
+			= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->nonleaf.sched_wfq_weight_max = 1;
+		cap->nonleaf.stats_mask = 0;
+
+	} else if (level_id == TM_NODE_LEVEL_PIPE) {
+		/* Pipe */
+		cap->n_nodes_max
+			= TM_MAX_PIPES_PER_SUBPORT * TM_MAX_SUBPORTS;
+		cap->n_nodes_nonleaf_max
+			= TM_MAX_PIPES_PER_SUBPORT * TM_MAX_SUBPORTS;
+		cap->n_nodes_leaf_max = 0;
+		cap->non_leaf_nodes_identical = 1;
+		cap->leaf_nodes_identical = 0;
+		cap->nonleaf.shaper_private_supported = 1;
+		cap->nonleaf.shaper_private_dual_rate_supported = 0;
+		cap->nonleaf.shaper_private_rate_min = 0;
+		cap->nonleaf.shaper_private_rate_max
+			= (ls * 1000000) / (8 * TM_MAX_PIPES_PER_SUBPORT);
+		cap->nonleaf.shaper_shared_n_max = 1;
+		cap->nonleaf.sched_n_children_max
+			= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->nonleaf.sched_sp_n_priorities_max
+			= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->nonleaf.sched_wfq_n_children_per_group_max = 1;
+		cap->nonleaf.sched_wfq_n_groups_max
+			= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->nonleaf.sched_wfq_weight_max = 1;
+		cap->nonleaf.stats_mask = 0;
+
+	} else if (level_id == TM_NODE_LEVEL_TC) {
+		/* Traffic Class */
+		cap->n_nodes_max
+			= TM_MAX_SUBPORTS
+			* TM_MAX_PIPES_PER_SUBPORT
+			* RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->n_nodes_nonleaf_max
+			= TM_MAX_SUBPORTS
+			* TM_MAX_PIPES_PER_SUBPORT
+			* RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->n_nodes_leaf_max = 0;
+		cap->non_leaf_nodes_identical = 1;
+		cap->leaf_nodes_identical = 0;
+		cap->nonleaf.shaper_private_supported = 1;
+		cap->nonleaf.shaper_private_dual_rate_supported = 0;
+		cap->nonleaf.shaper_private_rate_min = 0;
+		cap->nonleaf.shaper_private_rate_max
+			= (ls * 1000000) / (8 * TM_MAX_PIPES_PER_SUBPORT);
+		cap->nonleaf.shaper_shared_n_max = 0;
+		cap->nonleaf.sched_n_children_max = RTE_SCHED_QUEUES_PER_PIPE;
+		cap->nonleaf.sched_sp_n_priorities_max
+			= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->nonleaf.sched_wfq_n_children_per_group_max
+			= RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
+		cap->nonleaf.sched_wfq_n_groups_max
+			= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->nonleaf.sched_wfq_weight_max = 1;
+		cap->nonleaf.stats_mask = 0;
+
+		} else {
+		/* TM Queues */
+		cap->n_nodes_max
+			= TM_MAX_SUBPORTS
+			* TM_MAX_PIPES_PER_SUBPORT
+			* RTE_SCHED_QUEUES_PER_PIPE;
+		cap->n_nodes_nonleaf_max = 0;
+		cap->n_nodes_leaf_max = cap->n_nodes_max;
+		cap->non_leaf_nodes_identical = 0;
+		cap->leaf_nodes_identical = 1;
+		cap->leaf.shaper_private_supported = 0;
+		cap->leaf.shaper_private_dual_rate_supported = 0;
+		cap->leaf.shaper_private_rate_min = 0;
+		cap->leaf.shaper_private_rate_max = 0;
+		cap->leaf.shaper_shared_n_max = 0;
+		cap->leaf.cman_head_drop_supported = 0;
+		cap->leaf.cman_wred_context_private_supported = 1;
+		cap->leaf.cman_wred_context_shared_n_max = 0;
+		cap->nonleaf.stats_mask = 0;
+	}
+	return 0;
+}
+
+/* Traffic manager node capabilities get */
+static int
+pmd_tm_node_capabilities_get(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_node_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t ls = p->udev->data->dev_link.link_speed;
+	struct tm_node *tm_node;
+
+	if (!cap || !error)
+		return -EINVAL;
+
+	tm_node = tm_node_search(dev, node_id);
+
+	/* Check: node validity */
+	if ((node_id == RTE_TM_NODE_ID_NULL) || (!tm_node)) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_ID;
+		error->message = "node id invalid!";
+		return -EINVAL;
+	}
+
+	memset(cap, 0, sizeof(struct rte_tm_capabilities));
+
+	/* Check: node level */
+	if (tm_node->level == 0) {
+		/* Root node */
+		cap->shaper_private_supported = 1;
+		cap->shaper_private_dual_rate_supported = 0;
+		cap->shaper_private_rate_min = 0;
+		cap->shaper_private_rate_max = (ls * 1000000) / 8;
+		cap->shaper_shared_n_max = 0;
+		cap->nonleaf.sched_n_children_max = TM_MAX_SUBPORTS;
+		cap->nonleaf.sched_sp_n_priorities_max = 0;
+		cap->nonleaf.sched_wfq_n_children_per_group_max = 0;
+		cap->nonleaf.sched_wfq_n_groups_max = 0;
+		cap->nonleaf.sched_wfq_weight_max = 0;
+
+	} else if (tm_node->level == 1) {
+		/* Subport */
+		cap->shaper_private_supported = 0;
+		cap->shaper_private_dual_rate_supported = 0;
+		cap->shaper_private_rate_min = 0;
+		cap->shaper_private_rate_max = (ls * 1000000) / 8;
+		cap->shaper_shared_n_max = 1;
+		cap->nonleaf.sched_n_children_max = TM_MAX_PIPES_PER_SUBPORT;
+		cap->nonleaf.sched_sp_n_priorities_max
+			= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->nonleaf.sched_wfq_n_children_per_group_max
+			= TM_MAX_PIPES_PER_SUBPORT;
+		cap->nonleaf.sched_wfq_n_groups_max
+			= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->nonleaf.sched_wfq_weight_max = 1;
+
+	} else if (tm_node->level == 2) {
+		/* Pipe */
+		cap->shaper_private_supported = 1;
+		cap->shaper_private_dual_rate_supported = 0;
+		cap->shaper_private_rate_min = 0;
+		cap->shaper_private_rate_max
+			= (ls * 1000000) / (8 * TM_MAX_PIPES_PER_SUBPORT);
+		cap->shaper_shared_n_max = 0;
+		cap->nonleaf.sched_n_children_max
+			= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->nonleaf.sched_sp_n_priorities_max
+			= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->nonleaf.sched_wfq_n_children_per_group_max = 1;
+		cap->nonleaf.sched_wfq_n_groups_max
+			= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->nonleaf.sched_wfq_weight_max = 1;
+
+	} else if (tm_node->level == 3) {
+		/* Traffic Class */
+		cap->shaper_private_supported = 0;
+		cap->shaper_private_dual_rate_supported = 0;
+		cap->shaper_private_rate_min = 0;
+		cap->shaper_private_rate_max
+			= (ls * 1000000) / (8 * TM_MAX_PIPES_PER_SUBPORT);
+		cap->shaper_shared_n_max = 0;
+		cap->nonleaf.sched_n_children_max
+			= RTE_SCHED_QUEUES_PER_PIPE;
+		cap->nonleaf.sched_sp_n_priorities_max
+			= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->nonleaf.sched_wfq_n_children_per_group_max
+			= RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
+		cap->nonleaf.sched_wfq_n_groups_max
+			= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+		cap->nonleaf.sched_wfq_weight_max = 1;
+	} else {
+		/* Queue */
+		cap->shaper_private_supported = 1;
+		cap->shaper_private_dual_rate_supported = 0;
+		cap->shaper_private_rate_min = 0;
+		cap->shaper_private_rate_max = 0;
+		cap->shaper_shared_n_max = 0;
+		cap->leaf.cman_head_drop_supported = 1;
+		cap->leaf.cman_wred_context_private_supported = 0;
+		cap->leaf.cman_wred_context_shared_n_max = 0;
+	}
+	cap->stats_mask = 0;
+
+	return 0;
+}
+
+/* Traffic manager shaper profile add */
+static int
+pmd_tm_shaper_profile_add(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp;
+	struct tm_shaper_profiles *spl = &p->tm_conf.shaper_profiles;
+	char shaper_name[256];
+
+	if (!profile || !error)
+		return -EINVAL;
+
+	/* Shaper Rate */
+	if (!profile->peak.rate) {
+		error->type = RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_RATE;
+		error->message = "rate not specified!";
+		return -EINVAL;
+	}
+
+	/* Shaper Bucket Size */
+	if (!profile->peak.size) {
+		error->type = RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_SIZE;
+		error->message = "bucket size not specified!";
+		return -EINVAL;
+	}
+
+	/* Shaper Committed Rate */
+	if (profile->committed.rate) {
+		error->type = RTE_TM_ERROR_TYPE_SHAPER_PROFILE_COMMITTED_RATE;
+		error->message = "dual rate shaper not supported!";
+		return -EINVAL;
+	}
+
+	/* Shaper Committed Size */
+	if (profile->committed.size) {
+		error->type = RTE_TM_ERROR_TYPE_SHAPER_PROFILE_COMMITTED_SIZE;
+		error->message = "dual rate shaper not supported!";
+		return -EINVAL;
+	}
+
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+
+	if (sp) {
+		error->type = RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID;
+		error->message = "profile ID exist";
+		return -EINVAL;
+	}
+
+	snprintf(shaper_name, sizeof(shaper_name),
+		"tm_shaper_profile_%u", shaper_profile_id);
+
+	sp = rte_zmalloc(shaper_name, sizeof(struct tm_shaper_profile), 0);
+	if (!sp)
+		return -ENOMEM;
+
+	sp->shaper_profile_id = shaper_profile_id;
+	sp->shared_shaper_id = RTE_TM_SHAPER_PROFILE_ID_NONE;
+
+	(void)rte_memcpy(&sp->params, profile,
+			 sizeof(struct rte_tm_shaper_params));
+
+	if (!spl->tqh_first)
+		tm_conf_init(dev);
+
+	TAILQ_INSERT_TAIL(spl, sp, node);
+
+	return 0;
+}
+
+/* Traffic manager shaper profile delete */
+static int
+pmd_tm_shaper_profile_delete(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp;
+
+	if (!error)
+		return -EINVAL;
+
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+
+	if (!sp) {
+		error->type = RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID;
+		error->message = "profile ID not exist";
+		return -EINVAL;
+	}
+
+	/* Check: profile usage */
+	if (sp->n_users) {
+		error->type = RTE_TM_ERROR_TYPE_SHAPER_PROFILE;
+		error->message = "profile in use!";
+		return -EINVAL;
+	}
+
+	TAILQ_REMOVE(&p->tm_conf.shaper_profiles, sp, node);
+	rte_free(sp);
+
+	return 0;
+}
+
+/* Traffic manager shared shaper add/update */
+static int
+pmd_tm_shared_shaper_add_update(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct tm_shaper_profile *sp;
+	uint32_t n_shared_shapers;
+
+	if (!error)
+		return -EINVAL;
+
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+
+	if (!sp) {
+		error->type = RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID;
+		error->message = "shaper profile doesn't exist!";
+		return -EINVAL;
+	}
+
+	/* Shared shaper add/update */
+	n_shared_shapers = tm_shared_shaper_count(dev);
+	if (sp->shared_shaper_id == RTE_TM_SHAPER_PROFILE_ID_NONE)
+		sp->shared_shaper_id = n_shared_shapers;
+	else
+		sp->shared_shaper_id = shared_shaper_id;
+
+	return 0;
+}
+
+/* Traffic manager shared shaper delete */
+static int
+pmd_tm_shared_shaper_delete(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp;
+
+	if (!error)
+		return -EINVAL;
+
+	sp = tm_shared_shaper_search(dev, shared_shaper_id);
+
+	if (!sp) {
+		error->type = RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID;
+		error->message = "shared shaper not exist";
+		return -EINVAL;
+	}
+
+	/* Check: profile usage */
+	if (sp->n_users) {
+		error->type = RTE_TM_ERROR_TYPE_SHAPER_PROFILE;
+		error->message = "profile in use!";
+		return -EINVAL;
+	}
+
+	TAILQ_REMOVE(&p->tm_conf.shaper_profiles, sp, node);
+	rte_free(sp);
+
+	return 0;
+}
+
+/* Traffic manager node add */
+static int
+pmd_tm_node_add(struct rte_eth_dev *dev, uint32_t node_id,
+	uint32_t parent_node_id, uint32_t priority, uint32_t weight,
+	uint32_t level_id, struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct tm_node *tm_node, *parent_node;
+	struct tm_shaper_profile *sp;
+	uint64_t nodes_l1 = TM_MAX_SUBPORTS;
+	uint64_t nodes_l2 = nodes_l1 * TM_MAX_PIPES_PER_SUBPORT;
+	uint64_t nodes_l3 = nodes_l2 * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+	uint64_t nodes_l4 = nodes_l2 * RTE_SCHED_QUEUES_PER_PIPE;
+	char node_name[256];
+
+	struct pmd_internals *p = dev->data->dev_private;
+
+	if (!params || !error)
+		return -EINVAL;
+
+	/* Check: node id NULL*/
+	if (node_id == RTE_TM_NODE_ID_NULL) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_ID;
+		error->message = "invalid node id!";
+		return -EINVAL;
+	}
+
+	/* Check: node priority */
+	if (!priority) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_PRIORITY;
+		error->message = "priority not supported!";
+		return -EINVAL;
+	}
+
+	/* Check: node weight */
+	if (weight < 1) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_WEIGHT;
+		error->message = "weight not supported!";
+		return -EINVAL;
+	}
+
+	/* Check: node ID used */
+	if (tm_node_search(dev, node_id)) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_ID;
+		error->message = "node id already used!";
+		return -EINVAL;
+	}
+
+	/* Check: level */
+	if ((level_id != RTE_TM_NODE_LEVEL_ID_ANY) &&
+		(level_id >= TM_NODE_LEVEL_MAX)) {
+		error->type = RTE_TM_ERROR_TYPE_LEVEL_ID;
+		error->message = "levels exceeds the maximum allowed!";
+		return -EINVAL;
+	}
+
+	/* Check: number of nodes at levels*/
+	if (((level_id == TM_NODE_LEVEL_PORT) &&
+		(p->tm_conf.n_tm_nodes[TM_NODE_LEVEL_PORT] > 1)) ||
+		((level_id == TM_NODE_LEVEL_SUBPORT) &&
+		(p->tm_conf.n_tm_nodes[TM_NODE_LEVEL_SUBPORT] > nodes_l1)) ||
+		((level_id == TM_NODE_LEVEL_PIPE) &&
+		(p->tm_conf.n_tm_nodes[TM_NODE_LEVEL_PIPE] > nodes_l2)) ||
+		((level_id == TM_NODE_LEVEL_TC) &&
+		(p->tm_conf.n_tm_nodes[TM_NODE_LEVEL_TC] > nodes_l3)) ||
+		((level_id == TM_NODE_LEVEL_QUEUE) &&
+		(p->tm_conf.n_tm_nodes[TM_NODE_LEVEL_QUEUE] > nodes_l4))) {
+		error->type = RTE_TM_ERROR_TYPE_LEVEL_ID;
+		error->message = "nodes exceeds the max at this level!";
+		return -EINVAL;
+	}
+
+	/* Check: node shaper profile */
+	sp = tm_shaper_profile_search(dev, params->shaper_profile_id);
+	if (!sp) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID;
+		error->message = "shaper profile invalid! ";
+		return -EINVAL;
+	}
+
+	/* Check: root node */
+	if (parent_node_id == RTE_TM_NODE_ID_NULL) {
+		/* Check: level id */
+		if (level_id) {
+			error->type = RTE_TM_ERROR_TYPE_LEVEL_ID;
+			error->message = "level id invalid! ";
+			return -EINVAL;
+		}
+
+		/* Check: root node shaper params */
+		if ((!sp) || (sp->params.committed.size > 0) ||
+			(sp->params.committed.rate > 0) ||
+			(params->n_shared_shapers > 0) ||
+			(params->shared_shaper_id) ||
+			(params->shaper_profile_id
+				== RTE_TM_SHAPER_PROFILE_ID_NONE)) {
+			error->type
+			 = RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID;
+			error->message = "root node shaper invalid! ";
+			return -EINVAL;
+		}
+
+		/* Check: root node */
+		if (tm_root_node_present(dev)) {
+			error->type = RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID;
+			error->message = "root node already present!";
+			return -EINVAL;
+		}
+	}
+
+	/* Node add */
+	snprintf(node_name, sizeof(node_name), "node_%u", node_id);
+	tm_node = rte_zmalloc(node_name, sizeof(struct tm_node), 0);
+	if (!tm_node)
+		return -ENOMEM;
+
+	/*  Check: parent node */
+	if (parent_node_id != RTE_TM_NODE_ID_NULL) {
+		parent_node = tm_node_search(dev, parent_node_id);
+
+		if ((level_id != RTE_TM_NODE_LEVEL_ID_ANY) &&
+			(level_id != parent_node->level + 1)) {
+			error->type = RTE_TM_ERROR_TYPE_LEVEL_ID;
+			error->message = "level id invalid! ";
+			rte_free(tm_node);
+			return -EINVAL;
+		}
+
+		tm_node->parent_node = parent_node;
+		parent_node->n_child += 1;
+
+	} else {
+		tm_node->parent_node = NULL;
+	}
+
+	tm_node->id = node_id;
+	tm_node->priority = priority;
+	tm_node->weight = weight;
+	tm_node->n_child = 0;
+	tm_node->level = level_id;
+
+	(void)rte_memcpy(&tm_node->params,
+		params, sizeof(struct rte_tm_node_params));
+
+	/* Update:  shaper profile users */
+	sp->n_users++;
+
+	/* Update: number of nodes */
+	p->tm_conf.n_tm_nodes[tm_node->level] += 1;
+
+	return 0;
+}
+
+/* Traffic manager node delete */
+static int
+pmd_tm_node_delete(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node *tm_node;
+
+	if (!error)
+		return -EINVAL;
+
+	if (node_id == RTE_TM_NODE_ID_NULL) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_ID;
+		error->message = "invalid node id";
+		return -EINVAL;
+	}
+
+	/* Check: node id */
+	tm_node = tm_node_search(dev, node_id);
+	if (!tm_node) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_ID;
+		error->message = "node id invalid!";
+		return -EINVAL;
+	}
+
+	/* Check: node child */
+	if (tm_node->n_child) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_ID;
+		error->message = "node children exist!";
+		return -EINVAL;
+	}
+
+	/* Delete node */
+	tm_node->shaper_profile->n_users--;
+	tm_node->parent_node->n_child--;
+	TAILQ_REMOVE(&p->tm_conf.tm_nodes, tm_node, node);
+	p->tm_conf.n_tm_nodes[tm_node->level]--;
+	rte_free(tm_node);
+
+	return 0;
+}
+
+/* Traffic manager hierarchy commit */
+static int
+pmd_tm_hierarchy_commit(struct rte_eth_dev *dev,
+	int clear_on_fail,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->tm_params;
+	struct tm_conf *tm_conf = &p->tm_conf;
+	struct tm_node *tm_node;
+	struct tm_shaper_profile *sp;
+	uint32_t i, pid = 0, subport_id, pipe_id, n_subports;
+	uint32_t n_subports_per_port, n_pipes_per_subport, n_pipe_profiles;
+	struct tm_shaper_profiles *sp_list = &tm_conf->shaper_profiles;
+
+	if (!error) {
+		if (clear_on_fail)
+			goto fail_clear;
+		return -EINVAL;
+	}
+
+	/* TM Port */
+	tm_node = tm_root_node_present(dev);
+	if (!tm_node) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID;
+		error->message = "root node not exists!";
+		if (clear_on_fail)
+			goto fail_clear;
+		return -EINVAL;
+	}
+
+	n_subports_per_port = tm_conf->n_tm_nodes[1];
+	if (n_subports_per_port > TM_MAX_SUBPORTS) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_PARAMS;
+		error->message = "Number of subports exceeded!";
+		if (clear_on_fail)
+			goto fail_clear;
+		return -EINVAL;
+	}
+
+	n_pipes_per_subport = tm_conf->n_tm_nodes[2];
+	if (n_pipes_per_subport > TM_MAX_PIPES_PER_SUBPORT) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_PARAMS;
+		error->message = "Number of pipes exceeded!";
+		if (clear_on_fail)
+			goto fail_clear;
+		return -EINVAL;
+	}
+	n_pipe_profiles = tm_shaper_profile_count(dev);
+	if (n_pipe_profiles > RTE_SCHED_PIPE_PROFILES_PER_PORT) {
+		error->type = RTE_TM_ERROR_TYPE_NODE_PARAMS;
+		error->message = "Number of pipe profiles exceeded!";
+		if (clear_on_fail)
+			goto fail_clear;
+		return -EINVAL;
+	}
+
+	t->port_params.n_subports_per_port = n_subports_per_port;
+	t->port_params.n_pipes_per_subport = n_pipes_per_subport;
+	t->port_params.n_pipe_profiles = n_pipe_profiles;
+	t->port_params.frame_overhead = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++)
+		t->port_params.qsize[i] = TM_MAX_QUEUE_SIZE;
+
+	TAILQ_FOREACH(sp, sp_list, node) {
+		if (sp->shared_shaper_id == RTE_TM_SHAPER_PROFILE_ID_NONE) {
+			t->port_params.pipe_profiles[pid].tb_rate
+				= sp->params.peak.rate;
+			t->port_params.pipe_profiles[pid].tb_size
+				= sp->params.peak.size;
+
+			for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+				t->port_params.pipe_profiles[pid].tc_rate[i]
+					= sp->params.peak.rate;
+
+			t->port_params.pipe_profiles[pid].tc_period = 40;
+
+			for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++)
+				t->port_params.pipe_profiles[pid].wrr_weights[i]
+					= 1;
+
+			pid++;
+		}
+	}
+
+	/* TM Subport */
+	n_subports = t->port_params.n_subports_per_port;
+	for (subport_id = 0; subport_id < n_subports; subport_id++) {
+		struct tm_node *subport = tm_node_search(dev, subport_id + 1);
+		uint32_t n_shapers = subport->params.n_shared_shapers;
+
+		for (i = 0; i < n_shapers; i++) {
+			struct tm_shaper_profile *sp
+				= tm_shared_shaper_search(dev,
+					subport->params.shared_shaper_id[i]);
+
+			t->subport_params[subport_id].tb_rate
+				= sp->params.peak.rate;
+			t->subport_params[subport_id].tb_size
+				= sp->params.peak.size;
+			for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+				t->subport_params[subport_id].tc_rate[i]
+					= sp->params.peak.rate;
+
+			t->subport_params[subport_id].tc_period = 10;
+		}
+
+		/* TM Pipe */
+		n_pipes_per_subport = t->port_params.n_pipes_per_subport;
+		pipe_id = n_subports + 1;
+		for (; pipe_id < n_pipes_per_subport; pipe_id++) {
+			uint32_t n_max_pipes
+				= tm_node_get_child(dev, subport_id);
+			uint32_t pos = subport_id * n_max_pipes + pipe_id;
+			struct tm_node *pipe = tm_node_search(dev, pos);
+
+			t->pipe_to_profile[pos]
+				= pipe->shaper_profile->shaper_profile_id;
+		}
+	}
+
+	return 0;
+
+fail_clear:
+	if (clear_on_fail) {
+		tm_conf_uninit(dev);
+		tm_conf_init(dev);
+	}
+	return -EINVAL;
+}
+
+/* Traffic manager read stats counters for specific node */
+static int
+pmd_tm_node_stats_read(struct rte_eth_dev *dev __rte_unused,
+	uint32_t node_id __rte_unused,
+	struct rte_tm_node_stats *stats __rte_unused,
+	uint64_t *stats_mask __rte_unused,
+	int clear __rte_unused,
+	struct rte_tm_error *error __rte_unused)
+{
+	return 0;
+}
+
+const struct rte_tm_ops pmd_tm_ops = {
+	.node_type_get = pmd_tm_node_type_get,
+	.capabilities_get = pmd_tm_capabilities_get,
+	.level_capabilities_get = pmd_tm_level_capabilities_get,
+	.node_capabilities_get = pmd_tm_node_capabilities_get,
+
+	.wred_profile_add = NULL,
+	.wred_profile_delete = NULL,
+	.shared_wred_context_add_update = NULL,
+	.shared_wred_context_delete = NULL,
+
+	.shaper_profile_add = pmd_tm_shaper_profile_add,
+	.shaper_profile_delete = pmd_tm_shaper_profile_delete,
+	.shared_shaper_add_update = pmd_tm_shared_shaper_add_update,
+	.shared_shaper_delete = pmd_tm_shared_shaper_delete,
+
+	.node_add = pmd_tm_node_add,
+	.node_delete = pmd_tm_node_delete,
+	.node_suspend = NULL,
+	.node_resume = NULL,
+	.hierarchy_commit = pmd_tm_hierarchy_commit,
+
+	.node_parent_update = NULL,
+	.node_shaper_update = NULL,
+	.node_shared_shaper_update = NULL,
+	.node_stats_update = NULL,
+	.node_wfq_weight_mode_update = NULL,
+	.node_cman_update = NULL,
+	.node_wred_context_update = NULL,
+	.node_shared_wred_context_update = NULL,
+
+	.node_stats_read = pmd_tm_node_stats_read,
+};
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management
  2017-06-08 16:43           ` Dumitrescu, Cristian
@ 2017-07-04 23:48             ` Thomas Monjalon
  2017-07-05  9:32               ` Dumitrescu, Cristian
  0 siblings, 1 reply; 79+ messages in thread
From: Thomas Monjalon @ 2017-07-04 23:48 UTC (permalink / raw)
  To: Dumitrescu, Cristian
  Cc: dev, Singh, Jasvinder, Yigit, Ferruh, hemant.agrawal,
	Jerin.JacobKollanukkaran, Lu, Wenzhuo, techboard

08/06/2017 18:43, Dumitrescu, Cristian:
> <snip> ...
> > 
> > I'm sure I'm missing something.
> > In my understanding, we do not need to change the ops:
> > 	- if the device offers the capability, let's call the ops
> > 	- else call the software fallback function
> > 
> 
> What you might be missing is the observation that the approach you're describing requires changing each and every PMD. The changes are also intrusive: need to change the ops that need the SW fall-back patching, also need to change the private data of each PMD (as assigned to the opaque dev->data->dev_private) to add the context data needed by the patched ops. Therefore, this approach is a no-go.
> 
> We are looking for a generic approach that can gracefully and transparently work with any PMD.

Nobody is participating in this discussion.
Can we discuss how to proceed in the technical board meeting?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management
  2017-07-04 23:48             ` Thomas Monjalon
@ 2017-07-05  9:32               ` Dumitrescu, Cristian
  2017-07-05 10:17                 ` Thomas Monjalon
  0 siblings, 1 reply; 79+ messages in thread
From: Dumitrescu, Cristian @ 2017-07-05  9:32 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Singh, Jasvinder, Yigit, Ferruh, hemant.agrawal,
	Jerin.JacobKollanukkaran, Lu, Wenzhuo, techboard



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Wednesday, July 5, 2017 12:48 AM
> To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Cc: dev@dpdk.org; Singh, Jasvinder <jasvinder.singh@intel.com>; Yigit,
> Ferruh <ferruh.yigit@intel.com>; hemant.agrawal@nxp.com;
> Jerin.JacobKollanukkaran@cavium.com; Lu, Wenzhuo
> <wenzhuo.lu@intel.com>; techboard@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic
> management
> 
> 08/06/2017 18:43, Dumitrescu, Cristian:
> > <snip> ...
> > >
> > > I'm sure I'm missing something.
> > > In my understanding, we do not need to change the ops:
> > > 	- if the device offers the capability, let's call the ops
> > > 	- else call the software fallback function
> > >
> >
> > What you might be missing is the observation that the approach you're
> describing requires changing each and every PMD. The changes are also
> intrusive: need to change the ops that need the SW fall-back patching, also
> need to change the private data of each PMD (as assigned to the opaque
> dev->data->dev_private) to add the context data needed by the patched
> ops. Therefore, this approach is a no-go.
> >
> > We are looking for a generic approach that can gracefully and transparently
> work with any PMD.
> 
> Nobody is participating in this discussion.
> Can we discuss how to proceed in the technical board meeting?

Hi Thomas,

We are working to finalize a new version of the Soft NIC PMD which has a much simplified/straightforward design (we'll explain in the cover letter). We expect to send it in the next few days, hopefully we can target RC2.

I propose you take another look at this version and then decide if we need TB involvement or not?

Regards,
Cristian

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management
  2017-07-05  9:32               ` Dumitrescu, Cristian
@ 2017-07-05 10:17                 ` Thomas Monjalon
  0 siblings, 0 replies; 79+ messages in thread
From: Thomas Monjalon @ 2017-07-05 10:17 UTC (permalink / raw)
  To: Dumitrescu, Cristian
  Cc: dev, Singh, Jasvinder, Yigit, Ferruh, hemant.agrawal,
	Jerin.JacobKollanukkaran, Lu, Wenzhuo, techboard

05/07/2017 11:32, Dumitrescu, Cristian:
> 
> > -----Original Message-----
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > Sent: Wednesday, July 5, 2017 12:48 AM
> > To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> > Cc: dev@dpdk.org; Singh, Jasvinder <jasvinder.singh@intel.com>; Yigit,
> > Ferruh <ferruh.yigit@intel.com>; hemant.agrawal@nxp.com;
> > Jerin.JacobKollanukkaran@cavium.com; Lu, Wenzhuo
> > <wenzhuo.lu@intel.com>; techboard@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic
> > management
> > 
> > 08/06/2017 18:43, Dumitrescu, Cristian:
> > > <snip> ...
> > > >
> > > > I'm sure I'm missing something.
> > > > In my understanding, we do not need to change the ops:
> > > > 	- if the device offers the capability, let's call the ops
> > > > 	- else call the software fallback function
> > > >
> > >
> > > What you might be missing is the observation that the approach you're
> > describing requires changing each and every PMD. The changes are also
> > intrusive: need to change the ops that need the SW fall-back patching, also
> > need to change the private data of each PMD (as assigned to the opaque
> > dev->data->dev_private) to add the context data needed by the patched
> > ops. Therefore, this approach is a no-go.
> > >
> > > We are looking for a generic approach that can gracefully and transparently
> > work with any PMD.
> > 
> > Nobody is participating in this discussion.
> > Can we discuss how to proceed in the technical board meeting?
> 
> Hi Thomas,
> 
> We are working to finalize a new version of the Soft NIC PMD which has a much simplified/straightforward design (we'll explain in the cover letter). We expect to send it in the next few days, hopefully we can target RC2.

Thanks for continuing to work on it.
We will probably avoid a last minute integration of this new design in 17.08.
So it could become a priority topic for 17.11.

> I propose you take another look at this version and then decide if we need TB involvement or not?

OK, let's wait this version.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-06-26 16:43     ` [dpdk-dev] [PATCH v2 1/2] net/softnic: add softnic PMD " Jasvinder Singh
@ 2017-08-11 12:49       ` Jasvinder Singh
  2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 1/4] net/softnic: add softnic PMD Jasvinder Singh
                           ` (4 more replies)
  0 siblings, 5 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-08-11 12:49 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

The SoftNIC PMD is intended to provide SW fall-back options for specific
ethdev APIs in a generic way to the NICs not supporting those features.

Currently, the only implemented ethdev API is Traffic Management (TM),
but other ethdev APIs such as rte_flow, traffic metering & policing, etc
can be easily implemented.

Overview:
* Generic: The SoftNIC PMD works with any "hard" PMD that implements the
  ethdev API. It does not change the "hard" PMD in any way.
* Creation: For any given "hard" ethdev port, the user can decide to
  create an associated "soft" ethdev port to drive the "hard" port. The
  "soft" port is a virtual device that can be created at app start-up
  through EAL vdev arg or later through the virtual device API.
* Configuration: The app explicitly decides which features are to be
  enabled on the "soft" port and which features are still to be used from
  the "hard" port. The app continues to explicitly configure both the
  "hard" and the "soft" ports after the creation of the "soft" port.
* RX/TX: The app reads packets from/writes packets to the "soft" port
  instead of the "hard" port. The RX and TX queues of the "soft" port are
  thread safe, as any ethdev.
* Execution: The "soft" port is a feature-rich NIC implemented by the CPU,
  so the run function of the "soft" port has to be executed by the CPU in
  order to get packets moving between "hard" port and the app.
* Meets the NFV vision: The app should be (almost) agnostic about the NIC
  implementation (different vendors/models, HW-SW mix), the app should not
  require changes to use different NICs, the app should use the same API
  for all NICs. If a NIC does not implement a specific feature, the HW
  should be augmented with SW to meet the functionality while still
  preserving the same API.

Traffic Management SW fall-back overview:
* Implements the ethdev traffic management API (rte_tm.h).
* Based on the existing librte_sched DPDK library.

Example: Create "soft" port for "hard" port "0000:04:00.1", enable the TM
feature with default settings:
          --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on' 

Q1: Why generic name, if only TM is supported (for now)?
A1: The intention is to have SoftNIC PMD implement many other (all?)
    ethdev APIs under a single "ideal" ethdev, hence the generic name.
    The initial motivation is TM API, but the mechanism is generic and can
    be used for many other ethdev APIs. Somebody looking to provide SW
    fall-back for other ethdev API is likely to end up inventing the same,
    hence it would be good to consolidate all under a single PMD and have
    the user explicitly enable/disable the features it needs for each
    "soft" device.

Q2: Are there any performance requirements for SoftNIC?
A2: Yes, performance should be great/decent for every feature, otherwise
    the SW fall-back is unusable, thus useless.

Q3: Why not change the "hard" device (and keep a single device) instead of
    creating a new "soft" device (and thus having two devices)?
A3: This is not possible with the current librte_ether ethdev
    implementation. The ethdev->dev_ops are defined as constant structure,
    so it cannot be changed per device (nor per PMD). The new ops also
    need memory space to store their context data structures, which
    requires updating the ethdev->data->dev_private of the existing
    device; at best, maybe a resize of ethdev->data->dev_private could be
    done, assuming that librte_ether will introduce a way to find out its
    size, but this cannot be done while device is running. Other side
    effects might exist, as the changes are very intrusive, plus it likely
    needs more changes in librte_ether.

Q4: Why not call the SW fall-back dev_ops directly in librte_ether for
    devices which do not support the specific feature? If the device
    supports the capability, let's call its dev_ops, otherwise call the
    SW fall-back dev_ops.
A4: First, similar reasons to Q&A3. This fixes the need to change
    ethdev->dev_ops of the device, but it does not do anything to fix the
    other significant issue of where to store the context data structures
    needed by the SW fall-back functions (which, in this approach, are
    called implicitly by librte_ether).
    Second, the SW fall-back options should not be restricted arbitrarily
    by the librte_ether library, the decision should belong to the app.
    For example, the TM SW fall-back should not be limited to only
    librte_sched, which (like any SW fall-back) is limited to a specific
    hierarchy and feature set, it cannot do any possible hierarchy. If
    alternatives exist, the one to use should be picked by the app, not by
    the ethdev layer.

Q5: Why is the app required to continue to configure both the "hard" and
    the "soft" devices even after the "soft" device has been created? Why
    not hiding the "hard" device under the "soft" device and have the
    "soft" device configure the "hard" device under the hood?
A5: This was the approach tried in the V2 of this patch set (overlay
    "soft" device taking over the configuration of the underlay "hard"
    device) and eventually dropped due to increased complexity of having
    to keep the configuration of two distinct devices in sync with
    librte_ether implementation that is not friendly towards such
    approach. Basically, each ethdev API call for the overlay device
    needs to configure the overlay device, invoke the same configuration
    with possibly modified parameters for the underlay device, then resume
    the configuration of overlay device, turning this into a device
    emulation project.
    V2 minuses: increased complexity (deal with two devices at same time);
    need to implement every ethdev API, even those not needed for the scope
    of SW fall-back; intrusive; sometimes have to silently take decisions
    that should be left to the app.
    V3 pluses: lower complexity (only one device); only need to implement
    those APIs that are in scope of the SW fall-back; non-intrusive (deal
    with "hard" device through ethdev API); app decisions taken by the app
    in an explicit way.

Q6: Why expose the SW fall-back in a PMD and not in a SW library?
A6: The SW fall-back for an ethdev API has to implement that specific
    ethdev API, (hence expose an ethdev object through a PMD), as opposed
    to providing a different API. This approach allows the app to use the
    same API (NFV vision). For example, we already have a library for TM
    SW fall-back (librte_sched) that can be called directly by the apps
    that need to call it outside of ethdev context (use-cases exist), but
    an app that works with TM-aware NICs through the ethdev TM API would
    have to be changed significantly in order to work with different
    TM-agnostic NICs through the librte_sched API.

Q7: Why have all the SW fall-backs in a single PMD? Why not develop
    the SW fall-back for each different ethdev API in a separate PMD, then
    create a chain of "soft" devices for each "hard" device? Potentially,
    this results in smaller size PMDs that are easier to maintain.
A7: Arguments for single ethdev/PMD and against chain of ethdevs/PMDs:
    1. All the existing PMDs for HW NICs implement a lot of features under
       the same PMD, so there is no reason for single PMD approach to break
       code modularity. See the V3 code, a lot of care has been taken for
       code modularity.
    2. We should avoid the proliferation of SW PMDs.
    3. A single device should be handled by a single PMD.
    4. People are used with feature-rich PMDs, not with single-feature
       PMDs, so we change of mindset?
    5. [Configuration nightmare] A chain of "soft" devices attached to
       single "hard" device requires the app to be aware that the N "soft"
       devices in the chain plus the "hard" device refer to the same HW
       device, and which device should be invoked to configure which
       feature. Also the length of the chain and functionality of each
       link is different for each HW device. This breaks the requirement
       of preserving the same API while working with different NICs (NFV).
       This most likely results in a configuration nightmare, nobody is
       going to seriously use this.
    6. [Feature inter-dependecy] Sometimes different features need to be
       configured and executed together (e.g. share the same set of
       resources, are inter-dependent, etc), so it is better and more
       performant to do them in the same ethdev/PMD.
    7. [Code duplication] There is a lot of duplication in the
       configuration code for the chain of ethdevs approach. The ethdev
       dev_configure, rx_queue_setup, tx_queue_setup API functions have to
       be implemented per device, and they become meaningless/inconsistent
       with the chain approach.
    8. [Data structure duplication] The per device data structures have to
       be duplicated and read repeatedly for each "soft" ethdev. The
       ethdev device, dev_private, data, per RX/TX queue data structures
       have to be replicated per "soft" device. They have to be re-read for
       each stage, so the same cache misses are now multiplied with the
       number of stages in the chain.
    9. [rte_ring proliferation] Thread safety requirements for ethdev
       RX/TXqueues require an rte_ring to be used for every RX/TX queue
       of each "soft" ethdev. This rte_ring proliferation unnecessarily
       increases the memory footprint and lowers performance, especially
       when each "soft" ethdev ends up on a different CPU core (ping-pong
       of cache lines).
    10.[Meta-data proliferation] A chain of ethdevs is likely to result
       in proliferation of meta-data that has to be passed between the
       ethdevs (e.g. policing needs the output of flow classification),
       which results in more cache line ping-pong between cores, hence
       performance drops.

Cristian Dumitrescu (4):
Jasvinder Singh (4):
  net/softnic: add softnic PMD
  net/softnic: add traffic management support
  net/softnic: add TM capabilities ops
  net/softnic: add TM hierarchy related ops

 MAINTAINERS                                     |    5 +
 config/common_base                              |    5 +
 drivers/net/Makefile                            |    5 +
 drivers/net/softnic/Makefile                    |   57 +
 drivers/net/softnic/rte_eth_softnic.c           |  867 ++++++
 drivers/net/softnic/rte_eth_softnic.h           |   70 +
 drivers/net/softnic/rte_eth_softnic_internals.h |  291 ++
 drivers/net/softnic/rte_eth_softnic_tm.c        | 3446 +++++++++++++++++++++++
 drivers/net/softnic/rte_eth_softnic_version.map |    7 +
 mk/rte.app.mk                                   |    5 +-
 10 files changed, 4757 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic_version.map

-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v3 1/4] net/softnic: add softnic PMD
  2017-08-11 12:49       ` [dpdk-dev] [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
@ 2017-08-11 12:49         ` Jasvinder Singh
  2017-09-05 14:53           ` Ferruh Yigit
  2017-09-18  9:10           ` [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 2/4] net/softnic: add traffic management support Jasvinder Singh
                           ` (3 subsequent siblings)
  4 siblings, 2 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-08-11 12:49 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

Add SoftNIC PMD to provide SW fall-back for ethdev APIs.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
---
v3 changes:
- rebase to dpdk17.08 release

v2 changes:
- fix build errors
- rebased to TM APIs v6 plus dpdk master

 MAINTAINERS                                     |   5 +
 config/common_base                              |   5 +
 drivers/net/Makefile                            |   5 +
 drivers/net/softnic/Makefile                    |  56 +++
 drivers/net/softnic/rte_eth_softnic.c           | 609 ++++++++++++++++++++++++
 drivers/net/softnic/rte_eth_softnic.h           |  54 +++
 drivers/net/softnic/rte_eth_softnic_internals.h | 114 +++++
 drivers/net/softnic/rte_eth_softnic_version.map |   7 +
 mk/rte.app.mk                                   |   5 +-
 9 files changed, 859 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index a0cd75e..b6b738d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -511,6 +511,11 @@ M: Gaetan Rivet <gaetan.rivet@6wind.com>
 F: drivers/net/failsafe/
 F: doc/guides/nics/fail_safe.rst
 
+Softnic PMD
+M: Jasvinder Singh <jasvinder.singh@intel.com>
+M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
+F: drivers/net/softnic
+
 
 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index 5e97a08..1a0c77d 100644
--- a/config/common_base
+++ b/config/common_base
@@ -273,6 +273,11 @@ CONFIG_RTE_LIBRTE_SFC_EFX_PMD=y
 CONFIG_RTE_LIBRTE_SFC_EFX_DEBUG=n
 
 #
+# Compile SOFTNIC PMD
+#
+CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
+
+#
 # Compile software PMD backed by SZEDATA2 device
 #
 CONFIG_RTE_LIBRTE_PMD_SZEDATA2=n
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index d33c959..b552a51 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -110,4 +110,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
 endif # $(CONFIG_RTE_LIBRTE_VHOST)
 DEPDIRS-vhost = $(core-libs) librte_vhost
 
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += softnic
+endif # $(CONFIG_RTE_LIBRTE_SCHED)
+DEPDIRS-softnic = $(core-libs) librte_sched
+
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
new file mode 100644
index 0000000..8d00656
--- /dev/null
+++ b/drivers/net/softnic/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_softnic.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_eth_softnic_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+
+#
+# Export include files
+#
+SYMLINK-y-include +=rte_eth_softnic.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
new file mode 100644
index 0000000..35cb93c
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -0,0 +1,609 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_ethdev.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_vdev.h>
+#include <rte_kvargs.h>
+#include <rte_errno.h>
+#include <rte_ring.h>
+
+#include "rte_eth_softnic.h"
+#include "rte_eth_softnic_internals.h"
+
+#define PMD_PARAM_HARD_NAME					"hard_name"
+#define PMD_PARAM_HARD_TX_QUEUE_ID			"hard_tx_queue_id"
+
+static const char *pmd_valid_args[] = {
+	PMD_PARAM_HARD_NAME,
+	PMD_PARAM_HARD_TX_QUEUE_ID,
+	NULL
+};
+
+static struct rte_vdev_driver pmd_drv;
+
+static const struct rte_eth_dev_info pmd_dev_info = {
+	.min_rx_bufsize = 0,
+	.max_rx_pktlen = UINT32_MAX,
+	.max_rx_queues = UINT16_MAX,
+	.max_tx_queues = UINT16_MAX,
+	.rx_desc_lim = {
+		.nb_max = UINT16_MAX,
+		.nb_min = 0,
+		.nb_align = 1,
+	},
+	.tx_desc_lim = {
+		.nb_max = UINT16_MAX,
+		.nb_min = 0,
+		.nb_align = 1,
+	},
+};
+
+static void
+pmd_dev_infos_get(struct rte_eth_dev *dev __rte_unused,
+	struct rte_eth_dev_info *dev_info)
+{
+	memcpy(dev_info, &pmd_dev_info, sizeof(*dev_info));
+}
+
+static int
+pmd_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_eth_dev *hard_dev = &rte_eth_devices[p->hard.port_id];
+
+	if (dev->data->nb_rx_queues > hard_dev->data->nb_rx_queues)
+		return -1;
+
+	if (p->params.hard.tx_queue_id >= hard_dev->data->nb_tx_queues)
+		return -1;
+
+	return 0;
+}
+
+static int
+pmd_rx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id,
+	uint16_t nb_rx_desc __rte_unused,
+	unsigned int socket_id,
+	const struct rte_eth_rxconf *rx_conf __rte_unused,
+	struct rte_mempool *mb_pool __rte_unused)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	if (p->params.soft.intrusive == 0) {
+		struct pmd_rx_queue *rxq;
+
+		rxq = rte_zmalloc_socket(p->params.soft.name,
+			sizeof(struct pmd_rx_queue), 0, socket_id);
+		if (rxq == NULL)
+			return -1;
+
+		rxq->hard.port_id = p->hard.port_id;
+		rxq->hard.rx_queue_id = rx_queue_id;
+		dev->data->rx_queues[rx_queue_id] = rxq;
+	} else {
+		struct rte_eth_dev *hard_dev =
+			&rte_eth_devices[p->hard.port_id];
+		void *rxq = hard_dev->data->rx_queues[rx_queue_id];
+
+		if (rxq == NULL)
+			return -1;
+
+		dev->data->rx_queues[rx_queue_id] = rxq;
+	}
+	return 0;
+}
+
+static int
+pmd_tx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id,
+	uint16_t nb_tx_desc,
+	unsigned int socket_id,
+	const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	uint32_t size = RTE_ETH_NAME_MAX_LEN + strlen("_txq") + 4;
+	char name[size];
+	struct rte_ring *r;
+
+	snprintf(name, sizeof(name), "%s_txq%04x",
+		dev->data->name, tx_queue_id);
+	r = rte_ring_create(name, nb_tx_desc, socket_id,
+		RING_F_SP_ENQ | RING_F_SC_DEQ);
+	if (r == NULL)
+		return -1;
+
+	dev->data->tx_queues[tx_queue_id] = r;
+	return 0;
+}
+
+static int
+pmd_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+
+	if (p->params.soft.intrusive) {
+		struct rte_eth_dev *hard_dev =
+			&rte_eth_devices[p->hard.port_id];
+
+		/* The hard_dev->rx_pkt_burst should be stable by now */
+		dev->rx_pkt_burst = hard_dev->rx_pkt_burst;
+	}
+
+	return 0;
+}
+
+static void
+pmd_dev_stop(struct rte_eth_dev *dev)
+{
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+}
+
+static void
+pmd_dev_close(struct rte_eth_dev *dev)
+{
+	uint32_t i;
+
+	/* TX queues */
+	for (i = 0; i < dev->data->nb_tx_queues; i++)
+		rte_ring_free((struct rte_ring *) dev->data->tx_queues[i]);
+}
+
+static int
+pmd_link_update(struct rte_eth_dev *dev __rte_unused,
+	int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static const struct eth_dev_ops pmd_ops = {
+	.dev_configure = pmd_dev_configure,
+	.dev_start = pmd_dev_start,
+	.dev_stop = pmd_dev_stop,
+	.dev_close = pmd_dev_close,
+	.link_update = pmd_link_update,
+	.dev_infos_get = pmd_dev_infos_get,
+	.rx_queue_setup = pmd_rx_queue_setup,
+	.tx_queue_setup = pmd_tx_queue_setup,
+	.tm_ops_get = NULL,
+};
+
+static uint16_t
+pmd_rx_pkt_burst(void *rxq,
+	struct rte_mbuf **rx_pkts,
+	uint16_t nb_pkts)
+{
+	struct pmd_rx_queue *rx_queue = rxq;
+
+	return rte_eth_rx_burst(rx_queue->hard.port_id,
+		rx_queue->hard.rx_queue_id,
+		rx_pkts,
+		nb_pkts);
+}
+
+static uint16_t
+pmd_tx_pkt_burst(void *txq,
+	struct rte_mbuf **tx_pkts,
+	uint16_t nb_pkts)
+{
+	return (uint16_t) rte_ring_enqueue_burst(txq,
+		(void **) tx_pkts,
+		nb_pkts,
+		NULL);
+}
+
+static __rte_always_inline int
+rte_pmd_softnic_run_default(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Persistent context: Read Only (update not required) */
+	struct rte_mbuf **pkts = p->soft.def.pkts;
+	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
+
+	/* Persistent context: Read - Write (update required) */
+	uint32_t txq_pos = p->soft.def.txq_pos;
+	uint32_t pkts_len = p->soft.def.pkts_len;
+	uint32_t flush_count = p->soft.def.flush_count;
+
+	/* Not part of the persistent context */
+	uint32_t pos;
+	uint16_t i;
+
+	/* Soft device TXQ read, Hard device TXQ write */
+	for (i = 0; i < nb_tx_queues; i++) {
+		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
+
+		/* Read soft device TXQ burst to packet enqueue buffer */
+		pkts_len += rte_ring_sc_dequeue_burst(txq,
+			(void **) &pkts[pkts_len],
+			DEFAULT_BURST_SIZE,
+			NULL);
+
+		/* Increment soft device TXQ */
+		txq_pos++;
+		if (txq_pos >= nb_tx_queues)
+			txq_pos = 0;
+
+		/* Hard device TXQ write when complete burst is available */
+		if (pkts_len >= DEFAULT_BURST_SIZE) {
+			for (pos = 0; pos < pkts_len; )
+				pos += rte_eth_tx_burst(p->hard.port_id,
+					p->params.hard.tx_queue_id,
+					&pkts[pos],
+					(uint16_t) (pkts_len - pos));
+
+			pkts_len = 0;
+			flush_count = 0;
+			break;
+		}
+	}
+
+	if (flush_count >= FLUSH_COUNT_THRESHOLD) {
+		for (pos = 0; pos < pkts_len; )
+			pos += rte_eth_tx_burst(p->hard.port_id,
+				p->params.hard.tx_queue_id,
+				&pkts[pos],
+				(uint16_t) (pkts_len - pos));
+
+		pkts_len = 0;
+		flush_count = 0;
+	}
+
+	p->soft.def.txq_pos = txq_pos;
+	p->soft.def.pkts_len = pkts_len;
+	p->soft.def.flush_count = flush_count + 1;
+
+	return 0;
+}
+
+int
+rte_pmd_softnic_run(uint8_t port_id)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+#endif
+
+	return rte_pmd_softnic_run_default(dev);
+}
+
+static struct ether_addr eth_addr = { .addr_bytes = {0} };
+
+static uint32_t
+eth_dev_speed_max_mbps(uint32_t speed_capa)
+{
+	uint32_t rate_mbps[32] = {
+		ETH_SPEED_NUM_NONE,
+		ETH_SPEED_NUM_10M,
+		ETH_SPEED_NUM_10M,
+		ETH_SPEED_NUM_100M,
+		ETH_SPEED_NUM_100M,
+		ETH_SPEED_NUM_1G,
+		ETH_SPEED_NUM_2_5G,
+		ETH_SPEED_NUM_5G,
+		ETH_SPEED_NUM_10G,
+		ETH_SPEED_NUM_20G,
+		ETH_SPEED_NUM_25G,
+		ETH_SPEED_NUM_40G,
+		ETH_SPEED_NUM_50G,
+		ETH_SPEED_NUM_56G,
+		ETH_SPEED_NUM_100G,
+	};
+
+	uint32_t pos = (speed_capa) ? (31 - __builtin_clz(speed_capa)) : 0;
+	return rate_mbps[pos];
+}
+
+static int
+default_init(struct pmd_internals *p,
+	struct pmd_params *params,
+	int numa_node)
+{
+	p->soft.def.pkts = rte_zmalloc_socket(params->soft.name,
+		2 * DEFAULT_BURST_SIZE * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.def.pkts == NULL)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void
+default_free(struct pmd_internals *p)
+{
+	rte_free(p->soft.def.pkts);
+}
+
+static void *
+pmd_init(struct pmd_params *params, int numa_node)
+{
+	struct pmd_internals *p;
+	int status;
+
+	p = rte_zmalloc_socket(params->soft.name,
+		sizeof(struct pmd_internals),
+		0,
+		numa_node);
+	if (p == NULL)
+		return NULL;
+
+	memcpy(&p->params, params, sizeof(p->params));
+	rte_eth_dev_get_port_by_name(params->hard.name, &p->hard.port_id);
+
+	/* Default */
+	status = default_init(p, params, numa_node);
+	if (status) {
+		rte_free(p);
+		return NULL;
+	}
+
+	return p;
+}
+
+static void
+pmd_free(struct pmd_internals *p)
+{
+	default_free(p);
+
+	rte_free(p);
+}
+
+static int
+pmd_ethdev_register(struct rte_vdev_device *vdev,
+	struct pmd_params *params,
+	void *dev_private)
+{
+	struct rte_eth_dev_info hard_info;
+	struct rte_eth_dev *soft_dev;
+	struct rte_eth_dev_data *soft_data;
+	uint32_t hard_speed;
+	int numa_node;
+	uint8_t hard_port_id;
+
+	rte_eth_dev_get_port_by_name(params->hard.name, &hard_port_id);
+	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
+	numa_node = rte_eth_dev_socket_id(hard_port_id);
+
+	/* Memory allocation */
+	soft_data = rte_zmalloc_socket(params->soft.name,
+		sizeof(*soft_data), 0, numa_node);
+	if (!soft_data)
+		return -ENOMEM;
+
+	/* Ethdev entry allocation */
+	soft_dev = rte_eth_dev_allocate(params->soft.name);
+	if (!soft_dev) {
+		rte_free(soft_data);
+		return -ENOMEM;
+	}
+
+	/* Connect dev->data */
+	memmove(soft_data->name,
+		soft_dev->data->name,
+		sizeof(soft_data->name));
+	soft_data->port_id = soft_dev->data->port_id;
+	soft_data->mtu = soft_dev->data->mtu;
+	soft_dev->data = soft_data;
+
+	/* dev */
+	soft_dev->rx_pkt_burst = (params->soft.intrusive) ?
+		NULL : /* set up later */
+		pmd_rx_pkt_burst;
+	soft_dev->tx_pkt_burst = pmd_tx_pkt_burst;
+	soft_dev->tx_pkt_prepare = NULL;
+	soft_dev->dev_ops = &pmd_ops;
+	soft_dev->device = &vdev->device;
+
+	/* dev->data */
+	soft_dev->data->dev_private = dev_private;
+	soft_dev->data->dev_link.link_speed = hard_speed;
+	soft_dev->data->dev_link.link_duplex = ETH_LINK_FULL_DUPLEX;
+	soft_dev->data->dev_link.link_autoneg = ETH_LINK_SPEED_FIXED;
+	soft_dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	soft_dev->data->mac_addrs = &eth_addr;
+	soft_dev->data->promiscuous = 1;
+	soft_dev->data->kdrv = RTE_KDRV_NONE;
+	soft_dev->data->numa_node = numa_node;
+
+	return 0;
+}
+
+static int
+get_string(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(char **)extra_args = strdup(value);
+
+	if (!*(char **)extra_args)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int
+get_uint32(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(uint32_t *)extra_args = strtoull(value, NULL, 0);
+
+	return 0;
+}
+
+static int
+pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
+{
+	struct rte_kvargs *kvlist;
+	int ret;
+
+	kvlist = rte_kvargs_parse(params, pmd_valid_args);
+	if (kvlist == NULL)
+		return -EINVAL;
+
+	/* Set default values */
+	memset(p, 0, sizeof(*p));
+	p->soft.name = name;
+	p->soft.intrusive = INTRUSIVE;
+	p->hard.tx_queue_id = SOFTNIC_HARD_TX_QUEUE_ID;
+
+	/* HARD: name (mandatory) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_NAME) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME,
+			&get_string, &p->hard.name);
+		if (ret < 0)
+			goto out_free;
+	} else {
+		ret = -EINVAL;
+		goto out_free;
+	}
+
+	/* HARD: tx_queue_id (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_TX_QUEUE_ID) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_TX_QUEUE_ID,
+			&get_uint32, &p->hard.tx_queue_id);
+		if (ret < 0)
+			goto out_free;
+	}
+
+out_free:
+	rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+pmd_probe(struct rte_vdev_device *vdev)
+{
+	struct pmd_params p;
+	const char *params;
+	int status;
+
+	struct rte_eth_dev_info hard_info;
+	uint8_t hard_port_id;
+	int numa_node;
+	void *dev_private;
+
+	if (!vdev)
+		return -EINVAL;
+
+	RTE_LOG(INFO, PMD,
+		"Probing device \"%s\"\n",
+		rte_vdev_device_name(vdev));
+
+	/* Parse input arguments */
+	params = rte_vdev_device_args(vdev);
+	if (!params)
+		return -EINVAL;
+
+	status = pmd_parse_args(&p, rte_vdev_device_name(vdev), params);
+	if (status)
+		return status;
+
+	/* Check input arguments */
+	if (rte_eth_dev_get_port_by_name(p.hard.name, &hard_port_id))
+		return -EINVAL;
+
+	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	numa_node = rte_eth_dev_socket_id(hard_port_id);
+
+	if (p.hard.tx_queue_id >= hard_info.max_tx_queues)
+		return -EINVAL;
+
+	/* Allocate and initialize soft ethdev private data */
+	dev_private = pmd_init(&p, numa_node);
+	if (dev_private == NULL)
+		return -ENOMEM;
+
+	/* Register soft ethdev */
+	RTE_LOG(INFO, PMD,
+		"Creating soft ethdev \"%s\" for hard ethdev \"%s\"\n",
+		p.soft.name, p.hard.name);
+
+	status = pmd_ethdev_register(vdev, &p, dev_private);
+	if (status) {
+		pmd_free(dev_private);
+		return status;
+	}
+
+	return 0;
+}
+
+static int
+pmd_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *dev = NULL;
+	struct pmd_internals *p;
+
+	if (!vdev)
+		return -EINVAL;
+
+	RTE_LOG(INFO, PMD, "Removing device \"%s\"\n",
+		rte_vdev_device_name(vdev));
+
+	/* Find the ethdev entry */
+	dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (dev == NULL)
+		return -ENODEV;
+	p = dev->data->dev_private;
+
+	/* Free device data structures*/
+	pmd_free(p);
+	rte_free(dev->data);
+	rte_eth_dev_release_port(dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_drv = {
+	.probe = pmd_probe,
+	.remove = pmd_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_softnic, pmd_drv);
+RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
+	PMD_PARAM_HARD_NAME "=<string> "
+	PMD_PARAM_HARD_TX_QUEUE_ID "=<int>");
diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
new file mode 100644
index 0000000..f840345
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.h
@@ -0,0 +1,54 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_H__
+
+#include <stdint.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifndef SOFTNIC_HARD_TX_QUEUE_ID
+#define SOFTNIC_HARD_TX_QUEUE_ID			0
+#endif
+
+int
+rte_pmd_softnic_run(uint8_t port_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
new file mode 100644
index 0000000..dfb7fab
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -0,0 +1,114 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+
+#include "rte_eth_softnic.h"
+
+#ifndef INTRUSIVE
+#define INTRUSIVE					0
+#endif
+
+struct pmd_params {
+	/** Parameters for the soft device (to be created) */
+	struct {
+		const char *name; /**< Name */
+		uint32_t flags; /**< Flags */
+
+		/** 0 = Access hard device though API only (potentially slower,
+		 *      but safer);
+		 *  1 = Access hard device private data structures is allowed
+		 *      (potentially faster).
+		 */
+		int intrusive;
+	} soft;
+
+	/** Parameters for the hard device (existing) */
+	struct {
+		const char *name; /**< Name */
+		uint16_t tx_queue_id; /**< TX queue ID */
+	} hard;
+};
+
+/**
+ * Default Internals
+ */
+
+#ifndef DEFAULT_BURST_SIZE
+#define DEFAULT_BURST_SIZE				32
+#endif
+
+#ifndef FLUSH_COUNT_THRESHOLD
+#define FLUSH_COUNT_THRESHOLD			(1 << 17)
+#endif
+
+struct default_internals {
+	struct rte_mbuf **pkts;
+	uint32_t pkts_len;
+	uint32_t txq_pos;
+	uint32_t flush_count;
+};
+
+/**
+ * PMD Internals
+ */
+struct pmd_internals {
+	/** Params */
+	struct pmd_params params;
+
+	/** Soft device */
+	struct {
+		struct default_internals def; /**< Default */
+	} soft;
+
+	/** Hard device */
+	struct {
+		uint8_t port_id;
+	} hard;
+};
+
+struct pmd_rx_queue {
+	/** Hard device */
+	struct {
+		uint8_t port_id;
+		uint16_t rx_queue_id;
+	} hard;
+};
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_version.map b/drivers/net/softnic/rte_eth_softnic_version.map
new file mode 100644
index 0000000..fb2cb68
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_pmd_softnic_run;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index c25fdd9..3dc82fb 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -67,7 +67,6 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
-_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
 # librte_acl needs --whole-archive because of weak functions
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)            += --whole-archive
@@ -99,6 +98,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
@@ -135,6 +135,9 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL)       += -lrte_pmd_null
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP)       += -lrte_pmd_pcap -lpcap
 _LDLIBS-$(CONFIG_RTE_LIBRTE_QEDE_PMD)       += -lrte_pmd_qede
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING)       += -lrte_pmd_ring
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC)      += -lrte_pmd_softnic
+endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD)    += -lrte_pmd_sfc_efx
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2)   += -lrte_pmd_szedata2 -lsze2
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_TAP)        += -lrte_pmd_tap
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v3 2/4] net/softnic: add traffic management support
  2017-08-11 12:49       ` [dpdk-dev] [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 1/4] net/softnic: add softnic PMD Jasvinder Singh
@ 2017-08-11 12:49         ` Jasvinder Singh
  2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 3/4] net/softnic: add TM capabilities ops Jasvinder Singh
                           ` (2 subsequent siblings)
  4 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-08-11 12:49 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

Add ethdev Traffic Management API support to SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
---
v3 changes:
- add more confguration parameters (tm rate, tm queue sizes)

 drivers/net/softnic/Makefile                    |   1 +
 drivers/net/softnic/rte_eth_softnic.c           | 252 +++++++++++++++++++++++-
 drivers/net/softnic/rte_eth_softnic.h           |  16 ++
 drivers/net/softnic/rte_eth_softnic_internals.h | 106 +++++++++-
 drivers/net/softnic/rte_eth_softnic_tm.c        | 180 +++++++++++++++++
 5 files changed, 552 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c

diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
index 8d00656..eea76ca 100644
--- a/drivers/net/softnic/Makefile
+++ b/drivers/net/softnic/Makefile
@@ -47,6 +47,7 @@ LIBABIVER := 1
 # all source are stored in SRCS-y
 #
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic_tm.c
 
 #
 # Export include files
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index 35cb93c..f1906af 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -42,14 +42,34 @@
 #include <rte_kvargs.h>
 #include <rte_errno.h>
 #include <rte_ring.h>
+#include <rte_sched.h>
 
 #include "rte_eth_softnic.h"
 #include "rte_eth_softnic_internals.h"
 
+#define PMD_PARAM_SOFT_TM					"soft_tm"
+#define PMD_PARAM_SOFT_TM_RATE				"soft_tm_rate"
+#define PMD_PARAM_SOFT_TM_NB_QUEUES			"soft_tm_nb_queues"
+#define PMD_PARAM_SOFT_TM_QSIZE0			"soft_tm_qsize0"
+#define PMD_PARAM_SOFT_TM_QSIZE1			"soft_tm_qsize1"
+#define PMD_PARAM_SOFT_TM_QSIZE2			"soft_tm_qsize2"
+#define PMD_PARAM_SOFT_TM_QSIZE3			"soft_tm_qsize3"
+#define PMD_PARAM_SOFT_TM_ENQ_BSZ			"soft_tm_enq_bsz"
+#define PMD_PARAM_SOFT_TM_DEQ_BSZ			"soft_tm_deq_bsz"
+
 #define PMD_PARAM_HARD_NAME					"hard_name"
 #define PMD_PARAM_HARD_TX_QUEUE_ID			"hard_tx_queue_id"
 
 static const char *pmd_valid_args[] = {
+	PMD_PARAM_SOFT_TM,
+	PMD_PARAM_SOFT_TM_RATE,
+	PMD_PARAM_SOFT_TM_NB_QUEUES,
+	PMD_PARAM_SOFT_TM_QSIZE0,
+	PMD_PARAM_SOFT_TM_QSIZE1,
+	PMD_PARAM_SOFT_TM_QSIZE2,
+	PMD_PARAM_SOFT_TM_QSIZE3,
+	PMD_PARAM_SOFT_TM_ENQ_BSZ,
+	PMD_PARAM_SOFT_TM_DEQ_BSZ,
 	PMD_PARAM_HARD_NAME,
 	PMD_PARAM_HARD_TX_QUEUE_ID,
 	NULL
@@ -157,6 +177,13 @@ pmd_dev_start(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
 
+	if (tm_used(dev)) {
+		int status = tm_start(p);
+
+		if (status)
+			return status;
+	}
+
 	dev->data->dev_link.link_status = ETH_LINK_UP;
 
 	if (p->params.soft.intrusive) {
@@ -173,7 +200,12 @@ pmd_dev_start(struct rte_eth_dev *dev)
 static void
 pmd_dev_stop(struct rte_eth_dev *dev)
 {
+	struct pmd_internals *p = dev->data->dev_private;
+
 	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+
+	if (tm_used(dev))
+		tm_stop(p);
 }
 
 static void
@@ -294,6 +326,77 @@ rte_pmd_softnic_run_default(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static __rte_always_inline int
+rte_pmd_softnic_run_tm(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Persistent context: Read Only (update not required) */
+	struct rte_sched_port *sched = p->soft.tm.sched;
+	struct rte_mbuf **pkts_enq = p->soft.tm.pkts_enq;
+	struct rte_mbuf **pkts_deq = p->soft.tm.pkts_deq;
+	uint32_t enq_bsz = p->params.soft.tm.enq_bsz;
+	uint32_t deq_bsz = p->params.soft.tm.deq_bsz;
+	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
+
+	/* Persistent context: Read - Write (update required) */
+	uint32_t txq_pos = p->soft.tm.txq_pos;
+	uint32_t pkts_enq_len = p->soft.tm.pkts_enq_len;
+	uint32_t flush_count = p->soft.tm.flush_count;
+
+	/* Not part of the persistent context */
+	uint32_t pkts_deq_len, pos;
+	uint16_t i;
+
+	/* Soft device TXQ read, TM enqueue */
+	for (i = 0; i < nb_tx_queues; i++) {
+		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
+
+		/* Read TXQ burst to packet enqueue buffer */
+		pkts_enq_len += rte_ring_sc_dequeue_burst(txq,
+			(void **) &pkts_enq[pkts_enq_len],
+			enq_bsz,
+			NULL);
+
+		/* Increment TXQ */
+		txq_pos++;
+		if (txq_pos >= nb_tx_queues)
+			txq_pos = 0;
+
+		/* TM enqueue when complete burst is available */
+		if (pkts_enq_len >= enq_bsz) {
+			rte_sched_port_enqueue(sched, pkts_enq, pkts_enq_len);
+
+			pkts_enq_len = 0;
+			flush_count = 0;
+			break;
+		}
+	}
+
+	if (flush_count >= FLUSH_COUNT_THRESHOLD) {
+		if (pkts_enq_len)
+			rte_sched_port_enqueue(sched, pkts_enq, pkts_enq_len);
+
+		pkts_enq_len = 0;
+		flush_count = 0;
+	}
+
+	p->soft.tm.txq_pos = txq_pos;
+	p->soft.tm.pkts_enq_len = pkts_enq_len;
+	p->soft.tm.flush_count = flush_count + 1;
+
+	/* TM dequeue, Hard device TXQ write */
+	pkts_deq_len = rte_sched_port_dequeue(sched, pkts_deq, deq_bsz);
+
+	for (pos = 0; pos < pkts_deq_len; )
+		pos += rte_eth_tx_burst(p->hard.port_id,
+			p->params.hard.tx_queue_id,
+			&pkts_deq[pos],
+			(uint16_t) (pkts_deq_len - pos));
+
+	return 0;
+}
+
 int
 rte_pmd_softnic_run(uint8_t port_id)
 {
@@ -303,7 +406,9 @@ rte_pmd_softnic_run(uint8_t port_id)
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
 #endif
 
-	return rte_pmd_softnic_run_default(dev);
+	return (tm_used(dev)) ?
+		rte_pmd_softnic_run_tm(dev) :
+		rte_pmd_softnic_run_default(dev);
 }
 
 static struct ether_addr eth_addr = { .addr_bytes = {0} };
@@ -378,12 +483,25 @@ pmd_init(struct pmd_params *params, int numa_node)
 		return NULL;
 	}
 
+	/* Traffic Management (TM)*/
+	if (params->soft.flags & PMD_FEATURE_TM) {
+		status = tm_init(p, params, numa_node);
+		if (status) {
+			default_free(p);
+			rte_free(p);
+			return NULL;
+		}
+	}
+
 	return p;
 }
 
 static void
 pmd_free(struct pmd_internals *p)
 {
+	if (p->params.soft.flags & PMD_FEATURE_TM)
+		tm_free(p);
+
 	default_free(p);
 
 	rte_free(p);
@@ -479,7 +597,7 @@ static int
 pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
 {
 	struct rte_kvargs *kvlist;
-	int ret;
+	int i, ret;
 
 	kvlist = rte_kvargs_parse(params, pmd_valid_args);
 	if (kvlist == NULL)
@@ -489,8 +607,120 @@ pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
 	memset(p, 0, sizeof(*p));
 	p->soft.name = name;
 	p->soft.intrusive = INTRUSIVE;
+	p->soft.tm.rate = 0;
+	p->soft.tm.nb_queues = SOFTNIC_SOFT_TM_NB_QUEUES;
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+		p->soft.tm.qsize[i] = SOFTNIC_SOFT_TM_QUEUE_SIZE;
+	p->soft.tm.enq_bsz = SOFTNIC_SOFT_TM_ENQ_BSZ;
+	p->soft.tm.deq_bsz = SOFTNIC_SOFT_TM_DEQ_BSZ;
 	p->hard.tx_queue_id = SOFTNIC_HARD_TX_QUEUE_ID;
 
+	/* SOFT: TM (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM) == 1) {
+		char *s;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM,
+			&get_string, &s);
+		if (ret < 0)
+			goto out_free;
+
+		if (strcmp(s, "on") == 0)
+			p->soft.flags |= PMD_FEATURE_TM;
+		else if (strcmp(s, "off") == 0)
+			p->soft.flags &= ~PMD_FEATURE_TM;
+		else
+			goto out_free;
+	}
+
+	/* SOFT: TM rate (measured in bytes/second) (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_RATE) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_RATE,
+			&get_uint32, &p->soft.tm.rate);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM number of queues (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_NB_QUEUES) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_NB_QUEUES,
+			&get_uint32, &p->soft.tm.nb_queues);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM queue size 0 .. 3 (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE0) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE0,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[0] = (uint16_t) qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE1) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE1,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[1] = (uint16_t) qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE2) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE2,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[2] = (uint16_t) qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE3) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE3,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[3] = (uint16_t) qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM enqueue burst size (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_ENQ_BSZ) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_ENQ_BSZ,
+			&get_uint32, &p->soft.tm.enq_bsz);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM dequeue burst size (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_DEQ_BSZ) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_DEQ_BSZ,
+			&get_uint32, &p->soft.tm.deq_bsz);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
 	/* HARD: name (mandatory) */
 	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_NAME) == 1) {
 		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME,
@@ -523,6 +753,7 @@ pmd_probe(struct rte_vdev_device *vdev)
 	int status;
 
 	struct rte_eth_dev_info hard_info;
+	uint32_t hard_speed;
 	uint8_t hard_port_id;
 	int numa_node;
 	void *dev_private;
@@ -548,11 +779,19 @@ pmd_probe(struct rte_vdev_device *vdev)
 		return -EINVAL;
 
 	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
 	numa_node = rte_eth_dev_socket_id(hard_port_id);
 
 	if (p.hard.tx_queue_id >= hard_info.max_tx_queues)
 		return -EINVAL;
 
+	if (p.soft.flags & PMD_FEATURE_TM) {
+		status = tm_params_check(&p, hard_speed);
+
+		if (status)
+			return status;
+	}
+
 	/* Allocate and initialize soft ethdev private data */
 	dev_private = pmd_init(&p, numa_node);
 	if (dev_private == NULL)
@@ -605,5 +844,14 @@ static struct rte_vdev_driver pmd_drv = {
 
 RTE_PMD_REGISTER_VDEV(net_softnic, pmd_drv);
 RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
+	PMD_PARAM_SOFT_TM	 "=on|off "
+	PMD_PARAM_SOFT_TM_RATE "=<int> "
+	PMD_PARAM_SOFT_TM_NB_QUEUES "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE0 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE1 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE2 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE3 "=<int> "
+	PMD_PARAM_SOFT_TM_ENQ_BSZ "=<int> "
+	PMD_PARAM_SOFT_TM_DEQ_BSZ "=<int> "
 	PMD_PARAM_HARD_NAME "=<string> "
 	PMD_PARAM_HARD_TX_QUEUE_ID "=<int>");
diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
index f840345..4388dc5 100644
--- a/drivers/net/softnic/rte_eth_softnic.h
+++ b/drivers/net/softnic/rte_eth_softnic.h
@@ -40,6 +40,22 @@
 extern "C" {
 #endif
 
+#ifndef SOFTNIC_SOFT_TM_NB_QUEUES
+#define SOFTNIC_SOFT_TM_NB_QUEUES			65536
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_QUEUE_SIZE
+#define SOFTNIC_SOFT_TM_QUEUE_SIZE			64
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_ENQ_BSZ
+#define SOFTNIC_SOFT_TM_ENQ_BSZ				32
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_DEQ_BSZ
+#define SOFTNIC_SOFT_TM_DEQ_BSZ				24
+#endif
+
 #ifndef SOFTNIC_HARD_TX_QUEUE_ID
 #define SOFTNIC_HARD_TX_QUEUE_ID			0
 #endif
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index dfb7fab..c3c41cc 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -37,10 +37,19 @@
 #include <stdint.h>
 
 #include <rte_mbuf.h>
+#include <rte_sched.h>
 #include <rte_ethdev.h>
 
 #include "rte_eth_softnic.h"
 
+/**
+ * PMD Parameters
+ */
+
+enum pmd_feature {
+	PMD_FEATURE_TM = 1, /**< Traffic Management (TM) */
+};
+
 #ifndef INTRUSIVE
 #define INTRUSIVE					0
 #endif
@@ -57,6 +66,16 @@ struct pmd_params {
 		 *      (potentially faster).
 		 */
 		int intrusive;
+
+		/** Traffic Management (TM) */
+		struct {
+			uint32_t rate; /**< Rate (bytes/second) */
+			uint32_t nb_queues; /**< Number of queues */
+			uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+			/**< Queue size per traffic class */
+			uint32_t enq_bsz; /**< Enqueue burst size */
+			uint32_t deq_bsz; /**< Dequeue burst size */
+		} tm;
 	} soft;
 
 	/** Parameters for the hard device (existing) */
@@ -75,7 +94,7 @@ struct pmd_params {
 #endif
 
 #ifndef FLUSH_COUNT_THRESHOLD
-#define FLUSH_COUNT_THRESHOLD			(1 << 17)
+#define FLUSH_COUNT_THRESHOLD				(1 << 17)
 #endif
 
 struct default_internals {
@@ -86,6 +105,66 @@ struct default_internals {
 };
 
 /**
+ * Traffic Management (TM) Internals
+ */
+
+#ifndef TM_MAX_SUBPORTS
+#define TM_MAX_SUBPORTS					8
+#endif
+
+#ifndef TM_MAX_PIPES_PER_SUBPORT
+#define TM_MAX_PIPES_PER_SUBPORT			4096
+#endif
+
+struct tm_params {
+	struct rte_sched_port_params port_params;
+
+	struct rte_sched_subport_params subport_params[TM_MAX_SUBPORTS];
+
+	struct rte_sched_pipe_params
+		pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PORT];
+	uint32_t n_pipe_profiles;
+	uint32_t pipe_to_profile[TM_MAX_SUBPORTS * TM_MAX_PIPES_PER_SUBPORT];
+};
+
+/* TM Levels */
+enum tm_node_level {
+	TM_NODE_LEVEL_PORT = 0,
+	TM_NODE_LEVEL_SUBPORT,
+	TM_NODE_LEVEL_PIPE,
+	TM_NODE_LEVEL_TC,
+	TM_NODE_LEVEL_QUEUE,
+	TM_NODE_LEVEL_MAX,
+};
+
+/* TM Hierarchy Specification */
+struct tm_hierarchy {
+	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
+};
+
+struct tm_internals {
+	/** Hierarchy specification
+	 *
+	 *     -Hierarchy is unfrozen at init and when port is stopped.
+	 *     -Hierarchy is frozen on successful hierarchy commit.
+	 *     -Run-time hierarchy changes are not allowed, therefore it makes
+	 *      sense to keep the hierarchy frozen after the port is started.
+	 */
+	struct tm_hierarchy h;
+
+	/** Blueprints */
+	struct tm_params params;
+
+	/** Run-time */
+	struct rte_sched_port *sched;
+	struct rte_mbuf **pkts_enq;
+	struct rte_mbuf **pkts_deq;
+	uint32_t pkts_enq_len;
+	uint32_t txq_pos;
+	uint32_t flush_count;
+};
+
+/**
  * PMD Internals
  */
 struct pmd_internals {
@@ -95,6 +174,7 @@ struct pmd_internals {
 	/** Soft device */
 	struct {
 		struct default_internals def; /**< Default */
+		struct tm_internals tm; /**< Traffic Management */
 	} soft;
 
 	/** Hard device */
@@ -111,4 +191,28 @@ struct pmd_rx_queue {
 	} hard;
 };
 
+int
+tm_params_check(struct pmd_params *params, uint32_t hard_rate);
+
+int
+tm_init(struct pmd_internals *p, struct pmd_params *params, int numa_node);
+
+void
+tm_free(struct pmd_internals *p);
+
+int
+tm_start(struct pmd_internals *p);
+
+void
+tm_stop(struct pmd_internals *p);
+
+static inline int
+tm_used(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	return (p->params.soft.flags & PMD_FEATURE_TM) &&
+		p->soft.tm.h.n_tm_nodes[TM_NODE_LEVEL_PORT];
+}
+
 #endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
new file mode 100644
index 0000000..7fbea46
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -0,0 +1,180 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_malloc.h>
+
+#include "rte_eth_softnic_internals.h"
+#include "rte_eth_softnic.h"
+
+#define BYTES_IN_MBPS (1000 * 1000 / 8)
+
+int
+tm_params_check(struct pmd_params *params, uint32_t hard_rate)
+{
+	uint64_t hard_rate_bytes_per_sec = hard_rate * BYTES_IN_MBPS;
+	uint32_t i;
+
+	/* rate */
+	if (params->soft.tm.rate) {
+		if (params->soft.tm.rate > hard_rate_bytes_per_sec)
+			return -EINVAL;
+	} else
+		params->soft.tm.rate =
+			(hard_rate_bytes_per_sec > UINT32_MAX) ?
+				UINT32_MAX : hard_rate_bytes_per_sec;
+
+	/* nb_queues */
+	if (params->soft.tm.nb_queues == 0)
+		return -EINVAL;
+
+	if (params->soft.tm.nb_queues < RTE_SCHED_QUEUES_PER_PIPE)
+		params->soft.tm.nb_queues = RTE_SCHED_QUEUES_PER_PIPE;
+
+	params->soft.tm.nb_queues =
+		rte_align32pow2(params->soft.tm.nb_queues);
+
+	/* qsize */
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+		if (params->soft.tm.qsize[i] == 0)
+			return -EINVAL;
+
+		params->soft.tm.qsize[i] =
+			rte_align32pow2(params->soft.tm.qsize[i]);
+	}
+
+	/* enq_bsz, deq_bsz */
+	if ((params->soft.tm.enq_bsz == 0) ||
+		(params->soft.tm.deq_bsz == 0) ||
+		(params->soft.tm.deq_bsz >= params->soft.tm.enq_bsz))
+		return -EINVAL;
+
+	return 0;
+}
+
+int
+tm_init(struct pmd_internals *p,
+	struct pmd_params *params,
+	int numa_node)
+{
+	uint32_t enq_bsz = params->soft.tm.enq_bsz;
+	uint32_t deq_bsz = params->soft.tm.deq_bsz;
+
+	p->soft.tm.pkts_enq = rte_zmalloc_socket(params->soft.name,
+		2 * enq_bsz * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.tm.pkts_enq == NULL)
+		return -ENOMEM;
+
+	p->soft.tm.pkts_deq = rte_zmalloc_socket(params->soft.name,
+		deq_bsz * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.tm.pkts_deq == NULL) {
+		rte_free(p->soft.tm.pkts_enq);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void
+tm_free(struct pmd_internals *p)
+{
+	rte_free(p->soft.tm.pkts_enq);
+	rte_free(p->soft.tm.pkts_deq);
+}
+
+int
+tm_start(struct pmd_internals *p)
+{
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_subports, subport_id;
+	int status;
+
+	/* Port */
+	p->soft.tm.sched = rte_sched_port_config(&t->port_params);
+	if (p->soft.tm.sched == NULL)
+		return -1;
+
+	/* Subport */
+	n_subports = t->port_params.n_subports_per_port;
+	for (subport_id = 0; subport_id < n_subports; subport_id++) {
+		uint32_t n_pipes_per_subport =
+			t->port_params.n_pipes_per_subport;
+		uint32_t pipe_id;
+
+		status = rte_sched_subport_config(p->soft.tm.sched,
+			subport_id,
+			&t->subport_params[subport_id]);
+		if (status) {
+			rte_sched_port_free(p->soft.tm.sched);
+			return -1;
+		}
+
+		/* Pipe */
+		n_pipes_per_subport = t->port_params.n_pipes_per_subport;
+		for (pipe_id = 0; pipe_id < n_pipes_per_subport; pipe_id++) {
+			int pos = subport_id * TM_MAX_PIPES_PER_SUBPORT +
+				pipe_id;
+			int profile_id = t->pipe_to_profile[pos];
+
+			if (profile_id < 0)
+				continue;
+
+			status = rte_sched_pipe_config(p->soft.tm.sched,
+				subport_id,
+				pipe_id,
+				profile_id);
+			if (status) {
+				rte_sched_port_free(p->soft.tm.sched);
+				return -1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+void
+tm_stop(struct pmd_internals *p)
+{
+	if (p->soft.tm.sched)
+		rte_sched_port_free(p->soft.tm.sched);
+}
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v3 3/4] net/softnic: add TM capabilities ops
  2017-08-11 12:49       ` [dpdk-dev] [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 1/4] net/softnic: add softnic PMD Jasvinder Singh
  2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 2/4] net/softnic: add traffic management support Jasvinder Singh
@ 2017-08-11 12:49         ` Jasvinder Singh
  2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 4/4] net/softnic: add TM hierarchy related ops Jasvinder Singh
  2017-09-08 17:08         ` [dpdk-dev] [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Dumitrescu, Cristian
  4 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-08-11 12:49 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

Implement ethdev TM capability APIs in SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
---
 drivers/net/softnic/rte_eth_softnic.c           |  12 +-
 drivers/net/softnic/rte_eth_softnic_internals.h |  32 ++
 drivers/net/softnic/rte_eth_softnic_tm.c        | 500 ++++++++++++++++++++++++
 3 files changed, 543 insertions(+), 1 deletion(-)

diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index f1906af..f113f32 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -43,6 +43,7 @@
 #include <rte_errno.h>
 #include <rte_ring.h>
 #include <rte_sched.h>
+#include <rte_tm_driver.h>
 
 #include "rte_eth_softnic.h"
 #include "rte_eth_softnic_internals.h"
@@ -225,6 +226,15 @@ pmd_link_update(struct rte_eth_dev *dev __rte_unused,
 	return 0;
 }
 
+static int
+pmd_tm_ops_get(struct rte_eth_dev *dev, void *arg)
+{
+	*(const struct rte_tm_ops **)arg =
+		(tm_enabled(dev)) ? &pmd_tm_ops : NULL;
+
+	return 0;
+}
+
 static const struct eth_dev_ops pmd_ops = {
 	.dev_configure = pmd_dev_configure,
 	.dev_start = pmd_dev_start,
@@ -234,7 +244,7 @@ static const struct eth_dev_ops pmd_ops = {
 	.dev_infos_get = pmd_dev_infos_get,
 	.rx_queue_setup = pmd_rx_queue_setup,
 	.tx_queue_setup = pmd_tx_queue_setup,
-	.tm_ops_get = NULL,
+	.tm_ops_get = pmd_tm_ops_get,
 };
 
 static uint16_t
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index c3c41cc..a43aef9 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -39,6 +39,7 @@
 #include <rte_mbuf.h>
 #include <rte_sched.h>
 #include <rte_ethdev.h>
+#include <rte_tm_driver.h>
 
 #include "rte_eth_softnic.h"
 
@@ -137,8 +138,26 @@ enum tm_node_level {
 	TM_NODE_LEVEL_MAX,
 };
 
+/* TM Node */
+struct tm_node {
+	TAILQ_ENTRY(tm_node) node;
+	uint32_t node_id;
+	uint32_t parent_node_id;
+	uint32_t priority;
+	uint32_t weight;
+	uint32_t level;
+	struct tm_node *parent_node;
+	struct rte_tm_node_params params;
+	struct rte_tm_node_stats stats;
+	uint32_t n_children;
+};
+
+TAILQ_HEAD(tm_node_list, tm_node);
+
 /* TM Hierarchy Specification */
 struct tm_hierarchy {
+	struct tm_node_list nodes;
+
 	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
 };
 
@@ -191,6 +210,11 @@ struct pmd_rx_queue {
 	} hard;
 };
 
+/**
+ * Traffic Management (TM) Operation
+ */
+extern const struct rte_tm_ops pmd_tm_ops;
+
 int
 tm_params_check(struct pmd_params *params, uint32_t hard_rate);
 
@@ -207,6 +231,14 @@ void
 tm_stop(struct pmd_internals *p);
 
 static inline int
+tm_enabled(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	return (p->params.soft.flags & PMD_FEATURE_TM);
+}
+
+static inline int
 tm_used(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index 7fbea46..35bb084 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -178,3 +178,503 @@ tm_stop(struct pmd_internals *p)
 	if (p->soft.tm.sched)
 		rte_sched_port_free(p->soft.tm.sched);
 }
+
+static struct tm_node *
+tm_node_search(struct rte_eth_dev *dev, uint32_t node_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node)
+		if (n->node_id == node_id)
+			return n;
+
+	return NULL;
+}
+
+static uint32_t
+tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t n_queues_max = p->params.soft.tm.nb_queues;
+	uint32_t n_tc_max = n_queues_max / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
+	uint32_t n_pipes_max = n_tc_max / RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+	uint32_t n_subports_max = n_pipes_max;
+	uint32_t n_root_max = 1;
+
+	switch (level) {
+	case TM_NODE_LEVEL_PORT:
+		return n_root_max;
+	case TM_NODE_LEVEL_SUBPORT:
+		return n_subports_max;
+	case TM_NODE_LEVEL_PIPE:
+		return n_pipes_max;
+	case TM_NODE_LEVEL_TC:
+		return n_tc_max;
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		return n_queues_max;
+	}
+}
+
+#ifdef RTE_SCHED_RED
+#define WRED_SUPPORTED						1
+#else
+#define WRED_SUPPORTED						0
+#endif
+
+#define STATS_MASK_DEFAULT					\
+	(RTE_TM_STATS_N_PKTS |					\
+	RTE_TM_STATS_N_BYTES |					\
+	RTE_TM_STATS_N_PKTS_GREEN_DROPPED |			\
+	RTE_TM_STATS_N_BYTES_GREEN_DROPPED)
+
+#define STATS_MASK_QUEUE						\
+	(STATS_MASK_DEFAULT |					\
+	RTE_TM_STATS_N_PKTS_QUEUED)
+
+static const struct rte_tm_capabilities tm_cap = {
+	.n_nodes_max = UINT32_MAX,
+	.n_levels_max = TM_NODE_LEVEL_MAX,
+
+	.non_leaf_nodes_identical = 0,
+	.leaf_nodes_identical = 1,
+
+	.shaper_n_max = UINT32_MAX,
+	.shaper_private_n_max = UINT32_MAX,
+	.shaper_private_dual_rate_n_max = 0,
+	.shaper_private_rate_min = 1,
+	.shaper_private_rate_max = UINT32_MAX,
+
+	.shaper_shared_n_max = UINT32_MAX,
+	.shaper_shared_n_nodes_per_shaper_max = UINT32_MAX,
+	.shaper_shared_n_shapers_per_node_max = 1,
+	.shaper_shared_dual_rate_n_max = 0,
+	.shaper_shared_rate_min = 1,
+	.shaper_shared_rate_max = UINT32_MAX,
+
+	.shaper_pkt_length_adjust_min = RTE_TM_ETH_FRAMING_OVERHEAD_FCS,
+	.shaper_pkt_length_adjust_max = RTE_TM_ETH_FRAMING_OVERHEAD_FCS,
+
+	.sched_n_children_max = UINT32_MAX,
+	.sched_sp_n_priorities_max = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+	.sched_wfq_n_children_per_group_max = UINT32_MAX,
+	.sched_wfq_n_groups_max = 1,
+	.sched_wfq_weight_max = UINT32_MAX,
+
+	.cman_head_drop_supported = 0,
+	.cman_wred_context_n_max = 0,
+	.cman_wred_context_private_n_max = 0,
+	.cman_wred_context_shared_n_max = 0,
+	.cman_wred_context_shared_n_nodes_per_context_max = 0,
+	.cman_wred_context_shared_n_contexts_per_node_max = 0,
+
+	.mark_vlan_dei_supported = {0, 0, 0},
+	.mark_ip_ecn_tcp_supported = {0, 0, 0},
+	.mark_ip_ecn_sctp_supported = {0, 0, 0},
+	.mark_ip_dscp_supported = {0, 0, 0},
+
+	.dynamic_update_mask = 0,
+
+	.stats_mask = STATS_MASK_QUEUE,
+};
+
+/* Traffic manager capabilities get */
+static int
+pmd_tm_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	struct rte_tm_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_cap, sizeof(*cap));
+
+	cap->n_nodes_max = tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_TC) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_QUEUE);
+
+	cap->shaper_private_n_max =
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_TC);
+
+	cap->shaper_shared_n_max = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE *
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT);
+
+	cap->shaper_n_max = cap->shaper_private_n_max +
+		cap->shaper_shared_n_max;
+
+	cap->shaper_shared_n_nodes_per_shaper_max =
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE);
+
+	cap->sched_n_children_max = RTE_MAX(
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE),
+		(uint32_t) RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE);
+
+	cap->sched_wfq_n_children_per_group_max = cap->sched_n_children_max;
+
+	if (WRED_SUPPORTED)
+		cap->cman_wred_context_private_n_max =
+			tm_level_get_max_nodes(dev, TM_NODE_LEVEL_QUEUE);
+
+	cap->cman_wred_context_n_max = cap->cman_wred_context_private_n_max +
+		cap->cman_wred_context_shared_n_max;
+
+	return 0;
+}
+
+static const struct rte_tm_level_capabilities tm_level_cap[] = {
+	[TM_NODE_LEVEL_PORT] = {
+		.n_nodes_max = 1,
+		.n_nodes_nonleaf_max = 1,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = 1,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_SUBPORT] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+			.sched_wfq_weight_max = UINT32_MAX,
+#else
+			.sched_wfq_weight_max = 1,
+#endif
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_PIPE] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_sp_n_priorities_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_wfq_n_children_per_group_max = 1,
+			.sched_wfq_n_groups_max = 0,
+			.sched_wfq_weight_max = 1,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_TC] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 1,
+
+			.sched_n_children_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_QUEUE] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = 0,
+		.n_nodes_leaf_max = UINT32_MAX,
+		.non_leaf_nodes_identical = 0,
+		.leaf_nodes_identical = 1,
+
+		.leaf = {
+			.shaper_private_supported = 0,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 0,
+			.shaper_private_rate_max = 0,
+			.shaper_shared_n_max = 0,
+
+			.cman_head_drop_supported = 0,
+			.cman_wred_context_private_supported = WRED_SUPPORTED,
+			.cman_wred_context_shared_n_max = 0,
+
+			.stats_mask = STATS_MASK_QUEUE,
+		},
+	},
+};
+
+/* Traffic manager level capabilities get */
+static int
+pmd_tm_level_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t level_id,
+	struct rte_tm_level_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	if (level_id >= TM_NODE_LEVEL_MAX)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_LEVEL_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_level_cap[level_id], sizeof(*cap));
+
+	switch (level_id) {
+	case TM_NODE_LEVEL_PORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_SUBPORT);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_SUBPORT);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_PIPE);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_PIPE);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		break;
+
+	case TM_NODE_LEVEL_TC:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_TC);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		break;
+
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_QUEUE);
+		cap->n_nodes_leaf_max = cap->n_nodes_max;
+		break;
+	}
+
+	return 0;
+}
+
+static const struct rte_tm_node_capabilities tm_node_cap[] = {
+	[TM_NODE_LEVEL_PORT] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = 1,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_SUBPORT] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_PIPE] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_sp_n_priorities_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_wfq_n_children_per_group_max = 1,
+			.sched_wfq_n_groups_max = 0,
+			.sched_wfq_weight_max = 1,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_TC] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 1,
+
+		.nonleaf = {
+			.sched_n_children_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_QUEUE] = {
+		.shaper_private_supported = 0,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 0,
+		.shaper_private_rate_max = 0,
+		.shaper_shared_n_max = 0,
+
+
+		.leaf = {
+			.cman_head_drop_supported = 0,
+			.cman_wred_context_private_supported = WRED_SUPPORTED,
+			.cman_wred_context_shared_n_max = 0,
+		},
+
+		.stats_mask = STATS_MASK_QUEUE,
+	},
+};
+
+/* Traffic manager node capabilities get */
+static int
+pmd_tm_node_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t node_id,
+	struct rte_tm_node_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	struct tm_node *tm_node;
+
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	tm_node = tm_node_search(dev, node_id);
+	if (tm_node == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_NODE_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_node_cap[tm_node->level], sizeof(*cap));
+
+	switch (tm_node->level) {
+	case TM_NODE_LEVEL_PORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_SUBPORT);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_PIPE);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+	case TM_NODE_LEVEL_TC:
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+const struct rte_tm_ops pmd_tm_ops = {
+	.capabilities_get = pmd_tm_capabilities_get,
+	.level_capabilities_get = pmd_tm_level_capabilities_get,
+	.node_capabilities_get = pmd_tm_node_capabilities_get,
+};
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v3 4/4] net/softnic: add TM hierarchy related ops
  2017-08-11 12:49       ` [dpdk-dev] [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
                           ` (2 preceding siblings ...)
  2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 3/4] net/softnic: add TM capabilities ops Jasvinder Singh
@ 2017-08-11 12:49         ` Jasvinder Singh
  2017-09-08 17:08         ` [dpdk-dev] [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Dumitrescu, Cristian
  4 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-08-11 12:49 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

Implement ethdev TM hierarchy related APIs in SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
---
v3 changes:
- add dynamnic update ops, tm stats ops
- add wred congestion mangement ops
- imporve hierarchy commit ops implementation
- add more checks to all the ops

v2 changes:
- add TM functions for hierarchical QoS scheduler

 drivers/net/softnic/rte_eth_softnic_internals.h |   41 +
 drivers/net/softnic/rte_eth_softnic_tm.c        | 2774 ++++++++++++++++++++++-
 2 files changed, 2811 insertions(+), 4 deletions(-)

diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index a43aef9..4c3cac8 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -138,6 +138,36 @@ enum tm_node_level {
 	TM_NODE_LEVEL_MAX,
 };
 
+/* TM Shaper Profile */
+struct tm_shaper_profile {
+	TAILQ_ENTRY(tm_shaper_profile) node;
+	uint32_t shaper_profile_id;
+	uint32_t n_users;
+	struct rte_tm_shaper_params params;
+};
+
+TAILQ_HEAD(tm_shaper_profile_list, tm_shaper_profile);
+
+/* TM Shared Shaper */
+struct tm_shared_shaper {
+	TAILQ_ENTRY(tm_shared_shaper) node;
+	uint32_t shared_shaper_id;
+	uint32_t n_users;
+	uint32_t shaper_profile_id;
+};
+
+TAILQ_HEAD(tm_shared_shaper_list, tm_shared_shaper);
+
+/* TM WRED Profile */
+struct tm_wred_profile {
+	TAILQ_ENTRY(tm_wred_profile) node;
+	uint32_t wred_profile_id;
+	uint32_t n_users;
+	struct rte_tm_wred_params params;
+};
+
+TAILQ_HEAD(tm_wred_profile_list, tm_wred_profile);
+
 /* TM Node */
 struct tm_node {
 	TAILQ_ENTRY(tm_node) node;
@@ -147,6 +177,8 @@ struct tm_node {
 	uint32_t weight;
 	uint32_t level;
 	struct tm_node *parent_node;
+	struct tm_shaper_profile *shaper_profile;
+	struct tm_wred_profile *wred_profile;
 	struct rte_tm_node_params params;
 	struct rte_tm_node_stats stats;
 	uint32_t n_children;
@@ -156,8 +188,16 @@ TAILQ_HEAD(tm_node_list, tm_node);
 
 /* TM Hierarchy Specification */
 struct tm_hierarchy {
+	struct tm_shaper_profile_list shaper_profiles;
+	struct tm_shared_shaper_list shared_shapers;
+	struct tm_wred_profile_list wred_profiles;
 	struct tm_node_list nodes;
 
+	uint32_t n_shaper_profiles;
+	uint32_t n_shared_shapers;
+	uint32_t n_wred_profiles;
+	uint32_t n_nodes;
+
 	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
 };
 
@@ -170,6 +210,7 @@ struct tm_internals {
 	 *      sense to keep the hierarchy frozen after the port is started.
 	 */
 	struct tm_hierarchy h;
+	int hierarchy_frozen;
 
 	/** Blueprints */
 	struct tm_params params;
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index 35bb084..8cf7618 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -85,6 +85,79 @@ tm_params_check(struct pmd_params *params, uint32_t hard_rate)
 	return 0;
 }
 
+static void
+tm_hierarchy_init(struct pmd_internals *p)
+{
+	memset(&p->soft.tm.h, 0, sizeof(p->soft.tm.h));
+
+	/* Initialize shaper profile list */
+	TAILQ_INIT(&p->soft.tm.h.shaper_profiles);
+
+	/* Initialize shared shaper list */
+	TAILQ_INIT(&p->soft.tm.h.shared_shapers);
+
+	/* Initialize wred profile list */
+	TAILQ_INIT(&p->soft.tm.h.wred_profiles);
+
+	/* Initialize TM node list */
+	TAILQ_INIT(&p->soft.tm.h.nodes);
+}
+
+static void
+tm_hierarchy_uninit(struct pmd_internals *p)
+{
+	/* Remove all nodes*/
+	for ( ; ; ) {
+		struct tm_node *tm_node;
+
+		tm_node = TAILQ_FIRST(&p->soft.tm.h.nodes);
+		if (tm_node == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.nodes, tm_node, node);
+		free(tm_node);
+	}
+
+	/* Remove all WRED profiles */
+	for ( ; ; ) {
+		struct tm_wred_profile *wred_profile;
+
+		wred_profile = TAILQ_FIRST(&p->soft.tm.h.wred_profiles);
+		if (wred_profile == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.wred_profiles, wred_profile, node);
+		free(wred_profile);
+	}
+
+	/* Remove all shared shapers */
+	for ( ; ; ) {
+		struct tm_shared_shaper *shared_shaper;
+
+		shared_shaper = TAILQ_FIRST(&p->soft.tm.h.shared_shapers);
+		if (shared_shaper == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.shared_shapers, shared_shaper, node);
+		free(shared_shaper);
+	}
+
+	/* Remove all shaper profiles */
+	for ( ; ; ) {
+		struct tm_shaper_profile *shaper_profile;
+
+		shaper_profile = TAILQ_FIRST(&p->soft.tm.h.shaper_profiles);
+		if (shaper_profile == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.shaper_profiles,
+			shaper_profile, node);
+		free(shaper_profile);
+	}
+
+	memset(&p->soft.tm.h, 0, sizeof(p->soft.tm.h));
+}
+
 int
 tm_init(struct pmd_internals *p,
 	struct pmd_params *params,
@@ -111,12 +184,15 @@ tm_init(struct pmd_internals *p,
 		return -ENOMEM;
 	}
 
+	tm_hierarchy_init(p);
+
 	return 0;
 }
 
 void
 tm_free(struct pmd_internals *p)
 {
+	tm_hierarchy_uninit(p);
 	rte_free(p->soft.tm.pkts_enq);
 	rte_free(p->soft.tm.pkts_deq);
 }
@@ -128,6 +204,10 @@ tm_start(struct pmd_internals *p)
 	uint32_t n_subports, subport_id;
 	int status;
 
+	/* Is hierarchy frozen? */
+	if (p->soft.tm.hierarchy_frozen == 0)
+		return -1;
+
 	/* Port */
 	p->soft.tm.sched = rte_sched_port_config(&t->port_params);
 	if (p->soft.tm.sched == NULL)
@@ -177,6 +257,51 @@ tm_stop(struct pmd_internals *p)
 {
 	if (p->soft.tm.sched)
 		rte_sched_port_free(p->soft.tm.sched);
+
+	/* Unfreeze hierarchy */
+	p->soft.tm.hierarchy_frozen = 0;
+}
+
+static struct tm_shaper_profile *
+tm_shaper_profile_search(struct rte_eth_dev *dev, uint32_t shaper_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile_list *spl = &p->soft.tm.h.shaper_profiles;
+	struct tm_shaper_profile *sp;
+
+	TAILQ_FOREACH(sp, spl, node)
+		if (shaper_profile_id == sp->shaper_profile_id)
+			return sp;
+
+	return NULL;
+}
+
+static struct tm_shared_shaper *
+tm_shared_shaper_search(struct rte_eth_dev *dev, uint32_t shared_shaper_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper_list *ssl = &p->soft.tm.h.shared_shapers;
+	struct tm_shared_shaper *ss;
+
+	TAILQ_FOREACH(ss, ssl, node)
+		if (shared_shaper_id == ss->shared_shaper_id)
+			return ss;
+
+	return NULL;
+}
+
+static struct tm_wred_profile *
+tm_wred_profile_search(struct rte_eth_dev *dev, uint32_t wred_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile_list *wpl = &p->soft.tm.h.wred_profiles;
+	struct tm_wred_profile *wp;
+
+	TAILQ_FOREACH(wp, wpl, node)
+		if (wred_profile_id == wp->wred_profile_id)
+			return wp;
+
+	return NULL;
 }
 
 static struct tm_node *
@@ -193,6 +318,94 @@ tm_node_search(struct rte_eth_dev *dev, uint32_t node_id)
 	return NULL;
 }
 
+static struct tm_node *
+tm_root_node_present(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node)
+		if (n->parent_node_id == RTE_TM_NODE_ID_NULL)
+			return n;
+
+	return NULL;
+}
+
+static uint32_t
+tm_node_subport_id(struct rte_eth_dev *dev, struct tm_node *subport_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *ns;
+	uint32_t subport_id;
+
+	subport_id = 0;
+	TAILQ_FOREACH(ns, nl, node) {
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		if (ns->node_id == subport_node->node_id)
+			return subport_id;
+
+		subport_id++;
+	}
+
+	return UINT32_MAX;
+}
+
+static uint32_t
+tm_node_pipe_id(struct rte_eth_dev *dev, struct tm_node *pipe_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *np;
+	uint32_t pipe_id;
+
+	pipe_id = 0;
+	TAILQ_FOREACH(np, nl, node) {
+		if ((np->level != TM_NODE_LEVEL_PIPE) ||
+			(np->parent_node_id != pipe_node->parent_node_id))
+			continue;
+
+		if (np->node_id == pipe_node->node_id)
+			return pipe_id;
+
+		pipe_id++;
+	}
+
+	return UINT32_MAX;
+}
+
+static uint32_t
+tm_node_tc_id(struct rte_eth_dev *dev __rte_unused, struct tm_node *tc_node)
+{
+	return tc_node->priority;
+}
+
+static uint32_t
+tm_node_queue_id(struct rte_eth_dev *dev, struct tm_node *queue_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *nq;
+	uint32_t queue_id;
+
+	queue_id = 0;
+	TAILQ_FOREACH(nq, nl, node) {
+		if ((nq->level != TM_NODE_LEVEL_QUEUE) ||
+			(nq->parent_node_id != queue_node->parent_node_id))
+			continue;
+
+		if (nq->node_id == queue_node->node_id)
+			return queue_id;
+
+		queue_id++;
+	}
+
+	return UINT32_MAX;
+}
+
 static uint32_t
 tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 {
@@ -218,6 +431,35 @@ tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 	}
 }
 
+/* Traffic manager node type get */
+static int
+pmd_tm_node_type_get(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	int *is_leaf,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	if (is_leaf == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_UNSPECIFIED,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	if ((node_id == RTE_TM_NODE_ID_NULL) ||
+		(tm_node_search(dev, node_id) == NULL))
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_NODE_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	*is_leaf = node_id < p->params.soft.tm.nb_queues;
+
+	return 0;
+}
+
 #ifdef RTE_SCHED_RED
 #define WRED_SUPPORTED						1
 #else
@@ -673,8 +915,2532 @@ pmd_tm_node_capabilities_get(struct rte_eth_dev *dev __rte_unused,
 	return 0;
 }
 
-const struct rte_tm_ops pmd_tm_ops = {
-	.capabilities_get = pmd_tm_capabilities_get,
-	.level_capabilities_get = pmd_tm_level_capabilities_get,
-	.node_capabilities_get = pmd_tm_node_capabilities_get,
+static int
+shaper_profile_check(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error)
+{
+	struct tm_shaper_profile *sp;
+
+	/* Shaper profile ID must not be NONE. */
+	if (shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper profile must not exist. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp)
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	/* Profile must not be NULL. */
+	if (profile == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Peak rate: non-zero, 32-bit */
+	if ((profile->peak.rate == 0) ||
+		(profile->peak.rate >= UINT32_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_RATE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Peak size: non-zero, 32-bit */
+	if ((profile->peak.size == 0) ||
+		(profile->peak.size >= UINT32_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_SIZE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Dual-rate profiles are not supported. */
+	if (profile->committed.rate != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_COMMITTED_RATE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Packet length adjust: 24 bytes */
+	if (profile->pkt_length_adjust != RTE_TM_ETH_FRAMING_OVERHEAD_FCS)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PKT_ADJUST_LEN,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+/* Traffic manager shaper profile add */
+static int
+pmd_tm_shaper_profile_add(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile_list *spl = &p->soft.tm.h.shaper_profiles;
+	struct tm_shaper_profile *sp;
+	int status;
+
+	/* Check input params */
+	status = shaper_profile_check(dev, shaper_profile_id, profile, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	sp = calloc(1, sizeof(struct tm_shaper_profile));
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	sp->shaper_profile_id = shaper_profile_id;
+	memcpy(&sp->params, profile, sizeof(sp->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(spl, sp, node);
+	p->soft.tm.h.n_shaper_profiles++;
+
+	return 0;
+}
+
+/* Traffic manager shaper profile delete */
+static int
+pmd_tm_shaper_profile_delete(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp;
+
+	/* Check existing */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (sp->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.shaper_profiles, sp, node);
+	p->soft.tm.h.n_shaper_profiles--;
+	free(sp);
+
+	return 0;
+}
+
+static struct tm_node *
+tm_shared_shaper_get_tc(struct rte_eth_dev *dev,
+	struct tm_shared_shaper *ss)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node) {
+		if ((n->level != TM_NODE_LEVEL_TC) ||
+			(n->params.n_shared_shapers == 0) ||
+			(n->params.shared_shaper_id[0] != ss->shared_shaper_id))
+			continue;
+
+		return n;
+	}
+
+	return NULL;
+}
+
+static int
+update_subport_tc_rate(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct tm_shared_shaper *ss,
+	struct tm_shaper_profile *sp_new)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_subport_params subport_params;
+
+	struct tm_shaper_profile *sp_old = tm_shaper_profile_search(dev,
+		ss->shaper_profile_id);
+
+	/* Derive new subport configuration. */
+	memcpy(&subport_params,
+		&p->soft.tm.params.subport_params[subport_id],
+		sizeof(subport_params));
+	subport_params.tc_rate[tc_id] = sp_new->params.peak.rate;
+
+	/* Update the subport configuration. */
+	if (rte_sched_subport_config(p->soft.tm.sched,
+		subport_id, &subport_params))
+		return -1;
+
+	/* Commit changes. */
+	sp_old->n_users--;
+
+	ss->shaper_profile_id = sp_new->shaper_profile_id;
+	sp_new->n_users++;
+
+	memcpy(&p->soft.tm.params.subport_params[subport_id],
+		&subport_params,
+		sizeof(subport_params));
+
+	return 0;
+}
+
+/* Traffic manager shared shaper add/update */
+static int
+pmd_tm_shared_shaper_add_update(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper *ss;
+	struct tm_shaper_profile *sp;
+	struct tm_node *nt;
+
+	/* Shaper profile must be valid. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/**
+	 * Add new shared shaper
+	 */
+	ss = tm_shared_shaper_search(dev, shared_shaper_id);
+	if (ss == NULL) {
+		struct tm_shared_shaper_list *ssl =
+			&p->soft.tm.h.shared_shapers;
+
+		/* Hierarchy must not be frozen */
+		if (p->soft.tm.hierarchy_frozen)
+			return -rte_tm_error_set(error,
+				EBUSY,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EBUSY));
+
+		/* Memory allocation */
+		ss = calloc(1, sizeof(struct tm_shared_shaper));
+		if (ss == NULL)
+			return -rte_tm_error_set(error,
+				ENOMEM,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(ENOMEM));
+
+		/* Fill in */
+		ss->shared_shaper_id = shared_shaper_id;
+		ss->shaper_profile_id = shaper_profile_id;
+
+		/* Add to list */
+		TAILQ_INSERT_TAIL(ssl, ss, node);
+		p->soft.tm.h.n_shared_shapers++;
+
+		return 0;
+	}
+
+	/**
+	 * Update existing shared shaper
+	 */
+	/* Hierarchy must be frozen (run-time update) */
+	if (p->soft.tm.hierarchy_frozen == 0)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+
+	/* Propagate change. */
+	nt = tm_shared_shaper_get_tc(dev, ss);
+	if (update_subport_tc_rate(dev, nt, ss, sp))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+/* Traffic manager shared shaper delete */
+static int
+pmd_tm_shared_shaper_delete(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper *ss;
+
+	/* Check existing */
+	ss = tm_shared_shaper_search(dev, shared_shaper_id);
+	if (ss == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (ss->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.shared_shapers, ss, node);
+	p->soft.tm.h.n_shared_shapers--;
+	free(ss);
+
+	return 0;
+}
+
+static int
+wred_profile_check(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error)
+{
+	struct tm_wred_profile *wp;
+	enum rte_tm_color color;
+
+	/* WRED profile ID must not be NONE. */
+	if (wred_profile_id == RTE_TM_WRED_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* WRED profile must not exist. */
+	wp = tm_wred_profile_search(dev, wred_profile_id);
+	if (wp)
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	/* Profile must not be NULL. */
+	if (profile == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* min_th <= max_th, max_th > 0  */
+	for (color = RTE_TM_GREEN; color < RTE_TM_COLORS; color++) {
+		uint16_t min_th = profile->red_params[color].min_th;
+		uint16_t max_th = profile->red_params[color].max_th;
+
+		if ((min_th > max_th) || (max_th == 0))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_WRED_PROFILE,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	return 0;
+}
+
+/* Traffic manager WRED profile add */
+static int
+pmd_tm_wred_profile_add(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile_list *wpl = &p->soft.tm.h.wred_profiles;
+	struct tm_wred_profile *wp;
+	int status;
+
+	/* Check input params */
+	status = wred_profile_check(dev, wred_profile_id, profile, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	wp = calloc(1, sizeof(struct tm_wred_profile));
+	if (wp == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	wp->wred_profile_id = wred_profile_id;
+	memcpy(&wp->params, profile, sizeof(wp->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(wpl, wp, node);
+	p->soft.tm.h.n_wred_profiles++;
+
+	return 0;
+}
+
+/* Traffic manager WRED profile delete */
+static int
+pmd_tm_wred_profile_delete(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile *wp;
+
+	/* Check existing */
+	wp = tm_wred_profile_search(dev, wred_profile_id);
+	if (wp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (wp->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.wred_profiles, wp, node);
+	p->soft.tm.h.n_wred_profiles--;
+	free(wp);
+
+	return 0;
+}
+
+static int
+node_add_check_port(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp = tm_shaper_profile_search(dev,
+		params->shaper_profile_id);
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid.
+	 * Shaper profile peak rate must fit the configured port rate.
+	 */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(sp == NULL) ||
+		(sp->params.peak.rate > p->params.soft.tm.rate))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_subport(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_pipe(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight __rte_unused,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 4 */
+	if (params->nonleaf.n_sp_priorities !=
+		RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* WFQ mode must be byte mode */
+	if ((params->nonleaf.wfq_weight_mode != NULL) &&
+		(params->nonleaf.wfq_weight_mode[0] != 0) &&
+		(params->nonleaf.wfq_weight_mode[1] != 0) &&
+		(params->nonleaf.wfq_weight_mode[2] != 0) &&
+		(params->nonleaf.wfq_weight_mode[3] != 0))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_WFQ_WEIGHT_MODE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_tc(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority __rte_unused,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Single valid shared shaper */
+	if (params->n_shared_shapers > 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	if ((params->n_shared_shapers == 1) &&
+		((params->shared_shaper_id == NULL) ||
+		(!tm_shared_shaper_search(dev, params->shared_shaper_id[0]))))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_queue(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight __rte_unused,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: leaf */
+	if (node_id >= p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shaper */
+	if (params->shaper_profile_id != RTE_TM_SHAPER_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Congestion management must not be head drop */
+	if (params->leaf.cman == RTE_TM_CMAN_HEAD_DROP)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_CMAN,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Congestion management set to WRED */
+	if (params->leaf.cman == RTE_TM_CMAN_WRED) {
+		uint32_t wred_profile_id = params->leaf.wred.wred_profile_id;
+		struct tm_wred_profile *wp = tm_wred_profile_search(dev,
+			wred_profile_id);
+
+		/* WRED profile (for private WRED context) must be valid */
+		if ((wred_profile_id == RTE_TM_WRED_PROFILE_ID_NONE) ||
+			(wp == NULL))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARAMS_WRED_PROFILE_ID,
+				NULL,
+				rte_strerror(EINVAL));
+
+		/* No shared WRED contexts */
+		if (params->leaf.wred.n_shared_wred_contexts != 0)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_WRED_CONTEXTS,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_QUEUE))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct tm_node *pn;
+	uint32_t level;
+	int status;
+
+	/* node_id, parent_node_id:
+	 *    -node_id must not be RTE_TM_NODE_ID_NULL
+	 *    -node_id must not be in use
+	 *    -root node add (parent_node_id is RTE_TM_NODE_ID_NULL):
+	 *        -root node must not exist
+	 *    -non-root node add (parent_node_id is not RTE_TM_NODE_ID_NULL):
+	 *        -parent_node_id must be valid
+	 */
+	if (node_id == RTE_TM_NODE_ID_NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	if (tm_node_search(dev, node_id))
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	if (parent_node_id == RTE_TM_NODE_ID_NULL) {
+		pn = NULL;
+		if (tm_root_node_present(dev))
+			return -rte_tm_error_set(error,
+				EEXIST,
+				RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+				NULL,
+				rte_strerror(EEXIST));
+	} else {
+		pn = tm_node_search(dev, parent_node_id);
+		if (pn == NULL)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* priority: must be 0 .. 3 */
+	if (priority >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* weight: must be 1 .. 255 */
+	if ((weight == 0) || (weight >= UINT8_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* level_id: if valid, then
+	 *    -root node add (parent_node_id is RTE_TM_NODE_ID_NULL):
+	 *        -level_id must be zero
+	 *    -non-root node add (parent_node_id is not RTE_TM_NODE_ID_NULL):
+	 *        -level_id must be parent level ID plus one
+	 */
+	level = (pn == NULL) ? 0 : pn->level + 1;
+	if ((level_id != RTE_TM_NODE_LEVEL_ID_ANY) && (level_id != level))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* params: must not be NULL */
+	if (params == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* params: per level checks */
+	switch (level) {
+	case TM_NODE_LEVEL_PORT:
+		status = node_add_check_port(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		status = node_add_check_subport(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+		status = node_add_check_pipe(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_TC:
+		status = node_add_check_tc(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_QUEUE:
+		status = node_add_check_queue(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	default:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+	}
+
+	return 0;
+}
+
+/* Traffic manager node add */
+static int
+pmd_tm_node_add(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+	uint32_t i;
+	int status;
+
+	/* Checks */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	status = node_add_check(dev, node_id, parent_node_id, priority, weight,
+		level_id, params, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	n = calloc(1, sizeof(struct tm_node));
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	n->node_id = node_id;
+	n->parent_node_id = parent_node_id;
+	n->priority = priority;
+	n->weight = weight;
+
+	if (parent_node_id != RTE_TM_NODE_ID_NULL) {
+		n->parent_node = tm_node_search(dev, parent_node_id);
+		n->level = n->parent_node->level + 1;
+	}
+
+	if (params->shaper_profile_id != RTE_TM_SHAPER_PROFILE_ID_NONE)
+		n->shaper_profile = tm_shaper_profile_search(dev,
+			params->shaper_profile_id);
+
+	if ((n->level == TM_NODE_LEVEL_QUEUE) &&
+		(params->leaf.cman == RTE_TM_CMAN_WRED))
+		n->wred_profile = tm_wred_profile_search(dev,
+			params->leaf.wred.wred_profile_id);
+
+	memcpy(&n->params, params, sizeof(n->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(nl, n, node);
+	p->soft.tm.h.n_nodes++;
+
+	/* Update dependencies */
+	if (n->parent_node)
+		n->parent_node->n_children++;
+
+	if (n->shaper_profile)
+		n->shaper_profile->n_users++;
+
+	for (i = 0; i < params->n_shared_shapers; i++) {
+		struct tm_shared_shaper *ss;
+
+		ss = tm_shared_shaper_search(dev, params->shared_shaper_id[i]);
+		ss->n_users++;
+	}
+
+	if (n->wred_profile)
+		n->wred_profile->n_users++;
+
+	p->soft.tm.h.n_tm_nodes[n->level]++;
+
+	return 0;
+}
+
+/* Traffic manager node delete */
+static int
+pmd_tm_node_delete(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node *n;
+	uint32_t i;
+
+	/* Check hierarchy changes are currently allowed */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Check existing */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (n->n_children)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Update dependencies */
+	p->soft.tm.h.n_tm_nodes[n->level]--;
+
+	if (n->wred_profile)
+		n->wred_profile->n_users--;
+
+	for (i = 0; i < n->params.n_shared_shapers; i++) {
+		struct tm_shared_shaper *ss;
+
+		ss = tm_shared_shaper_search(dev,
+				n->params.shared_shaper_id[i]);
+		ss->n_users--;
+	}
+
+	if (n->shaper_profile)
+		n->shaper_profile->n_users--;
+
+	if (n->parent_node)
+		n->parent_node->n_children--;
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.nodes, n, node);
+	p->soft.tm.h.n_nodes--;
+	free(n);
+
+	return 0;
+}
+
+
+static void
+pipe_profile_build(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct rte_sched_pipe_params *pp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *nt, *nq;
+
+	memset(pp, 0, sizeof(*pp));
+
+	/* Pipe */
+	pp->tb_rate = np->shaper_profile->params.peak.rate;
+	pp->tb_size = np->shaper_profile->params.peak.size;
+
+	/* Traffic Class (TC) */
+	pp->tc_period = 40;
+
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+	pp->tc_ov_weight = np->weight;
+#endif
+
+	TAILQ_FOREACH(nt, nl, node) {
+		uint32_t queue_id = 0;
+
+		if ((nt->level != TM_NODE_LEVEL_TC) ||
+			(nt->parent_node_id != np->node_id))
+			continue;
+
+		pp->tc_rate[nt->priority] =
+			nt->shaper_profile->params.peak.rate;
+
+		/* Queue */
+		TAILQ_FOREACH(nq, nl, node) {
+			uint32_t pipe_queue_id;
+
+			if ((nq->level != TM_NODE_LEVEL_QUEUE) ||
+				(nq->parent_node_id != nt->node_id))
+				continue;
+
+			pipe_queue_id = nt->priority *
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+			pp->wrr_weights[pipe_queue_id] = nq->weight;
+
+			queue_id++;
+		}
+	}
+}
+
+static int
+pipe_profile_free_exists(struct rte_eth_dev *dev,
+	uint32_t *pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+
+	if (t->n_pipe_profiles < RTE_SCHED_PIPE_PROFILES_PER_PORT) {
+		*pipe_profile_id = t->n_pipe_profiles;
+		return 1;
+	}
+
+	return 0;
+}
+
+static int
+pipe_profile_exists(struct rte_eth_dev *dev,
+	struct rte_sched_pipe_params *pp,
+	uint32_t *pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t i;
+
+	for (i = 0; i < t->n_pipe_profiles; i++)
+		if (memcmp(&t->pipe_profiles[i], pp, sizeof(*pp)) == 0) {
+			if (pipe_profile_id)
+				*pipe_profile_id = i;
+			return 1;
+		}
+
+	return 0;
+}
+
+static void
+pipe_profile_install(struct rte_eth_dev *dev,
+	struct rte_sched_pipe_params *pp,
+	uint32_t pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+
+	memcpy(&t->pipe_profiles[pipe_profile_id], pp, sizeof(*pp));
+	t->n_pipe_profiles++;
+}
+
+static void
+pipe_profile_mark(struct rte_eth_dev *dev,
+	uint32_t subport_id,
+	uint32_t pipe_id,
+	uint32_t pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_pipes_per_subport, pos;
+
+	n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+	pos = subport_id * n_pipes_per_subport + pipe_id;
+
+	t->pipe_to_profile[pos] = pipe_profile_id;
+}
+
+static struct rte_sched_pipe_params *
+pipe_profile_get(struct rte_eth_dev *dev, struct tm_node *np)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	uint32_t subport_id = tm_node_subport_id(dev, np->parent_node);
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	uint32_t pos = subport_id * n_pipes_per_subport + pipe_id;
+	uint32_t pipe_profile_id = t->pipe_to_profile[pos];
+
+	return &t->pipe_profiles[pipe_profile_id];
+}
+
+static int
+pipe_profiles_generate(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *ns, *np;
+	uint32_t subport_id;
+
+	/* Objective: Fill in the following fields in struct tm_params:
+	 *    - pipe_profiles
+	 *    - n_pipe_profiles
+	 *    - pipe_to_profile
+	 */
+
+	subport_id = 0;
+	TAILQ_FOREACH(ns, nl, node) {
+		uint32_t pipe_id;
+
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		pipe_id = 0;
+		TAILQ_FOREACH(np, nl, node) {
+			struct rte_sched_pipe_params pp;
+			uint32_t pos;
+
+			if ((np->level != TM_NODE_LEVEL_PIPE) ||
+				(np->parent_node_id != ns->node_id))
+				continue;
+
+			pipe_profile_build(dev, np, &pp);
+
+			if (!pipe_profile_exists(dev, &pp, &pos)) {
+				if (!pipe_profile_free_exists(dev, &pos))
+					return -1;
+
+				pipe_profile_install(dev, &pp, pos);
+			}
+
+			pipe_profile_mark(dev, subport_id, pipe_id, pos);
+
+			pipe_id++;
+		}
+
+		subport_id++;
+	}
+
+	return 0;
+}
+
+static struct tm_wred_profile *
+tm_tc_wred_profile_get(struct rte_eth_dev *dev, uint32_t tc_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *nq;
+
+	TAILQ_FOREACH(nq, nl, node) {
+		if ((nq->level != TM_NODE_LEVEL_QUEUE) ||
+			(nq->parent_node->priority != tc_id))
+			continue;
+
+		return nq->wred_profile;
+	}
+
+	return NULL;
+}
+
+#ifdef RTE_SCHED_RED
+
+static void
+wred_profiles_set(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_sched_port_params *pp = &p->soft.tm.params.port_params;
+	uint32_t tc_id;
+	enum rte_tm_color color;
+
+	for (tc_id = 0; tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc_id++)
+		for (color = RTE_TM_GREEN; color < RTE_TM_COLORS; color++) {
+			struct rte_red_params *dst =
+				&pp->red_params[tc_id][color];
+			struct tm_wred_profile *src_wp =
+				tm_tc_wred_profile_get(dev, tc_id);
+			struct rte_tm_red_params *src =
+				&src_wp->params.red_params[color];
+
+			memcpy(dst, src, sizeof(*dst));
+		}
+}
+
+#else
+
+#define wred_profiles_set(dev)
+
+#endif
+
+static struct tm_shared_shaper *
+tm_tc_shared_shaper_get(struct rte_eth_dev *dev, struct tm_node *tc_node)
+{
+	return (tc_node->params.n_shared_shapers) ?
+		tm_shared_shaper_search(dev,
+			tc_node->params.shared_shaper_id[0]) :
+		NULL;
+}
+
+static struct tm_shared_shaper *
+tm_subport_tc_shared_shaper_get(struct rte_eth_dev *dev,
+	struct tm_node *subport_node,
+	uint32_t tc_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node) {
+		if ((n->level != TM_NODE_LEVEL_TC) ||
+			(n->parent_node->parent_node_id !=
+				subport_node->node_id) ||
+			(n->priority != tc_id))
+			continue;
+
+		return tm_tc_shared_shaper_get(dev, n);
+	}
+
+	return NULL;
+}
+
+static int
+hierarchy_commit_check(struct rte_eth_dev *dev, struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_shared_shaper_list *ssl = &h->shared_shapers;
+	struct tm_wred_profile_list *wpl = &h->wred_profiles;
+	struct tm_node *nr = tm_root_node_present(dev), *ns, *np, *nt, *nq;
+	struct tm_shared_shaper *ss;
+
+	uint32_t n_pipes_per_subport;
+
+	/* Root node exists. */
+	if (nr == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* There is at least one subport, max is not exceeded. */
+	if ((nr->n_children == 0) || (nr->n_children > TM_MAX_SUBPORTS))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* There is at least one pipe. */
+	if (h->n_tm_nodes[TM_NODE_LEVEL_PIPE] == 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of pipes is the same for all subports. Maximum number of pipes
+	 * per subport is not exceeded.
+	 */
+	n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	if (n_pipes_per_subport > TM_MAX_PIPES_PER_SUBPORT)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	TAILQ_FOREACH(ns, nl, node) {
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		if (ns->n_children != n_pipes_per_subport)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Each pipe has exactly 4 TCs, with exactly one TC for each priority */
+	TAILQ_FOREACH(np, nl, node) {
+		uint32_t mask = 0, mask_expected =
+			RTE_LEN2MASK(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+				uint32_t);
+
+		if (np->level != TM_NODE_LEVEL_PIPE)
+			continue;
+
+		if (np->n_children != RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+
+		TAILQ_FOREACH(nt, nl, node) {
+			if ((nt->level != TM_NODE_LEVEL_TC) ||
+				(nt->parent_node_id != np->node_id))
+				continue;
+
+			mask |= 1 << nt->priority;
+		}
+
+		if (mask != mask_expected)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Each TC has exactly 4 packet queues. */
+	TAILQ_FOREACH(nt, nl, node) {
+		if (nt->level != TM_NODE_LEVEL_TC)
+			continue;
+
+		if (nt->n_children != RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/**
+	 * Shared shapers:
+	 *    -For each TC #i, all pipes in the same subport use the same
+	 *     shared shaper (or no shared shaper) for their TC#i.
+	 *    -Each shared shaper needs to have at least one user. All its
+	 *     users have to be TC nodes with the same priority and the same
+	 *     subport.
+	 */
+	TAILQ_FOREACH(ns, nl, node) {
+		struct tm_shared_shaper *s[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t id;
+
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++)
+			s[id] = tm_subport_tc_shared_shaper_get(dev, ns, id);
+
+		TAILQ_FOREACH(nt, nl, node) {
+			struct tm_shared_shaper *subport_ss, *tc_ss;
+
+			if ((nt->level != TM_NODE_LEVEL_TC) ||
+				(nt->parent_node->parent_node_id !=
+					ns->node_id))
+				continue;
+
+			subport_ss = s[nt->priority];
+			tc_ss = tm_tc_shared_shaper_get(dev, nt);
+
+			if ((subport_ss == NULL) && (tc_ss == NULL))
+				continue;
+
+			if (((subport_ss == NULL) && (tc_ss != NULL)) ||
+				((subport_ss != NULL) && (tc_ss == NULL)) ||
+				(subport_ss->shared_shaper_id !=
+					tc_ss->shared_shaper_id))
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+	}
+
+	TAILQ_FOREACH(ss, ssl, node) {
+		struct tm_node *nt_any = tm_shared_shaper_get_tc(dev, ss);
+		uint32_t n_users = 0;
+
+		if (nt_any != NULL)
+			TAILQ_FOREACH(nt, nl, node) {
+				if ((nt->level != TM_NODE_LEVEL_TC) ||
+					(nt->priority != nt_any->priority) ||
+					(nt->parent_node->parent_node_id !=
+					nt_any->parent_node->parent_node_id))
+					continue;
+
+				n_users++;
+			}
+
+		if ((ss->n_users == 0) || (ss->n_users != n_users))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Not too many pipe profiles. */
+	if (pipe_profiles_generate(dev))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/**
+	 * WRED (when used, i.e. at least one WRED profile defined):
+	 *    -Each WRED profile must have at least one user.
+	 *    -All leaf nodes must have their private WRED context enabled.
+	 *    -For each TC #i, all leaf nodes must use the same WRED profile
+	 *     for their private WRED context.
+	 */
+	if (h->n_wred_profiles) {
+		struct tm_wred_profile *wp;
+		struct tm_wred_profile *w[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t id;
+
+		TAILQ_FOREACH(wp, wpl, node)
+			if (wp->n_users == 0)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++) {
+			w[id] = tm_tc_wred_profile_get(dev, id);
+
+			if (w[id] == NULL)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+
+		TAILQ_FOREACH(nq, nl, node) {
+			uint32_t id;
+
+			if (nq->level != TM_NODE_LEVEL_QUEUE)
+				continue;
+
+			id = nq->parent_node->priority;
+
+			if ((nq->wred_profile == NULL) ||
+				(nq->wred_profile->wred_profile_id !=
+					w[id]->wred_profile_id))
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+	}
+
+	return 0;
+}
+
+static void
+hierarchy_blueprints_create(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *root = tm_root_node_present(dev), *n;
+
+	uint32_t subport_id;
+
+	t->port_params = (struct rte_sched_port_params) {
+		.name = dev->data->name,
+		.socket = dev->data->numa_node,
+		.rate = root->shaper_profile->params.peak.rate,
+		.mtu = dev->data->mtu,
+		.frame_overhead =
+			root->shaper_profile->params.pkt_length_adjust,
+		.n_subports_per_port = root->n_children,
+		.n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+			h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT],
+		.qsize = {p->params.soft.tm.qsize[0],
+			p->params.soft.tm.qsize[1],
+			p->params.soft.tm.qsize[2],
+			p->params.soft.tm.qsize[3],
+		},
+		.pipe_profiles = t->pipe_profiles,
+		.n_pipe_profiles = t->n_pipe_profiles,
+	};
+
+	wred_profiles_set(dev);
+
+	subport_id = 0;
+	TAILQ_FOREACH(n, nl, node) {
+		uint64_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t i;
+
+		if (n->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+			struct tm_shared_shaper *ss;
+			struct tm_shaper_profile *sp;
+
+			ss = tm_subport_tc_shared_shaper_get(dev, n, i);
+			sp = (ss) ? tm_shaper_profile_search(dev,
+				ss->shaper_profile_id) :
+				n->shaper_profile;
+			tc_rate[i] = sp->params.peak.rate;
+		}
+
+		t->subport_params[subport_id] =
+			(struct rte_sched_subport_params) {
+				.tb_rate = n->shaper_profile->params.peak.rate,
+				.tb_size = n->shaper_profile->params.peak.size,
+
+				.tc_rate = {tc_rate[0],
+					tc_rate[1],
+					tc_rate[2],
+					tc_rate[3],
+			},
+			.tc_period = 10,
+		};
+
+		subport_id++;
+	}
+}
+
+/* Traffic manager hierarchy commit */
+static int
+pmd_tm_hierarchy_commit(struct rte_eth_dev *dev,
+	int clear_on_fail,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	int status;
+
+	/* Checks */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	status = hierarchy_commit_check(dev, error);
+	if (status) {
+		if (clear_on_fail) {
+			tm_hierarchy_uninit(p);
+			tm_hierarchy_init(p);
+		}
+
+		return status;
+	}
+
+	/* Create blueprints */
+	hierarchy_blueprints_create(dev);
+
+	/* Freeze hierarchy */
+	p->soft.tm.hierarchy_frozen = 1;
+
+	return 0;
+}
+
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+
+static int
+update_pipe_weight(struct rte_eth_dev *dev, struct tm_node *np, uint32_t weight)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tc_ov_weight = (uint8_t) weight;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t) pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	np->weight = weight;
+
+	return 0;
+}
+
+#endif
+
+static int
+update_queue_weight(struct rte_eth_dev *dev,
+	struct tm_node *nq, uint32_t weight)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t queue_id = tm_node_queue_id(dev, nq);
+
+	struct tm_node *nt = nq->parent_node;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t pipe_queue_id =
+		tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.wrr_weights[pipe_queue_id] = (uint8_t) weight;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set
+	 * of pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t) pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	nq->weight = weight;
+
+	return 0;
+}
+
+/* Traffic manager node parent update */
+static int
+pmd_tm_node_parent_update(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+
+	/* Port must be started and TM used. */
+	if ((dev->data->dev_started == 0) && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Parent node must be the same */
+	if (n->parent_node_id != parent_node_id)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be the same */
+	if (n->priority != priority)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* weight: must be 1 .. 255 */
+	if ((weight == 0) || (weight >= UINT8_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	case TM_NODE_LEVEL_SUBPORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	case TM_NODE_LEVEL_PIPE:
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+		if (update_pipe_weight(dev, n, weight))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+#else
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+#endif
+
+	case TM_NODE_LEVEL_TC:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		if (update_queue_weight(dev, n, weight))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+	}
+}
+
+static int
+update_subport_rate(struct rte_eth_dev *dev,
+	struct tm_node *ns,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_subport_params subport_params;
+
+	/* Derive new subport configuration. */
+	memcpy(&subport_params,
+		&p->soft.tm.params.subport_params[subport_id],
+		sizeof(subport_params));
+	subport_params.tb_rate = sp->params.peak.rate;
+	subport_params.tb_size = sp->params.peak.size;
+
+	/* Update the subport configuration. */
+	if (rte_sched_subport_config(p->soft.tm.sched, subport_id,
+		&subport_params))
+		return -1;
+
+	/* Commit changes. */
+	ns->shaper_profile->n_users--;
+
+	ns->shaper_profile = sp;
+	ns->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	memcpy(&p->soft.tm.params.subport_params[subport_id],
+		&subport_params,
+		sizeof(subport_params));
+
+	return 0;
+}
+
+static int
+update_pipe_rate(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tb_rate = sp->params.peak.rate;
+	profile1.tb_size = sp->params.peak.size;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t) pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	np->shaper_profile->n_users--;
+	np->shaper_profile = sp;
+	np->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	return 0;
+}
+
+static int
+update_tc_rate(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tc_rate[tc_id] = sp->params.peak.rate;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t) pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	nt->shaper_profile->n_users--;
+	nt->shaper_profile = sp;
+	nt->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	return 0;
+}
+
+/* Traffic manager node shaper update */
+static int
+pmd_tm_node_shaper_update(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+	struct tm_shaper_profile *sp;
+
+	/* Port must be started and TM used. */
+	if ((dev->data->dev_started == 0) && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper profile must be valid. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	case TM_NODE_LEVEL_SUBPORT:
+		if (update_subport_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_PIPE:
+		if (update_pipe_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_TC:
+		if (update_tc_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+	}
+}
+
+static inline uint32_t
+tm_port_queue_id(struct rte_eth_dev *dev,
+	uint32_t port_subport_id,
+	uint32_t subport_pipe_id,
+	uint32_t pipe_tc_id,
+	uint32_t tc_queue_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	uint32_t n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+			h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	uint32_t port_pipe_id =
+		port_subport_id * n_pipes_per_subport + subport_pipe_id;
+	uint32_t port_tc_id =
+		port_pipe_id * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE + pipe_tc_id;
+	uint32_t port_queue_id =
+		port_tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + tc_queue_id;
+
+	return port_queue_id;
+}
+
+static int
+read_port_stats(struct rte_eth_dev *dev,
+	struct tm_node *nr,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	uint32_t n_subports_per_port = h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+	uint32_t subport_id;
+
+	for (subport_id = 0; subport_id < n_subports_per_port; subport_id++) {
+		struct rte_sched_subport_stats s;
+		uint32_t tc_ov, id;
+
+		/* Stats read */
+		int status = rte_sched_subport_read_stats(
+			p->soft.tm.sched,
+			subport_id,
+			&s,
+			&tc_ov);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++) {
+			nr->stats.n_pkts +=
+				s.n_pkts_tc[id] - s.n_pkts_tc_dropped[id];
+			nr->stats.n_bytes +=
+				s.n_bytes_tc[id] - s.n_bytes_tc_dropped[id];
+			nr->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] +=
+				s.n_pkts_tc_dropped[id];
+			nr->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+				s.n_bytes_tc_dropped[id];
+		}
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nr->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nr->stats, 0, sizeof(nr->stats));
+
+	return 0;
+}
+
+static int
+read_subport_stats(struct rte_eth_dev *dev,
+	struct tm_node *ns,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+	struct rte_sched_subport_stats s;
+	uint32_t tc_ov, tc_id;
+
+	/* Stats read */
+	int status = rte_sched_subport_read_stats(
+		p->soft.tm.sched,
+		subport_id,
+		&s,
+		&tc_ov);
+	if (status)
+		return status;
+
+	/* Stats accumulate */
+	for (tc_id = 0; tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc_id++) {
+		ns->stats.n_pkts +=
+			s.n_pkts_tc[tc_id] - s.n_pkts_tc_dropped[tc_id];
+		ns->stats.n_bytes +=
+			s.n_bytes_tc[tc_id] - s.n_bytes_tc_dropped[tc_id];
+		ns->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] +=
+			s.n_pkts_tc_dropped[tc_id];
+		ns->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_tc_dropped[tc_id];
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &ns->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&ns->stats, 0, sizeof(ns->stats));
+
+	return 0;
+}
+
+static int
+read_pipe_stats(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t i;
+
+	/* Stats read */
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
+		struct rte_sched_queue_stats s;
+		uint16_t qlen;
+
+		uint32_t qid = tm_port_queue_id(dev,
+			subport_id,
+			pipe_id,
+			i / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			i % RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
+
+		int status = rte_sched_queue_read_stats(
+			p->soft.tm.sched,
+			qid,
+			&s,
+			&qlen);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		np->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+		np->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+		np->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+		np->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_dropped;
+		np->stats.leaf.n_pkts_queued = qlen;
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &np->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&np->stats, 0, sizeof(np->stats));
+
+	return 0;
+}
+
+static int
+read_tc_stats(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t i;
+
+	/* Stats read */
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
+		struct rte_sched_queue_stats s;
+		uint16_t qlen;
+
+		uint32_t qid = tm_port_queue_id(dev,
+			subport_id,
+			pipe_id,
+			tc_id,
+			i);
+
+		int status = rte_sched_queue_read_stats(
+			p->soft.tm.sched,
+			qid,
+			&s,
+			&qlen);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		nt->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+		nt->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+		nt->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+		nt->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_dropped;
+		nt->stats.leaf.n_pkts_queued = qlen;
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nt->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nt->stats, 0, sizeof(nt->stats));
+
+	return 0;
+}
+
+static int
+read_queue_stats(struct rte_eth_dev *dev,
+	struct tm_node *nq,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_sched_queue_stats s;
+	uint16_t qlen;
+
+	uint32_t queue_id = tm_node_queue_id(dev, nq);
+
+	struct tm_node *nt = nq->parent_node;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	/* Stats read */
+	uint32_t qid = tm_port_queue_id(dev,
+		subport_id,
+		pipe_id,
+		tc_id,
+		queue_id);
+
+	int status = rte_sched_queue_read_stats(
+		p->soft.tm.sched,
+		qid,
+		&s,
+		&qlen);
+	if (status)
+		return status;
+
+	/* Stats accumulate */
+	nq->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+	nq->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+	nq->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+	nq->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+		s.n_bytes_dropped;
+	nq->stats.leaf.n_pkts_queued = qlen;
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nq->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_QUEUE;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nq->stats, 0, sizeof(nq->stats));
+
+	return 0;
+}
+
+/* Traffic manager read stats counters for specific node */
+static int
+pmd_tm_node_stats_read(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+
+	/* Port must be started and TM used. */
+	if ((dev->data->dev_started == 0) && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		if (read_port_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		if (read_subport_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_PIPE:
+		if (read_pipe_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_TC:
+		if (read_tc_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		if (read_queue_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+	}
+}
+
+const struct rte_tm_ops pmd_tm_ops = {
+	.node_type_get = pmd_tm_node_type_get,
+	.capabilities_get = pmd_tm_capabilities_get,
+	.level_capabilities_get = pmd_tm_level_capabilities_get,
+	.node_capabilities_get = pmd_tm_node_capabilities_get,
+
+	.wred_profile_add = pmd_tm_wred_profile_add,
+	.wred_profile_delete = pmd_tm_wred_profile_delete,
+	.shared_wred_context_add_update = NULL,
+	.shared_wred_context_delete = NULL,
+
+	.shaper_profile_add = pmd_tm_shaper_profile_add,
+	.shaper_profile_delete = pmd_tm_shaper_profile_delete,
+	.shared_shaper_add_update = pmd_tm_shared_shaper_add_update,
+	.shared_shaper_delete = pmd_tm_shared_shaper_delete,
+
+	.node_add = pmd_tm_node_add,
+	.node_delete = pmd_tm_node_delete,
+	.node_suspend = NULL,
+	.node_resume = NULL,
+	.hierarchy_commit = pmd_tm_hierarchy_commit,
+
+	.node_parent_update = pmd_tm_node_parent_update,
+	.node_shaper_update = pmd_tm_node_shaper_update,
+	.node_shared_shaper_update = NULL,
+	.node_stats_update = NULL,
+	.node_wfq_weight_mode_update = NULL,
+	.node_cman_update = NULL,
+	.node_wred_context_update = NULL,
+	.node_shared_wred_context_update = NULL,
+
+	.node_stats_read = pmd_tm_node_stats_read,
 };
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management
  2017-05-26 18:11 [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management Jasvinder Singh
                   ` (2 preceding siblings ...)
  2017-06-07 14:32 ` [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management Thomas Monjalon
@ 2017-08-11 15:28 ` Stephen Hemminger
  2017-08-11 16:22   ` Dumitrescu, Cristian
  3 siblings, 1 reply; 79+ messages in thread
From: Stephen Hemminger @ 2017-08-11 15:28 UTC (permalink / raw)
  To: Jasvinder Singh
  Cc: dev, cristian.dumitrescu, ferruh.yigit, hemant.agrawal,
	Jerin.JacobKollanukkaran, wenzhuo.lu

On Fri, 26 May 2017 19:11:47 +0100
Jasvinder Singh <jasvinder.singh@intel.com> wrote:

> The SoftNIC PMD provides SW fall-back option for the NICs not supporting
> the Traffic Management (TM) features. 
> 
> SoftNIC PMD overview:
> - The SW fall-back is based on the existing librte_sched DPDK library.
> - The TM-agnostic port (the underlay device) is wrapped into a TM-aware
>   softnic port (the overlay device).
> - Once the overlay device (virtual device) is created, the configuration of
>   the underlay device is taking place through the overlay device.
> - The SoftNIC PMD is generic, i.e. it works for any underlay device PMD that
>   implements the ethdev API.
> 
>   Similarly to Ring PMD, the SoftNIC virtual device can be created in two
> different ways:
> 1. Through EAL command line (--vdev option)_
> 2._Through the rte_eth_softnic_create() API function called by the application
> 
> SoftNIC PMD params:
> - iface (mandatory): the ethdev port name (i.e. PCI address or vdev name) for
> the underlay device
> - txq_id (optional, default = 0): tx queue id of the underlay device
> - deq_bsz (optional, default = 24): traffic manager dequeue burst size
> - Example:_ --vdev 'net_softnic0,iface=0000:04:00.1,txq_id=0,deq_bsz=28'
> 
> SoftNIC PMD build instructions:
> - To build SoftNIC PMD, the following parameter needs to be set on
> config/common_base file: CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
> - The SoftNIC PMD depends on the TM API [1] and therefore is initially
> targeted for the tm-next repository
> 
> 
> Patch 1 adds softnic device PMD for traffic management.
> Patch 2 adds traffic management ops to the softnic device suggested in
> generic ethdev API for traffic management[1].
> 
> [1] TM API version 4:_ 
> http://www.dpdk.org/dev/patchwork/patch/24411/,
> http://www.dpdk.org/dev/patchwork/patch/24412/
> 
> 
> Jasvinder Singh (2):
>   net/softnic: add softnic PMD for traffic management
>   net/softnic: add traffic management ops
> 
>  MAINTAINERS                                     |    5 +
>  config/common_base                              |    5 +
>  drivers/net/Makefile                            |    5 +
>  drivers/net/softnic/Makefile                    |   58 ++
>  drivers/net/softnic/rte_eth_softnic.c           |  578 ++++++++++++
>  drivers/net/softnic/rte_eth_softnic.h           |   99 ++
>  drivers/net/softnic/rte_eth_softnic_default.c   | 1104 +++++++++++++++++++++++
>  drivers/net/softnic/rte_eth_softnic_internals.h |   93 ++
>  drivers/net/softnic/rte_eth_softnic_tm.c        |  235 +++++
>  drivers/net/softnic/rte_eth_softnic_version.map |    7 +
>  mk/rte.app.mk                                   |    5 +-
>  11 files changed, 2193 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/softnic/Makefile
>  create mode 100644 drivers/net/softnic/rte_eth_softnic.c
>  create mode 100644 drivers/net/softnic/rte_eth_softnic.h
>  create mode 100644 drivers/net/softnic/rte_eth_softnic_default.c
>  create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
>  create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c
>  create mode 100644 drivers/net/softnic/rte_eth_softnic_version.map
> 

Setting up a softnic plus hardware NIC is significantly more effort for applications
than just using ethdev. Also, it puts the burden on the application to decide which
hardware device needs softnic and which does not; putting hardware knowledge in the
application is the wrong architectural direction.

Why not just the simple method of putting an new field in ethdev_ops for TM.
If it is NULL, then rte_ethdev TM would just fallback to doing the SoftNIC processing?

Also, eth_dev_ops doesn't always have to be const. Aren't there some PMD's that
insert different values based on configuration or CPU?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management
  2017-08-11 15:28 ` Stephen Hemminger
@ 2017-08-11 16:22   ` Dumitrescu, Cristian
  0 siblings, 0 replies; 79+ messages in thread
From: Dumitrescu, Cristian @ 2017-08-11 16:22 UTC (permalink / raw)
  To: Stephen Hemminger, Singh, Jasvinder
  Cc: dev, Yigit, Ferruh, hemant.agrawal, Jerin.JacobKollanukkaran, Lu,
	Wenzhuo



> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Friday, August 11, 2017 4:28 PM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>
> Cc: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>;
> Yigit, Ferruh <ferruh.yigit@intel.com>; hemant.agrawal@nxp.com;
> Jerin.JacobKollanukkaran@cavium.com; Lu, Wenzhuo
> <wenzhuo.lu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic
> management
> 
> On Fri, 26 May 2017 19:11:47 +0100
> Jasvinder Singh <jasvinder.singh@intel.com> wrote:
> 
> > The SoftNIC PMD provides SW fall-back option for the NICs not supporting
> > the Traffic Management (TM) features.
> >
> > SoftNIC PMD overview:
> > - The SW fall-back is based on the existing librte_sched DPDK library.
> > - The TM-agnostic port (the underlay device) is wrapped into a TM-aware
> >   softnic port (the overlay device).
> > - Once the overlay device (virtual device) is created, the configuration of
> >   the underlay device is taking place through the overlay device.
> > - The SoftNIC PMD is generic, i.e. it works for any underlay device PMD that
> >   implements the ethdev API.
> >
> >   Similarly to Ring PMD, the SoftNIC virtual device can be created in two
> > different ways:
> > 1. Through EAL command line (--vdev option)_
> > 2._Through the rte_eth_softnic_create() API function called by the
> application
> >
> > SoftNIC PMD params:
> > - iface (mandatory): the ethdev port name (i.e. PCI address or vdev name)
> for
> > the underlay device
> > - txq_id (optional, default = 0): tx queue id of the underlay device
> > - deq_bsz (optional, default = 24): traffic manager dequeue burst size
> > - Example:_ --vdev 'net_softnic0,iface=0000:04:00.1,txq_id=0,deq_bsz=28'
> >
> > SoftNIC PMD build instructions:
> > - To build SoftNIC PMD, the following parameter needs to be set on
> > config/common_base file: CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
> > - The SoftNIC PMD depends on the TM API [1] and therefore is initially
> > targeted for the tm-next repository
> >
> >
> > Patch 1 adds softnic device PMD for traffic management.
> > Patch 2 adds traffic management ops to the softnic device suggested in
> > generic ethdev API for traffic management[1].
> >
> > [1] TM API version 4:_
> > http://www.dpdk.org/dev/patchwork/patch/24411/,
> > http://www.dpdk.org/dev/patchwork/patch/24412/
> >
> >
> > Jasvinder Singh (2):
> >   net/softnic: add softnic PMD for traffic management
> >   net/softnic: add traffic management ops
> >
> >  MAINTAINERS                                     |    5 +
> >  config/common_base                              |    5 +
> >  drivers/net/Makefile                            |    5 +
> >  drivers/net/softnic/Makefile                    |   58 ++
> >  drivers/net/softnic/rte_eth_softnic.c           |  578 ++++++++++++
> >  drivers/net/softnic/rte_eth_softnic.h           |   99 ++
> >  drivers/net/softnic/rte_eth_softnic_default.c   | 1104
> +++++++++++++++++++++++
> >  drivers/net/softnic/rte_eth_softnic_internals.h |   93 ++
> >  drivers/net/softnic/rte_eth_softnic_tm.c        |  235 +++++
> >  drivers/net/softnic/rte_eth_softnic_version.map |    7 +
> >  mk/rte.app.mk                                   |    5 +-
> >  11 files changed, 2193 insertions(+), 1 deletion(-)
> >  create mode 100644 drivers/net/softnic/Makefile
> >  create mode 100644 drivers/net/softnic/rte_eth_softnic.c
> >  create mode 100644 drivers/net/softnic/rte_eth_softnic.h
> >  create mode 100644 drivers/net/softnic/rte_eth_softnic_default.c
> >  create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
> >  create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c
> >  create mode 100644 drivers/net/softnic/rte_eth_softnic_version.map
> >
> 
> Setting up a softnic plus hardware NIC is significantly more effort for
> applications
> than just using ethdev. Also, it puts the burden on the application to decide
> which
> hardware device needs softnic and which does not; putting hardware
> knowledge in the
> application is the wrong architectural direction.
> 
> Why not just the simple method of putting an new field in ethdev_ops for
> TM.
> If it is NULL, then rte_ethdev TM would just fallback to doing the SoftNIC
> processing?
> 
> Also, eth_dev_ops doesn't always have to be const. Aren't there some
> PMD's that
> insert different values based on configuration or CPU?

Hi Stephen,

Can you please review V3 and not V1?

Code is very much improved, and we have a list of Q&As that address  your questions straight away.

Thanks,
Cristian

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/4] net/softnic: add softnic PMD
  2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 1/4] net/softnic: add softnic PMD Jasvinder Singh
@ 2017-09-05 14:53           ` Ferruh Yigit
  2017-09-08  9:30             ` Singh, Jasvinder
  2017-09-18  9:10           ` [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  1 sibling, 1 reply; 79+ messages in thread
From: Ferruh Yigit @ 2017-09-05 14:53 UTC (permalink / raw)
  To: Jasvinder Singh, dev; +Cc: cristian.dumitrescu, thomas

On 8/11/2017 1:49 PM, Jasvinder Singh wrote:
> Add SoftNIC PMD to provide SW fall-back for ethdev APIs.
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> ---
> v3 changes:
> - rebase to dpdk17.08 release
> 
> v2 changes:
> - fix build errors
> - rebased to TM APIs v6 plus dpdk master
> 
>  MAINTAINERS                                     |   5 +
>  config/common_base                              |   5 +
>  drivers/net/Makefile                            |   5 +
>  drivers/net/softnic/Makefile                    |  56 +++
>  drivers/net/softnic/rte_eth_softnic.c           | 609 ++++++++++++++++++++++++
>  drivers/net/softnic/rte_eth_softnic.h           |  54 +++
>  drivers/net/softnic/rte_eth_softnic_internals.h | 114 +++++
>  drivers/net/softnic/rte_eth_softnic_version.map |   7 +
>  mk/rte.app.mk                                   |   5 +-

Also documentation updates are required:
- .ini file
- PMD documentation .rst file
- I believe it is good to update release note about new PMD
- release notes library version info, since this has public API

<...>

> +EXPORT_MAP := rte_eth_softnic_version.map

rte_pmd_... to be consistent.

<...>

> +#
> +# Export include files
> +#
> +SYMLINK-y-include +=rte_eth_softnic.h

space after +=

<...>

> diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
<...>
> +
> +static struct rte_vdev_driver pmd_drv;

Why this is required, already defined below.
And for naming, pmd=poll mode driver, drv=driver, makes "poll mode
driver driver"


<...>

> +static int
> +pmd_rx_queue_setup(struct rte_eth_dev *dev,
> +	uint16_t rx_queue_id,
> +	uint16_t nb_rx_desc __rte_unused,
> +	unsigned int socket_id,
> +	const struct rte_eth_rxconf *rx_conf __rte_unused,
> +	struct rte_mempool *mb_pool __rte_unused)
> +{
> +	struct pmd_internals *p = dev->data->dev_private;
> +
> +	if (p->params.soft.intrusive == 0) {
> +		struct pmd_rx_queue *rxq;
> +
> +		rxq = rte_zmalloc_socket(p->params.soft.name,
> +			sizeof(struct pmd_rx_queue), 0, socket_id);
> +		if (rxq == NULL)
> +			return -1;

return -ENOMEM ?

> +
> +		rxq->hard.port_id = p->hard.port_id;
> +		rxq->hard.rx_queue_id = rx_queue_id;
> +		dev->data->rx_queues[rx_queue_id] = rxq;
> +	} else {
> +		struct rte_eth_dev *hard_dev =
> +			&rte_eth_devices[p->hard.port_id];> +		void *rxq = hard_dev->data->rx_queues[rx_queue_id];
> +
> +		if (rxq == NULL)
> +			return -1;
> +
> +		dev->data->rx_queues[rx_queue_id] = rxq;

This assigns underlying hw queue as this soft PMD queue, what happens if
two different cores, one polls the actual hw device and other polls the
this virtual device, since both are indeed same queues?

> +	}
> +	return 0;
> +}
> +

<...>

> +static __rte_always_inline int
> +rte_pmd_softnic_run_default(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *p = dev->data->dev_private;
> +
> +	/* Persistent context: Read Only (update not required) */
> +	struct rte_mbuf **pkts = p->soft.def.pkts;
> +	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
> +
> +	/* Persistent context: Read - Write (update required) */
> +	uint32_t txq_pos = p->soft.def.txq_pos;
> +	uint32_t pkts_len = p->soft.def.pkts_len;
> +	uint32_t flush_count = p->soft.def.flush_count;
> +
> +	/* Not part of the persistent context */
> +	uint32_t pos;
> +	uint16_t i;
> +
> +	/* Soft device TXQ read, Hard device TXQ write */
> +	for (i = 0; i < nb_tx_queues; i++) {
> +		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
> +
> +		/* Read soft device TXQ burst to packet enqueue buffer */
> +		pkts_len += rte_ring_sc_dequeue_burst(txq,
> +			(void **) &pkts[pkts_len],
> +			DEFAULT_BURST_SIZE,
> +			NULL);
> +
> +		/* Increment soft device TXQ */
> +		txq_pos++;
> +		if (txq_pos >= nb_tx_queues)
> +			txq_pos = 0;
> +
> +		/* Hard device TXQ write when complete burst is available */
> +		if (pkts_len >= DEFAULT_BURST_SIZE) {

There questions:
1- When there are multiple tx_queues of softnic, and assume all will be
processed by a core, this core will be reading from all into single HW
queue, won' this create a bottle neck?

2- This logic reads from all queues as BURST_SIZE and merges them, if
queues split with a RSS or similar, that clasiffication will be lost,
will it be problem?

3- If there is not enough packets in the queues ( < DEFAULT_BURST_SIZE)
those packets won't be transmitted unless more is comming, will this
create latency for those cases?

> +			for (pos = 0; pos < pkts_len; )
> +				pos += rte_eth_tx_burst(p->hard.port_id,
> +					p->params.hard.tx_queue_id,
> +					&pkts[pos],
> +					(uint16_t) (pkts_len - pos));
> +
> +			pkts_len = 0;
> +			flush_count = 0;
> +			break;
> +		}
> +	}
> +
> +	if (flush_count >= FLUSH_COUNT_THRESHOLD) {

FLUSH_COUNT_THRESHOLD is (1 << 17), and if no packet is sent, flash
count incremented by one, just want to confirm the treshold value?

And why this flush exists?

> +		for (pos = 0; pos < pkts_len; )
> +			pos += rte_eth_tx_burst(p->hard.port_id,
> +				p->params.hard.tx_queue_id,
> +				&pkts[pos],
> +				(uint16_t) (pkts_len - pos));
> +
> +		pkts_len = 0;
> +		flush_count = 0;
> +	}
> +
> +	p->soft.def.txq_pos = txq_pos;
> +	p->soft.def.pkts_len = pkts_len;
> +	p->soft.def.flush_count = flush_count + 1;
> +
> +	return 0;
> +}
> +
> +int
> +rte_pmd_softnic_run(uint8_t port_id)
> +{
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];

It can be possible to create a macro for this.

<...>

> +static int
> +default_init(struct pmd_internals *p,

default_mbufs_init()? default_init() on its own in not that clear.

<...>

> +static void
> +default_free(struct pmd_internals *p)

default_mbufs_free()?

<...>

> +static void *
> +pmd_init(struct pmd_params *params, int numa_node)
> +{
> +	struct pmd_internals *p;
> +	int status;
> +
> +	p = rte_zmalloc_socket(params->soft.name,
> +		sizeof(struct pmd_internals),
> +		0,
> +		numa_node);
> +	if (p == NULL)
> +		return NULL;
> +
> +	memcpy(&p->params, params, sizeof(p->params));
> +	rte_eth_dev_get_port_by_name(params->hard.name, &p->hard.port_id);

You may want to check return value of this.

> +
> +	/* Default */
> +	status = default_init(p, params, numa_node);
> +	if (status) {
> +		rte_free(p);
> +		return NULL;
> +	}
> +
> +	return p;
> +}
> +
> +static void
> +pmd_free(struct pmd_internals *p)
> +{
> +	default_free(p);

p->hard.name also needs to be freed here.

> +
> +	rte_free(p);
> +}
> +
> +static int
> +pmd_ethdev_register(struct rte_vdev_device *vdev,
> +	struct pmd_params *params,
> +	void *dev_private)
> +{
> +	struct rte_eth_dev_info hard_info;
> +	struct rte_eth_dev *soft_dev;
> +	struct rte_eth_dev_data *soft_data;
> +	uint32_t hard_speed;
> +	int numa_node;
> +	uint8_t hard_port_id;
> +
> +	rte_eth_dev_get_port_by_name(params->hard.name, &hard_port_id);
> +	rte_eth_dev_info_get(hard_port_id, &hard_info);
> +	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
> +	numa_node = rte_eth_dev_socket_id(hard_port_id);
> +
> +	/* Memory allocation */
> +	soft_data = rte_zmalloc_socket(params->soft.name,
> +		sizeof(*soft_data), 0, numa_node);
> +	if (!soft_data)
> +		return -ENOMEM;
> +
> +	/* Ethdev entry allocation */
> +	soft_dev = rte_eth_dev_allocate(params->soft.name);
> +	if (!soft_dev) {
> +		rte_free(soft_data);
> +		return -ENOMEM;
> +	}
> +
> +	/* Connect dev->data */
> +	memmove(soft_data->name,
> +		soft_dev->data->name,
> +		sizeof(soft_data->name));

I guess this is redundant here, allocating soft_data and rest, it is
possible to use soft_dev->data directly.

> +	soft_data->port_id = soft_dev->data->port_id;
> +	soft_data->mtu = soft_dev->data->mtu;
> +	soft_dev->data = soft_data;
> +
> +	/* dev */
> +	soft_dev->rx_pkt_burst = (params->soft.intrusive) ?
> +		NULL : /* set up later */
> +		pmd_rx_pkt_burst;
> +	soft_dev->tx_pkt_burst = pmd_tx_pkt_burst;
> +	soft_dev->tx_pkt_prepare = NULL;
> +	soft_dev->dev_ops = &pmd_ops;
> +	soft_dev->device = &vdev->device;
> +
> +	/* dev->data */
> +	soft_dev->data->dev_private = dev_private;
> +	soft_dev->data->dev_link.link_speed = hard_speed;
> +	soft_dev->data->dev_link.link_duplex = ETH_LINK_FULL_DUPLEX;
> +	soft_dev->data->dev_link.link_autoneg = ETH_LINK_SPEED_FIXED;
> +	soft_dev->data->dev_link.link_status = ETH_LINK_DOWN;

For simplity, it is possible to have a static struct rte_eth_link, and
assing it to data->dev_link, as done in null pmd.

> +	soft_dev->data->mac_addrs = &eth_addr;
> +	soft_dev->data->promiscuous = 1;
> +	soft_dev->data->kdrv = RTE_KDRV_NONE;
> +	soft_dev->data->numa_node = numa_node;

If pmd is detachable, need following flag:
data->dev_flags = RTE_ETH_DEV_DETACHABLE;

> +
> +	return 0;
> +}
> +

<...>

> +static int
> +pmd_probe(struct rte_vdev_device *vdev)
> +{
> +	struct pmd_params p;
> +	const char *params;
> +	int status;
> +
> +	struct rte_eth_dev_info hard_info;
> +	uint8_t hard_port_id;
> +	int numa_node;
> +	void *dev_private;
> +
> +	if (!vdev)
> +		return -EINVAL;

This check is not required, eal won't call this function with NULL vdev.

<...>

> diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
<...>
> +int
> +rte_pmd_softnic_run(uint8_t port_id);

Since this is public API, this needs to be commented properly, with
doxygen comment.

Btw, since there is API in this PMD perhaps api documentation also needs
to be updated to include this.

<...>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/4] net/softnic: add softnic PMD
  2017-09-05 14:53           ` Ferruh Yigit
@ 2017-09-08  9:30             ` Singh, Jasvinder
  2017-09-08  9:48               ` Ferruh Yigit
  0 siblings, 1 reply; 79+ messages in thread
From: Singh, Jasvinder @ 2017-09-08  9:30 UTC (permalink / raw)
  To: Yigit, Ferruh, dev; +Cc: Dumitrescu, Cristian, thomas

Hi Ferruh,

Thank you for the review and feedback. Please see inline response;

> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Tuesday, September 5, 2017 3:53 PM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>; dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>;
> thomas@monjalon.net
> Subject: Re: [PATCH v3 1/4] net/softnic: add softnic PMD
> 
> On 8/11/2017 1:49 PM, Jasvinder Singh wrote:
> > Add SoftNIC PMD to provide SW fall-back for ethdev APIs.
> >
> > Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> > ---
> > v3 changes:
> > - rebase to dpdk17.08 release
> >
> > v2 changes:
> > - fix build errors
> > - rebased to TM APIs v6 plus dpdk master
> >
> >  MAINTAINERS                                     |   5 +
> >  config/common_base                              |   5 +
> >  drivers/net/Makefile                            |   5 +
> >  drivers/net/softnic/Makefile                    |  56 +++
> >  drivers/net/softnic/rte_eth_softnic.c           | 609
> ++++++++++++++++++++++++
> >  drivers/net/softnic/rte_eth_softnic.h           |  54 +++
> >  drivers/net/softnic/rte_eth_softnic_internals.h | 114 +++++
> >  drivers/net/softnic/rte_eth_softnic_version.map |   7 +
> >  mk/rte.app.mk                                   |   5 +-
> 
> Also documentation updates are required:
> - .ini file
> - PMD documentation .rst file
> - I believe it is good to update release note about new PMD
> - release notes library version info, since this has public API

Will send documentation patch.

> <...>
> 
> > +EXPORT_MAP := rte_eth_softnic_version.map
> 
> rte_pmd_... to be consistent.
> 
> <...>

Will do.

> > +#
> > +# Export include files
> > +#
> > +SYMLINK-y-include +=rte_eth_softnic.h
> 
> space after +=
> 

Will add space.
> 
> > diff --git a/drivers/net/softnic/rte_eth_softnic.c
> > b/drivers/net/softnic/rte_eth_softnic.c
> <...>
> > +
> > +static struct rte_vdev_driver pmd_drv;
> 
> Why this is required, already defined below.
> And for naming, pmd=poll mode driver, drv=driver, makes "poll mode driver
> driver"
> 

Ok. will correct this.

> <...>
> 
> > +static int
> > +pmd_rx_queue_setup(struct rte_eth_dev *dev,
> > +	uint16_t rx_queue_id,
> > +	uint16_t nb_rx_desc __rte_unused,
> > +	unsigned int socket_id,
> > +	const struct rte_eth_rxconf *rx_conf __rte_unused,
> > +	struct rte_mempool *mb_pool __rte_unused) {
> > +	struct pmd_internals *p = dev->data->dev_private;
> > +
> > +	if (p->params.soft.intrusive == 0) {
> > +		struct pmd_rx_queue *rxq;
> > +
> > +		rxq = rte_zmalloc_socket(p->params.soft.name,
> > +			sizeof(struct pmd_rx_queue), 0, socket_id);
> > +		if (rxq == NULL)
> > +			return -1;
> 
> return -ENOMEM ?

Ok.
 
> > +
> > +		rxq->hard.port_id = p->hard.port_id;
> > +		rxq->hard.rx_queue_id = rx_queue_id;
> > +		dev->data->rx_queues[rx_queue_id] = rxq;
> > +	} else {
> > +		struct rte_eth_dev *hard_dev =
> > +			&rte_eth_devices[p->hard.port_id];> +
> 	void *rxq = hard_dev->data->rx_queues[rx_queue_id];
> > +
> > +		if (rxq == NULL)
> > +			return -1;
> > +
> > +		dev->data->rx_queues[rx_queue_id] = rxq;
> 
> This assigns underlying hw queue as this soft PMD queue, what happens if
> two different cores, one polls the actual hw device and other polls the this
> virtual device, since both are indeed same queues?

Once soft device is created and attached to hard device, application has to reads packets from/writes packets to the "soft" port instead of the "hard" port as soft device is feature rich
version of the hard device (See Cover letter notes). The RX and TX queues of the "soft" port are thread safe, as any ethdev. 
 
> > +	}
> > +	return 0;
> > +}
> > +
> 
> <...>
> 
> > +static __rte_always_inline int
> > +rte_pmd_softnic_run_default(struct rte_eth_dev *dev) {
> > +	struct pmd_internals *p = dev->data->dev_private;
> > +
> > +	/* Persistent context: Read Only (update not required) */
> > +	struct rte_mbuf **pkts = p->soft.def.pkts;
> > +	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
> > +
> > +	/* Persistent context: Read - Write (update required) */
> > +	uint32_t txq_pos = p->soft.def.txq_pos;
> > +	uint32_t pkts_len = p->soft.def.pkts_len;
> > +	uint32_t flush_count = p->soft.def.flush_count;
> > +
> > +	/* Not part of the persistent context */
> > +	uint32_t pos;
> > +	uint16_t i;
> > +
> > +	/* Soft device TXQ read, Hard device TXQ write */
> > +	for (i = 0; i < nb_tx_queues; i++) {
> > +		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
> > +
> > +		/* Read soft device TXQ burst to packet enqueue buffer */
> > +		pkts_len += rte_ring_sc_dequeue_burst(txq,
> > +			(void **) &pkts[pkts_len],
> > +			DEFAULT_BURST_SIZE,
> > +			NULL);
> > +
> > +		/* Increment soft device TXQ */
> > +		txq_pos++;
> > +		if (txq_pos >= nb_tx_queues)
> > +			txq_pos = 0;
> > +
> > +		/* Hard device TXQ write when complete burst is available */
> > +		if (pkts_len >= DEFAULT_BURST_SIZE) {
> 
> There questions:
> 1- When there are multiple tx_queues of softnic, and assume all will be
> processed by a core, this core will be reading from all into single HW queue,
> won' this create a bottle neck?

I am not sure if I understand correctly. As per QoS sched library implementation, the number of tx queues of the softnic depend upon the number of users sending their traffic and configurable via one of the input
argument for device creation. There doesn't exist any mapping between the softnic tx queues and hard device tx queues.  The softnic device receives the packets in its scheduling queues (tx queues) and prioritizes their transmission
and transmit them accordingly to specific queue of the hard device(can be specified as an input argument). It would be redundant for thread implementing the QoS scheduler to distribute the packets among the hard device tx queues which actually doesn't serve any purpose.
 
> 2- This logic reads from all queues as BURST_SIZE and merges them, if
> queues split with a RSS or similar, that clasiffication will be lost, will it be
> problem?

I don't think so. The QoS scheduler sits on the tx side just before the transmission stage and receives the packet burst destined for the specific network interface to which it is attached.
Thus, it schedules the packets egressing through the specific port instead of merging the packets going to different interfaces.
 
> 3- If there is not enough packets in the queues ( < DEFAULT_BURST_SIZE)
> those packets won't be transmitted unless more is comming, will this create
> latency for those cases?

In case of low traffic rate situation, packets will be automatically flushed at specific interval as discussed below.

> 
> > +			for (pos = 0; pos < pkts_len; )
> > +				pos += rte_eth_tx_burst(p->hard.port_id,
> > +					p->params.hard.tx_queue_id,
> > +					&pkts[pos],
> > +					(uint16_t) (pkts_len - pos));
> > +
> > +			pkts_len = 0;
> > +			flush_count = 0;
> > +			break;
> > +		}
> > +	}
> > +
> > +	if (flush_count >= FLUSH_COUNT_THRESHOLD) {
> 
> FLUSH_COUNT_THRESHOLD is (1 << 17), and if no packet is sent, flash count
> incremented by one, just want to confirm the treshold value?
> 
> And why this flush exists?

Flush mechanism comes in play when traffic rate is very low. In such instance, packet flush will be triggered once threshold value is satisfied. For example,  cpu core spining at 2.0 GHz, as per current
setting, the packet flush will happen at ~65us interval in case of packet burst size is less than set value.

> > +		for (pos = 0; pos < pkts_len; )
> > +			pos += rte_eth_tx_burst(p->hard.port_id,
> > +				p->params.hard.tx_queue_id,
> > +				&pkts[pos],
> > +				(uint16_t) (pkts_len - pos));
> > +
> > +		pkts_len = 0;
> > +		flush_count = 0;
> > +	}
> > +
> > +	p->soft.def.txq_pos = txq_pos;
> > +	p->soft.def.pkts_len = pkts_len;
> > +	p->soft.def.flush_count = flush_count + 1;
> > +
> > +	return 0;
> > +}
> > +
> > +int
> > +rte_pmd_softnic_run(uint8_t port_id)
> > +{
> > +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> 
> It can be possible to create a macro for this.

Ok. Will do.
 
> <...>
> 
> > +static int
> > +default_init(struct pmd_internals *p,
> 
> default_mbufs_init()? default_init() on its own in not that clear.
> 
> <...>
> 
> > +static void
> > +default_free(struct pmd_internals *p)
> 
> default_mbufs_free()?

The generic  name is chosen if we initialize and free more parameters than mbufs.
 
> <...>
> 
> > +static void *
> > +pmd_init(struct pmd_params *params, int numa_node) {
> > +	struct pmd_internals *p;
> > +	int status;
> > +
> > +	p = rte_zmalloc_socket(params->soft.name,
> > +		sizeof(struct pmd_internals),
> > +		0,
> > +		numa_node);
> > +	if (p == NULL)
> > +		return NULL;
> > +
> > +	memcpy(&p->params, params, sizeof(p->params));
> > +	rte_eth_dev_get_port_by_name(params->hard.name, &p-
> >hard.port_id);
> 
> You may want to check return value of this.

Will add check.
 
> > +
> > +	/* Default */
> > +	status = default_init(p, params, numa_node);
> > +	if (status) {
> > +		rte_free(p);
> > +		return NULL;
> > +	}
> > +
> > +	return p;
> > +}
> > +
> > +static void
> > +pmd_free(struct pmd_internals *p)
> > +{
> > +	default_free(p);
> 
> p->hard.name also needs to be freed here.

No, we don't allocate any memory to this varibale as it points to the value retrieved from the rte_eth_dev_get_port_by_name();
 
> > +
> > +	rte_free(p);
> > +}
> > +
> > +static int
> > +pmd_ethdev_register(struct rte_vdev_device *vdev,
> > +	struct pmd_params *params,
> > +	void *dev_private)
> > +{
> > +	struct rte_eth_dev_info hard_info;
> > +	struct rte_eth_dev *soft_dev;
> > +	struct rte_eth_dev_data *soft_data;
> > +	uint32_t hard_speed;
> > +	int numa_node;
> > +	uint8_t hard_port_id;
> > +
> > +	rte_eth_dev_get_port_by_name(params->hard.name,
> &hard_port_id);
> > +	rte_eth_dev_info_get(hard_port_id, &hard_info);
> > +	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
> > +	numa_node = rte_eth_dev_socket_id(hard_port_id);
> > +
> > +	/* Memory allocation */
> > +	soft_data = rte_zmalloc_socket(params->soft.name,
> > +		sizeof(*soft_data), 0, numa_node);
> > +	if (!soft_data)
> > +		return -ENOMEM;
> > +
> > +	/* Ethdev entry allocation */
> > +	soft_dev = rte_eth_dev_allocate(params->soft.name);
> > +	if (!soft_dev) {
> > +		rte_free(soft_data);
> > +		return -ENOMEM;
> > +	}
> > +
> > +	/* Connect dev->data */
> > +	memmove(soft_data->name,
> > +		soft_dev->data->name,
> > +		sizeof(soft_data->name));
> 
> I guess this is redundant here, allocating soft_data and rest, it is possible to
> use soft_dev->data directly.

Yes,  will correct this.
 
> > +	soft_data->port_id = soft_dev->data->port_id;
> > +	soft_data->mtu = soft_dev->data->mtu;
> > +	soft_dev->data = soft_data;
> > +
> > +	/* dev */
> > +	soft_dev->rx_pkt_burst = (params->soft.intrusive) ?
> > +		NULL : /* set up later */
> > +		pmd_rx_pkt_burst;
> > +	soft_dev->tx_pkt_burst = pmd_tx_pkt_burst;
> > +	soft_dev->tx_pkt_prepare = NULL;
> > +	soft_dev->dev_ops = &pmd_ops;
> > +	soft_dev->device = &vdev->device;
> > +
> > +	/* dev->data */
> > +	soft_dev->data->dev_private = dev_private;
> > +	soft_dev->data->dev_link.link_speed = hard_speed;
> > +	soft_dev->data->dev_link.link_duplex = ETH_LINK_FULL_DUPLEX;
> > +	soft_dev->data->dev_link.link_autoneg = ETH_LINK_SPEED_FIXED;
> > +	soft_dev->data->dev_link.link_status = ETH_LINK_DOWN;
> 
> For simplity, it is possible to have a static struct rte_eth_link, and assing it to
> data->dev_link, as done in null pmd.

The device speed is determined from that of hard device, so thought to assign value explicitly here.
 
> > +	soft_dev->data->mac_addrs = &eth_addr;
> > +	soft_dev->data->promiscuous = 1;
> > +	soft_dev->data->kdrv = RTE_KDRV_NONE;
> > +	soft_dev->data->numa_node = numa_node;
> 
> If pmd is detachable, need following flag:
> data->dev_flags = RTE_ETH_DEV_DETACHABLE;

Ok. Will do that.
 
> > +
> > +	return 0;
> > +}
> > +
> 
> <...>
> 
> > +static int
> > +pmd_probe(struct rte_vdev_device *vdev) {
> > +	struct pmd_params p;
> > +	const char *params;
> > +	int status;
> > +
> > +	struct rte_eth_dev_info hard_info;
> > +	uint8_t hard_port_id;
> > +	int numa_node;
> > +	void *dev_private;
> > +
> > +	if (!vdev)
> > +		return -EINVAL;
> 
> This check is not required, eal won't call this function with NULL vdev.

Ok. Will correct this.
 
> <...>
> 
> > diff --git a/drivers/net/softnic/rte_eth_softnic.h
> > b/drivers/net/softnic/rte_eth_softnic.h
> <...>
> > +int
> > +rte_pmd_softnic_run(uint8_t port_id);
> 
> Since this is public API, this needs to be commented properly, with doxygen
> comment.
>
> Btw, since there is API in this PMD perhaps api documentation also needs to
> be updated to include this.

Yes, will add documentation.
> <...>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/4] net/softnic: add softnic PMD
  2017-09-08  9:30             ` Singh, Jasvinder
@ 2017-09-08  9:48               ` Ferruh Yigit
  2017-09-08 10:42                 ` Singh, Jasvinder
  0 siblings, 1 reply; 79+ messages in thread
From: Ferruh Yigit @ 2017-09-08  9:48 UTC (permalink / raw)
  To: Singh, Jasvinder, dev; +Cc: Dumitrescu, Cristian, thomas

On 9/8/2017 10:30 AM, Singh, Jasvinder wrote:
> Hi Ferruh,
> 
> Thank you for the review and feedback. Please see inline response;
> 
>> -----Original Message-----
>> From: Yigit, Ferruh
>> Sent: Tuesday, September 5, 2017 3:53 PM
>> To: Singh, Jasvinder <jasvinder.singh@intel.com>; dev@dpdk.org
>> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>;
>> thomas@monjalon.net
>> Subject: Re: [PATCH v3 1/4] net/softnic: add softnic PMD
>>
>> On 8/11/2017 1:49 PM, Jasvinder Singh wrote:
>>> Add SoftNIC PMD to provide SW fall-back for ethdev APIs.
>>>
>>> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
>>> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

<...>

>>> +
>>> +	/* Default */
>>> +	status = default_init(p, params, numa_node);
>>> +	if (status) {
>>> +		rte_free(p);
>>> +		return NULL;
>>> +	}
>>> +
>>> +	return p;
>>> +}
>>> +
>>> +static void
>>> +pmd_free(struct pmd_internals *p)
>>> +{
>>> +	default_free(p);
>>
>> p->hard.name also needs to be freed here.
> 
> No, we don't allocate any memory to this varibale as it points to the value retrieved from the rte_eth_dev_get_port_by_name();

I guess it is otherway around, the rte_eth_dev_get_port_by_name() uses
hard.name to get and store the port_id of the underlying hw.

how hard.name set, if I don't miss anything, it is strdup from devargs:

--
ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME, &get_string,
&p->hard.name);
--
get_string()
	*(char **)extra_args = strdup(value);
--

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/4] net/softnic: add softnic PMD
  2017-09-08  9:48               ` Ferruh Yigit
@ 2017-09-08 10:42                 ` Singh, Jasvinder
  0 siblings, 0 replies; 79+ messages in thread
From: Singh, Jasvinder @ 2017-09-08 10:42 UTC (permalink / raw)
  To: Yigit, Ferruh, dev; +Cc: Dumitrescu, Cristian, thomas



> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Friday, September 8, 2017 10:49 AM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>; dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>;
> thomas@monjalon.net
> Subject: Re: [PATCH v3 1/4] net/softnic: add softnic PMD
> 
> On 9/8/2017 10:30 AM, Singh, Jasvinder wrote:
> > Hi Ferruh,
> >
> > Thank you for the review and feedback. Please see inline response;
> >
> >> -----Original Message-----
> >> From: Yigit, Ferruh
> >> Sent: Tuesday, September 5, 2017 3:53 PM
> >> To: Singh, Jasvinder <jasvinder.singh@intel.com>; dev@dpdk.org
> >> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>;
> >> thomas@monjalon.net
> >> Subject: Re: [PATCH v3 1/4] net/softnic: add softnic PMD
> >>
> >> On 8/11/2017 1:49 PM, Jasvinder Singh wrote:
> >>> Add SoftNIC PMD to provide SW fall-back for ethdev APIs.
> >>>
> >>> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> >>> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> 
> <...>
> 
> >>> +
> >>> +	/* Default */
> >>> +	status = default_init(p, params, numa_node);
> >>> +	if (status) {
> >>> +		rte_free(p);
> >>> +		return NULL;
> >>> +	}
> >>> +
> >>> +	return p;
> >>> +}
> >>> +
> >>> +static void
> >>> +pmd_free(struct pmd_internals *p)
> >>> +{
> >>> +	default_free(p);
> >>
> >> p->hard.name also needs to be freed here.
> >
> > No, we don't allocate any memory to this varibale as it points to the
> > value retrieved from the rte_eth_dev_get_port_by_name();
> 
> I guess it is otherway around, the rte_eth_dev_get_port_by_name() uses
> hard.name to get and store the port_id of the underlying hw.
> 
> how hard.name set, if I don't miss anything, it is strdup from devargs:
> 
> --
> ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME, &get_string,
> &p->hard.name);
> --
> get_string()
> 	*(char **)extra_args = strdup(value);
> --

Yes, it is set using above, will  correct that. Thanks.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-08-11 12:49       ` [dpdk-dev] [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
                           ` (3 preceding siblings ...)
  2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 4/4] net/softnic: add TM hierarchy related ops Jasvinder Singh
@ 2017-09-08 17:08         ` Dumitrescu, Cristian
  4 siblings, 0 replies; 79+ messages in thread
From: Dumitrescu, Cristian @ 2017-09-08 17:08 UTC (permalink / raw)
  To: Singh, Jasvinder, dev, thomas, stephen; +Cc: Yigit, Ferruh



> -----Original Message-----
> From: Singh, Jasvinder
> Sent: Friday, August 11, 2017 1:49 PM
> To: dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Yigit, Ferruh
> <ferruh.yigit@intel.com>; thomas@monjalon.net
> Subject: [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and
> others
> 
> The SoftNIC PMD is intended to provide SW fall-back options for specific
> ethdev APIs in a generic way to the NICs not supporting those features.
> 
> Currently, the only implemented ethdev API is Traffic Management (TM),
> but other ethdev APIs such as rte_flow, traffic metering & policing, etc
> can be easily implemented.
> 
> Overview:
> * Generic: The SoftNIC PMD works with any "hard" PMD that implements
> the
>   ethdev API. It does not change the "hard" PMD in any way.
> * Creation: For any given "hard" ethdev port, the user can decide to
>   create an associated "soft" ethdev port to drive the "hard" port. The
>   "soft" port is a virtual device that can be created at app start-up
>   through EAL vdev arg or later through the virtual device API.
> * Configuration: The app explicitly decides which features are to be
>   enabled on the "soft" port and which features are still to be used from
>   the "hard" port. The app continues to explicitly configure both the
>   "hard" and the "soft" ports after the creation of the "soft" port.
> * RX/TX: The app reads packets from/writes packets to the "soft" port
>   instead of the "hard" port. The RX and TX queues of the "soft" port are
>   thread safe, as any ethdev.
> * Execution: The "soft" port is a feature-rich NIC implemented by the CPU,
>   so the run function of the "soft" port has to be executed by the CPU in
>   order to get packets moving between "hard" port and the app.
> * Meets the NFV vision: The app should be (almost) agnostic about the NIC
>   implementation (different vendors/models, HW-SW mix), the app should
> not
>   require changes to use different NICs, the app should use the same API
>   for all NICs. If a NIC does not implement a specific feature, the HW
>   should be augmented with SW to meet the functionality while still
>   preserving the same API.
> 
> Traffic Management SW fall-back overview:
> * Implements the ethdev traffic management API (rte_tm.h).
> * Based on the existing librte_sched DPDK library.
> 
> Example: Create "soft" port for "hard" port "0000:04:00.1", enable the TM
> feature with default settings:
>           --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on'
> 
> Q1: Why generic name, if only TM is supported (for now)?
> A1: The intention is to have SoftNIC PMD implement many other (all?)
>     ethdev APIs under a single "ideal" ethdev, hence the generic name.
>     The initial motivation is TM API, but the mechanism is generic and can
>     be used for many other ethdev APIs. Somebody looking to provide SW
>     fall-back for other ethdev API is likely to end up inventing the same,
>     hence it would be good to consolidate all under a single PMD and have
>     the user explicitly enable/disable the features it needs for each
>     "soft" device.
> 
> Q2: Are there any performance requirements for SoftNIC?
> A2: Yes, performance should be great/decent for every feature, otherwise
>     the SW fall-back is unusable, thus useless.
> 
> Q3: Why not change the "hard" device (and keep a single device) instead of
>     creating a new "soft" device (and thus having two devices)?
> A3: This is not possible with the current librte_ether ethdev
>     implementation. The ethdev->dev_ops are defined as constant structure,
>     so it cannot be changed per device (nor per PMD). The new ops also
>     need memory space to store their context data structures, which
>     requires updating the ethdev->data->dev_private of the existing
>     device; at best, maybe a resize of ethdev->data->dev_private could be
>     done, assuming that librte_ether will introduce a way to find out its
>     size, but this cannot be done while device is running. Other side
>     effects might exist, as the changes are very intrusive, plus it likely
>     needs more changes in librte_ether.
> 
> Q4: Why not call the SW fall-back dev_ops directly in librte_ether for
>     devices which do not support the specific feature? If the device
>     supports the capability, let's call its dev_ops, otherwise call the
>     SW fall-back dev_ops.
> A4: First, similar reasons to Q&A3. This fixes the need to change
>     ethdev->dev_ops of the device, but it does not do anything to fix the
>     other significant issue of where to store the context data structures
>     needed by the SW fall-back functions (which, in this approach, are
>     called implicitly by librte_ether).
>     Second, the SW fall-back options should not be restricted arbitrarily
>     by the librte_ether library, the decision should belong to the app.
>     For example, the TM SW fall-back should not be limited to only
>     librte_sched, which (like any SW fall-back) is limited to a specific
>     hierarchy and feature set, it cannot do any possible hierarchy. If
>     alternatives exist, the one to use should be picked by the app, not by
>     the ethdev layer.
> 
> Q5: Why is the app required to continue to configure both the "hard" and
>     the "soft" devices even after the "soft" device has been created? Why
>     not hiding the "hard" device under the "soft" device and have the
>     "soft" device configure the "hard" device under the hood?
> A5: This was the approach tried in the V2 of this patch set (overlay
>     "soft" device taking over the configuration of the underlay "hard"
>     device) and eventually dropped due to increased complexity of having
>     to keep the configuration of two distinct devices in sync with
>     librte_ether implementation that is not friendly towards such
>     approach. Basically, each ethdev API call for the overlay device
>     needs to configure the overlay device, invoke the same configuration
>     with possibly modified parameters for the underlay device, then resume
>     the configuration of overlay device, turning this into a device
>     emulation project.
>     V2 minuses: increased complexity (deal with two devices at same time);
>     need to implement every ethdev API, even those not needed for the
> scope
>     of SW fall-back; intrusive; sometimes have to silently take decisions
>     that should be left to the app.
>     V3 pluses: lower complexity (only one device); only need to implement
>     those APIs that are in scope of the SW fall-back; non-intrusive (deal
>     with "hard" device through ethdev API); app decisions taken by the app
>     in an explicit way.
> 
> Q6: Why expose the SW fall-back in a PMD and not in a SW library?
> A6: The SW fall-back for an ethdev API has to implement that specific
>     ethdev API, (hence expose an ethdev object through a PMD), as opposed
>     to providing a different API. This approach allows the app to use the
>     same API (NFV vision). For example, we already have a library for TM
>     SW fall-back (librte_sched) that can be called directly by the apps
>     that need to call it outside of ethdev context (use-cases exist), but
>     an app that works with TM-aware NICs through the ethdev TM API would
>     have to be changed significantly in order to work with different
>     TM-agnostic NICs through the librte_sched API.
> 
> Q7: Why have all the SW fall-backs in a single PMD? Why not develop
>     the SW fall-back for each different ethdev API in a separate PMD, then
>     create a chain of "soft" devices for each "hard" device? Potentially,
>     this results in smaller size PMDs that are easier to maintain.
> A7: Arguments for single ethdev/PMD and against chain of ethdevs/PMDs:
>     1. All the existing PMDs for HW NICs implement a lot of features under
>        the same PMD, so there is no reason for single PMD approach to break
>        code modularity. See the V3 code, a lot of care has been taken for
>        code modularity.
>     2. We should avoid the proliferation of SW PMDs.
>     3. A single device should be handled by a single PMD.
>     4. People are used with feature-rich PMDs, not with single-feature
>        PMDs, so we change of mindset?
>     5. [Configuration nightmare] A chain of "soft" devices attached to
>        single "hard" device requires the app to be aware that the N "soft"
>        devices in the chain plus the "hard" device refer to the same HW
>        device, and which device should be invoked to configure which
>        feature. Also the length of the chain and functionality of each
>        link is different for each HW device. This breaks the requirement
>        of preserving the same API while working with different NICs (NFV).
>        This most likely results in a configuration nightmare, nobody is
>        going to seriously use this.
>     6. [Feature inter-dependecy] Sometimes different features need to be
>        configured and executed together (e.g. share the same set of
>        resources, are inter-dependent, etc), so it is better and more
>        performant to do them in the same ethdev/PMD.
>     7. [Code duplication] There is a lot of duplication in the
>        configuration code for the chain of ethdevs approach. The ethdev
>        dev_configure, rx_queue_setup, tx_queue_setup API functions have to
>        be implemented per device, and they become meaningless/inconsistent
>        with the chain approach.
>     8. [Data structure duplication] The per device data structures have to
>        be duplicated and read repeatedly for each "soft" ethdev. The
>        ethdev device, dev_private, data, per RX/TX queue data structures
>        have to be replicated per "soft" device. They have to be re-read for
>        each stage, so the same cache misses are now multiplied with the
>        number of stages in the chain.
>     9. [rte_ring proliferation] Thread safety requirements for ethdev
>        RX/TXqueues require an rte_ring to be used for every RX/TX queue
>        of each "soft" ethdev. This rte_ring proliferation unnecessarily
>        increases the memory footprint and lowers performance, especially
>        when each "soft" ethdev ends up on a different CPU core (ping-pong
>        of cache lines).
>     10.[Meta-data proliferation] A chain of ethdevs is likely to result
>        in proliferation of meta-data that has to be passed between the
>        ethdevs (e.g. policing needs the output of flow classification),
>        which results in more cache line ping-pong between cores, hence
>        performance drops.
> 
> Cristian Dumitrescu (4):
> Jasvinder Singh (4):
>   net/softnic: add softnic PMD
>   net/softnic: add traffic management support
>   net/softnic: add TM capabilities ops
>   net/softnic: add TM hierarchy related ops
> 
>  MAINTAINERS                                     |    5 +
>  config/common_base                              |    5 +
>  drivers/net/Makefile                            |    5 +
>  drivers/net/softnic/Makefile                    |   57 +
>  drivers/net/softnic/rte_eth_softnic.c           |  867 ++++++
>  drivers/net/softnic/rte_eth_softnic.h           |   70 +
>  drivers/net/softnic/rte_eth_softnic_internals.h |  291 ++
>  drivers/net/softnic/rte_eth_softnic_tm.c        | 3446
> +++++++++++++++++++++++
>  drivers/net/softnic/rte_eth_softnic_version.map |    7 +
>  mk/rte.app.mk                                   |    5 +-
>  10 files changed, 4757 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/softnic/Makefile
>  create mode 100644 drivers/net/softnic/rte_eth_softnic.c
>  create mode 100644 drivers/net/softnic/rte_eth_softnic.h
>  create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
>  create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c
>  create mode 100644 drivers/net/softnic/rte_eth_softnic_version.map
> 
> --
> 2.9.3

Ping Thomas and Stephen.

You guys previously had some comments looking to suggest the approach of augmenting the "hard" device as opposed to creating "soft" device for the "hard" device. We tried to address these comments in the Q& A list below (see Q&As 3 and 4).

Have your comments been addressed or do we need to talk more on them?

Thanks,
Cristian

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 1/4] net/softnic: add softnic PMD Jasvinder Singh
  2017-09-05 14:53           ` Ferruh Yigit
@ 2017-09-18  9:10           ` Jasvinder Singh
  2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 1/4] net/softnic: add softnic PMD Jasvinder Singh
                               ` (4 more replies)
  1 sibling, 5 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-09-18  9:10 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

The SoftNIC PMD is intended to provide SW fall-back options for specific
ethdev APIs in a generic way to the NICs not supporting those features.

Currently, the only implemented ethdev API is Traffic Management (TM),
but other ethdev APIs such as rte_flow, traffic metering & policing, etc
can be easily implemented.

Overview:
* Generic: The SoftNIC PMD works with any "hard" PMD that implements the
  ethdev API. It does not change the "hard" PMD in any way.
* Creation: For any given "hard" ethdev port, the user can decide to
  create an associated "soft" ethdev port to drive the "hard" port. The
  "soft" port is a virtual device that can be created at app start-up
  through EAL vdev arg or later through the virtual device API.
* Configuration: The app explicitly decides which features are to be
  enabled on the "soft" port and which features are still to be used from
  the "hard" port. The app continues to explicitly configure both the
  "hard" and the "soft" ports after the creation of the "soft" port.
* RX/TX: The app reads packets from/writes packets to the "soft" port
  instead of the "hard" port. The RX and TX queues of the "soft" port are
  thread safe, as any ethdev.
* Execution: The "soft" port is a feature-rich NIC implemented by the CPU,
  so the run function of the "soft" port has to be executed by the CPU in
  order to get packets moving between "hard" port and the app.
* Meets the NFV vision: The app should be (almost) agnostic about the NIC
  implementation (different vendors/models, HW-SW mix), the app should not
  require changes to use different NICs, the app should use the same API
  for all NICs. If a NIC does not implement a specific feature, the HW
  should be augmented with SW to meet the functionality while still
  preserving the same API.

Traffic Management SW fall-back overview:
* Implements the ethdev traffic management API (rte_tm.h).
* Based on the existing librte_sched DPDK library.

Example: Create "soft" port for "hard" port "0000:04:00.1", enable the TM
feature with default settings:
          --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on' 

Q1: Why generic name, if only TM is supported (for now)?
A1: The intention is to have SoftNIC PMD implement many other (all?)
    ethdev APIs under a single "ideal" ethdev, hence the generic name.
    The initial motivation is TM API, but the mechanism is generic and can
    be used for many other ethdev APIs. Somebody looking to provide SW
    fall-back for other ethdev API is likely to end up inventing the same,
    hence it would be good to consolidate all under a single PMD and have
    the user explicitly enable/disable the features it needs for each
    "soft" device.

Q2: Are there any performance requirements for SoftNIC?
A2: Yes, performance should be great/decent for every feature, otherwise
    the SW fall-back is unusable, thus useless.

Q3: Why not change the "hard" device (and keep a single device) instead of
    creating a new "soft" device (and thus having two devices)?
A3: This is not possible with the current librte_ether ethdev
    implementation. The ethdev->dev_ops are defined as constant structure,
    so it cannot be changed per device (nor per PMD). The new ops also
    need memory space to store their context data structures, which
    requires updating the ethdev->data->dev_private of the existing
    device; at best, maybe a resize of ethdev->data->dev_private could be
    done, assuming that librte_ether will introduce a way to find out its
    size, but this cannot be done while device is running. Other side
    effects might exist, as the changes are very intrusive, plus it likely
    needs more changes in librte_ether.

Q4: Why not call the SW fall-back dev_ops directly in librte_ether for
    devices which do not support the specific feature? If the device
    supports the capability, let's call its dev_ops, otherwise call the
    SW fall-back dev_ops.
A4: First, similar reasons to Q&A3. This fixes the need to change
    ethdev->dev_ops of the device, but it does not do anything to fix the
    other significant issue of where to store the context data structures
    needed by the SW fall-back functions (which, in this approach, are
    called implicitly by librte_ether).
    Second, the SW fall-back options should not be restricted arbitrarily
    by the librte_ether library, the decision should belong to the app.
    For example, the TM SW fall-back should not be limited to only
    librte_sched, which (like any SW fall-back) is limited to a specific
    hierarchy and feature set, it cannot do any possible hierarchy. If
    alternatives exist, the one to use should be picked by the app, not by
    the ethdev layer.

Q5: Why is the app required to continue to configure both the "hard" and
    the "soft" devices even after the "soft" device has been created? Why
    not hiding the "hard" device under the "soft" device and have the
    "soft" device configure the "hard" device under the hood?
A5: This was the approach tried in the V2 of this patch set (overlay
    "soft" device taking over the configuration of the underlay "hard"
    device) and eventually dropped due to increased complexity of having
    to keep the configuration of two distinct devices in sync with
    librte_ether implementation that is not friendly towards such
    approach. Basically, each ethdev API call for the overlay device
    needs to configure the overlay device, invoke the same configuration
    with possibly modified parameters for the underlay device, then resume
    the configuration of overlay device, turning this into a device
    emulation project.
    V2 minuses: increased complexity (deal with two devices at same time);
    need to implement every ethdev API, even those not needed for the scope
    of SW fall-back; intrusive; sometimes have to silently take decisions
    that should be left to the app.
    V3 pluses: lower complexity (only one device); only need to implement
    those APIs that are in scope of the SW fall-back; non-intrusive (deal
    with "hard" device through ethdev API); app decisions taken by the app
    in an explicit way.

Q6: Why expose the SW fall-back in a PMD and not in a SW library?
A6: The SW fall-back for an ethdev API has to implement that specific
    ethdev API, (hence expose an ethdev object through a PMD), as opposed
    to providing a different API. This approach allows the app to use the
    same API (NFV vision). For example, we already have a library for TM
    SW fall-back (librte_sched) that can be called directly by the apps
    that need to call it outside of ethdev context (use-cases exist), but
    an app that works with TM-aware NICs through the ethdev TM API would
    have to be changed significantly in order to work with different
    TM-agnostic NICs through the librte_sched API.

Q7: Why have all the SW fall-backs in a single PMD? Why not develop
    the SW fall-back for each different ethdev API in a separate PMD, then
    create a chain of "soft" devices for each "hard" device? Potentially,
    this results in smaller size PMDs that are easier to maintain.
A7: Arguments for single ethdev/PMD and against chain of ethdevs/PMDs:
    1. All the existing PMDs for HW NICs implement a lot of features under
       the same PMD, so there is no reason for single PMD approach to break
       code modularity. See the V3 code, a lot of care has been taken for
       code modularity.
    2. We should avoid the proliferation of SW PMDs.
    3. A single device should be handled by a single PMD.
    4. People are used with feature-rich PMDs, not with single-feature
       PMDs, so we change of mindset?
    5. [Configuration nightmare] A chain of "soft" devices attached to
       single "hard" device requires the app to be aware that the N "soft"
       devices in the chain plus the "hard" device refer to the same HW
       device, and which device should be invoked to configure which
       feature. Also the length of the chain and functionality of each
       link is different for each HW device. This breaks the requirement
       of preserving the same API while working with different NICs (NFV).
       This most likely results in a configuration nightmare, nobody is
       going to seriously use this.
    6. [Feature inter-dependecy] Sometimes different features need to be
       configured and executed together (e.g. share the same set of
       resources, are inter-dependent, etc), so it is better and more
       performant to do them in the same ethdev/PMD.
    7. [Code duplication] There is a lot of duplication in the
       configuration code for the chain of ethdevs approach. The ethdev
       dev_configure, rx_queue_setup, tx_queue_setup API functions have to
       be implemented per device, and they become meaningless/inconsistent
       with the chain approach.
    8. [Data structure duplication] The per device data structures have to
       be duplicated and read repeatedly for each "soft" ethdev. The
       ethdev device, dev_private, data, per RX/TX queue data structures
       have to be replicated per "soft" device. They have to be re-read for
       each stage, so the same cache misses are now multiplied with the
       number of stages in the chain.
    9. [rte_ring proliferation] Thread safety requirements for ethdev
       RX/TXqueues require an rte_ring to be used for every RX/TX queue
       of each "soft" ethdev. This rte_ring proliferation unnecessarily
       increases the memory footprint and lowers performance, especially
       when each "soft" ethdev ends up on a different CPU core (ping-pong
       of cache lines).
    10.[Meta-data proliferation] A chain of ethdevs is likely to result
       in proliferation of meta-data that has to be passed between the
       ethdevs (e.g. policing needs the output of flow classification),
       which results in more cache line ping-pong between cores, hence
       performance drops.

Cristian Dumitrescu (4):
Jasvinder Singh (4):
  net/softnic: add softnic PMD
  net/softnic: add traffic management support
  net/softnic: add TM capabilities ops
  net/softnic: add TM hierarchy related  ops

 MAINTAINERS                                        |    5 +
 config/common_base                                 |    5 +
 doc/api/doxy-api-index.md                          |    3 +-
 doc/api/doxy-api.conf                              |    1 +
 doc/guides/rel_notes/release_17_11.rst             |    6 +
 drivers/net/Makefile                               |    5 +
 drivers/net/softnic/Makefile                       |   57 +
 drivers/net/softnic/rte_eth_softnic.c              |  853 +++++
 drivers/net/softnic/rte_eth_softnic.h              |   83 +
 drivers/net/softnic/rte_eth_softnic_internals.h    |  291 ++
 drivers/net/softnic/rte_eth_softnic_tm.c           | 3449 ++++++++++++++++++++
 .../net/softnic/rte_pmd_eth_softnic_version.map    |    7 +
 mk/rte.app.mk                                      |    5 +-
 13 files changed, 4768 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c
 create mode 100644 drivers/net/softnic/rte_pmd_eth_softnic_version.map

-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v4 1/4] net/softnic: add softnic PMD
  2017-09-18  9:10           ` [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
@ 2017-09-18  9:10             ` Jasvinder Singh
  2017-09-18 16:58               ` Singh, Jasvinder
  2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 2/4] net/softnic: add traffic management support Jasvinder Singh
                               ` (3 subsequent siblings)
  4 siblings, 1 reply; 79+ messages in thread
From: Jasvinder Singh @ 2017-09-18  9:10 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 30410 bytes --]

Add SoftNIC PMD to provide SW fall-back for ethdev APIs.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
---
v4 changes:
- Implemented feedback from Ferruh [1]
 - rename map file to rte_pmd_eth_softnic_version.map
 - add release notes library version info
 - doxygen: fix hooks in doc/api/doxy-api-index.md
 - add doxygen comment for rte_pmd_softnic_run()
 - free device name memory
 - remove soft_dev param in pmd_ethdev_register()
 - fix checkpatch warnings

v3 changes:
- rebase to dpdk17.08 release

v2 changes:
- fix build errors
- rebased to TM APIs v6 plus dpdk master

[1] Ferruh’s feedback on v3: http://dpdk.org/ml/archives/dev/2017-September/074576.html

 MAINTAINERS                                        |   5 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   3 +-
 doc/api/doxy-api.conf                              |   1 +
 doc/guides/rel_notes/release_17_11.rst             |   6 +
 drivers/net/Makefile                               |   5 +
 drivers/net/softnic/Makefile                       |  56 ++
 drivers/net/softnic/rte_eth_softnic.c              | 595 +++++++++++++++++++++
 drivers/net/softnic/rte_eth_softnic.h              |  67 +++
 drivers/net/softnic/rte_eth_softnic_internals.h    | 114 ++++
 .../net/softnic/rte_pmd_eth_softnic_version.map    |   7 +
 mk/rte.app.mk                                      |   5 +-
 12 files changed, 867 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_pmd_eth_softnic_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index a0cd75e..b6b738d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -511,6 +511,11 @@ M: Gaetan Rivet <gaetan.rivet@6wind.com>
 F: drivers/net/failsafe/
 F: doc/guides/nics/fail_safe.rst
 
+Softnic PMD
+M: Jasvinder Singh <jasvinder.singh@intel.com>
+M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
+F: drivers/net/softnic
+
 
 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index 5e97a08..1a0c77d 100644
--- a/config/common_base
+++ b/config/common_base
@@ -273,6 +273,11 @@ CONFIG_RTE_LIBRTE_SFC_EFX_PMD=y
 CONFIG_RTE_LIBRTE_SFC_EFX_DEBUG=n
 
 #
+# Compile SOFTNIC PMD
+#
+CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
+
+#
 # Compile software PMD backed by SZEDATA2 device
 #
 CONFIG_RTE_LIBRTE_PMD_SZEDATA2=n
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 19e0d4f..626ab51 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -55,7 +55,8 @@ The public API headers are grouped by topics:
   [KNI]                (@ref rte_kni.h),
   [ixgbe]              (@ref rte_pmd_ixgbe.h),
   [i40e]               (@ref rte_pmd_i40e.h),
-  [crypto_scheduler]   (@ref rte_cryptodev_scheduler.h)
+  [crypto_scheduler]   (@ref rte_cryptodev_scheduler.h),
+  [softnic]            (@ref rte_eth_softnic.h)
 
 - **memory**:
   [memseg]             (@ref rte_memory.h),
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 823554f..b27755d 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -32,6 +32,7 @@ PROJECT_NAME            = DPDK
 INPUT                   = doc/api/doxy-api-index.md \
                           drivers/crypto/scheduler \
                           drivers/net/bonding \
+                          drivers/net/softnic \
                           drivers/net/i40e \
                           drivers/net/ixgbe \
                           lib/librte_eal/common/include \
diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 170f4f9..d5a760b 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -41,6 +41,11 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added SoftNIC PMD.**
+
+  Added new SoftNIC PMD. This virtual device offers applications a software
+  fallback support for traffic management.
+
 
 Resolved Issues
 ---------------
@@ -170,6 +175,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_pipeline.so.3
      librte_pmd_bond.so.1
      librte_pmd_ring.so.2
+   + librte_pmd_softnic.so.1
      librte_port.so.3
      librte_power.so.1
      librte_reorder.so.1
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index d33c959..b552a51 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -110,4 +110,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
 endif # $(CONFIG_RTE_LIBRTE_VHOST)
 DEPDIRS-vhost = $(core-libs) librte_vhost
 
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += softnic
+endif # $(CONFIG_RTE_LIBRTE_SCHED)
+DEPDIRS-softnic = $(core-libs) librte_sched
+
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
new file mode 100644
index 0000000..c2f42ef
--- /dev/null
+++ b/drivers/net/softnic/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_softnic.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_eth_softnic_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+
+#
+# Export include files
+#
+SYMLINK-y-include += rte_eth_softnic.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
new file mode 100644
index 0000000..792e7ea
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -0,0 +1,595 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_ethdev.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_vdev.h>
+#include <rte_kvargs.h>
+#include <rte_errno.h>
+#include <rte_ring.h>
+
+#include "rte_eth_softnic.h"
+#include "rte_eth_softnic_internals.h"
+
+#define PRIV_TO_HARD_DEV(p)					\
+	(&rte_eth_devices[p->hard.port_id])
+
+#define PMD_PARAM_HARD_NAME					"hard_name"
+#define PMD_PARAM_HARD_TX_QUEUE_ID			"hard_tx_queue_id"
+
+static const char *pmd_valid_args[] = {
+	PMD_PARAM_HARD_NAME,
+	PMD_PARAM_HARD_TX_QUEUE_ID,
+	NULL
+};
+
+static const struct rte_eth_dev_info pmd_dev_info = {
+	.min_rx_bufsize = 0,
+	.max_rx_pktlen = UINT32_MAX,
+	.max_rx_queues = UINT16_MAX,
+	.max_tx_queues = UINT16_MAX,
+	.rx_desc_lim = {
+		.nb_max = UINT16_MAX,
+		.nb_min = 0,
+		.nb_align = 1,
+	},
+	.tx_desc_lim = {
+		.nb_max = UINT16_MAX,
+		.nb_min = 0,
+		.nb_align = 1,
+	},
+};
+
+static void
+pmd_dev_infos_get(struct rte_eth_dev *dev __rte_unused,
+	struct rte_eth_dev_info *dev_info)
+{
+	memcpy(dev_info, &pmd_dev_info, sizeof(*dev_info));
+}
+
+static int
+pmd_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_eth_dev *hard_dev = PRIV_TO_HARD_DEV(p);
+
+	if (dev->data->nb_rx_queues > hard_dev->data->nb_rx_queues)
+		return -1;
+
+	if (p->params.hard.tx_queue_id >= hard_dev->data->nb_tx_queues)
+		return -1;
+
+	return 0;
+}
+
+static int
+pmd_rx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id,
+	uint16_t nb_rx_desc __rte_unused,
+	unsigned int socket_id,
+	const struct rte_eth_rxconf *rx_conf __rte_unused,
+	struct rte_mempool *mb_pool __rte_unused)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	if (p->params.soft.intrusive == 0) {
+		struct pmd_rx_queue *rxq;
+
+		rxq = rte_zmalloc_socket(p->params.soft.name,
+			sizeof(struct pmd_rx_queue), 0, socket_id);
+		if (rxq == NULL)
+			return -ENOMEM;
+
+		rxq->hard.port_id = p->hard.port_id;
+		rxq->hard.rx_queue_id = rx_queue_id;
+		dev->data->rx_queues[rx_queue_id] = rxq;
+	} else {
+		struct rte_eth_dev *hard_dev = PRIV_TO_HARD_DEV(p);
+		void *rxq = hard_dev->data->rx_queues[rx_queue_id];
+
+		if (rxq == NULL)
+			return -1;
+
+		dev->data->rx_queues[rx_queue_id] = rxq;
+	}
+	return 0;
+}
+
+static int
+pmd_tx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id,
+	uint16_t nb_tx_desc,
+	unsigned int socket_id,
+	const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	uint32_t size = RTE_ETH_NAME_MAX_LEN + strlen("_txq") + 4;
+	char name[size];
+	struct rte_ring *r;
+
+	snprintf(name, sizeof(name), "%s_txq%04x",
+		dev->data->name, tx_queue_id);
+	r = rte_ring_create(name, nb_tx_desc, socket_id,
+		RING_F_SP_ENQ | RING_F_SC_DEQ);
+	if (r == NULL)
+		return -1;
+
+	dev->data->tx_queues[tx_queue_id] = r;
+	return 0;
+}
+
+static int
+pmd_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+
+	if (p->params.soft.intrusive) {
+		struct rte_eth_dev *hard_dev = PRIV_TO_HARD_DEV(p);
+
+		/* The hard_dev->rx_pkt_burst should be stable by now */
+		dev->rx_pkt_burst = hard_dev->rx_pkt_burst;
+	}
+
+	return 0;
+}
+
+static void
+pmd_dev_stop(struct rte_eth_dev *dev)
+{
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+}
+
+static void
+pmd_dev_close(struct rte_eth_dev *dev)
+{
+	uint32_t i;
+
+	/* TX queues */
+	for (i = 0; i < dev->data->nb_tx_queues; i++)
+		rte_ring_free((struct rte_ring *)dev->data->tx_queues[i]);
+}
+
+static int
+pmd_link_update(struct rte_eth_dev *dev __rte_unused,
+	int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static const struct eth_dev_ops pmd_ops = {
+	.dev_configure = pmd_dev_configure,
+	.dev_start = pmd_dev_start,
+	.dev_stop = pmd_dev_stop,
+	.dev_close = pmd_dev_close,
+	.link_update = pmd_link_update,
+	.dev_infos_get = pmd_dev_infos_get,
+	.rx_queue_setup = pmd_rx_queue_setup,
+	.tx_queue_setup = pmd_tx_queue_setup,
+	.tm_ops_get = NULL,
+};
+
+static uint16_t
+pmd_rx_pkt_burst(void *rxq,
+	struct rte_mbuf **rx_pkts,
+	uint16_t nb_pkts)
+{
+	struct pmd_rx_queue *rx_queue = rxq;
+
+	return rte_eth_rx_burst(rx_queue->hard.port_id,
+		rx_queue->hard.rx_queue_id,
+		rx_pkts,
+		nb_pkts);
+}
+
+static uint16_t
+pmd_tx_pkt_burst(void *txq,
+	struct rte_mbuf **tx_pkts,
+	uint16_t nb_pkts)
+{
+	return (uint16_t)rte_ring_enqueue_burst(txq,
+		(void **)tx_pkts,
+		nb_pkts,
+		NULL);
+}
+
+static __rte_always_inline int
+rte_pmd_softnic_run_default(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Persistent context: Read Only (update not required) */
+	struct rte_mbuf **pkts = p->soft.def.pkts;
+	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
+
+	/* Persistent context: Read - Write (update required) */
+	uint32_t txq_pos = p->soft.def.txq_pos;
+	uint32_t pkts_len = p->soft.def.pkts_len;
+	uint32_t flush_count = p->soft.def.flush_count;
+
+	/* Not part of the persistent context */
+	uint32_t pos;
+	uint16_t i;
+
+	/* Soft device TXQ read, Hard device TXQ write */
+	for (i = 0; i < nb_tx_queues; i++) {
+		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
+
+		/* Read soft device TXQ burst to packet enqueue buffer */
+		pkts_len += rte_ring_sc_dequeue_burst(txq,
+			(void **)&pkts[pkts_len],
+			DEFAULT_BURST_SIZE,
+			NULL);
+
+		/* Increment soft device TXQ */
+		txq_pos++;
+		if (txq_pos >= nb_tx_queues)
+			txq_pos = 0;
+
+		/* Hard device TXQ write when complete burst is available */
+		if (pkts_len >= DEFAULT_BURST_SIZE) {
+			for (pos = 0; pos < pkts_len; )
+				pos += rte_eth_tx_burst(p->hard.port_id,
+					p->params.hard.tx_queue_id,
+					&pkts[pos],
+					(uint16_t)(pkts_len - pos));
+
+			pkts_len = 0;
+			flush_count = 0;
+			break;
+		}
+	}
+
+	if (flush_count >= FLUSH_COUNT_THRESHOLD) {
+		for (pos = 0; pos < pkts_len; )
+			pos += rte_eth_tx_burst(p->hard.port_id,
+				p->params.hard.tx_queue_id,
+				&pkts[pos],
+				(uint16_t)(pkts_len - pos));
+
+		pkts_len = 0;
+		flush_count = 0;
+	}
+
+	p->soft.def.txq_pos = txq_pos;
+	p->soft.def.pkts_len = pkts_len;
+	p->soft.def.flush_count = flush_count + 1;
+
+	return 0;
+}
+
+int
+rte_pmd_softnic_run(uint8_t port_id)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+#endif
+
+	return rte_pmd_softnic_run_default(dev);
+}
+
+static struct ether_addr eth_addr = { .addr_bytes = {0} };
+
+static uint32_t
+eth_dev_speed_max_mbps(uint32_t speed_capa)
+{
+	uint32_t rate_mbps[32] = {
+		ETH_SPEED_NUM_NONE,
+		ETH_SPEED_NUM_10M,
+		ETH_SPEED_NUM_10M,
+		ETH_SPEED_NUM_100M,
+		ETH_SPEED_NUM_100M,
+		ETH_SPEED_NUM_1G,
+		ETH_SPEED_NUM_2_5G,
+		ETH_SPEED_NUM_5G,
+		ETH_SPEED_NUM_10G,
+		ETH_SPEED_NUM_20G,
+		ETH_SPEED_NUM_25G,
+		ETH_SPEED_NUM_40G,
+		ETH_SPEED_NUM_50G,
+		ETH_SPEED_NUM_56G,
+		ETH_SPEED_NUM_100G,
+	};
+
+	uint32_t pos = (speed_capa) ? (31 - __builtin_clz(speed_capa)) : 0;
+	return rate_mbps[pos];
+}
+
+static int
+default_init(struct pmd_internals *p,
+	struct pmd_params *params,
+	int numa_node)
+{
+	p->soft.def.pkts = rte_zmalloc_socket(params->soft.name,
+		2 * DEFAULT_BURST_SIZE * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.def.pkts == NULL)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void
+default_free(struct pmd_internals *p)
+{
+	free((void *)p->params.hard.name);
+	rte_free(p->soft.def.pkts);
+}
+
+static void *
+pmd_init(struct pmd_params *params, int numa_node)
+{
+	struct pmd_internals *p;
+	int status;
+
+	p = rte_zmalloc_socket(params->soft.name,
+		sizeof(struct pmd_internals),
+		0,
+		numa_node);
+	if (p == NULL)
+		return NULL;
+
+	memcpy(&p->params, params, sizeof(p->params));
+	status = rte_eth_dev_get_port_by_name(params->hard.name,
+		&p->hard.port_id);
+	if (status) {
+		rte_free(p);
+		return NULL;
+	}
+
+	/* Default */
+	status = default_init(p, params, numa_node);
+	if (status) {
+		rte_free(p);
+		return NULL;
+	}
+
+	return p;
+}
+
+static void
+pmd_free(struct pmd_internals *p)
+{
+	default_free(p);
+
+	rte_free(p);
+}
+
+static int
+pmd_ethdev_register(struct rte_vdev_device *vdev,
+	struct pmd_params *params,
+	void *dev_private)
+{
+	struct rte_eth_dev_info hard_info;
+	struct rte_eth_dev *soft_dev;
+	uint32_t hard_speed;
+	int numa_node;
+	uint8_t hard_port_id;
+
+	rte_eth_dev_get_port_by_name(params->hard.name, &hard_port_id);
+	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
+	numa_node = rte_eth_dev_socket_id(hard_port_id);
+
+	/* Ethdev entry allocation */
+	soft_dev = rte_eth_dev_allocate(params->soft.name);
+	if (!soft_dev)
+		return -ENOMEM;
+
+	/* dev */
+	soft_dev->rx_pkt_burst = (params->soft.intrusive) ?
+		NULL : /* set up later */
+		pmd_rx_pkt_burst;
+	soft_dev->tx_pkt_burst = pmd_tx_pkt_burst;
+	soft_dev->tx_pkt_prepare = NULL;
+	soft_dev->dev_ops = &pmd_ops;
+	soft_dev->device = &vdev->device;
+
+	/* dev->data */
+	soft_dev->data->dev_private = dev_private;
+	soft_dev->data->dev_link.link_speed = hard_speed;
+	soft_dev->data->dev_link.link_duplex = ETH_LINK_FULL_DUPLEX;
+	soft_dev->data->dev_link.link_autoneg = ETH_LINK_SPEED_FIXED;
+	soft_dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	soft_dev->data->mac_addrs = &eth_addr;
+	soft_dev->data->promiscuous = 1;
+	soft_dev->data->kdrv = RTE_KDRV_NONE;
+	soft_dev->data->numa_node = numa_node;
+	soft_dev->data->dev_flags = RTE_ETH_DEV_DETACHABLE;
+
+	return 0;
+}
+
+static int
+get_string(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(char **)extra_args = strdup(value);
+
+	if (!*(char **)extra_args)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int
+get_uint32(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(uint32_t *)extra_args = strtoull(value, NULL, 0);
+
+	return 0;
+}
+
+static int
+pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
+{
+	struct rte_kvargs *kvlist;
+	int ret;
+
+	kvlist = rte_kvargs_parse(params, pmd_valid_args);
+	if (kvlist == NULL)
+		return -EINVAL;
+
+	/* Set default values */
+	memset(p, 0, sizeof(*p));
+	p->soft.name = name;
+	p->soft.intrusive = INTRUSIVE;
+	p->hard.tx_queue_id = SOFTNIC_HARD_TX_QUEUE_ID;
+
+	/* HARD: name (mandatory) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_NAME) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME,
+			&get_string, &p->hard.name);
+		if (ret < 0)
+			goto out_free;
+	} else {
+		ret = -EINVAL;
+		goto out_free;
+	}
+
+	/* HARD: tx_queue_id (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_TX_QUEUE_ID) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_TX_QUEUE_ID,
+			&get_uint32, &p->hard.tx_queue_id);
+		if (ret < 0)
+			goto out_free;
+	}
+
+out_free:
+	rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+pmd_probe(struct rte_vdev_device *vdev)
+{
+	struct pmd_params p;
+	const char *params;
+	int status;
+
+	struct rte_eth_dev_info hard_info;
+	uint8_t hard_port_id;
+	int numa_node;
+	void *dev_private;
+
+	RTE_LOG(INFO, PMD,
+		"Probing device \"%s\"\n",
+		rte_vdev_device_name(vdev));
+
+	/* Parse input arguments */
+	params = rte_vdev_device_args(vdev);
+	if (!params)
+		return -EINVAL;
+
+	status = pmd_parse_args(&p, rte_vdev_device_name(vdev), params);
+	if (status)
+		return status;
+
+	/* Check input arguments */
+	if (rte_eth_dev_get_port_by_name(p.hard.name, &hard_port_id))
+		return -EINVAL;
+
+	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	numa_node = rte_eth_dev_socket_id(hard_port_id);
+
+	if (p.hard.tx_queue_id >= hard_info.max_tx_queues)
+		return -EINVAL;
+
+	/* Allocate and initialize soft ethdev private data */
+	dev_private = pmd_init(&p, numa_node);
+	if (dev_private == NULL)
+		return -ENOMEM;
+
+	/* Register soft ethdev */
+	RTE_LOG(INFO, PMD,
+		"Creating soft ethdev \"%s\" for hard ethdev \"%s\"\n",
+		p.soft.name, p.hard.name);
+
+	status = pmd_ethdev_register(vdev, &p, dev_private);
+	if (status) {
+		pmd_free(dev_private);
+		return status;
+	}
+
+	return 0;
+}
+
+static int
+pmd_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *dev = NULL;
+	struct pmd_internals *p;
+
+	if (!vdev)
+		return -EINVAL;
+
+	RTE_LOG(INFO, PMD, "Removing device \"%s\"\n",
+		rte_vdev_device_name(vdev));
+
+	/* Find the ethdev entry */
+	dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (dev == NULL)
+		return -ENODEV;
+	p = dev->data->dev_private;
+
+	/* Free device data structures*/
+	pmd_free(p);
+	rte_free(dev->data);
+	rte_eth_dev_release_port(dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_softnic_drv = {
+	.probe = pmd_probe,
+	.remove = pmd_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_softnic, pmd_softnic_drv);
+RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
+	PMD_PARAM_HARD_NAME "=<string> "
+	PMD_PARAM_HARD_TX_QUEUE_ID "=<int>");
diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
new file mode 100644
index 0000000..e6996f3
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.h
@@ -0,0 +1,67 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_H__
+
+#include <stdint.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifndef SOFTNIC_HARD_TX_QUEUE_ID
+#define SOFTNIC_HARD_TX_QUEUE_ID			0
+#endif
+
+/**
+ * Run the traffic management function on the softnic device
+ *
+ * This function read the packets from the softnic input queues, insert into
+ * QoS scheduler queues based on mbuf sched field value and transmit the
+ * scheduled packets out through the hard device interface.
+ *
+ * @param portid
+ *    port id of the soft device.
+ * @return
+ *    zero.
+ */
+
+int
+rte_pmd_softnic_run(uint8_t port_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
new file mode 100644
index 0000000..96995b5
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -0,0 +1,114 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+
+#include "rte_eth_softnic.h"
+
+#ifndef INTRUSIVE
+#define INTRUSIVE					0
+#endif
+
+struct pmd_params {
+	/** Parameters for the soft device (to be created) */
+	struct {
+		const char *name; /**< Name */
+		uint32_t flags; /**< Flags */
+
+		/** 0 = Access hard device though API only (potentially slower,
+		 *      but safer);
+		 *  1 = Access hard device private data structures is allowed
+		 *      (potentially faster).
+		 */
+		int intrusive;
+	} soft;
+
+	/** Parameters for the hard device (existing) */
+	struct {
+		char *name; /**< Name */
+		uint16_t tx_queue_id; /**< TX queue ID */
+	} hard;
+};
+
+/**
+ * Default Internals
+ */
+
+#ifndef DEFAULT_BURST_SIZE
+#define DEFAULT_BURST_SIZE				32
+#endif
+
+#ifndef FLUSH_COUNT_THRESHOLD
+#define FLUSH_COUNT_THRESHOLD			(1 << 17)
+#endif
+
+struct default_internals {
+	struct rte_mbuf **pkts;
+	uint32_t pkts_len;
+	uint32_t txq_pos;
+	uint32_t flush_count;
+};
+
+/**
+ * PMD Internals
+ */
+struct pmd_internals {
+	/** Params */
+	struct pmd_params params;
+
+	/** Soft device */
+	struct {
+		struct default_internals def; /**< Default */
+	} soft;
+
+	/** Hard device */
+	struct {
+		uint8_t port_id;
+	} hard;
+};
+
+struct pmd_rx_queue {
+	/** Hard device */
+	struct {
+		uint8_t port_id;
+		uint16_t rx_queue_id;
+	} hard;
+};
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
diff --git a/drivers/net/softnic/rte_pmd_eth_softnic_version.map b/drivers/net/softnic/rte_pmd_eth_softnic_version.map
new file mode 100644
index 0000000..fb2cb68
--- /dev/null
+++ b/drivers/net/softnic/rte_pmd_eth_softnic_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_pmd_softnic_run;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index c25fdd9..3dc82fb 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -67,7 +67,6 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
-_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
 # librte_acl needs --whole-archive because of weak functions
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)            += --whole-archive
@@ -99,6 +98,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
@@ -135,6 +135,9 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL)       += -lrte_pmd_null
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP)       += -lrte_pmd_pcap -lpcap
 _LDLIBS-$(CONFIG_RTE_LIBRTE_QEDE_PMD)       += -lrte_pmd_qede
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING)       += -lrte_pmd_ring
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC)      += -lrte_pmd_softnic
+endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD)    += -lrte_pmd_sfc_efx
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2)   += -lrte_pmd_szedata2 -lsze2
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_TAP)        += -lrte_pmd_tap
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v4 2/4] net/softnic: add traffic management support
  2017-09-18  9:10           ` [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 1/4] net/softnic: add softnic PMD Jasvinder Singh
@ 2017-09-18  9:10             ` Jasvinder Singh
  2017-09-25  1:58               ` Lu, Wenzhuo
  2017-09-29 14:04               ` [dpdk-dev] [PATCH v5 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 3/4] net/softnic: add TM capabilities ops Jasvinder Singh
                               ` (2 subsequent siblings)
  4 siblings, 2 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-09-18  9:10 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

Add ethdev Traffic Management API support to SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
---
v3 changes:
- add more confguration parameters (tm rate, tm queue sizes)

 drivers/net/softnic/Makefile                    |   1 +
 drivers/net/softnic/rte_eth_softnic.c           | 252 +++++++++++++++++++++++-
 drivers/net/softnic/rte_eth_softnic.h           |  16 ++
 drivers/net/softnic/rte_eth_softnic_internals.h | 106 +++++++++-
 drivers/net/softnic/rte_eth_softnic_tm.c        | 181 +++++++++++++++++
 5 files changed, 553 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c

diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
index c2f42ef..8b848a9 100644
--- a/drivers/net/softnic/Makefile
+++ b/drivers/net/softnic/Makefile
@@ -47,6 +47,7 @@ LIBABIVER := 1
 # all source are stored in SRCS-y
 #
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic_tm.c
 
 #
 # Export include files
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index 792e7ea..28b5155 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -42,6 +42,7 @@
 #include <rte_kvargs.h>
 #include <rte_errno.h>
 #include <rte_ring.h>
+#include <rte_sched.h>
 
 #include "rte_eth_softnic.h"
 #include "rte_eth_softnic_internals.h"
@@ -49,10 +50,29 @@
 #define PRIV_TO_HARD_DEV(p)					\
 	(&rte_eth_devices[p->hard.port_id])
 
+#define PMD_PARAM_SOFT_TM					"soft_tm"
+#define PMD_PARAM_SOFT_TM_RATE				"soft_tm_rate"
+#define PMD_PARAM_SOFT_TM_NB_QUEUES			"soft_tm_nb_queues"
+#define PMD_PARAM_SOFT_TM_QSIZE0			"soft_tm_qsize0"
+#define PMD_PARAM_SOFT_TM_QSIZE1			"soft_tm_qsize1"
+#define PMD_PARAM_SOFT_TM_QSIZE2			"soft_tm_qsize2"
+#define PMD_PARAM_SOFT_TM_QSIZE3			"soft_tm_qsize3"
+#define PMD_PARAM_SOFT_TM_ENQ_BSZ			"soft_tm_enq_bsz"
+#define PMD_PARAM_SOFT_TM_DEQ_BSZ			"soft_tm_deq_bsz"
+
 #define PMD_PARAM_HARD_NAME					"hard_name"
 #define PMD_PARAM_HARD_TX_QUEUE_ID			"hard_tx_queue_id"
 
 static const char *pmd_valid_args[] = {
+	PMD_PARAM_SOFT_TM,
+	PMD_PARAM_SOFT_TM_RATE,
+	PMD_PARAM_SOFT_TM_NB_QUEUES,
+	PMD_PARAM_SOFT_TM_QSIZE0,
+	PMD_PARAM_SOFT_TM_QSIZE1,
+	PMD_PARAM_SOFT_TM_QSIZE2,
+	PMD_PARAM_SOFT_TM_QSIZE3,
+	PMD_PARAM_SOFT_TM_ENQ_BSZ,
+	PMD_PARAM_SOFT_TM_DEQ_BSZ,
 	PMD_PARAM_HARD_NAME,
 	PMD_PARAM_HARD_TX_QUEUE_ID,
 	NULL
@@ -157,6 +177,13 @@ pmd_dev_start(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
 
+	if (tm_used(dev)) {
+		int status = tm_start(p);
+
+		if (status)
+			return status;
+	}
+
 	dev->data->dev_link.link_status = ETH_LINK_UP;
 
 	if (p->params.soft.intrusive) {
@@ -172,7 +199,12 @@ pmd_dev_start(struct rte_eth_dev *dev)
 static void
 pmd_dev_stop(struct rte_eth_dev *dev)
 {
+	struct pmd_internals *p = dev->data->dev_private;
+
 	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+
+	if (tm_used(dev))
+		tm_stop(p);
 }
 
 static void
@@ -293,6 +325,77 @@ rte_pmd_softnic_run_default(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static __rte_always_inline int
+rte_pmd_softnic_run_tm(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Persistent context: Read Only (update not required) */
+	struct rte_sched_port *sched = p->soft.tm.sched;
+	struct rte_mbuf **pkts_enq = p->soft.tm.pkts_enq;
+	struct rte_mbuf **pkts_deq = p->soft.tm.pkts_deq;
+	uint32_t enq_bsz = p->params.soft.tm.enq_bsz;
+	uint32_t deq_bsz = p->params.soft.tm.deq_bsz;
+	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
+
+	/* Persistent context: Read - Write (update required) */
+	uint32_t txq_pos = p->soft.tm.txq_pos;
+	uint32_t pkts_enq_len = p->soft.tm.pkts_enq_len;
+	uint32_t flush_count = p->soft.tm.flush_count;
+
+	/* Not part of the persistent context */
+	uint32_t pkts_deq_len, pos;
+	uint16_t i;
+
+	/* Soft device TXQ read, TM enqueue */
+	for (i = 0; i < nb_tx_queues; i++) {
+		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
+
+		/* Read TXQ burst to packet enqueue buffer */
+		pkts_enq_len += rte_ring_sc_dequeue_burst(txq,
+			(void **)&pkts_enq[pkts_enq_len],
+			enq_bsz,
+			NULL);
+
+		/* Increment TXQ */
+		txq_pos++;
+		if (txq_pos >= nb_tx_queues)
+			txq_pos = 0;
+
+		/* TM enqueue when complete burst is available */
+		if (pkts_enq_len >= enq_bsz) {
+			rte_sched_port_enqueue(sched, pkts_enq, pkts_enq_len);
+
+			pkts_enq_len = 0;
+			flush_count = 0;
+			break;
+		}
+	}
+
+	if (flush_count >= FLUSH_COUNT_THRESHOLD) {
+		if (pkts_enq_len)
+			rte_sched_port_enqueue(sched, pkts_enq, pkts_enq_len);
+
+		pkts_enq_len = 0;
+		flush_count = 0;
+	}
+
+	p->soft.tm.txq_pos = txq_pos;
+	p->soft.tm.pkts_enq_len = pkts_enq_len;
+	p->soft.tm.flush_count = flush_count + 1;
+
+	/* TM dequeue, Hard device TXQ write */
+	pkts_deq_len = rte_sched_port_dequeue(sched, pkts_deq, deq_bsz);
+
+	for (pos = 0; pos < pkts_deq_len; )
+		pos += rte_eth_tx_burst(p->hard.port_id,
+			p->params.hard.tx_queue_id,
+			&pkts_deq[pos],
+			(uint16_t)(pkts_deq_len - pos));
+
+	return 0;
+}
+
 int
 rte_pmd_softnic_run(uint8_t port_id)
 {
@@ -302,7 +405,9 @@ rte_pmd_softnic_run(uint8_t port_id)
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
 #endif
 
-	return rte_pmd_softnic_run_default(dev);
+	return (tm_used(dev)) ?
+		rte_pmd_softnic_run_tm(dev) :
+		rte_pmd_softnic_run_default(dev);
 }
 
 static struct ether_addr eth_addr = { .addr_bytes = {0} };
@@ -383,12 +488,25 @@ pmd_init(struct pmd_params *params, int numa_node)
 		return NULL;
 	}
 
+	/* Traffic Management (TM)*/
+	if (params->soft.flags & PMD_FEATURE_TM) {
+		status = tm_init(p, params, numa_node);
+		if (status) {
+			default_free(p);
+			rte_free(p);
+			return NULL;
+		}
+	}
+
 	return p;
 }
 
 static void
 pmd_free(struct pmd_internals *p)
 {
+	if (p->params.soft.flags & PMD_FEATURE_TM)
+		tm_free(p);
+
 	default_free(p);
 
 	rte_free(p);
@@ -468,7 +586,7 @@ static int
 pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
 {
 	struct rte_kvargs *kvlist;
-	int ret;
+	int i, ret;
 
 	kvlist = rte_kvargs_parse(params, pmd_valid_args);
 	if (kvlist == NULL)
@@ -478,8 +596,120 @@ pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
 	memset(p, 0, sizeof(*p));
 	p->soft.name = name;
 	p->soft.intrusive = INTRUSIVE;
+	p->soft.tm.rate = 0;
+	p->soft.tm.nb_queues = SOFTNIC_SOFT_TM_NB_QUEUES;
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+		p->soft.tm.qsize[i] = SOFTNIC_SOFT_TM_QUEUE_SIZE;
+	p->soft.tm.enq_bsz = SOFTNIC_SOFT_TM_ENQ_BSZ;
+	p->soft.tm.deq_bsz = SOFTNIC_SOFT_TM_DEQ_BSZ;
 	p->hard.tx_queue_id = SOFTNIC_HARD_TX_QUEUE_ID;
 
+	/* SOFT: TM (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM) == 1) {
+		char *s;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM,
+			&get_string, &s);
+		if (ret < 0)
+			goto out_free;
+
+		if (strcmp(s, "on") == 0)
+			p->soft.flags |= PMD_FEATURE_TM;
+		else if (strcmp(s, "off") == 0)
+			p->soft.flags &= ~PMD_FEATURE_TM;
+		else
+			goto out_free;
+	}
+
+	/* SOFT: TM rate (measured in bytes/second) (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_RATE) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_RATE,
+			&get_uint32, &p->soft.tm.rate);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM number of queues (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_NB_QUEUES) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_NB_QUEUES,
+			&get_uint32, &p->soft.tm.nb_queues);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM queue size 0 .. 3 (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE0) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE0,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[0] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE1) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE1,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[1] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE2) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE2,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[2] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE3) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE3,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[3] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM enqueue burst size (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_ENQ_BSZ) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_ENQ_BSZ,
+			&get_uint32, &p->soft.tm.enq_bsz);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM dequeue burst size (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_DEQ_BSZ) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_DEQ_BSZ,
+			&get_uint32, &p->soft.tm.deq_bsz);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
 	/* HARD: name (mandatory) */
 	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_NAME) == 1) {
 		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME,
@@ -512,6 +742,7 @@ pmd_probe(struct rte_vdev_device *vdev)
 	int status;
 
 	struct rte_eth_dev_info hard_info;
+	uint32_t hard_speed;
 	uint8_t hard_port_id;
 	int numa_node;
 	void *dev_private;
@@ -534,11 +765,19 @@ pmd_probe(struct rte_vdev_device *vdev)
 		return -EINVAL;
 
 	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
 	numa_node = rte_eth_dev_socket_id(hard_port_id);
 
 	if (p.hard.tx_queue_id >= hard_info.max_tx_queues)
 		return -EINVAL;
 
+	if (p.soft.flags & PMD_FEATURE_TM) {
+		status = tm_params_check(&p, hard_speed);
+
+		if (status)
+			return status;
+	}
+
 	/* Allocate and initialize soft ethdev private data */
 	dev_private = pmd_init(&p, numa_node);
 	if (dev_private == NULL)
@@ -591,5 +830,14 @@ static struct rte_vdev_driver pmd_softnic_drv = {
 
 RTE_PMD_REGISTER_VDEV(net_softnic, pmd_softnic_drv);
 RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
+	PMD_PARAM_SOFT_TM	 "=on|off "
+	PMD_PARAM_SOFT_TM_RATE "=<int> "
+	PMD_PARAM_SOFT_TM_NB_QUEUES "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE0 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE1 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE2 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE3 "=<int> "
+	PMD_PARAM_SOFT_TM_ENQ_BSZ "=<int> "
+	PMD_PARAM_SOFT_TM_DEQ_BSZ "=<int> "
 	PMD_PARAM_HARD_NAME "=<string> "
 	PMD_PARAM_HARD_TX_QUEUE_ID "=<int>");
diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
index e6996f3..517b96a 100644
--- a/drivers/net/softnic/rte_eth_softnic.h
+++ b/drivers/net/softnic/rte_eth_softnic.h
@@ -40,6 +40,22 @@
 extern "C" {
 #endif
 
+#ifndef SOFTNIC_SOFT_TM_NB_QUEUES
+#define SOFTNIC_SOFT_TM_NB_QUEUES			65536
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_QUEUE_SIZE
+#define SOFTNIC_SOFT_TM_QUEUE_SIZE			64
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_ENQ_BSZ
+#define SOFTNIC_SOFT_TM_ENQ_BSZ				32
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_DEQ_BSZ
+#define SOFTNIC_SOFT_TM_DEQ_BSZ				24
+#endif
+
 #ifndef SOFTNIC_HARD_TX_QUEUE_ID
 #define SOFTNIC_HARD_TX_QUEUE_ID			0
 #endif
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 96995b5..f9e2177 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -37,10 +37,19 @@
 #include <stdint.h>
 
 #include <rte_mbuf.h>
+#include <rte_sched.h>
 #include <rte_ethdev.h>
 
 #include "rte_eth_softnic.h"
 
+/**
+ * PMD Parameters
+ */
+
+enum pmd_feature {
+	PMD_FEATURE_TM = 1, /**< Traffic Management (TM) */
+};
+
 #ifndef INTRUSIVE
 #define INTRUSIVE					0
 #endif
@@ -57,6 +66,16 @@ struct pmd_params {
 		 *      (potentially faster).
 		 */
 		int intrusive;
+
+		/** Traffic Management (TM) */
+		struct {
+			uint32_t rate; /**< Rate (bytes/second) */
+			uint32_t nb_queues; /**< Number of queues */
+			uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+			/**< Queue size per traffic class */
+			uint32_t enq_bsz; /**< Enqueue burst size */
+			uint32_t deq_bsz; /**< Dequeue burst size */
+		} tm;
 	} soft;
 
 	/** Parameters for the hard device (existing) */
@@ -75,7 +94,7 @@ struct pmd_params {
 #endif
 
 #ifndef FLUSH_COUNT_THRESHOLD
-#define FLUSH_COUNT_THRESHOLD			(1 << 17)
+#define FLUSH_COUNT_THRESHOLD				(1 << 17)
 #endif
 
 struct default_internals {
@@ -86,6 +105,66 @@ struct default_internals {
 };
 
 /**
+ * Traffic Management (TM) Internals
+ */
+
+#ifndef TM_MAX_SUBPORTS
+#define TM_MAX_SUBPORTS					8
+#endif
+
+#ifndef TM_MAX_PIPES_PER_SUBPORT
+#define TM_MAX_PIPES_PER_SUBPORT			4096
+#endif
+
+struct tm_params {
+	struct rte_sched_port_params port_params;
+
+	struct rte_sched_subport_params subport_params[TM_MAX_SUBPORTS];
+
+	struct rte_sched_pipe_params
+		pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PORT];
+	uint32_t n_pipe_profiles;
+	uint32_t pipe_to_profile[TM_MAX_SUBPORTS * TM_MAX_PIPES_PER_SUBPORT];
+};
+
+/* TM Levels */
+enum tm_node_level {
+	TM_NODE_LEVEL_PORT = 0,
+	TM_NODE_LEVEL_SUBPORT,
+	TM_NODE_LEVEL_PIPE,
+	TM_NODE_LEVEL_TC,
+	TM_NODE_LEVEL_QUEUE,
+	TM_NODE_LEVEL_MAX,
+};
+
+/* TM Hierarchy Specification */
+struct tm_hierarchy {
+	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
+};
+
+struct tm_internals {
+	/** Hierarchy specification
+	 *
+	 *     -Hierarchy is unfrozen at init and when port is stopped.
+	 *     -Hierarchy is frozen on successful hierarchy commit.
+	 *     -Run-time hierarchy changes are not allowed, therefore it makes
+	 *      sense to keep the hierarchy frozen after the port is started.
+	 */
+	struct tm_hierarchy h;
+
+	/** Blueprints */
+	struct tm_params params;
+
+	/** Run-time */
+	struct rte_sched_port *sched;
+	struct rte_mbuf **pkts_enq;
+	struct rte_mbuf **pkts_deq;
+	uint32_t pkts_enq_len;
+	uint32_t txq_pos;
+	uint32_t flush_count;
+};
+
+/**
  * PMD Internals
  */
 struct pmd_internals {
@@ -95,6 +174,7 @@ struct pmd_internals {
 	/** Soft device */
 	struct {
 		struct default_internals def; /**< Default */
+		struct tm_internals tm; /**< Traffic Management */
 	} soft;
 
 	/** Hard device */
@@ -111,4 +191,28 @@ struct pmd_rx_queue {
 	} hard;
 };
 
+int
+tm_params_check(struct pmd_params *params, uint32_t hard_rate);
+
+int
+tm_init(struct pmd_internals *p, struct pmd_params *params, int numa_node);
+
+void
+tm_free(struct pmd_internals *p);
+
+int
+tm_start(struct pmd_internals *p);
+
+void
+tm_stop(struct pmd_internals *p);
+
+static inline int
+tm_used(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	return (p->params.soft.flags & PMD_FEATURE_TM) &&
+		p->soft.tm.h.n_tm_nodes[TM_NODE_LEVEL_PORT];
+}
+
 #endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
new file mode 100644
index 0000000..bb28798
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -0,0 +1,181 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_malloc.h>
+
+#include "rte_eth_softnic_internals.h"
+#include "rte_eth_softnic.h"
+
+#define BYTES_IN_MBPS (1000 * 1000 / 8)
+
+int
+tm_params_check(struct pmd_params *params, uint32_t hard_rate)
+{
+	uint64_t hard_rate_bytes_per_sec = hard_rate * BYTES_IN_MBPS;
+	uint32_t i;
+
+	/* rate */
+	if (params->soft.tm.rate) {
+		if (params->soft.tm.rate > hard_rate_bytes_per_sec)
+			return -EINVAL;
+	} else {
+		params->soft.tm.rate =
+			(hard_rate_bytes_per_sec > UINT32_MAX) ?
+				UINT32_MAX : hard_rate_bytes_per_sec;
+	}
+
+	/* nb_queues */
+	if (params->soft.tm.nb_queues == 0)
+		return -EINVAL;
+
+	if (params->soft.tm.nb_queues < RTE_SCHED_QUEUES_PER_PIPE)
+		params->soft.tm.nb_queues = RTE_SCHED_QUEUES_PER_PIPE;
+
+	params->soft.tm.nb_queues =
+		rte_align32pow2(params->soft.tm.nb_queues);
+
+	/* qsize */
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+		if (params->soft.tm.qsize[i] == 0)
+			return -EINVAL;
+
+		params->soft.tm.qsize[i] =
+			rte_align32pow2(params->soft.tm.qsize[i]);
+	}
+
+	/* enq_bsz, deq_bsz */
+	if ((params->soft.tm.enq_bsz == 0) ||
+		(params->soft.tm.deq_bsz == 0) ||
+		(params->soft.tm.deq_bsz >= params->soft.tm.enq_bsz))
+		return -EINVAL;
+
+	return 0;
+}
+
+int
+tm_init(struct pmd_internals *p,
+	struct pmd_params *params,
+	int numa_node)
+{
+	uint32_t enq_bsz = params->soft.tm.enq_bsz;
+	uint32_t deq_bsz = params->soft.tm.deq_bsz;
+
+	p->soft.tm.pkts_enq = rte_zmalloc_socket(params->soft.name,
+		2 * enq_bsz * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.tm.pkts_enq == NULL)
+		return -ENOMEM;
+
+	p->soft.tm.pkts_deq = rte_zmalloc_socket(params->soft.name,
+		deq_bsz * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.tm.pkts_deq == NULL) {
+		rte_free(p->soft.tm.pkts_enq);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void
+tm_free(struct pmd_internals *p)
+{
+	rte_free(p->soft.tm.pkts_enq);
+	rte_free(p->soft.tm.pkts_deq);
+}
+
+int
+tm_start(struct pmd_internals *p)
+{
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_subports, subport_id;
+	int status;
+
+	/* Port */
+	p->soft.tm.sched = rte_sched_port_config(&t->port_params);
+	if (p->soft.tm.sched == NULL)
+		return -1;
+
+	/* Subport */
+	n_subports = t->port_params.n_subports_per_port;
+	for (subport_id = 0; subport_id < n_subports; subport_id++) {
+		uint32_t n_pipes_per_subport =
+			t->port_params.n_pipes_per_subport;
+		uint32_t pipe_id;
+
+		status = rte_sched_subport_config(p->soft.tm.sched,
+			subport_id,
+			&t->subport_params[subport_id]);
+		if (status) {
+			rte_sched_port_free(p->soft.tm.sched);
+			return -1;
+		}
+
+		/* Pipe */
+		n_pipes_per_subport = t->port_params.n_pipes_per_subport;
+		for (pipe_id = 0; pipe_id < n_pipes_per_subport; pipe_id++) {
+			int pos = subport_id * TM_MAX_PIPES_PER_SUBPORT +
+				pipe_id;
+			int profile_id = t->pipe_to_profile[pos];
+
+			if (profile_id < 0)
+				continue;
+
+			status = rte_sched_pipe_config(p->soft.tm.sched,
+				subport_id,
+				pipe_id,
+				profile_id);
+			if (status) {
+				rte_sched_port_free(p->soft.tm.sched);
+				return -1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+void
+tm_stop(struct pmd_internals *p)
+{
+	if (p->soft.tm.sched)
+		rte_sched_port_free(p->soft.tm.sched);
+}
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v4 3/4] net/softnic: add TM capabilities ops
  2017-09-18  9:10           ` [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 1/4] net/softnic: add softnic PMD Jasvinder Singh
  2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 2/4] net/softnic: add traffic management support Jasvinder Singh
@ 2017-09-18  9:10             ` Jasvinder Singh
  2017-09-25  2:33               ` Lu, Wenzhuo
  2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 4/4] net/softnic: add TM hierarchy related ops Jasvinder Singh
  2017-09-20 15:35             ` [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Thomas Monjalon
  4 siblings, 1 reply; 79+ messages in thread
From: Jasvinder Singh @ 2017-09-18  9:10 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

Implement ethdev TM capability APIs in SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
---
 drivers/net/softnic/rte_eth_softnic.c           |  12 +-
 drivers/net/softnic/rte_eth_softnic_internals.h |  32 ++
 drivers/net/softnic/rte_eth_softnic_tm.c        | 500 ++++++++++++++++++++++++
 3 files changed, 543 insertions(+), 1 deletion(-)

diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index 28b5155..d0b5550 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -43,6 +43,7 @@
 #include <rte_errno.h>
 #include <rte_ring.h>
 #include <rte_sched.h>
+#include <rte_tm_driver.h>
 
 #include "rte_eth_softnic.h"
 #include "rte_eth_softnic_internals.h"
@@ -224,6 +225,15 @@ pmd_link_update(struct rte_eth_dev *dev __rte_unused,
 	return 0;
 }
 
+static int
+pmd_tm_ops_get(struct rte_eth_dev *dev, void *arg)
+{
+	*(const struct rte_tm_ops **)arg =
+		(tm_enabled(dev)) ? &pmd_tm_ops : NULL;
+
+	return 0;
+}
+
 static const struct eth_dev_ops pmd_ops = {
 	.dev_configure = pmd_dev_configure,
 	.dev_start = pmd_dev_start,
@@ -233,7 +243,7 @@ static const struct eth_dev_ops pmd_ops = {
 	.dev_infos_get = pmd_dev_infos_get,
 	.rx_queue_setup = pmd_rx_queue_setup,
 	.tx_queue_setup = pmd_tx_queue_setup,
-	.tm_ops_get = NULL,
+	.tm_ops_get = pmd_tm_ops_get,
 };
 
 static uint16_t
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index f9e2177..5e93f71 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -39,6 +39,7 @@
 #include <rte_mbuf.h>
 #include <rte_sched.h>
 #include <rte_ethdev.h>
+#include <rte_tm_driver.h>
 
 #include "rte_eth_softnic.h"
 
@@ -137,8 +138,26 @@ enum tm_node_level {
 	TM_NODE_LEVEL_MAX,
 };
 
+/* TM Node */
+struct tm_node {
+	TAILQ_ENTRY(tm_node) node;
+	uint32_t node_id;
+	uint32_t parent_node_id;
+	uint32_t priority;
+	uint32_t weight;
+	uint32_t level;
+	struct tm_node *parent_node;
+	struct rte_tm_node_params params;
+	struct rte_tm_node_stats stats;
+	uint32_t n_children;
+};
+
+TAILQ_HEAD(tm_node_list, tm_node);
+
 /* TM Hierarchy Specification */
 struct tm_hierarchy {
+	struct tm_node_list nodes;
+
 	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
 };
 
@@ -191,6 +210,11 @@ struct pmd_rx_queue {
 	} hard;
 };
 
+/**
+ * Traffic Management (TM) Operation
+ */
+extern const struct rte_tm_ops pmd_tm_ops;
+
 int
 tm_params_check(struct pmd_params *params, uint32_t hard_rate);
 
@@ -207,6 +231,14 @@ void
 tm_stop(struct pmd_internals *p);
 
 static inline int
+tm_enabled(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	return (p->params.soft.flags & PMD_FEATURE_TM);
+}
+
+static inline int
 tm_used(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index bb28798..a552006 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -179,3 +179,503 @@ tm_stop(struct pmd_internals *p)
 	if (p->soft.tm.sched)
 		rte_sched_port_free(p->soft.tm.sched);
 }
+
+static struct tm_node *
+tm_node_search(struct rte_eth_dev *dev, uint32_t node_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node)
+		if (n->node_id == node_id)
+			return n;
+
+	return NULL;
+}
+
+static uint32_t
+tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t n_queues_max = p->params.soft.tm.nb_queues;
+	uint32_t n_tc_max = n_queues_max / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
+	uint32_t n_pipes_max = n_tc_max / RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+	uint32_t n_subports_max = n_pipes_max;
+	uint32_t n_root_max = 1;
+
+	switch (level) {
+	case TM_NODE_LEVEL_PORT:
+		return n_root_max;
+	case TM_NODE_LEVEL_SUBPORT:
+		return n_subports_max;
+	case TM_NODE_LEVEL_PIPE:
+		return n_pipes_max;
+	case TM_NODE_LEVEL_TC:
+		return n_tc_max;
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		return n_queues_max;
+	}
+}
+
+#ifdef RTE_SCHED_RED
+#define WRED_SUPPORTED						1
+#else
+#define WRED_SUPPORTED						0
+#endif
+
+#define STATS_MASK_DEFAULT					\
+	(RTE_TM_STATS_N_PKTS |					\
+	RTE_TM_STATS_N_BYTES |					\
+	RTE_TM_STATS_N_PKTS_GREEN_DROPPED |			\
+	RTE_TM_STATS_N_BYTES_GREEN_DROPPED)
+
+#define STATS_MASK_QUEUE						\
+	(STATS_MASK_DEFAULT |					\
+	RTE_TM_STATS_N_PKTS_QUEUED)
+
+static const struct rte_tm_capabilities tm_cap = {
+	.n_nodes_max = UINT32_MAX,
+	.n_levels_max = TM_NODE_LEVEL_MAX,
+
+	.non_leaf_nodes_identical = 0,
+	.leaf_nodes_identical = 1,
+
+	.shaper_n_max = UINT32_MAX,
+	.shaper_private_n_max = UINT32_MAX,
+	.shaper_private_dual_rate_n_max = 0,
+	.shaper_private_rate_min = 1,
+	.shaper_private_rate_max = UINT32_MAX,
+
+	.shaper_shared_n_max = UINT32_MAX,
+	.shaper_shared_n_nodes_per_shaper_max = UINT32_MAX,
+	.shaper_shared_n_shapers_per_node_max = 1,
+	.shaper_shared_dual_rate_n_max = 0,
+	.shaper_shared_rate_min = 1,
+	.shaper_shared_rate_max = UINT32_MAX,
+
+	.shaper_pkt_length_adjust_min = RTE_TM_ETH_FRAMING_OVERHEAD_FCS,
+	.shaper_pkt_length_adjust_max = RTE_TM_ETH_FRAMING_OVERHEAD_FCS,
+
+	.sched_n_children_max = UINT32_MAX,
+	.sched_sp_n_priorities_max = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+	.sched_wfq_n_children_per_group_max = UINT32_MAX,
+	.sched_wfq_n_groups_max = 1,
+	.sched_wfq_weight_max = UINT32_MAX,
+
+	.cman_head_drop_supported = 0,
+	.cman_wred_context_n_max = 0,
+	.cman_wred_context_private_n_max = 0,
+	.cman_wred_context_shared_n_max = 0,
+	.cman_wred_context_shared_n_nodes_per_context_max = 0,
+	.cman_wred_context_shared_n_contexts_per_node_max = 0,
+
+	.mark_vlan_dei_supported = {0, 0, 0},
+	.mark_ip_ecn_tcp_supported = {0, 0, 0},
+	.mark_ip_ecn_sctp_supported = {0, 0, 0},
+	.mark_ip_dscp_supported = {0, 0, 0},
+
+	.dynamic_update_mask = 0,
+
+	.stats_mask = STATS_MASK_QUEUE,
+};
+
+/* Traffic manager capabilities get */
+static int
+pmd_tm_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	struct rte_tm_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_cap, sizeof(*cap));
+
+	cap->n_nodes_max = tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_TC) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_QUEUE);
+
+	cap->shaper_private_n_max =
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_TC);
+
+	cap->shaper_shared_n_max = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE *
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT);
+
+	cap->shaper_n_max = cap->shaper_private_n_max +
+		cap->shaper_shared_n_max;
+
+	cap->shaper_shared_n_nodes_per_shaper_max =
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE);
+
+	cap->sched_n_children_max = RTE_MAX(
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE),
+		(uint32_t)RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE);
+
+	cap->sched_wfq_n_children_per_group_max = cap->sched_n_children_max;
+
+	if (WRED_SUPPORTED)
+		cap->cman_wred_context_private_n_max =
+			tm_level_get_max_nodes(dev, TM_NODE_LEVEL_QUEUE);
+
+	cap->cman_wred_context_n_max = cap->cman_wred_context_private_n_max +
+		cap->cman_wred_context_shared_n_max;
+
+	return 0;
+}
+
+static const struct rte_tm_level_capabilities tm_level_cap[] = {
+	[TM_NODE_LEVEL_PORT] = {
+		.n_nodes_max = 1,
+		.n_nodes_nonleaf_max = 1,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = 1,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_SUBPORT] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+			.sched_wfq_weight_max = UINT32_MAX,
+#else
+			.sched_wfq_weight_max = 1,
+#endif
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_PIPE] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_sp_n_priorities_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_wfq_n_children_per_group_max = 1,
+			.sched_wfq_n_groups_max = 0,
+			.sched_wfq_weight_max = 1,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_TC] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 1,
+
+			.sched_n_children_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_QUEUE] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = 0,
+		.n_nodes_leaf_max = UINT32_MAX,
+		.non_leaf_nodes_identical = 0,
+		.leaf_nodes_identical = 1,
+
+		.leaf = {
+			.shaper_private_supported = 0,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 0,
+			.shaper_private_rate_max = 0,
+			.shaper_shared_n_max = 0,
+
+			.cman_head_drop_supported = 0,
+			.cman_wred_context_private_supported = WRED_SUPPORTED,
+			.cman_wred_context_shared_n_max = 0,
+
+			.stats_mask = STATS_MASK_QUEUE,
+		},
+	},
+};
+
+/* Traffic manager level capabilities get */
+static int
+pmd_tm_level_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t level_id,
+	struct rte_tm_level_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	if (level_id >= TM_NODE_LEVEL_MAX)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_LEVEL_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_level_cap[level_id], sizeof(*cap));
+
+	switch (level_id) {
+	case TM_NODE_LEVEL_PORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_SUBPORT);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_SUBPORT);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_PIPE);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_PIPE);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		break;
+
+	case TM_NODE_LEVEL_TC:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_TC);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		break;
+
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_QUEUE);
+		cap->n_nodes_leaf_max = cap->n_nodes_max;
+		break;
+	}
+
+	return 0;
+}
+
+static const struct rte_tm_node_capabilities tm_node_cap[] = {
+	[TM_NODE_LEVEL_PORT] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = 1,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_SUBPORT] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_PIPE] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_sp_n_priorities_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_wfq_n_children_per_group_max = 1,
+			.sched_wfq_n_groups_max = 0,
+			.sched_wfq_weight_max = 1,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_TC] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 1,
+
+		.nonleaf = {
+			.sched_n_children_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_QUEUE] = {
+		.shaper_private_supported = 0,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 0,
+		.shaper_private_rate_max = 0,
+		.shaper_shared_n_max = 0,
+
+
+		.leaf = {
+			.cman_head_drop_supported = 0,
+			.cman_wred_context_private_supported = WRED_SUPPORTED,
+			.cman_wred_context_shared_n_max = 0,
+		},
+
+		.stats_mask = STATS_MASK_QUEUE,
+	},
+};
+
+/* Traffic manager node capabilities get */
+static int
+pmd_tm_node_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t node_id,
+	struct rte_tm_node_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	struct tm_node *tm_node;
+
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	tm_node = tm_node_search(dev, node_id);
+	if (tm_node == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_NODE_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_node_cap[tm_node->level], sizeof(*cap));
+
+	switch (tm_node->level) {
+	case TM_NODE_LEVEL_PORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_SUBPORT);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_PIPE);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+	case TM_NODE_LEVEL_TC:
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+const struct rte_tm_ops pmd_tm_ops = {
+	.capabilities_get = pmd_tm_capabilities_get,
+	.level_capabilities_get = pmd_tm_level_capabilities_get,
+	.node_capabilities_get = pmd_tm_node_capabilities_get,
+};
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v4 4/4] net/softnic: add TM hierarchy related ops
  2017-09-18  9:10           ` [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
                               ` (2 preceding siblings ...)
  2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 3/4] net/softnic: add TM capabilities ops Jasvinder Singh
@ 2017-09-18  9:10             ` Jasvinder Singh
  2017-09-25  7:14               ` Lu, Wenzhuo
  2017-09-20 15:35             ` [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Thomas Monjalon
  4 siblings, 1 reply; 79+ messages in thread
From: Jasvinder Singh @ 2017-09-18  9:10 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

Implement ethdev TM hierarchy related APIs in SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
---
 drivers/net/softnic/rte_eth_softnic_internals.h |   41 +
 drivers/net/softnic/rte_eth_softnic_tm.c        | 2776 ++++++++++++++++++++++-
 2 files changed, 2813 insertions(+), 4 deletions(-)

diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 5e93f71..317cb08 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -138,6 +138,36 @@ enum tm_node_level {
 	TM_NODE_LEVEL_MAX,
 };
 
+/* TM Shaper Profile */
+struct tm_shaper_profile {
+	TAILQ_ENTRY(tm_shaper_profile) node;
+	uint32_t shaper_profile_id;
+	uint32_t n_users;
+	struct rte_tm_shaper_params params;
+};
+
+TAILQ_HEAD(tm_shaper_profile_list, tm_shaper_profile);
+
+/* TM Shared Shaper */
+struct tm_shared_shaper {
+	TAILQ_ENTRY(tm_shared_shaper) node;
+	uint32_t shared_shaper_id;
+	uint32_t n_users;
+	uint32_t shaper_profile_id;
+};
+
+TAILQ_HEAD(tm_shared_shaper_list, tm_shared_shaper);
+
+/* TM WRED Profile */
+struct tm_wred_profile {
+	TAILQ_ENTRY(tm_wred_profile) node;
+	uint32_t wred_profile_id;
+	uint32_t n_users;
+	struct rte_tm_wred_params params;
+};
+
+TAILQ_HEAD(tm_wred_profile_list, tm_wred_profile);
+
 /* TM Node */
 struct tm_node {
 	TAILQ_ENTRY(tm_node) node;
@@ -147,6 +177,8 @@ struct tm_node {
 	uint32_t weight;
 	uint32_t level;
 	struct tm_node *parent_node;
+	struct tm_shaper_profile *shaper_profile;
+	struct tm_wred_profile *wred_profile;
 	struct rte_tm_node_params params;
 	struct rte_tm_node_stats stats;
 	uint32_t n_children;
@@ -156,8 +188,16 @@ TAILQ_HEAD(tm_node_list, tm_node);
 
 /* TM Hierarchy Specification */
 struct tm_hierarchy {
+	struct tm_shaper_profile_list shaper_profiles;
+	struct tm_shared_shaper_list shared_shapers;
+	struct tm_wred_profile_list wred_profiles;
 	struct tm_node_list nodes;
 
+	uint32_t n_shaper_profiles;
+	uint32_t n_shared_shapers;
+	uint32_t n_wred_profiles;
+	uint32_t n_nodes;
+
 	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
 };
 
@@ -170,6 +210,7 @@ struct tm_internals {
 	 *      sense to keep the hierarchy frozen after the port is started.
 	 */
 	struct tm_hierarchy h;
+	int hierarchy_frozen;
 
 	/** Blueprints */
 	struct tm_params params;
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index a552006..8014fbd 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -86,6 +86,79 @@ tm_params_check(struct pmd_params *params, uint32_t hard_rate)
 	return 0;
 }
 
+static void
+tm_hierarchy_init(struct pmd_internals *p)
+{
+	memset(&p->soft.tm.h, 0, sizeof(p->soft.tm.h));
+
+	/* Initialize shaper profile list */
+	TAILQ_INIT(&p->soft.tm.h.shaper_profiles);
+
+	/* Initialize shared shaper list */
+	TAILQ_INIT(&p->soft.tm.h.shared_shapers);
+
+	/* Initialize wred profile list */
+	TAILQ_INIT(&p->soft.tm.h.wred_profiles);
+
+	/* Initialize TM node list */
+	TAILQ_INIT(&p->soft.tm.h.nodes);
+}
+
+static void
+tm_hierarchy_uninit(struct pmd_internals *p)
+{
+	/* Remove all nodes*/
+	for ( ; ; ) {
+		struct tm_node *tm_node;
+
+		tm_node = TAILQ_FIRST(&p->soft.tm.h.nodes);
+		if (tm_node == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.nodes, tm_node, node);
+		free(tm_node);
+	}
+
+	/* Remove all WRED profiles */
+	for ( ; ; ) {
+		struct tm_wred_profile *wred_profile;
+
+		wred_profile = TAILQ_FIRST(&p->soft.tm.h.wred_profiles);
+		if (wred_profile == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.wred_profiles, wred_profile, node);
+		free(wred_profile);
+	}
+
+	/* Remove all shared shapers */
+	for ( ; ; ) {
+		struct tm_shared_shaper *shared_shaper;
+
+		shared_shaper = TAILQ_FIRST(&p->soft.tm.h.shared_shapers);
+		if (shared_shaper == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.shared_shapers, shared_shaper, node);
+		free(shared_shaper);
+	}
+
+	/* Remove all shaper profiles */
+	for ( ; ; ) {
+		struct tm_shaper_profile *shaper_profile;
+
+		shaper_profile = TAILQ_FIRST(&p->soft.tm.h.shaper_profiles);
+		if (shaper_profile == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.shaper_profiles,
+			shaper_profile, node);
+		free(shaper_profile);
+	}
+
+	memset(&p->soft.tm.h, 0, sizeof(p->soft.tm.h));
+}
+
 int
 tm_init(struct pmd_internals *p,
 	struct pmd_params *params,
@@ -112,12 +185,15 @@ tm_init(struct pmd_internals *p,
 		return -ENOMEM;
 	}
 
+	tm_hierarchy_init(p);
+
 	return 0;
 }
 
 void
 tm_free(struct pmd_internals *p)
 {
+	tm_hierarchy_uninit(p);
 	rte_free(p->soft.tm.pkts_enq);
 	rte_free(p->soft.tm.pkts_deq);
 }
@@ -129,6 +205,10 @@ tm_start(struct pmd_internals *p)
 	uint32_t n_subports, subport_id;
 	int status;
 
+	/* Is hierarchy frozen? */
+	if (p->soft.tm.hierarchy_frozen == 0)
+		return -1;
+
 	/* Port */
 	p->soft.tm.sched = rte_sched_port_config(&t->port_params);
 	if (p->soft.tm.sched == NULL)
@@ -178,6 +258,51 @@ tm_stop(struct pmd_internals *p)
 {
 	if (p->soft.tm.sched)
 		rte_sched_port_free(p->soft.tm.sched);
+
+	/* Unfreeze hierarchy */
+	p->soft.tm.hierarchy_frozen = 0;
+}
+
+static struct tm_shaper_profile *
+tm_shaper_profile_search(struct rte_eth_dev *dev, uint32_t shaper_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile_list *spl = &p->soft.tm.h.shaper_profiles;
+	struct tm_shaper_profile *sp;
+
+	TAILQ_FOREACH(sp, spl, node)
+		if (shaper_profile_id == sp->shaper_profile_id)
+			return sp;
+
+	return NULL;
+}
+
+static struct tm_shared_shaper *
+tm_shared_shaper_search(struct rte_eth_dev *dev, uint32_t shared_shaper_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper_list *ssl = &p->soft.tm.h.shared_shapers;
+	struct tm_shared_shaper *ss;
+
+	TAILQ_FOREACH(ss, ssl, node)
+		if (shared_shaper_id == ss->shared_shaper_id)
+			return ss;
+
+	return NULL;
+}
+
+static struct tm_wred_profile *
+tm_wred_profile_search(struct rte_eth_dev *dev, uint32_t wred_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile_list *wpl = &p->soft.tm.h.wred_profiles;
+	struct tm_wred_profile *wp;
+
+	TAILQ_FOREACH(wp, wpl, node)
+		if (wred_profile_id == wp->wred_profile_id)
+			return wp;
+
+	return NULL;
 }
 
 static struct tm_node *
@@ -194,6 +319,94 @@ tm_node_search(struct rte_eth_dev *dev, uint32_t node_id)
 	return NULL;
 }
 
+static struct tm_node *
+tm_root_node_present(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node)
+		if (n->parent_node_id == RTE_TM_NODE_ID_NULL)
+			return n;
+
+	return NULL;
+}
+
+static uint32_t
+tm_node_subport_id(struct rte_eth_dev *dev, struct tm_node *subport_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *ns;
+	uint32_t subport_id;
+
+	subport_id = 0;
+	TAILQ_FOREACH(ns, nl, node) {
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		if (ns->node_id == subport_node->node_id)
+			return subport_id;
+
+		subport_id++;
+	}
+
+	return UINT32_MAX;
+}
+
+static uint32_t
+tm_node_pipe_id(struct rte_eth_dev *dev, struct tm_node *pipe_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *np;
+	uint32_t pipe_id;
+
+	pipe_id = 0;
+	TAILQ_FOREACH(np, nl, node) {
+		if ((np->level != TM_NODE_LEVEL_PIPE) ||
+			(np->parent_node_id != pipe_node->parent_node_id))
+			continue;
+
+		if (np->node_id == pipe_node->node_id)
+			return pipe_id;
+
+		pipe_id++;
+	}
+
+	return UINT32_MAX;
+}
+
+static uint32_t
+tm_node_tc_id(struct rte_eth_dev *dev __rte_unused, struct tm_node *tc_node)
+{
+	return tc_node->priority;
+}
+
+static uint32_t
+tm_node_queue_id(struct rte_eth_dev *dev, struct tm_node *queue_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *nq;
+	uint32_t queue_id;
+
+	queue_id = 0;
+	TAILQ_FOREACH(nq, nl, node) {
+		if ((nq->level != TM_NODE_LEVEL_QUEUE) ||
+			(nq->parent_node_id != queue_node->parent_node_id))
+			continue;
+
+		if (nq->node_id == queue_node->node_id)
+			return queue_id;
+
+		queue_id++;
+	}
+
+	return UINT32_MAX;
+}
+
 static uint32_t
 tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 {
@@ -219,6 +432,35 @@ tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 	}
 }
 
+/* Traffic manager node type get */
+static int
+pmd_tm_node_type_get(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	int *is_leaf,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	if (is_leaf == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_UNSPECIFIED,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	if ((node_id == RTE_TM_NODE_ID_NULL) ||
+		(tm_node_search(dev, node_id) == NULL))
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_NODE_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	*is_leaf = node_id < p->params.soft.tm.nb_queues;
+
+	return 0;
+}
+
 #ifdef RTE_SCHED_RED
 #define WRED_SUPPORTED						1
 #else
@@ -674,8 +916,2534 @@ pmd_tm_node_capabilities_get(struct rte_eth_dev *dev __rte_unused,
 	return 0;
 }
 
-const struct rte_tm_ops pmd_tm_ops = {
-	.capabilities_get = pmd_tm_capabilities_get,
-	.level_capabilities_get = pmd_tm_level_capabilities_get,
-	.node_capabilities_get = pmd_tm_node_capabilities_get,
+static int
+shaper_profile_check(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error)
+{
+	struct tm_shaper_profile *sp;
+
+	/* Shaper profile ID must not be NONE. */
+	if (shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper profile must not exist. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp)
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	/* Profile must not be NULL. */
+	if (profile == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Peak rate: non-zero, 32-bit */
+	if ((profile->peak.rate == 0) ||
+		(profile->peak.rate >= UINT32_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_RATE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Peak size: non-zero, 32-bit */
+	if ((profile->peak.size == 0) ||
+		(profile->peak.size >= UINT32_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_SIZE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Dual-rate profiles are not supported. */
+	if (profile->committed.rate != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_COMMITTED_RATE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Packet length adjust: 24 bytes */
+	if (profile->pkt_length_adjust != RTE_TM_ETH_FRAMING_OVERHEAD_FCS)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PKT_ADJUST_LEN,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+/* Traffic manager shaper profile add */
+static int
+pmd_tm_shaper_profile_add(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile_list *spl = &p->soft.tm.h.shaper_profiles;
+	struct tm_shaper_profile *sp;
+	int status;
+
+	/* Check input params */
+	status = shaper_profile_check(dev, shaper_profile_id, profile, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	sp = calloc(1, sizeof(struct tm_shaper_profile));
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	sp->shaper_profile_id = shaper_profile_id;
+	memcpy(&sp->params, profile, sizeof(sp->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(spl, sp, node);
+	p->soft.tm.h.n_shaper_profiles++;
+
+	return 0;
+}
+
+/* Traffic manager shaper profile delete */
+static int
+pmd_tm_shaper_profile_delete(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp;
+
+	/* Check existing */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (sp->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.shaper_profiles, sp, node);
+	p->soft.tm.h.n_shaper_profiles--;
+	free(sp);
+
+	return 0;
+}
+
+static struct tm_node *
+tm_shared_shaper_get_tc(struct rte_eth_dev *dev,
+	struct tm_shared_shaper *ss)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node) {
+		if ((n->level != TM_NODE_LEVEL_TC) ||
+			(n->params.n_shared_shapers == 0) ||
+			(n->params.shared_shaper_id[0] != ss->shared_shaper_id))
+			continue;
+
+		return n;
+	}
+
+	return NULL;
+}
+
+static int
+update_subport_tc_rate(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct tm_shared_shaper *ss,
+	struct tm_shaper_profile *sp_new)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_subport_params subport_params;
+
+	struct tm_shaper_profile *sp_old = tm_shaper_profile_search(dev,
+		ss->shaper_profile_id);
+
+	/* Derive new subport configuration. */
+	memcpy(&subport_params,
+		&p->soft.tm.params.subport_params[subport_id],
+		sizeof(subport_params));
+	subport_params.tc_rate[tc_id] = sp_new->params.peak.rate;
+
+	/* Update the subport configuration. */
+	if (rte_sched_subport_config(p->soft.tm.sched,
+		subport_id, &subport_params))
+		return -1;
+
+	/* Commit changes. */
+	sp_old->n_users--;
+
+	ss->shaper_profile_id = sp_new->shaper_profile_id;
+	sp_new->n_users++;
+
+	memcpy(&p->soft.tm.params.subport_params[subport_id],
+		&subport_params,
+		sizeof(subport_params));
+
+	return 0;
+}
+
+/* Traffic manager shared shaper add/update */
+static int
+pmd_tm_shared_shaper_add_update(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper *ss;
+	struct tm_shaper_profile *sp;
+	struct tm_node *nt;
+
+	/* Shaper profile must be valid. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/**
+	 * Add new shared shaper
+	 */
+	ss = tm_shared_shaper_search(dev, shared_shaper_id);
+	if (ss == NULL) {
+		struct tm_shared_shaper_list *ssl =
+			&p->soft.tm.h.shared_shapers;
+
+		/* Hierarchy must not be frozen */
+		if (p->soft.tm.hierarchy_frozen)
+			return -rte_tm_error_set(error,
+				EBUSY,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EBUSY));
+
+		/* Memory allocation */
+		ss = calloc(1, sizeof(struct tm_shared_shaper));
+		if (ss == NULL)
+			return -rte_tm_error_set(error,
+				ENOMEM,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(ENOMEM));
+
+		/* Fill in */
+		ss->shared_shaper_id = shared_shaper_id;
+		ss->shaper_profile_id = shaper_profile_id;
+
+		/* Add to list */
+		TAILQ_INSERT_TAIL(ssl, ss, node);
+		p->soft.tm.h.n_shared_shapers++;
+
+		return 0;
+	}
+
+	/**
+	 * Update existing shared shaper
+	 */
+	/* Hierarchy must be frozen (run-time update) */
+	if (p->soft.tm.hierarchy_frozen == 0)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+
+	/* Propagate change. */
+	nt = tm_shared_shaper_get_tc(dev, ss);
+	if (update_subport_tc_rate(dev, nt, ss, sp))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+/* Traffic manager shared shaper delete */
+static int
+pmd_tm_shared_shaper_delete(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper *ss;
+
+	/* Check existing */
+	ss = tm_shared_shaper_search(dev, shared_shaper_id);
+	if (ss == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (ss->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.shared_shapers, ss, node);
+	p->soft.tm.h.n_shared_shapers--;
+	free(ss);
+
+	return 0;
+}
+
+static int
+wred_profile_check(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error)
+{
+	struct tm_wred_profile *wp;
+	enum rte_tm_color color;
+
+	/* WRED profile ID must not be NONE. */
+	if (wred_profile_id == RTE_TM_WRED_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* WRED profile must not exist. */
+	wp = tm_wred_profile_search(dev, wred_profile_id);
+	if (wp)
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	/* Profile must not be NULL. */
+	if (profile == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* min_th <= max_th, max_th > 0  */
+	for (color = RTE_TM_GREEN; color < RTE_TM_COLORS; color++) {
+		uint16_t min_th = profile->red_params[color].min_th;
+		uint16_t max_th = profile->red_params[color].max_th;
+
+		if ((min_th > max_th) || (max_th == 0))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_WRED_PROFILE,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	return 0;
+}
+
+/* Traffic manager WRED profile add */
+static int
+pmd_tm_wred_profile_add(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile_list *wpl = &p->soft.tm.h.wred_profiles;
+	struct tm_wred_profile *wp;
+	int status;
+
+	/* Check input params */
+	status = wred_profile_check(dev, wred_profile_id, profile, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	wp = calloc(1, sizeof(struct tm_wred_profile));
+	if (wp == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	wp->wred_profile_id = wred_profile_id;
+	memcpy(&wp->params, profile, sizeof(wp->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(wpl, wp, node);
+	p->soft.tm.h.n_wred_profiles++;
+
+	return 0;
+}
+
+/* Traffic manager WRED profile delete */
+static int
+pmd_tm_wred_profile_delete(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile *wp;
+
+	/* Check existing */
+	wp = tm_wred_profile_search(dev, wred_profile_id);
+	if (wp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (wp->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.wred_profiles, wp, node);
+	p->soft.tm.h.n_wred_profiles--;
+	free(wp);
+
+	return 0;
+}
+
+static int
+node_add_check_port(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp = tm_shaper_profile_search(dev,
+		params->shaper_profile_id);
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid.
+	 * Shaper profile peak rate must fit the configured port rate.
+	 */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(sp == NULL) ||
+		(sp->params.peak.rate > p->params.soft.tm.rate))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_subport(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_pipe(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight __rte_unused,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 4 */
+	if (params->nonleaf.n_sp_priorities !=
+		RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* WFQ mode must be byte mode */
+	if ((params->nonleaf.wfq_weight_mode != NULL) &&
+		(params->nonleaf.wfq_weight_mode[0] != 0) &&
+		(params->nonleaf.wfq_weight_mode[1] != 0) &&
+		(params->nonleaf.wfq_weight_mode[2] != 0) &&
+		(params->nonleaf.wfq_weight_mode[3] != 0))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_WFQ_WEIGHT_MODE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_tc(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority __rte_unused,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Single valid shared shaper */
+	if (params->n_shared_shapers > 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	if ((params->n_shared_shapers == 1) &&
+		((params->shared_shaper_id == NULL) ||
+		(!tm_shared_shaper_search(dev, params->shared_shaper_id[0]))))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_queue(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight __rte_unused,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: leaf */
+	if (node_id >= p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shaper */
+	if (params->shaper_profile_id != RTE_TM_SHAPER_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Congestion management must not be head drop */
+	if (params->leaf.cman == RTE_TM_CMAN_HEAD_DROP)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_CMAN,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Congestion management set to WRED */
+	if (params->leaf.cman == RTE_TM_CMAN_WRED) {
+		uint32_t wred_profile_id = params->leaf.wred.wred_profile_id;
+		struct tm_wred_profile *wp = tm_wred_profile_search(dev,
+			wred_profile_id);
+
+		/* WRED profile (for private WRED context) must be valid */
+		if ((wred_profile_id == RTE_TM_WRED_PROFILE_ID_NONE) ||
+			(wp == NULL))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARAMS_WRED_PROFILE_ID,
+				NULL,
+				rte_strerror(EINVAL));
+
+		/* No shared WRED contexts */
+		if (params->leaf.wred.n_shared_wred_contexts != 0)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_WRED_CONTEXTS,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_QUEUE))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct tm_node *pn;
+	uint32_t level;
+	int status;
+
+	/* node_id, parent_node_id:
+	 *    -node_id must not be RTE_TM_NODE_ID_NULL
+	 *    -node_id must not be in use
+	 *    -root node add (parent_node_id is RTE_TM_NODE_ID_NULL):
+	 *        -root node must not exist
+	 *    -non-root node add (parent_node_id is not RTE_TM_NODE_ID_NULL):
+	 *        -parent_node_id must be valid
+	 */
+	if (node_id == RTE_TM_NODE_ID_NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	if (tm_node_search(dev, node_id))
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	if (parent_node_id == RTE_TM_NODE_ID_NULL) {
+		pn = NULL;
+		if (tm_root_node_present(dev))
+			return -rte_tm_error_set(error,
+				EEXIST,
+				RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+				NULL,
+				rte_strerror(EEXIST));
+	} else {
+		pn = tm_node_search(dev, parent_node_id);
+		if (pn == NULL)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* priority: must be 0 .. 3 */
+	if (priority >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* weight: must be 1 .. 255 */
+	if ((weight == 0) || (weight >= UINT8_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* level_id: if valid, then
+	 *    -root node add (parent_node_id is RTE_TM_NODE_ID_NULL):
+	 *        -level_id must be zero
+	 *    -non-root node add (parent_node_id is not RTE_TM_NODE_ID_NULL):
+	 *        -level_id must be parent level ID plus one
+	 */
+	level = (pn == NULL) ? 0 : pn->level + 1;
+	if ((level_id != RTE_TM_NODE_LEVEL_ID_ANY) && (level_id != level))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* params: must not be NULL */
+	if (params == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* params: per level checks */
+	switch (level) {
+	case TM_NODE_LEVEL_PORT:
+		status = node_add_check_port(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		status = node_add_check_subport(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+		status = node_add_check_pipe(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_TC:
+		status = node_add_check_tc(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_QUEUE:
+		status = node_add_check_queue(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	default:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+	}
+
+	return 0;
+}
+
+/* Traffic manager node add */
+static int
+pmd_tm_node_add(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+	uint32_t i;
+	int status;
+
+	/* Checks */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	status = node_add_check(dev, node_id, parent_node_id, priority, weight,
+		level_id, params, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	n = calloc(1, sizeof(struct tm_node));
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	n->node_id = node_id;
+	n->parent_node_id = parent_node_id;
+	n->priority = priority;
+	n->weight = weight;
+
+	if (parent_node_id != RTE_TM_NODE_ID_NULL) {
+		n->parent_node = tm_node_search(dev, parent_node_id);
+		n->level = n->parent_node->level + 1;
+	}
+
+	if (params->shaper_profile_id != RTE_TM_SHAPER_PROFILE_ID_NONE)
+		n->shaper_profile = tm_shaper_profile_search(dev,
+			params->shaper_profile_id);
+
+	if ((n->level == TM_NODE_LEVEL_QUEUE) &&
+		(params->leaf.cman == RTE_TM_CMAN_WRED))
+		n->wred_profile = tm_wred_profile_search(dev,
+			params->leaf.wred.wred_profile_id);
+
+	memcpy(&n->params, params, sizeof(n->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(nl, n, node);
+	p->soft.tm.h.n_nodes++;
+
+	/* Update dependencies */
+	if (n->parent_node)
+		n->parent_node->n_children++;
+
+	if (n->shaper_profile)
+		n->shaper_profile->n_users++;
+
+	for (i = 0; i < params->n_shared_shapers; i++) {
+		struct tm_shared_shaper *ss;
+
+		ss = tm_shared_shaper_search(dev, params->shared_shaper_id[i]);
+		ss->n_users++;
+	}
+
+	if (n->wred_profile)
+		n->wred_profile->n_users++;
+
+	p->soft.tm.h.n_tm_nodes[n->level]++;
+
+	return 0;
+}
+
+/* Traffic manager node delete */
+static int
+pmd_tm_node_delete(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node *n;
+	uint32_t i;
+
+	/* Check hierarchy changes are currently allowed */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Check existing */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (n->n_children)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Update dependencies */
+	p->soft.tm.h.n_tm_nodes[n->level]--;
+
+	if (n->wred_profile)
+		n->wred_profile->n_users--;
+
+	for (i = 0; i < n->params.n_shared_shapers; i++) {
+		struct tm_shared_shaper *ss;
+
+		ss = tm_shared_shaper_search(dev,
+				n->params.shared_shaper_id[i]);
+		ss->n_users--;
+	}
+
+	if (n->shaper_profile)
+		n->shaper_profile->n_users--;
+
+	if (n->parent_node)
+		n->parent_node->n_children--;
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.nodes, n, node);
+	p->soft.tm.h.n_nodes--;
+	free(n);
+
+	return 0;
+}
+
+
+static void
+pipe_profile_build(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct rte_sched_pipe_params *pp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *nt, *nq;
+
+	memset(pp, 0, sizeof(*pp));
+
+	/* Pipe */
+	pp->tb_rate = np->shaper_profile->params.peak.rate;
+	pp->tb_size = np->shaper_profile->params.peak.size;
+
+	/* Traffic Class (TC) */
+	pp->tc_period = 40;
+
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+	pp->tc_ov_weight = np->weight;
+#endif
+
+	TAILQ_FOREACH(nt, nl, node) {
+		uint32_t queue_id = 0;
+
+		if ((nt->level != TM_NODE_LEVEL_TC) ||
+			(nt->parent_node_id != np->node_id))
+			continue;
+
+		pp->tc_rate[nt->priority] =
+			nt->shaper_profile->params.peak.rate;
+
+		/* Queue */
+		TAILQ_FOREACH(nq, nl, node) {
+			uint32_t pipe_queue_id;
+
+			if ((nq->level != TM_NODE_LEVEL_QUEUE) ||
+				(nq->parent_node_id != nt->node_id))
+				continue;
+
+			pipe_queue_id = nt->priority *
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+			pp->wrr_weights[pipe_queue_id] = nq->weight;
+
+			queue_id++;
+		}
+	}
+}
+
+static int
+pipe_profile_free_exists(struct rte_eth_dev *dev,
+	uint32_t *pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+
+	if (t->n_pipe_profiles < RTE_SCHED_PIPE_PROFILES_PER_PORT) {
+		*pipe_profile_id = t->n_pipe_profiles;
+		return 1;
+	}
+
+	return 0;
+}
+
+static int
+pipe_profile_exists(struct rte_eth_dev *dev,
+	struct rte_sched_pipe_params *pp,
+	uint32_t *pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t i;
+
+	for (i = 0; i < t->n_pipe_profiles; i++)
+		if (memcmp(&t->pipe_profiles[i], pp, sizeof(*pp)) == 0) {
+			if (pipe_profile_id)
+				*pipe_profile_id = i;
+			return 1;
+		}
+
+	return 0;
+}
+
+static void
+pipe_profile_install(struct rte_eth_dev *dev,
+	struct rte_sched_pipe_params *pp,
+	uint32_t pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+
+	memcpy(&t->pipe_profiles[pipe_profile_id], pp, sizeof(*pp));
+	t->n_pipe_profiles++;
+}
+
+static void
+pipe_profile_mark(struct rte_eth_dev *dev,
+	uint32_t subport_id,
+	uint32_t pipe_id,
+	uint32_t pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_pipes_per_subport, pos;
+
+	n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+	pos = subport_id * n_pipes_per_subport + pipe_id;
+
+	t->pipe_to_profile[pos] = pipe_profile_id;
+}
+
+static struct rte_sched_pipe_params *
+pipe_profile_get(struct rte_eth_dev *dev, struct tm_node *np)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	uint32_t subport_id = tm_node_subport_id(dev, np->parent_node);
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	uint32_t pos = subport_id * n_pipes_per_subport + pipe_id;
+	uint32_t pipe_profile_id = t->pipe_to_profile[pos];
+
+	return &t->pipe_profiles[pipe_profile_id];
+}
+
+static int
+pipe_profiles_generate(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *ns, *np;
+	uint32_t subport_id;
+
+	/* Objective: Fill in the following fields in struct tm_params:
+	 *    - pipe_profiles
+	 *    - n_pipe_profiles
+	 *    - pipe_to_profile
+	 */
+
+	subport_id = 0;
+	TAILQ_FOREACH(ns, nl, node) {
+		uint32_t pipe_id;
+
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		pipe_id = 0;
+		TAILQ_FOREACH(np, nl, node) {
+			struct rte_sched_pipe_params pp;
+			uint32_t pos;
+
+			if ((np->level != TM_NODE_LEVEL_PIPE) ||
+				(np->parent_node_id != ns->node_id))
+				continue;
+
+			pipe_profile_build(dev, np, &pp);
+
+			if (!pipe_profile_exists(dev, &pp, &pos)) {
+				if (!pipe_profile_free_exists(dev, &pos))
+					return -1;
+
+				pipe_profile_install(dev, &pp, pos);
+			}
+
+			pipe_profile_mark(dev, subport_id, pipe_id, pos);
+
+			pipe_id++;
+		}
+
+		subport_id++;
+	}
+
+	return 0;
+}
+
+static struct tm_wred_profile *
+tm_tc_wred_profile_get(struct rte_eth_dev *dev, uint32_t tc_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *nq;
+
+	TAILQ_FOREACH(nq, nl, node) {
+		if ((nq->level != TM_NODE_LEVEL_QUEUE) ||
+			(nq->parent_node->priority != tc_id))
+			continue;
+
+		return nq->wred_profile;
+	}
+
+	return NULL;
+}
+
+#ifdef RTE_SCHED_RED
+
+static void
+wred_profiles_set(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_sched_port_params *pp = &p->soft.tm.params.port_params;
+	uint32_t tc_id;
+	enum rte_tm_color color;
+
+	for (tc_id = 0; tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc_id++)
+		for (color = RTE_TM_GREEN; color < RTE_TM_COLORS; color++) {
+			struct rte_red_params *dst =
+				&pp->red_params[tc_id][color];
+			struct tm_wred_profile *src_wp =
+				tm_tc_wred_profile_get(dev, tc_id);
+			struct rte_tm_red_params *src =
+				&src_wp->params.red_params[color];
+
+			memcpy(dst, src, sizeof(*dst));
+		}
+}
+
+#else
+
+#define wred_profiles_set(dev)
+
+#endif
+
+static struct tm_shared_shaper *
+tm_tc_shared_shaper_get(struct rte_eth_dev *dev, struct tm_node *tc_node)
+{
+	return (tc_node->params.n_shared_shapers) ?
+		tm_shared_shaper_search(dev,
+			tc_node->params.shared_shaper_id[0]) :
+		NULL;
+}
+
+static struct tm_shared_shaper *
+tm_subport_tc_shared_shaper_get(struct rte_eth_dev *dev,
+	struct tm_node *subport_node,
+	uint32_t tc_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node) {
+		if ((n->level != TM_NODE_LEVEL_TC) ||
+			(n->parent_node->parent_node_id !=
+				subport_node->node_id) ||
+			(n->priority != tc_id))
+			continue;
+
+		return tm_tc_shared_shaper_get(dev, n);
+	}
+
+	return NULL;
+}
+
+static int
+hierarchy_commit_check(struct rte_eth_dev *dev, struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_shared_shaper_list *ssl = &h->shared_shapers;
+	struct tm_wred_profile_list *wpl = &h->wred_profiles;
+	struct tm_node *nr = tm_root_node_present(dev), *ns, *np, *nt, *nq;
+	struct tm_shared_shaper *ss;
+
+	uint32_t n_pipes_per_subport;
+
+	/* Root node exists. */
+	if (nr == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* There is at least one subport, max is not exceeded. */
+	if ((nr->n_children == 0) || (nr->n_children > TM_MAX_SUBPORTS))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* There is at least one pipe. */
+	if (h->n_tm_nodes[TM_NODE_LEVEL_PIPE] == 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of pipes is the same for all subports. Maximum number of pipes
+	 * per subport is not exceeded.
+	 */
+	n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	if (n_pipes_per_subport > TM_MAX_PIPES_PER_SUBPORT)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	TAILQ_FOREACH(ns, nl, node) {
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		if (ns->n_children != n_pipes_per_subport)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Each pipe has exactly 4 TCs, with exactly one TC for each priority */
+	TAILQ_FOREACH(np, nl, node) {
+		uint32_t mask = 0, mask_expected =
+			RTE_LEN2MASK(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+				uint32_t);
+
+		if (np->level != TM_NODE_LEVEL_PIPE)
+			continue;
+
+		if (np->n_children != RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+
+		TAILQ_FOREACH(nt, nl, node) {
+			if ((nt->level != TM_NODE_LEVEL_TC) ||
+				(nt->parent_node_id != np->node_id))
+				continue;
+
+			mask |= 1 << nt->priority;
+		}
+
+		if (mask != mask_expected)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Each TC has exactly 4 packet queues. */
+	TAILQ_FOREACH(nt, nl, node) {
+		if (nt->level != TM_NODE_LEVEL_TC)
+			continue;
+
+		if (nt->n_children != RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/**
+	 * Shared shapers:
+	 *    -For each TC #i, all pipes in the same subport use the same
+	 *     shared shaper (or no shared shaper) for their TC#i.
+	 *    -Each shared shaper needs to have at least one user. All its
+	 *     users have to be TC nodes with the same priority and the same
+	 *     subport.
+	 */
+	TAILQ_FOREACH(ns, nl, node) {
+		struct tm_shared_shaper *s[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t id;
+
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++)
+			s[id] = tm_subport_tc_shared_shaper_get(dev, ns, id);
+
+		TAILQ_FOREACH(nt, nl, node) {
+			struct tm_shared_shaper *subport_ss, *tc_ss;
+
+			if ((nt->level != TM_NODE_LEVEL_TC) ||
+				(nt->parent_node->parent_node_id !=
+					ns->node_id))
+				continue;
+
+			subport_ss = s[nt->priority];
+			tc_ss = tm_tc_shared_shaper_get(dev, nt);
+
+			if ((subport_ss == NULL) && (tc_ss == NULL))
+				continue;
+
+			if (((subport_ss == NULL) && (tc_ss != NULL)) ||
+				((subport_ss != NULL) && (tc_ss == NULL)) ||
+				(subport_ss->shared_shaper_id !=
+					tc_ss->shared_shaper_id))
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+	}
+
+	TAILQ_FOREACH(ss, ssl, node) {
+		struct tm_node *nt_any = tm_shared_shaper_get_tc(dev, ss);
+		uint32_t n_users = 0;
+
+		if (nt_any != NULL)
+			TAILQ_FOREACH(nt, nl, node) {
+				if ((nt->level != TM_NODE_LEVEL_TC) ||
+					(nt->priority != nt_any->priority) ||
+					(nt->parent_node->parent_node_id !=
+					nt_any->parent_node->parent_node_id))
+					continue;
+
+				n_users++;
+			}
+
+		if ((ss->n_users == 0) || (ss->n_users != n_users))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Not too many pipe profiles. */
+	if (pipe_profiles_generate(dev))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/**
+	 * WRED (when used, i.e. at least one WRED profile defined):
+	 *    -Each WRED profile must have at least one user.
+	 *    -All leaf nodes must have their private WRED context enabled.
+	 *    -For each TC #i, all leaf nodes must use the same WRED profile
+	 *     for their private WRED context.
+	 */
+	if (h->n_wred_profiles) {
+		struct tm_wred_profile *wp;
+		struct tm_wred_profile *w[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t id;
+
+		TAILQ_FOREACH(wp, wpl, node)
+			if (wp->n_users == 0)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++) {
+			w[id] = tm_tc_wred_profile_get(dev, id);
+
+			if (w[id] == NULL)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+
+		TAILQ_FOREACH(nq, nl, node) {
+			uint32_t id;
+
+			if (nq->level != TM_NODE_LEVEL_QUEUE)
+				continue;
+
+			id = nq->parent_node->priority;
+
+			if ((nq->wred_profile == NULL) ||
+				(nq->wred_profile->wred_profile_id !=
+					w[id]->wred_profile_id))
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+	}
+
+	return 0;
+}
+
+static void
+hierarchy_blueprints_create(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *root = tm_root_node_present(dev), *n;
+
+	uint32_t subport_id;
+
+	t->port_params = (struct rte_sched_port_params) {
+		.name = dev->data->name,
+		.socket = dev->data->numa_node,
+		.rate = root->shaper_profile->params.peak.rate,
+		.mtu = dev->data->mtu,
+		.frame_overhead =
+			root->shaper_profile->params.pkt_length_adjust,
+		.n_subports_per_port = root->n_children,
+		.n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+			h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT],
+		.qsize = {p->params.soft.tm.qsize[0],
+			p->params.soft.tm.qsize[1],
+			p->params.soft.tm.qsize[2],
+			p->params.soft.tm.qsize[3],
+		},
+		.pipe_profiles = t->pipe_profiles,
+		.n_pipe_profiles = t->n_pipe_profiles,
+	};
+
+	wred_profiles_set(dev);
+
+	subport_id = 0;
+	TAILQ_FOREACH(n, nl, node) {
+		uint64_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t i;
+
+		if (n->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+			struct tm_shared_shaper *ss;
+			struct tm_shaper_profile *sp;
+
+			ss = tm_subport_tc_shared_shaper_get(dev, n, i);
+			sp = (ss) ? tm_shaper_profile_search(dev,
+				ss->shaper_profile_id) :
+				n->shaper_profile;
+			tc_rate[i] = sp->params.peak.rate;
+		}
+
+		t->subport_params[subport_id] =
+			(struct rte_sched_subport_params) {
+				.tb_rate = n->shaper_profile->params.peak.rate,
+				.tb_size = n->shaper_profile->params.peak.size,
+
+				.tc_rate = {tc_rate[0],
+					tc_rate[1],
+					tc_rate[2],
+					tc_rate[3],
+			},
+			.tc_period = 10,
+		};
+
+		subport_id++;
+	}
+}
+
+/* Traffic manager hierarchy commit */
+static int
+pmd_tm_hierarchy_commit(struct rte_eth_dev *dev,
+	int clear_on_fail,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	int status;
+
+	/* Checks */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	status = hierarchy_commit_check(dev, error);
+	if (status) {
+		if (clear_on_fail) {
+			tm_hierarchy_uninit(p);
+			tm_hierarchy_init(p);
+		}
+
+		return status;
+	}
+
+	/* Create blueprints */
+	hierarchy_blueprints_create(dev);
+
+	/* Freeze hierarchy */
+	p->soft.tm.hierarchy_frozen = 1;
+
+	return 0;
+}
+
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+
+static int
+update_pipe_weight(struct rte_eth_dev *dev, struct tm_node *np, uint32_t weight)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tc_ov_weight = (uint8_t)weight;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	np->weight = weight;
+
+	return 0;
+}
+
+#endif
+
+static int
+update_queue_weight(struct rte_eth_dev *dev,
+	struct tm_node *nq, uint32_t weight)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t queue_id = tm_node_queue_id(dev, nq);
+
+	struct tm_node *nt = nq->parent_node;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t pipe_queue_id =
+		tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.wrr_weights[pipe_queue_id] = (uint8_t)weight;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set
+	 * of pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	nq->weight = weight;
+
+	return 0;
+}
+
+/* Traffic manager node parent update */
+static int
+pmd_tm_node_parent_update(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+
+	/* Port must be started and TM used. */
+	if ((dev->data->dev_started == 0) && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Parent node must be the same */
+	if (n->parent_node_id != parent_node_id)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be the same */
+	if (n->priority != priority)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* weight: must be 1 .. 255 */
+	if ((weight == 0) || (weight >= UINT8_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_SUBPORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_PIPE:
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+		if (update_pipe_weight(dev, n, weight))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+#else
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+#endif
+		/* fall-through */
+	case TM_NODE_LEVEL_TC:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_QUEUE:
+		/* fall-through */
+	default:
+		if (update_queue_weight(dev, n, weight))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+	}
+}
+
+static int
+update_subport_rate(struct rte_eth_dev *dev,
+	struct tm_node *ns,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_subport_params subport_params;
+
+	/* Derive new subport configuration. */
+	memcpy(&subport_params,
+		&p->soft.tm.params.subport_params[subport_id],
+		sizeof(subport_params));
+	subport_params.tb_rate = sp->params.peak.rate;
+	subport_params.tb_size = sp->params.peak.size;
+
+	/* Update the subport configuration. */
+	if (rte_sched_subport_config(p->soft.tm.sched, subport_id,
+		&subport_params))
+		return -1;
+
+	/* Commit changes. */
+	ns->shaper_profile->n_users--;
+
+	ns->shaper_profile = sp;
+	ns->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	memcpy(&p->soft.tm.params.subport_params[subport_id],
+		&subport_params,
+		sizeof(subport_params));
+
+	return 0;
+}
+
+static int
+update_pipe_rate(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tb_rate = sp->params.peak.rate;
+	profile1.tb_size = sp->params.peak.size;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	np->shaper_profile->n_users--;
+	np->shaper_profile = sp;
+	np->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	return 0;
+}
+
+static int
+update_tc_rate(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tc_rate[tc_id] = sp->params.peak.rate;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	nt->shaper_profile->n_users--;
+	nt->shaper_profile = sp;
+	nt->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	return 0;
+}
+
+/* Traffic manager node shaper update */
+static int
+pmd_tm_node_shaper_update(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+	struct tm_shaper_profile *sp;
+
+	/* Port must be started and TM used. */
+	if ((dev->data->dev_started == 0) && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper profile must be valid. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_SUBPORT:
+		if (update_subport_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_PIPE:
+		if (update_pipe_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_TC:
+		if (update_tc_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_QUEUE:
+		/* fall-through */
+	default:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+	}
+}
+
+static inline uint32_t
+tm_port_queue_id(struct rte_eth_dev *dev,
+	uint32_t port_subport_id,
+	uint32_t subport_pipe_id,
+	uint32_t pipe_tc_id,
+	uint32_t tc_queue_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	uint32_t n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+			h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	uint32_t port_pipe_id =
+		port_subport_id * n_pipes_per_subport + subport_pipe_id;
+	uint32_t port_tc_id =
+		port_pipe_id * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE + pipe_tc_id;
+	uint32_t port_queue_id =
+		port_tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + tc_queue_id;
+
+	return port_queue_id;
+}
+
+static int
+read_port_stats(struct rte_eth_dev *dev,
+	struct tm_node *nr,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	uint32_t n_subports_per_port = h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+	uint32_t subport_id;
+
+	for (subport_id = 0; subport_id < n_subports_per_port; subport_id++) {
+		struct rte_sched_subport_stats s;
+		uint32_t tc_ov, id;
+
+		/* Stats read */
+		int status = rte_sched_subport_read_stats(
+			p->soft.tm.sched,
+			subport_id,
+			&s,
+			&tc_ov);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++) {
+			nr->stats.n_pkts +=
+				s.n_pkts_tc[id] - s.n_pkts_tc_dropped[id];
+			nr->stats.n_bytes +=
+				s.n_bytes_tc[id] - s.n_bytes_tc_dropped[id];
+			nr->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] +=
+				s.n_pkts_tc_dropped[id];
+			nr->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+				s.n_bytes_tc_dropped[id];
+		}
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nr->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nr->stats, 0, sizeof(nr->stats));
+
+	return 0;
+}
+
+static int
+read_subport_stats(struct rte_eth_dev *dev,
+	struct tm_node *ns,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+	struct rte_sched_subport_stats s;
+	uint32_t tc_ov, tc_id;
+
+	/* Stats read */
+	int status = rte_sched_subport_read_stats(
+		p->soft.tm.sched,
+		subport_id,
+		&s,
+		&tc_ov);
+	if (status)
+		return status;
+
+	/* Stats accumulate */
+	for (tc_id = 0; tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc_id++) {
+		ns->stats.n_pkts +=
+			s.n_pkts_tc[tc_id] - s.n_pkts_tc_dropped[tc_id];
+		ns->stats.n_bytes +=
+			s.n_bytes_tc[tc_id] - s.n_bytes_tc_dropped[tc_id];
+		ns->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] +=
+			s.n_pkts_tc_dropped[tc_id];
+		ns->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_tc_dropped[tc_id];
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &ns->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&ns->stats, 0, sizeof(ns->stats));
+
+	return 0;
+}
+
+static int
+read_pipe_stats(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t i;
+
+	/* Stats read */
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
+		struct rte_sched_queue_stats s;
+		uint16_t qlen;
+
+		uint32_t qid = tm_port_queue_id(dev,
+			subport_id,
+			pipe_id,
+			i / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			i % RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
+
+		int status = rte_sched_queue_read_stats(
+			p->soft.tm.sched,
+			qid,
+			&s,
+			&qlen);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		np->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+		np->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+		np->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+		np->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_dropped;
+		np->stats.leaf.n_pkts_queued = qlen;
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &np->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&np->stats, 0, sizeof(np->stats));
+
+	return 0;
+}
+
+static int
+read_tc_stats(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t i;
+
+	/* Stats read */
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
+		struct rte_sched_queue_stats s;
+		uint16_t qlen;
+
+		uint32_t qid = tm_port_queue_id(dev,
+			subport_id,
+			pipe_id,
+			tc_id,
+			i);
+
+		int status = rte_sched_queue_read_stats(
+			p->soft.tm.sched,
+			qid,
+			&s,
+			&qlen);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		nt->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+		nt->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+		nt->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+		nt->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_dropped;
+		nt->stats.leaf.n_pkts_queued = qlen;
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nt->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nt->stats, 0, sizeof(nt->stats));
+
+	return 0;
+}
+
+static int
+read_queue_stats(struct rte_eth_dev *dev,
+	struct tm_node *nq,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_sched_queue_stats s;
+	uint16_t qlen;
+
+	uint32_t queue_id = tm_node_queue_id(dev, nq);
+
+	struct tm_node *nt = nq->parent_node;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	/* Stats read */
+	uint32_t qid = tm_port_queue_id(dev,
+		subport_id,
+		pipe_id,
+		tc_id,
+		queue_id);
+
+	int status = rte_sched_queue_read_stats(
+		p->soft.tm.sched,
+		qid,
+		&s,
+		&qlen);
+	if (status)
+		return status;
+
+	/* Stats accumulate */
+	nq->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+	nq->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+	nq->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+	nq->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+		s.n_bytes_dropped;
+	nq->stats.leaf.n_pkts_queued = qlen;
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nq->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_QUEUE;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nq->stats, 0, sizeof(nq->stats));
+
+	return 0;
+}
+
+/* Traffic manager read stats counters for specific node */
+static int
+pmd_tm_node_stats_read(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+
+	/* Port must be started and TM used. */
+	if ((dev->data->dev_started == 0) && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		if (read_port_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		if (read_subport_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_PIPE:
+		if (read_pipe_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_TC:
+		if (read_tc_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		if (read_queue_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+	}
+}
+
+const struct rte_tm_ops pmd_tm_ops = {
+	.node_type_get = pmd_tm_node_type_get,
+	.capabilities_get = pmd_tm_capabilities_get,
+	.level_capabilities_get = pmd_tm_level_capabilities_get,
+	.node_capabilities_get = pmd_tm_node_capabilities_get,
+
+	.wred_profile_add = pmd_tm_wred_profile_add,
+	.wred_profile_delete = pmd_tm_wred_profile_delete,
+	.shared_wred_context_add_update = NULL,
+	.shared_wred_context_delete = NULL,
+
+	.shaper_profile_add = pmd_tm_shaper_profile_add,
+	.shaper_profile_delete = pmd_tm_shaper_profile_delete,
+	.shared_shaper_add_update = pmd_tm_shared_shaper_add_update,
+	.shared_shaper_delete = pmd_tm_shared_shaper_delete,
+
+	.node_add = pmd_tm_node_add,
+	.node_delete = pmd_tm_node_delete,
+	.node_suspend = NULL,
+	.node_resume = NULL,
+	.hierarchy_commit = pmd_tm_hierarchy_commit,
+
+	.node_parent_update = pmd_tm_node_parent_update,
+	.node_shaper_update = pmd_tm_node_shaper_update,
+	.node_shared_shaper_update = NULL,
+	.node_stats_update = NULL,
+	.node_wfq_weight_mode_update = NULL,
+	.node_cman_update = NULL,
+	.node_wred_context_update = NULL,
+	.node_shared_wred_context_update = NULL,
+
+	.node_stats_read = pmd_tm_node_stats_read,
 };
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v4 1/4] net/softnic: add softnic PMD
  2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 1/4] net/softnic: add softnic PMD Jasvinder Singh
@ 2017-09-18 16:58               ` Singh, Jasvinder
  2017-09-18 19:09                 ` Thomas Monjalon
  0 siblings, 1 reply; 79+ messages in thread
From: Singh, Jasvinder @ 2017-09-18 16:58 UTC (permalink / raw)
  To: Singh, Jasvinder, dev; +Cc: Dumitrescu, Cristian, Yigit, Ferruh, thomas

Hi Thomas,

I don't see this patch in patchwork, although it is present in email archive. Any guess why it is not showing up there?

Thank you,
Jasvinder

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jasvinder Singh
> Sent: Monday, September 18, 2017 10:10 AM
> To: dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Yigit, Ferruh
> <ferruh.yigit@intel.com>; thomas@monjalon.net
> Subject: [dpdk-dev] [PATCH v4 1/4] net/softnic: add softnic PMD
> 
> Add SoftNIC PMD to provide SW fall-back for ethdev APIs.
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> ---
> v4 changes:
> - Implemented feedback from Ferruh [1]
>  - rename map file to rte_pmd_eth_softnic_version.map
>  - add release notes library version info
>  - doxygen: fix hooks in doc/api/doxy-api-index.md
>  - add doxygen comment for rte_pmd_softnic_run()
>  - free device name memory
>  - remove soft_dev param in pmd_ethdev_register()
>  - fix checkpatch warnings
> 
> v3 changes:
> - rebase to dpdk17.08 release
> 
> v2 changes:
> - fix build errors
> - rebased to TM APIs v6 plus dpdk master
> 
> [1] Ferruh s feedback on v3: http://dpdk.org/ml/archives/dev/2017-
> September/074576.html
> 
>  MAINTAINERS                                        |   5 +
>  config/common_base                                 |   5 +
>  doc/api/doxy-api-index.md                          |   3 +-
>  doc/api/doxy-api.conf                              |   1 +
>  doc/guides/rel_notes/release_17_11.rst             |   6 +
>  drivers/net/Makefile                               |   5 +
>  drivers/net/softnic/Makefile                       |  56 ++
>  drivers/net/softnic/rte_eth_softnic.c              | 595 +++++++++++++++++++++
>  drivers/net/softnic/rte_eth_softnic.h              |  67 +++
>  drivers/net/softnic/rte_eth_softnic_internals.h    | 114 ++++
>  .../net/softnic/rte_pmd_eth_softnic_version.map    |   7 +
>  mk/rte.app.mk                                      |   5 +-
>  12 files changed, 867 insertions(+), 2 deletions(-)  create mode 100644
> drivers/net/softnic/Makefile  create mode 100644
> drivers/net/softnic/rte_eth_softnic.c
>  create mode 100644 drivers/net/softnic/rte_eth_softnic.h
>  create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
>  create mode 100644 drivers/net/softnic/rte_pmd_eth_softnic_version.map
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index a0cd75e..b6b738d 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -511,6 +511,11 @@ M: Gaetan Rivet <gaetan.rivet@6wind.com>
>  F: drivers/net/failsafe/
>  F: doc/guides/nics/fail_safe.rst
> 
> +Softnic PMD
> +M: Jasvinder Singh <jasvinder.singh@intel.com>
> +M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> +F: drivers/net/softnic
> +
> 
>  Crypto Drivers
>  --------------
> diff --git a/config/common_base b/config/common_base index
> 5e97a08..1a0c77d 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -273,6 +273,11 @@ CONFIG_RTE_LIBRTE_SFC_EFX_PMD=y
> CONFIG_RTE_LIBRTE_SFC_EFX_DEBUG=n
> 
>  #
> +# Compile SOFTNIC PMD
> +#
> +CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
> +
> +#
>  # Compile software PMD backed by SZEDATA2 device  #
> CONFIG_RTE_LIBRTE_PMD_SZEDATA2=n diff --git a/doc/api/doxy-api-
> index.md b/doc/api/doxy-api-index.md index 19e0d4f..626ab51 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -55,7 +55,8 @@ The public API headers are grouped by topics:
>    [KNI]                (@ref rte_kni.h),
>    [ixgbe]              (@ref rte_pmd_ixgbe.h),
>    [i40e]               (@ref rte_pmd_i40e.h),
> -  [crypto_scheduler]   (@ref rte_cryptodev_scheduler.h)
> +  [crypto_scheduler]   (@ref rte_cryptodev_scheduler.h),
> +  [softnic]            (@ref rte_eth_softnic.h)
> 
>  - **memory**:
>    [memseg]             (@ref rte_memory.h),
> diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf index
> 823554f..b27755d 100644
> --- a/doc/api/doxy-api.conf
> +++ b/doc/api/doxy-api.conf
> @@ -32,6 +32,7 @@ PROJECT_NAME            = DPDK
>  INPUT                   = doc/api/doxy-api-index.md \
>                            drivers/crypto/scheduler \
>                            drivers/net/bonding \
> +                          drivers/net/softnic \
>                            drivers/net/i40e \
>                            drivers/net/ixgbe \
>                            lib/librte_eal/common/include \ diff --git
> a/doc/guides/rel_notes/release_17_11.rst
> b/doc/guides/rel_notes/release_17_11.rst
> index 170f4f9..d5a760b 100644
> --- a/doc/guides/rel_notes/release_17_11.rst
> +++ b/doc/guides/rel_notes/release_17_11.rst
> @@ -41,6 +41,11 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =========================================================
> 
> +* **Added SoftNIC PMD.**
> +
> +  Added new SoftNIC PMD. This virtual device offers applications a
> + software  fallback support for traffic management.
> +
> 
>  Resolved Issues
>  ---------------
> @@ -170,6 +175,7 @@ The libraries prepended with a plus sign were
> incremented in this version.
>       librte_pipeline.so.3
>       librte_pmd_bond.so.1
>       librte_pmd_ring.so.2
> +   + librte_pmd_softnic.so.1
>       librte_port.so.3
>       librte_power.so.1
>       librte_reorder.so.1
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile index
> d33c959..b552a51 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -110,4 +110,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
> endif # $(CONFIG_RTE_LIBRTE_VHOST)  DEPDIRS-vhost = $(core-libs)
> librte_vhost
> 
> +ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
> +DIRS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += softnic endif #
> +$(CONFIG_RTE_LIBRTE_SCHED) DEPDIRS-softnic = $(core-libs) librte_sched
> +
>  include $(RTE_SDK)/mk/rte.subdir.mk
> diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile new
> file mode 100644 index 0000000..c2f42ef
> --- /dev/null
> +++ b/drivers/net/softnic/Makefile
> @@ -0,0 +1,56 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2017 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +#     * Redistributions of source code must retain the above copyright
> +#       notice, this list of conditions and the following disclaimer.
> +#     * Redistributions in binary form must reproduce the above copyright
> +#       notice, this list of conditions and the following disclaimer in
> +#       the documentation and/or other materials provided with the
> +#       distribution.
> +#     * Neither the name of Intel Corporation nor the names of its
> +#       contributors may be used to endorse or promote products derived
> +#       from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +#
> +# library name
> +#
> +LIB = librte_pmd_softnic.a
> +
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +
> +EXPORT_MAP := rte_pmd_eth_softnic_version.map
> +
> +LIBABIVER := 1
> +
> +#
> +# all source are stored in SRCS-y
> +#
> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
> +
> +#
> +# Export include files
> +#
> +SYMLINK-y-include += rte_eth_softnic.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/net/softnic/rte_eth_softnic.c
> b/drivers/net/softnic/rte_eth_softnic.c
> new file mode 100644
> index 0000000..792e7ea
> --- /dev/null
> +++ b/drivers/net/softnic/rte_eth_softnic.c
> @@ -0,0 +1,595 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +#include <stdint.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +#include <rte_ethdev.h>
> +#include <rte_ethdev_vdev.h>
> +#include <rte_malloc.h>
> +#include <rte_vdev.h>
> +#include <rte_kvargs.h>
> +#include <rte_errno.h>
> +#include <rte_ring.h>
> +
> +#include "rte_eth_softnic.h"
> +#include "rte_eth_softnic_internals.h"
> +
> +#define PRIV_TO_HARD_DEV(p)					\
> +	(&rte_eth_devices[p->hard.port_id])
> +
> +#define PMD_PARAM_HARD_NAME
> 	"hard_name"
> +#define PMD_PARAM_HARD_TX_QUEUE_ID
> 	"hard_tx_queue_id"
> +
> +static const char *pmd_valid_args[] = {
> +	PMD_PARAM_HARD_NAME,
> +	PMD_PARAM_HARD_TX_QUEUE_ID,
> +	NULL
> +};
> +
> +static const struct rte_eth_dev_info pmd_dev_info = {
> +	.min_rx_bufsize = 0,
> +	.max_rx_pktlen = UINT32_MAX,
> +	.max_rx_queues = UINT16_MAX,
> +	.max_tx_queues = UINT16_MAX,
> +	.rx_desc_lim = {
> +		.nb_max = UINT16_MAX,
> +		.nb_min = 0,
> +		.nb_align = 1,
> +	},
> +	.tx_desc_lim = {
> +		.nb_max = UINT16_MAX,
> +		.nb_min = 0,
> +		.nb_align = 1,
> +	},
> +};
> +
> +static void
> +pmd_dev_infos_get(struct rte_eth_dev *dev __rte_unused,
> +	struct rte_eth_dev_info *dev_info)
> +{
> +	memcpy(dev_info, &pmd_dev_info, sizeof(*dev_info)); }
> +
> +static int
> +pmd_dev_configure(struct rte_eth_dev *dev) {
> +	struct pmd_internals *p = dev->data->dev_private;
> +	struct rte_eth_dev *hard_dev = PRIV_TO_HARD_DEV(p);
> +
> +	if (dev->data->nb_rx_queues > hard_dev->data->nb_rx_queues)
> +		return -1;
> +
> +	if (p->params.hard.tx_queue_id >= hard_dev->data->nb_tx_queues)
> +		return -1;
> +
> +	return 0;
> +}
> +
> +static int
> +pmd_rx_queue_setup(struct rte_eth_dev *dev,
> +	uint16_t rx_queue_id,
> +	uint16_t nb_rx_desc __rte_unused,
> +	unsigned int socket_id,
> +	const struct rte_eth_rxconf *rx_conf __rte_unused,
> +	struct rte_mempool *mb_pool __rte_unused) {
> +	struct pmd_internals *p = dev->data->dev_private;
> +
> +	if (p->params.soft.intrusive == 0) {
> +		struct pmd_rx_queue *rxq;
> +
> +		rxq = rte_zmalloc_socket(p->params.soft.name,
> +			sizeof(struct pmd_rx_queue), 0, socket_id);
> +		if (rxq == NULL)
> +			return -ENOMEM;
> +
> +		rxq->hard.port_id = p->hard.port_id;
> +		rxq->hard.rx_queue_id = rx_queue_id;
> +		dev->data->rx_queues[rx_queue_id] = rxq;
> +	} else {
> +		struct rte_eth_dev *hard_dev = PRIV_TO_HARD_DEV(p);
> +		void *rxq = hard_dev->data->rx_queues[rx_queue_id];
> +
> +		if (rxq == NULL)
> +			return -1;
> +
> +		dev->data->rx_queues[rx_queue_id] = rxq;
> +	}
> +	return 0;
> +}
> +
> +static int
> +pmd_tx_queue_setup(struct rte_eth_dev *dev,
> +	uint16_t tx_queue_id,
> +	uint16_t nb_tx_desc,
> +	unsigned int socket_id,
> +	const struct rte_eth_txconf *tx_conf __rte_unused) {
> +	uint32_t size = RTE_ETH_NAME_MAX_LEN + strlen("_txq") + 4;
> +	char name[size];
> +	struct rte_ring *r;
> +
> +	snprintf(name, sizeof(name), "%s_txq%04x",
> +		dev->data->name, tx_queue_id);
> +	r = rte_ring_create(name, nb_tx_desc, socket_id,
> +		RING_F_SP_ENQ | RING_F_SC_DEQ);
> +	if (r == NULL)
> +		return -1;
> +
> +	dev->data->tx_queues[tx_queue_id] = r;
> +	return 0;
> +}
> +
> +static int
> +pmd_dev_start(struct rte_eth_dev *dev)
> +{
> +	struct pmd_internals *p = dev->data->dev_private;
> +
> +	dev->data->dev_link.link_status = ETH_LINK_UP;
> +
> +	if (p->params.soft.intrusive) {
> +		struct rte_eth_dev *hard_dev = PRIV_TO_HARD_DEV(p);
> +
> +		/* The hard_dev->rx_pkt_burst should be stable by now */
> +		dev->rx_pkt_burst = hard_dev->rx_pkt_burst;
> +	}
> +
> +	return 0;
> +}
> +
> +static void
> +pmd_dev_stop(struct rte_eth_dev *dev)
> +{
> +	dev->data->dev_link.link_status = ETH_LINK_DOWN; }
> +
> +static void
> +pmd_dev_close(struct rte_eth_dev *dev)
> +{
> +	uint32_t i;
> +
> +	/* TX queues */
> +	for (i = 0; i < dev->data->nb_tx_queues; i++)
> +		rte_ring_free((struct rte_ring *)dev->data->tx_queues[i]); }
> +
> +static int
> +pmd_link_update(struct rte_eth_dev *dev __rte_unused,
> +	int wait_to_complete __rte_unused)
> +{
> +	return 0;
> +}
> +
> +static const struct eth_dev_ops pmd_ops = {
> +	.dev_configure = pmd_dev_configure,
> +	.dev_start = pmd_dev_start,
> +	.dev_stop = pmd_dev_stop,
> +	.dev_close = pmd_dev_close,
> +	.link_update = pmd_link_update,
> +	.dev_infos_get = pmd_dev_infos_get,
> +	.rx_queue_setup = pmd_rx_queue_setup,
> +	.tx_queue_setup = pmd_tx_queue_setup,
> +	.tm_ops_get = NULL,
> +};
> +
> +static uint16_t
> +pmd_rx_pkt_burst(void *rxq,
> +	struct rte_mbuf **rx_pkts,
> +	uint16_t nb_pkts)
> +{
> +	struct pmd_rx_queue *rx_queue = rxq;
> +
> +	return rte_eth_rx_burst(rx_queue->hard.port_id,
> +		rx_queue->hard.rx_queue_id,
> +		rx_pkts,
> +		nb_pkts);
> +}
> +
> +static uint16_t
> +pmd_tx_pkt_burst(void *txq,
> +	struct rte_mbuf **tx_pkts,
> +	uint16_t nb_pkts)
> +{
> +	return (uint16_t)rte_ring_enqueue_burst(txq,
> +		(void **)tx_pkts,
> +		nb_pkts,
> +		NULL);
> +}
> +
> +static __rte_always_inline int
> +rte_pmd_softnic_run_default(struct rte_eth_dev *dev) {
> +	struct pmd_internals *p = dev->data->dev_private;
> +
> +	/* Persistent context: Read Only (update not required) */
> +	struct rte_mbuf **pkts = p->soft.def.pkts;
> +	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
> +
> +	/* Persistent context: Read - Write (update required) */
> +	uint32_t txq_pos = p->soft.def.txq_pos;
> +	uint32_t pkts_len = p->soft.def.pkts_len;
> +	uint32_t flush_count = p->soft.def.flush_count;
> +
> +	/* Not part of the persistent context */
> +	uint32_t pos;
> +	uint16_t i;
> +
> +	/* Soft device TXQ read, Hard device TXQ write */
> +	for (i = 0; i < nb_tx_queues; i++) {
> +		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
> +
> +		/* Read soft device TXQ burst to packet enqueue buffer */
> +		pkts_len += rte_ring_sc_dequeue_burst(txq,
> +			(void **)&pkts[pkts_len],
> +			DEFAULT_BURST_SIZE,
> +			NULL);
> +
> +		/* Increment soft device TXQ */
> +		txq_pos++;
> +		if (txq_pos >= nb_tx_queues)
> +			txq_pos = 0;
> +
> +		/* Hard device TXQ write when complete burst is available */
> +		if (pkts_len >= DEFAULT_BURST_SIZE) {
> +			for (pos = 0; pos < pkts_len; )
> +				pos += rte_eth_tx_burst(p->hard.port_id,
> +					p->params.hard.tx_queue_id,
> +					&pkts[pos],
> +					(uint16_t)(pkts_len - pos));
> +
> +			pkts_len = 0;
> +			flush_count = 0;
> +			break;
> +		}
> +	}
> +
> +	if (flush_count >= FLUSH_COUNT_THRESHOLD) {
> +		for (pos = 0; pos < pkts_len; )
> +			pos += rte_eth_tx_burst(p->hard.port_id,
> +				p->params.hard.tx_queue_id,
> +				&pkts[pos],
> +				(uint16_t)(pkts_len - pos));
> +
> +		pkts_len = 0;
> +		flush_count = 0;
> +	}
> +
> +	p->soft.def.txq_pos = txq_pos;
> +	p->soft.def.pkts_len = pkts_len;
> +	p->soft.def.flush_count = flush_count + 1;
> +
> +	return 0;
> +}
> +
> +int
> +rte_pmd_softnic_run(uint8_t port_id)
> +{
> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> +
> +#ifdef RTE_LIBRTE_ETHDEV_DEBUG
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0); #endif
> +
> +	return rte_pmd_softnic_run_default(dev); }
> +
> +static struct ether_addr eth_addr = { .addr_bytes = {0} };
> +
> +static uint32_t
> +eth_dev_speed_max_mbps(uint32_t speed_capa) {
> +	uint32_t rate_mbps[32] = {
> +		ETH_SPEED_NUM_NONE,
> +		ETH_SPEED_NUM_10M,
> +		ETH_SPEED_NUM_10M,
> +		ETH_SPEED_NUM_100M,
> +		ETH_SPEED_NUM_100M,
> +		ETH_SPEED_NUM_1G,
> +		ETH_SPEED_NUM_2_5G,
> +		ETH_SPEED_NUM_5G,
> +		ETH_SPEED_NUM_10G,
> +		ETH_SPEED_NUM_20G,
> +		ETH_SPEED_NUM_25G,
> +		ETH_SPEED_NUM_40G,
> +		ETH_SPEED_NUM_50G,
> +		ETH_SPEED_NUM_56G,
> +		ETH_SPEED_NUM_100G,
> +	};
> +
> +	uint32_t pos = (speed_capa) ? (31 - __builtin_clz(speed_capa)) : 0;
> +	return rate_mbps[pos];
> +}
> +
> +static int
> +default_init(struct pmd_internals *p,
> +	struct pmd_params *params,
> +	int numa_node)
> +{
> +	p->soft.def.pkts = rte_zmalloc_socket(params->soft.name,
> +		2 * DEFAULT_BURST_SIZE * sizeof(struct rte_mbuf *),
> +		0,
> +		numa_node);
> +
> +	if (p->soft.def.pkts == NULL)
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static void
> +default_free(struct pmd_internals *p)
> +{
> +	free((void *)p->params.hard.name);
> +	rte_free(p->soft.def.pkts);
> +}
> +
> +static void *
> +pmd_init(struct pmd_params *params, int numa_node) {
> +	struct pmd_internals *p;
> +	int status;
> +
> +	p = rte_zmalloc_socket(params->soft.name,
> +		sizeof(struct pmd_internals),
> +		0,
> +		numa_node);
> +	if (p == NULL)
> +		return NULL;
> +
> +	memcpy(&p->params, params, sizeof(p->params));
> +	status = rte_eth_dev_get_port_by_name(params->hard.name,
> +		&p->hard.port_id);
> +	if (status) {
> +		rte_free(p);
> +		return NULL;
> +	}
> +
> +	/* Default */
> +	status = default_init(p, params, numa_node);
> +	if (status) {
> +		rte_free(p);
> +		return NULL;
> +	}
> +
> +	return p;
> +}
> +
> +static void
> +pmd_free(struct pmd_internals *p)
> +{
> +	default_free(p);
> +
> +	rte_free(p);
> +}
> +
> +static int
> +pmd_ethdev_register(struct rte_vdev_device *vdev,
> +	struct pmd_params *params,
> +	void *dev_private)
> +{
> +	struct rte_eth_dev_info hard_info;
> +	struct rte_eth_dev *soft_dev;
> +	uint32_t hard_speed;
> +	int numa_node;
> +	uint8_t hard_port_id;
> +
> +	rte_eth_dev_get_port_by_name(params->hard.name,
> &hard_port_id);
> +	rte_eth_dev_info_get(hard_port_id, &hard_info);
> +	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
> +	numa_node = rte_eth_dev_socket_id(hard_port_id);
> +
> +	/* Ethdev entry allocation */
> +	soft_dev = rte_eth_dev_allocate(params->soft.name);
> +	if (!soft_dev)
> +		return -ENOMEM;
> +
> +	/* dev */
> +	soft_dev->rx_pkt_burst = (params->soft.intrusive) ?
> +		NULL : /* set up later */
> +		pmd_rx_pkt_burst;
> +	soft_dev->tx_pkt_burst = pmd_tx_pkt_burst;
> +	soft_dev->tx_pkt_prepare = NULL;
> +	soft_dev->dev_ops = &pmd_ops;
> +	soft_dev->device = &vdev->device;
> +
> +	/* dev->data */
> +	soft_dev->data->dev_private = dev_private;
> +	soft_dev->data->dev_link.link_speed = hard_speed;
> +	soft_dev->data->dev_link.link_duplex = ETH_LINK_FULL_DUPLEX;
> +	soft_dev->data->dev_link.link_autoneg = ETH_LINK_SPEED_FIXED;
> +	soft_dev->data->dev_link.link_status = ETH_LINK_DOWN;
> +	soft_dev->data->mac_addrs = &eth_addr;
> +	soft_dev->data->promiscuous = 1;
> +	soft_dev->data->kdrv = RTE_KDRV_NONE;
> +	soft_dev->data->numa_node = numa_node;
> +	soft_dev->data->dev_flags = RTE_ETH_DEV_DETACHABLE;
> +
> +	return 0;
> +}
> +
> +static int
> +get_string(const char *key __rte_unused, const char *value, void
> +*extra_args) {
> +	if (!value || !extra_args)
> +		return -EINVAL;
> +
> +	*(char **)extra_args = strdup(value);
> +
> +	if (!*(char **)extra_args)
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static int
> +get_uint32(const char *key __rte_unused, const char *value, void
> +*extra_args) {
> +	if (!value || !extra_args)
> +		return -EINVAL;
> +
> +	*(uint32_t *)extra_args = strtoull(value, NULL, 0);
> +
> +	return 0;
> +}
> +
> +static int
> +pmd_parse_args(struct pmd_params *p, const char *name, const char
> +*params) {
> +	struct rte_kvargs *kvlist;
> +	int ret;
> +
> +	kvlist = rte_kvargs_parse(params, pmd_valid_args);
> +	if (kvlist == NULL)
> +		return -EINVAL;
> +
> +	/* Set default values */
> +	memset(p, 0, sizeof(*p));
> +	p->soft.name = name;
> +	p->soft.intrusive = INTRUSIVE;
> +	p->hard.tx_queue_id = SOFTNIC_HARD_TX_QUEUE_ID;
> +
> +	/* HARD: name (mandatory) */
> +	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_NAME) == 1) {
> +		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME,
> +			&get_string, &p->hard.name);
> +		if (ret < 0)
> +			goto out_free;
> +	} else {
> +		ret = -EINVAL;
> +		goto out_free;
> +	}
> +
> +	/* HARD: tx_queue_id (optional) */
> +	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_TX_QUEUE_ID) == 1)
> {
> +		ret = rte_kvargs_process(kvlist,
> PMD_PARAM_HARD_TX_QUEUE_ID,
> +			&get_uint32, &p->hard.tx_queue_id);
> +		if (ret < 0)
> +			goto out_free;
> +	}
> +
> +out_free:
> +	rte_kvargs_free(kvlist);
> +	return ret;
> +}
> +
> +static int
> +pmd_probe(struct rte_vdev_device *vdev) {
> +	struct pmd_params p;
> +	const char *params;
> +	int status;
> +
> +	struct rte_eth_dev_info hard_info;
> +	uint8_t hard_port_id;
> +	int numa_node;
> +	void *dev_private;
> +
> +	RTE_LOG(INFO, PMD,
> +		"Probing device \"%s\"\n",
> +		rte_vdev_device_name(vdev));
> +
> +	/* Parse input arguments */
> +	params = rte_vdev_device_args(vdev);
> +	if (!params)
> +		return -EINVAL;
> +
> +	status = pmd_parse_args(&p, rte_vdev_device_name(vdev),
> params);
> +	if (status)
> +		return status;
> +
> +	/* Check input arguments */
> +	if (rte_eth_dev_get_port_by_name(p.hard.name, &hard_port_id))
> +		return -EINVAL;
> +
> +	rte_eth_dev_info_get(hard_port_id, &hard_info);
> +	numa_node = rte_eth_dev_socket_id(hard_port_id);
> +
> +	if (p.hard.tx_queue_id >= hard_info.max_tx_queues)
> +		return -EINVAL;
> +
> +	/* Allocate and initialize soft ethdev private data */
> +	dev_private = pmd_init(&p, numa_node);
> +	if (dev_private == NULL)
> +		return -ENOMEM;
> +
> +	/* Register soft ethdev */
> +	RTE_LOG(INFO, PMD,
> +		"Creating soft ethdev \"%s\" for hard ethdev \"%s\"\n",
> +		p.soft.name, p.hard.name);
> +
> +	status = pmd_ethdev_register(vdev, &p, dev_private);
> +	if (status) {
> +		pmd_free(dev_private);
> +		return status;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +pmd_remove(struct rte_vdev_device *vdev) {
> +	struct rte_eth_dev *dev = NULL;
> +	struct pmd_internals *p;
> +
> +	if (!vdev)
> +		return -EINVAL;
> +
> +	RTE_LOG(INFO, PMD, "Removing device \"%s\"\n",
> +		rte_vdev_device_name(vdev));
> +
> +	/* Find the ethdev entry */
> +	dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
> +	if (dev == NULL)
> +		return -ENODEV;
> +	p = dev->data->dev_private;
> +
> +	/* Free device data structures*/
> +	pmd_free(p);
> +	rte_free(dev->data);
> +	rte_eth_dev_release_port(dev);
> +
> +	return 0;
> +}
> +
> +static struct rte_vdev_driver pmd_softnic_drv = {
> +	.probe = pmd_probe,
> +	.remove = pmd_remove,
> +};
> +
> +RTE_PMD_REGISTER_VDEV(net_softnic, pmd_softnic_drv);
> +RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
> +	PMD_PARAM_HARD_NAME "=<string> "
> +	PMD_PARAM_HARD_TX_QUEUE_ID "=<int>");
> diff --git a/drivers/net/softnic/rte_eth_softnic.h
> b/drivers/net/softnic/rte_eth_softnic.h
> new file mode 100644
> index 0000000..e6996f3
> --- /dev/null
> +++ b/drivers/net/softnic/rte_eth_softnic.h
> @@ -0,0 +1,67 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +#ifndef __INCLUDE_RTE_ETH_SOFTNIC_H__
> +#define __INCLUDE_RTE_ETH_SOFTNIC_H__
> +
> +#include <stdint.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#ifndef SOFTNIC_HARD_TX_QUEUE_ID
> +#define SOFTNIC_HARD_TX_QUEUE_ID			0
> +#endif
> +
> +/**
> + * Run the traffic management function on the softnic device
> + *
> + * This function read the packets from the softnic input queues, insert
> +into
> + * QoS scheduler queues based on mbuf sched field value and transmit
> +the
> + * scheduled packets out through the hard device interface.
> + *
> + * @param portid
> + *    port id of the soft device.
> + * @return
> + *    zero.
> + */
> +
> +int
> +rte_pmd_softnic_run(uint8_t port_id);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* __INCLUDE_RTE_ETH_SOFTNIC_H__ */
> diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h
> b/drivers/net/softnic/rte_eth_softnic_internals.h
> new file mode 100644
> index 0000000..96995b5
> --- /dev/null
> +++ b/drivers/net/softnic/rte_eth_softnic_internals.h
> @@ -0,0 +1,114 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2017 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of Intel Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +#ifndef __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
> +#define __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
> +
> +#include <stdint.h>
> +
> +#include <rte_mbuf.h>
> +#include <rte_ethdev.h>
> +
> +#include "rte_eth_softnic.h"
> +
> +#ifndef INTRUSIVE
> +#define INTRUSIVE					0
> +#endif
> +
> +struct pmd_params {
> +	/** Parameters for the soft device (to be created) */
> +	struct {
> +		const char *name; /**< Name */
> +		uint32_t flags; /**< Flags */
> +
> +		/** 0 = Access hard device though API only (potentially
> slower,
> +		 *      but safer);
> +		 *  1 = Access hard device private data structures is allowed
> +		 *      (potentially faster).
> +		 */
> +		int intrusive;
> +	} soft;
> +
> +	/** Parameters for the hard device (existing) */
> +	struct {
> +		char *name; /**< Name */
> +		uint16_t tx_queue_id; /**< TX queue ID */
> +	} hard;
> +};
> +
> +/**
> + * Default Internals
> + */
> +
> +#ifndef DEFAULT_BURST_SIZE
> +#define DEFAULT_BURST_SIZE				32
> +#endif
> +
> +#ifndef FLUSH_COUNT_THRESHOLD
> +#define FLUSH_COUNT_THRESHOLD			(1 << 17)
> +#endif
> +
> +struct default_internals {
> +	struct rte_mbuf **pkts;
> +	uint32_t pkts_len;
> +	uint32_t txq_pos;
> +	uint32_t flush_count;
> +};
> +
> +/**
> + * PMD Internals
> + */
> +struct pmd_internals {
> +	/** Params */
> +	struct pmd_params params;
> +
> +	/** Soft device */
> +	struct {
> +		struct default_internals def; /**< Default */
> +	} soft;
> +
> +	/** Hard device */
> +	struct {
> +		uint8_t port_id;
> +	} hard;
> +};
> +
> +struct pmd_rx_queue {
> +	/** Hard device */
> +	struct {
> +		uint8_t port_id;
> +		uint16_t rx_queue_id;
> +	} hard;
> +};
> +
> +#endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
> diff --git a/drivers/net/softnic/rte_pmd_eth_softnic_version.map
> b/drivers/net/softnic/rte_pmd_eth_softnic_version.map
> new file mode 100644
> index 0000000..fb2cb68
> --- /dev/null
> +++ b/drivers/net/softnic/rte_pmd_eth_softnic_version.map
> @@ -0,0 +1,7 @@
> +DPDK_17.11 {
> +	global:
> +
> +	rte_pmd_softnic_run;
> +
> +	local: *;
> +};
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk index c25fdd9..3dc82fb 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -67,7 +67,6 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -
> lrte_distributor
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
> -_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
>  # librte_acl needs --whole-archive because of weak functions
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)            += --whole-archive
> @@ -99,6 +98,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -
> lrte_ring
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
> 
>  ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
> @@ -135,6 +135,9 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL)       += -
> lrte_pmd_null
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP)       += -lrte_pmd_pcap -lpcap
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_QEDE_PMD)       += -lrte_pmd_qede
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING)       += -lrte_pmd_ring
> +ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC)      += -lrte_pmd_softnic
> +endif
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD)    += -lrte_pmd_sfc_efx
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2)   += -lrte_pmd_szedata2 -
> lsze2
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_TAP)        += -lrte_pmd_tap
> --
> 2.9.3


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v4 1/4] net/softnic: add softnic PMD
  2017-09-18 16:58               ` Singh, Jasvinder
@ 2017-09-18 19:09                 ` Thomas Monjalon
  0 siblings, 0 replies; 79+ messages in thread
From: Thomas Monjalon @ 2017-09-18 19:09 UTC (permalink / raw)
  To: Singh, Jasvinder; +Cc: dev, Dumitrescu, Cristian, Yigit, Ferruh

18/09/2017 18:58, Singh, Jasvinder:
> Hi Thomas,
> 
> I don't see this patch in patchwork, although it is present in email archive. Any guess why it is not showing up there?

No idea.
Not a big deal, others are there.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-09-18  9:10           ` [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
                               ` (3 preceding siblings ...)
  2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 4/4] net/softnic: add TM hierarchy related ops Jasvinder Singh
@ 2017-09-20 15:35             ` Thomas Monjalon
  2017-09-22 22:07               ` Singh, Jasvinder
  2017-10-06 10:40               ` Dumitrescu, Cristian
  4 siblings, 2 replies; 79+ messages in thread
From: Thomas Monjalon @ 2017-09-20 15:35 UTC (permalink / raw)
  To: Jasvinder Singh, cristian.dumitrescu; +Cc: dev, ferruh.yigit

Hi,

18/09/2017 11:10, Jasvinder Singh:
> The SoftNIC PMD is intended to provide SW fall-back options for specific
> ethdev APIs in a generic way to the NICs not supporting those features.

I agree it is important to have a solution in DPDK to better manage
SW fallbacks. One question is to know whether we can implement and
maintain many solutions. We probably must choose only one solution.

I have not read the code. I am just interested in the design for now.
I think it is a smart idea but maybe less convenient than calling fallback
from ethdev API glue code. My opinion has not changed since v1.
Thanks for the detailed explanations. Let's discuss below.

[...]
> * RX/TX: The app reads packets from/writes packets to the "soft" port
>   instead of the "hard" port. The RX and TX queues of the "soft" port are
>   thread safe, as any ethdev.

"thread safe as any ethdev"?
I would say the ethdev queues are not thread safe.

[...]
> * Meets the NFV vision: The app should be (almost) agnostic about the NIC
>   implementation (different vendors/models, HW-SW mix), the app should not
>   require changes to use different NICs, the app should use the same API
>   for all NICs. If a NIC does not implement a specific feature, the HW
>   should be augmented with SW to meet the functionality while still
>   preserving the same API.

This goal could also be achieved by adding the SW capability to the API.
After getting capabilities of a hardware, the app could set the capability
of some driver features to "SW fallback".
So the capability would become a tristate:
	- not supported
	- HW supported
	- SW supported

The unique API goal is failed if we must manage two ports,
the HW port for some features and the softnic port for other features.
You explain it in A5 below.

[...]
> Example: Create "soft" port for "hard" port "0000:04:00.1", enable the TM
> feature with default settings:
>           --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on' 

So the app will use only the vdev net_softnic0 which will forward packets
to 0000:04:00.1?
Can we say in this example that net_softnic0 owns 0000:04:00.1?
Probably not, because the config of the HW must be done separately (cf. Q5).
See my "ownership proposal":
	http://dpdk.org/ml/archives/dev/2017-September/074656.html

The issue I see in this example is that we must define how to enable
every features. It should be equivalent to defining the ethdev capabilities.
In this example, the option soft_tm=on is probably not enough fine-grain.
We could support some parts of TM API in HW and other parts in SW.

[...]
> Q3: Why not change the "hard" device (and keep a single device) instead of
>     creating a new "soft" device (and thus having two devices)?
> A3: This is not possible with the current librte_ether ethdev
>     implementation. The ethdev->dev_ops are defined as constant structure,
>     so it cannot be changed per device (nor per PMD). The new ops also
>     need memory space to store their context data structures, which
>     requires updating the ethdev->data->dev_private of the existing
>     device; at best, maybe a resize of ethdev->data->dev_private could be
>     done, assuming that librte_ether will introduce a way to find out its
>     size, but this cannot be done while device is running. Other side
>     effects might exist, as the changes are very intrusive, plus it likely
>     needs more changes in librte_ether.

Q3 is about calling SW fallback from the driver code, right?

We must not implement fallbacks in drivers because it would hide
it to the application.
If a feature is not available in hardware, the application can choose
to bypass this feature or integrate the fallback in its own workflow.

> Q4: Why not call the SW fall-back dev_ops directly in librte_ether for
>     devices which do not support the specific feature? If the device
>     supports the capability, let's call its dev_ops, otherwise call the
>     SW fall-back dev_ops.
> A4: First, similar reasons to Q&A3. This fixes the need to change
>     ethdev->dev_ops of the device, but it does not do anything to fix the
>     other significant issue of where to store the context data structures
>     needed by the SW fall-back functions (which, in this approach, are
>     called implicitly by librte_ether).
>     Second, the SW fall-back options should not be restricted arbitrarily
>     by the librte_ether library, the decision should belong to the app.
>     For example, the TM SW fall-back should not be limited to only
>     librte_sched, which (like any SW fall-back) is limited to a specific
>     hierarchy and feature set, it cannot do any possible hierarchy. If
>     alternatives exist, the one to use should be picked by the app, not by
>     the ethdev layer.

Q4 is about calling SW callback from the API glue code, right?

We could summarize Q3/Q4 as "it could be done but we propose another way".
I think we must consider the pros and cons of both approaches from
a user perspective.
I agree the application must decide which fallback to use.
We could propose one fallback in ethdev which can be enabled explicitly
(see my tristate capabilities proposal above).

> Q5: Why is the app required to continue to configure both the "hard" and
>     the "soft" devices even after the "soft" device has been created? Why
>     not hiding the "hard" device under the "soft" device and have the
>     "soft" device configure the "hard" device under the hood?
> A5: This was the approach tried in the V2 of this patch set (overlay
>     "soft" device taking over the configuration of the underlay "hard"
>     device) and eventually dropped due to increased complexity of having
>     to keep the configuration of two distinct devices in sync with
>     librte_ether implementation that is not friendly towards such
>     approach. Basically, each ethdev API call for the overlay device
>     needs to configure the overlay device, invoke the same configuration
>     with possibly modified parameters for the underlay device, then resume
>     the configuration of overlay device, turning this into a device
>     emulation project.
>     V2 minuses: increased complexity (deal with two devices at same time);
>     need to implement every ethdev API, even those not needed for the scope
>     of SW fall-back; intrusive; sometimes have to silently take decisions
>     that should be left to the app.
>     V3 pluses: lower complexity (only one device); only need to implement
>     those APIs that are in scope of the SW fall-back; non-intrusive (deal
>     with "hard" device through ethdev API); app decisions taken by the app
>     in an explicit way.

I think it is breaking what you call the NFV vision in several places.

[...]
>     9. [rte_ring proliferation] Thread safety requirements for ethdev
>        RX/TXqueues require an rte_ring to be used for every RX/TX queue
>        of each "soft" ethdev. This rte_ring proliferation unnecessarily
>        increases the memory footprint and lowers performance, especially
>        when each "soft" ethdev ends up on a different CPU core (ping-pong
>        of cache lines).

I am curious to understand why you consider thread safety as a requirement
for queues. No need to reply here, the question is already asked
at the beginning of this email ;)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-09-20 15:35             ` [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Thomas Monjalon
@ 2017-09-22 22:07               ` Singh, Jasvinder
  2017-10-06 10:40               ` Dumitrescu, Cristian
  1 sibling, 0 replies; 79+ messages in thread
From: Singh, Jasvinder @ 2017-09-22 22:07 UTC (permalink / raw)
  To: Thomas Monjalon, Dumitrescu, Cristian; +Cc: dev, Yigit, Ferruh

Hi Thomas,

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Wednesday, September 20, 2017 4:36 PM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>; Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com>
> Cc: dev@dpdk.org; Yigit, Ferruh <ferruh.yigit@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for
> traffic mgmt and others
> 
> Hi,
> 
> 18/09/2017 11:10, Jasvinder Singh:
> > The SoftNIC PMD is intended to provide SW fall-back options for
> > specific ethdev APIs in a generic way to the NICs not supporting those
> features.
> 
> I agree it is important to have a solution in DPDK to better manage SW
> fallbacks. One question is to know whether we can implement and maintain
> many solutions. We probably must choose only one solution.
> 
> I have not read the code. I am just interested in the design for now.
> I think it is a smart idea but maybe less convenient than calling fallback from
> ethdev API glue code. My opinion has not changed since v1.

IMHO, Calling fallback from ethdev API glue code suffers from scalability issue. Let's assume the scenario of having another
sw fallback implementation for TM or its specific feature is available. What will be the approach when we already have something glued in TM API? 
The softnic could be considered as a placeholder for adding and enabling more features at any granularity in addition
to having complete TM feature.  

> Thanks for the detailed explanations. Let's discuss below.
> 
> [...]
> > * RX/TX: The app reads packets from/writes packets to the "soft" port
> >   instead of the "hard" port. The RX and TX queues of the "soft" port are
> >   thread safe, as any ethdev.
> 
> "thread safe as any ethdev"?
> I would say the ethdev queues are not thread safe.

[Jasvinder] Agree.

> [...]
> > * Meets the NFV vision: The app should be (almost) agnostic about the NIC
> >   implementation (different vendors/models, HW-SW mix), the app should
> not
> >   require changes to use different NICs, the app should use the same API
> >   for all NICs. If a NIC does not implement a specific feature, the HW
> >   should be augmented with SW to meet the functionality while still
> >   preserving the same API.
> 
> This goal could also be achieved by adding the SW capability to the API.
> After getting capabilities of a hardware, the app could set the capability of
> some driver features to "SW fallback".
> So the capability would become a tristate:
> 	- not supported
> 	- HW supported
> 	- SW supported
> 
> The unique API goal is failed if we must manage two ports, the HW port for
> some features and the softnic port for other features.
> You explain it in A5 below.

[Jasvinder]  TM API is agnostic to underlying implementation and allows applications to implement
solution on SW, HW or hybrid of HW and SW at any granularity, and on any number of devices depending
upon the availability of features. No Restriction. Thus, managing and configuring number of devices (physical and virtual) using high
level api is at the disposal of application level framework. When softnic device is enabled, application sends and receives packet
from soft device instead of hard device as soft device implements the features missing in hard device. It deosn't mean that softnic
device should hide the hard device. However, it doesn't restrict application to communicate directly with hard device. If desired, application
can bypass the softnic device and send tx packet straight to hard device through the queues not used by soft device.

> [...]
> > Example: Create "soft" port for "hard" port "0000:04:00.1", enable the
> > TM feature with default settings:
> >           --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on'
> 
> So the app will use only the vdev net_softnic0 which will forward packets to
> 0000:04:00.1?
> Can we say in this example that net_softnic0 owns 0000:04:00.1?
> Probably not, because the config of the HW must be done separately (cf. Q5).
> See my "ownership proposal":
> 	http://dpdk.org/ml/archives/dev/2017-September/074656.html
> 
> The issue I see in this example is that we must define how to enable every
> features. It should be equivalent to defining the ethdev capabilities.
> In this example, the option soft_tm=on is probably not enough fine-grain.
> We could support some parts of TM API in HW and other parts in SW.
> 
[Jasvinder] - This is one instance where the complete hierarchical scheduler is presented as software fallback. But the
approach doesn't restrict to add more features (at any granularity) to softnic and enable them altogether
naming arguments during device creation.

> [...]
> > Q3: Why not change the "hard" device (and keep a single device) instead of
> >     creating a new "soft" device (and thus having two devices)?
> > A3: This is not possible with the current librte_ether ethdev
> >     implementation. The ethdev->dev_ops are defined as constant structure,
> >     so it cannot be changed per device (nor per PMD). The new ops also
> >     need memory space to store their context data structures, which
> >     requires updating the ethdev->data->dev_private of the existing
> >     device; at best, maybe a resize of ethdev->data->dev_private could be
> >     done, assuming that librte_ether will introduce a way to find out its
> >     size, but this cannot be done while device is running. Other side
> >     effects might exist, as the changes are very intrusive, plus it likely
> >     needs more changes in librte_ether.
> 
> Q3 is about calling SW fallback from the driver code, right?
> 
> We must not implement fallbacks in drivers because it would hide it to the
> application.
> If a feature is not available in hardware, the application can choose to bypass
> this feature or integrate the fallback in its own workflow.

[Jasvinder]: Naturally, if hard device has the TM feature, then TM specific ops which are invoked API function are implemented by its pmd.
Similar approach has been followed in sw fallback solution as softnic port that complements the hard device.
 
> > Q4: Why not call the SW fall-back dev_ops directly in librte_ether for
> >     devices which do not support the specific feature? If the device
> >     supports the capability, let's call its dev_ops, otherwise call the
> >     SW fall-back dev_ops.
> > A4: First, similar reasons to Q&A3. This fixes the need to change
> >     ethdev->dev_ops of the device, but it does not do anything to fix the
> >     other significant issue of where to store the context data structures
> >     needed by the SW fall-back functions (which, in this approach, are
> >     called implicitly by librte_ether).
> >     Second, the SW fall-back options should not be restricted arbitrarily
> >     by the librte_ether library, the decision should belong to the app.
> >     For example, the TM SW fall-back should not be limited to only
> >     librte_sched, which (like any SW fall-back) is limited to a specific
> >     hierarchy and feature set, it cannot do any possible hierarchy. If
> >     alternatives exist, the one to use should be picked by the app, not by
> >     the ethdev layer.
> 
> Q4 is about calling SW callback from the API glue code, right?
> 
> We could summarize Q3/Q4 as "it could be done but we propose another
> way".
> I think we must consider the pros and cons of both approaches from a user
> perspective.
> I agree the application must decide which fallback to use.
> We could propose one fallback in ethdev which can be enabled explicitly (see
> my tristate capabilities proposal above).

[Jasvinder] As explained above as well, the approach of sticking sw solution with API will
create issue of scalability. How two different SW solution will coexist or for instance N software solutions.
That's why the softnic (virtual device) is proposed as an alternative which could be extended
to include and enable features.
 
> > Q5: Why is the app required to continue to configure both the "hard" and
> >     the "soft" devices even after the "soft" device has been created? Why
> >     not hiding the "hard" device under the "soft" device and have the
> >     "soft" device configure the "hard" device under the hood?
> > A5: This was the approach tried in the V2 of this patch set (overlay
> >     "soft" device taking over the configuration of the underlay "hard"
> >     device) and eventually dropped due to increased complexity of having
> >     to keep the configuration of two distinct devices in sync with
> >     librte_ether implementation that is not friendly towards such
> >     approach. Basically, each ethdev API call for the overlay device
> >     needs to configure the overlay device, invoke the same configuration
> >     with possibly modified parameters for the underlay device, then resume
> >     the configuration of overlay device, turning this into a device
> >     emulation project.
> >     V2 minuses: increased complexity (deal with two devices at same time);
> >     need to implement every ethdev API, even those not needed for the
> scope
> >     of SW fall-back; intrusive; sometimes have to silently take decisions
> >     that should be left to the app.
> >     V3 pluses: lower complexity (only one device); only need to implement
> >     those APIs that are in scope of the SW fall-back; non-intrusive (deal
> >     with "hard" device through ethdev API); app decisions taken by the app
> >     in an explicit way.
> 
> I think it is breaking what you call the NFV vision in several places.

[Jasvinder] Mentioning nfv vision is about hiding heterogeneous implementation
(HW,SW, HW-SW hybrid) under the abstraction layer provided by TM API
instead of restricting app to use API for specific port.
> 
> [...]
> >     9. [rte_ring proliferation] Thread safety requirements for ethdev
> >        RX/TXqueues require an rte_ring to be used for every RX/TX queue
> >        of each "soft" ethdev. This rte_ring proliferation unnecessarily
> >        increases the memory footprint and lowers performance, especially
> >        when each "soft" ethdev ends up on a different CPU core (ping-pong
> >        of cache lines).
> 
> I am curious to understand why you consider thread safety as a requirement
> for queues. No need to reply here, the question is already asked at the
> beginning of this email ;)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v4 2/4] net/softnic: add traffic management support
  2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 2/4] net/softnic: add traffic management support Jasvinder Singh
@ 2017-09-25  1:58               ` Lu, Wenzhuo
  2017-09-28  8:14                 ` Singh, Jasvinder
  2017-09-29 14:04               ` [dpdk-dev] [PATCH v5 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  1 sibling, 1 reply; 79+ messages in thread
From: Lu, Wenzhuo @ 2017-09-25  1:58 UTC (permalink / raw)
  To: Singh, Jasvinder, dev; +Cc: Dumitrescu, Cristian, Yigit, Ferruh, thomas

Hi Jasvinder,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jasvinder Singh
> Sent: Monday, September 18, 2017 5:10 PM
> To: dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Yigit, Ferruh
> <ferruh.yigit@intel.com>; thomas@monjalon.net
> Subject: [dpdk-dev] [PATCH v4 2/4] net/softnic: add traffic management
> support
> 
> Add ethdev Traffic Management API support to SoftNIC PMD.
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> ---
> v3 changes:
> - add more confguration parameters (tm rate, tm queue sizes)
> 
>  drivers/net/softnic/Makefile                    |   1 +
>  drivers/net/softnic/rte_eth_softnic.c           | 252
> +++++++++++++++++++++++-
>  drivers/net/softnic/rte_eth_softnic.h           |  16 ++
>  drivers/net/softnic/rte_eth_softnic_internals.h | 106 +++++++++-
>  drivers/net/softnic/rte_eth_softnic_tm.c        | 181 +++++++++++++++++
>  5 files changed, 553 insertions(+), 3 deletions(-)  create mode 100644
> drivers/net/softnic/rte_eth_softnic_tm.c



> 
>  static void
> @@ -293,6 +325,77 @@ rte_pmd_softnic_run_default(struct rte_eth_dev
> *dev)
>  	return 0;
>  }
> 
> +static __rte_always_inline int
> +rte_pmd_softnic_run_tm(struct rte_eth_dev *dev) {
This function name seems a little misleading. If it's a inline function not an API,  better just name it 'softnic_run_tm".
And a common comments for the names, like, pmd_feature, tm_params_check, tm_init ... if they're only for soft nic, better add the prefix 'softnc_' for them.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/4] net/softnic: add TM capabilities ops
  2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 3/4] net/softnic: add TM capabilities ops Jasvinder Singh
@ 2017-09-25  2:33               ` Lu, Wenzhuo
  2017-09-28  8:16                 ` Singh, Jasvinder
  0 siblings, 1 reply; 79+ messages in thread
From: Lu, Wenzhuo @ 2017-09-25  2:33 UTC (permalink / raw)
  To: Singh, Jasvinder, dev; +Cc: Dumitrescu, Cristian, Yigit, Ferruh, thomas

Hi Jasvinder,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jasvinder Singh
> Sent: Monday, September 18, 2017 5:10 PM
> To: dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Yigit, Ferruh
> <ferruh.yigit@intel.com>; thomas@monjalon.net
> Subject: [dpdk-dev] [PATCH v4 3/4] net/softnic: add TM capabilities ops
> 
> Implement ethdev TM capability APIs in SoftNIC PMD.
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> ---
>  drivers/net/softnic/rte_eth_softnic.c           |  12 +-
>  drivers/net/softnic/rte_eth_softnic_internals.h |  32 ++
>  drivers/net/softnic/rte_eth_softnic_tm.c        | 500
> ++++++++++++++++++++++++
>  3 files changed, 543 insertions(+), 1 deletion(-)
The same concern of the naming as patch 2.The function and structure names are too common. Better add the prefix 'softnic_' too.
Except that, the patch looks good to me.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v4 4/4] net/softnic: add TM hierarchy related ops
  2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 4/4] net/softnic: add TM hierarchy related ops Jasvinder Singh
@ 2017-09-25  7:14               ` Lu, Wenzhuo
  2017-09-28  8:39                 ` Singh, Jasvinder
  0 siblings, 1 reply; 79+ messages in thread
From: Lu, Wenzhuo @ 2017-09-25  7:14 UTC (permalink / raw)
  To: Singh, Jasvinder, dev; +Cc: Dumitrescu, Cristian, Yigit, Ferruh, thomas

Hi Jasvinder,


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jasvinder Singh
> Sent: Monday, September 18, 2017 5:10 PM
> To: dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Yigit, Ferruh
> <ferruh.yigit@intel.com>; thomas@monjalon.net
> Subject: [dpdk-dev] [PATCH v4 4/4] net/softnic: add TM hierarchy related ops
> 
> Implement ethdev TM hierarchy related APIs in SoftNIC PMD.
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> ---
>  drivers/net/softnic/rte_eth_softnic_internals.h |   41 +
>  drivers/net/softnic/rte_eth_softnic_tm.c        | 2776
> ++++++++++++++++++++++-
>  2 files changed, 2813 insertions(+), 4 deletions(-)


> +
> +static uint32_t
> +tm_node_subport_id(struct rte_eth_dev *dev, struct tm_node
> *subport_node)
> +{
> +	struct pmd_internals *p = dev->data->dev_private;
> +	struct tm_node_list *nl = &p->soft.tm.h.nodes;
> +	struct tm_node *ns;
> +	uint32_t subport_id;
> +
> +	subport_id = 0;
> +	TAILQ_FOREACH(ns, nl, node) {
> +		if (ns->level != TM_NODE_LEVEL_SUBPORT)
> +			continue;
> +
> +		if (ns->node_id == subport_node->node_id)
> +			return subport_id;
> +
> +		subport_id++;
> +	}
> +
> +	return UINT32_MAX;
UINT32_MAX means invalid number, right? Better define a specific MACRO for the invalid number in case you may not want to use 0xff.. or uint32.
The same suggestion for the below functions.


> +static int
> +shaper_profile_check(struct rte_eth_dev *dev,
> +	uint32_t shaper_profile_id,
> +	struct rte_tm_shaper_params *profile,
> +	struct rte_tm_error *error)
> +{
> +	struct tm_shaper_profile *sp;
> +
> +	/* Shaper profile ID must not be NONE. */
> +	if (shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE)
> +		return -rte_tm_error_set(error,
> +			EINVAL,
> +			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
> +			NULL,
> +			rte_strerror(EINVAL));
> +
> +	/* Shaper profile must not exist. */
> +	sp = tm_shaper_profile_search(dev, shaper_profile_id);
> +	if (sp)
> +		return -rte_tm_error_set(error,
> +			EEXIST,
> +			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
> +			NULL,
> +			rte_strerror(EEXIST));
> +
> +	/* Profile must not be NULL. */
> +	if (profile == NULL)
> +		return -rte_tm_error_set(error,
> +			EINVAL,
> +			RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
> +			NULL,
> +			rte_strerror(EINVAL));
A slight suggestion. We can do the easiest check at first.


> +
> +/* Traffic manager shaper profile add */
> +static int
> +pmd_tm_shaper_profile_add(struct rte_eth_dev *dev,
> +	uint32_t shaper_profile_id,
> +	struct rte_tm_shaper_params *profile,
> +	struct rte_tm_error *error)
> +{
> +	struct pmd_internals *p = dev->data->dev_private;
> +	struct tm_shaper_profile_list *spl = &p->soft.tm.h.shaper_profiles;
> +	struct tm_shaper_profile *sp;
> +	int status;
> +
> +	/* Check input params */
> +	status = shaper_profile_check(dev, shaper_profile_id, profile, error);
> +	if (status)
> +		return status;
> +
> +	/* Memory allocation */
> +	sp = calloc(1, sizeof(struct tm_shaper_profile));
Just curious, why not use rte_zmalloc?

> +	if (sp == NULL)
> +		return -rte_tm_error_set(error,
> +			ENOMEM,
> +			RTE_TM_ERROR_TYPE_UNSPECIFIED,
> +			NULL,
> +			rte_strerror(ENOMEM));
> +
> +	/* Fill in */
> +	sp->shaper_profile_id = shaper_profile_id;
> +	memcpy(&sp->params, profile, sizeof(sp->params));
> +
> +	/* Add to list */
> +	TAILQ_INSERT_TAIL(spl, sp, node);
> +	p->soft.tm.h.n_shaper_profiles++;
> +
> +	return 0;
> +}
> +

> +
> +static struct tm_node *
> +tm_shared_shaper_get_tc(struct rte_eth_dev *dev,
> +	struct tm_shared_shaper *ss)
> +{
> +	struct pmd_internals *p = dev->data->dev_private;
> +	struct tm_node_list *nl = &p->soft.tm.h.nodes;
> +	struct tm_node *n;
> +
> +	TAILQ_FOREACH(n, nl, node) {
> +		if ((n->level != TM_NODE_LEVEL_TC) ||
> +			(n->params.n_shared_shapers == 0) ||
> +			(n->params.shared_shaper_id[0] != ss-
> >shared_shaper_id))
According to node_add_check_tc, only one shared shaper supported, right? Better adding some comments here?

> +			continue;
> +
> +		return n;
> +	}
> +
> +	return NULL;
> +}



> +
> +static void
> +pipe_profile_build(struct rte_eth_dev *dev,
> +	struct tm_node *np,
> +	struct rte_sched_pipe_params *pp)
> +{
> +	struct pmd_internals *p = dev->data->dev_private;
> +	struct tm_hierarchy *h = &p->soft.tm.h;
> +	struct tm_node_list *nl = &h->nodes;
> +	struct tm_node *nt, *nq;
> +
> +	memset(pp, 0, sizeof(*pp));
> +
> +	/* Pipe */
> +	pp->tb_rate = np->shaper_profile->params.peak.rate;
> +	pp->tb_size = np->shaper_profile->params.peak.size;
> +
> +	/* Traffic Class (TC) */
> +	pp->tc_period = 40;
40 means? A MACRO is better?


> +
> +static int
> +pipe_profile_free_exists(struct rte_eth_dev *dev,
> +	uint32_t *pipe_profile_id)
> +{
> +	struct pmd_internals *p = dev->data->dev_private;
> +	struct tm_params *t = &p->soft.tm.params;
> +
> +	if (t->n_pipe_profiles < RTE_SCHED_PIPE_PROFILES_PER_PORT) {
> +		*pipe_profile_id = t->n_pipe_profiles;
> +		return 1;
Returning true or false is easier to understand?


Also the same concern of the naming as patch 3.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v4 2/4] net/softnic: add traffic management support
  2017-09-25  1:58               ` Lu, Wenzhuo
@ 2017-09-28  8:14                 ` Singh, Jasvinder
  0 siblings, 0 replies; 79+ messages in thread
From: Singh, Jasvinder @ 2017-09-28  8:14 UTC (permalink / raw)
  To: Lu, Wenzhuo, dev; +Cc: Dumitrescu, Cristian, Yigit, Ferruh, thomas

> 
> Hi Jasvinder,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jasvinder Singh
> > Sent: Monday, September 18, 2017 5:10 PM
> > To: dev@dpdk.org
> > Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Yigit,
> > Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
> > Subject: [dpdk-dev] [PATCH v4 2/4] net/softnic: add traffic management
> > support
> >
> > Add ethdev Traffic Management API support to SoftNIC PMD.
> >
> > Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> > ---
> > v3 changes:
> > - add more confguration parameters (tm rate, tm queue sizes)
> >
> >  drivers/net/softnic/Makefile                    |   1 +
> >  drivers/net/softnic/rte_eth_softnic.c           | 252
> > +++++++++++++++++++++++-
> >  drivers/net/softnic/rte_eth_softnic.h           |  16 ++
> >  drivers/net/softnic/rte_eth_softnic_internals.h | 106 +++++++++-
> >  drivers/net/softnic/rte_eth_softnic_tm.c        | 181 +++++++++++++++++
> >  5 files changed, 553 insertions(+), 3 deletions(-)  create mode
> > 100644 drivers/net/softnic/rte_eth_softnic_tm.c
> 
> 
> 
> >
> >  static void
> > @@ -293,6 +325,77 @@ rte_pmd_softnic_run_default(struct rte_eth_dev
> > *dev)
> >  	return 0;
> >  }
> >
> > +static __rte_always_inline int
> > +rte_pmd_softnic_run_tm(struct rte_eth_dev *dev) {
> This function name seems a little misleading. If it's a inline function not an
> API,  better just name it 'softnic_run_tm".
> And a common comments for the names, like, pmd_feature,
> tm_params_check, tm_init ... if they're only for soft nic, better add the prefix
> 'softnc_' for them.

Ok, will change the above function name for clarity. Thanks.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/4] net/softnic: add TM capabilities ops
  2017-09-25  2:33               ` Lu, Wenzhuo
@ 2017-09-28  8:16                 ` Singh, Jasvinder
  0 siblings, 0 replies; 79+ messages in thread
From: Singh, Jasvinder @ 2017-09-28  8:16 UTC (permalink / raw)
  To: Lu, Wenzhuo, dev; +Cc: Dumitrescu, Cristian, Yigit, Ferruh, thomas


> Hi Jasvinder,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jasvinder Singh
> > Sent: Monday, September 18, 2017 5:10 PM
> > To: dev@dpdk.org
> > Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Yigit,
> > Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
> > Subject: [dpdk-dev] [PATCH v4 3/4] net/softnic: add TM capabilities
> > ops
> >
> > Implement ethdev TM capability APIs in SoftNIC PMD.
> >
> > Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> > ---
> >  drivers/net/softnic/rte_eth_softnic.c           |  12 +-
> >  drivers/net/softnic/rte_eth_softnic_internals.h |  32 ++
> >  drivers/net/softnic/rte_eth_softnic_tm.c        | 500
> > ++++++++++++++++++++++++
> >  3 files changed, 543 insertions(+), 1 deletion(-)
> The same concern of the naming as patch 2.The function and structure
> names are too common. Better add the prefix 'softnic_' too.
> Except that, the patch looks good to me.

Ok, Thanks. Will clarity and add comments as well.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v4 4/4] net/softnic: add TM hierarchy related ops
  2017-09-25  7:14               ` Lu, Wenzhuo
@ 2017-09-28  8:39                 ` Singh, Jasvinder
  0 siblings, 0 replies; 79+ messages in thread
From: Singh, Jasvinder @ 2017-09-28  8:39 UTC (permalink / raw)
  To: Lu, Wenzhuo, dev; +Cc: Dumitrescu, Cristian, Yigit, Ferruh, thomas

> Hi Jasvinder,
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jasvinder Singh
> > Sent: Monday, September 18, 2017 5:10 PM
> > To: dev@dpdk.org
> > Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Yigit,
> > Ferruh <ferruh.yigit@intel.com>; thomas@monjalon.net
> > Subject: [dpdk-dev] [PATCH v4 4/4] net/softnic: add TM hierarchy
> > related ops
> >
> > Implement ethdev TM hierarchy related APIs in SoftNIC PMD.
> >
> > Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> > Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> > ---
> >  drivers/net/softnic/rte_eth_softnic_internals.h |   41 +
> >  drivers/net/softnic/rte_eth_softnic_tm.c        | 2776
> > ++++++++++++++++++++++-
> >  2 files changed, 2813 insertions(+), 4 deletions(-)
> 
> 
> > +
> > +static uint32_t
> > +tm_node_subport_id(struct rte_eth_dev *dev, struct tm_node
> > *subport_node)
> > +{
> > +	struct pmd_internals *p = dev->data->dev_private;
> > +	struct tm_node_list *nl = &p->soft.tm.h.nodes;
> > +	struct tm_node *ns;
> > +	uint32_t subport_id;
> > +
> > +	subport_id = 0;
> > +	TAILQ_FOREACH(ns, nl, node) {
> > +		if (ns->level != TM_NODE_LEVEL_SUBPORT)
> > +			continue;
> > +
> > +		if (ns->node_id == subport_node->node_id)
> > +			return subport_id;
> > +
> > +		subport_id++;
> > +	}
> > +
> > +	return UINT32_MAX;
> UINT32_MAX means invalid number, right? Better define a specific MACRO
> for the invalid number in case you may not want to use 0xff.. or uint32.
> The same suggestion for the below functions.

Ok.

> > +static int
> > +shaper_profile_check(struct rte_eth_dev *dev,
> > +	uint32_t shaper_profile_id,
> > +	struct rte_tm_shaper_params *profile,
> > +	struct rte_tm_error *error)
> > +{
> > +	struct tm_shaper_profile *sp;
> > +
> > +	/* Shaper profile ID must not be NONE. */
> > +	if (shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE)
> > +		return -rte_tm_error_set(error,
> > +			EINVAL,
> > +			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
> > +			NULL,
> > +			rte_strerror(EINVAL));
> > +
> > +	/* Shaper profile must not exist. */
> > +	sp = tm_shaper_profile_search(dev, shaper_profile_id);
> > +	if (sp)
> > +		return -rte_tm_error_set(error,
> > +			EEXIST,
> > +			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
> > +			NULL,
> > +			rte_strerror(EEXIST));
> > +
> > +	/* Profile must not be NULL. */
> > +	if (profile == NULL)
> > +		return -rte_tm_error_set(error,
> > +			EINVAL,
> > +			RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
> > +			NULL,
> > +			rte_strerror(EINVAL));
> A slight suggestion. We can do the easiest check at first.

We preferred to perform checks as per the arguments order specified in the function definition so that all the parameter could 
be scanned in a systematic manner instead of picking them randomly.

> > +
> > +/* Traffic manager shaper profile add */ static int
> > +pmd_tm_shaper_profile_add(struct rte_eth_dev *dev,
> > +	uint32_t shaper_profile_id,
> > +	struct rte_tm_shaper_params *profile,
> > +	struct rte_tm_error *error)
> > +{
> > +	struct pmd_internals *p = dev->data->dev_private;
> > +	struct tm_shaper_profile_list *spl = &p->soft.tm.h.shaper_profiles;
> > +	struct tm_shaper_profile *sp;
> > +	int status;
> > +
> > +	/* Check input params */
> > +	status = shaper_profile_check(dev, shaper_profile_id, profile, error);
> > +	if (status)
> > +		return status;
> > +
> > +	/* Memory allocation */
> > +	sp = calloc(1, sizeof(struct tm_shaper_profile));
> Just curious, why not use rte_zmalloc?

This relates to high level hierarchy specification objects which doesn't need to be allocated on specific
numa node as it is not used once the hierarchy is committed. All these objects gets eventually translated into
TM implementation  (librte_sched) specific objects. These objects are allocated using rte_zmalloc and  needs
lots of memory (approx. 2M mbufs for single instance of TM hierarchy ports) on specific numa node.

> > +	if (sp == NULL)
> > +		return -rte_tm_error_set(error,
> > +			ENOMEM,
> > +			RTE_TM_ERROR_TYPE_UNSPECIFIED,
> > +			NULL,
> > +			rte_strerror(ENOMEM));
> > +
> > +	/* Fill in */
> > +	sp->shaper_profile_id = shaper_profile_id;
> > +	memcpy(&sp->params, profile, sizeof(sp->params));
> > +
> > +	/* Add to list */
> > +	TAILQ_INSERT_TAIL(spl, sp, node);
> > +	p->soft.tm.h.n_shaper_profiles++;
> > +
> > +	return 0;
> > +}
> > +
> 
> > +
> > +static struct tm_node *
> > +tm_shared_shaper_get_tc(struct rte_eth_dev *dev,
> > +	struct tm_shared_shaper *ss)
> > +{
> > +	struct pmd_internals *p = dev->data->dev_private;
> > +	struct tm_node_list *nl = &p->soft.tm.h.nodes;
> > +	struct tm_node *n;
> > +
> > +	TAILQ_FOREACH(n, nl, node) {
> > +		if ((n->level != TM_NODE_LEVEL_TC) ||
> > +			(n->params.n_shared_shapers == 0) ||
> > +			(n->params.shared_shaper_id[0] != ss-
> > >shared_shaper_id))
> According to node_add_check_tc, only one shared shaper supported, right?
> Better adding some comments here?

The subport has 4 shared shapers for each of the pipe traffic classes. Will add comment.

> > +			continue;
> > +
> > +		return n;
> > +	}
> > +
> > +	return NULL;
> > +}
> 
> 
> 
> > +
> > +static void
> > +pipe_profile_build(struct rte_eth_dev *dev,
> > +	struct tm_node *np,
> > +	struct rte_sched_pipe_params *pp)
> > +{
> > +	struct pmd_internals *p = dev->data->dev_private;
> > +	struct tm_hierarchy *h = &p->soft.tm.h;
> > +	struct tm_node_list *nl = &h->nodes;
> > +	struct tm_node *nt, *nq;
> > +
> > +	memset(pp, 0, sizeof(*pp));
> > +
> > +	/* Pipe */
> > +	pp->tb_rate = np->shaper_profile->params.peak.rate;
> > +	pp->tb_size = np->shaper_profile->params.peak.size;
> > +
> > +	/* Traffic Class (TC) */
> > +	pp->tc_period = 40;
> 40 means? A MACRO is better?

will add macro. Thanks.

> > +
> > +static int
> > +pipe_profile_free_exists(struct rte_eth_dev *dev,
> > +	uint32_t *pipe_profile_id)
> > +{
> > +	struct pmd_internals *p = dev->data->dev_private;
> > +	struct tm_params *t = &p->soft.tm.params;
> > +
> > +	if (t->n_pipe_profiles < RTE_SCHED_PIPE_PROFILES_PER_PORT) {
> > +		*pipe_profile_id = t->n_pipe_profiles;
> > +		return 1;
> Returning true or false is easier to understand?

Ok. 

> Also the same concern of the naming as patch 3.

Ok. Will bring clarity in the names.

Thanks for the comments and time.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v5 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 2/4] net/softnic: add traffic management support Jasvinder Singh
  2017-09-25  1:58               ` Lu, Wenzhuo
@ 2017-09-29 14:04               ` Jasvinder Singh
  2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 1/5] net/softnic: add softnic PMD Jasvinder Singh
                                   ` (4 more replies)
  1 sibling, 5 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-09-29 14:04 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

The SoftNIC PMD is intended to provide SW fall-back options for specific
ethdev APIs in a generic way to the NICs not supporting those features.

Currently, the only implemented ethdev API is Traffic Management (TM),
but other ethdev APIs such as rte_flow, traffic metering & policing, etc
can be easily implemented.

Overview:
* Generic: The SoftNIC PMD works with any "hard" PMD that implements the
  ethdev API. It does not change the "hard" PMD in any way.
* Creation: For any given "hard" ethdev port, the user can decide to
  create an associated "soft" ethdev port to drive the "hard" port. The
  "soft" port is a virtual device that can be created at app start-up
  through EAL vdev arg or later through the virtual device API.
* Configuration: The app explicitly decides which features are to be
  enabled on the "soft" port and which features are still to be used from
  the "hard" port. The app continues to explicitly configure both the
  "hard" and the "soft" ports after the creation of the "soft" port.
* RX/TX: The app reads packets from/writes packets to the "soft" port
  instead of the "hard" port. The RX and TX queues of the "soft" port are
  thread safe, as any ethdev.
* Execution: The "soft" port is a feature-rich NIC implemented by the CPU,
  so the run function of the "soft" port has to be executed by the CPU in
  order to get packets moving between "hard" port and the app.
* Meets the NFV vision: The app should be (almost) agnostic about the NIC
  implementation (different vendors/models, HW-SW mix), the app should not
  require changes to use different NICs, the app should use the same API
  for all NICs. If a NIC does not implement a specific feature, the HW
  should be augmented with SW to meet the functionality while still
  preserving the same API.

Traffic Management SW fall-back overview:
* Implements the ethdev traffic management API (rte_tm.h).
* Based on the existing librte_sched DPDK library.

Example: Create "soft" port for "hard" port "0000:04:00.1", enable the TM
feature with default settings:
          --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on' 

Q1: Why generic name, if only TM is supported (for now)?
A1: The intention is to have SoftNIC PMD implement many other (all?)
    ethdev APIs under a single "ideal" ethdev, hence the generic name.
    The initial motivation is TM API, but the mechanism is generic and can
    be used for many other ethdev APIs. Somebody looking to provide SW
    fall-back for other ethdev API is likely to end up inventing the same,
    hence it would be good to consolidate all under a single PMD and have
    the user explicitly enable/disable the features it needs for each
    "soft" device.

Q2: Are there any performance requirements for SoftNIC?
A2: Yes, performance should be great/decent for every feature, otherwise
    the SW fall-back is unusable, thus useless.

Q3: Why not change the "hard" device (and keep a single device) instead of
    creating a new "soft" device (and thus having two devices)?
A3: This is not possible with the current librte_ether ethdev
    implementation. The ethdev->dev_ops are defined as constant structure,
    so it cannot be changed per device (nor per PMD). The new ops also
    need memory space to store their context data structures, which
    requires updating the ethdev->data->dev_private of the existing
    device; at best, maybe a resize of ethdev->data->dev_private could be
    done, assuming that librte_ether will introduce a way to find out its
    size, but this cannot be done while device is running. Other side
    effects might exist, as the changes are very intrusive, plus it likely
    needs more changes in librte_ether.

Q4: Why not call the SW fall-back dev_ops directly in librte_ether for
    devices which do not support the specific feature? If the device
    supports the capability, let's call its dev_ops, otherwise call the
    SW fall-back dev_ops.
A4: First, similar reasons to Q&A3. This fixes the need to change
    ethdev->dev_ops of the device, but it does not do anything to fix the
    other significant issue of where to store the context data structures
    needed by the SW fall-back functions (which, in this approach, are
    called implicitly by librte_ether).
    Second, the SW fall-back options should not be restricted arbitrarily
    by the librte_ether library, the decision should belong to the app.
    For example, the TM SW fall-back should not be limited to only
    librte_sched, which (like any SW fall-back) is limited to a specific
    hierarchy and feature set, it cannot do any possible hierarchy. If
    alternatives exist, the one to use should be picked by the app, not by
    the ethdev layer.

Q5: Why is the app required to continue to configure both the "hard" and
    the "soft" devices even after the "soft" device has been created? Why
    not hiding the "hard" device under the "soft" device and have the
    "soft" device configure the "hard" device under the hood?
A5: This was the approach tried in the V2 of this patch set (overlay
    "soft" device taking over the configuration of the underlay "hard"
    device) and eventually dropped due to increased complexity of having
    to keep the configuration of two distinct devices in sync with
    librte_ether implementation that is not friendly towards such
    approach. Basically, each ethdev API call for the overlay device
    needs to configure the overlay device, invoke the same configuration
    with possibly modified parameters for the underlay device, then resume
    the configuration of overlay device, turning this into a device
    emulation project.
    V2 minuses: increased complexity (deal with two devices at same time);
    need to implement every ethdev API, even those not needed for the scope
    of SW fall-back; intrusive; sometimes have to silently take decisions
    that should be left to the app.
    V3 pluses: lower complexity (only one device); only need to implement
    those APIs that are in scope of the SW fall-back; non-intrusive (deal
    with "hard" device through ethdev API); app decisions taken by the app
    in an explicit way.

Q6: Why expose the SW fall-back in a PMD and not in a SW library?
A6: The SW fall-back for an ethdev API has to implement that specific
    ethdev API, (hence expose an ethdev object through a PMD), as opposed
    to providing a different API. This approach allows the app to use the
    same API (NFV vision). For example, we already have a library for TM
    SW fall-back (librte_sched) that can be called directly by the apps
    that need to call it outside of ethdev context (use-cases exist), but
    an app that works with TM-aware NICs through the ethdev TM API would
    have to be changed significantly in order to work with different
    TM-agnostic NICs through the librte_sched API.

Q7: Why have all the SW fall-backs in a single PMD? Why not develop
    the SW fall-back for each different ethdev API in a separate PMD, then
    create a chain of "soft" devices for each "hard" device? Potentially,
    this results in smaller size PMDs that are easier to maintain.
A7: Arguments for single ethdev/PMD and against chain of ethdevs/PMDs:
    1. All the existing PMDs for HW NICs implement a lot of features under
       the same PMD, so there is no reason for single PMD approach to break
       code modularity. See the V3 code, a lot of care has been taken for
       code modularity.
    2. We should avoid the proliferation of SW PMDs.
    3. A single device should be handled by a single PMD.
    4. People are used with feature-rich PMDs, not with single-feature
       PMDs, so we change of mindset?
    5. [Configuration nightmare] A chain of "soft" devices attached to
       single "hard" device requires the app to be aware that the N "soft"
       devices in the chain plus the "hard" device refer to the same HW
       device, and which device should be invoked to configure which
       feature. Also the length of the chain and functionality of each
       link is different for each HW device. This breaks the requirement
       of preserving the same API while working with different NICs (NFV).
       This most likely results in a configuration nightmare, nobody is
       going to seriously use this.
    6. [Feature inter-dependecy] Sometimes different features need to be
       configured and executed together (e.g. share the same set of
       resources, are inter-dependent, etc), so it is better and more
       performant to do them in the same ethdev/PMD.
    7. [Code duplication] There is a lot of duplication in the
       configuration code for the chain of ethdevs approach. The ethdev
       dev_configure, rx_queue_setup, tx_queue_setup API functions have to
       be implemented per device, and they become meaningless/inconsistent
       with the chain approach.
    8. [Data structure duplication] The per device data structures have to
       be duplicated and read repeatedly for each "soft" ethdev. The
       ethdev device, dev_private, data, per RX/TX queue data structures
       have to be replicated per "soft" device. They have to be re-read for
       each stage, so the same cache misses are now multiplied with the
       number of stages in the chain.
    9. [rte_ring proliferation] Thread safety requirements for ethdev
       RX/TXqueues require an rte_ring to be used for every RX/TX queue
       of each "soft" ethdev. This rte_ring proliferation unnecessarily
       increases the memory footprint and lowers performance, especially
       when each "soft" ethdev ends up on a different CPU core (ping-pong
       of cache lines).
    10.[Meta-data proliferation] A chain of ethdevs is likely to result
       in proliferation of meta-data that has to be passed between the
       ethdevs (e.g. policing needs the output of flow classification),
       which results in more cache line ping-pong between cores, hence
       performance drops.

Cristian Dumitrescu (4):
Jasvinder Singh (4):
  net/softnic: add softnic PMD
  net/softnic: add traffic management support
  net/softnic: add TM capabilities ops
  net/softnic: add TM hierarchy related ops

Jasvinder Singh (1):
  app/testpmd: add traffic management forwarding mode

 MAINTAINERS                                        |    5 +
 app/test-pmd/Makefile                              |    8 +
 app/test-pmd/cmdline.c                             |   88 +
 app/test-pmd/testpmd.c                             |   15 +
 app/test-pmd/testpmd.h                             |   46 +
 app/test-pmd/tm.c                                  |  865 +++++
 config/common_base                                 |    5 +
 doc/api/doxy-api-index.md                          |    3 +-
 doc/api/doxy-api.conf                              |    1 +
 doc/guides/rel_notes/release_17_11.rst             |    6 +
 drivers/net/Makefile                               |    5 +
 drivers/net/softnic/Makefile                       |   57 +
 drivers/net/softnic/rte_eth_softnic.c              |  852 +++++
 drivers/net/softnic/rte_eth_softnic.h              |   83 +
 drivers/net/softnic/rte_eth_softnic_internals.h    |  291 ++
 drivers/net/softnic/rte_eth_softnic_tm.c           | 3452 ++++++++++++++++++++
 .../net/softnic/rte_pmd_eth_softnic_version.map    |    7 +
 mk/rte.app.mk                                      |    5 +-
 18 files changed, 5792 insertions(+), 2 deletions(-)
 create mode 100644 app/test-pmd/tm.c
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c
 create mode 100644 drivers/net/softnic/rte_pmd_eth_softnic_version.map

Series Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v5 1/5] net/softnic: add softnic PMD
  2017-09-29 14:04               ` [dpdk-dev] [PATCH v5 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
@ 2017-09-29 14:04                 ` Jasvinder Singh
  2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 2/5] net/softnic: add traffic management support Jasvinder Singh
                                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-09-29 14:04 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 30387 bytes --]

Add SoftNIC PMD to provide SW fall-back for ethdev APIs.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
---
v5 changes:
- change function name rte_pmd_softnic_run_default() to run_default()

v4 changes:
- Implemented feedback from Ferruh [1]
 - rename map file to rte_pmd_eth_softnic_version.map
 - add release notes library version info
 - doxygen: fix hooks in doc/api/doxy-api-index.md
 - add doxygen comment for rte_pmd_softnic_run()
 - free device name memory
 - remove soft_dev param in pmd_ethdev_register()
 - fix checkpatch warnings

v3 changes:
- rebase to dpdk17.08 release

v2 changes:
- fix build errors
- rebased to TM APIs v6 plus dpdk master

[1] Ferruh’s feedback on v3: http://dpdk.org/ml/archives/dev/2017-September/074576.html

 MAINTAINERS                                        |   5 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   3 +-
 doc/api/doxy-api.conf                              |   1 +
 doc/guides/rel_notes/release_17_11.rst             |   6 +
 drivers/net/Makefile                               |   5 +
 drivers/net/softnic/Makefile                       |  56 ++
 drivers/net/softnic/rte_eth_softnic.c              | 591 +++++++++++++++++++++
 drivers/net/softnic/rte_eth_softnic.h              |  67 +++
 drivers/net/softnic/rte_eth_softnic_internals.h    | 114 ++++
 .../net/softnic/rte_pmd_eth_softnic_version.map    |   7 +
 mk/rte.app.mk                                      |   5 +-
 12 files changed, 863 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_pmd_eth_softnic_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 8df2a7f..8719e5f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -514,6 +514,11 @@ M: Gaetan Rivet <gaetan.rivet@6wind.com>
 F: drivers/net/failsafe/
 F: doc/guides/nics/fail_safe.rst
 
+Softnic PMD
+M: Jasvinder Singh <jasvinder.singh@intel.com>
+M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
+F: drivers/net/softnic
+
 
 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index 12f6be9..76ef2b9 100644
--- a/config/common_base
+++ b/config/common_base
@@ -274,6 +274,11 @@ CONFIG_RTE_LIBRTE_SFC_EFX_PMD=y
 CONFIG_RTE_LIBRTE_SFC_EFX_DEBUG=n
 
 #
+# Compile SOFTNIC PMD
+#
+CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
+
+#
 # Compile software PMD backed by SZEDATA2 device
 #
 CONFIG_RTE_LIBRTE_PMD_SZEDATA2=n
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 19e0d4f..626ab51 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -55,7 +55,8 @@ The public API headers are grouped by topics:
   [KNI]                (@ref rte_kni.h),
   [ixgbe]              (@ref rte_pmd_ixgbe.h),
   [i40e]               (@ref rte_pmd_i40e.h),
-  [crypto_scheduler]   (@ref rte_cryptodev_scheduler.h)
+  [crypto_scheduler]   (@ref rte_cryptodev_scheduler.h),
+  [softnic]            (@ref rte_eth_softnic.h)
 
 - **memory**:
   [memseg]             (@ref rte_memory.h),
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 823554f..b27755d 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -32,6 +32,7 @@ PROJECT_NAME            = DPDK
 INPUT                   = doc/api/doxy-api-index.md \
                           drivers/crypto/scheduler \
                           drivers/net/bonding \
+                          drivers/net/softnic \
                           drivers/net/i40e \
                           drivers/net/ixgbe \
                           lib/librte_eal/common/include \
diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 8bf91bd..32d3af3 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -41,6 +41,11 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **Added SoftNIC PMD.**
+
+  Added new SoftNIC PMD. This virtual device offers applications a software
+  fallback support for traffic management.
+
 
 Resolved Issues
 ---------------
@@ -190,6 +195,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_pipeline.so.3
      librte_pmd_bond.so.1
      librte_pmd_ring.so.2
+   + librte_pmd_softnic.so.1
      librte_port.so.3
      librte_power.so.1
      librte_reorder.so.1
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index d33c959..b552a51 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -110,4 +110,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
 endif # $(CONFIG_RTE_LIBRTE_VHOST)
 DEPDIRS-vhost = $(core-libs) librte_vhost
 
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += softnic
+endif # $(CONFIG_RTE_LIBRTE_SCHED)
+DEPDIRS-softnic = $(core-libs) librte_sched
+
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
new file mode 100644
index 0000000..c2f42ef
--- /dev/null
+++ b/drivers/net/softnic/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_softnic.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_eth_softnic_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+
+#
+# Export include files
+#
+SYMLINK-y-include += rte_eth_softnic.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
new file mode 100644
index 0000000..aa5ea8b
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -0,0 +1,591 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_ethdev.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_vdev.h>
+#include <rte_kvargs.h>
+#include <rte_errno.h>
+#include <rte_ring.h>
+
+#include "rte_eth_softnic.h"
+#include "rte_eth_softnic_internals.h"
+
+#define DEV_HARD(p)					\
+	(&rte_eth_devices[p->hard.port_id])
+
+#define PMD_PARAM_HARD_NAME					"hard_name"
+#define PMD_PARAM_HARD_TX_QUEUE_ID			"hard_tx_queue_id"
+
+static const char *pmd_valid_args[] = {
+	PMD_PARAM_HARD_NAME,
+	PMD_PARAM_HARD_TX_QUEUE_ID,
+	NULL
+};
+
+static const struct rte_eth_dev_info pmd_dev_info = {
+	.min_rx_bufsize = 0,
+	.max_rx_pktlen = UINT32_MAX,
+	.max_rx_queues = UINT16_MAX,
+	.max_tx_queues = UINT16_MAX,
+	.rx_desc_lim = {
+		.nb_max = UINT16_MAX,
+		.nb_min = 0,
+		.nb_align = 1,
+	},
+	.tx_desc_lim = {
+		.nb_max = UINT16_MAX,
+		.nb_min = 0,
+		.nb_align = 1,
+	},
+};
+
+static void
+pmd_dev_infos_get(struct rte_eth_dev *dev __rte_unused,
+	struct rte_eth_dev_info *dev_info)
+{
+	memcpy(dev_info, &pmd_dev_info, sizeof(*dev_info));
+}
+
+static int
+pmd_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_eth_dev *hard_dev = DEV_HARD(p);
+
+	if (dev->data->nb_rx_queues > hard_dev->data->nb_rx_queues)
+		return -1;
+
+	if (p->params.hard.tx_queue_id >= hard_dev->data->nb_tx_queues)
+		return -1;
+
+	return 0;
+}
+
+static int
+pmd_rx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id,
+	uint16_t nb_rx_desc __rte_unused,
+	unsigned int socket_id,
+	const struct rte_eth_rxconf *rx_conf __rte_unused,
+	struct rte_mempool *mb_pool __rte_unused)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	if (p->params.soft.intrusive == 0) {
+		struct pmd_rx_queue *rxq;
+
+		rxq = rte_zmalloc_socket(p->params.soft.name,
+			sizeof(struct pmd_rx_queue), 0, socket_id);
+		if (rxq == NULL)
+			return -ENOMEM;
+
+		rxq->hard.port_id = p->hard.port_id;
+		rxq->hard.rx_queue_id = rx_queue_id;
+		dev->data->rx_queues[rx_queue_id] = rxq;
+	} else {
+		struct rte_eth_dev *hard_dev = DEV_HARD(p);
+		void *rxq = hard_dev->data->rx_queues[rx_queue_id];
+
+		if (rxq == NULL)
+			return -1;
+
+		dev->data->rx_queues[rx_queue_id] = rxq;
+	}
+	return 0;
+}
+
+static int
+pmd_tx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id,
+	uint16_t nb_tx_desc,
+	unsigned int socket_id,
+	const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	uint32_t size = RTE_ETH_NAME_MAX_LEN + strlen("_txq") + 4;
+	char name[size];
+	struct rte_ring *r;
+
+	snprintf(name, sizeof(name), "%s_txq%04x",
+		dev->data->name, tx_queue_id);
+	r = rte_ring_create(name, nb_tx_desc, socket_id,
+		RING_F_SP_ENQ | RING_F_SC_DEQ);
+	if (r == NULL)
+		return -1;
+
+	dev->data->tx_queues[tx_queue_id] = r;
+	return 0;
+}
+
+static int
+pmd_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+
+	if (p->params.soft.intrusive) {
+		struct rte_eth_dev *hard_dev = DEV_HARD(p);
+
+		/* The hard_dev->rx_pkt_burst should be stable by now */
+		dev->rx_pkt_burst = hard_dev->rx_pkt_burst;
+	}
+
+	return 0;
+}
+
+static void
+pmd_dev_stop(struct rte_eth_dev *dev)
+{
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+}
+
+static void
+pmd_dev_close(struct rte_eth_dev *dev)
+{
+	uint32_t i;
+
+	/* TX queues */
+	for (i = 0; i < dev->data->nb_tx_queues; i++)
+		rte_ring_free((struct rte_ring *)dev->data->tx_queues[i]);
+}
+
+static int
+pmd_link_update(struct rte_eth_dev *dev __rte_unused,
+	int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static const struct eth_dev_ops pmd_ops = {
+	.dev_configure = pmd_dev_configure,
+	.dev_start = pmd_dev_start,
+	.dev_stop = pmd_dev_stop,
+	.dev_close = pmd_dev_close,
+	.link_update = pmd_link_update,
+	.dev_infos_get = pmd_dev_infos_get,
+	.rx_queue_setup = pmd_rx_queue_setup,
+	.tx_queue_setup = pmd_tx_queue_setup,
+	.tm_ops_get = NULL,
+};
+
+static uint16_t
+pmd_rx_pkt_burst(void *rxq,
+	struct rte_mbuf **rx_pkts,
+	uint16_t nb_pkts)
+{
+	struct pmd_rx_queue *rx_queue = rxq;
+
+	return rte_eth_rx_burst(rx_queue->hard.port_id,
+		rx_queue->hard.rx_queue_id,
+		rx_pkts,
+		nb_pkts);
+}
+
+static uint16_t
+pmd_tx_pkt_burst(void *txq,
+	struct rte_mbuf **tx_pkts,
+	uint16_t nb_pkts)
+{
+	return (uint16_t)rte_ring_enqueue_burst(txq,
+		(void **)tx_pkts,
+		nb_pkts,
+		NULL);
+}
+
+static __rte_always_inline int
+run_default(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Persistent context: Read Only (update not required) */
+	struct rte_mbuf **pkts = p->soft.def.pkts;
+	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
+
+	/* Persistent context: Read - Write (update required) */
+	uint32_t txq_pos = p->soft.def.txq_pos;
+	uint32_t pkts_len = p->soft.def.pkts_len;
+	uint32_t flush_count = p->soft.def.flush_count;
+
+	/* Not part of the persistent context */
+	uint32_t pos;
+	uint16_t i;
+
+	/* Soft device TXQ read, Hard device TXQ write */
+	for (i = 0; i < nb_tx_queues; i++) {
+		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
+
+		/* Read soft device TXQ burst to packet enqueue buffer */
+		pkts_len += rte_ring_sc_dequeue_burst(txq,
+			(void **)&pkts[pkts_len],
+			DEFAULT_BURST_SIZE,
+			NULL);
+
+		/* Increment soft device TXQ */
+		txq_pos++;
+		if (txq_pos >= nb_tx_queues)
+			txq_pos = 0;
+
+		/* Hard device TXQ write when complete burst is available */
+		if (pkts_len >= DEFAULT_BURST_SIZE) {
+			for (pos = 0; pos < pkts_len; )
+				pos += rte_eth_tx_burst(p->hard.port_id,
+					p->params.hard.tx_queue_id,
+					&pkts[pos],
+					(uint16_t)(pkts_len - pos));
+
+			pkts_len = 0;
+			flush_count = 0;
+			break;
+		}
+	}
+
+	if (flush_count >= FLUSH_COUNT_THRESHOLD) {
+		for (pos = 0; pos < pkts_len; )
+			pos += rte_eth_tx_burst(p->hard.port_id,
+				p->params.hard.tx_queue_id,
+				&pkts[pos],
+				(uint16_t)(pkts_len - pos));
+
+		pkts_len = 0;
+		flush_count = 0;
+	}
+
+	p->soft.def.txq_pos = txq_pos;
+	p->soft.def.pkts_len = pkts_len;
+	p->soft.def.flush_count = flush_count + 1;
+
+	return 0;
+}
+
+int
+rte_pmd_softnic_run(uint8_t port_id)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+#endif
+
+	return run_default(dev);
+}
+
+static struct ether_addr eth_addr = { .addr_bytes = {0} };
+
+static uint32_t
+eth_dev_speed_max_mbps(uint32_t speed_capa)
+{
+	uint32_t rate_mbps[32] = {
+		ETH_SPEED_NUM_NONE,
+		ETH_SPEED_NUM_10M,
+		ETH_SPEED_NUM_10M,
+		ETH_SPEED_NUM_100M,
+		ETH_SPEED_NUM_100M,
+		ETH_SPEED_NUM_1G,
+		ETH_SPEED_NUM_2_5G,
+		ETH_SPEED_NUM_5G,
+		ETH_SPEED_NUM_10G,
+		ETH_SPEED_NUM_20G,
+		ETH_SPEED_NUM_25G,
+		ETH_SPEED_NUM_40G,
+		ETH_SPEED_NUM_50G,
+		ETH_SPEED_NUM_56G,
+		ETH_SPEED_NUM_100G,
+	};
+
+	uint32_t pos = (speed_capa) ? (31 - __builtin_clz(speed_capa)) : 0;
+	return rate_mbps[pos];
+}
+
+static int
+default_init(struct pmd_internals *p,
+	struct pmd_params *params,
+	int numa_node)
+{
+	p->soft.def.pkts = rte_zmalloc_socket(params->soft.name,
+		2 * DEFAULT_BURST_SIZE * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.def.pkts == NULL)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void
+default_free(struct pmd_internals *p)
+{
+	rte_free(p->soft.def.pkts);
+}
+
+static void *
+pmd_init(struct pmd_params *params, int numa_node)
+{
+	struct pmd_internals *p;
+	int status;
+
+	p = rte_zmalloc_socket(params->soft.name,
+		sizeof(struct pmd_internals),
+		0,
+		numa_node);
+	if (p == NULL)
+		return NULL;
+
+	memcpy(&p->params, params, sizeof(p->params));
+	rte_eth_dev_get_port_by_name(params->hard.name, &p->hard.port_id);
+
+	/* Default */
+	status = default_init(p, params, numa_node);
+	if (status) {
+		free(p->params.hard.name);
+		rte_free(p);
+		return NULL;
+	}
+
+	return p;
+}
+
+static void
+pmd_free(struct pmd_internals *p)
+{
+	default_free(p);
+
+	free(p->params.hard.name);
+	rte_free(p);
+}
+
+static int
+pmd_ethdev_register(struct rte_vdev_device *vdev,
+	struct pmd_params *params,
+	void *dev_private)
+{
+	struct rte_eth_dev_info hard_info;
+	struct rte_eth_dev *soft_dev;
+	uint32_t hard_speed;
+	int numa_node;
+	uint8_t hard_port_id;
+
+	rte_eth_dev_get_port_by_name(params->hard.name, &hard_port_id);
+	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
+	numa_node = rte_eth_dev_socket_id(hard_port_id);
+
+	/* Ethdev entry allocation */
+	soft_dev = rte_eth_dev_allocate(params->soft.name);
+	if (!soft_dev)
+		return -ENOMEM;
+
+	/* dev */
+	soft_dev->rx_pkt_burst = (params->soft.intrusive) ?
+		NULL : /* set up later */
+		pmd_rx_pkt_burst;
+	soft_dev->tx_pkt_burst = pmd_tx_pkt_burst;
+	soft_dev->tx_pkt_prepare = NULL;
+	soft_dev->dev_ops = &pmd_ops;
+	soft_dev->device = &vdev->device;
+
+	/* dev->data */
+	soft_dev->data->dev_private = dev_private;
+	soft_dev->data->dev_link.link_speed = hard_speed;
+	soft_dev->data->dev_link.link_duplex = ETH_LINK_FULL_DUPLEX;
+	soft_dev->data->dev_link.link_autoneg = ETH_LINK_SPEED_FIXED;
+	soft_dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	soft_dev->data->mac_addrs = &eth_addr;
+	soft_dev->data->promiscuous = 1;
+	soft_dev->data->kdrv = RTE_KDRV_NONE;
+	soft_dev->data->numa_node = numa_node;
+	soft_dev->data->dev_flags = RTE_ETH_DEV_DETACHABLE;
+
+	return 0;
+}
+
+static int
+get_string(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(char **)extra_args = strdup(value);
+
+	if (!*(char **)extra_args)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int
+get_uint32(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(uint32_t *)extra_args = strtoull(value, NULL, 0);
+
+	return 0;
+}
+
+static int
+pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
+{
+	struct rte_kvargs *kvlist;
+	int ret;
+
+	kvlist = rte_kvargs_parse(params, pmd_valid_args);
+	if (kvlist == NULL)
+		return -EINVAL;
+
+	/* Set default values */
+	memset(p, 0, sizeof(*p));
+	p->soft.name = name;
+	p->soft.intrusive = INTRUSIVE;
+	p->hard.tx_queue_id = SOFTNIC_HARD_TX_QUEUE_ID;
+
+	/* HARD: name (mandatory) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_NAME) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME,
+			&get_string, &p->hard.name);
+		if (ret < 0)
+			goto out_free;
+	} else {
+		ret = -EINVAL;
+		goto out_free;
+	}
+
+	/* HARD: tx_queue_id (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_TX_QUEUE_ID) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_TX_QUEUE_ID,
+			&get_uint32, &p->hard.tx_queue_id);
+		if (ret < 0)
+			goto out_free;
+	}
+
+out_free:
+	rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+pmd_probe(struct rte_vdev_device *vdev)
+{
+	struct pmd_params p;
+	const char *params;
+	int status;
+
+	struct rte_eth_dev_info hard_info;
+	uint8_t hard_port_id;
+	int numa_node;
+	void *dev_private;
+
+	RTE_LOG(INFO, PMD,
+		"Probing device \"%s\"\n",
+		rte_vdev_device_name(vdev));
+
+	/* Parse input arguments */
+	params = rte_vdev_device_args(vdev);
+	if (!params)
+		return -EINVAL;
+
+	status = pmd_parse_args(&p, rte_vdev_device_name(vdev), params);
+	if (status)
+		return status;
+
+	/* Check input arguments */
+	if (rte_eth_dev_get_port_by_name(p.hard.name, &hard_port_id))
+		return -EINVAL;
+
+	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	numa_node = rte_eth_dev_socket_id(hard_port_id);
+
+	if (p.hard.tx_queue_id >= hard_info.max_tx_queues)
+		return -EINVAL;
+
+	/* Allocate and initialize soft ethdev private data */
+	dev_private = pmd_init(&p, numa_node);
+	if (dev_private == NULL)
+		return -ENOMEM;
+
+	/* Register soft ethdev */
+	RTE_LOG(INFO, PMD,
+		"Creating soft ethdev \"%s\" for hard ethdev \"%s\"\n",
+		p.soft.name, p.hard.name);
+
+	status = pmd_ethdev_register(vdev, &p, dev_private);
+	if (status) {
+		pmd_free(dev_private);
+		return status;
+	}
+
+	return 0;
+}
+
+static int
+pmd_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *dev = NULL;
+	struct pmd_internals *p;
+
+	if (!vdev)
+		return -EINVAL;
+
+	RTE_LOG(INFO, PMD, "Removing device \"%s\"\n",
+		rte_vdev_device_name(vdev));
+
+	/* Find the ethdev entry */
+	dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (dev == NULL)
+		return -ENODEV;
+	p = dev->data->dev_private;
+
+	/* Free device data structures*/
+	pmd_free(p);
+	rte_free(dev->data);
+	rte_eth_dev_release_port(dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_softnic_drv = {
+	.probe = pmd_probe,
+	.remove = pmd_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_softnic, pmd_softnic_drv);
+RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
+	PMD_PARAM_HARD_NAME "=<string> "
+	PMD_PARAM_HARD_TX_QUEUE_ID "=<int>");
diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
new file mode 100644
index 0000000..e6996f3
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.h
@@ -0,0 +1,67 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_H__
+
+#include <stdint.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifndef SOFTNIC_HARD_TX_QUEUE_ID
+#define SOFTNIC_HARD_TX_QUEUE_ID			0
+#endif
+
+/**
+ * Run the traffic management function on the softnic device
+ *
+ * This function read the packets from the softnic input queues, insert into
+ * QoS scheduler queues based on mbuf sched field value and transmit the
+ * scheduled packets out through the hard device interface.
+ *
+ * @param portid
+ *    port id of the soft device.
+ * @return
+ *    zero.
+ */
+
+int
+rte_pmd_softnic_run(uint8_t port_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
new file mode 100644
index 0000000..96995b5
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -0,0 +1,114 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+
+#include "rte_eth_softnic.h"
+
+#ifndef INTRUSIVE
+#define INTRUSIVE					0
+#endif
+
+struct pmd_params {
+	/** Parameters for the soft device (to be created) */
+	struct {
+		const char *name; /**< Name */
+		uint32_t flags; /**< Flags */
+
+		/** 0 = Access hard device though API only (potentially slower,
+		 *      but safer);
+		 *  1 = Access hard device private data structures is allowed
+		 *      (potentially faster).
+		 */
+		int intrusive;
+	} soft;
+
+	/** Parameters for the hard device (existing) */
+	struct {
+		char *name; /**< Name */
+		uint16_t tx_queue_id; /**< TX queue ID */
+	} hard;
+};
+
+/**
+ * Default Internals
+ */
+
+#ifndef DEFAULT_BURST_SIZE
+#define DEFAULT_BURST_SIZE				32
+#endif
+
+#ifndef FLUSH_COUNT_THRESHOLD
+#define FLUSH_COUNT_THRESHOLD			(1 << 17)
+#endif
+
+struct default_internals {
+	struct rte_mbuf **pkts;
+	uint32_t pkts_len;
+	uint32_t txq_pos;
+	uint32_t flush_count;
+};
+
+/**
+ * PMD Internals
+ */
+struct pmd_internals {
+	/** Params */
+	struct pmd_params params;
+
+	/** Soft device */
+	struct {
+		struct default_internals def; /**< Default */
+	} soft;
+
+	/** Hard device */
+	struct {
+		uint8_t port_id;
+	} hard;
+};
+
+struct pmd_rx_queue {
+	/** Hard device */
+	struct {
+		uint8_t port_id;
+		uint16_t rx_queue_id;
+	} hard;
+};
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
diff --git a/drivers/net/softnic/rte_pmd_eth_softnic_version.map b/drivers/net/softnic/rte_pmd_eth_softnic_version.map
new file mode 100644
index 0000000..fb2cb68
--- /dev/null
+++ b/drivers/net/softnic/rte_pmd_eth_softnic_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_pmd_softnic_run;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index c25fdd9..3dc82fb 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -67,7 +67,6 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
-_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
 # librte_acl needs --whole-archive because of weak functions
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)            += --whole-archive
@@ -99,6 +98,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
@@ -135,6 +135,9 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL)       += -lrte_pmd_null
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP)       += -lrte_pmd_pcap -lpcap
 _LDLIBS-$(CONFIG_RTE_LIBRTE_QEDE_PMD)       += -lrte_pmd_qede
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING)       += -lrte_pmd_ring
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC)      += -lrte_pmd_softnic
+endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD)    += -lrte_pmd_sfc_efx
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2)   += -lrte_pmd_szedata2 -lsze2
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_TAP)        += -lrte_pmd_tap
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v5 2/5] net/softnic: add traffic management support
  2017-09-29 14:04               ` [dpdk-dev] [PATCH v5 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 1/5] net/softnic: add softnic PMD Jasvinder Singh
@ 2017-09-29 14:04                 ` Jasvinder Singh
  2017-10-06 16:59                   ` [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 3/5] net/softnic: add TM capabilities ops Jasvinder Singh
                                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 79+ messages in thread
From: Jasvinder Singh @ 2017-09-29 14:04 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

Add ethdev Traffic Management API support to SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Lu, Wenzhuo <wenzhuo.lu@intel.com>
---
v5 changes:
- change function name rte_pmd_softnic_run_tm() to run_tm()

v3 changes:
- add more confguration parameters (tm rate, tm queue sizes)

 drivers/net/softnic/Makefile                    |   1 +
 drivers/net/softnic/rte_eth_softnic.c           | 255 +++++++++++++++++++++++-
 drivers/net/softnic/rte_eth_softnic.h           |  16 ++
 drivers/net/softnic/rte_eth_softnic_internals.h | 104 ++++++++++
 drivers/net/softnic/rte_eth_softnic_tm.c        | 181 +++++++++++++++++
 5 files changed, 555 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c

diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
index c2f42ef..8b848a9 100644
--- a/drivers/net/softnic/Makefile
+++ b/drivers/net/softnic/Makefile
@@ -47,6 +47,7 @@ LIBABIVER := 1
 # all source are stored in SRCS-y
 #
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic_tm.c
 
 #
 # Export include files
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index aa5ea8b..ab26948 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -42,6 +42,7 @@
 #include <rte_kvargs.h>
 #include <rte_errno.h>
 #include <rte_ring.h>
+#include <rte_sched.h>
 
 #include "rte_eth_softnic.h"
 #include "rte_eth_softnic_internals.h"
@@ -49,10 +50,29 @@
 #define DEV_HARD(p)					\
 	(&rte_eth_devices[p->hard.port_id])
 
+#define PMD_PARAM_SOFT_TM					"soft_tm"
+#define PMD_PARAM_SOFT_TM_RATE				"soft_tm_rate"
+#define PMD_PARAM_SOFT_TM_NB_QUEUES			"soft_tm_nb_queues"
+#define PMD_PARAM_SOFT_TM_QSIZE0			"soft_tm_qsize0"
+#define PMD_PARAM_SOFT_TM_QSIZE1			"soft_tm_qsize1"
+#define PMD_PARAM_SOFT_TM_QSIZE2			"soft_tm_qsize2"
+#define PMD_PARAM_SOFT_TM_QSIZE3			"soft_tm_qsize3"
+#define PMD_PARAM_SOFT_TM_ENQ_BSZ			"soft_tm_enq_bsz"
+#define PMD_PARAM_SOFT_TM_DEQ_BSZ			"soft_tm_deq_bsz"
+
 #define PMD_PARAM_HARD_NAME					"hard_name"
 #define PMD_PARAM_HARD_TX_QUEUE_ID			"hard_tx_queue_id"
 
 static const char *pmd_valid_args[] = {
+	PMD_PARAM_SOFT_TM,
+	PMD_PARAM_SOFT_TM_RATE,
+	PMD_PARAM_SOFT_TM_NB_QUEUES,
+	PMD_PARAM_SOFT_TM_QSIZE0,
+	PMD_PARAM_SOFT_TM_QSIZE1,
+	PMD_PARAM_SOFT_TM_QSIZE2,
+	PMD_PARAM_SOFT_TM_QSIZE3,
+	PMD_PARAM_SOFT_TM_ENQ_BSZ,
+	PMD_PARAM_SOFT_TM_DEQ_BSZ,
 	PMD_PARAM_HARD_NAME,
 	PMD_PARAM_HARD_TX_QUEUE_ID,
 	NULL
@@ -157,6 +177,13 @@ pmd_dev_start(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
 
+	if (tm_used(dev)) {
+		int status = tm_start(p);
+
+		if (status)
+			return status;
+	}
+
 	dev->data->dev_link.link_status = ETH_LINK_UP;
 
 	if (p->params.soft.intrusive) {
@@ -172,7 +199,12 @@ pmd_dev_start(struct rte_eth_dev *dev)
 static void
 pmd_dev_stop(struct rte_eth_dev *dev)
 {
+	struct pmd_internals *p = dev->data->dev_private;
+
 	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+
+	if (tm_used(dev))
+		tm_stop(p);
 }
 
 static void
@@ -293,6 +325,77 @@ run_default(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static __rte_always_inline int
+run_tm(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Persistent context: Read Only (update not required) */
+	struct rte_sched_port *sched = p->soft.tm.sched;
+	struct rte_mbuf **pkts_enq = p->soft.tm.pkts_enq;
+	struct rte_mbuf **pkts_deq = p->soft.tm.pkts_deq;
+	uint32_t enq_bsz = p->params.soft.tm.enq_bsz;
+	uint32_t deq_bsz = p->params.soft.tm.deq_bsz;
+	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
+
+	/* Persistent context: Read - Write (update required) */
+	uint32_t txq_pos = p->soft.tm.txq_pos;
+	uint32_t pkts_enq_len = p->soft.tm.pkts_enq_len;
+	uint32_t flush_count = p->soft.tm.flush_count;
+
+	/* Not part of the persistent context */
+	uint32_t pkts_deq_len, pos;
+	uint16_t i;
+
+	/* Soft device TXQ read, TM enqueue */
+	for (i = 0; i < nb_tx_queues; i++) {
+		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
+
+		/* Read TXQ burst to packet enqueue buffer */
+		pkts_enq_len += rte_ring_sc_dequeue_burst(txq,
+			(void **)&pkts_enq[pkts_enq_len],
+			enq_bsz,
+			NULL);
+
+		/* Increment TXQ */
+		txq_pos++;
+		if (txq_pos >= nb_tx_queues)
+			txq_pos = 0;
+
+		/* TM enqueue when complete burst is available */
+		if (pkts_enq_len >= enq_bsz) {
+			rte_sched_port_enqueue(sched, pkts_enq, pkts_enq_len);
+
+			pkts_enq_len = 0;
+			flush_count = 0;
+			break;
+		}
+	}
+
+	if (flush_count >= FLUSH_COUNT_THRESHOLD) {
+		if (pkts_enq_len)
+			rte_sched_port_enqueue(sched, pkts_enq, pkts_enq_len);
+
+		pkts_enq_len = 0;
+		flush_count = 0;
+	}
+
+	p->soft.tm.txq_pos = txq_pos;
+	p->soft.tm.pkts_enq_len = pkts_enq_len;
+	p->soft.tm.flush_count = flush_count + 1;
+
+	/* TM dequeue, Hard device TXQ write */
+	pkts_deq_len = rte_sched_port_dequeue(sched, pkts_deq, deq_bsz);
+
+	for (pos = 0; pos < pkts_deq_len; )
+		pos += rte_eth_tx_burst(p->hard.port_id,
+			p->params.hard.tx_queue_id,
+			&pkts_deq[pos],
+			(uint16_t)(pkts_deq_len - pos));
+
+	return 0;
+}
+
 int
 rte_pmd_softnic_run(uint8_t port_id)
 {
@@ -302,7 +405,7 @@ rte_pmd_softnic_run(uint8_t port_id)
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
 #endif
 
-	return run_default(dev);
+	return (tm_used(dev)) ? run_tm(dev) : run_default(dev);
 }
 
 static struct ether_addr eth_addr = { .addr_bytes = {0} };
@@ -378,12 +481,26 @@ pmd_init(struct pmd_params *params, int numa_node)
 		return NULL;
 	}
 
+	/* Traffic Management (TM)*/
+	if (params->soft.flags & PMD_FEATURE_TM) {
+		status = tm_init(p, params, numa_node);
+		if (status) {
+			default_free(p);
+			free(p->params.hard.name);
+			rte_free(p);
+			return NULL;
+		}
+	}
+
 	return p;
 }
 
 static void
 pmd_free(struct pmd_internals *p)
 {
+	if (p->params.soft.flags & PMD_FEATURE_TM)
+		tm_free(p);
+
 	default_free(p);
 
 	free(p->params.hard.name);
@@ -464,7 +581,7 @@ static int
 pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
 {
 	struct rte_kvargs *kvlist;
-	int ret;
+	int i, ret;
 
 	kvlist = rte_kvargs_parse(params, pmd_valid_args);
 	if (kvlist == NULL)
@@ -474,8 +591,124 @@ pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
 	memset(p, 0, sizeof(*p));
 	p->soft.name = name;
 	p->soft.intrusive = INTRUSIVE;
+	p->soft.tm.rate = 0;
+	p->soft.tm.nb_queues = SOFTNIC_SOFT_TM_NB_QUEUES;
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+		p->soft.tm.qsize[i] = SOFTNIC_SOFT_TM_QUEUE_SIZE;
+	p->soft.tm.enq_bsz = SOFTNIC_SOFT_TM_ENQ_BSZ;
+	p->soft.tm.deq_bsz = SOFTNIC_SOFT_TM_DEQ_BSZ;
 	p->hard.tx_queue_id = SOFTNIC_HARD_TX_QUEUE_ID;
 
+	/* SOFT: TM (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM) == 1) {
+		char *s;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM,
+			&get_string, &s);
+		if (ret < 0)
+			goto out_free;
+
+		if (strcmp(s, "on") == 0)
+			p->soft.flags |= PMD_FEATURE_TM;
+		else if (strcmp(s, "off") == 0)
+			p->soft.flags &= ~PMD_FEATURE_TM;
+		else
+			ret = -EINVAL;
+
+		free(s);
+		if (ret)
+			goto out_free;
+	}
+
+	/* SOFT: TM rate (measured in bytes/second) (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_RATE) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_RATE,
+			&get_uint32, &p->soft.tm.rate);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM number of queues (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_NB_QUEUES) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_NB_QUEUES,
+			&get_uint32, &p->soft.tm.nb_queues);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM queue size 0 .. 3 (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE0) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE0,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[0] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE1) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE1,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[1] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE2) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE2,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[2] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE3) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE3,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[3] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM enqueue burst size (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_ENQ_BSZ) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_ENQ_BSZ,
+			&get_uint32, &p->soft.tm.enq_bsz);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM dequeue burst size (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_DEQ_BSZ) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_DEQ_BSZ,
+			&get_uint32, &p->soft.tm.deq_bsz);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
 	/* HARD: name (mandatory) */
 	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_NAME) == 1) {
 		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME,
@@ -508,6 +741,7 @@ pmd_probe(struct rte_vdev_device *vdev)
 	int status;
 
 	struct rte_eth_dev_info hard_info;
+	uint32_t hard_speed;
 	uint8_t hard_port_id;
 	int numa_node;
 	void *dev_private;
@@ -530,11 +764,19 @@ pmd_probe(struct rte_vdev_device *vdev)
 		return -EINVAL;
 
 	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
 	numa_node = rte_eth_dev_socket_id(hard_port_id);
 
 	if (p.hard.tx_queue_id >= hard_info.max_tx_queues)
 		return -EINVAL;
 
+	if (p.soft.flags & PMD_FEATURE_TM) {
+		status = tm_params_check(&p, hard_speed);
+
+		if (status)
+			return status;
+	}
+
 	/* Allocate and initialize soft ethdev private data */
 	dev_private = pmd_init(&p, numa_node);
 	if (dev_private == NULL)
@@ -587,5 +829,14 @@ static struct rte_vdev_driver pmd_softnic_drv = {
 
 RTE_PMD_REGISTER_VDEV(net_softnic, pmd_softnic_drv);
 RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
+	PMD_PARAM_SOFT_TM	 "=on|off "
+	PMD_PARAM_SOFT_TM_RATE "=<int> "
+	PMD_PARAM_SOFT_TM_NB_QUEUES "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE0 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE1 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE2 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE3 "=<int> "
+	PMD_PARAM_SOFT_TM_ENQ_BSZ "=<int> "
+	PMD_PARAM_SOFT_TM_DEQ_BSZ "=<int> "
 	PMD_PARAM_HARD_NAME "=<string> "
 	PMD_PARAM_HARD_TX_QUEUE_ID "=<int>");
diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
index e6996f3..517b96a 100644
--- a/drivers/net/softnic/rte_eth_softnic.h
+++ b/drivers/net/softnic/rte_eth_softnic.h
@@ -40,6 +40,22 @@
 extern "C" {
 #endif
 
+#ifndef SOFTNIC_SOFT_TM_NB_QUEUES
+#define SOFTNIC_SOFT_TM_NB_QUEUES			65536
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_QUEUE_SIZE
+#define SOFTNIC_SOFT_TM_QUEUE_SIZE			64
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_ENQ_BSZ
+#define SOFTNIC_SOFT_TM_ENQ_BSZ				32
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_DEQ_BSZ
+#define SOFTNIC_SOFT_TM_DEQ_BSZ				24
+#endif
+
 #ifndef SOFTNIC_HARD_TX_QUEUE_ID
 #define SOFTNIC_HARD_TX_QUEUE_ID			0
 #endif
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 96995b5..11f88d8 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -37,10 +37,19 @@
 #include <stdint.h>
 
 #include <rte_mbuf.h>
+#include <rte_sched.h>
 #include <rte_ethdev.h>
 
 #include "rte_eth_softnic.h"
 
+/**
+ * PMD Parameters
+ */
+
+enum pmd_feature {
+	PMD_FEATURE_TM = 1, /**< Traffic Management (TM) */
+};
+
 #ifndef INTRUSIVE
 #define INTRUSIVE					0
 #endif
@@ -57,6 +66,16 @@ struct pmd_params {
 		 *      (potentially faster).
 		 */
 		int intrusive;
+
+		/** Traffic Management (TM) */
+		struct {
+			uint32_t rate; /**< Rate (bytes/second) */
+			uint32_t nb_queues; /**< Number of queues */
+			uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+			/**< Queue size per traffic class */
+			uint32_t enq_bsz; /**< Enqueue burst size */
+			uint32_t deq_bsz; /**< Dequeue burst size */
+		} tm;
 	} soft;
 
 	/** Parameters for the hard device (existing) */
@@ -86,6 +105,66 @@ struct default_internals {
 };
 
 /**
+ * Traffic Management (TM) Internals
+ */
+
+#ifndef TM_MAX_SUBPORTS
+#define TM_MAX_SUBPORTS					8
+#endif
+
+#ifndef TM_MAX_PIPES_PER_SUBPORT
+#define TM_MAX_PIPES_PER_SUBPORT			4096
+#endif
+
+struct tm_params {
+	struct rte_sched_port_params port_params;
+
+	struct rte_sched_subport_params subport_params[TM_MAX_SUBPORTS];
+
+	struct rte_sched_pipe_params
+		pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PORT];
+	uint32_t n_pipe_profiles;
+	uint32_t pipe_to_profile[TM_MAX_SUBPORTS * TM_MAX_PIPES_PER_SUBPORT];
+};
+
+/* TM Levels */
+enum tm_node_level {
+	TM_NODE_LEVEL_PORT = 0,
+	TM_NODE_LEVEL_SUBPORT,
+	TM_NODE_LEVEL_PIPE,
+	TM_NODE_LEVEL_TC,
+	TM_NODE_LEVEL_QUEUE,
+	TM_NODE_LEVEL_MAX,
+};
+
+/* TM Hierarchy Specification */
+struct tm_hierarchy {
+	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
+};
+
+struct tm_internals {
+	/** Hierarchy specification
+	 *
+	 *     -Hierarchy is unfrozen at init and when port is stopped.
+	 *     -Hierarchy is frozen on successful hierarchy commit.
+	 *     -Run-time hierarchy changes are not allowed, therefore it makes
+	 *      sense to keep the hierarchy frozen after the port is started.
+	 */
+	struct tm_hierarchy h;
+
+	/** Blueprints */
+	struct tm_params params;
+
+	/** Run-time */
+	struct rte_sched_port *sched;
+	struct rte_mbuf **pkts_enq;
+	struct rte_mbuf **pkts_deq;
+	uint32_t pkts_enq_len;
+	uint32_t txq_pos;
+	uint32_t flush_count;
+};
+
+/**
  * PMD Internals
  */
 struct pmd_internals {
@@ -95,6 +174,7 @@ struct pmd_internals {
 	/** Soft device */
 	struct {
 		struct default_internals def; /**< Default */
+		struct tm_internals tm; /**< Traffic Management */
 	} soft;
 
 	/** Hard device */
@@ -111,4 +191,28 @@ struct pmd_rx_queue {
 	} hard;
 };
 
+int
+tm_params_check(struct pmd_params *params, uint32_t hard_rate);
+
+int
+tm_init(struct pmd_internals *p, struct pmd_params *params, int numa_node);
+
+void
+tm_free(struct pmd_internals *p);
+
+int
+tm_start(struct pmd_internals *p);
+
+void
+tm_stop(struct pmd_internals *p);
+
+static inline int
+tm_used(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	return (p->params.soft.flags & PMD_FEATURE_TM) &&
+		p->soft.tm.h.n_tm_nodes[TM_NODE_LEVEL_PORT];
+}
+
 #endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
new file mode 100644
index 0000000..bb28798
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -0,0 +1,181 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_malloc.h>
+
+#include "rte_eth_softnic_internals.h"
+#include "rte_eth_softnic.h"
+
+#define BYTES_IN_MBPS (1000 * 1000 / 8)
+
+int
+tm_params_check(struct pmd_params *params, uint32_t hard_rate)
+{
+	uint64_t hard_rate_bytes_per_sec = hard_rate * BYTES_IN_MBPS;
+	uint32_t i;
+
+	/* rate */
+	if (params->soft.tm.rate) {
+		if (params->soft.tm.rate > hard_rate_bytes_per_sec)
+			return -EINVAL;
+	} else {
+		params->soft.tm.rate =
+			(hard_rate_bytes_per_sec > UINT32_MAX) ?
+				UINT32_MAX : hard_rate_bytes_per_sec;
+	}
+
+	/* nb_queues */
+	if (params->soft.tm.nb_queues == 0)
+		return -EINVAL;
+
+	if (params->soft.tm.nb_queues < RTE_SCHED_QUEUES_PER_PIPE)
+		params->soft.tm.nb_queues = RTE_SCHED_QUEUES_PER_PIPE;
+
+	params->soft.tm.nb_queues =
+		rte_align32pow2(params->soft.tm.nb_queues);
+
+	/* qsize */
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+		if (params->soft.tm.qsize[i] == 0)
+			return -EINVAL;
+
+		params->soft.tm.qsize[i] =
+			rte_align32pow2(params->soft.tm.qsize[i]);
+	}
+
+	/* enq_bsz, deq_bsz */
+	if ((params->soft.tm.enq_bsz == 0) ||
+		(params->soft.tm.deq_bsz == 0) ||
+		(params->soft.tm.deq_bsz >= params->soft.tm.enq_bsz))
+		return -EINVAL;
+
+	return 0;
+}
+
+int
+tm_init(struct pmd_internals *p,
+	struct pmd_params *params,
+	int numa_node)
+{
+	uint32_t enq_bsz = params->soft.tm.enq_bsz;
+	uint32_t deq_bsz = params->soft.tm.deq_bsz;
+
+	p->soft.tm.pkts_enq = rte_zmalloc_socket(params->soft.name,
+		2 * enq_bsz * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.tm.pkts_enq == NULL)
+		return -ENOMEM;
+
+	p->soft.tm.pkts_deq = rte_zmalloc_socket(params->soft.name,
+		deq_bsz * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.tm.pkts_deq == NULL) {
+		rte_free(p->soft.tm.pkts_enq);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void
+tm_free(struct pmd_internals *p)
+{
+	rte_free(p->soft.tm.pkts_enq);
+	rte_free(p->soft.tm.pkts_deq);
+}
+
+int
+tm_start(struct pmd_internals *p)
+{
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_subports, subport_id;
+	int status;
+
+	/* Port */
+	p->soft.tm.sched = rte_sched_port_config(&t->port_params);
+	if (p->soft.tm.sched == NULL)
+		return -1;
+
+	/* Subport */
+	n_subports = t->port_params.n_subports_per_port;
+	for (subport_id = 0; subport_id < n_subports; subport_id++) {
+		uint32_t n_pipes_per_subport =
+			t->port_params.n_pipes_per_subport;
+		uint32_t pipe_id;
+
+		status = rte_sched_subport_config(p->soft.tm.sched,
+			subport_id,
+			&t->subport_params[subport_id]);
+		if (status) {
+			rte_sched_port_free(p->soft.tm.sched);
+			return -1;
+		}
+
+		/* Pipe */
+		n_pipes_per_subport = t->port_params.n_pipes_per_subport;
+		for (pipe_id = 0; pipe_id < n_pipes_per_subport; pipe_id++) {
+			int pos = subport_id * TM_MAX_PIPES_PER_SUBPORT +
+				pipe_id;
+			int profile_id = t->pipe_to_profile[pos];
+
+			if (profile_id < 0)
+				continue;
+
+			status = rte_sched_pipe_config(p->soft.tm.sched,
+				subport_id,
+				pipe_id,
+				profile_id);
+			if (status) {
+				rte_sched_port_free(p->soft.tm.sched);
+				return -1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+void
+tm_stop(struct pmd_internals *p)
+{
+	if (p->soft.tm.sched)
+		rte_sched_port_free(p->soft.tm.sched);
+}
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v5 3/5] net/softnic: add TM capabilities ops
  2017-09-29 14:04               ` [dpdk-dev] [PATCH v5 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 1/5] net/softnic: add softnic PMD Jasvinder Singh
  2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 2/5] net/softnic: add traffic management support Jasvinder Singh
@ 2017-09-29 14:04                 ` Jasvinder Singh
  2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 4/5] net/softnic: add TM hierarchy related ops Jasvinder Singh
  2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 5/5] app/testpmd: add traffic management forwarding mode Jasvinder Singh
  4 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-09-29 14:04 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

Implement ethdev TM capability APIs in SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Lu, Wenzhuo <wenzhuo.lu@intel.com>
---
 drivers/net/softnic/rte_eth_softnic.c           |  12 +-
 drivers/net/softnic/rte_eth_softnic_internals.h |  32 ++
 drivers/net/softnic/rte_eth_softnic_tm.c        | 500 ++++++++++++++++++++++++
 3 files changed, 543 insertions(+), 1 deletion(-)

diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index ab26948..4d70ebf 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -43,6 +43,7 @@
 #include <rte_errno.h>
 #include <rte_ring.h>
 #include <rte_sched.h>
+#include <rte_tm_driver.h>
 
 #include "rte_eth_softnic.h"
 #include "rte_eth_softnic_internals.h"
@@ -224,6 +225,15 @@ pmd_link_update(struct rte_eth_dev *dev __rte_unused,
 	return 0;
 }
 
+static int
+pmd_tm_ops_get(struct rte_eth_dev *dev, void *arg)
+{
+	*(const struct rte_tm_ops **)arg =
+		(tm_enabled(dev)) ? &pmd_tm_ops : NULL;
+
+	return 0;
+}
+
 static const struct eth_dev_ops pmd_ops = {
 	.dev_configure = pmd_dev_configure,
 	.dev_start = pmd_dev_start,
@@ -233,7 +243,7 @@ static const struct eth_dev_ops pmd_ops = {
 	.dev_infos_get = pmd_dev_infos_get,
 	.rx_queue_setup = pmd_rx_queue_setup,
 	.tx_queue_setup = pmd_tx_queue_setup,
-	.tm_ops_get = NULL,
+	.tm_ops_get = pmd_tm_ops_get,
 };
 
 static uint16_t
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 11f88d8..9b313d0 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -39,6 +39,7 @@
 #include <rte_mbuf.h>
 #include <rte_sched.h>
 #include <rte_ethdev.h>
+#include <rte_tm_driver.h>
 
 #include "rte_eth_softnic.h"
 
@@ -137,8 +138,26 @@ enum tm_node_level {
 	TM_NODE_LEVEL_MAX,
 };
 
+/* TM Node */
+struct tm_node {
+	TAILQ_ENTRY(tm_node) node;
+	uint32_t node_id;
+	uint32_t parent_node_id;
+	uint32_t priority;
+	uint32_t weight;
+	uint32_t level;
+	struct tm_node *parent_node;
+	struct rte_tm_node_params params;
+	struct rte_tm_node_stats stats;
+	uint32_t n_children;
+};
+
+TAILQ_HEAD(tm_node_list, tm_node);
+
 /* TM Hierarchy Specification */
 struct tm_hierarchy {
+	struct tm_node_list nodes;
+
 	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
 };
 
@@ -191,6 +210,11 @@ struct pmd_rx_queue {
 	} hard;
 };
 
+/**
+ * Traffic Management (TM) Operation
+ */
+extern const struct rte_tm_ops pmd_tm_ops;
+
 int
 tm_params_check(struct pmd_params *params, uint32_t hard_rate);
 
@@ -207,6 +231,14 @@ void
 tm_stop(struct pmd_internals *p);
 
 static inline int
+tm_enabled(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	return (p->params.soft.flags & PMD_FEATURE_TM);
+}
+
+static inline int
 tm_used(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index bb28798..a552006 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -179,3 +179,503 @@ tm_stop(struct pmd_internals *p)
 	if (p->soft.tm.sched)
 		rte_sched_port_free(p->soft.tm.sched);
 }
+
+static struct tm_node *
+tm_node_search(struct rte_eth_dev *dev, uint32_t node_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node)
+		if (n->node_id == node_id)
+			return n;
+
+	return NULL;
+}
+
+static uint32_t
+tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t n_queues_max = p->params.soft.tm.nb_queues;
+	uint32_t n_tc_max = n_queues_max / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
+	uint32_t n_pipes_max = n_tc_max / RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+	uint32_t n_subports_max = n_pipes_max;
+	uint32_t n_root_max = 1;
+
+	switch (level) {
+	case TM_NODE_LEVEL_PORT:
+		return n_root_max;
+	case TM_NODE_LEVEL_SUBPORT:
+		return n_subports_max;
+	case TM_NODE_LEVEL_PIPE:
+		return n_pipes_max;
+	case TM_NODE_LEVEL_TC:
+		return n_tc_max;
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		return n_queues_max;
+	}
+}
+
+#ifdef RTE_SCHED_RED
+#define WRED_SUPPORTED						1
+#else
+#define WRED_SUPPORTED						0
+#endif
+
+#define STATS_MASK_DEFAULT					\
+	(RTE_TM_STATS_N_PKTS |					\
+	RTE_TM_STATS_N_BYTES |					\
+	RTE_TM_STATS_N_PKTS_GREEN_DROPPED |			\
+	RTE_TM_STATS_N_BYTES_GREEN_DROPPED)
+
+#define STATS_MASK_QUEUE						\
+	(STATS_MASK_DEFAULT |					\
+	RTE_TM_STATS_N_PKTS_QUEUED)
+
+static const struct rte_tm_capabilities tm_cap = {
+	.n_nodes_max = UINT32_MAX,
+	.n_levels_max = TM_NODE_LEVEL_MAX,
+
+	.non_leaf_nodes_identical = 0,
+	.leaf_nodes_identical = 1,
+
+	.shaper_n_max = UINT32_MAX,
+	.shaper_private_n_max = UINT32_MAX,
+	.shaper_private_dual_rate_n_max = 0,
+	.shaper_private_rate_min = 1,
+	.shaper_private_rate_max = UINT32_MAX,
+
+	.shaper_shared_n_max = UINT32_MAX,
+	.shaper_shared_n_nodes_per_shaper_max = UINT32_MAX,
+	.shaper_shared_n_shapers_per_node_max = 1,
+	.shaper_shared_dual_rate_n_max = 0,
+	.shaper_shared_rate_min = 1,
+	.shaper_shared_rate_max = UINT32_MAX,
+
+	.shaper_pkt_length_adjust_min = RTE_TM_ETH_FRAMING_OVERHEAD_FCS,
+	.shaper_pkt_length_adjust_max = RTE_TM_ETH_FRAMING_OVERHEAD_FCS,
+
+	.sched_n_children_max = UINT32_MAX,
+	.sched_sp_n_priorities_max = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+	.sched_wfq_n_children_per_group_max = UINT32_MAX,
+	.sched_wfq_n_groups_max = 1,
+	.sched_wfq_weight_max = UINT32_MAX,
+
+	.cman_head_drop_supported = 0,
+	.cman_wred_context_n_max = 0,
+	.cman_wred_context_private_n_max = 0,
+	.cman_wred_context_shared_n_max = 0,
+	.cman_wred_context_shared_n_nodes_per_context_max = 0,
+	.cman_wred_context_shared_n_contexts_per_node_max = 0,
+
+	.mark_vlan_dei_supported = {0, 0, 0},
+	.mark_ip_ecn_tcp_supported = {0, 0, 0},
+	.mark_ip_ecn_sctp_supported = {0, 0, 0},
+	.mark_ip_dscp_supported = {0, 0, 0},
+
+	.dynamic_update_mask = 0,
+
+	.stats_mask = STATS_MASK_QUEUE,
+};
+
+/* Traffic manager capabilities get */
+static int
+pmd_tm_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	struct rte_tm_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_cap, sizeof(*cap));
+
+	cap->n_nodes_max = tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_TC) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_QUEUE);
+
+	cap->shaper_private_n_max =
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_TC);
+
+	cap->shaper_shared_n_max = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE *
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT);
+
+	cap->shaper_n_max = cap->shaper_private_n_max +
+		cap->shaper_shared_n_max;
+
+	cap->shaper_shared_n_nodes_per_shaper_max =
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE);
+
+	cap->sched_n_children_max = RTE_MAX(
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE),
+		(uint32_t)RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE);
+
+	cap->sched_wfq_n_children_per_group_max = cap->sched_n_children_max;
+
+	if (WRED_SUPPORTED)
+		cap->cman_wred_context_private_n_max =
+			tm_level_get_max_nodes(dev, TM_NODE_LEVEL_QUEUE);
+
+	cap->cman_wred_context_n_max = cap->cman_wred_context_private_n_max +
+		cap->cman_wred_context_shared_n_max;
+
+	return 0;
+}
+
+static const struct rte_tm_level_capabilities tm_level_cap[] = {
+	[TM_NODE_LEVEL_PORT] = {
+		.n_nodes_max = 1,
+		.n_nodes_nonleaf_max = 1,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = 1,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_SUBPORT] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+			.sched_wfq_weight_max = UINT32_MAX,
+#else
+			.sched_wfq_weight_max = 1,
+#endif
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_PIPE] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_sp_n_priorities_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_wfq_n_children_per_group_max = 1,
+			.sched_wfq_n_groups_max = 0,
+			.sched_wfq_weight_max = 1,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_TC] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 1,
+
+			.sched_n_children_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_QUEUE] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = 0,
+		.n_nodes_leaf_max = UINT32_MAX,
+		.non_leaf_nodes_identical = 0,
+		.leaf_nodes_identical = 1,
+
+		.leaf = {
+			.shaper_private_supported = 0,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 0,
+			.shaper_private_rate_max = 0,
+			.shaper_shared_n_max = 0,
+
+			.cman_head_drop_supported = 0,
+			.cman_wred_context_private_supported = WRED_SUPPORTED,
+			.cman_wred_context_shared_n_max = 0,
+
+			.stats_mask = STATS_MASK_QUEUE,
+		},
+	},
+};
+
+/* Traffic manager level capabilities get */
+static int
+pmd_tm_level_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t level_id,
+	struct rte_tm_level_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	if (level_id >= TM_NODE_LEVEL_MAX)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_LEVEL_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_level_cap[level_id], sizeof(*cap));
+
+	switch (level_id) {
+	case TM_NODE_LEVEL_PORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_SUBPORT);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_SUBPORT);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_PIPE);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_PIPE);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		break;
+
+	case TM_NODE_LEVEL_TC:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_TC);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		break;
+
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_QUEUE);
+		cap->n_nodes_leaf_max = cap->n_nodes_max;
+		break;
+	}
+
+	return 0;
+}
+
+static const struct rte_tm_node_capabilities tm_node_cap[] = {
+	[TM_NODE_LEVEL_PORT] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = 1,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_SUBPORT] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_PIPE] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_sp_n_priorities_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_wfq_n_children_per_group_max = 1,
+			.sched_wfq_n_groups_max = 0,
+			.sched_wfq_weight_max = 1,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_TC] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 1,
+
+		.nonleaf = {
+			.sched_n_children_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_QUEUE] = {
+		.shaper_private_supported = 0,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 0,
+		.shaper_private_rate_max = 0,
+		.shaper_shared_n_max = 0,
+
+
+		.leaf = {
+			.cman_head_drop_supported = 0,
+			.cman_wred_context_private_supported = WRED_SUPPORTED,
+			.cman_wred_context_shared_n_max = 0,
+		},
+
+		.stats_mask = STATS_MASK_QUEUE,
+	},
+};
+
+/* Traffic manager node capabilities get */
+static int
+pmd_tm_node_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t node_id,
+	struct rte_tm_node_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	struct tm_node *tm_node;
+
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	tm_node = tm_node_search(dev, node_id);
+	if (tm_node == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_NODE_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_node_cap[tm_node->level], sizeof(*cap));
+
+	switch (tm_node->level) {
+	case TM_NODE_LEVEL_PORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_SUBPORT);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_PIPE);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+	case TM_NODE_LEVEL_TC:
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+const struct rte_tm_ops pmd_tm_ops = {
+	.capabilities_get = pmd_tm_capabilities_get,
+	.level_capabilities_get = pmd_tm_level_capabilities_get,
+	.node_capabilities_get = pmd_tm_node_capabilities_get,
+};
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v5 4/5] net/softnic: add TM hierarchy related ops
  2017-09-29 14:04               ` [dpdk-dev] [PATCH v5 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
                                   ` (2 preceding siblings ...)
  2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 3/5] net/softnic: add TM capabilities ops Jasvinder Singh
@ 2017-09-29 14:04                 ` Jasvinder Singh
  2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 5/5] app/testpmd: add traffic management forwarding mode Jasvinder Singh
  4 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-09-29 14:04 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

Implement ethdev TM hierarchy related APIs in SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Lu, Wenzhuo <wenzhuo.lu@intel.com>
---
v5 change:
- add macro for the tc period
- add more comments

 drivers/net/softnic/rte_eth_softnic_internals.h |   41 +
 drivers/net/softnic/rte_eth_softnic_tm.c        | 2781 ++++++++++++++++++++++-
 2 files changed, 2817 insertions(+), 5 deletions(-)

diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 9b313d0..a2675e0 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -138,6 +138,36 @@ enum tm_node_level {
 	TM_NODE_LEVEL_MAX,
 };
 
+/* TM Shaper Profile */
+struct tm_shaper_profile {
+	TAILQ_ENTRY(tm_shaper_profile) node;
+	uint32_t shaper_profile_id;
+	uint32_t n_users;
+	struct rte_tm_shaper_params params;
+};
+
+TAILQ_HEAD(tm_shaper_profile_list, tm_shaper_profile);
+
+/* TM Shared Shaper */
+struct tm_shared_shaper {
+	TAILQ_ENTRY(tm_shared_shaper) node;
+	uint32_t shared_shaper_id;
+	uint32_t n_users;
+	uint32_t shaper_profile_id;
+};
+
+TAILQ_HEAD(tm_shared_shaper_list, tm_shared_shaper);
+
+/* TM WRED Profile */
+struct tm_wred_profile {
+	TAILQ_ENTRY(tm_wred_profile) node;
+	uint32_t wred_profile_id;
+	uint32_t n_users;
+	struct rte_tm_wred_params params;
+};
+
+TAILQ_HEAD(tm_wred_profile_list, tm_wred_profile);
+
 /* TM Node */
 struct tm_node {
 	TAILQ_ENTRY(tm_node) node;
@@ -147,6 +177,8 @@ struct tm_node {
 	uint32_t weight;
 	uint32_t level;
 	struct tm_node *parent_node;
+	struct tm_shaper_profile *shaper_profile;
+	struct tm_wred_profile *wred_profile;
 	struct rte_tm_node_params params;
 	struct rte_tm_node_stats stats;
 	uint32_t n_children;
@@ -156,8 +188,16 @@ TAILQ_HEAD(tm_node_list, tm_node);
 
 /* TM Hierarchy Specification */
 struct tm_hierarchy {
+	struct tm_shaper_profile_list shaper_profiles;
+	struct tm_shared_shaper_list shared_shapers;
+	struct tm_wred_profile_list wred_profiles;
 	struct tm_node_list nodes;
 
+	uint32_t n_shaper_profiles;
+	uint32_t n_shared_shapers;
+	uint32_t n_wred_profiles;
+	uint32_t n_nodes;
+
 	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
 };
 
@@ -170,6 +210,7 @@ struct tm_internals {
 	 *      sense to keep the hierarchy frozen after the port is started.
 	 */
 	struct tm_hierarchy h;
+	int hierarchy_frozen;
 
 	/** Blueprints */
 	struct tm_params params;
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index a552006..21cc93b 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -40,7 +40,9 @@
 #include "rte_eth_softnic_internals.h"
 #include "rte_eth_softnic.h"
 
-#define BYTES_IN_MBPS (1000 * 1000 / 8)
+#define BYTES_IN_MBPS		(1000 * 1000 / 8)
+#define SUBPORT_TC_PERIOD	10
+#define PIPE_TC_PERIOD		40
 
 int
 tm_params_check(struct pmd_params *params, uint32_t hard_rate)
@@ -86,6 +88,79 @@ tm_params_check(struct pmd_params *params, uint32_t hard_rate)
 	return 0;
 }
 
+static void
+tm_hierarchy_init(struct pmd_internals *p)
+{
+	memset(&p->soft.tm.h, 0, sizeof(p->soft.tm.h));
+
+	/* Initialize shaper profile list */
+	TAILQ_INIT(&p->soft.tm.h.shaper_profiles);
+
+	/* Initialize shared shaper list */
+	TAILQ_INIT(&p->soft.tm.h.shared_shapers);
+
+	/* Initialize wred profile list */
+	TAILQ_INIT(&p->soft.tm.h.wred_profiles);
+
+	/* Initialize TM node list */
+	TAILQ_INIT(&p->soft.tm.h.nodes);
+}
+
+static void
+tm_hierarchy_uninit(struct pmd_internals *p)
+{
+	/* Remove all nodes*/
+	for ( ; ; ) {
+		struct tm_node *tm_node;
+
+		tm_node = TAILQ_FIRST(&p->soft.tm.h.nodes);
+		if (tm_node == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.nodes, tm_node, node);
+		free(tm_node);
+	}
+
+	/* Remove all WRED profiles */
+	for ( ; ; ) {
+		struct tm_wred_profile *wred_profile;
+
+		wred_profile = TAILQ_FIRST(&p->soft.tm.h.wred_profiles);
+		if (wred_profile == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.wred_profiles, wred_profile, node);
+		free(wred_profile);
+	}
+
+	/* Remove all shared shapers */
+	for ( ; ; ) {
+		struct tm_shared_shaper *shared_shaper;
+
+		shared_shaper = TAILQ_FIRST(&p->soft.tm.h.shared_shapers);
+		if (shared_shaper == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.shared_shapers, shared_shaper, node);
+		free(shared_shaper);
+	}
+
+	/* Remove all shaper profiles */
+	for ( ; ; ) {
+		struct tm_shaper_profile *shaper_profile;
+
+		shaper_profile = TAILQ_FIRST(&p->soft.tm.h.shaper_profiles);
+		if (shaper_profile == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.shaper_profiles,
+			shaper_profile, node);
+		free(shaper_profile);
+	}
+
+	memset(&p->soft.tm.h, 0, sizeof(p->soft.tm.h));
+}
+
 int
 tm_init(struct pmd_internals *p,
 	struct pmd_params *params,
@@ -112,12 +187,15 @@ tm_init(struct pmd_internals *p,
 		return -ENOMEM;
 	}
 
+	tm_hierarchy_init(p);
+
 	return 0;
 }
 
 void
 tm_free(struct pmd_internals *p)
 {
+	tm_hierarchy_uninit(p);
 	rte_free(p->soft.tm.pkts_enq);
 	rte_free(p->soft.tm.pkts_deq);
 }
@@ -129,6 +207,10 @@ tm_start(struct pmd_internals *p)
 	uint32_t n_subports, subport_id;
 	int status;
 
+	/* Is hierarchy frozen? */
+	if (p->soft.tm.hierarchy_frozen == 0)
+		return -1;
+
 	/* Port */
 	p->soft.tm.sched = rte_sched_port_config(&t->port_params);
 	if (p->soft.tm.sched == NULL)
@@ -178,6 +260,51 @@ tm_stop(struct pmd_internals *p)
 {
 	if (p->soft.tm.sched)
 		rte_sched_port_free(p->soft.tm.sched);
+
+	/* Unfreeze hierarchy */
+	p->soft.tm.hierarchy_frozen = 0;
+}
+
+static struct tm_shaper_profile *
+tm_shaper_profile_search(struct rte_eth_dev *dev, uint32_t shaper_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile_list *spl = &p->soft.tm.h.shaper_profiles;
+	struct tm_shaper_profile *sp;
+
+	TAILQ_FOREACH(sp, spl, node)
+		if (shaper_profile_id == sp->shaper_profile_id)
+			return sp;
+
+	return NULL;
+}
+
+static struct tm_shared_shaper *
+tm_shared_shaper_search(struct rte_eth_dev *dev, uint32_t shared_shaper_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper_list *ssl = &p->soft.tm.h.shared_shapers;
+	struct tm_shared_shaper *ss;
+
+	TAILQ_FOREACH(ss, ssl, node)
+		if (shared_shaper_id == ss->shared_shaper_id)
+			return ss;
+
+	return NULL;
+}
+
+static struct tm_wred_profile *
+tm_wred_profile_search(struct rte_eth_dev *dev, uint32_t wred_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile_list *wpl = &p->soft.tm.h.wred_profiles;
+	struct tm_wred_profile *wp;
+
+	TAILQ_FOREACH(wp, wpl, node)
+		if (wred_profile_id == wp->wred_profile_id)
+			return wp;
+
+	return NULL;
 }
 
 static struct tm_node *
@@ -194,6 +321,94 @@ tm_node_search(struct rte_eth_dev *dev, uint32_t node_id)
 	return NULL;
 }
 
+static struct tm_node *
+tm_root_node_present(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node)
+		if (n->parent_node_id == RTE_TM_NODE_ID_NULL)
+			return n;
+
+	return NULL;
+}
+
+static uint32_t
+tm_node_subport_id(struct rte_eth_dev *dev, struct tm_node *subport_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *ns;
+	uint32_t subport_id;
+
+	subport_id = 0;
+	TAILQ_FOREACH(ns, nl, node) {
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		if (ns->node_id == subport_node->node_id)
+			return subport_id;
+
+		subport_id++;
+	}
+
+	return UINT32_MAX;
+}
+
+static uint32_t
+tm_node_pipe_id(struct rte_eth_dev *dev, struct tm_node *pipe_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *np;
+	uint32_t pipe_id;
+
+	pipe_id = 0;
+	TAILQ_FOREACH(np, nl, node) {
+		if ((np->level != TM_NODE_LEVEL_PIPE) ||
+			(np->parent_node_id != pipe_node->parent_node_id))
+			continue;
+
+		if (np->node_id == pipe_node->node_id)
+			return pipe_id;
+
+		pipe_id++;
+	}
+
+	return UINT32_MAX;
+}
+
+static uint32_t
+tm_node_tc_id(struct rte_eth_dev *dev __rte_unused, struct tm_node *tc_node)
+{
+	return tc_node->priority;
+}
+
+static uint32_t
+tm_node_queue_id(struct rte_eth_dev *dev, struct tm_node *queue_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *nq;
+	uint32_t queue_id;
+
+	queue_id = 0;
+	TAILQ_FOREACH(nq, nl, node) {
+		if ((nq->level != TM_NODE_LEVEL_QUEUE) ||
+			(nq->parent_node_id != queue_node->parent_node_id))
+			continue;
+
+		if (nq->node_id == queue_node->node_id)
+			return queue_id;
+
+		queue_id++;
+	}
+
+	return UINT32_MAX;
+}
+
 static uint32_t
 tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 {
@@ -219,6 +434,35 @@ tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 	}
 }
 
+/* Traffic manager node type get */
+static int
+pmd_tm_node_type_get(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	int *is_leaf,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	if (is_leaf == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_UNSPECIFIED,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	if ((node_id == RTE_TM_NODE_ID_NULL) ||
+		(tm_node_search(dev, node_id) == NULL))
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_NODE_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	*is_leaf = node_id < p->params.soft.tm.nb_queues;
+
+	return 0;
+}
+
 #ifdef RTE_SCHED_RED
 #define WRED_SUPPORTED						1
 #else
@@ -674,8 +918,2535 @@ pmd_tm_node_capabilities_get(struct rte_eth_dev *dev __rte_unused,
 	return 0;
 }
 
-const struct rte_tm_ops pmd_tm_ops = {
-	.capabilities_get = pmd_tm_capabilities_get,
-	.level_capabilities_get = pmd_tm_level_capabilities_get,
-	.node_capabilities_get = pmd_tm_node_capabilities_get,
+static int
+shaper_profile_check(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error)
+{
+	struct tm_shaper_profile *sp;
+
+	/* Shaper profile ID must not be NONE. */
+	if (shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper profile must not exist. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp)
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	/* Profile must not be NULL. */
+	if (profile == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Peak rate: non-zero, 32-bit */
+	if ((profile->peak.rate == 0) ||
+		(profile->peak.rate >= UINT32_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_RATE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Peak size: non-zero, 32-bit */
+	if ((profile->peak.size == 0) ||
+		(profile->peak.size >= UINT32_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_SIZE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Dual-rate profiles are not supported. */
+	if (profile->committed.rate != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_COMMITTED_RATE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Packet length adjust: 24 bytes */
+	if (profile->pkt_length_adjust != RTE_TM_ETH_FRAMING_OVERHEAD_FCS)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PKT_ADJUST_LEN,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+/* Traffic manager shaper profile add */
+static int
+pmd_tm_shaper_profile_add(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile_list *spl = &p->soft.tm.h.shaper_profiles;
+	struct tm_shaper_profile *sp;
+	int status;
+
+	/* Check input params */
+	status = shaper_profile_check(dev, shaper_profile_id, profile, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	sp = calloc(1, sizeof(struct tm_shaper_profile));
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	sp->shaper_profile_id = shaper_profile_id;
+	memcpy(&sp->params, profile, sizeof(sp->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(spl, sp, node);
+	p->soft.tm.h.n_shaper_profiles++;
+
+	return 0;
+}
+
+/* Traffic manager shaper profile delete */
+static int
+pmd_tm_shaper_profile_delete(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp;
+
+	/* Check existing */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (sp->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.shaper_profiles, sp, node);
+	p->soft.tm.h.n_shaper_profiles--;
+	free(sp);
+
+	return 0;
+}
+
+static struct tm_node *
+tm_shared_shaper_get_tc(struct rte_eth_dev *dev,
+	struct tm_shared_shaper *ss)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	/* Subport: each TC uses shared shaper  */
+	TAILQ_FOREACH(n, nl, node) {
+		if ((n->level != TM_NODE_LEVEL_TC) ||
+			(n->params.n_shared_shapers == 0) ||
+			(n->params.shared_shaper_id[0] != ss->shared_shaper_id))
+			continue;
+
+		return n;
+	}
+
+	return NULL;
+}
+
+static int
+update_subport_tc_rate(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct tm_shared_shaper *ss,
+	struct tm_shaper_profile *sp_new)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_subport_params subport_params;
+
+	struct tm_shaper_profile *sp_old = tm_shaper_profile_search(dev,
+		ss->shaper_profile_id);
+
+	/* Derive new subport configuration. */
+	memcpy(&subport_params,
+		&p->soft.tm.params.subport_params[subport_id],
+		sizeof(subport_params));
+	subport_params.tc_rate[tc_id] = sp_new->params.peak.rate;
+
+	/* Update the subport configuration. */
+	if (rte_sched_subport_config(p->soft.tm.sched,
+		subport_id, &subport_params))
+		return -1;
+
+	/* Commit changes. */
+	sp_old->n_users--;
+
+	ss->shaper_profile_id = sp_new->shaper_profile_id;
+	sp_new->n_users++;
+
+	memcpy(&p->soft.tm.params.subport_params[subport_id],
+		&subport_params,
+		sizeof(subport_params));
+
+	return 0;
+}
+
+/* Traffic manager shared shaper add/update */
+static int
+pmd_tm_shared_shaper_add_update(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper *ss;
+	struct tm_shaper_profile *sp;
+	struct tm_node *nt;
+
+	/* Shaper profile must be valid. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/**
+	 * Add new shared shaper
+	 */
+	ss = tm_shared_shaper_search(dev, shared_shaper_id);
+	if (ss == NULL) {
+		struct tm_shared_shaper_list *ssl =
+			&p->soft.tm.h.shared_shapers;
+
+		/* Hierarchy must not be frozen */
+		if (p->soft.tm.hierarchy_frozen)
+			return -rte_tm_error_set(error,
+				EBUSY,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EBUSY));
+
+		/* Memory allocation */
+		ss = calloc(1, sizeof(struct tm_shared_shaper));
+		if (ss == NULL)
+			return -rte_tm_error_set(error,
+				ENOMEM,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(ENOMEM));
+
+		/* Fill in */
+		ss->shared_shaper_id = shared_shaper_id;
+		ss->shaper_profile_id = shaper_profile_id;
+
+		/* Add to list */
+		TAILQ_INSERT_TAIL(ssl, ss, node);
+		p->soft.tm.h.n_shared_shapers++;
+
+		return 0;
+	}
+
+	/**
+	 * Update existing shared shaper
+	 */
+	/* Hierarchy must be frozen (run-time update) */
+	if (p->soft.tm.hierarchy_frozen == 0)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+
+	/* Propagate change. */
+	nt = tm_shared_shaper_get_tc(dev, ss);
+	if (update_subport_tc_rate(dev, nt, ss, sp))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+/* Traffic manager shared shaper delete */
+static int
+pmd_tm_shared_shaper_delete(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper *ss;
+
+	/* Check existing */
+	ss = tm_shared_shaper_search(dev, shared_shaper_id);
+	if (ss == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (ss->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.shared_shapers, ss, node);
+	p->soft.tm.h.n_shared_shapers--;
+	free(ss);
+
+	return 0;
+}
+
+static int
+wred_profile_check(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error)
+{
+	struct tm_wred_profile *wp;
+	enum rte_tm_color color;
+
+	/* WRED profile ID must not be NONE. */
+	if (wred_profile_id == RTE_TM_WRED_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* WRED profile must not exist. */
+	wp = tm_wred_profile_search(dev, wred_profile_id);
+	if (wp)
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	/* Profile must not be NULL. */
+	if (profile == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* min_th <= max_th, max_th > 0  */
+	for (color = RTE_TM_GREEN; color < RTE_TM_COLORS; color++) {
+		uint16_t min_th = profile->red_params[color].min_th;
+		uint16_t max_th = profile->red_params[color].max_th;
+
+		if ((min_th > max_th) || (max_th == 0))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_WRED_PROFILE,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	return 0;
+}
+
+/* Traffic manager WRED profile add */
+static int
+pmd_tm_wred_profile_add(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile_list *wpl = &p->soft.tm.h.wred_profiles;
+	struct tm_wred_profile *wp;
+	int status;
+
+	/* Check input params */
+	status = wred_profile_check(dev, wred_profile_id, profile, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	wp = calloc(1, sizeof(struct tm_wred_profile));
+	if (wp == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	wp->wred_profile_id = wred_profile_id;
+	memcpy(&wp->params, profile, sizeof(wp->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(wpl, wp, node);
+	p->soft.tm.h.n_wred_profiles++;
+
+	return 0;
+}
+
+/* Traffic manager WRED profile delete */
+static int
+pmd_tm_wred_profile_delete(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile *wp;
+
+	/* Check existing */
+	wp = tm_wred_profile_search(dev, wred_profile_id);
+	if (wp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (wp->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.wred_profiles, wp, node);
+	p->soft.tm.h.n_wred_profiles--;
+	free(wp);
+
+	return 0;
+}
+
+static int
+node_add_check_port(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp = tm_shaper_profile_search(dev,
+		params->shaper_profile_id);
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid.
+	 * Shaper profile peak rate must fit the configured port rate.
+	 */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(sp == NULL) ||
+		(sp->params.peak.rate > p->params.soft.tm.rate))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_subport(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_pipe(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight __rte_unused,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 4 */
+	if (params->nonleaf.n_sp_priorities !=
+		RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* WFQ mode must be byte mode */
+	if ((params->nonleaf.wfq_weight_mode != NULL) &&
+		(params->nonleaf.wfq_weight_mode[0] != 0) &&
+		(params->nonleaf.wfq_weight_mode[1] != 0) &&
+		(params->nonleaf.wfq_weight_mode[2] != 0) &&
+		(params->nonleaf.wfq_weight_mode[3] != 0))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_WFQ_WEIGHT_MODE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_tc(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority __rte_unused,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Single valid shared shaper */
+	if (params->n_shared_shapers > 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	if ((params->n_shared_shapers == 1) &&
+		((params->shared_shaper_id == NULL) ||
+		(!tm_shared_shaper_search(dev, params->shared_shaper_id[0]))))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_queue(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight __rte_unused,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: leaf */
+	if (node_id >= p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shaper */
+	if (params->shaper_profile_id != RTE_TM_SHAPER_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Congestion management must not be head drop */
+	if (params->leaf.cman == RTE_TM_CMAN_HEAD_DROP)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_CMAN,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Congestion management set to WRED */
+	if (params->leaf.cman == RTE_TM_CMAN_WRED) {
+		uint32_t wred_profile_id = params->leaf.wred.wred_profile_id;
+		struct tm_wred_profile *wp = tm_wred_profile_search(dev,
+			wred_profile_id);
+
+		/* WRED profile (for private WRED context) must be valid */
+		if ((wred_profile_id == RTE_TM_WRED_PROFILE_ID_NONE) ||
+			(wp == NULL))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARAMS_WRED_PROFILE_ID,
+				NULL,
+				rte_strerror(EINVAL));
+
+		/* No shared WRED contexts */
+		if (params->leaf.wred.n_shared_wred_contexts != 0)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_WRED_CONTEXTS,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_QUEUE))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct tm_node *pn;
+	uint32_t level;
+	int status;
+
+	/* node_id, parent_node_id:
+	 *    -node_id must not be RTE_TM_NODE_ID_NULL
+	 *    -node_id must not be in use
+	 *    -root node add (parent_node_id is RTE_TM_NODE_ID_NULL):
+	 *        -root node must not exist
+	 *    -non-root node add (parent_node_id is not RTE_TM_NODE_ID_NULL):
+	 *        -parent_node_id must be valid
+	 */
+	if (node_id == RTE_TM_NODE_ID_NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	if (tm_node_search(dev, node_id))
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	if (parent_node_id == RTE_TM_NODE_ID_NULL) {
+		pn = NULL;
+		if (tm_root_node_present(dev))
+			return -rte_tm_error_set(error,
+				EEXIST,
+				RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+				NULL,
+				rte_strerror(EEXIST));
+	} else {
+		pn = tm_node_search(dev, parent_node_id);
+		if (pn == NULL)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* priority: must be 0 .. 3 */
+	if (priority >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* weight: must be 1 .. 255 */
+	if ((weight == 0) || (weight >= UINT8_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* level_id: if valid, then
+	 *    -root node add (parent_node_id is RTE_TM_NODE_ID_NULL):
+	 *        -level_id must be zero
+	 *    -non-root node add (parent_node_id is not RTE_TM_NODE_ID_NULL):
+	 *        -level_id must be parent level ID plus one
+	 */
+	level = (pn == NULL) ? 0 : pn->level + 1;
+	if ((level_id != RTE_TM_NODE_LEVEL_ID_ANY) && (level_id != level))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* params: must not be NULL */
+	if (params == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* params: per level checks */
+	switch (level) {
+	case TM_NODE_LEVEL_PORT:
+		status = node_add_check_port(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		status = node_add_check_subport(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+		status = node_add_check_pipe(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_TC:
+		status = node_add_check_tc(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_QUEUE:
+		status = node_add_check_queue(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	default:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+	}
+
+	return 0;
+}
+
+/* Traffic manager node add */
+static int
+pmd_tm_node_add(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+	uint32_t i;
+	int status;
+
+	/* Checks */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	status = node_add_check(dev, node_id, parent_node_id, priority, weight,
+		level_id, params, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	n = calloc(1, sizeof(struct tm_node));
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	n->node_id = node_id;
+	n->parent_node_id = parent_node_id;
+	n->priority = priority;
+	n->weight = weight;
+
+	if (parent_node_id != RTE_TM_NODE_ID_NULL) {
+		n->parent_node = tm_node_search(dev, parent_node_id);
+		n->level = n->parent_node->level + 1;
+	}
+
+	if (params->shaper_profile_id != RTE_TM_SHAPER_PROFILE_ID_NONE)
+		n->shaper_profile = tm_shaper_profile_search(dev,
+			params->shaper_profile_id);
+
+	if ((n->level == TM_NODE_LEVEL_QUEUE) &&
+		(params->leaf.cman == RTE_TM_CMAN_WRED))
+		n->wred_profile = tm_wred_profile_search(dev,
+			params->leaf.wred.wred_profile_id);
+
+	memcpy(&n->params, params, sizeof(n->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(nl, n, node);
+	p->soft.tm.h.n_nodes++;
+
+	/* Update dependencies */
+	if (n->parent_node)
+		n->parent_node->n_children++;
+
+	if (n->shaper_profile)
+		n->shaper_profile->n_users++;
+
+	for (i = 0; i < params->n_shared_shapers; i++) {
+		struct tm_shared_shaper *ss;
+
+		ss = tm_shared_shaper_search(dev, params->shared_shaper_id[i]);
+		ss->n_users++;
+	}
+
+	if (n->wred_profile)
+		n->wred_profile->n_users++;
+
+	p->soft.tm.h.n_tm_nodes[n->level]++;
+
+	return 0;
+}
+
+/* Traffic manager node delete */
+static int
+pmd_tm_node_delete(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node *n;
+	uint32_t i;
+
+	/* Check hierarchy changes are currently allowed */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Check existing */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (n->n_children)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Update dependencies */
+	p->soft.tm.h.n_tm_nodes[n->level]--;
+
+	if (n->wred_profile)
+		n->wred_profile->n_users--;
+
+	for (i = 0; i < n->params.n_shared_shapers; i++) {
+		struct tm_shared_shaper *ss;
+
+		ss = tm_shared_shaper_search(dev,
+				n->params.shared_shaper_id[i]);
+		ss->n_users--;
+	}
+
+	if (n->shaper_profile)
+		n->shaper_profile->n_users--;
+
+	if (n->parent_node)
+		n->parent_node->n_children--;
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.nodes, n, node);
+	p->soft.tm.h.n_nodes--;
+	free(n);
+
+	return 0;
+}
+
+
+static void
+pipe_profile_build(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct rte_sched_pipe_params *pp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *nt, *nq;
+
+	memset(pp, 0, sizeof(*pp));
+
+	/* Pipe */
+	pp->tb_rate = np->shaper_profile->params.peak.rate;
+	pp->tb_size = np->shaper_profile->params.peak.size;
+
+	/* Traffic Class (TC) */
+	pp->tc_period = PIPE_TC_PERIOD;
+
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+	pp->tc_ov_weight = np->weight;
+#endif
+
+	TAILQ_FOREACH(nt, nl, node) {
+		uint32_t queue_id = 0;
+
+		if ((nt->level != TM_NODE_LEVEL_TC) ||
+			(nt->parent_node_id != np->node_id))
+			continue;
+
+		pp->tc_rate[nt->priority] =
+			nt->shaper_profile->params.peak.rate;
+
+		/* Queue */
+		TAILQ_FOREACH(nq, nl, node) {
+			uint32_t pipe_queue_id;
+
+			if ((nq->level != TM_NODE_LEVEL_QUEUE) ||
+				(nq->parent_node_id != nt->node_id))
+				continue;
+
+			pipe_queue_id = nt->priority *
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+			pp->wrr_weights[pipe_queue_id] = nq->weight;
+
+			queue_id++;
+		}
+	}
+}
+
+static int
+pipe_profile_free_exists(struct rte_eth_dev *dev,
+	uint32_t *pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+
+	if (t->n_pipe_profiles < RTE_SCHED_PIPE_PROFILES_PER_PORT) {
+		*pipe_profile_id = t->n_pipe_profiles;
+		return 1;
+	}
+
+	return 0;
+}
+
+static int
+pipe_profile_exists(struct rte_eth_dev *dev,
+	struct rte_sched_pipe_params *pp,
+	uint32_t *pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t i;
+
+	for (i = 0; i < t->n_pipe_profiles; i++)
+		if (memcmp(&t->pipe_profiles[i], pp, sizeof(*pp)) == 0) {
+			if (pipe_profile_id)
+				*pipe_profile_id = i;
+			return 1;
+		}
+
+	return 0;
+}
+
+static void
+pipe_profile_install(struct rte_eth_dev *dev,
+	struct rte_sched_pipe_params *pp,
+	uint32_t pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+
+	memcpy(&t->pipe_profiles[pipe_profile_id], pp, sizeof(*pp));
+	t->n_pipe_profiles++;
+}
+
+static void
+pipe_profile_mark(struct rte_eth_dev *dev,
+	uint32_t subport_id,
+	uint32_t pipe_id,
+	uint32_t pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_pipes_per_subport, pos;
+
+	n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+	pos = subport_id * n_pipes_per_subport + pipe_id;
+
+	t->pipe_to_profile[pos] = pipe_profile_id;
+}
+
+static struct rte_sched_pipe_params *
+pipe_profile_get(struct rte_eth_dev *dev, struct tm_node *np)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	uint32_t subport_id = tm_node_subport_id(dev, np->parent_node);
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	uint32_t pos = subport_id * n_pipes_per_subport + pipe_id;
+	uint32_t pipe_profile_id = t->pipe_to_profile[pos];
+
+	return &t->pipe_profiles[pipe_profile_id];
+}
+
+static int
+pipe_profiles_generate(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *ns, *np;
+	uint32_t subport_id;
+
+	/* Objective: Fill in the following fields in struct tm_params:
+	 *    - pipe_profiles
+	 *    - n_pipe_profiles
+	 *    - pipe_to_profile
+	 */
+
+	subport_id = 0;
+	TAILQ_FOREACH(ns, nl, node) {
+		uint32_t pipe_id;
+
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		pipe_id = 0;
+		TAILQ_FOREACH(np, nl, node) {
+			struct rte_sched_pipe_params pp;
+			uint32_t pos;
+
+			if ((np->level != TM_NODE_LEVEL_PIPE) ||
+				(np->parent_node_id != ns->node_id))
+				continue;
+
+			pipe_profile_build(dev, np, &pp);
+
+			if (!pipe_profile_exists(dev, &pp, &pos)) {
+				if (!pipe_profile_free_exists(dev, &pos))
+					return -1;
+
+				pipe_profile_install(dev, &pp, pos);
+			}
+
+			pipe_profile_mark(dev, subport_id, pipe_id, pos);
+
+			pipe_id++;
+		}
+
+		subport_id++;
+	}
+
+	return 0;
+}
+
+static struct tm_wred_profile *
+tm_tc_wred_profile_get(struct rte_eth_dev *dev, uint32_t tc_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *nq;
+
+	TAILQ_FOREACH(nq, nl, node) {
+		if ((nq->level != TM_NODE_LEVEL_QUEUE) ||
+			(nq->parent_node->priority != tc_id))
+			continue;
+
+		return nq->wred_profile;
+	}
+
+	return NULL;
+}
+
+#ifdef RTE_SCHED_RED
+
+static void
+wred_profiles_set(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_sched_port_params *pp = &p->soft.tm.params.port_params;
+	uint32_t tc_id;
+	enum rte_tm_color color;
+
+	for (tc_id = 0; tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc_id++)
+		for (color = RTE_TM_GREEN; color < RTE_TM_COLORS; color++) {
+			struct rte_red_params *dst =
+				&pp->red_params[tc_id][color];
+			struct tm_wred_profile *src_wp =
+				tm_tc_wred_profile_get(dev, tc_id);
+			struct rte_tm_red_params *src =
+				&src_wp->params.red_params[color];
+
+			memcpy(dst, src, sizeof(*dst));
+		}
+}
+
+#else
+
+#define wred_profiles_set(dev)
+
+#endif
+
+static struct tm_shared_shaper *
+tm_tc_shared_shaper_get(struct rte_eth_dev *dev, struct tm_node *tc_node)
+{
+	return (tc_node->params.n_shared_shapers) ?
+		tm_shared_shaper_search(dev,
+			tc_node->params.shared_shaper_id[0]) :
+		NULL;
+}
+
+static struct tm_shared_shaper *
+tm_subport_tc_shared_shaper_get(struct rte_eth_dev *dev,
+	struct tm_node *subport_node,
+	uint32_t tc_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node) {
+		if ((n->level != TM_NODE_LEVEL_TC) ||
+			(n->parent_node->parent_node_id !=
+				subport_node->node_id) ||
+			(n->priority != tc_id))
+			continue;
+
+		return tm_tc_shared_shaper_get(dev, n);
+	}
+
+	return NULL;
+}
+
+static int
+hierarchy_commit_check(struct rte_eth_dev *dev, struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_shared_shaper_list *ssl = &h->shared_shapers;
+	struct tm_wred_profile_list *wpl = &h->wred_profiles;
+	struct tm_node *nr = tm_root_node_present(dev), *ns, *np, *nt, *nq;
+	struct tm_shared_shaper *ss;
+
+	uint32_t n_pipes_per_subport;
+
+	/* Root node exists. */
+	if (nr == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* There is at least one subport, max is not exceeded. */
+	if ((nr->n_children == 0) || (nr->n_children > TM_MAX_SUBPORTS))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* There is at least one pipe. */
+	if (h->n_tm_nodes[TM_NODE_LEVEL_PIPE] == 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of pipes is the same for all subports. Maximum number of pipes
+	 * per subport is not exceeded.
+	 */
+	n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	if (n_pipes_per_subport > TM_MAX_PIPES_PER_SUBPORT)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	TAILQ_FOREACH(ns, nl, node) {
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		if (ns->n_children != n_pipes_per_subport)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Each pipe has exactly 4 TCs, with exactly one TC for each priority */
+	TAILQ_FOREACH(np, nl, node) {
+		uint32_t mask = 0, mask_expected =
+			RTE_LEN2MASK(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+				uint32_t);
+
+		if (np->level != TM_NODE_LEVEL_PIPE)
+			continue;
+
+		if (np->n_children != RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+
+		TAILQ_FOREACH(nt, nl, node) {
+			if ((nt->level != TM_NODE_LEVEL_TC) ||
+				(nt->parent_node_id != np->node_id))
+				continue;
+
+			mask |= 1 << nt->priority;
+		}
+
+		if (mask != mask_expected)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Each TC has exactly 4 packet queues. */
+	TAILQ_FOREACH(nt, nl, node) {
+		if (nt->level != TM_NODE_LEVEL_TC)
+			continue;
+
+		if (nt->n_children != RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/**
+	 * Shared shapers:
+	 *    -For each TC #i, all pipes in the same subport use the same
+	 *     shared shaper (or no shared shaper) for their TC#i.
+	 *    -Each shared shaper needs to have at least one user. All its
+	 *     users have to be TC nodes with the same priority and the same
+	 *     subport.
+	 */
+	TAILQ_FOREACH(ns, nl, node) {
+		struct tm_shared_shaper *s[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t id;
+
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++)
+			s[id] = tm_subport_tc_shared_shaper_get(dev, ns, id);
+
+		TAILQ_FOREACH(nt, nl, node) {
+			struct tm_shared_shaper *subport_ss, *tc_ss;
+
+			if ((nt->level != TM_NODE_LEVEL_TC) ||
+				(nt->parent_node->parent_node_id !=
+					ns->node_id))
+				continue;
+
+			subport_ss = s[nt->priority];
+			tc_ss = tm_tc_shared_shaper_get(dev, nt);
+
+			if ((subport_ss == NULL) && (tc_ss == NULL))
+				continue;
+
+			if (((subport_ss == NULL) && (tc_ss != NULL)) ||
+				((subport_ss != NULL) && (tc_ss == NULL)) ||
+				(subport_ss->shared_shaper_id !=
+					tc_ss->shared_shaper_id))
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+	}
+
+	TAILQ_FOREACH(ss, ssl, node) {
+		struct tm_node *nt_any = tm_shared_shaper_get_tc(dev, ss);
+		uint32_t n_users = 0;
+
+		if (nt_any != NULL)
+			TAILQ_FOREACH(nt, nl, node) {
+				if ((nt->level != TM_NODE_LEVEL_TC) ||
+					(nt->priority != nt_any->priority) ||
+					(nt->parent_node->parent_node_id !=
+					nt_any->parent_node->parent_node_id))
+					continue;
+
+				n_users++;
+			}
+
+		if ((ss->n_users == 0) || (ss->n_users != n_users))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Not too many pipe profiles. */
+	if (pipe_profiles_generate(dev))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/**
+	 * WRED (when used, i.e. at least one WRED profile defined):
+	 *    -Each WRED profile must have at least one user.
+	 *    -All leaf nodes must have their private WRED context enabled.
+	 *    -For each TC #i, all leaf nodes must use the same WRED profile
+	 *     for their private WRED context.
+	 */
+	if (h->n_wred_profiles) {
+		struct tm_wred_profile *wp;
+		struct tm_wred_profile *w[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t id;
+
+		TAILQ_FOREACH(wp, wpl, node)
+			if (wp->n_users == 0)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++) {
+			w[id] = tm_tc_wred_profile_get(dev, id);
+
+			if (w[id] == NULL)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+
+		TAILQ_FOREACH(nq, nl, node) {
+			uint32_t id;
+
+			if (nq->level != TM_NODE_LEVEL_QUEUE)
+				continue;
+
+			id = nq->parent_node->priority;
+
+			if ((nq->wred_profile == NULL) ||
+				(nq->wred_profile->wred_profile_id !=
+					w[id]->wred_profile_id))
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+	}
+
+	return 0;
+}
+
+static void
+hierarchy_blueprints_create(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *root = tm_root_node_present(dev), *n;
+
+	uint32_t subport_id;
+
+	t->port_params = (struct rte_sched_port_params) {
+		.name = dev->data->name,
+		.socket = dev->data->numa_node,
+		.rate = root->shaper_profile->params.peak.rate,
+		.mtu = dev->data->mtu,
+		.frame_overhead =
+			root->shaper_profile->params.pkt_length_adjust,
+		.n_subports_per_port = root->n_children,
+		.n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+			h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT],
+		.qsize = {p->params.soft.tm.qsize[0],
+			p->params.soft.tm.qsize[1],
+			p->params.soft.tm.qsize[2],
+			p->params.soft.tm.qsize[3],
+		},
+		.pipe_profiles = t->pipe_profiles,
+		.n_pipe_profiles = t->n_pipe_profiles,
+	};
+
+	wred_profiles_set(dev);
+
+	subport_id = 0;
+	TAILQ_FOREACH(n, nl, node) {
+		uint64_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t i;
+
+		if (n->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+			struct tm_shared_shaper *ss;
+			struct tm_shaper_profile *sp;
+
+			ss = tm_subport_tc_shared_shaper_get(dev, n, i);
+			sp = (ss) ? tm_shaper_profile_search(dev,
+				ss->shaper_profile_id) :
+				n->shaper_profile;
+			tc_rate[i] = sp->params.peak.rate;
+		}
+
+		t->subport_params[subport_id] =
+			(struct rte_sched_subport_params) {
+				.tb_rate = n->shaper_profile->params.peak.rate,
+				.tb_size = n->shaper_profile->params.peak.size,
+
+				.tc_rate = {tc_rate[0],
+					tc_rate[1],
+					tc_rate[2],
+					tc_rate[3],
+			},
+			.tc_period = SUBPORT_TC_PERIOD,
+		};
+
+		subport_id++;
+	}
+}
+
+/* Traffic manager hierarchy commit */
+static int
+pmd_tm_hierarchy_commit(struct rte_eth_dev *dev,
+	int clear_on_fail,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	int status;
+
+	/* Checks */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	status = hierarchy_commit_check(dev, error);
+	if (status) {
+		if (clear_on_fail) {
+			tm_hierarchy_uninit(p);
+			tm_hierarchy_init(p);
+		}
+
+		return status;
+	}
+
+	/* Create blueprints */
+	hierarchy_blueprints_create(dev);
+
+	/* Freeze hierarchy */
+	p->soft.tm.hierarchy_frozen = 1;
+
+	return 0;
+}
+
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+
+static int
+update_pipe_weight(struct rte_eth_dev *dev, struct tm_node *np, uint32_t weight)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tc_ov_weight = (uint8_t)weight;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	np->weight = weight;
+
+	return 0;
+}
+
+#endif
+
+static int
+update_queue_weight(struct rte_eth_dev *dev,
+	struct tm_node *nq, uint32_t weight)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t queue_id = tm_node_queue_id(dev, nq);
+
+	struct tm_node *nt = nq->parent_node;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t pipe_queue_id =
+		tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.wrr_weights[pipe_queue_id] = (uint8_t)weight;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set
+	 * of pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	nq->weight = weight;
+
+	return 0;
+}
+
+/* Traffic manager node parent update */
+static int
+pmd_tm_node_parent_update(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+
+	/* Port must be started and TM used. */
+	if ((dev->data->dev_started == 0) && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Parent node must be the same */
+	if (n->parent_node_id != parent_node_id)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be the same */
+	if (n->priority != priority)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* weight: must be 1 .. 255 */
+	if ((weight == 0) || (weight >= UINT8_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_SUBPORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_PIPE:
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+		if (update_pipe_weight(dev, n, weight))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+#else
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+#endif
+		/* fall-through */
+	case TM_NODE_LEVEL_TC:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_QUEUE:
+		/* fall-through */
+	default:
+		if (update_queue_weight(dev, n, weight))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+	}
+}
+
+static int
+update_subport_rate(struct rte_eth_dev *dev,
+	struct tm_node *ns,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_subport_params subport_params;
+
+	/* Derive new subport configuration. */
+	memcpy(&subport_params,
+		&p->soft.tm.params.subport_params[subport_id],
+		sizeof(subport_params));
+	subport_params.tb_rate = sp->params.peak.rate;
+	subport_params.tb_size = sp->params.peak.size;
+
+	/* Update the subport configuration. */
+	if (rte_sched_subport_config(p->soft.tm.sched, subport_id,
+		&subport_params))
+		return -1;
+
+	/* Commit changes. */
+	ns->shaper_profile->n_users--;
+
+	ns->shaper_profile = sp;
+	ns->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	memcpy(&p->soft.tm.params.subport_params[subport_id],
+		&subport_params,
+		sizeof(subport_params));
+
+	return 0;
+}
+
+static int
+update_pipe_rate(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tb_rate = sp->params.peak.rate;
+	profile1.tb_size = sp->params.peak.size;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	np->shaper_profile->n_users--;
+	np->shaper_profile = sp;
+	np->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	return 0;
+}
+
+static int
+update_tc_rate(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tc_rate[tc_id] = sp->params.peak.rate;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	nt->shaper_profile->n_users--;
+	nt->shaper_profile = sp;
+	nt->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	return 0;
+}
+
+/* Traffic manager node shaper update */
+static int
+pmd_tm_node_shaper_update(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+	struct tm_shaper_profile *sp;
+
+	/* Port must be started and TM used. */
+	if ((dev->data->dev_started == 0) && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper profile must be valid. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_SUBPORT:
+		if (update_subport_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_PIPE:
+		if (update_pipe_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_TC:
+		if (update_tc_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_QUEUE:
+		/* fall-through */
+	default:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+	}
+}
+
+static inline uint32_t
+tm_port_queue_id(struct rte_eth_dev *dev,
+	uint32_t port_subport_id,
+	uint32_t subport_pipe_id,
+	uint32_t pipe_tc_id,
+	uint32_t tc_queue_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	uint32_t n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+			h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	uint32_t port_pipe_id =
+		port_subport_id * n_pipes_per_subport + subport_pipe_id;
+	uint32_t port_tc_id =
+		port_pipe_id * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE + pipe_tc_id;
+	uint32_t port_queue_id =
+		port_tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + tc_queue_id;
+
+	return port_queue_id;
+}
+
+static int
+read_port_stats(struct rte_eth_dev *dev,
+	struct tm_node *nr,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	uint32_t n_subports_per_port = h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+	uint32_t subport_id;
+
+	for (subport_id = 0; subport_id < n_subports_per_port; subport_id++) {
+		struct rte_sched_subport_stats s;
+		uint32_t tc_ov, id;
+
+		/* Stats read */
+		int status = rte_sched_subport_read_stats(
+			p->soft.tm.sched,
+			subport_id,
+			&s,
+			&tc_ov);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++) {
+			nr->stats.n_pkts +=
+				s.n_pkts_tc[id] - s.n_pkts_tc_dropped[id];
+			nr->stats.n_bytes +=
+				s.n_bytes_tc[id] - s.n_bytes_tc_dropped[id];
+			nr->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] +=
+				s.n_pkts_tc_dropped[id];
+			nr->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+				s.n_bytes_tc_dropped[id];
+		}
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nr->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nr->stats, 0, sizeof(nr->stats));
+
+	return 0;
+}
+
+static int
+read_subport_stats(struct rte_eth_dev *dev,
+	struct tm_node *ns,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+	struct rte_sched_subport_stats s;
+	uint32_t tc_ov, tc_id;
+
+	/* Stats read */
+	int status = rte_sched_subport_read_stats(
+		p->soft.tm.sched,
+		subport_id,
+		&s,
+		&tc_ov);
+	if (status)
+		return status;
+
+	/* Stats accumulate */
+	for (tc_id = 0; tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc_id++) {
+		ns->stats.n_pkts +=
+			s.n_pkts_tc[tc_id] - s.n_pkts_tc_dropped[tc_id];
+		ns->stats.n_bytes +=
+			s.n_bytes_tc[tc_id] - s.n_bytes_tc_dropped[tc_id];
+		ns->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] +=
+			s.n_pkts_tc_dropped[tc_id];
+		ns->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_tc_dropped[tc_id];
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &ns->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&ns->stats, 0, sizeof(ns->stats));
+
+	return 0;
+}
+
+static int
+read_pipe_stats(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t i;
+
+	/* Stats read */
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
+		struct rte_sched_queue_stats s;
+		uint16_t qlen;
+
+		uint32_t qid = tm_port_queue_id(dev,
+			subport_id,
+			pipe_id,
+			i / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			i % RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
+
+		int status = rte_sched_queue_read_stats(
+			p->soft.tm.sched,
+			qid,
+			&s,
+			&qlen);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		np->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+		np->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+		np->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+		np->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_dropped;
+		np->stats.leaf.n_pkts_queued = qlen;
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &np->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&np->stats, 0, sizeof(np->stats));
+
+	return 0;
+}
+
+static int
+read_tc_stats(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t i;
+
+	/* Stats read */
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
+		struct rte_sched_queue_stats s;
+		uint16_t qlen;
+
+		uint32_t qid = tm_port_queue_id(dev,
+			subport_id,
+			pipe_id,
+			tc_id,
+			i);
+
+		int status = rte_sched_queue_read_stats(
+			p->soft.tm.sched,
+			qid,
+			&s,
+			&qlen);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		nt->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+		nt->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+		nt->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+		nt->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_dropped;
+		nt->stats.leaf.n_pkts_queued = qlen;
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nt->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nt->stats, 0, sizeof(nt->stats));
+
+	return 0;
+}
+
+static int
+read_queue_stats(struct rte_eth_dev *dev,
+	struct tm_node *nq,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_sched_queue_stats s;
+	uint16_t qlen;
+
+	uint32_t queue_id = tm_node_queue_id(dev, nq);
+
+	struct tm_node *nt = nq->parent_node;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	/* Stats read */
+	uint32_t qid = tm_port_queue_id(dev,
+		subport_id,
+		pipe_id,
+		tc_id,
+		queue_id);
+
+	int status = rte_sched_queue_read_stats(
+		p->soft.tm.sched,
+		qid,
+		&s,
+		&qlen);
+	if (status)
+		return status;
+
+	/* Stats accumulate */
+	nq->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+	nq->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+	nq->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+	nq->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+		s.n_bytes_dropped;
+	nq->stats.leaf.n_pkts_queued = qlen;
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nq->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_QUEUE;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nq->stats, 0, sizeof(nq->stats));
+
+	return 0;
+}
+
+/* Traffic manager read stats counters for specific node */
+static int
+pmd_tm_node_stats_read(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+
+	/* Port must be started and TM used. */
+	if ((dev->data->dev_started == 0) && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		if (read_port_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		if (read_subport_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_PIPE:
+		if (read_pipe_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_TC:
+		if (read_tc_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		if (read_queue_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+	}
+}
+
+const struct rte_tm_ops pmd_tm_ops = {
+	.node_type_get = pmd_tm_node_type_get,
+	.capabilities_get = pmd_tm_capabilities_get,
+	.level_capabilities_get = pmd_tm_level_capabilities_get,
+	.node_capabilities_get = pmd_tm_node_capabilities_get,
+
+	.wred_profile_add = pmd_tm_wred_profile_add,
+	.wred_profile_delete = pmd_tm_wred_profile_delete,
+	.shared_wred_context_add_update = NULL,
+	.shared_wred_context_delete = NULL,
+
+	.shaper_profile_add = pmd_tm_shaper_profile_add,
+	.shaper_profile_delete = pmd_tm_shaper_profile_delete,
+	.shared_shaper_add_update = pmd_tm_shared_shaper_add_update,
+	.shared_shaper_delete = pmd_tm_shared_shaper_delete,
+
+	.node_add = pmd_tm_node_add,
+	.node_delete = pmd_tm_node_delete,
+	.node_suspend = NULL,
+	.node_resume = NULL,
+	.hierarchy_commit = pmd_tm_hierarchy_commit,
+
+	.node_parent_update = pmd_tm_node_parent_update,
+	.node_shaper_update = pmd_tm_node_shaper_update,
+	.node_shared_shaper_update = NULL,
+	.node_stats_update = NULL,
+	.node_wfq_weight_mode_update = NULL,
+	.node_cman_update = NULL,
+	.node_wred_context_update = NULL,
+	.node_shared_wred_context_update = NULL,
+
+	.node_stats_read = pmd_tm_node_stats_read,
 };
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v5 5/5] app/testpmd: add traffic management forwarding mode
  2017-09-29 14:04               ` [dpdk-dev] [PATCH v5 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
                                   ` (3 preceding siblings ...)
  2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 4/5] net/softnic: add TM hierarchy related ops Jasvinder Singh
@ 2017-09-29 14:04                 ` Jasvinder Singh
  4 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-09-29 14:04 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas

This commit extends the testpmd application with new forwarding engine
that demonstrates the use of ethdev traffic management APIs and softnic
PMD for QoS traffic management.

In this mode, 5-level hierarchical tree of the QoS scheduler is built
with the help of ethdev TM APIs such as shaper profile add/delete,
shared shaper add/update, node add/delete, hierarchy commit, etc.
The hierarchical tree has following nodes; root node(x1, level 0),
subport node(x1, level 1), pipe node(x4096, level 2),
tc node(x16348, level 3), queue node(x65536, level 4).

During runtime, each received packet is first classified by mapping the
packet fields information to 5-tuples (HQoS subport, pipe, traffic class,
queue within traffic class, and color) and storing it in the packet mbuf
sched field. After classification, each packet is sent to softnic port
which prioritizes the transmission of the received packets, and
accordingly sends them on to the output interface.

To enable traffic management mode, following testpmd command is used;

$ ./testpmd -c c -n 4 --vdev
	'net_softnic0,hard_name=0000:06:00.1,soft_tm=on' -- -i
	--forward-mode=tm

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
---
v5 change:
- add CLI to enable default tm hierarchy
 
v3 change:
- Implements feedback from Pablo[1]
  - add flag to check required librte_sched lib and softnic pmd
  - code cleanup

v2 change:
- change file name softnictm.c to tm.c
- change forward mode name to "tm"
- code clean up

[1] http://dpdk.org/ml/archives/dev/2017-September/075744.html

 app/test-pmd/Makefile  |   8 +
 app/test-pmd/cmdline.c |  88 +++++
 app/test-pmd/testpmd.c |  15 +
 app/test-pmd/testpmd.h |  46 +++
 app/test-pmd/tm.c      | 865 +++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 1022 insertions(+)
 create mode 100644 app/test-pmd/tm.c

diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile
index c36be19..2c50f68 100644
--- a/app/test-pmd/Makefile
+++ b/app/test-pmd/Makefile
@@ -59,6 +59,10 @@ SRCS-y += csumonly.c
 SRCS-y += icmpecho.c
 SRCS-$(CONFIG_RTE_LIBRTE_IEEE1588) += ieee1588fwd.c
 
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_SOFTNIC)$(CONFIG_RTE_LIBRTE_SCHED),yy)
+SRCS-y += tm.c
+endif
+
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 
 ifeq ($(CONFIG_RTE_LIBRTE_PMD_BOND),y)
@@ -81,6 +85,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_PMD_XENVIRT),y)
 LDLIBS += -lrte_pmd_xenvirt
 endif
 
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_SOFTNIC),y)
+LDLIBS += -lrte_pmd_softnic
+endif
+
 endif
 
 CFLAGS_cmdline.o := -D_GNU_SOURCE
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index ccdf239..435ee0d 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -619,6 +619,11 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"E-tag set filter del e-tag-id (value) port (port_id)\n"
 			"    Delete an E-tag forwarding filter on a port\n\n"
 
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+			"set port tm hierarchy default (port_id)\n"
+			"	Set default traffic Management hierarchy on a port\n\n"
+
+#endif
 			"ddp add (port_id) (profile_path[,output_path])\n"
 			"    Load a profile package on a port\n\n"
 
@@ -13185,6 +13190,86 @@ cmdline_parse_inst_t cmd_vf_tc_max_bw = {
 	},
 };
 
+
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+
+/* *** Set Port default Traffic Management Hierarchy *** */
+struct cmd_set_port_tm_hierarchy_default_result {
+	cmdline_fixed_string_t set;
+	cmdline_fixed_string_t port;
+	cmdline_fixed_string_t tm;
+	cmdline_fixed_string_t hierarchy;
+	cmdline_fixed_string_t def;
+	uint8_t port_id;
+};
+
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_set =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result, set, "set");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_port =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result, port, "port");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_tm =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result, tm, "tm");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_hierarchy =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result,
+			hierarchy, "hierarchy");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_default =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result,
+			def, "default");
+cmdline_parse_token_num_t cmd_set_port_tm_hierarchy_default_port_id =
+	TOKEN_NUM_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result,
+			port_id, UINT8);
+
+static void cmd_set_port_tm_hierarchy_default_parsed(void *parsed_result,
+	__attribute__((unused)) struct cmdline *cl,
+	__attribute__((unused)) void *data)
+{
+	struct cmd_set_port_tm_hierarchy_default_result *res = parsed_result;
+	struct rte_port *p;
+	uint8_t port_id = res->port_id;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
+		return;
+
+	p = &ports[port_id];
+
+	/* Port tm flag */
+	if (p->softport.tm_flag == 0) {
+		printf("  tm not enabled on port %u (error)\n", port_id);
+		return;
+	}
+
+	/* Forward mode: tm */
+	if (strcmp(cur_fwd_config.fwd_eng->fwd_mode_name, "tm")) {
+		printf("  tm mode not enabled(error)\n");
+		return;
+	}
+
+	/* Set the default tm hierarchy */
+	p->softport.tm.default_hierarchy_enable = 1;
+}
+
+cmdline_parse_inst_t cmd_set_port_tm_hierarchy_default = {
+	.f = cmd_set_port_tm_hierarchy_default_parsed,
+	.data = NULL,
+	.help_str = "set port tm hierarchy default <port_id>",
+	.tokens = {
+		(void *)&cmd_set_port_tm_hierarchy_default_set,
+		(void *)&cmd_set_port_tm_hierarchy_default_port,
+		(void *)&cmd_set_port_tm_hierarchy_default_tm,
+		(void *)&cmd_set_port_tm_hierarchy_default_hierarchy,
+		(void *)&cmd_set_port_tm_hierarchy_default_default,
+		(void *)&cmd_set_port_tm_hierarchy_default_port_id,
+		NULL,
+	},
+};
+#endif
+
 /* Strict link priority scheduling mode setting */
 static void
 cmd_strict_link_prio_parsed(
@@ -14370,6 +14455,9 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_vf_tc_max_bw,
 	(cmdline_parse_inst_t *)&cmd_strict_link_prio,
 	(cmdline_parse_inst_t *)&cmd_tc_min_bw,
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+	(cmdline_parse_inst_t *)&cmd_set_port_tm_hierarchy_default,
+#endif
 	(cmdline_parse_inst_t *)&cmd_ddp_add,
 	(cmdline_parse_inst_t *)&cmd_ddp_del,
 	(cmdline_parse_inst_t *)&cmd_ddp_get_list,
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index e097ee0..d729267 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -167,6 +167,10 @@ struct fwd_engine * fwd_engines[] = {
 	&tx_only_engine,
 	&csum_fwd_engine,
 	&icmp_echo_engine,
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+	&softnic_tm_engine,
+	&softnic_tm_bypass_engine,
+#endif
 #ifdef RTE_LIBRTE_IEEE1588
 	&ieee1588_fwd_engine,
 #endif
@@ -2085,6 +2089,17 @@ init_port_config(void)
 		    (rte_eth_devices[pid].data->dev_flags &
 		     RTE_ETH_DEV_INTR_RMV))
 			port->dev_conf.intr_conf.rmv = 1;
+
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+		/* Detect softnic port */
+		if (!strcmp(port->dev_info.driver_name, "net_softnic")) {
+			port->softnic_enable = 1;
+			memset(&port->softport, 0, sizeof(struct softnic_port));
+
+			if (!strcmp(cur_fwd_eng->fwd_mode_name, "tm"))
+				port->softport.tm_flag = 1;
+		}
+#endif
 	}
 }
 
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 1d1ee75..bc7c6e2 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -84,6 +84,12 @@ typedef uint16_t streamid_t;
 
 #define MAX_QUEUE_ID ((1 << (sizeof(queueid_t) * 8)) - 1)
 
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+#define TM_MODE			1
+#else
+#define TM_MODE			0
+#endif
+
 enum {
 	PORT_TOPOLOGY_PAIRED,
 	PORT_TOPOLOGY_CHAINED,
@@ -162,6 +168,38 @@ struct port_flow {
 	uint8_t data[]; /**< Storage for pattern/actions. */
 };
 
+#ifdef TM_MODE
+/**
+ * Soft port tm related parameters
+ */
+struct softnic_port_tm {
+	uint32_t default_hierarchy_enable; /**< def hierarchy enable flag */
+	uint32_t hierarchy_config;  /**< set to 1 if hierarchy configured */
+
+	uint32_t n_subports_per_port;  /**< Num of subport nodes per port */
+	uint32_t n_pipes_per_subport;  /**< Num of pipe nodes per subport */
+
+	uint64_t tm_pktfield0_slabpos;	/**< Pkt field position for subport */
+	uint64_t tm_pktfield0_slabmask; /**< Pkt field mask for the subport */
+	uint64_t tm_pktfield0_slabshr;
+	uint64_t tm_pktfield1_slabpos; /**< Pkt field position for the pipe */
+	uint64_t tm_pktfield1_slabmask; /**< Pkt field mask for the pipe */
+	uint64_t tm_pktfield1_slabshr;
+	uint64_t tm_pktfield2_slabpos; /**< Pkt field position table index */
+	uint64_t tm_pktfield2_slabmask;	/**< Pkt field mask for tc table idx */
+	uint64_t tm_pktfield2_slabshr;
+	uint64_t tm_tc_table[64];  /**< TC translation table */
+};
+
+/**
+ * The data structure associate with softnic port
+ */
+struct softnic_port {
+	unsigned int tm_flag;	/**< set to 1 if tm feature is enabled */
+	struct softnic_port_tm tm;	/**< softnic port tm parameters */
+};
+#endif
+
 /**
  * The data structure associated with each port.
  */
@@ -195,6 +233,10 @@ struct rte_port {
 	uint32_t                mc_addr_nb; /**< nb. of addr. in mc_addr_pool */
 	uint8_t                 slave_flag; /**< bonding slave port */
 	struct port_flow        *flow_list; /**< Associated flows. */
+#ifdef TM_MODE
+	unsigned int			softnic_enable;	/**< softnic flag */
+	struct softnic_port     softport;  /**< softnic port params */
+#endif
 };
 
 /**
@@ -253,6 +295,10 @@ extern struct fwd_engine rx_only_engine;
 extern struct fwd_engine tx_only_engine;
 extern struct fwd_engine csum_fwd_engine;
 extern struct fwd_engine icmp_echo_engine;
+#ifdef TM_MODE
+extern struct fwd_engine softnic_tm_engine;
+extern struct fwd_engine softnic_tm_bypass_engine;
+#endif
 #ifdef RTE_LIBRTE_IEEE1588
 extern struct fwd_engine ieee1588_fwd_engine;
 #endif
diff --git a/app/test-pmd/tm.c b/app/test-pmd/tm.c
new file mode 100644
index 0000000..9021e26
--- /dev/null
+++ b/app/test-pmd/tm.c
@@ -0,0 +1,865 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include <stdio.h>
+#include <sys/stat.h>
+
+#include <rte_cycles.h>
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_flow.h>
+#include <rte_meter.h>
+#include <rte_eth_softnic.h>
+#include <rte_tm.h>
+
+#include "testpmd.h"
+
+#define SUBPORT_NODES_PER_PORT		1
+#define PIPE_NODES_PER_SUBPORT		4096
+#define TC_NODES_PER_PIPE			4
+#define QUEUE_NODES_PER_TC			4
+
+#define NUM_PIPE_NODES						\
+	(SUBPORT_NODES_PER_PORT * PIPE_NODES_PER_SUBPORT)
+
+#define NUM_TC_NODES						\
+	(NUM_PIPE_NODES * TC_NODES_PER_PIPE)
+
+#define ROOT_NODE_ID				1000000
+#define SUBPORT_NODES_START_ID		900000
+#define PIPE_NODES_START_ID			800000
+#define TC_NODES_START_ID			700000
+
+#define STATS_MASK_DEFAULT					\
+	(RTE_TM_STATS_N_PKTS |					\
+	RTE_TM_STATS_N_BYTES |					\
+	RTE_TM_STATS_N_PKTS_GREEN_DROPPED |			\
+	RTE_TM_STATS_N_BYTES_GREEN_DROPPED)
+
+#define STATS_MASK_QUEUE					\
+	(STATS_MASK_DEFAULT |					\
+	RTE_TM_STATS_N_PKTS_QUEUED)
+
+#define BYTES_IN_MBPS				(1000 * 1000 / 8)
+#define TOKEN_BUCKET_SIZE			1000000
+
+/* TM Hierarchy Levels */
+enum tm_hierarchy_level {
+	TM_NODE_LEVEL_PORT = 0,
+	TM_NODE_LEVEL_SUBPORT,
+	TM_NODE_LEVEL_PIPE,
+	TM_NODE_LEVEL_TC,
+	TM_NODE_LEVEL_QUEUE,
+	TM_NODE_LEVEL_MAX,
+};
+
+struct tm_hierarchy {
+	/* TM Nodes */
+	uint32_t root_node_id;
+	uint32_t subport_node_id[SUBPORT_NODES_PER_PORT];
+	uint32_t pipe_node_id[SUBPORT_NODES_PER_PORT][PIPE_NODES_PER_SUBPORT];
+	uint32_t tc_node_id[NUM_PIPE_NODES][TC_NODES_PER_PIPE];
+	uint32_t queue_node_id[NUM_TC_NODES][QUEUE_NODES_PER_TC];
+
+	/* TM Hierarchy Nodes Shaper Rates */
+	uint32_t root_node_shaper_rate;
+	uint32_t subport_node_shaper_rate;
+	uint32_t pipe_node_shaper_rate;
+	uint32_t tc_node_shaper_rate;
+	uint32_t tc_node_shared_shaper_rate;
+
+	uint32_t n_shapers;
+};
+
+#define BITFIELD(byte_array, slab_pos, slab_mask, slab_shr)	\
+({								\
+	uint64_t slab = *((uint64_t *) &byte_array[slab_pos]);	\
+	uint64_t val =				\
+		(rte_be_to_cpu_64(slab) & slab_mask) >> slab_shr;	\
+	val;						\
+})
+
+#define RTE_SCHED_PORT_HIERARCHY(subport, pipe,           \
+	traffic_class, queue, color)                          \
+	((((uint64_t) (queue)) & 0x3) |                       \
+	((((uint64_t) (traffic_class)) & 0x3) << 2) |         \
+	((((uint64_t) (color)) & 0x3) << 4) |                 \
+	((((uint64_t) (subport)) & 0xFFFF) << 16) |           \
+	((((uint64_t) (pipe)) & 0xFFFFFFFF) << 32))
+
+
+static void
+pkt_metadata_set(struct rte_port *p, struct rte_mbuf **pkts,
+	uint32_t n_pkts)
+{
+	struct softnic_port_tm *tm = &p->softport.tm;
+	uint32_t i;
+
+	for (i = 0; i < (n_pkts & (~0x3)); i += 4) {
+		struct rte_mbuf *pkt0 = pkts[i];
+		struct rte_mbuf *pkt1 = pkts[i + 1];
+		struct rte_mbuf *pkt2 = pkts[i + 2];
+		struct rte_mbuf *pkt3 = pkts[i + 3];
+
+		uint8_t *pkt0_data = rte_pktmbuf_mtod(pkt0, uint8_t *);
+		uint8_t *pkt1_data = rte_pktmbuf_mtod(pkt1, uint8_t *);
+		uint8_t *pkt2_data = rte_pktmbuf_mtod(pkt2, uint8_t *);
+		uint8_t *pkt3_data = rte_pktmbuf_mtod(pkt3, uint8_t *);
+
+		uint64_t pkt0_subport = BITFIELD(pkt0_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt0_pipe = BITFIELD(pkt0_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt0_dscp = BITFIELD(pkt0_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt0_tc = tm->tm_tc_table[pkt0_dscp & 0x3F] >> 2;
+		uint32_t pkt0_tc_q = tm->tm_tc_table[pkt0_dscp & 0x3F] & 0x3;
+		uint64_t pkt1_subport = BITFIELD(pkt1_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt1_pipe = BITFIELD(pkt1_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt1_dscp = BITFIELD(pkt1_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt1_tc = tm->tm_tc_table[pkt1_dscp & 0x3F] >> 2;
+		uint32_t pkt1_tc_q = tm->tm_tc_table[pkt1_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt2_subport = BITFIELD(pkt2_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt2_pipe = BITFIELD(pkt2_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt2_dscp = BITFIELD(pkt2_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt2_tc = tm->tm_tc_table[pkt2_dscp & 0x3F] >> 2;
+		uint32_t pkt2_tc_q = tm->tm_tc_table[pkt2_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt3_subport = BITFIELD(pkt3_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt3_pipe = BITFIELD(pkt3_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt3_dscp = BITFIELD(pkt3_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt3_tc = tm->tm_tc_table[pkt3_dscp & 0x3F] >> 2;
+		uint32_t pkt3_tc_q = tm->tm_tc_table[pkt3_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt0_sched = RTE_SCHED_PORT_HIERARCHY(pkt0_subport,
+						pkt0_pipe,
+						pkt0_tc,
+						pkt0_tc_q,
+						0);
+		uint64_t pkt1_sched = RTE_SCHED_PORT_HIERARCHY(pkt1_subport,
+						pkt1_pipe,
+						pkt1_tc,
+						pkt1_tc_q,
+						0);
+		uint64_t pkt2_sched = RTE_SCHED_PORT_HIERARCHY(pkt2_subport,
+						pkt2_pipe,
+						pkt2_tc,
+						pkt2_tc_q,
+						0);
+		uint64_t pkt3_sched = RTE_SCHED_PORT_HIERARCHY(pkt3_subport,
+						pkt3_pipe,
+						pkt3_tc,
+						pkt3_tc_q,
+						0);
+
+		pkt0->hash.sched.lo = pkt0_sched & 0xFFFFFFFF;
+		pkt0->hash.sched.hi = pkt0_sched >> 32;
+		pkt1->hash.sched.lo = pkt1_sched & 0xFFFFFFFF;
+		pkt1->hash.sched.hi = pkt1_sched >> 32;
+		pkt2->hash.sched.lo = pkt2_sched & 0xFFFFFFFF;
+		pkt2->hash.sched.hi = pkt2_sched >> 32;
+		pkt3->hash.sched.lo = pkt3_sched & 0xFFFFFFFF;
+		pkt3->hash.sched.hi = pkt3_sched >> 32;
+	}
+
+	for (; i < n_pkts; i++)	{
+		struct rte_mbuf *pkt = pkts[i];
+
+		uint8_t *pkt_data = rte_pktmbuf_mtod(pkt, uint8_t *);
+
+		uint64_t pkt_subport = BITFIELD(pkt_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt_pipe = BITFIELD(pkt_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt_dscp = BITFIELD(pkt_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt_tc = tm->tm_tc_table[pkt_dscp & 0x3F] >> 2;
+		uint32_t pkt_tc_q = tm->tm_tc_table[pkt_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt_sched = RTE_SCHED_PORT_HIERARCHY(pkt_subport,
+						pkt_pipe,
+						pkt_tc,
+						pkt_tc_q,
+						0);
+
+		pkt->hash.sched.lo = pkt_sched & 0xFFFFFFFF;
+		pkt->hash.sched.hi = pkt_sched >> 32;
+	}
+}
+
+/*
+ * Soft port packet forward
+ */
+static void
+softport_packet_fwd(struct fwd_stream *fs)
+{
+	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_port *rte_tx_port = &ports[fs->tx_port];
+	uint16_t nb_rx;
+	uint16_t nb_tx;
+	uint32_t retry;
+
+#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
+	uint64_t start_tsc;
+	uint64_t end_tsc;
+	uint64_t core_cycles;
+#endif
+
+#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
+	start_tsc = rte_rdtsc();
+#endif
+
+	/*  Packets Receive */
+	nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue,
+			pkts_burst, nb_pkt_per_burst);
+	fs->rx_packets += nb_rx;
+
+#ifdef RTE_TEST_PMD_RECORD_BURST_STATS
+	fs->rx_burst_stats.pkt_burst_spread[nb_rx]++;
+#endif
+
+	if (rte_tx_port->softnic_enable) {
+		/* Set packet metadata if tm flag enabled */
+		if (rte_tx_port->softport.tm_flag)
+			pkt_metadata_set(rte_tx_port, pkts_burst, nb_rx);
+
+		/* Softport run */
+		rte_pmd_softnic_run(fs->tx_port);
+	}
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
+			pkts_burst, nb_rx);
+
+	/* Retry if necessary */
+	if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
+		retry = 0;
+		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
+			rte_delay_us(burst_tx_delay_time);
+			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
+					&pkts_burst[nb_tx], nb_rx - nb_tx);
+		}
+	}
+	fs->tx_packets += nb_tx;
+
+#ifdef RTE_TEST_PMD_RECORD_BURST_STATS
+	fs->tx_burst_stats.pkt_burst_spread[nb_tx]++;
+#endif
+
+	if (unlikely(nb_tx < nb_rx)) {
+		fs->fwd_dropped += (nb_rx - nb_tx);
+		do {
+			rte_pktmbuf_free(pkts_burst[nb_tx]);
+		} while (++nb_tx < nb_rx);
+	}
+#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
+	end_tsc = rte_rdtsc();
+	core_cycles = (end_tsc - start_tsc);
+	fs->core_cycles = (uint64_t) (fs->core_cycles + core_cycles);
+#endif
+}
+
+static void
+set_tm_hiearchy_nodes_shaper_rate(portid_t port_id, struct tm_hierarchy *h)
+{
+	struct rte_eth_link link_params;
+	uint64_t tm_port_rate;
+
+	memset(&link_params, 0, sizeof(link_params));
+
+	rte_eth_link_get(port_id, &link_params);
+	tm_port_rate = link_params.link_speed * BYTES_IN_MBPS;
+
+	if (tm_port_rate > UINT32_MAX)
+		tm_port_rate = UINT32_MAX;
+
+	/* Set tm hierarchy shapers rate */
+	h->root_node_shaper_rate = tm_port_rate;
+	h->subport_node_shaper_rate =
+		tm_port_rate / SUBPORT_NODES_PER_PORT;
+	h->pipe_node_shaper_rate
+		= h->subport_node_shaper_rate / PIPE_NODES_PER_SUBPORT;
+	h->tc_node_shaper_rate = h->pipe_node_shaper_rate;
+	h->tc_node_shared_shaper_rate = h->subport_node_shaper_rate;
+}
+
+static int
+softport_tm_root_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	struct rte_tm_node_params rnp;
+	struct rte_tm_shaper_params rsp;
+	uint32_t priority, weight, level_id, shaper_profile_id;
+
+	memset(&rsp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&rnp, 0, sizeof(struct rte_tm_node_params));
+
+	/* Shaper profile Parameters */
+	rsp.peak.rate = h->root_node_shaper_rate;
+	rsp.peak.size = TOKEN_BUCKET_SIZE;
+	rsp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+	shaper_profile_id = 0;
+
+	if (rte_tm_shaper_profile_add(port_id, shaper_profile_id,
+		&rsp, error)) {
+		printf("%s ERROR(%d)-%s!(shaper_id %u)\n ",
+			__func__, error->type, error->message,
+			shaper_profile_id);
+		return -1;
+	}
+
+	/* Root Node Parameters */
+	h->root_node_id = ROOT_NODE_ID;
+	weight = 1;
+	priority = 0;
+	level_id = TM_NODE_LEVEL_PORT;
+	rnp.shaper_profile_id = shaper_profile_id;
+	rnp.nonleaf.n_sp_priorities = 1;
+	rnp.stats_mask = STATS_MASK_DEFAULT;
+
+	/* Add Node to TM Hierarchy */
+	if (rte_tm_node_add(port_id, h->root_node_id, RTE_TM_NODE_ID_NULL,
+		priority, weight, level_id, &rnp, error)) {
+		printf("%s ERROR(%d)-%s!(node_id %u, parent_id %u, level %u)\n",
+			__func__, error->type, error->message,
+			h->root_node_id, RTE_TM_NODE_ID_NULL,
+			level_id);
+		return -1;
+	}
+	/* Update */
+	h->n_shapers++;
+
+	printf("  Root node added (Start id %u, Count %u, level %u)\n",
+		h->root_node_id, 1, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_subport_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t subport_parent_node_id, subport_node_id;
+	struct rte_tm_node_params snp;
+	struct rte_tm_shaper_params ssp;
+	uint32_t priority, weight, level_id, shaper_profile_id;
+	uint32_t i;
+
+	memset(&ssp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&snp, 0, sizeof(struct rte_tm_node_params));
+
+	shaper_profile_id = h->n_shapers;
+
+	/* Add Shaper Profile to TM Hierarchy */
+	for (i = 0; i < SUBPORT_NODES_PER_PORT; i++) {
+		ssp.peak.rate = h->subport_node_shaper_rate;
+		ssp.peak.size = TOKEN_BUCKET_SIZE;
+		ssp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+		if (rte_tm_shaper_profile_add(port_id, shaper_profile_id,
+			&ssp, error)) {
+			printf("%s ERROR(%d)-%s!(shaper_id %u)\n ",
+				__func__, error->type, error->message,
+				shaper_profile_id);
+			return -1;
+		}
+
+		/* Node Parameters */
+		h->subport_node_id[i] = SUBPORT_NODES_START_ID + i;
+		subport_parent_node_id = h->root_node_id;
+		weight = 1;
+		priority = 0;
+		level_id = TM_NODE_LEVEL_SUBPORT;
+		snp.shaper_profile_id = shaper_profile_id;
+		snp.nonleaf.n_sp_priorities = 1;
+		snp.stats_mask = STATS_MASK_DEFAULT;
+
+		/* Add Node to TM Hiearchy */
+		if (rte_tm_node_add(port_id,
+				h->subport_node_id[i],
+				subport_parent_node_id,
+				priority, weight,
+				level_id,
+				&snp,
+				error)) {
+			printf("%s ERROR(%d)-%s!(node %u,parent %u,level %u)\n",
+					__func__,
+					error->type,
+					error->message,
+					h->subport_node_id[i],
+					subport_parent_node_id,
+					level_id);
+			return -1;
+		}
+		shaper_profile_id++;
+		subport_node_id++;
+	}
+	/* Update */
+	h->n_shapers = shaper_profile_id;
+
+	printf("  Subport nodes added (Start id %u, Count %u, level %u)\n",
+		h->subport_node_id[0], SUBPORT_NODES_PER_PORT, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_pipe_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t pipe_parent_node_id;
+	struct rte_tm_node_params pnp;
+	struct rte_tm_shaper_params psp;
+	uint32_t priority, weight, level_id, shaper_profile_id;
+	uint32_t i, j;
+
+	memset(&psp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&pnp, 0, sizeof(struct rte_tm_node_params));
+
+	shaper_profile_id = h->n_shapers;
+
+	/* Shaper Profile Parameters */
+	psp.peak.rate = h->pipe_node_shaper_rate;
+	psp.peak.size = TOKEN_BUCKET_SIZE;
+	psp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+	/* Pipe Node Parameters */
+	weight = 1;
+	priority = 0;
+	level_id = TM_NODE_LEVEL_PIPE;
+	pnp.nonleaf.n_sp_priorities = 4;
+	pnp.stats_mask = STATS_MASK_DEFAULT;
+
+	/* Add Shaper Profiles and Nodes to TM Hierarchy */
+	for (i = 0; i < SUBPORT_NODES_PER_PORT; i++) {
+		for (j = 0; j < PIPE_NODES_PER_SUBPORT; j++) {
+			if (rte_tm_shaper_profile_add(port_id,
+				shaper_profile_id, &psp, error)) {
+				printf("%s ERROR(%d)-%s!(shaper_id %u)\n ",
+					__func__, error->type, error->message,
+					shaper_profile_id);
+				return -1;
+			}
+			pnp.shaper_profile_id = shaper_profile_id;
+			pipe_parent_node_id = h->subport_node_id[i];
+			h->pipe_node_id[i][j] = PIPE_NODES_START_ID +
+				(i * PIPE_NODES_PER_SUBPORT) + j;
+
+			if (rte_tm_node_add(port_id,
+					h->pipe_node_id[i][j],
+					pipe_parent_node_id,
+					priority, weight, level_id,
+					&pnp,
+					error)) {
+				printf("%s ERROR(%d)-%s!(node %u,parent %u )\n",
+					__func__,
+					error->type,
+					error->message,
+					h->pipe_node_id[i][j],
+					pipe_parent_node_id);
+
+				return -1;
+			}
+			shaper_profile_id++;
+		}
+	}
+	/* Update */
+	h->n_shapers = shaper_profile_id;
+
+	printf("  Pipe nodes added (Start id %u, Count %u, level %u)\n",
+		h->pipe_node_id[0][0], NUM_PIPE_NODES, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_tc_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t tc_parent_node_id;
+	struct rte_tm_node_params tnp;
+	struct rte_tm_shaper_params tsp, tssp;
+	uint32_t shared_shaper_profile_id[TC_NODES_PER_PIPE];
+	uint32_t priority, weight, level_id, shaper_profile_id;
+	uint32_t pos, n_tc_nodes, i, j, k;
+
+	memset(&tsp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&tssp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&tnp, 0, sizeof(struct rte_tm_node_params));
+
+	shaper_profile_id = h->n_shapers;
+
+	/* Private Shaper Profile (TC) Parameters */
+	tsp.peak.rate = h->tc_node_shaper_rate;
+	tsp.peak.size = TOKEN_BUCKET_SIZE;
+	tsp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+	/* Shared Shaper Profile (TC) Parameters */
+	tssp.peak.rate = h->tc_node_shared_shaper_rate;
+	tssp.peak.size = TOKEN_BUCKET_SIZE;
+	tssp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+	/* TC Node Parameters */
+	weight = 1;
+	level_id = TM_NODE_LEVEL_TC;
+	tnp.n_shared_shapers = 1;
+	tnp.nonleaf.n_sp_priorities = 1;
+	tnp.stats_mask = STATS_MASK_DEFAULT;
+
+	/* Add Shared Shaper Profiles to TM Hierarchy */
+	for (i = 0; i < TC_NODES_PER_PIPE; i++) {
+		shared_shaper_profile_id[i] = shaper_profile_id;
+
+		if (rte_tm_shaper_profile_add(port_id,
+			shared_shaper_profile_id[i], &tssp, error)) {
+			printf("%s ERROR(%d)-%s!(Shared shaper profileid %u)\n",
+				__func__, error->type, error->message,
+				shared_shaper_profile_id[i]);
+
+			return -1;
+		}
+		if (rte_tm_shared_shaper_add_update(port_id,  i,
+			shared_shaper_profile_id[i], error)) {
+			printf("%s ERROR(%d)-%s!(Shared shaper id %u)\n",
+				__func__, error->type, error->message, i);
+
+			return -1;
+		}
+		shaper_profile_id++;
+	}
+
+	/* Add Shaper Profiles and Nodes to TM Hierarchy */
+	n_tc_nodes = 0;
+	for (i = 0; i < SUBPORT_NODES_PER_PORT; i++) {
+		for (j = 0; j < PIPE_NODES_PER_SUBPORT; j++) {
+			for (k = 0; k < TC_NODES_PER_PIPE ; k++) {
+				priority = k;
+				tc_parent_node_id = h->pipe_node_id[i][j];
+				tnp.shared_shaper_id =
+					(uint32_t *)calloc(1, sizeof(uint32_t));
+				tnp.shared_shaper_id[0] = k;
+				pos = j + (i * PIPE_NODES_PER_SUBPORT);
+				h->tc_node_id[pos][k] =
+					TC_NODES_START_ID + n_tc_nodes;
+
+				if (rte_tm_shaper_profile_add(port_id,
+					shaper_profile_id, &tsp, error)) {
+					printf("%s ERROR(%d)-%s!(shaper %u)\n",
+						__func__, error->type,
+						error->message,
+						shaper_profile_id);
+
+					return -1;
+				}
+				tnp.shaper_profile_id = shaper_profile_id;
+				if (rte_tm_node_add(port_id,
+						h->tc_node_id[pos][k],
+						tc_parent_node_id,
+						priority, weight,
+						level_id,
+						&tnp, error)) {
+					printf("%s ERROR(%d)-%s!(node id %u)\n",
+						__func__,
+						error->type,
+						error->message,
+						h->tc_node_id[pos][k]);
+
+					return -1;
+				}
+				shaper_profile_id++;
+				n_tc_nodes++;
+			}
+		}
+	}
+	/* Update */
+	h->n_shapers = shaper_profile_id;
+
+	printf("  TC nodes added (Start id %u, Count %u, level %u)\n",
+		h->tc_node_id[0][0], n_tc_nodes, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_queue_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t queue_parent_node_id;
+	struct rte_tm_node_params qnp;
+	uint32_t priority, weight, level_id, pos;
+	uint32_t n_queue_nodes, i, j, k;
+
+	memset(&qnp, 0, sizeof(struct rte_tm_node_params));
+
+	/* Queue Node Parameters */
+	priority = 0;
+	weight = 1;
+	level_id = TM_NODE_LEVEL_QUEUE;
+	qnp.shaper_profile_id = RTE_TM_SHAPER_PROFILE_ID_NONE;
+	qnp.leaf.cman = RTE_TM_CMAN_TAIL_DROP;
+	qnp.stats_mask = STATS_MASK_QUEUE;
+
+	/* Add Queue Nodes to TM Hierarchy */
+	n_queue_nodes = 0;
+	for (i = 0; i < NUM_PIPE_NODES; i++) {
+		for (j = 0; j < TC_NODES_PER_PIPE; j++) {
+			queue_parent_node_id = h->tc_node_id[i][j];
+			for (k = 0; k < QUEUE_NODES_PER_TC; k++) {
+				pos = j + (i * TC_NODES_PER_PIPE);
+				h->queue_node_id[pos][k] = n_queue_nodes;
+				if (rte_tm_node_add(port_id,
+						h->queue_node_id[pos][k],
+						queue_parent_node_id,
+						priority,
+						weight,
+						level_id,
+						&qnp, error)) {
+					printf("%s ERROR(%d)-%s!(node %u)\n",
+						__func__,
+						error->type,
+						error->message,
+						h->queue_node_id[pos][k]);
+
+					return -1;
+				}
+				n_queue_nodes++;
+			}
+		}
+	}
+	printf("  Queue nodes added (Start id %u, Count %u, level %u)\n",
+		h->queue_node_id[0][0], n_queue_nodes, level_id);
+
+	return 0;
+}
+
+/*
+ * TM Packet Field Setup
+ */
+static void
+softport_tm_pktfield_setup(portid_t port_id)
+{
+	struct rte_port *p = &ports[port_id];
+	uint64_t pktfield0_mask = 0;
+	uint64_t pktfield1_mask = 0x0000000FFF000000LLU;
+	uint64_t pktfield2_mask = 0x00000000000000FCLLU;
+
+	p->softport.tm = (struct softnic_port_tm) {
+		.n_subports_per_port = SUBPORT_NODES_PER_PORT,
+		.n_pipes_per_subport = PIPE_NODES_PER_SUBPORT,
+
+		/* Packet field to identify subport
+		 *
+		 * Default configuration assumes only one subport, thus
+		 * the subport ID is hardcoded to 0
+		 */
+		.tm_pktfield0_slabpos = 0,
+		.tm_pktfield0_slabmask = pktfield0_mask,
+		.tm_pktfield0_slabshr =
+			__builtin_ctzll(pktfield0_mask),
+
+		/* Packet field to identify pipe.
+		 *
+		 * Default value assumes Ethernet/IPv4/UDP packets,
+		 * UDP payload bits 12 .. 23
+		 */
+		.tm_pktfield1_slabpos = 40,
+		.tm_pktfield1_slabmask = pktfield1_mask,
+		.tm_pktfield1_slabshr =
+			__builtin_ctzll(pktfield1_mask),
+
+		/* Packet field used as index into TC translation table
+		 * to identify the traffic class and queue.
+		 *
+		 * Default value assumes Ethernet/IPv4 packets, IPv4
+		 * DSCP field
+		 */
+		.tm_pktfield2_slabpos = 8,
+		.tm_pktfield2_slabmask = pktfield2_mask,
+		.tm_pktfield2_slabshr =
+			__builtin_ctzll(pktfield2_mask),
+
+		.tm_tc_table = {
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+		}, /**< TC translation table */
+	};
+}
+
+static int
+softport_tm_hierarchy_specify(portid_t port_id, struct rte_tm_error *error)
+{
+
+	struct tm_hierarchy h;
+	int status;
+
+	memset(&h, 0, sizeof(struct tm_hierarchy));
+
+	/* TM hierarchy shapers rate */
+	set_tm_hiearchy_nodes_shaper_rate(port_id, &h);
+
+	/* Add root node (level 0) */
+	status = softport_tm_root_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add subport node (level 1) */
+	status = softport_tm_subport_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add pipe nodes (level 2) */
+	status = softport_tm_pipe_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add traffic class nodes (level 3) */
+	status = softport_tm_tc_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add queue nodes (level 4) */
+	status = softport_tm_queue_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* TM packet fields setup */
+	softport_tm_pktfield_setup(port_id);
+
+	return 0;
+}
+
+/*
+ * Soft port Init
+ */
+static void
+softport_tm_begin(portid_t pi)
+{
+	struct rte_port *port = &ports[pi];
+
+	/* Soft port TM flag */
+	if (port->softport.tm_flag == 1) {
+		printf("\n\n  TM feature available on port %u\n", pi);
+
+		/* Soft port TM hierarchy configuration */
+		if ((port->softport.tm.hierarchy_config == 0) &&
+			(port->softport.tm.default_hierarchy_enable == 1)) {
+			struct rte_tm_error error;
+			int status;
+
+			/* Stop port */
+			rte_eth_dev_stop(pi);
+
+			/* TM hierarchy specification */
+			status = softport_tm_hierarchy_specify(pi, &error);
+			if (status) {
+				printf("  TM Hierarchy built error(%d) - %s\n",
+					error.type, error.message);
+				return;
+			}
+			printf("\n  TM Hierarchy Specified!\n\v");
+
+			/* TM hierarchy commit */
+			status = rte_tm_hierarchy_commit(pi, 0, &error);
+			if (status) {
+				printf("  Hierarchy commit error(%d) - %s\n",
+					error.type, error.message);
+				return;
+			}
+			printf("  Hierarchy Committed (port %u)!", pi);
+			port->softport.tm.hierarchy_config = 1;
+
+			/* Start port */
+			status = rte_eth_dev_start(pi);
+			if (status) {
+				printf("\n  Port %u start error!\n", pi);
+				return;
+			}
+			printf("\n  Port %u started!\n", pi);
+			return;
+		}
+	}
+	printf("\n  TM feature not available on port %u", pi);
+}
+
+struct fwd_engine softnic_tm_engine = {
+	.fwd_mode_name  = "tm",
+	.port_fwd_begin = softport_tm_begin,
+	.port_fwd_end   = NULL,
+	.packet_fwd     = softport_packet_fwd,
+};
+
+struct fwd_engine softnic_tm_bypass_engine = {
+	.fwd_mode_name  = "tm-bypass",
+	.port_fwd_begin = NULL,
+	.port_fwd_end   = NULL,
+	.packet_fwd     = softport_packet_fwd,
+};
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-09-20 15:35             ` [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Thomas Monjalon
  2017-09-22 22:07               ` Singh, Jasvinder
@ 2017-10-06 10:40               ` Dumitrescu, Cristian
  2017-10-06 12:13                 ` Thomas Monjalon
  1 sibling, 1 reply; 79+ messages in thread
From: Dumitrescu, Cristian @ 2017-10-06 10:40 UTC (permalink / raw)
  To: Thomas Monjalon, Singh, Jasvinder; +Cc: dev, Yigit, Ferruh

Hi Thomas,

Thanks for taking the time to read through our rationale and provide quality comments on a topic where usually people are shouting but not listening!

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Wednesday, September 20, 2017 4:36 PM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>; Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com>
> Cc: dev@dpdk.org; Yigit, Ferruh <ferruh.yigit@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for
> traffic mgmt and others
> 
> Hi,
> 
> 18/09/2017 11:10, Jasvinder Singh:
> > The SoftNIC PMD is intended to provide SW fall-back options for specific
> > ethdev APIs in a generic way to the NICs not supporting those features.
> 
> I agree it is important to have a solution in DPDK to better manage
> SW fallbacks. One question is to know whether we can implement and
> maintain many solutions. We probably must choose only one solution.
> 
> I have not read the code. I am just interested in the design for now.
> I think it is a smart idea but maybe less convenient than calling fallback
> from ethdev API glue code. My opinion has not changed since v1.
> Thanks for the detailed explanations. Let's discuss below.
> 

Don't understand me wrong, I would also like to have the single device solution (hard NIC augmented with SW-implemented features) as opposed to the current proposal, which requires two devices (hard device and soft device acting as app front-end for the hard device).

The problem is that right now the single device solution is not an option with the current librte_ether, as there simply a lot of changes required that need more time to think through and get agreement, and likely several incremental stages are required to make it happen. As detailed in the Dublin presentation, they mostly refer to:
- the need of the SW fall-back to maintain its owns data structures and functions (per device, per RX queue, per TX queue)
- coexistence of all the features together
- how to bind an ethdev to one (or several) SW threads
- thread safety requirements between ethdev SW thread and app threads

Per our Dublin discussion, here is my proposal:
1. Get Soft NIC PMD into release 17.11.
	a) It is the imperfect 2-device solution, but it works and provides an interim solution.
	b) It allows us to make progress on the development for a few key features such as traffic management (on TX) and hopefully flow & metering (on RX) and get feedback on this code that we can later restructure into the final single-device solution.
	c) It is purely yet another PMD which we can melt away into the final solution later.
2. Start an RFC on librte_ether required changes to get the single-device solution in place.
	a) I can spend some time to summarize the objectives, requirements, current issues and potential approaches and send the first draft of this RFC in the next week or two?
	b) We can then discuss, poll for more ideas and hopefully draft an incremental path forward

What do you think?

> [...]
> > * RX/TX: The app reads packets from/writes packets to the "soft" port
> >   instead of the "hard" port. The RX and TX queues of the "soft" port are
> >   thread safe, as any ethdev.
> 
> "thread safe as any ethdev"?
> I would say the ethdev queues are not thread safe.
> 
> [...]

Yes, per the Dublin presentation, the thread safety mentioned here is between the Soft NIC thread and the application thread(s).

> > * Meets the NFV vision: The app should be (almost) agnostic about the NIC
> >   implementation (different vendors/models, HW-SW mix), the app should
> not
> >   require changes to use different NICs, the app should use the same API
> >   for all NICs. If a NIC does not implement a specific feature, the HW
> >   should be augmented with SW to meet the functionality while still
> >   preserving the same API.
> 
> This goal could also be achieved by adding the SW capability to the API.
> After getting capabilities of a hardware, the app could set the capability
> of some driver features to "SW fallback".
> So the capability would become a tristate:
> 	- not supported
> 	- HW supported
> 	- SW supported
> 
> The unique API goal is failed if we must manage two ports,
> the HW port for some features and the softnic port for other features.
> You explain it in A5 below.
> 

Yes, agree that 2-device solution is not fully meeting this goal, but IMHO this is the best we can do today; hopefully we can come up with a path forward for the single-device solution.

> [...]
> > Example: Create "soft" port for "hard" port "0000:04:00.1", enable the TM
> > feature with default settings:
> >           --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on'
> 
> So the app will use only the vdev net_softnic0 which will forward packets
> to 0000:04:00.1?
> Can we say in this example that net_softnic0 owns 0000:04:00.1?
> Probably not, because the config of the HW must be done separately (cf.
> Q5).
> See my "ownership proposal":
> 	http://dpdk.org/ml/archives/dev/2017-September/074656.html
> 
> The issue I see in this example is that we must define how to enable
> every features. It should be equivalent to defining the ethdev capabilities.
> In this example, the option soft_tm=on is probably not enough fine-grain.
> We could support some parts of TM API in HW and other parts in SW.
> 

There are optional parameters for each feature (i.e. only TM at this point) that are left on their default value for this simple example; they can easily be added on the command line for fine grained tuning of each feature.

> [...]
> > Q3: Why not change the "hard" device (and keep a single device) instead of
> >     creating a new "soft" device (and thus having two devices)?
> > A3: This is not possible with the current librte_ether ethdev
> >     implementation. The ethdev->dev_ops are defined as constant
> structure,
> >     so it cannot be changed per device (nor per PMD). The new ops also
> >     need memory space to store their context data structures, which
> >     requires updating the ethdev->data->dev_private of the existing
> >     device; at best, maybe a resize of ethdev->data->dev_private could be
> >     done, assuming that librte_ether will introduce a way to find out its
> >     size, but this cannot be done while device is running. Other side
> >     effects might exist, as the changes are very intrusive, plus it likely
> >     needs more changes in librte_ether.
> 
> Q3 is about calling SW fallback from the driver code, right?
> 

Yes, correct, but the answer is applicable to the Q4 as well.

> We must not implement fallbacks in drivers because it would hide
> it to the application.
> If a feature is not available in hardware, the application can choose
> to bypass this feature or integrate the fallback in its own workflow.
> 

I agree.

> > Q4: Why not call the SW fall-back dev_ops directly in librte_ether for
> >     devices which do not support the specific feature? If the device
> >     supports the capability, let's call its dev_ops, otherwise call the
> >     SW fall-back dev_ops.
> > A4: First, similar reasons to Q&A3. This fixes the need to change
> >     ethdev->dev_ops of the device, but it does not do anything to fix the
> >     other significant issue of where to store the context data structures
> >     needed by the SW fall-back functions (which, in this approach, are
> >     called implicitly by librte_ether).
> >     Second, the SW fall-back options should not be restricted arbitrarily
> >     by the librte_ether library, the decision should belong to the app.
> >     For example, the TM SW fall-back should not be limited to only
> >     librte_sched, which (like any SW fall-back) is limited to a specific
> >     hierarchy and feature set, it cannot do any possible hierarchy. If
> >     alternatives exist, the one to use should be picked by the app, not by
> >     the ethdev layer.
> 
> Q4 is about calling SW callback from the API glue code, right?
> 

Yes.

> We could summarize Q3/Q4 as "it could be done but we propose another
> way".
> I think we must consider the pros and cons of both approaches from
> a user perspective.
> I agree the application must decide which fallback to use.
> We could propose one fallback in ethdev which can be enabled explicitly
> (see my tristate capabilities proposal above).
> 

My summary would be: it would be great to do it this way, but significant road blocks exist that need to be lifted first.

> > Q5: Why is the app required to continue to configure both the "hard" and
> >     the "soft" devices even after the "soft" device has been created? Why
> >     not hiding the "hard" device under the "soft" device and have the
> >     "soft" device configure the "hard" device under the hood?
> > A5: This was the approach tried in the V2 of this patch set (overlay
> >     "soft" device taking over the configuration of the underlay "hard"
> >     device) and eventually dropped due to increased complexity of having
> >     to keep the configuration of two distinct devices in sync with
> >     librte_ether implementation that is not friendly towards such
> >     approach. Basically, each ethdev API call for the overlay device
> >     needs to configure the overlay device, invoke the same configuration
> >     with possibly modified parameters for the underlay device, then resume
> >     the configuration of overlay device, turning this into a device
> >     emulation project.
> >     V2 minuses: increased complexity (deal with two devices at same time);
> >     need to implement every ethdev API, even those not needed for the
> scope
> >     of SW fall-back; intrusive; sometimes have to silently take decisions
> >     that should be left to the app.
> >     V3 pluses: lower complexity (only one device); only need to implement
> >     those APIs that are in scope of the SW fall-back; non-intrusive (deal
> >     with "hard" device through ethdev API); app decisions taken by the app
> >     in an explicit way.
> 
> I think it is breaking what you call the NFV vision in several places.
> 

Personally, I also agree with you here.

> [...]
> >     9. [rte_ring proliferation] Thread safety requirements for ethdev
> >        RX/TXqueues require an rte_ring to be used for every RX/TX queue
> >        of each "soft" ethdev. This rte_ring proliferation unnecessarily
> >        increases the memory footprint and lowers performance, especially
> >        when each "soft" ethdev ends up on a different CPU core (ping-pong
> >        of cache lines).
> 
> I am curious to understand why you consider thread safety as a requirement
> for queues. No need to reply here, the question is already asked
> at the beginning of this email ;)

Regards,
Cristian

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-10-06 10:40               ` Dumitrescu, Cristian
@ 2017-10-06 12:13                 ` Thomas Monjalon
  0 siblings, 0 replies; 79+ messages in thread
From: Thomas Monjalon @ 2017-10-06 12:13 UTC (permalink / raw)
  To: Dumitrescu, Cristian; +Cc: Singh, Jasvinder, dev, Yigit, Ferruh

06/10/2017 12:40, Dumitrescu, Cristian:
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > 18/09/2017 11:10, Jasvinder Singh:
> > > The SoftNIC PMD is intended to provide SW fall-back options for specific
> > > ethdev APIs in a generic way to the NICs not supporting those features.
> > 
> > I agree it is important to have a solution in DPDK to better manage
> > SW fallbacks. One question is to know whether we can implement and
> > maintain many solutions. We probably must choose only one solution.
> > 
> > I have not read the code. I am just interested in the design for now.
> > I think it is a smart idea but maybe less convenient than calling fallback
> > from ethdev API glue code. My opinion has not changed since v1.
> > Thanks for the detailed explanations. Let's discuss below.
> > 
> 
> Don't understand me wrong, I would also like to have the single device solution (hard NIC augmented with SW-implemented features) as opposed to the current proposal, which requires two devices (hard device and soft device acting as app front-end for the hard device).
> 
> The problem is that right now the single device solution is not an option with the current librte_ether, as there simply a lot of changes required that need more time to think through and get agreement, and likely several incremental stages are required to make it happen. As detailed in the Dublin presentation, they mostly refer to:
> - the need of the SW fall-back to maintain its owns data structures and functions (per device, per RX queue, per TX queue)
> - coexistence of all the features together
> - how to bind an ethdev to one (or several) SW threads
> - thread safety requirements between ethdev SW thread and app threads
> 
> Per our Dublin discussion, here is my proposal:
> 1. Get Soft NIC PMD into release 17.11.
> 	a) It is the imperfect 2-device solution, but it works and provides an interim solution.
> 	b) It allows us to make progress on the development for a few key features such as traffic management (on TX) and hopefully flow & metering (on RX) and get feedback on this code that we can later restructure into the final single-device solution.
> 	c) It is purely yet another PMD which we can melt away into the final solution later.
> 2. Start an RFC on librte_ether required changes to get the single-device solution in place.
> 	a) I can spend some time to summarize the objectives, requirements, current issues and potential approaches and send the first draft of this RFC in the next week or two?
> 	b) We can then discuss, poll for more ideas and hopefully draft an incremental path forward
> 
> What do you think?

I think temporary solutions (which often become definitive) must be avoided,
especially when it implies new API.
In the case of softnic, there is no new API really, just a new workflow
for applications and some new driver parameters.
So my conclusion is that we should merge and experience it.
It does not prevent from working on another solution, as you suggest.

Acked-by: Thomas Monjalon <thomas@monjalon.net>

PS: thank you for having given your opinion on other questions

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 2/5] net/softnic: add traffic management support Jasvinder Singh
@ 2017-10-06 16:59                   ` Jasvinder Singh
  2017-10-06 16:59                     ` [dpdk-dev] [PATCH v6 1/5] net/softnic: add softnic PMD Jasvinder Singh
                                       ` (5 more replies)
  0 siblings, 6 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-06 16:59 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

The SoftNIC PMD is intended to provide SW fall-back options for specific
ethdev APIs in a generic way to the NICs not supporting those features.

Currently, the only implemented ethdev API is Traffic Management (TM),
but other ethdev APIs such as rte_flow, traffic metering & policing, etc
can be easily implemented.

Overview:
* Generic: The SoftNIC PMD works with any "hard" PMD that implements the
  ethdev API. It does not change the "hard" PMD in any way.
* Creation: For any given "hard" ethdev port, the user can decide to
  create an associated "soft" ethdev port to drive the "hard" port. The
  "soft" port is a virtual device that can be created at app start-up
  through EAL vdev arg or later through the virtual device API.
* Configuration: The app explicitly decides which features are to be
  enabled on the "soft" port and which features are still to be used from
  the "hard" port. The app continues to explicitly configure both the
  "hard" and the "soft" ports after the creation of the "soft" port.
* RX/TX: The app reads packets from/writes packets to the "soft" port
  instead of the "hard" port. The RX and TX queues of the "soft" port are
  thread safe, as any ethdev.
* Execution: The "soft" port is a feature-rich NIC implemented by the CPU,
  so the run function of the "soft" port has to be executed by the CPU in
  order to get packets moving between "hard" port and the app.
* Meets the NFV vision: The app should be (almost) agnostic about the NIC
  implementation (different vendors/models, HW-SW mix), the app should not
  require changes to use different NICs, the app should use the same API
  for all NICs. If a NIC does not implement a specific feature, the HW
  should be augmented with SW to meet the functionality while still
  preserving the same API.

Traffic Management SW fall-back overview:
* Implements the ethdev traffic management API (rte_tm.h).
* Based on the existing librte_sched DPDK library.

Example: Create "soft" port for "hard" port "0000:04:00.1", enable the TM
feature with default settings:
          --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on' 

Q1: Why generic name, if only TM is supported (for now)?
A1: The intention is to have SoftNIC PMD implement many other (all?)
    ethdev APIs under a single "ideal" ethdev, hence the generic name.
    The initial motivation is TM API, but the mechanism is generic and can
    be used for many other ethdev APIs. Somebody looking to provide SW
    fall-back for other ethdev API is likely to end up inventing the same,
    hence it would be good to consolidate all under a single PMD and have
    the user explicitly enable/disable the features it needs for each
    "soft" device.

Q2: Are there any performance requirements for SoftNIC?
A2: Yes, performance should be great/decent for every feature, otherwise
    the SW fall-back is unusable, thus useless.

Q3: Why not change the "hard" device (and keep a single device) instead of
    creating a new "soft" device (and thus having two devices)?
A3: This is not possible with the current librte_ether ethdev
    implementation. The ethdev->dev_ops are defined as constant structure,
    so it cannot be changed per device (nor per PMD). The new ops also
    need memory space to store their context data structures, which
    requires updating the ethdev->data->dev_private of the existing
    device; at best, maybe a resize of ethdev->data->dev_private could be
    done, assuming that librte_ether will introduce a way to find out its
    size, but this cannot be done while device is running. Other side
    effects might exist, as the changes are very intrusive, plus it likely
    needs more changes in librte_ether.

Q4: Why not call the SW fall-back dev_ops directly in librte_ether for
    devices which do not support the specific feature? If the device
    supports the capability, let's call its dev_ops, otherwise call the
    SW fall-back dev_ops.
A4: First, similar reasons to Q&A3. This fixes the need to change
    ethdev->dev_ops of the device, but it does not do anything to fix the
    other significant issue of where to store the context data structures
    needed by the SW fall-back functions (which, in this approach, are
    called implicitly by librte_ether).
    Second, the SW fall-back options should not be restricted arbitrarily
    by the librte_ether library, the decision should belong to the app.
    For example, the TM SW fall-back should not be limited to only
    librte_sched, which (like any SW fall-back) is limited to a specific
    hierarchy and feature set, it cannot do any possible hierarchy. If
    alternatives exist, the one to use should be picked by the app, not by
    the ethdev layer.

Q5: Why is the app required to continue to configure both the "hard" and
    the "soft" devices even after the "soft" device has been created? Why
    not hiding the "hard" device under the "soft" device and have the
    "soft" device configure the "hard" device under the hood?
A5: This was the approach tried in the V2 of this patch set (overlay
    "soft" device taking over the configuration of the underlay "hard"
    device) and eventually dropped due to increased complexity of having
    to keep the configuration of two distinct devices in sync with
    librte_ether implementation that is not friendly towards such
    approach. Basically, each ethdev API call for the overlay device
    needs to configure the overlay device, invoke the same configuration
    with possibly modified parameters for the underlay device, then resume
    the configuration of overlay device, turning this into a device
    emulation project.
    V2 minuses: increased complexity (deal with two devices at same time);
    need to implement every ethdev API, even those not needed for the scope
    of SW fall-back; intrusive; sometimes have to silently take decisions
    that should be left to the app.
    V3 pluses: lower complexity (only one device); only need to implement
    those APIs that are in scope of the SW fall-back; non-intrusive (deal
    with "hard" device through ethdev API); app decisions taken by the app
    in an explicit way.

Q6: Why expose the SW fall-back in a PMD and not in a SW library?
A6: The SW fall-back for an ethdev API has to implement that specific
    ethdev API, (hence expose an ethdev object through a PMD), as opposed
    to providing a different API. This approach allows the app to use the
    same API (NFV vision). For example, we already have a library for TM
    SW fall-back (librte_sched) that can be called directly by the apps
    that need to call it outside of ethdev context (use-cases exist), but
    an app that works with TM-aware NICs through the ethdev TM API would
    have to be changed significantly in order to work with different
    TM-agnostic NICs through the librte_sched API.

Q7: Why have all the SW fall-backs in a single PMD? Why not develop
    the SW fall-back for each different ethdev API in a separate PMD, then
    create a chain of "soft" devices for each "hard" device? Potentially,
    this results in smaller size PMDs that are easier to maintain.
A7: Arguments for single ethdev/PMD and against chain of ethdevs/PMDs:
    1. All the existing PMDs for HW NICs implement a lot of features under
       the same PMD, so there is no reason for single PMD approach to break
       code modularity. See the V3 code, a lot of care has been taken for
       code modularity.
    2. We should avoid the proliferation of SW PMDs.
    3. A single device should be handled by a single PMD.
    4. People are used with feature-rich PMDs, not with single-feature
       PMDs, so we change of mindset?
    5. [Configuration nightmare] A chain of "soft" devices attached to
       single "hard" device requires the app to be aware that the N "soft"
       devices in the chain plus the "hard" device refer to the same HW
       device, and which device should be invoked to configure which
       feature. Also the length of the chain and functionality of each
       link is different for each HW device. This breaks the requirement
       of preserving the same API while working with different NICs (NFV).
       This most likely results in a configuration nightmare, nobody is
       going to seriously use this.
    6. [Feature inter-dependecy] Sometimes different features need to be
       configured and executed together (e.g. share the same set of
       resources, are inter-dependent, etc), so it is better and more
       performant to do them in the same ethdev/PMD.
    7. [Code duplication] There is a lot of duplication in the
       configuration code for the chain of ethdevs approach. The ethdev
       dev_configure, rx_queue_setup, tx_queue_setup API functions have to
       be implemented per device, and they become meaningless/inconsistent
       with the chain approach.
    8. [Data structure duplication] The per device data structures have to
       be duplicated and read repeatedly for each "soft" ethdev. The
       ethdev device, dev_private, data, per RX/TX queue data structures
       have to be replicated per "soft" device. They have to be re-read for
       each stage, so the same cache misses are now multiplied with the
       number of stages in the chain.
    9. [rte_ring proliferation] Thread safety requirements for ethdev
       RX/TXqueues require an rte_ring to be used for every RX/TX queue
       of each "soft" ethdev. This rte_ring proliferation unnecessarily
       increases the memory footprint and lowers performance, especially
       when each "soft" ethdev ends up on a different CPU core (ping-pong
       of cache lines).
    10.[Meta-data proliferation] A chain of ethdevs is likely to result
       in proliferation of meta-data that has to be passed between the
       ethdevs (e.g. policing needs the output of flow classification),
       which results in more cache line ping-pong between cores, hence
       performance drops.

Cristian Dumitrescu (4):
Jasvinder Singh (4):
  net/softnic: add softnic PMD
  net/softnic: add traffic management support
  net/softnic: add TM capabilities ops
  net/softnic: add TM hierarchy related ops

Jasvinder Singh (1):
  app/testpmd: add traffic management forwarding mode

 MAINTAINERS                                        |    5 +
 app/test-pmd/Makefile                              |    8 +
 app/test-pmd/cmdline.c                             |   88 +
 app/test-pmd/testpmd.c                             |   15 +
 app/test-pmd/testpmd.h                             |   46 +
 app/test-pmd/tm.c                                  |  865 +++++
 config/common_base                                 |    5 +
 doc/api/doxy-api-index.md                          |    3 +-
 doc/api/doxy-api.conf                              |    1 +
 doc/guides/rel_notes/release_17_11.rst             |    6 +
 drivers/net/Makefile                               |    5 +
 drivers/net/softnic/Makefile                       |   57 +
 drivers/net/softnic/rte_eth_softnic.c              |  852 +++++
 drivers/net/softnic/rte_eth_softnic.h              |   83 +
 drivers/net/softnic/rte_eth_softnic_internals.h    |  291 ++
 drivers/net/softnic/rte_eth_softnic_tm.c           | 3452 ++++++++++++++++++++
 .../net/softnic/rte_pmd_eth_softnic_version.map    |    7 +
 mk/rte.app.mk                                      |    5 +-
 18 files changed, 5792 insertions(+), 2 deletions(-)
 create mode 100644 app/test-pmd/tm.c
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c
 create mode 100644 drivers/net/softnic/rte_pmd_eth_softnic_version.map

Series Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Series Acked-by: Thomas Monjalon <thomas@monjalon.net>
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v6 1/5] net/softnic: add softnic PMD
  2017-10-06 16:59                   ` [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
@ 2017-10-06 16:59                     ` Jasvinder Singh
  2017-10-09 12:58                       ` [dpdk-dev] [PATCH v7 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-10-06 17:00                     ` [dpdk-dev] [PATCH v6 2/5] net/softnic: add traffic management support Jasvinder Singh
                                       ` (4 subsequent siblings)
  5 siblings, 1 reply; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-06 16:59 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

Add SoftNIC PMD to provide SW fall-back for ethdev APIs.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
v5 changes:
- change function name rte_pmd_softnic_run_default() to run_default()

v4 changes:
- Implemented feedback from Ferruh [1]
 - rename map file to rte_pmd_eth_softnic_version.map
 - add release notes library version info
 - doxygen: fix hooks in doc/api/doxy-api-index.md
 - add doxygen comment for rte_pmd_softnic_run()
 - free device name memory
 - remove soft_dev param in pmd_ethdev_register()
 - fix checkpatch warnings

v3 changes:
- rebase to dpdk17.08 release

v2 changes:
- fix build errors
- rebased to TM APIs v6 plus dpdk master

[1] Feedback from Ferruh on v3: http://dpdk.org/ml/archives/dev/2017-September/074576.html

 MAINTAINERS                                        |   5 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   3 +-
 doc/api/doxy-api.conf                              |   1 +
 doc/guides/rel_notes/release_17_11.rst             |   6 +
 drivers/net/Makefile                               |   5 +
 drivers/net/softnic/Makefile                       |  56 ++
 drivers/net/softnic/rte_eth_softnic.c              | 591 +++++++++++++++++++++
 drivers/net/softnic/rte_eth_softnic.h              |  67 +++
 drivers/net/softnic/rte_eth_softnic_internals.h    | 114 ++++
 .../net/softnic/rte_pmd_eth_softnic_version.map    |   7 +
 mk/rte.app.mk                                      |   5 +-
 12 files changed, 863 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_pmd_eth_softnic_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index cd0d6bc..91b8afe 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -524,6 +524,11 @@ M: Gaetan Rivet <gaetan.rivet@6wind.com>
 F: drivers/net/failsafe/
 F: doc/guides/nics/fail_safe.rst
 
+Softnic PMD
+M: Jasvinder Singh <jasvinder.singh@intel.com>
+M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
+F: drivers/net/softnic
+
 
 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index ca47615..3d10c37 100644
--- a/config/common_base
+++ b/config/common_base
@@ -271,6 +271,11 @@ CONFIG_RTE_LIBRTE_SFC_EFX_PMD=y
 CONFIG_RTE_LIBRTE_SFC_EFX_DEBUG=n
 
 #
+# Compile SOFTNIC PMD
+#
+CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
+
+#
 # Compile software PMD backed by SZEDATA2 device
 #
 CONFIG_RTE_LIBRTE_PMD_SZEDATA2=n
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 19e0d4f..626ab51 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -55,7 +55,8 @@ The public API headers are grouped by topics:
   [KNI]                (@ref rte_kni.h),
   [ixgbe]              (@ref rte_pmd_ixgbe.h),
   [i40e]               (@ref rte_pmd_i40e.h),
-  [crypto_scheduler]   (@ref rte_cryptodev_scheduler.h)
+  [crypto_scheduler]   (@ref rte_cryptodev_scheduler.h),
+  [softnic]            (@ref rte_eth_softnic.h)
 
 - **memory**:
   [memseg]             (@ref rte_memory.h),
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 823554f..b27755d 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -32,6 +32,7 @@ PROJECT_NAME            = DPDK
 INPUT                   = doc/api/doxy-api-index.md \
                           drivers/crypto/scheduler \
                           drivers/net/bonding \
+                          drivers/net/softnic \
                           drivers/net/i40e \
                           drivers/net/ixgbe \
                           lib/librte_eal/common/include \
diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 60f7097..91879f7 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -58,6 +58,11 @@ New Features
    * Support for Flow API
    * Support for Tx and Rx descriptor status functions
 
+* **Added SoftNIC PMD.**
+
+  Added new SoftNIC PMD. This virtual device offers applications a software
+  fallback support for traffic management.
+
 
 Resolved Issues
 ---------------
@@ -211,6 +216,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_pmd_ixgbe.so.1
      librte_pmd_ring.so.2
      librte_pmd_vhost.so.1
+   + librte_pmd_softnic.so.1
      librte_port.so.3
      librte_power.so.1
      librte_reorder.so.1
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 2bd42f8..218dd3c 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -112,4 +112,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
 endif # $(CONFIG_RTE_LIBRTE_VHOST)
 DEPDIRS-vhost = $(core-libs) librte_vhost
 
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += softnic
+endif # $(CONFIG_RTE_LIBRTE_SCHED)
+DEPDIRS-softnic = $(core-libs) librte_sched
+
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
new file mode 100644
index 0000000..c2f42ef
--- /dev/null
+++ b/drivers/net/softnic/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_softnic.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_eth_softnic_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+
+#
+# Export include files
+#
+SYMLINK-y-include += rte_eth_softnic.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
new file mode 100644
index 0000000..aa5ea8b
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -0,0 +1,591 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_ethdev.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_vdev.h>
+#include <rte_kvargs.h>
+#include <rte_errno.h>
+#include <rte_ring.h>
+
+#include "rte_eth_softnic.h"
+#include "rte_eth_softnic_internals.h"
+
+#define DEV_HARD(p)					\
+	(&rte_eth_devices[p->hard.port_id])
+
+#define PMD_PARAM_HARD_NAME					"hard_name"
+#define PMD_PARAM_HARD_TX_QUEUE_ID			"hard_tx_queue_id"
+
+static const char *pmd_valid_args[] = {
+	PMD_PARAM_HARD_NAME,
+	PMD_PARAM_HARD_TX_QUEUE_ID,
+	NULL
+};
+
+static const struct rte_eth_dev_info pmd_dev_info = {
+	.min_rx_bufsize = 0,
+	.max_rx_pktlen = UINT32_MAX,
+	.max_rx_queues = UINT16_MAX,
+	.max_tx_queues = UINT16_MAX,
+	.rx_desc_lim = {
+		.nb_max = UINT16_MAX,
+		.nb_min = 0,
+		.nb_align = 1,
+	},
+	.tx_desc_lim = {
+		.nb_max = UINT16_MAX,
+		.nb_min = 0,
+		.nb_align = 1,
+	},
+};
+
+static void
+pmd_dev_infos_get(struct rte_eth_dev *dev __rte_unused,
+	struct rte_eth_dev_info *dev_info)
+{
+	memcpy(dev_info, &pmd_dev_info, sizeof(*dev_info));
+}
+
+static int
+pmd_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_eth_dev *hard_dev = DEV_HARD(p);
+
+	if (dev->data->nb_rx_queues > hard_dev->data->nb_rx_queues)
+		return -1;
+
+	if (p->params.hard.tx_queue_id >= hard_dev->data->nb_tx_queues)
+		return -1;
+
+	return 0;
+}
+
+static int
+pmd_rx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id,
+	uint16_t nb_rx_desc __rte_unused,
+	unsigned int socket_id,
+	const struct rte_eth_rxconf *rx_conf __rte_unused,
+	struct rte_mempool *mb_pool __rte_unused)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	if (p->params.soft.intrusive == 0) {
+		struct pmd_rx_queue *rxq;
+
+		rxq = rte_zmalloc_socket(p->params.soft.name,
+			sizeof(struct pmd_rx_queue), 0, socket_id);
+		if (rxq == NULL)
+			return -ENOMEM;
+
+		rxq->hard.port_id = p->hard.port_id;
+		rxq->hard.rx_queue_id = rx_queue_id;
+		dev->data->rx_queues[rx_queue_id] = rxq;
+	} else {
+		struct rte_eth_dev *hard_dev = DEV_HARD(p);
+		void *rxq = hard_dev->data->rx_queues[rx_queue_id];
+
+		if (rxq == NULL)
+			return -1;
+
+		dev->data->rx_queues[rx_queue_id] = rxq;
+	}
+	return 0;
+}
+
+static int
+pmd_tx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id,
+	uint16_t nb_tx_desc,
+	unsigned int socket_id,
+	const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	uint32_t size = RTE_ETH_NAME_MAX_LEN + strlen("_txq") + 4;
+	char name[size];
+	struct rte_ring *r;
+
+	snprintf(name, sizeof(name), "%s_txq%04x",
+		dev->data->name, tx_queue_id);
+	r = rte_ring_create(name, nb_tx_desc, socket_id,
+		RING_F_SP_ENQ | RING_F_SC_DEQ);
+	if (r == NULL)
+		return -1;
+
+	dev->data->tx_queues[tx_queue_id] = r;
+	return 0;
+}
+
+static int
+pmd_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+
+	if (p->params.soft.intrusive) {
+		struct rte_eth_dev *hard_dev = DEV_HARD(p);
+
+		/* The hard_dev->rx_pkt_burst should be stable by now */
+		dev->rx_pkt_burst = hard_dev->rx_pkt_burst;
+	}
+
+	return 0;
+}
+
+static void
+pmd_dev_stop(struct rte_eth_dev *dev)
+{
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+}
+
+static void
+pmd_dev_close(struct rte_eth_dev *dev)
+{
+	uint32_t i;
+
+	/* TX queues */
+	for (i = 0; i < dev->data->nb_tx_queues; i++)
+		rte_ring_free((struct rte_ring *)dev->data->tx_queues[i]);
+}
+
+static int
+pmd_link_update(struct rte_eth_dev *dev __rte_unused,
+	int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static const struct eth_dev_ops pmd_ops = {
+	.dev_configure = pmd_dev_configure,
+	.dev_start = pmd_dev_start,
+	.dev_stop = pmd_dev_stop,
+	.dev_close = pmd_dev_close,
+	.link_update = pmd_link_update,
+	.dev_infos_get = pmd_dev_infos_get,
+	.rx_queue_setup = pmd_rx_queue_setup,
+	.tx_queue_setup = pmd_tx_queue_setup,
+	.tm_ops_get = NULL,
+};
+
+static uint16_t
+pmd_rx_pkt_burst(void *rxq,
+	struct rte_mbuf **rx_pkts,
+	uint16_t nb_pkts)
+{
+	struct pmd_rx_queue *rx_queue = rxq;
+
+	return rte_eth_rx_burst(rx_queue->hard.port_id,
+		rx_queue->hard.rx_queue_id,
+		rx_pkts,
+		nb_pkts);
+}
+
+static uint16_t
+pmd_tx_pkt_burst(void *txq,
+	struct rte_mbuf **tx_pkts,
+	uint16_t nb_pkts)
+{
+	return (uint16_t)rte_ring_enqueue_burst(txq,
+		(void **)tx_pkts,
+		nb_pkts,
+		NULL);
+}
+
+static __rte_always_inline int
+run_default(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Persistent context: Read Only (update not required) */
+	struct rte_mbuf **pkts = p->soft.def.pkts;
+	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
+
+	/* Persistent context: Read - Write (update required) */
+	uint32_t txq_pos = p->soft.def.txq_pos;
+	uint32_t pkts_len = p->soft.def.pkts_len;
+	uint32_t flush_count = p->soft.def.flush_count;
+
+	/* Not part of the persistent context */
+	uint32_t pos;
+	uint16_t i;
+
+	/* Soft device TXQ read, Hard device TXQ write */
+	for (i = 0; i < nb_tx_queues; i++) {
+		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
+
+		/* Read soft device TXQ burst to packet enqueue buffer */
+		pkts_len += rte_ring_sc_dequeue_burst(txq,
+			(void **)&pkts[pkts_len],
+			DEFAULT_BURST_SIZE,
+			NULL);
+
+		/* Increment soft device TXQ */
+		txq_pos++;
+		if (txq_pos >= nb_tx_queues)
+			txq_pos = 0;
+
+		/* Hard device TXQ write when complete burst is available */
+		if (pkts_len >= DEFAULT_BURST_SIZE) {
+			for (pos = 0; pos < pkts_len; )
+				pos += rte_eth_tx_burst(p->hard.port_id,
+					p->params.hard.tx_queue_id,
+					&pkts[pos],
+					(uint16_t)(pkts_len - pos));
+
+			pkts_len = 0;
+			flush_count = 0;
+			break;
+		}
+	}
+
+	if (flush_count >= FLUSH_COUNT_THRESHOLD) {
+		for (pos = 0; pos < pkts_len; )
+			pos += rte_eth_tx_burst(p->hard.port_id,
+				p->params.hard.tx_queue_id,
+				&pkts[pos],
+				(uint16_t)(pkts_len - pos));
+
+		pkts_len = 0;
+		flush_count = 0;
+	}
+
+	p->soft.def.txq_pos = txq_pos;
+	p->soft.def.pkts_len = pkts_len;
+	p->soft.def.flush_count = flush_count + 1;
+
+	return 0;
+}
+
+int
+rte_pmd_softnic_run(uint8_t port_id)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+#endif
+
+	return run_default(dev);
+}
+
+static struct ether_addr eth_addr = { .addr_bytes = {0} };
+
+static uint32_t
+eth_dev_speed_max_mbps(uint32_t speed_capa)
+{
+	uint32_t rate_mbps[32] = {
+		ETH_SPEED_NUM_NONE,
+		ETH_SPEED_NUM_10M,
+		ETH_SPEED_NUM_10M,
+		ETH_SPEED_NUM_100M,
+		ETH_SPEED_NUM_100M,
+		ETH_SPEED_NUM_1G,
+		ETH_SPEED_NUM_2_5G,
+		ETH_SPEED_NUM_5G,
+		ETH_SPEED_NUM_10G,
+		ETH_SPEED_NUM_20G,
+		ETH_SPEED_NUM_25G,
+		ETH_SPEED_NUM_40G,
+		ETH_SPEED_NUM_50G,
+		ETH_SPEED_NUM_56G,
+		ETH_SPEED_NUM_100G,
+	};
+
+	uint32_t pos = (speed_capa) ? (31 - __builtin_clz(speed_capa)) : 0;
+	return rate_mbps[pos];
+}
+
+static int
+default_init(struct pmd_internals *p,
+	struct pmd_params *params,
+	int numa_node)
+{
+	p->soft.def.pkts = rte_zmalloc_socket(params->soft.name,
+		2 * DEFAULT_BURST_SIZE * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.def.pkts == NULL)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void
+default_free(struct pmd_internals *p)
+{
+	rte_free(p->soft.def.pkts);
+}
+
+static void *
+pmd_init(struct pmd_params *params, int numa_node)
+{
+	struct pmd_internals *p;
+	int status;
+
+	p = rte_zmalloc_socket(params->soft.name,
+		sizeof(struct pmd_internals),
+		0,
+		numa_node);
+	if (p == NULL)
+		return NULL;
+
+	memcpy(&p->params, params, sizeof(p->params));
+	rte_eth_dev_get_port_by_name(params->hard.name, &p->hard.port_id);
+
+	/* Default */
+	status = default_init(p, params, numa_node);
+	if (status) {
+		free(p->params.hard.name);
+		rte_free(p);
+		return NULL;
+	}
+
+	return p;
+}
+
+static void
+pmd_free(struct pmd_internals *p)
+{
+	default_free(p);
+
+	free(p->params.hard.name);
+	rte_free(p);
+}
+
+static int
+pmd_ethdev_register(struct rte_vdev_device *vdev,
+	struct pmd_params *params,
+	void *dev_private)
+{
+	struct rte_eth_dev_info hard_info;
+	struct rte_eth_dev *soft_dev;
+	uint32_t hard_speed;
+	int numa_node;
+	uint8_t hard_port_id;
+
+	rte_eth_dev_get_port_by_name(params->hard.name, &hard_port_id);
+	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
+	numa_node = rte_eth_dev_socket_id(hard_port_id);
+
+	/* Ethdev entry allocation */
+	soft_dev = rte_eth_dev_allocate(params->soft.name);
+	if (!soft_dev)
+		return -ENOMEM;
+
+	/* dev */
+	soft_dev->rx_pkt_burst = (params->soft.intrusive) ?
+		NULL : /* set up later */
+		pmd_rx_pkt_burst;
+	soft_dev->tx_pkt_burst = pmd_tx_pkt_burst;
+	soft_dev->tx_pkt_prepare = NULL;
+	soft_dev->dev_ops = &pmd_ops;
+	soft_dev->device = &vdev->device;
+
+	/* dev->data */
+	soft_dev->data->dev_private = dev_private;
+	soft_dev->data->dev_link.link_speed = hard_speed;
+	soft_dev->data->dev_link.link_duplex = ETH_LINK_FULL_DUPLEX;
+	soft_dev->data->dev_link.link_autoneg = ETH_LINK_SPEED_FIXED;
+	soft_dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	soft_dev->data->mac_addrs = &eth_addr;
+	soft_dev->data->promiscuous = 1;
+	soft_dev->data->kdrv = RTE_KDRV_NONE;
+	soft_dev->data->numa_node = numa_node;
+	soft_dev->data->dev_flags = RTE_ETH_DEV_DETACHABLE;
+
+	return 0;
+}
+
+static int
+get_string(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(char **)extra_args = strdup(value);
+
+	if (!*(char **)extra_args)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int
+get_uint32(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(uint32_t *)extra_args = strtoull(value, NULL, 0);
+
+	return 0;
+}
+
+static int
+pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
+{
+	struct rte_kvargs *kvlist;
+	int ret;
+
+	kvlist = rte_kvargs_parse(params, pmd_valid_args);
+	if (kvlist == NULL)
+		return -EINVAL;
+
+	/* Set default values */
+	memset(p, 0, sizeof(*p));
+	p->soft.name = name;
+	p->soft.intrusive = INTRUSIVE;
+	p->hard.tx_queue_id = SOFTNIC_HARD_TX_QUEUE_ID;
+
+	/* HARD: name (mandatory) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_NAME) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME,
+			&get_string, &p->hard.name);
+		if (ret < 0)
+			goto out_free;
+	} else {
+		ret = -EINVAL;
+		goto out_free;
+	}
+
+	/* HARD: tx_queue_id (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_TX_QUEUE_ID) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_TX_QUEUE_ID,
+			&get_uint32, &p->hard.tx_queue_id);
+		if (ret < 0)
+			goto out_free;
+	}
+
+out_free:
+	rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+pmd_probe(struct rte_vdev_device *vdev)
+{
+	struct pmd_params p;
+	const char *params;
+	int status;
+
+	struct rte_eth_dev_info hard_info;
+	uint8_t hard_port_id;
+	int numa_node;
+	void *dev_private;
+
+	RTE_LOG(INFO, PMD,
+		"Probing device \"%s\"\n",
+		rte_vdev_device_name(vdev));
+
+	/* Parse input arguments */
+	params = rte_vdev_device_args(vdev);
+	if (!params)
+		return -EINVAL;
+
+	status = pmd_parse_args(&p, rte_vdev_device_name(vdev), params);
+	if (status)
+		return status;
+
+	/* Check input arguments */
+	if (rte_eth_dev_get_port_by_name(p.hard.name, &hard_port_id))
+		return -EINVAL;
+
+	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	numa_node = rte_eth_dev_socket_id(hard_port_id);
+
+	if (p.hard.tx_queue_id >= hard_info.max_tx_queues)
+		return -EINVAL;
+
+	/* Allocate and initialize soft ethdev private data */
+	dev_private = pmd_init(&p, numa_node);
+	if (dev_private == NULL)
+		return -ENOMEM;
+
+	/* Register soft ethdev */
+	RTE_LOG(INFO, PMD,
+		"Creating soft ethdev \"%s\" for hard ethdev \"%s\"\n",
+		p.soft.name, p.hard.name);
+
+	status = pmd_ethdev_register(vdev, &p, dev_private);
+	if (status) {
+		pmd_free(dev_private);
+		return status;
+	}
+
+	return 0;
+}
+
+static int
+pmd_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *dev = NULL;
+	struct pmd_internals *p;
+
+	if (!vdev)
+		return -EINVAL;
+
+	RTE_LOG(INFO, PMD, "Removing device \"%s\"\n",
+		rte_vdev_device_name(vdev));
+
+	/* Find the ethdev entry */
+	dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (dev == NULL)
+		return -ENODEV;
+	p = dev->data->dev_private;
+
+	/* Free device data structures*/
+	pmd_free(p);
+	rte_free(dev->data);
+	rte_eth_dev_release_port(dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_softnic_drv = {
+	.probe = pmd_probe,
+	.remove = pmd_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_softnic, pmd_softnic_drv);
+RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
+	PMD_PARAM_HARD_NAME "=<string> "
+	PMD_PARAM_HARD_TX_QUEUE_ID "=<int>");
diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
new file mode 100644
index 0000000..e6996f3
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.h
@@ -0,0 +1,67 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_H__
+
+#include <stdint.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifndef SOFTNIC_HARD_TX_QUEUE_ID
+#define SOFTNIC_HARD_TX_QUEUE_ID			0
+#endif
+
+/**
+ * Run the traffic management function on the softnic device
+ *
+ * This function read the packets from the softnic input queues, insert into
+ * QoS scheduler queues based on mbuf sched field value and transmit the
+ * scheduled packets out through the hard device interface.
+ *
+ * @param portid
+ *    port id of the soft device.
+ * @return
+ *    zero.
+ */
+
+int
+rte_pmd_softnic_run(uint8_t port_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
new file mode 100644
index 0000000..96995b5
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -0,0 +1,114 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+
+#include "rte_eth_softnic.h"
+
+#ifndef INTRUSIVE
+#define INTRUSIVE					0
+#endif
+
+struct pmd_params {
+	/** Parameters for the soft device (to be created) */
+	struct {
+		const char *name; /**< Name */
+		uint32_t flags; /**< Flags */
+
+		/** 0 = Access hard device though API only (potentially slower,
+		 *      but safer);
+		 *  1 = Access hard device private data structures is allowed
+		 *      (potentially faster).
+		 */
+		int intrusive;
+	} soft;
+
+	/** Parameters for the hard device (existing) */
+	struct {
+		char *name; /**< Name */
+		uint16_t tx_queue_id; /**< TX queue ID */
+	} hard;
+};
+
+/**
+ * Default Internals
+ */
+
+#ifndef DEFAULT_BURST_SIZE
+#define DEFAULT_BURST_SIZE				32
+#endif
+
+#ifndef FLUSH_COUNT_THRESHOLD
+#define FLUSH_COUNT_THRESHOLD			(1 << 17)
+#endif
+
+struct default_internals {
+	struct rte_mbuf **pkts;
+	uint32_t pkts_len;
+	uint32_t txq_pos;
+	uint32_t flush_count;
+};
+
+/**
+ * PMD Internals
+ */
+struct pmd_internals {
+	/** Params */
+	struct pmd_params params;
+
+	/** Soft device */
+	struct {
+		struct default_internals def; /**< Default */
+	} soft;
+
+	/** Hard device */
+	struct {
+		uint8_t port_id;
+	} hard;
+};
+
+struct pmd_rx_queue {
+	/** Hard device */
+	struct {
+		uint8_t port_id;
+		uint16_t rx_queue_id;
+	} hard;
+};
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
diff --git a/drivers/net/softnic/rte_pmd_eth_softnic_version.map b/drivers/net/softnic/rte_pmd_eth_softnic_version.map
new file mode 100644
index 0000000..fb2cb68
--- /dev/null
+++ b/drivers/net/softnic/rte_pmd_eth_softnic_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_pmd_softnic_run;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 29507dc..443a3ab 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -67,7 +67,6 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)    += -lrte_distributor
 _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
-_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
 # librte_acl needs --whole-archive because of weak functions
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)            += --whole-archive
@@ -99,6 +98,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
@@ -140,6 +140,9 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL)       += -lrte_pmd_null
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP)       += -lrte_pmd_pcap -lpcap
 _LDLIBS-$(CONFIG_RTE_LIBRTE_QEDE_PMD)       += -lrte_pmd_qede
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING)       += -lrte_pmd_ring
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC)      += -lrte_pmd_softnic
+endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD)    += -lrte_pmd_sfc_efx
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2)   += -lrte_pmd_szedata2 -lsze2
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_TAP)        += -lrte_pmd_tap
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v6 2/5] net/softnic: add traffic management support
  2017-10-06 16:59                   ` [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-10-06 16:59                     ` [dpdk-dev] [PATCH v6 1/5] net/softnic: add softnic PMD Jasvinder Singh
@ 2017-10-06 17:00                     ` Jasvinder Singh
  2017-10-06 17:00                     ` [dpdk-dev] [PATCH v6 3/5] net/softnic: add TM capabilities ops Jasvinder Singh
                                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-06 17:00 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

Add ethdev Traffic Management API support to SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Lu, Wenzhuo <wenzhuo.lu@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
v5 changes:
- change function name rte_pmd_softnic_run_tm() to run_tm()

v3 changes:
- add more confguration parameters (tm rate, tm queue sizes)

 drivers/net/softnic/Makefile                    |   1 +
 drivers/net/softnic/rte_eth_softnic.c           | 255 +++++++++++++++++++++++-
 drivers/net/softnic/rte_eth_softnic.h           |  16 ++
 drivers/net/softnic/rte_eth_softnic_internals.h | 104 ++++++++++
 drivers/net/softnic/rte_eth_softnic_tm.c        | 181 +++++++++++++++++
 5 files changed, 555 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c

diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
index c2f42ef..8b848a9 100644
--- a/drivers/net/softnic/Makefile
+++ b/drivers/net/softnic/Makefile
@@ -47,6 +47,7 @@ LIBABIVER := 1
 # all source are stored in SRCS-y
 #
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic_tm.c
 
 #
 # Export include files
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index aa5ea8b..ab26948 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -42,6 +42,7 @@
 #include <rte_kvargs.h>
 #include <rte_errno.h>
 #include <rte_ring.h>
+#include <rte_sched.h>
 
 #include "rte_eth_softnic.h"
 #include "rte_eth_softnic_internals.h"
@@ -49,10 +50,29 @@
 #define DEV_HARD(p)					\
 	(&rte_eth_devices[p->hard.port_id])
 
+#define PMD_PARAM_SOFT_TM					"soft_tm"
+#define PMD_PARAM_SOFT_TM_RATE				"soft_tm_rate"
+#define PMD_PARAM_SOFT_TM_NB_QUEUES			"soft_tm_nb_queues"
+#define PMD_PARAM_SOFT_TM_QSIZE0			"soft_tm_qsize0"
+#define PMD_PARAM_SOFT_TM_QSIZE1			"soft_tm_qsize1"
+#define PMD_PARAM_SOFT_TM_QSIZE2			"soft_tm_qsize2"
+#define PMD_PARAM_SOFT_TM_QSIZE3			"soft_tm_qsize3"
+#define PMD_PARAM_SOFT_TM_ENQ_BSZ			"soft_tm_enq_bsz"
+#define PMD_PARAM_SOFT_TM_DEQ_BSZ			"soft_tm_deq_bsz"
+
 #define PMD_PARAM_HARD_NAME					"hard_name"
 #define PMD_PARAM_HARD_TX_QUEUE_ID			"hard_tx_queue_id"
 
 static const char *pmd_valid_args[] = {
+	PMD_PARAM_SOFT_TM,
+	PMD_PARAM_SOFT_TM_RATE,
+	PMD_PARAM_SOFT_TM_NB_QUEUES,
+	PMD_PARAM_SOFT_TM_QSIZE0,
+	PMD_PARAM_SOFT_TM_QSIZE1,
+	PMD_PARAM_SOFT_TM_QSIZE2,
+	PMD_PARAM_SOFT_TM_QSIZE3,
+	PMD_PARAM_SOFT_TM_ENQ_BSZ,
+	PMD_PARAM_SOFT_TM_DEQ_BSZ,
 	PMD_PARAM_HARD_NAME,
 	PMD_PARAM_HARD_TX_QUEUE_ID,
 	NULL
@@ -157,6 +177,13 @@ pmd_dev_start(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
 
+	if (tm_used(dev)) {
+		int status = tm_start(p);
+
+		if (status)
+			return status;
+	}
+
 	dev->data->dev_link.link_status = ETH_LINK_UP;
 
 	if (p->params.soft.intrusive) {
@@ -172,7 +199,12 @@ pmd_dev_start(struct rte_eth_dev *dev)
 static void
 pmd_dev_stop(struct rte_eth_dev *dev)
 {
+	struct pmd_internals *p = dev->data->dev_private;
+
 	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+
+	if (tm_used(dev))
+		tm_stop(p);
 }
 
 static void
@@ -293,6 +325,77 @@ run_default(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static __rte_always_inline int
+run_tm(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Persistent context: Read Only (update not required) */
+	struct rte_sched_port *sched = p->soft.tm.sched;
+	struct rte_mbuf **pkts_enq = p->soft.tm.pkts_enq;
+	struct rte_mbuf **pkts_deq = p->soft.tm.pkts_deq;
+	uint32_t enq_bsz = p->params.soft.tm.enq_bsz;
+	uint32_t deq_bsz = p->params.soft.tm.deq_bsz;
+	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
+
+	/* Persistent context: Read - Write (update required) */
+	uint32_t txq_pos = p->soft.tm.txq_pos;
+	uint32_t pkts_enq_len = p->soft.tm.pkts_enq_len;
+	uint32_t flush_count = p->soft.tm.flush_count;
+
+	/* Not part of the persistent context */
+	uint32_t pkts_deq_len, pos;
+	uint16_t i;
+
+	/* Soft device TXQ read, TM enqueue */
+	for (i = 0; i < nb_tx_queues; i++) {
+		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
+
+		/* Read TXQ burst to packet enqueue buffer */
+		pkts_enq_len += rte_ring_sc_dequeue_burst(txq,
+			(void **)&pkts_enq[pkts_enq_len],
+			enq_bsz,
+			NULL);
+
+		/* Increment TXQ */
+		txq_pos++;
+		if (txq_pos >= nb_tx_queues)
+			txq_pos = 0;
+
+		/* TM enqueue when complete burst is available */
+		if (pkts_enq_len >= enq_bsz) {
+			rte_sched_port_enqueue(sched, pkts_enq, pkts_enq_len);
+
+			pkts_enq_len = 0;
+			flush_count = 0;
+			break;
+		}
+	}
+
+	if (flush_count >= FLUSH_COUNT_THRESHOLD) {
+		if (pkts_enq_len)
+			rte_sched_port_enqueue(sched, pkts_enq, pkts_enq_len);
+
+		pkts_enq_len = 0;
+		flush_count = 0;
+	}
+
+	p->soft.tm.txq_pos = txq_pos;
+	p->soft.tm.pkts_enq_len = pkts_enq_len;
+	p->soft.tm.flush_count = flush_count + 1;
+
+	/* TM dequeue, Hard device TXQ write */
+	pkts_deq_len = rte_sched_port_dequeue(sched, pkts_deq, deq_bsz);
+
+	for (pos = 0; pos < pkts_deq_len; )
+		pos += rte_eth_tx_burst(p->hard.port_id,
+			p->params.hard.tx_queue_id,
+			&pkts_deq[pos],
+			(uint16_t)(pkts_deq_len - pos));
+
+	return 0;
+}
+
 int
 rte_pmd_softnic_run(uint8_t port_id)
 {
@@ -302,7 +405,7 @@ rte_pmd_softnic_run(uint8_t port_id)
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
 #endif
 
-	return run_default(dev);
+	return (tm_used(dev)) ? run_tm(dev) : run_default(dev);
 }
 
 static struct ether_addr eth_addr = { .addr_bytes = {0} };
@@ -378,12 +481,26 @@ pmd_init(struct pmd_params *params, int numa_node)
 		return NULL;
 	}
 
+	/* Traffic Management (TM)*/
+	if (params->soft.flags & PMD_FEATURE_TM) {
+		status = tm_init(p, params, numa_node);
+		if (status) {
+			default_free(p);
+			free(p->params.hard.name);
+			rte_free(p);
+			return NULL;
+		}
+	}
+
 	return p;
 }
 
 static void
 pmd_free(struct pmd_internals *p)
 {
+	if (p->params.soft.flags & PMD_FEATURE_TM)
+		tm_free(p);
+
 	default_free(p);
 
 	free(p->params.hard.name);
@@ -464,7 +581,7 @@ static int
 pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
 {
 	struct rte_kvargs *kvlist;
-	int ret;
+	int i, ret;
 
 	kvlist = rte_kvargs_parse(params, pmd_valid_args);
 	if (kvlist == NULL)
@@ -474,8 +591,124 @@ pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
 	memset(p, 0, sizeof(*p));
 	p->soft.name = name;
 	p->soft.intrusive = INTRUSIVE;
+	p->soft.tm.rate = 0;
+	p->soft.tm.nb_queues = SOFTNIC_SOFT_TM_NB_QUEUES;
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+		p->soft.tm.qsize[i] = SOFTNIC_SOFT_TM_QUEUE_SIZE;
+	p->soft.tm.enq_bsz = SOFTNIC_SOFT_TM_ENQ_BSZ;
+	p->soft.tm.deq_bsz = SOFTNIC_SOFT_TM_DEQ_BSZ;
 	p->hard.tx_queue_id = SOFTNIC_HARD_TX_QUEUE_ID;
 
+	/* SOFT: TM (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM) == 1) {
+		char *s;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM,
+			&get_string, &s);
+		if (ret < 0)
+			goto out_free;
+
+		if (strcmp(s, "on") == 0)
+			p->soft.flags |= PMD_FEATURE_TM;
+		else if (strcmp(s, "off") == 0)
+			p->soft.flags &= ~PMD_FEATURE_TM;
+		else
+			ret = -EINVAL;
+
+		free(s);
+		if (ret)
+			goto out_free;
+	}
+
+	/* SOFT: TM rate (measured in bytes/second) (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_RATE) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_RATE,
+			&get_uint32, &p->soft.tm.rate);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM number of queues (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_NB_QUEUES) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_NB_QUEUES,
+			&get_uint32, &p->soft.tm.nb_queues);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM queue size 0 .. 3 (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE0) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE0,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[0] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE1) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE1,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[1] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE2) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE2,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[2] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE3) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE3,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[3] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM enqueue burst size (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_ENQ_BSZ) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_ENQ_BSZ,
+			&get_uint32, &p->soft.tm.enq_bsz);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM dequeue burst size (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_DEQ_BSZ) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_DEQ_BSZ,
+			&get_uint32, &p->soft.tm.deq_bsz);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
 	/* HARD: name (mandatory) */
 	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_NAME) == 1) {
 		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME,
@@ -508,6 +741,7 @@ pmd_probe(struct rte_vdev_device *vdev)
 	int status;
 
 	struct rte_eth_dev_info hard_info;
+	uint32_t hard_speed;
 	uint8_t hard_port_id;
 	int numa_node;
 	void *dev_private;
@@ -530,11 +764,19 @@ pmd_probe(struct rte_vdev_device *vdev)
 		return -EINVAL;
 
 	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
 	numa_node = rte_eth_dev_socket_id(hard_port_id);
 
 	if (p.hard.tx_queue_id >= hard_info.max_tx_queues)
 		return -EINVAL;
 
+	if (p.soft.flags & PMD_FEATURE_TM) {
+		status = tm_params_check(&p, hard_speed);
+
+		if (status)
+			return status;
+	}
+
 	/* Allocate and initialize soft ethdev private data */
 	dev_private = pmd_init(&p, numa_node);
 	if (dev_private == NULL)
@@ -587,5 +829,14 @@ static struct rte_vdev_driver pmd_softnic_drv = {
 
 RTE_PMD_REGISTER_VDEV(net_softnic, pmd_softnic_drv);
 RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
+	PMD_PARAM_SOFT_TM	 "=on|off "
+	PMD_PARAM_SOFT_TM_RATE "=<int> "
+	PMD_PARAM_SOFT_TM_NB_QUEUES "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE0 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE1 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE2 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE3 "=<int> "
+	PMD_PARAM_SOFT_TM_ENQ_BSZ "=<int> "
+	PMD_PARAM_SOFT_TM_DEQ_BSZ "=<int> "
 	PMD_PARAM_HARD_NAME "=<string> "
 	PMD_PARAM_HARD_TX_QUEUE_ID "=<int>");
diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
index e6996f3..517b96a 100644
--- a/drivers/net/softnic/rte_eth_softnic.h
+++ b/drivers/net/softnic/rte_eth_softnic.h
@@ -40,6 +40,22 @@
 extern "C" {
 #endif
 
+#ifndef SOFTNIC_SOFT_TM_NB_QUEUES
+#define SOFTNIC_SOFT_TM_NB_QUEUES			65536
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_QUEUE_SIZE
+#define SOFTNIC_SOFT_TM_QUEUE_SIZE			64
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_ENQ_BSZ
+#define SOFTNIC_SOFT_TM_ENQ_BSZ				32
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_DEQ_BSZ
+#define SOFTNIC_SOFT_TM_DEQ_BSZ				24
+#endif
+
 #ifndef SOFTNIC_HARD_TX_QUEUE_ID
 #define SOFTNIC_HARD_TX_QUEUE_ID			0
 #endif
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 96995b5..11f88d8 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -37,10 +37,19 @@
 #include <stdint.h>
 
 #include <rte_mbuf.h>
+#include <rte_sched.h>
 #include <rte_ethdev.h>
 
 #include "rte_eth_softnic.h"
 
+/**
+ * PMD Parameters
+ */
+
+enum pmd_feature {
+	PMD_FEATURE_TM = 1, /**< Traffic Management (TM) */
+};
+
 #ifndef INTRUSIVE
 #define INTRUSIVE					0
 #endif
@@ -57,6 +66,16 @@ struct pmd_params {
 		 *      (potentially faster).
 		 */
 		int intrusive;
+
+		/** Traffic Management (TM) */
+		struct {
+			uint32_t rate; /**< Rate (bytes/second) */
+			uint32_t nb_queues; /**< Number of queues */
+			uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+			/**< Queue size per traffic class */
+			uint32_t enq_bsz; /**< Enqueue burst size */
+			uint32_t deq_bsz; /**< Dequeue burst size */
+		} tm;
 	} soft;
 
 	/** Parameters for the hard device (existing) */
@@ -86,6 +105,66 @@ struct default_internals {
 };
 
 /**
+ * Traffic Management (TM) Internals
+ */
+
+#ifndef TM_MAX_SUBPORTS
+#define TM_MAX_SUBPORTS					8
+#endif
+
+#ifndef TM_MAX_PIPES_PER_SUBPORT
+#define TM_MAX_PIPES_PER_SUBPORT			4096
+#endif
+
+struct tm_params {
+	struct rte_sched_port_params port_params;
+
+	struct rte_sched_subport_params subport_params[TM_MAX_SUBPORTS];
+
+	struct rte_sched_pipe_params
+		pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PORT];
+	uint32_t n_pipe_profiles;
+	uint32_t pipe_to_profile[TM_MAX_SUBPORTS * TM_MAX_PIPES_PER_SUBPORT];
+};
+
+/* TM Levels */
+enum tm_node_level {
+	TM_NODE_LEVEL_PORT = 0,
+	TM_NODE_LEVEL_SUBPORT,
+	TM_NODE_LEVEL_PIPE,
+	TM_NODE_LEVEL_TC,
+	TM_NODE_LEVEL_QUEUE,
+	TM_NODE_LEVEL_MAX,
+};
+
+/* TM Hierarchy Specification */
+struct tm_hierarchy {
+	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
+};
+
+struct tm_internals {
+	/** Hierarchy specification
+	 *
+	 *     -Hierarchy is unfrozen at init and when port is stopped.
+	 *     -Hierarchy is frozen on successful hierarchy commit.
+	 *     -Run-time hierarchy changes are not allowed, therefore it makes
+	 *      sense to keep the hierarchy frozen after the port is started.
+	 */
+	struct tm_hierarchy h;
+
+	/** Blueprints */
+	struct tm_params params;
+
+	/** Run-time */
+	struct rte_sched_port *sched;
+	struct rte_mbuf **pkts_enq;
+	struct rte_mbuf **pkts_deq;
+	uint32_t pkts_enq_len;
+	uint32_t txq_pos;
+	uint32_t flush_count;
+};
+
+/**
  * PMD Internals
  */
 struct pmd_internals {
@@ -95,6 +174,7 @@ struct pmd_internals {
 	/** Soft device */
 	struct {
 		struct default_internals def; /**< Default */
+		struct tm_internals tm; /**< Traffic Management */
 	} soft;
 
 	/** Hard device */
@@ -111,4 +191,28 @@ struct pmd_rx_queue {
 	} hard;
 };
 
+int
+tm_params_check(struct pmd_params *params, uint32_t hard_rate);
+
+int
+tm_init(struct pmd_internals *p, struct pmd_params *params, int numa_node);
+
+void
+tm_free(struct pmd_internals *p);
+
+int
+tm_start(struct pmd_internals *p);
+
+void
+tm_stop(struct pmd_internals *p);
+
+static inline int
+tm_used(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	return (p->params.soft.flags & PMD_FEATURE_TM) &&
+		p->soft.tm.h.n_tm_nodes[TM_NODE_LEVEL_PORT];
+}
+
 #endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
new file mode 100644
index 0000000..bb28798
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -0,0 +1,181 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_malloc.h>
+
+#include "rte_eth_softnic_internals.h"
+#include "rte_eth_softnic.h"
+
+#define BYTES_IN_MBPS (1000 * 1000 / 8)
+
+int
+tm_params_check(struct pmd_params *params, uint32_t hard_rate)
+{
+	uint64_t hard_rate_bytes_per_sec = hard_rate * BYTES_IN_MBPS;
+	uint32_t i;
+
+	/* rate */
+	if (params->soft.tm.rate) {
+		if (params->soft.tm.rate > hard_rate_bytes_per_sec)
+			return -EINVAL;
+	} else {
+		params->soft.tm.rate =
+			(hard_rate_bytes_per_sec > UINT32_MAX) ?
+				UINT32_MAX : hard_rate_bytes_per_sec;
+	}
+
+	/* nb_queues */
+	if (params->soft.tm.nb_queues == 0)
+		return -EINVAL;
+
+	if (params->soft.tm.nb_queues < RTE_SCHED_QUEUES_PER_PIPE)
+		params->soft.tm.nb_queues = RTE_SCHED_QUEUES_PER_PIPE;
+
+	params->soft.tm.nb_queues =
+		rte_align32pow2(params->soft.tm.nb_queues);
+
+	/* qsize */
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+		if (params->soft.tm.qsize[i] == 0)
+			return -EINVAL;
+
+		params->soft.tm.qsize[i] =
+			rte_align32pow2(params->soft.tm.qsize[i]);
+	}
+
+	/* enq_bsz, deq_bsz */
+	if ((params->soft.tm.enq_bsz == 0) ||
+		(params->soft.tm.deq_bsz == 0) ||
+		(params->soft.tm.deq_bsz >= params->soft.tm.enq_bsz))
+		return -EINVAL;
+
+	return 0;
+}
+
+int
+tm_init(struct pmd_internals *p,
+	struct pmd_params *params,
+	int numa_node)
+{
+	uint32_t enq_bsz = params->soft.tm.enq_bsz;
+	uint32_t deq_bsz = params->soft.tm.deq_bsz;
+
+	p->soft.tm.pkts_enq = rte_zmalloc_socket(params->soft.name,
+		2 * enq_bsz * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.tm.pkts_enq == NULL)
+		return -ENOMEM;
+
+	p->soft.tm.pkts_deq = rte_zmalloc_socket(params->soft.name,
+		deq_bsz * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.tm.pkts_deq == NULL) {
+		rte_free(p->soft.tm.pkts_enq);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void
+tm_free(struct pmd_internals *p)
+{
+	rte_free(p->soft.tm.pkts_enq);
+	rte_free(p->soft.tm.pkts_deq);
+}
+
+int
+tm_start(struct pmd_internals *p)
+{
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_subports, subport_id;
+	int status;
+
+	/* Port */
+	p->soft.tm.sched = rte_sched_port_config(&t->port_params);
+	if (p->soft.tm.sched == NULL)
+		return -1;
+
+	/* Subport */
+	n_subports = t->port_params.n_subports_per_port;
+	for (subport_id = 0; subport_id < n_subports; subport_id++) {
+		uint32_t n_pipes_per_subport =
+			t->port_params.n_pipes_per_subport;
+		uint32_t pipe_id;
+
+		status = rte_sched_subport_config(p->soft.tm.sched,
+			subport_id,
+			&t->subport_params[subport_id]);
+		if (status) {
+			rte_sched_port_free(p->soft.tm.sched);
+			return -1;
+		}
+
+		/* Pipe */
+		n_pipes_per_subport = t->port_params.n_pipes_per_subport;
+		for (pipe_id = 0; pipe_id < n_pipes_per_subport; pipe_id++) {
+			int pos = subport_id * TM_MAX_PIPES_PER_SUBPORT +
+				pipe_id;
+			int profile_id = t->pipe_to_profile[pos];
+
+			if (profile_id < 0)
+				continue;
+
+			status = rte_sched_pipe_config(p->soft.tm.sched,
+				subport_id,
+				pipe_id,
+				profile_id);
+			if (status) {
+				rte_sched_port_free(p->soft.tm.sched);
+				return -1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+void
+tm_stop(struct pmd_internals *p)
+{
+	if (p->soft.tm.sched)
+		rte_sched_port_free(p->soft.tm.sched);
+}
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v6 3/5] net/softnic: add TM capabilities ops
  2017-10-06 16:59                   ` [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-10-06 16:59                     ` [dpdk-dev] [PATCH v6 1/5] net/softnic: add softnic PMD Jasvinder Singh
  2017-10-06 17:00                     ` [dpdk-dev] [PATCH v6 2/5] net/softnic: add traffic management support Jasvinder Singh
@ 2017-10-06 17:00                     ` Jasvinder Singh
  2017-10-06 17:00                     ` [dpdk-dev] [PATCH v6 4/5] net/softnic: add TM hierarchy related ops Jasvinder Singh
                                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-06 17:00 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

Implement ethdev TM capability APIs in SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Lu, Wenzhuo <wenzhuo.lu@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
 drivers/net/softnic/rte_eth_softnic.c           |  12 +-
 drivers/net/softnic/rte_eth_softnic_internals.h |  32 ++
 drivers/net/softnic/rte_eth_softnic_tm.c        | 500 ++++++++++++++++++++++++
 3 files changed, 543 insertions(+), 1 deletion(-)

diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index ab26948..4d70ebf 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -43,6 +43,7 @@
 #include <rte_errno.h>
 #include <rte_ring.h>
 #include <rte_sched.h>
+#include <rte_tm_driver.h>
 
 #include "rte_eth_softnic.h"
 #include "rte_eth_softnic_internals.h"
@@ -224,6 +225,15 @@ pmd_link_update(struct rte_eth_dev *dev __rte_unused,
 	return 0;
 }
 
+static int
+pmd_tm_ops_get(struct rte_eth_dev *dev, void *arg)
+{
+	*(const struct rte_tm_ops **)arg =
+		(tm_enabled(dev)) ? &pmd_tm_ops : NULL;
+
+	return 0;
+}
+
 static const struct eth_dev_ops pmd_ops = {
 	.dev_configure = pmd_dev_configure,
 	.dev_start = pmd_dev_start,
@@ -233,7 +243,7 @@ static const struct eth_dev_ops pmd_ops = {
 	.dev_infos_get = pmd_dev_infos_get,
 	.rx_queue_setup = pmd_rx_queue_setup,
 	.tx_queue_setup = pmd_tx_queue_setup,
-	.tm_ops_get = NULL,
+	.tm_ops_get = pmd_tm_ops_get,
 };
 
 static uint16_t
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 11f88d8..9b313d0 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -39,6 +39,7 @@
 #include <rte_mbuf.h>
 #include <rte_sched.h>
 #include <rte_ethdev.h>
+#include <rte_tm_driver.h>
 
 #include "rte_eth_softnic.h"
 
@@ -137,8 +138,26 @@ enum tm_node_level {
 	TM_NODE_LEVEL_MAX,
 };
 
+/* TM Node */
+struct tm_node {
+	TAILQ_ENTRY(tm_node) node;
+	uint32_t node_id;
+	uint32_t parent_node_id;
+	uint32_t priority;
+	uint32_t weight;
+	uint32_t level;
+	struct tm_node *parent_node;
+	struct rte_tm_node_params params;
+	struct rte_tm_node_stats stats;
+	uint32_t n_children;
+};
+
+TAILQ_HEAD(tm_node_list, tm_node);
+
 /* TM Hierarchy Specification */
 struct tm_hierarchy {
+	struct tm_node_list nodes;
+
 	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
 };
 
@@ -191,6 +210,11 @@ struct pmd_rx_queue {
 	} hard;
 };
 
+/**
+ * Traffic Management (TM) Operation
+ */
+extern const struct rte_tm_ops pmd_tm_ops;
+
 int
 tm_params_check(struct pmd_params *params, uint32_t hard_rate);
 
@@ -207,6 +231,14 @@ void
 tm_stop(struct pmd_internals *p);
 
 static inline int
+tm_enabled(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	return (p->params.soft.flags & PMD_FEATURE_TM);
+}
+
+static inline int
 tm_used(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index bb28798..a552006 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -179,3 +179,503 @@ tm_stop(struct pmd_internals *p)
 	if (p->soft.tm.sched)
 		rte_sched_port_free(p->soft.tm.sched);
 }
+
+static struct tm_node *
+tm_node_search(struct rte_eth_dev *dev, uint32_t node_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node)
+		if (n->node_id == node_id)
+			return n;
+
+	return NULL;
+}
+
+static uint32_t
+tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t n_queues_max = p->params.soft.tm.nb_queues;
+	uint32_t n_tc_max = n_queues_max / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
+	uint32_t n_pipes_max = n_tc_max / RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+	uint32_t n_subports_max = n_pipes_max;
+	uint32_t n_root_max = 1;
+
+	switch (level) {
+	case TM_NODE_LEVEL_PORT:
+		return n_root_max;
+	case TM_NODE_LEVEL_SUBPORT:
+		return n_subports_max;
+	case TM_NODE_LEVEL_PIPE:
+		return n_pipes_max;
+	case TM_NODE_LEVEL_TC:
+		return n_tc_max;
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		return n_queues_max;
+	}
+}
+
+#ifdef RTE_SCHED_RED
+#define WRED_SUPPORTED						1
+#else
+#define WRED_SUPPORTED						0
+#endif
+
+#define STATS_MASK_DEFAULT					\
+	(RTE_TM_STATS_N_PKTS |					\
+	RTE_TM_STATS_N_BYTES |					\
+	RTE_TM_STATS_N_PKTS_GREEN_DROPPED |			\
+	RTE_TM_STATS_N_BYTES_GREEN_DROPPED)
+
+#define STATS_MASK_QUEUE						\
+	(STATS_MASK_DEFAULT |					\
+	RTE_TM_STATS_N_PKTS_QUEUED)
+
+static const struct rte_tm_capabilities tm_cap = {
+	.n_nodes_max = UINT32_MAX,
+	.n_levels_max = TM_NODE_LEVEL_MAX,
+
+	.non_leaf_nodes_identical = 0,
+	.leaf_nodes_identical = 1,
+
+	.shaper_n_max = UINT32_MAX,
+	.shaper_private_n_max = UINT32_MAX,
+	.shaper_private_dual_rate_n_max = 0,
+	.shaper_private_rate_min = 1,
+	.shaper_private_rate_max = UINT32_MAX,
+
+	.shaper_shared_n_max = UINT32_MAX,
+	.shaper_shared_n_nodes_per_shaper_max = UINT32_MAX,
+	.shaper_shared_n_shapers_per_node_max = 1,
+	.shaper_shared_dual_rate_n_max = 0,
+	.shaper_shared_rate_min = 1,
+	.shaper_shared_rate_max = UINT32_MAX,
+
+	.shaper_pkt_length_adjust_min = RTE_TM_ETH_FRAMING_OVERHEAD_FCS,
+	.shaper_pkt_length_adjust_max = RTE_TM_ETH_FRAMING_OVERHEAD_FCS,
+
+	.sched_n_children_max = UINT32_MAX,
+	.sched_sp_n_priorities_max = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+	.sched_wfq_n_children_per_group_max = UINT32_MAX,
+	.sched_wfq_n_groups_max = 1,
+	.sched_wfq_weight_max = UINT32_MAX,
+
+	.cman_head_drop_supported = 0,
+	.cman_wred_context_n_max = 0,
+	.cman_wred_context_private_n_max = 0,
+	.cman_wred_context_shared_n_max = 0,
+	.cman_wred_context_shared_n_nodes_per_context_max = 0,
+	.cman_wred_context_shared_n_contexts_per_node_max = 0,
+
+	.mark_vlan_dei_supported = {0, 0, 0},
+	.mark_ip_ecn_tcp_supported = {0, 0, 0},
+	.mark_ip_ecn_sctp_supported = {0, 0, 0},
+	.mark_ip_dscp_supported = {0, 0, 0},
+
+	.dynamic_update_mask = 0,
+
+	.stats_mask = STATS_MASK_QUEUE,
+};
+
+/* Traffic manager capabilities get */
+static int
+pmd_tm_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	struct rte_tm_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_cap, sizeof(*cap));
+
+	cap->n_nodes_max = tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_TC) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_QUEUE);
+
+	cap->shaper_private_n_max =
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_TC);
+
+	cap->shaper_shared_n_max = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE *
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT);
+
+	cap->shaper_n_max = cap->shaper_private_n_max +
+		cap->shaper_shared_n_max;
+
+	cap->shaper_shared_n_nodes_per_shaper_max =
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE);
+
+	cap->sched_n_children_max = RTE_MAX(
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE),
+		(uint32_t)RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE);
+
+	cap->sched_wfq_n_children_per_group_max = cap->sched_n_children_max;
+
+	if (WRED_SUPPORTED)
+		cap->cman_wred_context_private_n_max =
+			tm_level_get_max_nodes(dev, TM_NODE_LEVEL_QUEUE);
+
+	cap->cman_wred_context_n_max = cap->cman_wred_context_private_n_max +
+		cap->cman_wred_context_shared_n_max;
+
+	return 0;
+}
+
+static const struct rte_tm_level_capabilities tm_level_cap[] = {
+	[TM_NODE_LEVEL_PORT] = {
+		.n_nodes_max = 1,
+		.n_nodes_nonleaf_max = 1,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = 1,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_SUBPORT] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+			.sched_wfq_weight_max = UINT32_MAX,
+#else
+			.sched_wfq_weight_max = 1,
+#endif
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_PIPE] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_sp_n_priorities_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_wfq_n_children_per_group_max = 1,
+			.sched_wfq_n_groups_max = 0,
+			.sched_wfq_weight_max = 1,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_TC] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 1,
+
+			.sched_n_children_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_QUEUE] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = 0,
+		.n_nodes_leaf_max = UINT32_MAX,
+		.non_leaf_nodes_identical = 0,
+		.leaf_nodes_identical = 1,
+
+		.leaf = {
+			.shaper_private_supported = 0,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 0,
+			.shaper_private_rate_max = 0,
+			.shaper_shared_n_max = 0,
+
+			.cman_head_drop_supported = 0,
+			.cman_wred_context_private_supported = WRED_SUPPORTED,
+			.cman_wred_context_shared_n_max = 0,
+
+			.stats_mask = STATS_MASK_QUEUE,
+		},
+	},
+};
+
+/* Traffic manager level capabilities get */
+static int
+pmd_tm_level_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t level_id,
+	struct rte_tm_level_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	if (level_id >= TM_NODE_LEVEL_MAX)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_LEVEL_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_level_cap[level_id], sizeof(*cap));
+
+	switch (level_id) {
+	case TM_NODE_LEVEL_PORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_SUBPORT);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_SUBPORT);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_PIPE);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_PIPE);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		break;
+
+	case TM_NODE_LEVEL_TC:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_TC);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		break;
+
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_QUEUE);
+		cap->n_nodes_leaf_max = cap->n_nodes_max;
+		break;
+	}
+
+	return 0;
+}
+
+static const struct rte_tm_node_capabilities tm_node_cap[] = {
+	[TM_NODE_LEVEL_PORT] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = 1,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_SUBPORT] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_PIPE] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_sp_n_priorities_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_wfq_n_children_per_group_max = 1,
+			.sched_wfq_n_groups_max = 0,
+			.sched_wfq_weight_max = 1,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_TC] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 1,
+
+		.nonleaf = {
+			.sched_n_children_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_QUEUE] = {
+		.shaper_private_supported = 0,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 0,
+		.shaper_private_rate_max = 0,
+		.shaper_shared_n_max = 0,
+
+
+		.leaf = {
+			.cman_head_drop_supported = 0,
+			.cman_wred_context_private_supported = WRED_SUPPORTED,
+			.cman_wred_context_shared_n_max = 0,
+		},
+
+		.stats_mask = STATS_MASK_QUEUE,
+	},
+};
+
+/* Traffic manager node capabilities get */
+static int
+pmd_tm_node_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t node_id,
+	struct rte_tm_node_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	struct tm_node *tm_node;
+
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	tm_node = tm_node_search(dev, node_id);
+	if (tm_node == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_NODE_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_node_cap[tm_node->level], sizeof(*cap));
+
+	switch (tm_node->level) {
+	case TM_NODE_LEVEL_PORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_SUBPORT);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_PIPE);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+	case TM_NODE_LEVEL_TC:
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+const struct rte_tm_ops pmd_tm_ops = {
+	.capabilities_get = pmd_tm_capabilities_get,
+	.level_capabilities_get = pmd_tm_level_capabilities_get,
+	.node_capabilities_get = pmd_tm_node_capabilities_get,
+};
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v6 4/5] net/softnic: add TM hierarchy related ops
  2017-10-06 16:59                   ` [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
                                       ` (2 preceding siblings ...)
  2017-10-06 17:00                     ` [dpdk-dev] [PATCH v6 3/5] net/softnic: add TM capabilities ops Jasvinder Singh
@ 2017-10-06 17:00                     ` Jasvinder Singh
  2017-10-06 17:00                     ` [dpdk-dev] [PATCH v6 5/5] app/testpmd: add traffic management forwarding mode Jasvinder Singh
  2017-10-06 18:57                     ` [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Ferruh Yigit
  5 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-06 17:00 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

Implement ethdev TM hierarchy related APIs in SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Lu, Wenzhuo <wenzhuo.lu@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
v5 change:
- add macro for the tc period
- add more comments

 drivers/net/softnic/rte_eth_softnic_internals.h |   41 +
 drivers/net/softnic/rte_eth_softnic_tm.c        | 2781 ++++++++++++++++++++++-
 2 files changed, 2817 insertions(+), 5 deletions(-)

diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 9b313d0..a2675e0 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -138,6 +138,36 @@ enum tm_node_level {
 	TM_NODE_LEVEL_MAX,
 };
 
+/* TM Shaper Profile */
+struct tm_shaper_profile {
+	TAILQ_ENTRY(tm_shaper_profile) node;
+	uint32_t shaper_profile_id;
+	uint32_t n_users;
+	struct rte_tm_shaper_params params;
+};
+
+TAILQ_HEAD(tm_shaper_profile_list, tm_shaper_profile);
+
+/* TM Shared Shaper */
+struct tm_shared_shaper {
+	TAILQ_ENTRY(tm_shared_shaper) node;
+	uint32_t shared_shaper_id;
+	uint32_t n_users;
+	uint32_t shaper_profile_id;
+};
+
+TAILQ_HEAD(tm_shared_shaper_list, tm_shared_shaper);
+
+/* TM WRED Profile */
+struct tm_wred_profile {
+	TAILQ_ENTRY(tm_wred_profile) node;
+	uint32_t wred_profile_id;
+	uint32_t n_users;
+	struct rte_tm_wred_params params;
+};
+
+TAILQ_HEAD(tm_wred_profile_list, tm_wred_profile);
+
 /* TM Node */
 struct tm_node {
 	TAILQ_ENTRY(tm_node) node;
@@ -147,6 +177,8 @@ struct tm_node {
 	uint32_t weight;
 	uint32_t level;
 	struct tm_node *parent_node;
+	struct tm_shaper_profile *shaper_profile;
+	struct tm_wred_profile *wred_profile;
 	struct rte_tm_node_params params;
 	struct rte_tm_node_stats stats;
 	uint32_t n_children;
@@ -156,8 +188,16 @@ TAILQ_HEAD(tm_node_list, tm_node);
 
 /* TM Hierarchy Specification */
 struct tm_hierarchy {
+	struct tm_shaper_profile_list shaper_profiles;
+	struct tm_shared_shaper_list shared_shapers;
+	struct tm_wred_profile_list wred_profiles;
 	struct tm_node_list nodes;
 
+	uint32_t n_shaper_profiles;
+	uint32_t n_shared_shapers;
+	uint32_t n_wred_profiles;
+	uint32_t n_nodes;
+
 	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
 };
 
@@ -170,6 +210,7 @@ struct tm_internals {
 	 *      sense to keep the hierarchy frozen after the port is started.
 	 */
 	struct tm_hierarchy h;
+	int hierarchy_frozen;
 
 	/** Blueprints */
 	struct tm_params params;
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index a552006..21cc93b 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -40,7 +40,9 @@
 #include "rte_eth_softnic_internals.h"
 #include "rte_eth_softnic.h"
 
-#define BYTES_IN_MBPS (1000 * 1000 / 8)
+#define BYTES_IN_MBPS		(1000 * 1000 / 8)
+#define SUBPORT_TC_PERIOD	10
+#define PIPE_TC_PERIOD		40
 
 int
 tm_params_check(struct pmd_params *params, uint32_t hard_rate)
@@ -86,6 +88,79 @@ tm_params_check(struct pmd_params *params, uint32_t hard_rate)
 	return 0;
 }
 
+static void
+tm_hierarchy_init(struct pmd_internals *p)
+{
+	memset(&p->soft.tm.h, 0, sizeof(p->soft.tm.h));
+
+	/* Initialize shaper profile list */
+	TAILQ_INIT(&p->soft.tm.h.shaper_profiles);
+
+	/* Initialize shared shaper list */
+	TAILQ_INIT(&p->soft.tm.h.shared_shapers);
+
+	/* Initialize wred profile list */
+	TAILQ_INIT(&p->soft.tm.h.wred_profiles);
+
+	/* Initialize TM node list */
+	TAILQ_INIT(&p->soft.tm.h.nodes);
+}
+
+static void
+tm_hierarchy_uninit(struct pmd_internals *p)
+{
+	/* Remove all nodes*/
+	for ( ; ; ) {
+		struct tm_node *tm_node;
+
+		tm_node = TAILQ_FIRST(&p->soft.tm.h.nodes);
+		if (tm_node == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.nodes, tm_node, node);
+		free(tm_node);
+	}
+
+	/* Remove all WRED profiles */
+	for ( ; ; ) {
+		struct tm_wred_profile *wred_profile;
+
+		wred_profile = TAILQ_FIRST(&p->soft.tm.h.wred_profiles);
+		if (wred_profile == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.wred_profiles, wred_profile, node);
+		free(wred_profile);
+	}
+
+	/* Remove all shared shapers */
+	for ( ; ; ) {
+		struct tm_shared_shaper *shared_shaper;
+
+		shared_shaper = TAILQ_FIRST(&p->soft.tm.h.shared_shapers);
+		if (shared_shaper == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.shared_shapers, shared_shaper, node);
+		free(shared_shaper);
+	}
+
+	/* Remove all shaper profiles */
+	for ( ; ; ) {
+		struct tm_shaper_profile *shaper_profile;
+
+		shaper_profile = TAILQ_FIRST(&p->soft.tm.h.shaper_profiles);
+		if (shaper_profile == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.shaper_profiles,
+			shaper_profile, node);
+		free(shaper_profile);
+	}
+
+	memset(&p->soft.tm.h, 0, sizeof(p->soft.tm.h));
+}
+
 int
 tm_init(struct pmd_internals *p,
 	struct pmd_params *params,
@@ -112,12 +187,15 @@ tm_init(struct pmd_internals *p,
 		return -ENOMEM;
 	}
 
+	tm_hierarchy_init(p);
+
 	return 0;
 }
 
 void
 tm_free(struct pmd_internals *p)
 {
+	tm_hierarchy_uninit(p);
 	rte_free(p->soft.tm.pkts_enq);
 	rte_free(p->soft.tm.pkts_deq);
 }
@@ -129,6 +207,10 @@ tm_start(struct pmd_internals *p)
 	uint32_t n_subports, subport_id;
 	int status;
 
+	/* Is hierarchy frozen? */
+	if (p->soft.tm.hierarchy_frozen == 0)
+		return -1;
+
 	/* Port */
 	p->soft.tm.sched = rte_sched_port_config(&t->port_params);
 	if (p->soft.tm.sched == NULL)
@@ -178,6 +260,51 @@ tm_stop(struct pmd_internals *p)
 {
 	if (p->soft.tm.sched)
 		rte_sched_port_free(p->soft.tm.sched);
+
+	/* Unfreeze hierarchy */
+	p->soft.tm.hierarchy_frozen = 0;
+}
+
+static struct tm_shaper_profile *
+tm_shaper_profile_search(struct rte_eth_dev *dev, uint32_t shaper_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile_list *spl = &p->soft.tm.h.shaper_profiles;
+	struct tm_shaper_profile *sp;
+
+	TAILQ_FOREACH(sp, spl, node)
+		if (shaper_profile_id == sp->shaper_profile_id)
+			return sp;
+
+	return NULL;
+}
+
+static struct tm_shared_shaper *
+tm_shared_shaper_search(struct rte_eth_dev *dev, uint32_t shared_shaper_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper_list *ssl = &p->soft.tm.h.shared_shapers;
+	struct tm_shared_shaper *ss;
+
+	TAILQ_FOREACH(ss, ssl, node)
+		if (shared_shaper_id == ss->shared_shaper_id)
+			return ss;
+
+	return NULL;
+}
+
+static struct tm_wred_profile *
+tm_wred_profile_search(struct rte_eth_dev *dev, uint32_t wred_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile_list *wpl = &p->soft.tm.h.wred_profiles;
+	struct tm_wred_profile *wp;
+
+	TAILQ_FOREACH(wp, wpl, node)
+		if (wred_profile_id == wp->wred_profile_id)
+			return wp;
+
+	return NULL;
 }
 
 static struct tm_node *
@@ -194,6 +321,94 @@ tm_node_search(struct rte_eth_dev *dev, uint32_t node_id)
 	return NULL;
 }
 
+static struct tm_node *
+tm_root_node_present(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node)
+		if (n->parent_node_id == RTE_TM_NODE_ID_NULL)
+			return n;
+
+	return NULL;
+}
+
+static uint32_t
+tm_node_subport_id(struct rte_eth_dev *dev, struct tm_node *subport_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *ns;
+	uint32_t subport_id;
+
+	subport_id = 0;
+	TAILQ_FOREACH(ns, nl, node) {
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		if (ns->node_id == subport_node->node_id)
+			return subport_id;
+
+		subport_id++;
+	}
+
+	return UINT32_MAX;
+}
+
+static uint32_t
+tm_node_pipe_id(struct rte_eth_dev *dev, struct tm_node *pipe_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *np;
+	uint32_t pipe_id;
+
+	pipe_id = 0;
+	TAILQ_FOREACH(np, nl, node) {
+		if ((np->level != TM_NODE_LEVEL_PIPE) ||
+			(np->parent_node_id != pipe_node->parent_node_id))
+			continue;
+
+		if (np->node_id == pipe_node->node_id)
+			return pipe_id;
+
+		pipe_id++;
+	}
+
+	return UINT32_MAX;
+}
+
+static uint32_t
+tm_node_tc_id(struct rte_eth_dev *dev __rte_unused, struct tm_node *tc_node)
+{
+	return tc_node->priority;
+}
+
+static uint32_t
+tm_node_queue_id(struct rte_eth_dev *dev, struct tm_node *queue_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *nq;
+	uint32_t queue_id;
+
+	queue_id = 0;
+	TAILQ_FOREACH(nq, nl, node) {
+		if ((nq->level != TM_NODE_LEVEL_QUEUE) ||
+			(nq->parent_node_id != queue_node->parent_node_id))
+			continue;
+
+		if (nq->node_id == queue_node->node_id)
+			return queue_id;
+
+		queue_id++;
+	}
+
+	return UINT32_MAX;
+}
+
 static uint32_t
 tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 {
@@ -219,6 +434,35 @@ tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 	}
 }
 
+/* Traffic manager node type get */
+static int
+pmd_tm_node_type_get(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	int *is_leaf,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	if (is_leaf == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_UNSPECIFIED,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	if ((node_id == RTE_TM_NODE_ID_NULL) ||
+		(tm_node_search(dev, node_id) == NULL))
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_NODE_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	*is_leaf = node_id < p->params.soft.tm.nb_queues;
+
+	return 0;
+}
+
 #ifdef RTE_SCHED_RED
 #define WRED_SUPPORTED						1
 #else
@@ -674,8 +918,2535 @@ pmd_tm_node_capabilities_get(struct rte_eth_dev *dev __rte_unused,
 	return 0;
 }
 
-const struct rte_tm_ops pmd_tm_ops = {
-	.capabilities_get = pmd_tm_capabilities_get,
-	.level_capabilities_get = pmd_tm_level_capabilities_get,
-	.node_capabilities_get = pmd_tm_node_capabilities_get,
+static int
+shaper_profile_check(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error)
+{
+	struct tm_shaper_profile *sp;
+
+	/* Shaper profile ID must not be NONE. */
+	if (shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper profile must not exist. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp)
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	/* Profile must not be NULL. */
+	if (profile == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Peak rate: non-zero, 32-bit */
+	if ((profile->peak.rate == 0) ||
+		(profile->peak.rate >= UINT32_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_RATE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Peak size: non-zero, 32-bit */
+	if ((profile->peak.size == 0) ||
+		(profile->peak.size >= UINT32_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_SIZE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Dual-rate profiles are not supported. */
+	if (profile->committed.rate != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_COMMITTED_RATE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Packet length adjust: 24 bytes */
+	if (profile->pkt_length_adjust != RTE_TM_ETH_FRAMING_OVERHEAD_FCS)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PKT_ADJUST_LEN,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+/* Traffic manager shaper profile add */
+static int
+pmd_tm_shaper_profile_add(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile_list *spl = &p->soft.tm.h.shaper_profiles;
+	struct tm_shaper_profile *sp;
+	int status;
+
+	/* Check input params */
+	status = shaper_profile_check(dev, shaper_profile_id, profile, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	sp = calloc(1, sizeof(struct tm_shaper_profile));
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	sp->shaper_profile_id = shaper_profile_id;
+	memcpy(&sp->params, profile, sizeof(sp->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(spl, sp, node);
+	p->soft.tm.h.n_shaper_profiles++;
+
+	return 0;
+}
+
+/* Traffic manager shaper profile delete */
+static int
+pmd_tm_shaper_profile_delete(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp;
+
+	/* Check existing */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (sp->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.shaper_profiles, sp, node);
+	p->soft.tm.h.n_shaper_profiles--;
+	free(sp);
+
+	return 0;
+}
+
+static struct tm_node *
+tm_shared_shaper_get_tc(struct rte_eth_dev *dev,
+	struct tm_shared_shaper *ss)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	/* Subport: each TC uses shared shaper  */
+	TAILQ_FOREACH(n, nl, node) {
+		if ((n->level != TM_NODE_LEVEL_TC) ||
+			(n->params.n_shared_shapers == 0) ||
+			(n->params.shared_shaper_id[0] != ss->shared_shaper_id))
+			continue;
+
+		return n;
+	}
+
+	return NULL;
+}
+
+static int
+update_subport_tc_rate(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct tm_shared_shaper *ss,
+	struct tm_shaper_profile *sp_new)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_subport_params subport_params;
+
+	struct tm_shaper_profile *sp_old = tm_shaper_profile_search(dev,
+		ss->shaper_profile_id);
+
+	/* Derive new subport configuration. */
+	memcpy(&subport_params,
+		&p->soft.tm.params.subport_params[subport_id],
+		sizeof(subport_params));
+	subport_params.tc_rate[tc_id] = sp_new->params.peak.rate;
+
+	/* Update the subport configuration. */
+	if (rte_sched_subport_config(p->soft.tm.sched,
+		subport_id, &subport_params))
+		return -1;
+
+	/* Commit changes. */
+	sp_old->n_users--;
+
+	ss->shaper_profile_id = sp_new->shaper_profile_id;
+	sp_new->n_users++;
+
+	memcpy(&p->soft.tm.params.subport_params[subport_id],
+		&subport_params,
+		sizeof(subport_params));
+
+	return 0;
+}
+
+/* Traffic manager shared shaper add/update */
+static int
+pmd_tm_shared_shaper_add_update(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper *ss;
+	struct tm_shaper_profile *sp;
+	struct tm_node *nt;
+
+	/* Shaper profile must be valid. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/**
+	 * Add new shared shaper
+	 */
+	ss = tm_shared_shaper_search(dev, shared_shaper_id);
+	if (ss == NULL) {
+		struct tm_shared_shaper_list *ssl =
+			&p->soft.tm.h.shared_shapers;
+
+		/* Hierarchy must not be frozen */
+		if (p->soft.tm.hierarchy_frozen)
+			return -rte_tm_error_set(error,
+				EBUSY,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EBUSY));
+
+		/* Memory allocation */
+		ss = calloc(1, sizeof(struct tm_shared_shaper));
+		if (ss == NULL)
+			return -rte_tm_error_set(error,
+				ENOMEM,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(ENOMEM));
+
+		/* Fill in */
+		ss->shared_shaper_id = shared_shaper_id;
+		ss->shaper_profile_id = shaper_profile_id;
+
+		/* Add to list */
+		TAILQ_INSERT_TAIL(ssl, ss, node);
+		p->soft.tm.h.n_shared_shapers++;
+
+		return 0;
+	}
+
+	/**
+	 * Update existing shared shaper
+	 */
+	/* Hierarchy must be frozen (run-time update) */
+	if (p->soft.tm.hierarchy_frozen == 0)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+
+	/* Propagate change. */
+	nt = tm_shared_shaper_get_tc(dev, ss);
+	if (update_subport_tc_rate(dev, nt, ss, sp))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+/* Traffic manager shared shaper delete */
+static int
+pmd_tm_shared_shaper_delete(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper *ss;
+
+	/* Check existing */
+	ss = tm_shared_shaper_search(dev, shared_shaper_id);
+	if (ss == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (ss->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.shared_shapers, ss, node);
+	p->soft.tm.h.n_shared_shapers--;
+	free(ss);
+
+	return 0;
+}
+
+static int
+wred_profile_check(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error)
+{
+	struct tm_wred_profile *wp;
+	enum rte_tm_color color;
+
+	/* WRED profile ID must not be NONE. */
+	if (wred_profile_id == RTE_TM_WRED_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* WRED profile must not exist. */
+	wp = tm_wred_profile_search(dev, wred_profile_id);
+	if (wp)
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	/* Profile must not be NULL. */
+	if (profile == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* min_th <= max_th, max_th > 0  */
+	for (color = RTE_TM_GREEN; color < RTE_TM_COLORS; color++) {
+		uint16_t min_th = profile->red_params[color].min_th;
+		uint16_t max_th = profile->red_params[color].max_th;
+
+		if ((min_th > max_th) || (max_th == 0))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_WRED_PROFILE,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	return 0;
+}
+
+/* Traffic manager WRED profile add */
+static int
+pmd_tm_wred_profile_add(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile_list *wpl = &p->soft.tm.h.wred_profiles;
+	struct tm_wred_profile *wp;
+	int status;
+
+	/* Check input params */
+	status = wred_profile_check(dev, wred_profile_id, profile, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	wp = calloc(1, sizeof(struct tm_wred_profile));
+	if (wp == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	wp->wred_profile_id = wred_profile_id;
+	memcpy(&wp->params, profile, sizeof(wp->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(wpl, wp, node);
+	p->soft.tm.h.n_wred_profiles++;
+
+	return 0;
+}
+
+/* Traffic manager WRED profile delete */
+static int
+pmd_tm_wred_profile_delete(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile *wp;
+
+	/* Check existing */
+	wp = tm_wred_profile_search(dev, wred_profile_id);
+	if (wp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (wp->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.wred_profiles, wp, node);
+	p->soft.tm.h.n_wred_profiles--;
+	free(wp);
+
+	return 0;
+}
+
+static int
+node_add_check_port(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp = tm_shaper_profile_search(dev,
+		params->shaper_profile_id);
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid.
+	 * Shaper profile peak rate must fit the configured port rate.
+	 */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(sp == NULL) ||
+		(sp->params.peak.rate > p->params.soft.tm.rate))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_subport(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_pipe(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight __rte_unused,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 4 */
+	if (params->nonleaf.n_sp_priorities !=
+		RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* WFQ mode must be byte mode */
+	if ((params->nonleaf.wfq_weight_mode != NULL) &&
+		(params->nonleaf.wfq_weight_mode[0] != 0) &&
+		(params->nonleaf.wfq_weight_mode[1] != 0) &&
+		(params->nonleaf.wfq_weight_mode[2] != 0) &&
+		(params->nonleaf.wfq_weight_mode[3] != 0))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_WFQ_WEIGHT_MODE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_tc(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority __rte_unused,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if ((params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE) ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Single valid shared shaper */
+	if (params->n_shared_shapers > 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	if ((params->n_shared_shapers == 1) &&
+		((params->shared_shaper_id == NULL) ||
+		(!tm_shared_shaper_search(dev, params->shared_shaper_id[0]))))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_DEFAULT))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_queue(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight __rte_unused,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: leaf */
+	if (node_id >= p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shaper */
+	if (params->shaper_profile_id != RTE_TM_SHAPER_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Congestion management must not be head drop */
+	if (params->leaf.cman == RTE_TM_CMAN_HEAD_DROP)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_CMAN,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Congestion management set to WRED */
+	if (params->leaf.cman == RTE_TM_CMAN_WRED) {
+		uint32_t wred_profile_id = params->leaf.wred.wred_profile_id;
+		struct tm_wred_profile *wp = tm_wred_profile_search(dev,
+			wred_profile_id);
+
+		/* WRED profile (for private WRED context) must be valid */
+		if ((wred_profile_id == RTE_TM_WRED_PROFILE_ID_NONE) ||
+			(wp == NULL))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARAMS_WRED_PROFILE_ID,
+				NULL,
+				rte_strerror(EINVAL));
+
+		/* No shared WRED contexts */
+		if (params->leaf.wred.n_shared_wred_contexts != 0)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_WRED_CONTEXTS,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Stats */
+	if (params->stats_mask & (~STATS_MASK_QUEUE))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct tm_node *pn;
+	uint32_t level;
+	int status;
+
+	/* node_id, parent_node_id:
+	 *    -node_id must not be RTE_TM_NODE_ID_NULL
+	 *    -node_id must not be in use
+	 *    -root node add (parent_node_id is RTE_TM_NODE_ID_NULL):
+	 *        -root node must not exist
+	 *    -non-root node add (parent_node_id is not RTE_TM_NODE_ID_NULL):
+	 *        -parent_node_id must be valid
+	 */
+	if (node_id == RTE_TM_NODE_ID_NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	if (tm_node_search(dev, node_id))
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	if (parent_node_id == RTE_TM_NODE_ID_NULL) {
+		pn = NULL;
+		if (tm_root_node_present(dev))
+			return -rte_tm_error_set(error,
+				EEXIST,
+				RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+				NULL,
+				rte_strerror(EEXIST));
+	} else {
+		pn = tm_node_search(dev, parent_node_id);
+		if (pn == NULL)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* priority: must be 0 .. 3 */
+	if (priority >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* weight: must be 1 .. 255 */
+	if ((weight == 0) || (weight >= UINT8_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* level_id: if valid, then
+	 *    -root node add (parent_node_id is RTE_TM_NODE_ID_NULL):
+	 *        -level_id must be zero
+	 *    -non-root node add (parent_node_id is not RTE_TM_NODE_ID_NULL):
+	 *        -level_id must be parent level ID plus one
+	 */
+	level = (pn == NULL) ? 0 : pn->level + 1;
+	if ((level_id != RTE_TM_NODE_LEVEL_ID_ANY) && (level_id != level))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* params: must not be NULL */
+	if (params == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* params: per level checks */
+	switch (level) {
+	case TM_NODE_LEVEL_PORT:
+		status = node_add_check_port(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		status = node_add_check_subport(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+		status = node_add_check_pipe(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_TC:
+		status = node_add_check_tc(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_QUEUE:
+		status = node_add_check_queue(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	default:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+	}
+
+	return 0;
+}
+
+/* Traffic manager node add */
+static int
+pmd_tm_node_add(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+	uint32_t i;
+	int status;
+
+	/* Checks */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	status = node_add_check(dev, node_id, parent_node_id, priority, weight,
+		level_id, params, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	n = calloc(1, sizeof(struct tm_node));
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	n->node_id = node_id;
+	n->parent_node_id = parent_node_id;
+	n->priority = priority;
+	n->weight = weight;
+
+	if (parent_node_id != RTE_TM_NODE_ID_NULL) {
+		n->parent_node = tm_node_search(dev, parent_node_id);
+		n->level = n->parent_node->level + 1;
+	}
+
+	if (params->shaper_profile_id != RTE_TM_SHAPER_PROFILE_ID_NONE)
+		n->shaper_profile = tm_shaper_profile_search(dev,
+			params->shaper_profile_id);
+
+	if ((n->level == TM_NODE_LEVEL_QUEUE) &&
+		(params->leaf.cman == RTE_TM_CMAN_WRED))
+		n->wred_profile = tm_wred_profile_search(dev,
+			params->leaf.wred.wred_profile_id);
+
+	memcpy(&n->params, params, sizeof(n->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(nl, n, node);
+	p->soft.tm.h.n_nodes++;
+
+	/* Update dependencies */
+	if (n->parent_node)
+		n->parent_node->n_children++;
+
+	if (n->shaper_profile)
+		n->shaper_profile->n_users++;
+
+	for (i = 0; i < params->n_shared_shapers; i++) {
+		struct tm_shared_shaper *ss;
+
+		ss = tm_shared_shaper_search(dev, params->shared_shaper_id[i]);
+		ss->n_users++;
+	}
+
+	if (n->wred_profile)
+		n->wred_profile->n_users++;
+
+	p->soft.tm.h.n_tm_nodes[n->level]++;
+
+	return 0;
+}
+
+/* Traffic manager node delete */
+static int
+pmd_tm_node_delete(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node *n;
+	uint32_t i;
+
+	/* Check hierarchy changes are currently allowed */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Check existing */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (n->n_children)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Update dependencies */
+	p->soft.tm.h.n_tm_nodes[n->level]--;
+
+	if (n->wred_profile)
+		n->wred_profile->n_users--;
+
+	for (i = 0; i < n->params.n_shared_shapers; i++) {
+		struct tm_shared_shaper *ss;
+
+		ss = tm_shared_shaper_search(dev,
+				n->params.shared_shaper_id[i]);
+		ss->n_users--;
+	}
+
+	if (n->shaper_profile)
+		n->shaper_profile->n_users--;
+
+	if (n->parent_node)
+		n->parent_node->n_children--;
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.nodes, n, node);
+	p->soft.tm.h.n_nodes--;
+	free(n);
+
+	return 0;
+}
+
+
+static void
+pipe_profile_build(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct rte_sched_pipe_params *pp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *nt, *nq;
+
+	memset(pp, 0, sizeof(*pp));
+
+	/* Pipe */
+	pp->tb_rate = np->shaper_profile->params.peak.rate;
+	pp->tb_size = np->shaper_profile->params.peak.size;
+
+	/* Traffic Class (TC) */
+	pp->tc_period = PIPE_TC_PERIOD;
+
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+	pp->tc_ov_weight = np->weight;
+#endif
+
+	TAILQ_FOREACH(nt, nl, node) {
+		uint32_t queue_id = 0;
+
+		if ((nt->level != TM_NODE_LEVEL_TC) ||
+			(nt->parent_node_id != np->node_id))
+			continue;
+
+		pp->tc_rate[nt->priority] =
+			nt->shaper_profile->params.peak.rate;
+
+		/* Queue */
+		TAILQ_FOREACH(nq, nl, node) {
+			uint32_t pipe_queue_id;
+
+			if ((nq->level != TM_NODE_LEVEL_QUEUE) ||
+				(nq->parent_node_id != nt->node_id))
+				continue;
+
+			pipe_queue_id = nt->priority *
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+			pp->wrr_weights[pipe_queue_id] = nq->weight;
+
+			queue_id++;
+		}
+	}
+}
+
+static int
+pipe_profile_free_exists(struct rte_eth_dev *dev,
+	uint32_t *pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+
+	if (t->n_pipe_profiles < RTE_SCHED_PIPE_PROFILES_PER_PORT) {
+		*pipe_profile_id = t->n_pipe_profiles;
+		return 1;
+	}
+
+	return 0;
+}
+
+static int
+pipe_profile_exists(struct rte_eth_dev *dev,
+	struct rte_sched_pipe_params *pp,
+	uint32_t *pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t i;
+
+	for (i = 0; i < t->n_pipe_profiles; i++)
+		if (memcmp(&t->pipe_profiles[i], pp, sizeof(*pp)) == 0) {
+			if (pipe_profile_id)
+				*pipe_profile_id = i;
+			return 1;
+		}
+
+	return 0;
+}
+
+static void
+pipe_profile_install(struct rte_eth_dev *dev,
+	struct rte_sched_pipe_params *pp,
+	uint32_t pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+
+	memcpy(&t->pipe_profiles[pipe_profile_id], pp, sizeof(*pp));
+	t->n_pipe_profiles++;
+}
+
+static void
+pipe_profile_mark(struct rte_eth_dev *dev,
+	uint32_t subport_id,
+	uint32_t pipe_id,
+	uint32_t pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_pipes_per_subport, pos;
+
+	n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+	pos = subport_id * n_pipes_per_subport + pipe_id;
+
+	t->pipe_to_profile[pos] = pipe_profile_id;
+}
+
+static struct rte_sched_pipe_params *
+pipe_profile_get(struct rte_eth_dev *dev, struct tm_node *np)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	uint32_t subport_id = tm_node_subport_id(dev, np->parent_node);
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	uint32_t pos = subport_id * n_pipes_per_subport + pipe_id;
+	uint32_t pipe_profile_id = t->pipe_to_profile[pos];
+
+	return &t->pipe_profiles[pipe_profile_id];
+}
+
+static int
+pipe_profiles_generate(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *ns, *np;
+	uint32_t subport_id;
+
+	/* Objective: Fill in the following fields in struct tm_params:
+	 *    - pipe_profiles
+	 *    - n_pipe_profiles
+	 *    - pipe_to_profile
+	 */
+
+	subport_id = 0;
+	TAILQ_FOREACH(ns, nl, node) {
+		uint32_t pipe_id;
+
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		pipe_id = 0;
+		TAILQ_FOREACH(np, nl, node) {
+			struct rte_sched_pipe_params pp;
+			uint32_t pos;
+
+			if ((np->level != TM_NODE_LEVEL_PIPE) ||
+				(np->parent_node_id != ns->node_id))
+				continue;
+
+			pipe_profile_build(dev, np, &pp);
+
+			if (!pipe_profile_exists(dev, &pp, &pos)) {
+				if (!pipe_profile_free_exists(dev, &pos))
+					return -1;
+
+				pipe_profile_install(dev, &pp, pos);
+			}
+
+			pipe_profile_mark(dev, subport_id, pipe_id, pos);
+
+			pipe_id++;
+		}
+
+		subport_id++;
+	}
+
+	return 0;
+}
+
+static struct tm_wred_profile *
+tm_tc_wred_profile_get(struct rte_eth_dev *dev, uint32_t tc_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *nq;
+
+	TAILQ_FOREACH(nq, nl, node) {
+		if ((nq->level != TM_NODE_LEVEL_QUEUE) ||
+			(nq->parent_node->priority != tc_id))
+			continue;
+
+		return nq->wred_profile;
+	}
+
+	return NULL;
+}
+
+#ifdef RTE_SCHED_RED
+
+static void
+wred_profiles_set(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_sched_port_params *pp = &p->soft.tm.params.port_params;
+	uint32_t tc_id;
+	enum rte_tm_color color;
+
+	for (tc_id = 0; tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc_id++)
+		for (color = RTE_TM_GREEN; color < RTE_TM_COLORS; color++) {
+			struct rte_red_params *dst =
+				&pp->red_params[tc_id][color];
+			struct tm_wred_profile *src_wp =
+				tm_tc_wred_profile_get(dev, tc_id);
+			struct rte_tm_red_params *src =
+				&src_wp->params.red_params[color];
+
+			memcpy(dst, src, sizeof(*dst));
+		}
+}
+
+#else
+
+#define wred_profiles_set(dev)
+
+#endif
+
+static struct tm_shared_shaper *
+tm_tc_shared_shaper_get(struct rte_eth_dev *dev, struct tm_node *tc_node)
+{
+	return (tc_node->params.n_shared_shapers) ?
+		tm_shared_shaper_search(dev,
+			tc_node->params.shared_shaper_id[0]) :
+		NULL;
+}
+
+static struct tm_shared_shaper *
+tm_subport_tc_shared_shaper_get(struct rte_eth_dev *dev,
+	struct tm_node *subport_node,
+	uint32_t tc_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node) {
+		if ((n->level != TM_NODE_LEVEL_TC) ||
+			(n->parent_node->parent_node_id !=
+				subport_node->node_id) ||
+			(n->priority != tc_id))
+			continue;
+
+		return tm_tc_shared_shaper_get(dev, n);
+	}
+
+	return NULL;
+}
+
+static int
+hierarchy_commit_check(struct rte_eth_dev *dev, struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_shared_shaper_list *ssl = &h->shared_shapers;
+	struct tm_wred_profile_list *wpl = &h->wred_profiles;
+	struct tm_node *nr = tm_root_node_present(dev), *ns, *np, *nt, *nq;
+	struct tm_shared_shaper *ss;
+
+	uint32_t n_pipes_per_subport;
+
+	/* Root node exists. */
+	if (nr == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* There is at least one subport, max is not exceeded. */
+	if ((nr->n_children == 0) || (nr->n_children > TM_MAX_SUBPORTS))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* There is at least one pipe. */
+	if (h->n_tm_nodes[TM_NODE_LEVEL_PIPE] == 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of pipes is the same for all subports. Maximum number of pipes
+	 * per subport is not exceeded.
+	 */
+	n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	if (n_pipes_per_subport > TM_MAX_PIPES_PER_SUBPORT)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	TAILQ_FOREACH(ns, nl, node) {
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		if (ns->n_children != n_pipes_per_subport)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Each pipe has exactly 4 TCs, with exactly one TC for each priority */
+	TAILQ_FOREACH(np, nl, node) {
+		uint32_t mask = 0, mask_expected =
+			RTE_LEN2MASK(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+				uint32_t);
+
+		if (np->level != TM_NODE_LEVEL_PIPE)
+			continue;
+
+		if (np->n_children != RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+
+		TAILQ_FOREACH(nt, nl, node) {
+			if ((nt->level != TM_NODE_LEVEL_TC) ||
+				(nt->parent_node_id != np->node_id))
+				continue;
+
+			mask |= 1 << nt->priority;
+		}
+
+		if (mask != mask_expected)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Each TC has exactly 4 packet queues. */
+	TAILQ_FOREACH(nt, nl, node) {
+		if (nt->level != TM_NODE_LEVEL_TC)
+			continue;
+
+		if (nt->n_children != RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/**
+	 * Shared shapers:
+	 *    -For each TC #i, all pipes in the same subport use the same
+	 *     shared shaper (or no shared shaper) for their TC#i.
+	 *    -Each shared shaper needs to have at least one user. All its
+	 *     users have to be TC nodes with the same priority and the same
+	 *     subport.
+	 */
+	TAILQ_FOREACH(ns, nl, node) {
+		struct tm_shared_shaper *s[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t id;
+
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++)
+			s[id] = tm_subport_tc_shared_shaper_get(dev, ns, id);
+
+		TAILQ_FOREACH(nt, nl, node) {
+			struct tm_shared_shaper *subport_ss, *tc_ss;
+
+			if ((nt->level != TM_NODE_LEVEL_TC) ||
+				(nt->parent_node->parent_node_id !=
+					ns->node_id))
+				continue;
+
+			subport_ss = s[nt->priority];
+			tc_ss = tm_tc_shared_shaper_get(dev, nt);
+
+			if ((subport_ss == NULL) && (tc_ss == NULL))
+				continue;
+
+			if (((subport_ss == NULL) && (tc_ss != NULL)) ||
+				((subport_ss != NULL) && (tc_ss == NULL)) ||
+				(subport_ss->shared_shaper_id !=
+					tc_ss->shared_shaper_id))
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+	}
+
+	TAILQ_FOREACH(ss, ssl, node) {
+		struct tm_node *nt_any = tm_shared_shaper_get_tc(dev, ss);
+		uint32_t n_users = 0;
+
+		if (nt_any != NULL)
+			TAILQ_FOREACH(nt, nl, node) {
+				if ((nt->level != TM_NODE_LEVEL_TC) ||
+					(nt->priority != nt_any->priority) ||
+					(nt->parent_node->parent_node_id !=
+					nt_any->parent_node->parent_node_id))
+					continue;
+
+				n_users++;
+			}
+
+		if ((ss->n_users == 0) || (ss->n_users != n_users))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Not too many pipe profiles. */
+	if (pipe_profiles_generate(dev))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/**
+	 * WRED (when used, i.e. at least one WRED profile defined):
+	 *    -Each WRED profile must have at least one user.
+	 *    -All leaf nodes must have their private WRED context enabled.
+	 *    -For each TC #i, all leaf nodes must use the same WRED profile
+	 *     for their private WRED context.
+	 */
+	if (h->n_wred_profiles) {
+		struct tm_wred_profile *wp;
+		struct tm_wred_profile *w[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t id;
+
+		TAILQ_FOREACH(wp, wpl, node)
+			if (wp->n_users == 0)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++) {
+			w[id] = tm_tc_wred_profile_get(dev, id);
+
+			if (w[id] == NULL)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+
+		TAILQ_FOREACH(nq, nl, node) {
+			uint32_t id;
+
+			if (nq->level != TM_NODE_LEVEL_QUEUE)
+				continue;
+
+			id = nq->parent_node->priority;
+
+			if ((nq->wred_profile == NULL) ||
+				(nq->wred_profile->wred_profile_id !=
+					w[id]->wred_profile_id))
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+	}
+
+	return 0;
+}
+
+static void
+hierarchy_blueprints_create(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *root = tm_root_node_present(dev), *n;
+
+	uint32_t subport_id;
+
+	t->port_params = (struct rte_sched_port_params) {
+		.name = dev->data->name,
+		.socket = dev->data->numa_node,
+		.rate = root->shaper_profile->params.peak.rate,
+		.mtu = dev->data->mtu,
+		.frame_overhead =
+			root->shaper_profile->params.pkt_length_adjust,
+		.n_subports_per_port = root->n_children,
+		.n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+			h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT],
+		.qsize = {p->params.soft.tm.qsize[0],
+			p->params.soft.tm.qsize[1],
+			p->params.soft.tm.qsize[2],
+			p->params.soft.tm.qsize[3],
+		},
+		.pipe_profiles = t->pipe_profiles,
+		.n_pipe_profiles = t->n_pipe_profiles,
+	};
+
+	wred_profiles_set(dev);
+
+	subport_id = 0;
+	TAILQ_FOREACH(n, nl, node) {
+		uint64_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t i;
+
+		if (n->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+			struct tm_shared_shaper *ss;
+			struct tm_shaper_profile *sp;
+
+			ss = tm_subport_tc_shared_shaper_get(dev, n, i);
+			sp = (ss) ? tm_shaper_profile_search(dev,
+				ss->shaper_profile_id) :
+				n->shaper_profile;
+			tc_rate[i] = sp->params.peak.rate;
+		}
+
+		t->subport_params[subport_id] =
+			(struct rte_sched_subport_params) {
+				.tb_rate = n->shaper_profile->params.peak.rate,
+				.tb_size = n->shaper_profile->params.peak.size,
+
+				.tc_rate = {tc_rate[0],
+					tc_rate[1],
+					tc_rate[2],
+					tc_rate[3],
+			},
+			.tc_period = SUBPORT_TC_PERIOD,
+		};
+
+		subport_id++;
+	}
+}
+
+/* Traffic manager hierarchy commit */
+static int
+pmd_tm_hierarchy_commit(struct rte_eth_dev *dev,
+	int clear_on_fail,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	int status;
+
+	/* Checks */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	status = hierarchy_commit_check(dev, error);
+	if (status) {
+		if (clear_on_fail) {
+			tm_hierarchy_uninit(p);
+			tm_hierarchy_init(p);
+		}
+
+		return status;
+	}
+
+	/* Create blueprints */
+	hierarchy_blueprints_create(dev);
+
+	/* Freeze hierarchy */
+	p->soft.tm.hierarchy_frozen = 1;
+
+	return 0;
+}
+
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+
+static int
+update_pipe_weight(struct rte_eth_dev *dev, struct tm_node *np, uint32_t weight)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tc_ov_weight = (uint8_t)weight;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	np->weight = weight;
+
+	return 0;
+}
+
+#endif
+
+static int
+update_queue_weight(struct rte_eth_dev *dev,
+	struct tm_node *nq, uint32_t weight)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t queue_id = tm_node_queue_id(dev, nq);
+
+	struct tm_node *nt = nq->parent_node;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t pipe_queue_id =
+		tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.wrr_weights[pipe_queue_id] = (uint8_t)weight;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set
+	 * of pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	nq->weight = weight;
+
+	return 0;
+}
+
+/* Traffic manager node parent update */
+static int
+pmd_tm_node_parent_update(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+
+	/* Port must be started and TM used. */
+	if ((dev->data->dev_started == 0) && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Parent node must be the same */
+	if (n->parent_node_id != parent_node_id)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be the same */
+	if (n->priority != priority)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* weight: must be 1 .. 255 */
+	if ((weight == 0) || (weight >= UINT8_MAX))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_SUBPORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_PIPE:
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+		if (update_pipe_weight(dev, n, weight))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+#else
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+#endif
+		/* fall-through */
+	case TM_NODE_LEVEL_TC:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_QUEUE:
+		/* fall-through */
+	default:
+		if (update_queue_weight(dev, n, weight))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+	}
+}
+
+static int
+update_subport_rate(struct rte_eth_dev *dev,
+	struct tm_node *ns,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_subport_params subport_params;
+
+	/* Derive new subport configuration. */
+	memcpy(&subport_params,
+		&p->soft.tm.params.subport_params[subport_id],
+		sizeof(subport_params));
+	subport_params.tb_rate = sp->params.peak.rate;
+	subport_params.tb_size = sp->params.peak.size;
+
+	/* Update the subport configuration. */
+	if (rte_sched_subport_config(p->soft.tm.sched, subport_id,
+		&subport_params))
+		return -1;
+
+	/* Commit changes. */
+	ns->shaper_profile->n_users--;
+
+	ns->shaper_profile = sp;
+	ns->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	memcpy(&p->soft.tm.params.subport_params[subport_id],
+		&subport_params,
+		sizeof(subport_params));
+
+	return 0;
+}
+
+static int
+update_pipe_rate(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tb_rate = sp->params.peak.rate;
+	profile1.tb_size = sp->params.peak.size;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	np->shaper_profile->n_users--;
+	np->shaper_profile = sp;
+	np->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	return 0;
+}
+
+static int
+update_tc_rate(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tc_rate[tc_id] = sp->params.peak.rate;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	nt->shaper_profile->n_users--;
+	nt->shaper_profile = sp;
+	nt->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	return 0;
+}
+
+/* Traffic manager node shaper update */
+static int
+pmd_tm_node_shaper_update(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+	struct tm_shaper_profile *sp;
+
+	/* Port must be started and TM used. */
+	if ((dev->data->dev_started == 0) && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper profile must be valid. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_SUBPORT:
+		if (update_subport_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_PIPE:
+		if (update_pipe_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_TC:
+		if (update_tc_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_QUEUE:
+		/* fall-through */
+	default:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+	}
+}
+
+static inline uint32_t
+tm_port_queue_id(struct rte_eth_dev *dev,
+	uint32_t port_subport_id,
+	uint32_t subport_pipe_id,
+	uint32_t pipe_tc_id,
+	uint32_t tc_queue_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	uint32_t n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+			h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	uint32_t port_pipe_id =
+		port_subport_id * n_pipes_per_subport + subport_pipe_id;
+	uint32_t port_tc_id =
+		port_pipe_id * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE + pipe_tc_id;
+	uint32_t port_queue_id =
+		port_tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + tc_queue_id;
+
+	return port_queue_id;
+}
+
+static int
+read_port_stats(struct rte_eth_dev *dev,
+	struct tm_node *nr,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	uint32_t n_subports_per_port = h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+	uint32_t subport_id;
+
+	for (subport_id = 0; subport_id < n_subports_per_port; subport_id++) {
+		struct rte_sched_subport_stats s;
+		uint32_t tc_ov, id;
+
+		/* Stats read */
+		int status = rte_sched_subport_read_stats(
+			p->soft.tm.sched,
+			subport_id,
+			&s,
+			&tc_ov);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++) {
+			nr->stats.n_pkts +=
+				s.n_pkts_tc[id] - s.n_pkts_tc_dropped[id];
+			nr->stats.n_bytes +=
+				s.n_bytes_tc[id] - s.n_bytes_tc_dropped[id];
+			nr->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] +=
+				s.n_pkts_tc_dropped[id];
+			nr->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+				s.n_bytes_tc_dropped[id];
+		}
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nr->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nr->stats, 0, sizeof(nr->stats));
+
+	return 0;
+}
+
+static int
+read_subport_stats(struct rte_eth_dev *dev,
+	struct tm_node *ns,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+	struct rte_sched_subport_stats s;
+	uint32_t tc_ov, tc_id;
+
+	/* Stats read */
+	int status = rte_sched_subport_read_stats(
+		p->soft.tm.sched,
+		subport_id,
+		&s,
+		&tc_ov);
+	if (status)
+		return status;
+
+	/* Stats accumulate */
+	for (tc_id = 0; tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc_id++) {
+		ns->stats.n_pkts +=
+			s.n_pkts_tc[tc_id] - s.n_pkts_tc_dropped[tc_id];
+		ns->stats.n_bytes +=
+			s.n_bytes_tc[tc_id] - s.n_bytes_tc_dropped[tc_id];
+		ns->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] +=
+			s.n_pkts_tc_dropped[tc_id];
+		ns->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_tc_dropped[tc_id];
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &ns->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&ns->stats, 0, sizeof(ns->stats));
+
+	return 0;
+}
+
+static int
+read_pipe_stats(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t i;
+
+	/* Stats read */
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
+		struct rte_sched_queue_stats s;
+		uint16_t qlen;
+
+		uint32_t qid = tm_port_queue_id(dev,
+			subport_id,
+			pipe_id,
+			i / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			i % RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
+
+		int status = rte_sched_queue_read_stats(
+			p->soft.tm.sched,
+			qid,
+			&s,
+			&qlen);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		np->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+		np->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+		np->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+		np->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_dropped;
+		np->stats.leaf.n_pkts_queued = qlen;
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &np->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&np->stats, 0, sizeof(np->stats));
+
+	return 0;
+}
+
+static int
+read_tc_stats(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t i;
+
+	/* Stats read */
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
+		struct rte_sched_queue_stats s;
+		uint16_t qlen;
+
+		uint32_t qid = tm_port_queue_id(dev,
+			subport_id,
+			pipe_id,
+			tc_id,
+			i);
+
+		int status = rte_sched_queue_read_stats(
+			p->soft.tm.sched,
+			qid,
+			&s,
+			&qlen);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		nt->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+		nt->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+		nt->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+		nt->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_dropped;
+		nt->stats.leaf.n_pkts_queued = qlen;
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nt->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nt->stats, 0, sizeof(nt->stats));
+
+	return 0;
+}
+
+static int
+read_queue_stats(struct rte_eth_dev *dev,
+	struct tm_node *nq,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_sched_queue_stats s;
+	uint16_t qlen;
+
+	uint32_t queue_id = tm_node_queue_id(dev, nq);
+
+	struct tm_node *nt = nq->parent_node;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	/* Stats read */
+	uint32_t qid = tm_port_queue_id(dev,
+		subport_id,
+		pipe_id,
+		tc_id,
+		queue_id);
+
+	int status = rte_sched_queue_read_stats(
+		p->soft.tm.sched,
+		qid,
+		&s,
+		&qlen);
+	if (status)
+		return status;
+
+	/* Stats accumulate */
+	nq->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+	nq->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+	nq->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+	nq->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+		s.n_bytes_dropped;
+	nq->stats.leaf.n_pkts_queued = qlen;
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nq->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_QUEUE;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nq->stats, 0, sizeof(nq->stats));
+
+	return 0;
+}
+
+/* Traffic manager read stats counters for specific node */
+static int
+pmd_tm_node_stats_read(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+
+	/* Port must be started and TM used. */
+	if ((dev->data->dev_started == 0) && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		if (read_port_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		if (read_subport_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_PIPE:
+		if (read_pipe_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_TC:
+		if (read_tc_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		if (read_queue_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+	}
+}
+
+const struct rte_tm_ops pmd_tm_ops = {
+	.node_type_get = pmd_tm_node_type_get,
+	.capabilities_get = pmd_tm_capabilities_get,
+	.level_capabilities_get = pmd_tm_level_capabilities_get,
+	.node_capabilities_get = pmd_tm_node_capabilities_get,
+
+	.wred_profile_add = pmd_tm_wred_profile_add,
+	.wred_profile_delete = pmd_tm_wred_profile_delete,
+	.shared_wred_context_add_update = NULL,
+	.shared_wred_context_delete = NULL,
+
+	.shaper_profile_add = pmd_tm_shaper_profile_add,
+	.shaper_profile_delete = pmd_tm_shaper_profile_delete,
+	.shared_shaper_add_update = pmd_tm_shared_shaper_add_update,
+	.shared_shaper_delete = pmd_tm_shared_shaper_delete,
+
+	.node_add = pmd_tm_node_add,
+	.node_delete = pmd_tm_node_delete,
+	.node_suspend = NULL,
+	.node_resume = NULL,
+	.hierarchy_commit = pmd_tm_hierarchy_commit,
+
+	.node_parent_update = pmd_tm_node_parent_update,
+	.node_shaper_update = pmd_tm_node_shaper_update,
+	.node_shared_shaper_update = NULL,
+	.node_stats_update = NULL,
+	.node_wfq_weight_mode_update = NULL,
+	.node_cman_update = NULL,
+	.node_wred_context_update = NULL,
+	.node_shared_wred_context_update = NULL,
+
+	.node_stats_read = pmd_tm_node_stats_read,
 };
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v6 5/5] app/testpmd: add traffic management forwarding mode
  2017-10-06 16:59                   ` [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
                                       ` (3 preceding siblings ...)
  2017-10-06 17:00                     ` [dpdk-dev] [PATCH v6 4/5] net/softnic: add TM hierarchy related ops Jasvinder Singh
@ 2017-10-06 17:00                     ` Jasvinder Singh
  2017-10-06 18:57                     ` [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Ferruh Yigit
  5 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-06 17:00 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

This commit extends the testpmd application with new forwarding engine
that demonstrates the use of ethdev traffic management APIs and softnic
PMD for QoS traffic management.

In this mode, 5-level hierarchical tree of the QoS scheduler is built
with the help of ethdev TM APIs such as shaper profile add/delete,
shared shaper add/update, node add/delete, hierarchy commit, etc.
The hierarchical tree has following nodes; root node(x1, level 0),
subport node(x1, level 1), pipe node(x4096, level 2),
tc node(x16348, level 3), queue node(x65536, level 4).

During runtime, each received packet is first classified by mapping the
packet fields information to 5-tuples (HQoS subport, pipe, traffic class,
queue within traffic class, and color) and storing it in the packet mbuf
sched field. After classification, each packet is sent to softnic port
which prioritizes the transmission of the received packets, and
accordingly sends them on to the output interface.

To enable traffic management mode, following testpmd command is used;

$ ./testpmd -c c -n 4 --vdev
	'net_softnic0,hard_name=0000:06:00.1,soft_tm=on' -- -i
	--forward-mode=tm

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
v5 change:
- add CLI to enable default tm hierarchy
 
v3 change:
- Implements feedback from Pablo[1]
  - add flag to check required librte_sched lib and softnic pmd
  - code cleanup

v2 change:
- change file name softnictm.c to tm.c
- change forward mode name to "tm"
- code clean up

[1] http://dpdk.org/ml/archives/dev/2017-September/075744.html

 app/test-pmd/Makefile  |   8 +
 app/test-pmd/cmdline.c |  88 +++++
 app/test-pmd/testpmd.c |  15 +
 app/test-pmd/testpmd.h |  46 +++
 app/test-pmd/tm.c      | 865 +++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 1022 insertions(+)
 create mode 100644 app/test-pmd/tm.c

diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile
index c36be19..2c50f68 100644
--- a/app/test-pmd/Makefile
+++ b/app/test-pmd/Makefile
@@ -59,6 +59,10 @@ SRCS-y += csumonly.c
 SRCS-y += icmpecho.c
 SRCS-$(CONFIG_RTE_LIBRTE_IEEE1588) += ieee1588fwd.c
 
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_SOFTNIC)$(CONFIG_RTE_LIBRTE_SCHED),yy)
+SRCS-y += tm.c
+endif
+
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 
 ifeq ($(CONFIG_RTE_LIBRTE_PMD_BOND),y)
@@ -81,6 +85,10 @@ ifeq ($(CONFIG_RTE_LIBRTE_PMD_XENVIRT),y)
 LDLIBS += -lrte_pmd_xenvirt
 endif
 
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_SOFTNIC),y)
+LDLIBS += -lrte_pmd_softnic
+endif
+
 endif
 
 CFLAGS_cmdline.o := -D_GNU_SOURCE
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index cdde5a1..91c5d38 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -623,6 +623,11 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"E-tag set filter del e-tag-id (value) port (port_id)\n"
 			"    Delete an E-tag forwarding filter on a port\n\n"
 
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+			"set port tm hierarchy default (port_id)\n"
+			"	Set default traffic Management hierarchy on a port\n\n"
+
+#endif
 			"ddp add (port_id) (profile_path[,output_path])\n"
 			"    Load a profile package on a port\n\n"
 
@@ -13212,6 +13217,86 @@ cmdline_parse_inst_t cmd_vf_tc_max_bw = {
 	},
 };
 
+
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+
+/* *** Set Port default Traffic Management Hierarchy *** */
+struct cmd_set_port_tm_hierarchy_default_result {
+	cmdline_fixed_string_t set;
+	cmdline_fixed_string_t port;
+	cmdline_fixed_string_t tm;
+	cmdline_fixed_string_t hierarchy;
+	cmdline_fixed_string_t def;
+	uint8_t port_id;
+};
+
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_set =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result, set, "set");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_port =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result, port, "port");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_tm =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result, tm, "tm");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_hierarchy =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result,
+			hierarchy, "hierarchy");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_default =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result,
+			def, "default");
+cmdline_parse_token_num_t cmd_set_port_tm_hierarchy_default_port_id =
+	TOKEN_NUM_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result,
+			port_id, UINT8);
+
+static void cmd_set_port_tm_hierarchy_default_parsed(void *parsed_result,
+	__attribute__((unused)) struct cmdline *cl,
+	__attribute__((unused)) void *data)
+{
+	struct cmd_set_port_tm_hierarchy_default_result *res = parsed_result;
+	struct rte_port *p;
+	uint8_t port_id = res->port_id;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
+		return;
+
+	p = &ports[port_id];
+
+	/* Port tm flag */
+	if (p->softport.tm_flag == 0) {
+		printf("  tm not enabled on port %u (error)\n", port_id);
+		return;
+	}
+
+	/* Forward mode: tm */
+	if (strcmp(cur_fwd_config.fwd_eng->fwd_mode_name, "tm")) {
+		printf("  tm mode not enabled(error)\n");
+		return;
+	}
+
+	/* Set the default tm hierarchy */
+	p->softport.tm.default_hierarchy_enable = 1;
+}
+
+cmdline_parse_inst_t cmd_set_port_tm_hierarchy_default = {
+	.f = cmd_set_port_tm_hierarchy_default_parsed,
+	.data = NULL,
+	.help_str = "set port tm hierarchy default <port_id>",
+	.tokens = {
+		(void *)&cmd_set_port_tm_hierarchy_default_set,
+		(void *)&cmd_set_port_tm_hierarchy_default_port,
+		(void *)&cmd_set_port_tm_hierarchy_default_tm,
+		(void *)&cmd_set_port_tm_hierarchy_default_hierarchy,
+		(void *)&cmd_set_port_tm_hierarchy_default_default,
+		(void *)&cmd_set_port_tm_hierarchy_default_port_id,
+		NULL,
+	},
+};
+#endif
+
 /* Strict link priority scheduling mode setting */
 static void
 cmd_strict_link_prio_parsed(
@@ -14804,6 +14889,9 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_vf_tc_max_bw,
 	(cmdline_parse_inst_t *)&cmd_strict_link_prio,
 	(cmdline_parse_inst_t *)&cmd_tc_min_bw,
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+	(cmdline_parse_inst_t *)&cmd_set_port_tm_hierarchy_default,
+#endif
 	(cmdline_parse_inst_t *)&cmd_ddp_add,
 	(cmdline_parse_inst_t *)&cmd_ddp_del,
 	(cmdline_parse_inst_t *)&cmd_ddp_get_list,
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 5507c0f..b900675 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -167,6 +167,10 @@ struct fwd_engine * fwd_engines[] = {
 	&tx_only_engine,
 	&csum_fwd_engine,
 	&icmp_echo_engine,
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+	&softnic_tm_engine,
+	&softnic_tm_bypass_engine,
+#endif
 #ifdef RTE_LIBRTE_IEEE1588
 	&ieee1588_fwd_engine,
 #endif
@@ -2085,6 +2089,17 @@ init_port_config(void)
 		    (rte_eth_devices[pid].data->dev_flags &
 		     RTE_ETH_DEV_INTR_RMV))
 			port->dev_conf.intr_conf.rmv = 1;
+
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+		/* Detect softnic port */
+		if (!strcmp(port->dev_info.driver_name, "net_softnic")) {
+			port->softnic_enable = 1;
+			memset(&port->softport, 0, sizeof(struct softnic_port));
+
+			if (!strcmp(cur_fwd_eng->fwd_mode_name, "tm"))
+				port->softport.tm_flag = 1;
+		}
+#endif
 	}
 }
 
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 1d1ee75..b3a8665 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -84,6 +84,12 @@ typedef uint16_t streamid_t;
 
 #define MAX_QUEUE_ID ((1 << (sizeof(queueid_t) * 8)) - 1)
 
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+#define TM_MODE			1
+#else
+#define TM_MODE			0
+#endif
+
 enum {
 	PORT_TOPOLOGY_PAIRED,
 	PORT_TOPOLOGY_CHAINED,
@@ -162,6 +168,38 @@ struct port_flow {
 	uint8_t data[]; /**< Storage for pattern/actions. */
 };
 
+#ifdef TM_MODE
+/**
+ * Soft port tm related parameters
+ */
+struct softnic_port_tm {
+	uint32_t default_hierarchy_enable; /**< def hierarchy enable flag */
+	uint32_t hierarchy_config;  /**< set to 1 if hierarchy configured */
+
+	uint32_t n_subports_per_port;  /**< Num of subport nodes per port */
+	uint32_t n_pipes_per_subport;  /**< Num of pipe nodes per subport */
+
+	uint64_t tm_pktfield0_slabpos;	/**< Pkt field position for subport */
+	uint64_t tm_pktfield0_slabmask; /**< Pkt field mask for the subport */
+	uint64_t tm_pktfield0_slabshr;
+	uint64_t tm_pktfield1_slabpos; /**< Pkt field position for the pipe */
+	uint64_t tm_pktfield1_slabmask; /**< Pkt field mask for the pipe */
+	uint64_t tm_pktfield1_slabshr;
+	uint64_t tm_pktfield2_slabpos; /**< Pkt field position table index */
+	uint64_t tm_pktfield2_slabmask;	/**< Pkt field mask for tc table idx */
+	uint64_t tm_pktfield2_slabshr;
+	uint64_t tm_tc_table[64];  /**< TC translation table */
+};
+
+/**
+ * The data structure associate with softnic port
+ */
+struct softnic_port {
+	unsigned int tm_flag;	/**< set to 1 if tm feature is enabled */
+	struct softnic_port_tm tm;	/**< softnic port tm parameters */
+};
+#endif
+
 /**
  * The data structure associated with each port.
  */
@@ -195,6 +233,10 @@ struct rte_port {
 	uint32_t                mc_addr_nb; /**< nb. of addr. in mc_addr_pool */
 	uint8_t                 slave_flag; /**< bonding slave port */
 	struct port_flow        *flow_list; /**< Associated flows. */
+#ifdef TM_MODE
+	unsigned int			softnic_enable;	/**< softnic flag */
+	struct softnic_port     softport;  /**< softnic port params */
+#endif
 };
 
 /**
@@ -253,6 +295,10 @@ extern struct fwd_engine rx_only_engine;
 extern struct fwd_engine tx_only_engine;
 extern struct fwd_engine csum_fwd_engine;
 extern struct fwd_engine icmp_echo_engine;
+#ifdef TM_MODE
+extern struct fwd_engine softnic_tm_engine;
+extern struct fwd_engine softnic_tm_bypass_engine;
+#endif
 #ifdef RTE_LIBRTE_IEEE1588
 extern struct fwd_engine ieee1588_fwd_engine;
 #endif
diff --git a/app/test-pmd/tm.c b/app/test-pmd/tm.c
new file mode 100644
index 0000000..9021e26
--- /dev/null
+++ b/app/test-pmd/tm.c
@@ -0,0 +1,865 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include <stdio.h>
+#include <sys/stat.h>
+
+#include <rte_cycles.h>
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_flow.h>
+#include <rte_meter.h>
+#include <rte_eth_softnic.h>
+#include <rte_tm.h>
+
+#include "testpmd.h"
+
+#define SUBPORT_NODES_PER_PORT		1
+#define PIPE_NODES_PER_SUBPORT		4096
+#define TC_NODES_PER_PIPE			4
+#define QUEUE_NODES_PER_TC			4
+
+#define NUM_PIPE_NODES						\
+	(SUBPORT_NODES_PER_PORT * PIPE_NODES_PER_SUBPORT)
+
+#define NUM_TC_NODES						\
+	(NUM_PIPE_NODES * TC_NODES_PER_PIPE)
+
+#define ROOT_NODE_ID				1000000
+#define SUBPORT_NODES_START_ID		900000
+#define PIPE_NODES_START_ID			800000
+#define TC_NODES_START_ID			700000
+
+#define STATS_MASK_DEFAULT					\
+	(RTE_TM_STATS_N_PKTS |					\
+	RTE_TM_STATS_N_BYTES |					\
+	RTE_TM_STATS_N_PKTS_GREEN_DROPPED |			\
+	RTE_TM_STATS_N_BYTES_GREEN_DROPPED)
+
+#define STATS_MASK_QUEUE					\
+	(STATS_MASK_DEFAULT |					\
+	RTE_TM_STATS_N_PKTS_QUEUED)
+
+#define BYTES_IN_MBPS				(1000 * 1000 / 8)
+#define TOKEN_BUCKET_SIZE			1000000
+
+/* TM Hierarchy Levels */
+enum tm_hierarchy_level {
+	TM_NODE_LEVEL_PORT = 0,
+	TM_NODE_LEVEL_SUBPORT,
+	TM_NODE_LEVEL_PIPE,
+	TM_NODE_LEVEL_TC,
+	TM_NODE_LEVEL_QUEUE,
+	TM_NODE_LEVEL_MAX,
+};
+
+struct tm_hierarchy {
+	/* TM Nodes */
+	uint32_t root_node_id;
+	uint32_t subport_node_id[SUBPORT_NODES_PER_PORT];
+	uint32_t pipe_node_id[SUBPORT_NODES_PER_PORT][PIPE_NODES_PER_SUBPORT];
+	uint32_t tc_node_id[NUM_PIPE_NODES][TC_NODES_PER_PIPE];
+	uint32_t queue_node_id[NUM_TC_NODES][QUEUE_NODES_PER_TC];
+
+	/* TM Hierarchy Nodes Shaper Rates */
+	uint32_t root_node_shaper_rate;
+	uint32_t subport_node_shaper_rate;
+	uint32_t pipe_node_shaper_rate;
+	uint32_t tc_node_shaper_rate;
+	uint32_t tc_node_shared_shaper_rate;
+
+	uint32_t n_shapers;
+};
+
+#define BITFIELD(byte_array, slab_pos, slab_mask, slab_shr)	\
+({								\
+	uint64_t slab = *((uint64_t *) &byte_array[slab_pos]);	\
+	uint64_t val =				\
+		(rte_be_to_cpu_64(slab) & slab_mask) >> slab_shr;	\
+	val;						\
+})
+
+#define RTE_SCHED_PORT_HIERARCHY(subport, pipe,           \
+	traffic_class, queue, color)                          \
+	((((uint64_t) (queue)) & 0x3) |                       \
+	((((uint64_t) (traffic_class)) & 0x3) << 2) |         \
+	((((uint64_t) (color)) & 0x3) << 4) |                 \
+	((((uint64_t) (subport)) & 0xFFFF) << 16) |           \
+	((((uint64_t) (pipe)) & 0xFFFFFFFF) << 32))
+
+
+static void
+pkt_metadata_set(struct rte_port *p, struct rte_mbuf **pkts,
+	uint32_t n_pkts)
+{
+	struct softnic_port_tm *tm = &p->softport.tm;
+	uint32_t i;
+
+	for (i = 0; i < (n_pkts & (~0x3)); i += 4) {
+		struct rte_mbuf *pkt0 = pkts[i];
+		struct rte_mbuf *pkt1 = pkts[i + 1];
+		struct rte_mbuf *pkt2 = pkts[i + 2];
+		struct rte_mbuf *pkt3 = pkts[i + 3];
+
+		uint8_t *pkt0_data = rte_pktmbuf_mtod(pkt0, uint8_t *);
+		uint8_t *pkt1_data = rte_pktmbuf_mtod(pkt1, uint8_t *);
+		uint8_t *pkt2_data = rte_pktmbuf_mtod(pkt2, uint8_t *);
+		uint8_t *pkt3_data = rte_pktmbuf_mtod(pkt3, uint8_t *);
+
+		uint64_t pkt0_subport = BITFIELD(pkt0_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt0_pipe = BITFIELD(pkt0_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt0_dscp = BITFIELD(pkt0_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt0_tc = tm->tm_tc_table[pkt0_dscp & 0x3F] >> 2;
+		uint32_t pkt0_tc_q = tm->tm_tc_table[pkt0_dscp & 0x3F] & 0x3;
+		uint64_t pkt1_subport = BITFIELD(pkt1_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt1_pipe = BITFIELD(pkt1_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt1_dscp = BITFIELD(pkt1_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt1_tc = tm->tm_tc_table[pkt1_dscp & 0x3F] >> 2;
+		uint32_t pkt1_tc_q = tm->tm_tc_table[pkt1_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt2_subport = BITFIELD(pkt2_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt2_pipe = BITFIELD(pkt2_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt2_dscp = BITFIELD(pkt2_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt2_tc = tm->tm_tc_table[pkt2_dscp & 0x3F] >> 2;
+		uint32_t pkt2_tc_q = tm->tm_tc_table[pkt2_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt3_subport = BITFIELD(pkt3_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt3_pipe = BITFIELD(pkt3_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt3_dscp = BITFIELD(pkt3_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt3_tc = tm->tm_tc_table[pkt3_dscp & 0x3F] >> 2;
+		uint32_t pkt3_tc_q = tm->tm_tc_table[pkt3_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt0_sched = RTE_SCHED_PORT_HIERARCHY(pkt0_subport,
+						pkt0_pipe,
+						pkt0_tc,
+						pkt0_tc_q,
+						0);
+		uint64_t pkt1_sched = RTE_SCHED_PORT_HIERARCHY(pkt1_subport,
+						pkt1_pipe,
+						pkt1_tc,
+						pkt1_tc_q,
+						0);
+		uint64_t pkt2_sched = RTE_SCHED_PORT_HIERARCHY(pkt2_subport,
+						pkt2_pipe,
+						pkt2_tc,
+						pkt2_tc_q,
+						0);
+		uint64_t pkt3_sched = RTE_SCHED_PORT_HIERARCHY(pkt3_subport,
+						pkt3_pipe,
+						pkt3_tc,
+						pkt3_tc_q,
+						0);
+
+		pkt0->hash.sched.lo = pkt0_sched & 0xFFFFFFFF;
+		pkt0->hash.sched.hi = pkt0_sched >> 32;
+		pkt1->hash.sched.lo = pkt1_sched & 0xFFFFFFFF;
+		pkt1->hash.sched.hi = pkt1_sched >> 32;
+		pkt2->hash.sched.lo = pkt2_sched & 0xFFFFFFFF;
+		pkt2->hash.sched.hi = pkt2_sched >> 32;
+		pkt3->hash.sched.lo = pkt3_sched & 0xFFFFFFFF;
+		pkt3->hash.sched.hi = pkt3_sched >> 32;
+	}
+
+	for (; i < n_pkts; i++)	{
+		struct rte_mbuf *pkt = pkts[i];
+
+		uint8_t *pkt_data = rte_pktmbuf_mtod(pkt, uint8_t *);
+
+		uint64_t pkt_subport = BITFIELD(pkt_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt_pipe = BITFIELD(pkt_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt_dscp = BITFIELD(pkt_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt_tc = tm->tm_tc_table[pkt_dscp & 0x3F] >> 2;
+		uint32_t pkt_tc_q = tm->tm_tc_table[pkt_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt_sched = RTE_SCHED_PORT_HIERARCHY(pkt_subport,
+						pkt_pipe,
+						pkt_tc,
+						pkt_tc_q,
+						0);
+
+		pkt->hash.sched.lo = pkt_sched & 0xFFFFFFFF;
+		pkt->hash.sched.hi = pkt_sched >> 32;
+	}
+}
+
+/*
+ * Soft port packet forward
+ */
+static void
+softport_packet_fwd(struct fwd_stream *fs)
+{
+	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_port *rte_tx_port = &ports[fs->tx_port];
+	uint16_t nb_rx;
+	uint16_t nb_tx;
+	uint32_t retry;
+
+#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
+	uint64_t start_tsc;
+	uint64_t end_tsc;
+	uint64_t core_cycles;
+#endif
+
+#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
+	start_tsc = rte_rdtsc();
+#endif
+
+	/*  Packets Receive */
+	nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue,
+			pkts_burst, nb_pkt_per_burst);
+	fs->rx_packets += nb_rx;
+
+#ifdef RTE_TEST_PMD_RECORD_BURST_STATS
+	fs->rx_burst_stats.pkt_burst_spread[nb_rx]++;
+#endif
+
+	if (rte_tx_port->softnic_enable) {
+		/* Set packet metadata if tm flag enabled */
+		if (rte_tx_port->softport.tm_flag)
+			pkt_metadata_set(rte_tx_port, pkts_burst, nb_rx);
+
+		/* Softport run */
+		rte_pmd_softnic_run(fs->tx_port);
+	}
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
+			pkts_burst, nb_rx);
+
+	/* Retry if necessary */
+	if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
+		retry = 0;
+		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
+			rte_delay_us(burst_tx_delay_time);
+			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
+					&pkts_burst[nb_tx], nb_rx - nb_tx);
+		}
+	}
+	fs->tx_packets += nb_tx;
+
+#ifdef RTE_TEST_PMD_RECORD_BURST_STATS
+	fs->tx_burst_stats.pkt_burst_spread[nb_tx]++;
+#endif
+
+	if (unlikely(nb_tx < nb_rx)) {
+		fs->fwd_dropped += (nb_rx - nb_tx);
+		do {
+			rte_pktmbuf_free(pkts_burst[nb_tx]);
+		} while (++nb_tx < nb_rx);
+	}
+#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
+	end_tsc = rte_rdtsc();
+	core_cycles = (end_tsc - start_tsc);
+	fs->core_cycles = (uint64_t) (fs->core_cycles + core_cycles);
+#endif
+}
+
+static void
+set_tm_hiearchy_nodes_shaper_rate(portid_t port_id, struct tm_hierarchy *h)
+{
+	struct rte_eth_link link_params;
+	uint64_t tm_port_rate;
+
+	memset(&link_params, 0, sizeof(link_params));
+
+	rte_eth_link_get(port_id, &link_params);
+	tm_port_rate = link_params.link_speed * BYTES_IN_MBPS;
+
+	if (tm_port_rate > UINT32_MAX)
+		tm_port_rate = UINT32_MAX;
+
+	/* Set tm hierarchy shapers rate */
+	h->root_node_shaper_rate = tm_port_rate;
+	h->subport_node_shaper_rate =
+		tm_port_rate / SUBPORT_NODES_PER_PORT;
+	h->pipe_node_shaper_rate
+		= h->subport_node_shaper_rate / PIPE_NODES_PER_SUBPORT;
+	h->tc_node_shaper_rate = h->pipe_node_shaper_rate;
+	h->tc_node_shared_shaper_rate = h->subport_node_shaper_rate;
+}
+
+static int
+softport_tm_root_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	struct rte_tm_node_params rnp;
+	struct rte_tm_shaper_params rsp;
+	uint32_t priority, weight, level_id, shaper_profile_id;
+
+	memset(&rsp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&rnp, 0, sizeof(struct rte_tm_node_params));
+
+	/* Shaper profile Parameters */
+	rsp.peak.rate = h->root_node_shaper_rate;
+	rsp.peak.size = TOKEN_BUCKET_SIZE;
+	rsp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+	shaper_profile_id = 0;
+
+	if (rte_tm_shaper_profile_add(port_id, shaper_profile_id,
+		&rsp, error)) {
+		printf("%s ERROR(%d)-%s!(shaper_id %u)\n ",
+			__func__, error->type, error->message,
+			shaper_profile_id);
+		return -1;
+	}
+
+	/* Root Node Parameters */
+	h->root_node_id = ROOT_NODE_ID;
+	weight = 1;
+	priority = 0;
+	level_id = TM_NODE_LEVEL_PORT;
+	rnp.shaper_profile_id = shaper_profile_id;
+	rnp.nonleaf.n_sp_priorities = 1;
+	rnp.stats_mask = STATS_MASK_DEFAULT;
+
+	/* Add Node to TM Hierarchy */
+	if (rte_tm_node_add(port_id, h->root_node_id, RTE_TM_NODE_ID_NULL,
+		priority, weight, level_id, &rnp, error)) {
+		printf("%s ERROR(%d)-%s!(node_id %u, parent_id %u, level %u)\n",
+			__func__, error->type, error->message,
+			h->root_node_id, RTE_TM_NODE_ID_NULL,
+			level_id);
+		return -1;
+	}
+	/* Update */
+	h->n_shapers++;
+
+	printf("  Root node added (Start id %u, Count %u, level %u)\n",
+		h->root_node_id, 1, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_subport_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t subport_parent_node_id, subport_node_id;
+	struct rte_tm_node_params snp;
+	struct rte_tm_shaper_params ssp;
+	uint32_t priority, weight, level_id, shaper_profile_id;
+	uint32_t i;
+
+	memset(&ssp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&snp, 0, sizeof(struct rte_tm_node_params));
+
+	shaper_profile_id = h->n_shapers;
+
+	/* Add Shaper Profile to TM Hierarchy */
+	for (i = 0; i < SUBPORT_NODES_PER_PORT; i++) {
+		ssp.peak.rate = h->subport_node_shaper_rate;
+		ssp.peak.size = TOKEN_BUCKET_SIZE;
+		ssp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+		if (rte_tm_shaper_profile_add(port_id, shaper_profile_id,
+			&ssp, error)) {
+			printf("%s ERROR(%d)-%s!(shaper_id %u)\n ",
+				__func__, error->type, error->message,
+				shaper_profile_id);
+			return -1;
+		}
+
+		/* Node Parameters */
+		h->subport_node_id[i] = SUBPORT_NODES_START_ID + i;
+		subport_parent_node_id = h->root_node_id;
+		weight = 1;
+		priority = 0;
+		level_id = TM_NODE_LEVEL_SUBPORT;
+		snp.shaper_profile_id = shaper_profile_id;
+		snp.nonleaf.n_sp_priorities = 1;
+		snp.stats_mask = STATS_MASK_DEFAULT;
+
+		/* Add Node to TM Hiearchy */
+		if (rte_tm_node_add(port_id,
+				h->subport_node_id[i],
+				subport_parent_node_id,
+				priority, weight,
+				level_id,
+				&snp,
+				error)) {
+			printf("%s ERROR(%d)-%s!(node %u,parent %u,level %u)\n",
+					__func__,
+					error->type,
+					error->message,
+					h->subport_node_id[i],
+					subport_parent_node_id,
+					level_id);
+			return -1;
+		}
+		shaper_profile_id++;
+		subport_node_id++;
+	}
+	/* Update */
+	h->n_shapers = shaper_profile_id;
+
+	printf("  Subport nodes added (Start id %u, Count %u, level %u)\n",
+		h->subport_node_id[0], SUBPORT_NODES_PER_PORT, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_pipe_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t pipe_parent_node_id;
+	struct rte_tm_node_params pnp;
+	struct rte_tm_shaper_params psp;
+	uint32_t priority, weight, level_id, shaper_profile_id;
+	uint32_t i, j;
+
+	memset(&psp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&pnp, 0, sizeof(struct rte_tm_node_params));
+
+	shaper_profile_id = h->n_shapers;
+
+	/* Shaper Profile Parameters */
+	psp.peak.rate = h->pipe_node_shaper_rate;
+	psp.peak.size = TOKEN_BUCKET_SIZE;
+	psp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+	/* Pipe Node Parameters */
+	weight = 1;
+	priority = 0;
+	level_id = TM_NODE_LEVEL_PIPE;
+	pnp.nonleaf.n_sp_priorities = 4;
+	pnp.stats_mask = STATS_MASK_DEFAULT;
+
+	/* Add Shaper Profiles and Nodes to TM Hierarchy */
+	for (i = 0; i < SUBPORT_NODES_PER_PORT; i++) {
+		for (j = 0; j < PIPE_NODES_PER_SUBPORT; j++) {
+			if (rte_tm_shaper_profile_add(port_id,
+				shaper_profile_id, &psp, error)) {
+				printf("%s ERROR(%d)-%s!(shaper_id %u)\n ",
+					__func__, error->type, error->message,
+					shaper_profile_id);
+				return -1;
+			}
+			pnp.shaper_profile_id = shaper_profile_id;
+			pipe_parent_node_id = h->subport_node_id[i];
+			h->pipe_node_id[i][j] = PIPE_NODES_START_ID +
+				(i * PIPE_NODES_PER_SUBPORT) + j;
+
+			if (rte_tm_node_add(port_id,
+					h->pipe_node_id[i][j],
+					pipe_parent_node_id,
+					priority, weight, level_id,
+					&pnp,
+					error)) {
+				printf("%s ERROR(%d)-%s!(node %u,parent %u )\n",
+					__func__,
+					error->type,
+					error->message,
+					h->pipe_node_id[i][j],
+					pipe_parent_node_id);
+
+				return -1;
+			}
+			shaper_profile_id++;
+		}
+	}
+	/* Update */
+	h->n_shapers = shaper_profile_id;
+
+	printf("  Pipe nodes added (Start id %u, Count %u, level %u)\n",
+		h->pipe_node_id[0][0], NUM_PIPE_NODES, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_tc_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t tc_parent_node_id;
+	struct rte_tm_node_params tnp;
+	struct rte_tm_shaper_params tsp, tssp;
+	uint32_t shared_shaper_profile_id[TC_NODES_PER_PIPE];
+	uint32_t priority, weight, level_id, shaper_profile_id;
+	uint32_t pos, n_tc_nodes, i, j, k;
+
+	memset(&tsp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&tssp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&tnp, 0, sizeof(struct rte_tm_node_params));
+
+	shaper_profile_id = h->n_shapers;
+
+	/* Private Shaper Profile (TC) Parameters */
+	tsp.peak.rate = h->tc_node_shaper_rate;
+	tsp.peak.size = TOKEN_BUCKET_SIZE;
+	tsp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+	/* Shared Shaper Profile (TC) Parameters */
+	tssp.peak.rate = h->tc_node_shared_shaper_rate;
+	tssp.peak.size = TOKEN_BUCKET_SIZE;
+	tssp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+	/* TC Node Parameters */
+	weight = 1;
+	level_id = TM_NODE_LEVEL_TC;
+	tnp.n_shared_shapers = 1;
+	tnp.nonleaf.n_sp_priorities = 1;
+	tnp.stats_mask = STATS_MASK_DEFAULT;
+
+	/* Add Shared Shaper Profiles to TM Hierarchy */
+	for (i = 0; i < TC_NODES_PER_PIPE; i++) {
+		shared_shaper_profile_id[i] = shaper_profile_id;
+
+		if (rte_tm_shaper_profile_add(port_id,
+			shared_shaper_profile_id[i], &tssp, error)) {
+			printf("%s ERROR(%d)-%s!(Shared shaper profileid %u)\n",
+				__func__, error->type, error->message,
+				shared_shaper_profile_id[i]);
+
+			return -1;
+		}
+		if (rte_tm_shared_shaper_add_update(port_id,  i,
+			shared_shaper_profile_id[i], error)) {
+			printf("%s ERROR(%d)-%s!(Shared shaper id %u)\n",
+				__func__, error->type, error->message, i);
+
+			return -1;
+		}
+		shaper_profile_id++;
+	}
+
+	/* Add Shaper Profiles and Nodes to TM Hierarchy */
+	n_tc_nodes = 0;
+	for (i = 0; i < SUBPORT_NODES_PER_PORT; i++) {
+		for (j = 0; j < PIPE_NODES_PER_SUBPORT; j++) {
+			for (k = 0; k < TC_NODES_PER_PIPE ; k++) {
+				priority = k;
+				tc_parent_node_id = h->pipe_node_id[i][j];
+				tnp.shared_shaper_id =
+					(uint32_t *)calloc(1, sizeof(uint32_t));
+				tnp.shared_shaper_id[0] = k;
+				pos = j + (i * PIPE_NODES_PER_SUBPORT);
+				h->tc_node_id[pos][k] =
+					TC_NODES_START_ID + n_tc_nodes;
+
+				if (rte_tm_shaper_profile_add(port_id,
+					shaper_profile_id, &tsp, error)) {
+					printf("%s ERROR(%d)-%s!(shaper %u)\n",
+						__func__, error->type,
+						error->message,
+						shaper_profile_id);
+
+					return -1;
+				}
+				tnp.shaper_profile_id = shaper_profile_id;
+				if (rte_tm_node_add(port_id,
+						h->tc_node_id[pos][k],
+						tc_parent_node_id,
+						priority, weight,
+						level_id,
+						&tnp, error)) {
+					printf("%s ERROR(%d)-%s!(node id %u)\n",
+						__func__,
+						error->type,
+						error->message,
+						h->tc_node_id[pos][k]);
+
+					return -1;
+				}
+				shaper_profile_id++;
+				n_tc_nodes++;
+			}
+		}
+	}
+	/* Update */
+	h->n_shapers = shaper_profile_id;
+
+	printf("  TC nodes added (Start id %u, Count %u, level %u)\n",
+		h->tc_node_id[0][0], n_tc_nodes, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_queue_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t queue_parent_node_id;
+	struct rte_tm_node_params qnp;
+	uint32_t priority, weight, level_id, pos;
+	uint32_t n_queue_nodes, i, j, k;
+
+	memset(&qnp, 0, sizeof(struct rte_tm_node_params));
+
+	/* Queue Node Parameters */
+	priority = 0;
+	weight = 1;
+	level_id = TM_NODE_LEVEL_QUEUE;
+	qnp.shaper_profile_id = RTE_TM_SHAPER_PROFILE_ID_NONE;
+	qnp.leaf.cman = RTE_TM_CMAN_TAIL_DROP;
+	qnp.stats_mask = STATS_MASK_QUEUE;
+
+	/* Add Queue Nodes to TM Hierarchy */
+	n_queue_nodes = 0;
+	for (i = 0; i < NUM_PIPE_NODES; i++) {
+		for (j = 0; j < TC_NODES_PER_PIPE; j++) {
+			queue_parent_node_id = h->tc_node_id[i][j];
+			for (k = 0; k < QUEUE_NODES_PER_TC; k++) {
+				pos = j + (i * TC_NODES_PER_PIPE);
+				h->queue_node_id[pos][k] = n_queue_nodes;
+				if (rte_tm_node_add(port_id,
+						h->queue_node_id[pos][k],
+						queue_parent_node_id,
+						priority,
+						weight,
+						level_id,
+						&qnp, error)) {
+					printf("%s ERROR(%d)-%s!(node %u)\n",
+						__func__,
+						error->type,
+						error->message,
+						h->queue_node_id[pos][k]);
+
+					return -1;
+				}
+				n_queue_nodes++;
+			}
+		}
+	}
+	printf("  Queue nodes added (Start id %u, Count %u, level %u)\n",
+		h->queue_node_id[0][0], n_queue_nodes, level_id);
+
+	return 0;
+}
+
+/*
+ * TM Packet Field Setup
+ */
+static void
+softport_tm_pktfield_setup(portid_t port_id)
+{
+	struct rte_port *p = &ports[port_id];
+	uint64_t pktfield0_mask = 0;
+	uint64_t pktfield1_mask = 0x0000000FFF000000LLU;
+	uint64_t pktfield2_mask = 0x00000000000000FCLLU;
+
+	p->softport.tm = (struct softnic_port_tm) {
+		.n_subports_per_port = SUBPORT_NODES_PER_PORT,
+		.n_pipes_per_subport = PIPE_NODES_PER_SUBPORT,
+
+		/* Packet field to identify subport
+		 *
+		 * Default configuration assumes only one subport, thus
+		 * the subport ID is hardcoded to 0
+		 */
+		.tm_pktfield0_slabpos = 0,
+		.tm_pktfield0_slabmask = pktfield0_mask,
+		.tm_pktfield0_slabshr =
+			__builtin_ctzll(pktfield0_mask),
+
+		/* Packet field to identify pipe.
+		 *
+		 * Default value assumes Ethernet/IPv4/UDP packets,
+		 * UDP payload bits 12 .. 23
+		 */
+		.tm_pktfield1_slabpos = 40,
+		.tm_pktfield1_slabmask = pktfield1_mask,
+		.tm_pktfield1_slabshr =
+			__builtin_ctzll(pktfield1_mask),
+
+		/* Packet field used as index into TC translation table
+		 * to identify the traffic class and queue.
+		 *
+		 * Default value assumes Ethernet/IPv4 packets, IPv4
+		 * DSCP field
+		 */
+		.tm_pktfield2_slabpos = 8,
+		.tm_pktfield2_slabmask = pktfield2_mask,
+		.tm_pktfield2_slabshr =
+			__builtin_ctzll(pktfield2_mask),
+
+		.tm_tc_table = {
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+		}, /**< TC translation table */
+	};
+}
+
+static int
+softport_tm_hierarchy_specify(portid_t port_id, struct rte_tm_error *error)
+{
+
+	struct tm_hierarchy h;
+	int status;
+
+	memset(&h, 0, sizeof(struct tm_hierarchy));
+
+	/* TM hierarchy shapers rate */
+	set_tm_hiearchy_nodes_shaper_rate(port_id, &h);
+
+	/* Add root node (level 0) */
+	status = softport_tm_root_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add subport node (level 1) */
+	status = softport_tm_subport_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add pipe nodes (level 2) */
+	status = softport_tm_pipe_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add traffic class nodes (level 3) */
+	status = softport_tm_tc_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add queue nodes (level 4) */
+	status = softport_tm_queue_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* TM packet fields setup */
+	softport_tm_pktfield_setup(port_id);
+
+	return 0;
+}
+
+/*
+ * Soft port Init
+ */
+static void
+softport_tm_begin(portid_t pi)
+{
+	struct rte_port *port = &ports[pi];
+
+	/* Soft port TM flag */
+	if (port->softport.tm_flag == 1) {
+		printf("\n\n  TM feature available on port %u\n", pi);
+
+		/* Soft port TM hierarchy configuration */
+		if ((port->softport.tm.hierarchy_config == 0) &&
+			(port->softport.tm.default_hierarchy_enable == 1)) {
+			struct rte_tm_error error;
+			int status;
+
+			/* Stop port */
+			rte_eth_dev_stop(pi);
+
+			/* TM hierarchy specification */
+			status = softport_tm_hierarchy_specify(pi, &error);
+			if (status) {
+				printf("  TM Hierarchy built error(%d) - %s\n",
+					error.type, error.message);
+				return;
+			}
+			printf("\n  TM Hierarchy Specified!\n\v");
+
+			/* TM hierarchy commit */
+			status = rte_tm_hierarchy_commit(pi, 0, &error);
+			if (status) {
+				printf("  Hierarchy commit error(%d) - %s\n",
+					error.type, error.message);
+				return;
+			}
+			printf("  Hierarchy Committed (port %u)!", pi);
+			port->softport.tm.hierarchy_config = 1;
+
+			/* Start port */
+			status = rte_eth_dev_start(pi);
+			if (status) {
+				printf("\n  Port %u start error!\n", pi);
+				return;
+			}
+			printf("\n  Port %u started!\n", pi);
+			return;
+		}
+	}
+	printf("\n  TM feature not available on port %u", pi);
+}
+
+struct fwd_engine softnic_tm_engine = {
+	.fwd_mode_name  = "tm",
+	.port_fwd_begin = softport_tm_begin,
+	.port_fwd_end   = NULL,
+	.packet_fwd     = softport_packet_fwd,
+};
+
+struct fwd_engine softnic_tm_bypass_engine = {
+	.fwd_mode_name  = "tm-bypass",
+	.port_fwd_begin = NULL,
+	.port_fwd_end   = NULL,
+	.packet_fwd     = softport_packet_fwd,
+};
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-10-06 16:59                   ` [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
                                       ` (4 preceding siblings ...)
  2017-10-06 17:00                     ` [dpdk-dev] [PATCH v6 5/5] app/testpmd: add traffic management forwarding mode Jasvinder Singh
@ 2017-10-06 18:57                     ` Ferruh Yigit
  2017-10-09 11:32                       ` Singh, Jasvinder
  5 siblings, 1 reply; 79+ messages in thread
From: Ferruh Yigit @ 2017-10-06 18:57 UTC (permalink / raw)
  To: Jasvinder Singh, dev; +Cc: cristian.dumitrescu, thomas, wenzhuo.lu

On 10/6/2017 5:59 PM, Jasvinder Singh wrote:
> The SoftNIC PMD is intended to provide SW fall-back options for specific
> ethdev APIs in a generic way to the NICs not supporting those features.
> 
> Currently, the only implemented ethdev API is Traffic Management (TM),
> but other ethdev APIs such as rte_flow, traffic metering & policing, etc
> can be easily implemented.

<...>

> Cristian Dumitrescu (4):
> Jasvinder Singh (4):
>   net/softnic: add softnic PMD
>   net/softnic: add traffic management support
>   net/softnic: add TM capabilities ops
>   net/softnic: add TM hierarchy related ops
> 
> Jasvinder Singh (1):
>   app/testpmd: add traffic management forwarding mode

Hi Jasvinder,

There is a patchset has been merged into main and next-net repo to
convert port_id from 8bits to 16bits.
This PMD also should reflect this conversion, can you please rebase PMD
on top of latest next-net?

Please keep existing Acks in next version.

Also there are lots of UNNECESSARY_PARENTHESES checkpatch warnings, can
you please fix them?

> 
>  MAINTAINERS                                        |    5 +

Checkpatch also gives warning about MAINTAINERS file updates, not sure
why, can you please check this?

<...>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-10-06 18:57                     ` [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Ferruh Yigit
@ 2017-10-09 11:32                       ` Singh, Jasvinder
  0 siblings, 0 replies; 79+ messages in thread
From: Singh, Jasvinder @ 2017-10-09 11:32 UTC (permalink / raw)
  To: Yigit, Ferruh, dev; +Cc: Dumitrescu, Cristian, thomas, Lu, Wenzhuo



> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Friday, October 6, 2017 7:57 PM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>; dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>;
> thomas@monjalon.net; Lu, Wenzhuo <wenzhuo.lu@intel.com>
> Subject: Re: [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt
> and others
> 
> On 10/6/2017 5:59 PM, Jasvinder Singh wrote:
> > The SoftNIC PMD is intended to provide SW fall-back options for
> > specific ethdev APIs in a generic way to the NICs not supporting those
> features.
> >
> > Currently, the only implemented ethdev API is Traffic Management (TM),
> > but other ethdev APIs such as rte_flow, traffic metering & policing,
> > etc can be easily implemented.
> 
> <...>
> 
> > Cristian Dumitrescu (4):
> > Jasvinder Singh (4):
> >   net/softnic: add softnic PMD
> >   net/softnic: add traffic management support
> >   net/softnic: add TM capabilities ops
> >   net/softnic: add TM hierarchy related ops
> >
> > Jasvinder Singh (1):
> >   app/testpmd: add traffic management forwarding mode
> 
> Hi Jasvinder,
> 
> There is a patchset has been merged into main and next-net repo to convert
> port_id from 8bits to 16bits.
> This PMD also should reflect this conversion, can you please rebase PMD on
> top of latest next-net?
> 
> Please keep existing Acks in next version.

Ok, will do in the next version.

> Also there are lots of UNNECESSARY_PARENTHESES checkpatch warnings, can
> you please fix them?

Ok, will fix them. 
> >
> >  MAINTAINERS                                        |    5 +
> 
> Checkpatch also gives warning about MAINTAINERS file updates, not sure
> why, can you please check this?
> 
The latest checkpatch script issues warning on the use of "space" instead of "Tab" for defining the type. The existing pattern in the MAINTEINERS
files doesn't follow this.  "Space" is used for each Type everywhere instead of "Tab", that's why I ignored this warning.  

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v7 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-10-06 16:59                     ` [dpdk-dev] [PATCH v6 1/5] net/softnic: add softnic PMD Jasvinder Singh
@ 2017-10-09 12:58                       ` Jasvinder Singh
  2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 1/5] net/softnic: add softnic PMD Jasvinder Singh
                                           ` (4 more replies)
  0 siblings, 5 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-09 12:58 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

The SoftNIC PMD is intended to provide SW fall-back options for specific
ethdev APIs in a generic way to the NICs not supporting those features.

Currently, the only implemented ethdev API is Traffic Management (TM),
but other ethdev APIs such as rte_flow, traffic metering & policing, etc
can be easily implemented.

Overview:
* Generic: The SoftNIC PMD works with any "hard" PMD that implements the
  ethdev API. It does not change the "hard" PMD in any way.
* Creation: For any given "hard" ethdev port, the user can decide to
  create an associated "soft" ethdev port to drive the "hard" port. The
  "soft" port is a virtual device that can be created at app start-up
  through EAL vdev arg or later through the virtual device API.
* Configuration: The app explicitly decides which features are to be
  enabled on the "soft" port and which features are still to be used from
  the "hard" port. The app continues to explicitly configure both the
  "hard" and the "soft" ports after the creation of the "soft" port.
* RX/TX: The app reads packets from/writes packets to the "soft" port
  instead of the "hard" port. The RX and TX queues of the "soft" port are
  thread safe, as any ethdev.
* Execution: The "soft" port is a feature-rich NIC implemented by the CPU,
  so the run function of the "soft" port has to be executed by the CPU in
  order to get packets moving between "hard" port and the app.
* Meets the NFV vision: The app should be (almost) agnostic about the NIC
  implementation (different vendors/models, HW-SW mix), the app should not
  require changes to use different NICs, the app should use the same API
  for all NICs. If a NIC does not implement a specific feature, the HW
  should be augmented with SW to meet the functionality while still
  preserving the same API.

Traffic Management SW fall-back overview:
* Implements the ethdev traffic management API (rte_tm.h).
* Based on the existing librte_sched DPDK library.

Example: Create "soft" port for "hard" port "0000:04:00.1", enable the TM
feature with default settings:
          --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on' 

Q1: Why generic name, if only TM is supported (for now)?
A1: The intention is to have SoftNIC PMD implement many other (all?)
    ethdev APIs under a single "ideal" ethdev, hence the generic name.
    The initial motivation is TM API, but the mechanism is generic and can
    be used for many other ethdev APIs. Somebody looking to provide SW
    fall-back for other ethdev API is likely to end up inventing the same,
    hence it would be good to consolidate all under a single PMD and have
    the user explicitly enable/disable the features it needs for each
    "soft" device.

Q2: Are there any performance requirements for SoftNIC?
A2: Yes, performance should be great/decent for every feature, otherwise
    the SW fall-back is unusable, thus useless.

Q3: Why not change the "hard" device (and keep a single device) instead of
    creating a new "soft" device (and thus having two devices)?
A3: This is not possible with the current librte_ether ethdev
    implementation. The ethdev->dev_ops are defined as constant structure,
    so it cannot be changed per device (nor per PMD). The new ops also
    need memory space to store their context data structures, which
    requires updating the ethdev->data->dev_private of the existing
    device; at best, maybe a resize of ethdev->data->dev_private could be
    done, assuming that librte_ether will introduce a way to find out its
    size, but this cannot be done while device is running. Other side
    effects might exist, as the changes are very intrusive, plus it likely
    needs more changes in librte_ether.

Q4: Why not call the SW fall-back dev_ops directly in librte_ether for
    devices which do not support the specific feature? If the device
    supports the capability, let's call its dev_ops, otherwise call the
    SW fall-back dev_ops.
A4: First, similar reasons to Q&A3. This fixes the need to change
    ethdev->dev_ops of the device, but it does not do anything to fix the
    other significant issue of where to store the context data structures
    needed by the SW fall-back functions (which, in this approach, are
    called implicitly by librte_ether).
    Second, the SW fall-back options should not be restricted arbitrarily
    by the librte_ether library, the decision should belong to the app.
    For example, the TM SW fall-back should not be limited to only
    librte_sched, which (like any SW fall-back) is limited to a specific
    hierarchy and feature set, it cannot do any possible hierarchy. If
    alternatives exist, the one to use should be picked by the app, not by
    the ethdev layer.

Q5: Why is the app required to continue to configure both the "hard" and
    the "soft" devices even after the "soft" device has been created? Why
    not hiding the "hard" device under the "soft" device and have the
    "soft" device configure the "hard" device under the hood?
A5: This was the approach tried in the V2 of this patch set (overlay
    "soft" device taking over the configuration of the underlay "hard"
    device) and eventually dropped due to increased complexity of having
    to keep the configuration of two distinct devices in sync with
    librte_ether implementation that is not friendly towards such
    approach. Basically, each ethdev API call for the overlay device
    needs to configure the overlay device, invoke the same configuration
    with possibly modified parameters for the underlay device, then resume
    the configuration of overlay device, turning this into a device
    emulation project.
    V2 minuses: increased complexity (deal with two devices at same time);
    need to implement every ethdev API, even those not needed for the scope
    of SW fall-back; intrusive; sometimes have to silently take decisions
    that should be left to the app.
    V3 pluses: lower complexity (only one device); only need to implement
    those APIs that are in scope of the SW fall-back; non-intrusive (deal
    with "hard" device through ethdev API); app decisions taken by the app
    in an explicit way.

Q6: Why expose the SW fall-back in a PMD and not in a SW library?
A6: The SW fall-back for an ethdev API has to implement that specific
    ethdev API, (hence expose an ethdev object through a PMD), as opposed
    to providing a different API. This approach allows the app to use the
    same API (NFV vision). For example, we already have a library for TM
    SW fall-back (librte_sched) that can be called directly by the apps
    that need to call it outside of ethdev context (use-cases exist), but
    an app that works with TM-aware NICs through the ethdev TM API would
    have to be changed significantly in order to work with different
    TM-agnostic NICs through the librte_sched API.

Q7: Why have all the SW fall-backs in a single PMD? Why not develop
    the SW fall-back for each different ethdev API in a separate PMD, then
    create a chain of "soft" devices for each "hard" device? Potentially,
    this results in smaller size PMDs that are easier to maintain.
A7: Arguments for single ethdev/PMD and against chain of ethdevs/PMDs:
    1. All the existing PMDs for HW NICs implement a lot of features under
       the same PMD, so there is no reason for single PMD approach to break
       code modularity. See the V3 code, a lot of care has been taken for
       code modularity.
    2. We should avoid the proliferation of SW PMDs.
    3. A single device should be handled by a single PMD.
    4. People are used with feature-rich PMDs, not with single-feature
       PMDs, so we change of mindset?
    5. [Configuration nightmare] A chain of "soft" devices attached to
       single "hard" device requires the app to be aware that the N "soft"
       devices in the chain plus the "hard" device refer to the same HW
       device, and which device should be invoked to configure which
       feature. Also the length of the chain and functionality of each
       link is different for each HW device. This breaks the requirement
       of preserving the same API while working with different NICs (NFV).
       This most likely results in a configuration nightmare, nobody is
       going to seriously use this.
    6. [Feature inter-dependecy] Sometimes different features need to be
       configured and executed together (e.g. share the same set of
       resources, are inter-dependent, etc), so it is better and more
       performant to do them in the same ethdev/PMD.
    7. [Code duplication] There is a lot of duplication in the
       configuration code for the chain of ethdevs approach. The ethdev
       dev_configure, rx_queue_setup, tx_queue_setup API functions have to
       be implemented per device, and they become meaningless/inconsistent
       with the chain approach.
    8. [Data structure duplication] The per device data structures have to
       be duplicated and read repeatedly for each "soft" ethdev. The
       ethdev device, dev_private, data, per RX/TX queue data structures
       have to be replicated per "soft" device. They have to be re-read for
       each stage, so the same cache misses are now multiplied with the
       number of stages in the chain.
    9. [rte_ring proliferation] Thread safety requirements for ethdev
       RX/TXqueues require an rte_ring to be used for every RX/TX queue
       of each "soft" ethdev. This rte_ring proliferation unnecessarily
       increases the memory footprint and lowers performance, especially
       when each "soft" ethdev ends up on a different CPU core (ping-pong
       of cache lines).
    10.[Meta-data proliferation] A chain of ethdevs is likely to result
       in proliferation of meta-data that has to be passed between the
       ethdevs (e.g. policing needs the output of flow classification),
       which results in more cache line ping-pong between cores, hence
       performance drops.

Cristian Dumitrescu (4):
Jasvinder Singh (4):
  net/softnic: add softnic PMD
  net/softnic: add traffic management support
  net/softnic: add TM capabilities ops
  net/softnic: add TM hierarchy related ops

Jasvinder Singh (1):
  app/testpmd: add traffic management forwarding mode

 MAINTAINERS                                        |    5 +
 app/test-pmd/Makefile                              |   12 +
 app/test-pmd/cmdline.c                             |   88 +
 app/test-pmd/testpmd.c                             |   15 +
 app/test-pmd/testpmd.h                             |   46 +
 app/test-pmd/tm.c                                  |  865 +++++
 config/common_base                                 |    5 +
 doc/api/doxy-api-index.md                          |    3 +-
 doc/api/doxy-api.conf                              |    1 +
 doc/guides/rel_notes/release_17_11.rst             |    8 +
 drivers/net/Makefile                               |    5 +
 drivers/net/softnic/Makefile                       |   57 +
 drivers/net/softnic/rte_eth_softnic.c              |  852 +++++
 drivers/net/softnic/rte_eth_softnic.h              |   83 +
 drivers/net/softnic/rte_eth_softnic_internals.h    |  291 ++
 drivers/net/softnic/rte_eth_softnic_tm.c           | 3452 ++++++++++++++++++++
 .../net/softnic/rte_pmd_eth_softnic_version.map    |    7 +
 mk/rte.app.mk                                      |    5 +-
 18 files changed, 5798 insertions(+), 2 deletions(-)
 create mode 100644 app/test-pmd/tm.c
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c
 create mode 100644 drivers/net/softnic/rte_pmd_eth_softnic_version.map

Series Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Series Acked-by: Thomas Monjalon <thomas@monjalon.net>
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v7 1/5] net/softnic: add softnic PMD
  2017-10-09 12:58                       ` [dpdk-dev] [PATCH v7 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
@ 2017-10-09 12:58                         ` Jasvinder Singh
  2017-10-09 20:18                           ` Ferruh Yigit
  2017-10-10 10:18                           ` [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 2/5] net/softnic: add traffic management support Jasvinder Singh
                                           ` (3 subsequent siblings)
  4 siblings, 2 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-09 12:58 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

Add SoftNIC PMD to provide SW fall-back for ethdev APIs.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
v7 changes:
- rebase on dpdk-next-net
- change port_id type to uint16_t
 
v5 changes:
- change function name rte_pmd_softnic_run_default() to run_default()

v4 changes:
- Implemented feedback from Ferruh [1]
 - rename map file to rte_pmd_eth_softnic_version.map
 - add release notes library version info
 - doxygen: fix hooks in doc/api/doxy-api-index.md
 - add doxygen comment for rte_pmd_softnic_run()
 - free device name memory
 - remove soft_dev param in pmd_ethdev_register()
 - fix checkpatch warnings

v3 changes:
- rebase to dpdk17.08 release

v2 changes:
- fix build errors
- rebased to TM APIs v6 plus dpdk master

[1] Feedback from Ferruh on v3: http://dpdk.org/ml/archives/dev/2017-September/074576.html

 MAINTAINERS                                        |   5 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   3 +-
 doc/api/doxy-api.conf                              |   1 +
 doc/guides/rel_notes/release_17_11.rst             |   8 +
 drivers/net/Makefile                               |   5 +
 drivers/net/softnic/Makefile                       |  56 ++
 drivers/net/softnic/rte_eth_softnic.c              | 591 +++++++++++++++++++++
 drivers/net/softnic/rte_eth_softnic.h              |  67 +++
 drivers/net/softnic/rte_eth_softnic_internals.h    | 114 ++++
 .../net/softnic/rte_pmd_eth_softnic_version.map    |   7 +
 mk/rte.app.mk                                      |   5 +-
 12 files changed, 865 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_pmd_eth_softnic_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index a6bb061..fff80c4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -526,6 +526,11 @@ M: Gaetan Rivet <gaetan.rivet@6wind.com>
 F: drivers/net/failsafe/
 F: doc/guides/nics/fail_safe.rst
 
+Softnic PMD
+M: Jasvinder Singh <jasvinder.singh@intel.com>
+M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
+F: drivers/net/softnic
+
 
 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index 9ebe589..75b63a6 100644
--- a/config/common_base
+++ b/config/common_base
@@ -271,6 +271,11 @@ CONFIG_RTE_LIBRTE_SFC_EFX_PMD=y
 CONFIG_RTE_LIBRTE_SFC_EFX_DEBUG=n
 
 #
+# Compile SOFTNIC PMD
+#
+CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
+
+#
 # Compile software PMD backed by SZEDATA2 device
 #
 CONFIG_RTE_LIBRTE_PMD_SZEDATA2=n
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index e032ae1..950a553 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -56,7 +56,8 @@ The public API headers are grouped by topics:
   [ixgbe]              (@ref rte_pmd_ixgbe.h),
   [i40e]               (@ref rte_pmd_i40e.h),
   [bnxt]               (@ref rte_pmd_bnxt.h),
-  [crypto_scheduler]   (@ref rte_cryptodev_scheduler.h)
+  [crypto_scheduler]   (@ref rte_cryptodev_scheduler.h),
+  [softnic]            (@ref rte_eth_softnic.h)
 
 - **memory**:
   [memseg]             (@ref rte_memory.h),
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 63fe6cb..1310dc7 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -32,6 +32,7 @@ PROJECT_NAME            = DPDK
 INPUT                   = doc/api/doxy-api-index.md \
                           drivers/crypto/scheduler \
                           drivers/net/bonding \
+                          drivers/net/softnic \
                           drivers/net/i40e \
                           drivers/net/ixgbe \
                           drivers/net/bnxt \
diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 36139e5..6057a1a 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -81,6 +81,7 @@ New Features
   See the :ref:`Membership Library <Member_Library>` documentation in
   the Programmers Guide document, for more information.
 
+<<<<<<< a3dabd30369bd73017fa367725fb1b455074a2ca
 * **Added the Generic Segmentation Offload Library.**
 
   Added the Generic Segmentation Offload (GSO) library to enable
@@ -96,6 +97,12 @@ New Features
   The GSO library doesn't check if the input packets have correct
   checksums, and doesn't update checksums for output packets.
   Additionally, the GSO library doesn't process IP fragmented packets.
+=======
+* **Added SoftNIC PMD.**
+
+  Added new SoftNIC PMD. This virtual device offers applications a software
+  fallback support for traffic management.
+>>>>>>> net/softnic: add softnic PMD
 
 
 Resolved Issues
@@ -267,6 +274,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_pmd_ixgbe.so.2
      librte_pmd_ring.so.2
      librte_pmd_vhost.so.2
+   + librte_pmd_softnic.so.1
      librte_port.so.3
      librte_power.so.1
      librte_reorder.so.1
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 1daa87f..9480c51 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -112,4 +112,9 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
 endif # $(CONFIG_RTE_LIBRTE_VHOST)
 DEPDIRS-vhost = $(core-libs) librte_vhost
 
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += softnic
+endif # $(CONFIG_RTE_LIBRTE_SCHED)
+DEPDIRS-softnic = $(core-libs) librte_sched
+
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
new file mode 100644
index 0000000..c2f42ef
--- /dev/null
+++ b/drivers/net/softnic/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_softnic.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_eth_softnic_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+
+#
+# Export include files
+#
+SYMLINK-y-include += rte_eth_softnic.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
new file mode 100644
index 0000000..2f92594
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -0,0 +1,591 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_ethdev.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_vdev.h>
+#include <rte_kvargs.h>
+#include <rte_errno.h>
+#include <rte_ring.h>
+
+#include "rte_eth_softnic.h"
+#include "rte_eth_softnic_internals.h"
+
+#define DEV_HARD(p)					\
+	(&rte_eth_devices[p->hard.port_id])
+
+#define PMD_PARAM_HARD_NAME					"hard_name"
+#define PMD_PARAM_HARD_TX_QUEUE_ID			"hard_tx_queue_id"
+
+static const char *pmd_valid_args[] = {
+	PMD_PARAM_HARD_NAME,
+	PMD_PARAM_HARD_TX_QUEUE_ID,
+	NULL
+};
+
+static const struct rte_eth_dev_info pmd_dev_info = {
+	.min_rx_bufsize = 0,
+	.max_rx_pktlen = UINT32_MAX,
+	.max_rx_queues = UINT16_MAX,
+	.max_tx_queues = UINT16_MAX,
+	.rx_desc_lim = {
+		.nb_max = UINT16_MAX,
+		.nb_min = 0,
+		.nb_align = 1,
+	},
+	.tx_desc_lim = {
+		.nb_max = UINT16_MAX,
+		.nb_min = 0,
+		.nb_align = 1,
+	},
+};
+
+static void
+pmd_dev_infos_get(struct rte_eth_dev *dev __rte_unused,
+	struct rte_eth_dev_info *dev_info)
+{
+	memcpy(dev_info, &pmd_dev_info, sizeof(*dev_info));
+}
+
+static int
+pmd_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_eth_dev *hard_dev = DEV_HARD(p);
+
+	if (dev->data->nb_rx_queues > hard_dev->data->nb_rx_queues)
+		return -1;
+
+	if (p->params.hard.tx_queue_id >= hard_dev->data->nb_tx_queues)
+		return -1;
+
+	return 0;
+}
+
+static int
+pmd_rx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id,
+	uint16_t nb_rx_desc __rte_unused,
+	unsigned int socket_id,
+	const struct rte_eth_rxconf *rx_conf __rte_unused,
+	struct rte_mempool *mb_pool __rte_unused)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	if (p->params.soft.intrusive == 0) {
+		struct pmd_rx_queue *rxq;
+
+		rxq = rte_zmalloc_socket(p->params.soft.name,
+			sizeof(struct pmd_rx_queue), 0, socket_id);
+		if (rxq == NULL)
+			return -ENOMEM;
+
+		rxq->hard.port_id = p->hard.port_id;
+		rxq->hard.rx_queue_id = rx_queue_id;
+		dev->data->rx_queues[rx_queue_id] = rxq;
+	} else {
+		struct rte_eth_dev *hard_dev = DEV_HARD(p);
+		void *rxq = hard_dev->data->rx_queues[rx_queue_id];
+
+		if (rxq == NULL)
+			return -1;
+
+		dev->data->rx_queues[rx_queue_id] = rxq;
+	}
+	return 0;
+}
+
+static int
+pmd_tx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id,
+	uint16_t nb_tx_desc,
+	unsigned int socket_id,
+	const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	uint32_t size = RTE_ETH_NAME_MAX_LEN + strlen("_txq") + 4;
+	char name[size];
+	struct rte_ring *r;
+
+	snprintf(name, sizeof(name), "%s_txq%04x",
+		dev->data->name, tx_queue_id);
+	r = rte_ring_create(name, nb_tx_desc, socket_id,
+		RING_F_SP_ENQ | RING_F_SC_DEQ);
+	if (r == NULL)
+		return -1;
+
+	dev->data->tx_queues[tx_queue_id] = r;
+	return 0;
+}
+
+static int
+pmd_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+
+	if (p->params.soft.intrusive) {
+		struct rte_eth_dev *hard_dev = DEV_HARD(p);
+
+		/* The hard_dev->rx_pkt_burst should be stable by now */
+		dev->rx_pkt_burst = hard_dev->rx_pkt_burst;
+	}
+
+	return 0;
+}
+
+static void
+pmd_dev_stop(struct rte_eth_dev *dev)
+{
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+}
+
+static void
+pmd_dev_close(struct rte_eth_dev *dev)
+{
+	uint32_t i;
+
+	/* TX queues */
+	for (i = 0; i < dev->data->nb_tx_queues; i++)
+		rte_ring_free((struct rte_ring *)dev->data->tx_queues[i]);
+}
+
+static int
+pmd_link_update(struct rte_eth_dev *dev __rte_unused,
+	int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static const struct eth_dev_ops pmd_ops = {
+	.dev_configure = pmd_dev_configure,
+	.dev_start = pmd_dev_start,
+	.dev_stop = pmd_dev_stop,
+	.dev_close = pmd_dev_close,
+	.link_update = pmd_link_update,
+	.dev_infos_get = pmd_dev_infos_get,
+	.rx_queue_setup = pmd_rx_queue_setup,
+	.tx_queue_setup = pmd_tx_queue_setup,
+	.tm_ops_get = NULL,
+};
+
+static uint16_t
+pmd_rx_pkt_burst(void *rxq,
+	struct rte_mbuf **rx_pkts,
+	uint16_t nb_pkts)
+{
+	struct pmd_rx_queue *rx_queue = rxq;
+
+	return rte_eth_rx_burst(rx_queue->hard.port_id,
+		rx_queue->hard.rx_queue_id,
+		rx_pkts,
+		nb_pkts);
+}
+
+static uint16_t
+pmd_tx_pkt_burst(void *txq,
+	struct rte_mbuf **tx_pkts,
+	uint16_t nb_pkts)
+{
+	return (uint16_t)rte_ring_enqueue_burst(txq,
+		(void **)tx_pkts,
+		nb_pkts,
+		NULL);
+}
+
+static __rte_always_inline int
+run_default(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Persistent context: Read Only (update not required) */
+	struct rte_mbuf **pkts = p->soft.def.pkts;
+	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
+
+	/* Persistent context: Read - Write (update required) */
+	uint32_t txq_pos = p->soft.def.txq_pos;
+	uint32_t pkts_len = p->soft.def.pkts_len;
+	uint32_t flush_count = p->soft.def.flush_count;
+
+	/* Not part of the persistent context */
+	uint32_t pos;
+	uint16_t i;
+
+	/* Soft device TXQ read, Hard device TXQ write */
+	for (i = 0; i < nb_tx_queues; i++) {
+		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
+
+		/* Read soft device TXQ burst to packet enqueue buffer */
+		pkts_len += rte_ring_sc_dequeue_burst(txq,
+			(void **)&pkts[pkts_len],
+			DEFAULT_BURST_SIZE,
+			NULL);
+
+		/* Increment soft device TXQ */
+		txq_pos++;
+		if (txq_pos >= nb_tx_queues)
+			txq_pos = 0;
+
+		/* Hard device TXQ write when complete burst is available */
+		if (pkts_len >= DEFAULT_BURST_SIZE) {
+			for (pos = 0; pos < pkts_len; )
+				pos += rte_eth_tx_burst(p->hard.port_id,
+					p->params.hard.tx_queue_id,
+					&pkts[pos],
+					(uint16_t)(pkts_len - pos));
+
+			pkts_len = 0;
+			flush_count = 0;
+			break;
+		}
+	}
+
+	if (flush_count >= FLUSH_COUNT_THRESHOLD) {
+		for (pos = 0; pos < pkts_len; )
+			pos += rte_eth_tx_burst(p->hard.port_id,
+				p->params.hard.tx_queue_id,
+				&pkts[pos],
+				(uint16_t)(pkts_len - pos));
+
+		pkts_len = 0;
+		flush_count = 0;
+	}
+
+	p->soft.def.txq_pos = txq_pos;
+	p->soft.def.pkts_len = pkts_len;
+	p->soft.def.flush_count = flush_count + 1;
+
+	return 0;
+}
+
+int
+rte_pmd_softnic_run(uint16_t port_id)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+#endif
+
+	return run_default(dev);
+}
+
+static struct ether_addr eth_addr = { .addr_bytes = {0} };
+
+static uint32_t
+eth_dev_speed_max_mbps(uint32_t speed_capa)
+{
+	uint32_t rate_mbps[32] = {
+		ETH_SPEED_NUM_NONE,
+		ETH_SPEED_NUM_10M,
+		ETH_SPEED_NUM_10M,
+		ETH_SPEED_NUM_100M,
+		ETH_SPEED_NUM_100M,
+		ETH_SPEED_NUM_1G,
+		ETH_SPEED_NUM_2_5G,
+		ETH_SPEED_NUM_5G,
+		ETH_SPEED_NUM_10G,
+		ETH_SPEED_NUM_20G,
+		ETH_SPEED_NUM_25G,
+		ETH_SPEED_NUM_40G,
+		ETH_SPEED_NUM_50G,
+		ETH_SPEED_NUM_56G,
+		ETH_SPEED_NUM_100G,
+	};
+
+	uint32_t pos = (speed_capa) ? (31 - __builtin_clz(speed_capa)) : 0;
+	return rate_mbps[pos];
+}
+
+static int
+default_init(struct pmd_internals *p,
+	struct pmd_params *params,
+	int numa_node)
+{
+	p->soft.def.pkts = rte_zmalloc_socket(params->soft.name,
+		2 * DEFAULT_BURST_SIZE * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.def.pkts == NULL)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void
+default_free(struct pmd_internals *p)
+{
+	rte_free(p->soft.def.pkts);
+}
+
+static void *
+pmd_init(struct pmd_params *params, int numa_node)
+{
+	struct pmd_internals *p;
+	int status;
+
+	p = rte_zmalloc_socket(params->soft.name,
+		sizeof(struct pmd_internals),
+		0,
+		numa_node);
+	if (p == NULL)
+		return NULL;
+
+	memcpy(&p->params, params, sizeof(p->params));
+	rte_eth_dev_get_port_by_name(params->hard.name, &p->hard.port_id);
+
+	/* Default */
+	status = default_init(p, params, numa_node);
+	if (status) {
+		free(p->params.hard.name);
+		rte_free(p);
+		return NULL;
+	}
+
+	return p;
+}
+
+static void
+pmd_free(struct pmd_internals *p)
+{
+	default_free(p);
+
+	free(p->params.hard.name);
+	rte_free(p);
+}
+
+static int
+pmd_ethdev_register(struct rte_vdev_device *vdev,
+	struct pmd_params *params,
+	void *dev_private)
+{
+	struct rte_eth_dev_info hard_info;
+	struct rte_eth_dev *soft_dev;
+	uint32_t hard_speed;
+	int numa_node;
+	uint16_t hard_port_id;
+
+	rte_eth_dev_get_port_by_name(params->hard.name, &hard_port_id);
+	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
+	numa_node = rte_eth_dev_socket_id(hard_port_id);
+
+	/* Ethdev entry allocation */
+	soft_dev = rte_eth_dev_allocate(params->soft.name);
+	if (!soft_dev)
+		return -ENOMEM;
+
+	/* dev */
+	soft_dev->rx_pkt_burst = (params->soft.intrusive) ?
+		NULL : /* set up later */
+		pmd_rx_pkt_burst;
+	soft_dev->tx_pkt_burst = pmd_tx_pkt_burst;
+	soft_dev->tx_pkt_prepare = NULL;
+	soft_dev->dev_ops = &pmd_ops;
+	soft_dev->device = &vdev->device;
+
+	/* dev->data */
+	soft_dev->data->dev_private = dev_private;
+	soft_dev->data->dev_link.link_speed = hard_speed;
+	soft_dev->data->dev_link.link_duplex = ETH_LINK_FULL_DUPLEX;
+	soft_dev->data->dev_link.link_autoneg = ETH_LINK_SPEED_FIXED;
+	soft_dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	soft_dev->data->mac_addrs = &eth_addr;
+	soft_dev->data->promiscuous = 1;
+	soft_dev->data->kdrv = RTE_KDRV_NONE;
+	soft_dev->data->numa_node = numa_node;
+	soft_dev->data->dev_flags = RTE_ETH_DEV_DETACHABLE;
+
+	return 0;
+}
+
+static int
+get_string(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(char **)extra_args = strdup(value);
+
+	if (!*(char **)extra_args)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int
+get_uint32(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(uint32_t *)extra_args = strtoull(value, NULL, 0);
+
+	return 0;
+}
+
+static int
+pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
+{
+	struct rte_kvargs *kvlist;
+	int ret;
+
+	kvlist = rte_kvargs_parse(params, pmd_valid_args);
+	if (kvlist == NULL)
+		return -EINVAL;
+
+	/* Set default values */
+	memset(p, 0, sizeof(*p));
+	p->soft.name = name;
+	p->soft.intrusive = INTRUSIVE;
+	p->hard.tx_queue_id = SOFTNIC_HARD_TX_QUEUE_ID;
+
+	/* HARD: name (mandatory) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_NAME) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME,
+			&get_string, &p->hard.name);
+		if (ret < 0)
+			goto out_free;
+	} else {
+		ret = -EINVAL;
+		goto out_free;
+	}
+
+	/* HARD: tx_queue_id (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_TX_QUEUE_ID) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_TX_QUEUE_ID,
+			&get_uint32, &p->hard.tx_queue_id);
+		if (ret < 0)
+			goto out_free;
+	}
+
+out_free:
+	rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+pmd_probe(struct rte_vdev_device *vdev)
+{
+	struct pmd_params p;
+	const char *params;
+	int status;
+
+	struct rte_eth_dev_info hard_info;
+	uint16_t hard_port_id;
+	int numa_node;
+	void *dev_private;
+
+	RTE_LOG(INFO, PMD,
+		"Probing device \"%s\"\n",
+		rte_vdev_device_name(vdev));
+
+	/* Parse input arguments */
+	params = rte_vdev_device_args(vdev);
+	if (!params)
+		return -EINVAL;
+
+	status = pmd_parse_args(&p, rte_vdev_device_name(vdev), params);
+	if (status)
+		return status;
+
+	/* Check input arguments */
+	if (rte_eth_dev_get_port_by_name(p.hard.name, &hard_port_id))
+		return -EINVAL;
+
+	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	numa_node = rte_eth_dev_socket_id(hard_port_id);
+
+	if (p.hard.tx_queue_id >= hard_info.max_tx_queues)
+		return -EINVAL;
+
+	/* Allocate and initialize soft ethdev private data */
+	dev_private = pmd_init(&p, numa_node);
+	if (dev_private == NULL)
+		return -ENOMEM;
+
+	/* Register soft ethdev */
+	RTE_LOG(INFO, PMD,
+		"Creating soft ethdev \"%s\" for hard ethdev \"%s\"\n",
+		p.soft.name, p.hard.name);
+
+	status = pmd_ethdev_register(vdev, &p, dev_private);
+	if (status) {
+		pmd_free(dev_private);
+		return status;
+	}
+
+	return 0;
+}
+
+static int
+pmd_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *dev = NULL;
+	struct pmd_internals *p;
+
+	if (!vdev)
+		return -EINVAL;
+
+	RTE_LOG(INFO, PMD, "Removing device \"%s\"\n",
+		rte_vdev_device_name(vdev));
+
+	/* Find the ethdev entry */
+	dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (dev == NULL)
+		return -ENODEV;
+	p = dev->data->dev_private;
+
+	/* Free device data structures*/
+	pmd_free(p);
+	rte_free(dev->data);
+	rte_eth_dev_release_port(dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_softnic_drv = {
+	.probe = pmd_probe,
+	.remove = pmd_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_softnic, pmd_softnic_drv);
+RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
+	PMD_PARAM_HARD_NAME "=<string> "
+	PMD_PARAM_HARD_TX_QUEUE_ID "=<int>");
diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
new file mode 100644
index 0000000..566490a
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.h
@@ -0,0 +1,67 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_H__
+
+#include <stdint.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifndef SOFTNIC_HARD_TX_QUEUE_ID
+#define SOFTNIC_HARD_TX_QUEUE_ID			0
+#endif
+
+/**
+ * Run the traffic management function on the softnic device
+ *
+ * This function read the packets from the softnic input queues, insert into
+ * QoS scheduler queues based on mbuf sched field value and transmit the
+ * scheduled packets out through the hard device interface.
+ *
+ * @param portid
+ *    port id of the soft device.
+ * @return
+ *    zero.
+ */
+
+int
+rte_pmd_softnic_run(uint16_t port_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
new file mode 100644
index 0000000..08a633f
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -0,0 +1,114 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+
+#include "rte_eth_softnic.h"
+
+#ifndef INTRUSIVE
+#define INTRUSIVE					0
+#endif
+
+struct pmd_params {
+	/** Parameters for the soft device (to be created) */
+	struct {
+		const char *name; /**< Name */
+		uint32_t flags; /**< Flags */
+
+		/** 0 = Access hard device though API only (potentially slower,
+		 *      but safer);
+		 *  1 = Access hard device private data structures is allowed
+		 *      (potentially faster).
+		 */
+		int intrusive;
+	} soft;
+
+	/** Parameters for the hard device (existing) */
+	struct {
+		char *name; /**< Name */
+		uint16_t tx_queue_id; /**< TX queue ID */
+	} hard;
+};
+
+/**
+ * Default Internals
+ */
+
+#ifndef DEFAULT_BURST_SIZE
+#define DEFAULT_BURST_SIZE				32
+#endif
+
+#ifndef FLUSH_COUNT_THRESHOLD
+#define FLUSH_COUNT_THRESHOLD			(1 << 17)
+#endif
+
+struct default_internals {
+	struct rte_mbuf **pkts;
+	uint32_t pkts_len;
+	uint32_t txq_pos;
+	uint32_t flush_count;
+};
+
+/**
+ * PMD Internals
+ */
+struct pmd_internals {
+	/** Params */
+	struct pmd_params params;
+
+	/** Soft device */
+	struct {
+		struct default_internals def; /**< Default */
+	} soft;
+
+	/** Hard device */
+	struct {
+		uint16_t port_id;
+	} hard;
+};
+
+struct pmd_rx_queue {
+	/** Hard device */
+	struct {
+		uint16_t port_id;
+		uint16_t rx_queue_id;
+	} hard;
+};
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
diff --git a/drivers/net/softnic/rte_pmd_eth_softnic_version.map b/drivers/net/softnic/rte_pmd_eth_softnic_version.map
new file mode 100644
index 0000000..fb2cb68
--- /dev/null
+++ b/drivers/net/softnic/rte_pmd_eth_softnic_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_pmd_softnic_run;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 83c952e..54e6268 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -68,7 +68,6 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
-_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
 # librte_acl needs --whole-archive because of weak functions
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)            += --whole-archive
@@ -101,6 +100,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
@@ -142,6 +142,9 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL)       += -lrte_pmd_null
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP)       += -lrte_pmd_pcap -lpcap
 _LDLIBS-$(CONFIG_RTE_LIBRTE_QEDE_PMD)       += -lrte_pmd_qede
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING)       += -lrte_pmd_ring
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC)      += -lrte_pmd_softnic
+endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD)    += -lrte_pmd_sfc_efx
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2)   += -lrte_pmd_szedata2 -lsze2
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_TAP)        += -lrte_pmd_tap
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v7 2/5] net/softnic: add traffic management support
  2017-10-09 12:58                       ` [dpdk-dev] [PATCH v7 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 1/5] net/softnic: add softnic PMD Jasvinder Singh
@ 2017-10-09 12:58                         ` Jasvinder Singh
  2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 3/5] net/softnic: add TM capabilities ops Jasvinder Singh
                                           ` (2 subsequent siblings)
  4 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-09 12:58 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

Add ethdev Traffic Management API support to SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Lu, Wenzhuo <wenzhuo.lu@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
v7 changes:
- fix checkpatch warning

v5 changes:
- change function name rte_pmd_softnic_run_tm() to run_tm()

v3 changes:
- add more confguration parameters (tm rate, tm queue sizes)

 drivers/net/softnic/Makefile                    |   1 +
 drivers/net/softnic/rte_eth_softnic.c           | 255 +++++++++++++++++++++++-
 drivers/net/softnic/rte_eth_softnic.h           |  16 ++
 drivers/net/softnic/rte_eth_softnic_internals.h | 104 ++++++++++
 drivers/net/softnic/rte_eth_softnic_tm.c        | 181 +++++++++++++++++
 5 files changed, 555 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c

diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
index c2f42ef..8b848a9 100644
--- a/drivers/net/softnic/Makefile
+++ b/drivers/net/softnic/Makefile
@@ -47,6 +47,7 @@ LIBABIVER := 1
 # all source are stored in SRCS-y
 #
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic_tm.c
 
 #
 # Export include files
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index 2f92594..2f19159 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -42,6 +42,7 @@
 #include <rte_kvargs.h>
 #include <rte_errno.h>
 #include <rte_ring.h>
+#include <rte_sched.h>
 
 #include "rte_eth_softnic.h"
 #include "rte_eth_softnic_internals.h"
@@ -49,10 +50,29 @@
 #define DEV_HARD(p)					\
 	(&rte_eth_devices[p->hard.port_id])
 
+#define PMD_PARAM_SOFT_TM					"soft_tm"
+#define PMD_PARAM_SOFT_TM_RATE				"soft_tm_rate"
+#define PMD_PARAM_SOFT_TM_NB_QUEUES			"soft_tm_nb_queues"
+#define PMD_PARAM_SOFT_TM_QSIZE0			"soft_tm_qsize0"
+#define PMD_PARAM_SOFT_TM_QSIZE1			"soft_tm_qsize1"
+#define PMD_PARAM_SOFT_TM_QSIZE2			"soft_tm_qsize2"
+#define PMD_PARAM_SOFT_TM_QSIZE3			"soft_tm_qsize3"
+#define PMD_PARAM_SOFT_TM_ENQ_BSZ			"soft_tm_enq_bsz"
+#define PMD_PARAM_SOFT_TM_DEQ_BSZ			"soft_tm_deq_bsz"
+
 #define PMD_PARAM_HARD_NAME					"hard_name"
 #define PMD_PARAM_HARD_TX_QUEUE_ID			"hard_tx_queue_id"
 
 static const char *pmd_valid_args[] = {
+	PMD_PARAM_SOFT_TM,
+	PMD_PARAM_SOFT_TM_RATE,
+	PMD_PARAM_SOFT_TM_NB_QUEUES,
+	PMD_PARAM_SOFT_TM_QSIZE0,
+	PMD_PARAM_SOFT_TM_QSIZE1,
+	PMD_PARAM_SOFT_TM_QSIZE2,
+	PMD_PARAM_SOFT_TM_QSIZE3,
+	PMD_PARAM_SOFT_TM_ENQ_BSZ,
+	PMD_PARAM_SOFT_TM_DEQ_BSZ,
 	PMD_PARAM_HARD_NAME,
 	PMD_PARAM_HARD_TX_QUEUE_ID,
 	NULL
@@ -157,6 +177,13 @@ pmd_dev_start(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
 
+	if (tm_used(dev)) {
+		int status = tm_start(p);
+
+		if (status)
+			return status;
+	}
+
 	dev->data->dev_link.link_status = ETH_LINK_UP;
 
 	if (p->params.soft.intrusive) {
@@ -172,7 +199,12 @@ pmd_dev_start(struct rte_eth_dev *dev)
 static void
 pmd_dev_stop(struct rte_eth_dev *dev)
 {
+	struct pmd_internals *p = dev->data->dev_private;
+
 	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+
+	if (tm_used(dev))
+		tm_stop(p);
 }
 
 static void
@@ -293,6 +325,77 @@ run_default(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static __rte_always_inline int
+run_tm(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Persistent context: Read Only (update not required) */
+	struct rte_sched_port *sched = p->soft.tm.sched;
+	struct rte_mbuf **pkts_enq = p->soft.tm.pkts_enq;
+	struct rte_mbuf **pkts_deq = p->soft.tm.pkts_deq;
+	uint32_t enq_bsz = p->params.soft.tm.enq_bsz;
+	uint32_t deq_bsz = p->params.soft.tm.deq_bsz;
+	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
+
+	/* Persistent context: Read - Write (update required) */
+	uint32_t txq_pos = p->soft.tm.txq_pos;
+	uint32_t pkts_enq_len = p->soft.tm.pkts_enq_len;
+	uint32_t flush_count = p->soft.tm.flush_count;
+
+	/* Not part of the persistent context */
+	uint32_t pkts_deq_len, pos;
+	uint16_t i;
+
+	/* Soft device TXQ read, TM enqueue */
+	for (i = 0; i < nb_tx_queues; i++) {
+		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
+
+		/* Read TXQ burst to packet enqueue buffer */
+		pkts_enq_len += rte_ring_sc_dequeue_burst(txq,
+			(void **)&pkts_enq[pkts_enq_len],
+			enq_bsz,
+			NULL);
+
+		/* Increment TXQ */
+		txq_pos++;
+		if (txq_pos >= nb_tx_queues)
+			txq_pos = 0;
+
+		/* TM enqueue when complete burst is available */
+		if (pkts_enq_len >= enq_bsz) {
+			rte_sched_port_enqueue(sched, pkts_enq, pkts_enq_len);
+
+			pkts_enq_len = 0;
+			flush_count = 0;
+			break;
+		}
+	}
+
+	if (flush_count >= FLUSH_COUNT_THRESHOLD) {
+		if (pkts_enq_len)
+			rte_sched_port_enqueue(sched, pkts_enq, pkts_enq_len);
+
+		pkts_enq_len = 0;
+		flush_count = 0;
+	}
+
+	p->soft.tm.txq_pos = txq_pos;
+	p->soft.tm.pkts_enq_len = pkts_enq_len;
+	p->soft.tm.flush_count = flush_count + 1;
+
+	/* TM dequeue, Hard device TXQ write */
+	pkts_deq_len = rte_sched_port_dequeue(sched, pkts_deq, deq_bsz);
+
+	for (pos = 0; pos < pkts_deq_len; )
+		pos += rte_eth_tx_burst(p->hard.port_id,
+			p->params.hard.tx_queue_id,
+			&pkts_deq[pos],
+			(uint16_t)(pkts_deq_len - pos));
+
+	return 0;
+}
+
 int
 rte_pmd_softnic_run(uint16_t port_id)
 {
@@ -302,7 +405,7 @@ rte_pmd_softnic_run(uint16_t port_id)
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
 #endif
 
-	return run_default(dev);
+	return (tm_used(dev)) ? run_tm(dev) : run_default(dev);
 }
 
 static struct ether_addr eth_addr = { .addr_bytes = {0} };
@@ -378,12 +481,26 @@ pmd_init(struct pmd_params *params, int numa_node)
 		return NULL;
 	}
 
+	/* Traffic Management (TM)*/
+	if (params->soft.flags & PMD_FEATURE_TM) {
+		status = tm_init(p, params, numa_node);
+		if (status) {
+			default_free(p);
+			free(p->params.hard.name);
+			rte_free(p);
+			return NULL;
+		}
+	}
+
 	return p;
 }
 
 static void
 pmd_free(struct pmd_internals *p)
 {
+	if (p->params.soft.flags & PMD_FEATURE_TM)
+		tm_free(p);
+
 	default_free(p);
 
 	free(p->params.hard.name);
@@ -464,7 +581,7 @@ static int
 pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
 {
 	struct rte_kvargs *kvlist;
-	int ret;
+	int i, ret;
 
 	kvlist = rte_kvargs_parse(params, pmd_valid_args);
 	if (kvlist == NULL)
@@ -474,8 +591,124 @@ pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
 	memset(p, 0, sizeof(*p));
 	p->soft.name = name;
 	p->soft.intrusive = INTRUSIVE;
+	p->soft.tm.rate = 0;
+	p->soft.tm.nb_queues = SOFTNIC_SOFT_TM_NB_QUEUES;
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+		p->soft.tm.qsize[i] = SOFTNIC_SOFT_TM_QUEUE_SIZE;
+	p->soft.tm.enq_bsz = SOFTNIC_SOFT_TM_ENQ_BSZ;
+	p->soft.tm.deq_bsz = SOFTNIC_SOFT_TM_DEQ_BSZ;
 	p->hard.tx_queue_id = SOFTNIC_HARD_TX_QUEUE_ID;
 
+	/* SOFT: TM (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM) == 1) {
+		char *s;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM,
+			&get_string, &s);
+		if (ret < 0)
+			goto out_free;
+
+		if (strcmp(s, "on") == 0)
+			p->soft.flags |= PMD_FEATURE_TM;
+		else if (strcmp(s, "off") == 0)
+			p->soft.flags &= ~PMD_FEATURE_TM;
+		else
+			ret = -EINVAL;
+
+		free(s);
+		if (ret)
+			goto out_free;
+	}
+
+	/* SOFT: TM rate (measured in bytes/second) (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_RATE) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_RATE,
+			&get_uint32, &p->soft.tm.rate);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM number of queues (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_NB_QUEUES) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_NB_QUEUES,
+			&get_uint32, &p->soft.tm.nb_queues);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM queue size 0 .. 3 (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE0) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE0,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[0] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE1) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE1,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[1] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE2) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE2,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[2] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE3) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE3,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[3] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM enqueue burst size (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_ENQ_BSZ) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_ENQ_BSZ,
+			&get_uint32, &p->soft.tm.enq_bsz);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM dequeue burst size (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_DEQ_BSZ) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_DEQ_BSZ,
+			&get_uint32, &p->soft.tm.deq_bsz);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
 	/* HARD: name (mandatory) */
 	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_NAME) == 1) {
 		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME,
@@ -508,6 +741,7 @@ pmd_probe(struct rte_vdev_device *vdev)
 	int status;
 
 	struct rte_eth_dev_info hard_info;
+	uint32_t hard_speed;
 	uint16_t hard_port_id;
 	int numa_node;
 	void *dev_private;
@@ -530,11 +764,19 @@ pmd_probe(struct rte_vdev_device *vdev)
 		return -EINVAL;
 
 	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
 	numa_node = rte_eth_dev_socket_id(hard_port_id);
 
 	if (p.hard.tx_queue_id >= hard_info.max_tx_queues)
 		return -EINVAL;
 
+	if (p.soft.flags & PMD_FEATURE_TM) {
+		status = tm_params_check(&p, hard_speed);
+
+		if (status)
+			return status;
+	}
+
 	/* Allocate and initialize soft ethdev private data */
 	dev_private = pmd_init(&p, numa_node);
 	if (dev_private == NULL)
@@ -587,5 +829,14 @@ static struct rte_vdev_driver pmd_softnic_drv = {
 
 RTE_PMD_REGISTER_VDEV(net_softnic, pmd_softnic_drv);
 RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
+	PMD_PARAM_SOFT_TM	 "=on|off "
+	PMD_PARAM_SOFT_TM_RATE "=<int> "
+	PMD_PARAM_SOFT_TM_NB_QUEUES "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE0 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE1 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE2 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE3 "=<int> "
+	PMD_PARAM_SOFT_TM_ENQ_BSZ "=<int> "
+	PMD_PARAM_SOFT_TM_DEQ_BSZ "=<int> "
 	PMD_PARAM_HARD_NAME "=<string> "
 	PMD_PARAM_HARD_TX_QUEUE_ID "=<int>");
diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
index 566490a..b49e582 100644
--- a/drivers/net/softnic/rte_eth_softnic.h
+++ b/drivers/net/softnic/rte_eth_softnic.h
@@ -40,6 +40,22 @@
 extern "C" {
 #endif
 
+#ifndef SOFTNIC_SOFT_TM_NB_QUEUES
+#define SOFTNIC_SOFT_TM_NB_QUEUES			65536
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_QUEUE_SIZE
+#define SOFTNIC_SOFT_TM_QUEUE_SIZE			64
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_ENQ_BSZ
+#define SOFTNIC_SOFT_TM_ENQ_BSZ				32
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_DEQ_BSZ
+#define SOFTNIC_SOFT_TM_DEQ_BSZ				24
+#endif
+
 #ifndef SOFTNIC_HARD_TX_QUEUE_ID
 #define SOFTNIC_HARD_TX_QUEUE_ID			0
 #endif
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 08a633f..fd9cbbe 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -37,10 +37,19 @@
 #include <stdint.h>
 
 #include <rte_mbuf.h>
+#include <rte_sched.h>
 #include <rte_ethdev.h>
 
 #include "rte_eth_softnic.h"
 
+/**
+ * PMD Parameters
+ */
+
+enum pmd_feature {
+	PMD_FEATURE_TM = 1, /**< Traffic Management (TM) */
+};
+
 #ifndef INTRUSIVE
 #define INTRUSIVE					0
 #endif
@@ -57,6 +66,16 @@ struct pmd_params {
 		 *      (potentially faster).
 		 */
 		int intrusive;
+
+		/** Traffic Management (TM) */
+		struct {
+			uint32_t rate; /**< Rate (bytes/second) */
+			uint32_t nb_queues; /**< Number of queues */
+			uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+			/**< Queue size per traffic class */
+			uint32_t enq_bsz; /**< Enqueue burst size */
+			uint32_t deq_bsz; /**< Dequeue burst size */
+		} tm;
 	} soft;
 
 	/** Parameters for the hard device (existing) */
@@ -86,6 +105,66 @@ struct default_internals {
 };
 
 /**
+ * Traffic Management (TM) Internals
+ */
+
+#ifndef TM_MAX_SUBPORTS
+#define TM_MAX_SUBPORTS					8
+#endif
+
+#ifndef TM_MAX_PIPES_PER_SUBPORT
+#define TM_MAX_PIPES_PER_SUBPORT			4096
+#endif
+
+struct tm_params {
+	struct rte_sched_port_params port_params;
+
+	struct rte_sched_subport_params subport_params[TM_MAX_SUBPORTS];
+
+	struct rte_sched_pipe_params
+		pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PORT];
+	uint32_t n_pipe_profiles;
+	uint32_t pipe_to_profile[TM_MAX_SUBPORTS * TM_MAX_PIPES_PER_SUBPORT];
+};
+
+/* TM Levels */
+enum tm_node_level {
+	TM_NODE_LEVEL_PORT = 0,
+	TM_NODE_LEVEL_SUBPORT,
+	TM_NODE_LEVEL_PIPE,
+	TM_NODE_LEVEL_TC,
+	TM_NODE_LEVEL_QUEUE,
+	TM_NODE_LEVEL_MAX,
+};
+
+/* TM Hierarchy Specification */
+struct tm_hierarchy {
+	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
+};
+
+struct tm_internals {
+	/** Hierarchy specification
+	 *
+	 *     -Hierarchy is unfrozen at init and when port is stopped.
+	 *     -Hierarchy is frozen on successful hierarchy commit.
+	 *     -Run-time hierarchy changes are not allowed, therefore it makes
+	 *      sense to keep the hierarchy frozen after the port is started.
+	 */
+	struct tm_hierarchy h;
+
+	/** Blueprints */
+	struct tm_params params;
+
+	/** Run-time */
+	struct rte_sched_port *sched;
+	struct rte_mbuf **pkts_enq;
+	struct rte_mbuf **pkts_deq;
+	uint32_t pkts_enq_len;
+	uint32_t txq_pos;
+	uint32_t flush_count;
+};
+
+/**
  * PMD Internals
  */
 struct pmd_internals {
@@ -95,6 +174,7 @@ struct pmd_internals {
 	/** Soft device */
 	struct {
 		struct default_internals def; /**< Default */
+		struct tm_internals tm; /**< Traffic Management */
 	} soft;
 
 	/** Hard device */
@@ -111,4 +191,28 @@ struct pmd_rx_queue {
 	} hard;
 };
 
+int
+tm_params_check(struct pmd_params *params, uint32_t hard_rate);
+
+int
+tm_init(struct pmd_internals *p, struct pmd_params *params, int numa_node);
+
+void
+tm_free(struct pmd_internals *p);
+
+int
+tm_start(struct pmd_internals *p);
+
+void
+tm_stop(struct pmd_internals *p);
+
+static inline int
+tm_used(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	return (p->params.soft.flags & PMD_FEATURE_TM) &&
+		p->soft.tm.h.n_tm_nodes[TM_NODE_LEVEL_PORT];
+}
+
 #endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
new file mode 100644
index 0000000..165abfe
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -0,0 +1,181 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_malloc.h>
+
+#include "rte_eth_softnic_internals.h"
+#include "rte_eth_softnic.h"
+
+#define BYTES_IN_MBPS (1000 * 1000 / 8)
+
+int
+tm_params_check(struct pmd_params *params, uint32_t hard_rate)
+{
+	uint64_t hard_rate_bytes_per_sec = hard_rate * BYTES_IN_MBPS;
+	uint32_t i;
+
+	/* rate */
+	if (params->soft.tm.rate) {
+		if (params->soft.tm.rate > hard_rate_bytes_per_sec)
+			return -EINVAL;
+	} else {
+		params->soft.tm.rate =
+			(hard_rate_bytes_per_sec > UINT32_MAX) ?
+				UINT32_MAX : hard_rate_bytes_per_sec;
+	}
+
+	/* nb_queues */
+	if (params->soft.tm.nb_queues == 0)
+		return -EINVAL;
+
+	if (params->soft.tm.nb_queues < RTE_SCHED_QUEUES_PER_PIPE)
+		params->soft.tm.nb_queues = RTE_SCHED_QUEUES_PER_PIPE;
+
+	params->soft.tm.nb_queues =
+		rte_align32pow2(params->soft.tm.nb_queues);
+
+	/* qsize */
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+		if (params->soft.tm.qsize[i] == 0)
+			return -EINVAL;
+
+		params->soft.tm.qsize[i] =
+			rte_align32pow2(params->soft.tm.qsize[i]);
+	}
+
+	/* enq_bsz, deq_bsz */
+	if (params->soft.tm.enq_bsz == 0 ||
+		params->soft.tm.deq_bsz == 0 ||
+		params->soft.tm.deq_bsz >= params->soft.tm.enq_bsz)
+		return -EINVAL;
+
+	return 0;
+}
+
+int
+tm_init(struct pmd_internals *p,
+	struct pmd_params *params,
+	int numa_node)
+{
+	uint32_t enq_bsz = params->soft.tm.enq_bsz;
+	uint32_t deq_bsz = params->soft.tm.deq_bsz;
+
+	p->soft.tm.pkts_enq = rte_zmalloc_socket(params->soft.name,
+		2 * enq_bsz * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.tm.pkts_enq == NULL)
+		return -ENOMEM;
+
+	p->soft.tm.pkts_deq = rte_zmalloc_socket(params->soft.name,
+		deq_bsz * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.tm.pkts_deq == NULL) {
+		rte_free(p->soft.tm.pkts_enq);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void
+tm_free(struct pmd_internals *p)
+{
+	rte_free(p->soft.tm.pkts_enq);
+	rte_free(p->soft.tm.pkts_deq);
+}
+
+int
+tm_start(struct pmd_internals *p)
+{
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_subports, subport_id;
+	int status;
+
+	/* Port */
+	p->soft.tm.sched = rte_sched_port_config(&t->port_params);
+	if (p->soft.tm.sched == NULL)
+		return -1;
+
+	/* Subport */
+	n_subports = t->port_params.n_subports_per_port;
+	for (subport_id = 0; subport_id < n_subports; subport_id++) {
+		uint32_t n_pipes_per_subport =
+			t->port_params.n_pipes_per_subport;
+		uint32_t pipe_id;
+
+		status = rte_sched_subport_config(p->soft.tm.sched,
+			subport_id,
+			&t->subport_params[subport_id]);
+		if (status) {
+			rte_sched_port_free(p->soft.tm.sched);
+			return -1;
+		}
+
+		/* Pipe */
+		n_pipes_per_subport = t->port_params.n_pipes_per_subport;
+		for (pipe_id = 0; pipe_id < n_pipes_per_subport; pipe_id++) {
+			int pos = subport_id * TM_MAX_PIPES_PER_SUBPORT +
+				pipe_id;
+			int profile_id = t->pipe_to_profile[pos];
+
+			if (profile_id < 0)
+				continue;
+
+			status = rte_sched_pipe_config(p->soft.tm.sched,
+				subport_id,
+				pipe_id,
+				profile_id);
+			if (status) {
+				rte_sched_port_free(p->soft.tm.sched);
+				return -1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+void
+tm_stop(struct pmd_internals *p)
+{
+	if (p->soft.tm.sched)
+		rte_sched_port_free(p->soft.tm.sched);
+}
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v7 3/5] net/softnic: add TM capabilities ops
  2017-10-09 12:58                       ` [dpdk-dev] [PATCH v7 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 1/5] net/softnic: add softnic PMD Jasvinder Singh
  2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 2/5] net/softnic: add traffic management support Jasvinder Singh
@ 2017-10-09 12:58                         ` Jasvinder Singh
  2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 4/5] net/softnic: add TM hierarchy related ops Jasvinder Singh
  2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 5/5] app/testpmd: add traffic management forwarding mode Jasvinder Singh
  4 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-09 12:58 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

Implement ethdev TM capability APIs in SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Lu, Wenzhuo <wenzhuo.lu@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
 drivers/net/softnic/rte_eth_softnic.c           |  12 +-
 drivers/net/softnic/rte_eth_softnic_internals.h |  32 ++
 drivers/net/softnic/rte_eth_softnic_tm.c        | 500 ++++++++++++++++++++++++
 3 files changed, 543 insertions(+), 1 deletion(-)

diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index 2f19159..34dceae 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -43,6 +43,7 @@
 #include <rte_errno.h>
 #include <rte_ring.h>
 #include <rte_sched.h>
+#include <rte_tm_driver.h>
 
 #include "rte_eth_softnic.h"
 #include "rte_eth_softnic_internals.h"
@@ -224,6 +225,15 @@ pmd_link_update(struct rte_eth_dev *dev __rte_unused,
 	return 0;
 }
 
+static int
+pmd_tm_ops_get(struct rte_eth_dev *dev, void *arg)
+{
+	*(const struct rte_tm_ops **)arg =
+		(tm_enabled(dev)) ? &pmd_tm_ops : NULL;
+
+	return 0;
+}
+
 static const struct eth_dev_ops pmd_ops = {
 	.dev_configure = pmd_dev_configure,
 	.dev_start = pmd_dev_start,
@@ -233,7 +243,7 @@ static const struct eth_dev_ops pmd_ops = {
 	.dev_infos_get = pmd_dev_infos_get,
 	.rx_queue_setup = pmd_rx_queue_setup,
 	.tx_queue_setup = pmd_tx_queue_setup,
-	.tm_ops_get = NULL,
+	.tm_ops_get = pmd_tm_ops_get,
 };
 
 static uint16_t
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index fd9cbbe..75d9387 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -39,6 +39,7 @@
 #include <rte_mbuf.h>
 #include <rte_sched.h>
 #include <rte_ethdev.h>
+#include <rte_tm_driver.h>
 
 #include "rte_eth_softnic.h"
 
@@ -137,8 +138,26 @@ enum tm_node_level {
 	TM_NODE_LEVEL_MAX,
 };
 
+/* TM Node */
+struct tm_node {
+	TAILQ_ENTRY(tm_node) node;
+	uint32_t node_id;
+	uint32_t parent_node_id;
+	uint32_t priority;
+	uint32_t weight;
+	uint32_t level;
+	struct tm_node *parent_node;
+	struct rte_tm_node_params params;
+	struct rte_tm_node_stats stats;
+	uint32_t n_children;
+};
+
+TAILQ_HEAD(tm_node_list, tm_node);
+
 /* TM Hierarchy Specification */
 struct tm_hierarchy {
+	struct tm_node_list nodes;
+
 	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
 };
 
@@ -191,6 +210,11 @@ struct pmd_rx_queue {
 	} hard;
 };
 
+/**
+ * Traffic Management (TM) Operation
+ */
+extern const struct rte_tm_ops pmd_tm_ops;
+
 int
 tm_params_check(struct pmd_params *params, uint32_t hard_rate);
 
@@ -207,6 +231,14 @@ void
 tm_stop(struct pmd_internals *p);
 
 static inline int
+tm_enabled(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	return (p->params.soft.flags & PMD_FEATURE_TM);
+}
+
+static inline int
 tm_used(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index 165abfe..73274d4 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -179,3 +179,503 @@ tm_stop(struct pmd_internals *p)
 	if (p->soft.tm.sched)
 		rte_sched_port_free(p->soft.tm.sched);
 }
+
+static struct tm_node *
+tm_node_search(struct rte_eth_dev *dev, uint32_t node_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node)
+		if (n->node_id == node_id)
+			return n;
+
+	return NULL;
+}
+
+static uint32_t
+tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t n_queues_max = p->params.soft.tm.nb_queues;
+	uint32_t n_tc_max = n_queues_max / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
+	uint32_t n_pipes_max = n_tc_max / RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+	uint32_t n_subports_max = n_pipes_max;
+	uint32_t n_root_max = 1;
+
+	switch (level) {
+	case TM_NODE_LEVEL_PORT:
+		return n_root_max;
+	case TM_NODE_LEVEL_SUBPORT:
+		return n_subports_max;
+	case TM_NODE_LEVEL_PIPE:
+		return n_pipes_max;
+	case TM_NODE_LEVEL_TC:
+		return n_tc_max;
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		return n_queues_max;
+	}
+}
+
+#ifdef RTE_SCHED_RED
+#define WRED_SUPPORTED						1
+#else
+#define WRED_SUPPORTED						0
+#endif
+
+#define STATS_MASK_DEFAULT					\
+	(RTE_TM_STATS_N_PKTS |					\
+	RTE_TM_STATS_N_BYTES |					\
+	RTE_TM_STATS_N_PKTS_GREEN_DROPPED |			\
+	RTE_TM_STATS_N_BYTES_GREEN_DROPPED)
+
+#define STATS_MASK_QUEUE						\
+	(STATS_MASK_DEFAULT |					\
+	RTE_TM_STATS_N_PKTS_QUEUED)
+
+static const struct rte_tm_capabilities tm_cap = {
+	.n_nodes_max = UINT32_MAX,
+	.n_levels_max = TM_NODE_LEVEL_MAX,
+
+	.non_leaf_nodes_identical = 0,
+	.leaf_nodes_identical = 1,
+
+	.shaper_n_max = UINT32_MAX,
+	.shaper_private_n_max = UINT32_MAX,
+	.shaper_private_dual_rate_n_max = 0,
+	.shaper_private_rate_min = 1,
+	.shaper_private_rate_max = UINT32_MAX,
+
+	.shaper_shared_n_max = UINT32_MAX,
+	.shaper_shared_n_nodes_per_shaper_max = UINT32_MAX,
+	.shaper_shared_n_shapers_per_node_max = 1,
+	.shaper_shared_dual_rate_n_max = 0,
+	.shaper_shared_rate_min = 1,
+	.shaper_shared_rate_max = UINT32_MAX,
+
+	.shaper_pkt_length_adjust_min = RTE_TM_ETH_FRAMING_OVERHEAD_FCS,
+	.shaper_pkt_length_adjust_max = RTE_TM_ETH_FRAMING_OVERHEAD_FCS,
+
+	.sched_n_children_max = UINT32_MAX,
+	.sched_sp_n_priorities_max = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+	.sched_wfq_n_children_per_group_max = UINT32_MAX,
+	.sched_wfq_n_groups_max = 1,
+	.sched_wfq_weight_max = UINT32_MAX,
+
+	.cman_head_drop_supported = 0,
+	.cman_wred_context_n_max = 0,
+	.cman_wred_context_private_n_max = 0,
+	.cman_wred_context_shared_n_max = 0,
+	.cman_wred_context_shared_n_nodes_per_context_max = 0,
+	.cman_wred_context_shared_n_contexts_per_node_max = 0,
+
+	.mark_vlan_dei_supported = {0, 0, 0},
+	.mark_ip_ecn_tcp_supported = {0, 0, 0},
+	.mark_ip_ecn_sctp_supported = {0, 0, 0},
+	.mark_ip_dscp_supported = {0, 0, 0},
+
+	.dynamic_update_mask = 0,
+
+	.stats_mask = STATS_MASK_QUEUE,
+};
+
+/* Traffic manager capabilities get */
+static int
+pmd_tm_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	struct rte_tm_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_cap, sizeof(*cap));
+
+	cap->n_nodes_max = tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_TC) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_QUEUE);
+
+	cap->shaper_private_n_max =
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_TC);
+
+	cap->shaper_shared_n_max = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE *
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT);
+
+	cap->shaper_n_max = cap->shaper_private_n_max +
+		cap->shaper_shared_n_max;
+
+	cap->shaper_shared_n_nodes_per_shaper_max =
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE);
+
+	cap->sched_n_children_max = RTE_MAX(
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE),
+		(uint32_t)RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE);
+
+	cap->sched_wfq_n_children_per_group_max = cap->sched_n_children_max;
+
+	if (WRED_SUPPORTED)
+		cap->cman_wred_context_private_n_max =
+			tm_level_get_max_nodes(dev, TM_NODE_LEVEL_QUEUE);
+
+	cap->cman_wred_context_n_max = cap->cman_wred_context_private_n_max +
+		cap->cman_wred_context_shared_n_max;
+
+	return 0;
+}
+
+static const struct rte_tm_level_capabilities tm_level_cap[] = {
+	[TM_NODE_LEVEL_PORT] = {
+		.n_nodes_max = 1,
+		.n_nodes_nonleaf_max = 1,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = 1,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_SUBPORT] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+			.sched_wfq_weight_max = UINT32_MAX,
+#else
+			.sched_wfq_weight_max = 1,
+#endif
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_PIPE] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_sp_n_priorities_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_wfq_n_children_per_group_max = 1,
+			.sched_wfq_n_groups_max = 0,
+			.sched_wfq_weight_max = 1,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_TC] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 1,
+
+			.sched_n_children_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_QUEUE] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = 0,
+		.n_nodes_leaf_max = UINT32_MAX,
+		.non_leaf_nodes_identical = 0,
+		.leaf_nodes_identical = 1,
+
+		.leaf = {
+			.shaper_private_supported = 0,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 0,
+			.shaper_private_rate_max = 0,
+			.shaper_shared_n_max = 0,
+
+			.cman_head_drop_supported = 0,
+			.cman_wred_context_private_supported = WRED_SUPPORTED,
+			.cman_wred_context_shared_n_max = 0,
+
+			.stats_mask = STATS_MASK_QUEUE,
+		},
+	},
+};
+
+/* Traffic manager level capabilities get */
+static int
+pmd_tm_level_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t level_id,
+	struct rte_tm_level_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	if (level_id >= TM_NODE_LEVEL_MAX)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_LEVEL_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_level_cap[level_id], sizeof(*cap));
+
+	switch (level_id) {
+	case TM_NODE_LEVEL_PORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_SUBPORT);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_SUBPORT);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_PIPE);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_PIPE);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		break;
+
+	case TM_NODE_LEVEL_TC:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_TC);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		break;
+
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_QUEUE);
+		cap->n_nodes_leaf_max = cap->n_nodes_max;
+		break;
+	}
+
+	return 0;
+}
+
+static const struct rte_tm_node_capabilities tm_node_cap[] = {
+	[TM_NODE_LEVEL_PORT] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = 1,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_SUBPORT] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_PIPE] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_sp_n_priorities_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_wfq_n_children_per_group_max = 1,
+			.sched_wfq_n_groups_max = 0,
+			.sched_wfq_weight_max = 1,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_TC] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 1,
+
+		.nonleaf = {
+			.sched_n_children_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_QUEUE] = {
+		.shaper_private_supported = 0,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 0,
+		.shaper_private_rate_max = 0,
+		.shaper_shared_n_max = 0,
+
+
+		.leaf = {
+			.cman_head_drop_supported = 0,
+			.cman_wred_context_private_supported = WRED_SUPPORTED,
+			.cman_wred_context_shared_n_max = 0,
+		},
+
+		.stats_mask = STATS_MASK_QUEUE,
+	},
+};
+
+/* Traffic manager node capabilities get */
+static int
+pmd_tm_node_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t node_id,
+	struct rte_tm_node_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	struct tm_node *tm_node;
+
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	tm_node = tm_node_search(dev, node_id);
+	if (tm_node == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_NODE_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_node_cap[tm_node->level], sizeof(*cap));
+
+	switch (tm_node->level) {
+	case TM_NODE_LEVEL_PORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_SUBPORT);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_PIPE);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+	case TM_NODE_LEVEL_TC:
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+const struct rte_tm_ops pmd_tm_ops = {
+	.capabilities_get = pmd_tm_capabilities_get,
+	.level_capabilities_get = pmd_tm_level_capabilities_get,
+	.node_capabilities_get = pmd_tm_node_capabilities_get,
+};
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v7 4/5] net/softnic: add TM hierarchy related ops
  2017-10-09 12:58                       ` [dpdk-dev] [PATCH v7 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
                                           ` (2 preceding siblings ...)
  2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 3/5] net/softnic: add TM capabilities ops Jasvinder Singh
@ 2017-10-09 12:58                         ` Jasvinder Singh
  2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 5/5] app/testpmd: add traffic management forwarding mode Jasvinder Singh
  4 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-09 12:58 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

Implement ethdev TM hierarchy related APIs in SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Lu, Wenzhuo <wenzhuo.lu@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
v7 change:
- fix checkpatch warnings

v5 change:
- add macro for the tc period
- add more comments

 drivers/net/softnic/rte_eth_softnic_internals.h |   41 +
 drivers/net/softnic/rte_eth_softnic_tm.c        | 2781 ++++++++++++++++++++++-
 2 files changed, 2817 insertions(+), 5 deletions(-)

diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 75d9387..1f75806 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -138,6 +138,36 @@ enum tm_node_level {
 	TM_NODE_LEVEL_MAX,
 };
 
+/* TM Shaper Profile */
+struct tm_shaper_profile {
+	TAILQ_ENTRY(tm_shaper_profile) node;
+	uint32_t shaper_profile_id;
+	uint32_t n_users;
+	struct rte_tm_shaper_params params;
+};
+
+TAILQ_HEAD(tm_shaper_profile_list, tm_shaper_profile);
+
+/* TM Shared Shaper */
+struct tm_shared_shaper {
+	TAILQ_ENTRY(tm_shared_shaper) node;
+	uint32_t shared_shaper_id;
+	uint32_t n_users;
+	uint32_t shaper_profile_id;
+};
+
+TAILQ_HEAD(tm_shared_shaper_list, tm_shared_shaper);
+
+/* TM WRED Profile */
+struct tm_wred_profile {
+	TAILQ_ENTRY(tm_wred_profile) node;
+	uint32_t wred_profile_id;
+	uint32_t n_users;
+	struct rte_tm_wred_params params;
+};
+
+TAILQ_HEAD(tm_wred_profile_list, tm_wred_profile);
+
 /* TM Node */
 struct tm_node {
 	TAILQ_ENTRY(tm_node) node;
@@ -147,6 +177,8 @@ struct tm_node {
 	uint32_t weight;
 	uint32_t level;
 	struct tm_node *parent_node;
+	struct tm_shaper_profile *shaper_profile;
+	struct tm_wred_profile *wred_profile;
 	struct rte_tm_node_params params;
 	struct rte_tm_node_stats stats;
 	uint32_t n_children;
@@ -156,8 +188,16 @@ TAILQ_HEAD(tm_node_list, tm_node);
 
 /* TM Hierarchy Specification */
 struct tm_hierarchy {
+	struct tm_shaper_profile_list shaper_profiles;
+	struct tm_shared_shaper_list shared_shapers;
+	struct tm_wred_profile_list wred_profiles;
 	struct tm_node_list nodes;
 
+	uint32_t n_shaper_profiles;
+	uint32_t n_shared_shapers;
+	uint32_t n_wred_profiles;
+	uint32_t n_nodes;
+
 	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
 };
 
@@ -170,6 +210,7 @@ struct tm_internals {
 	 *      sense to keep the hierarchy frozen after the port is started.
 	 */
 	struct tm_hierarchy h;
+	int hierarchy_frozen;
 
 	/** Blueprints */
 	struct tm_params params;
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index 73274d4..682cc4d 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -40,7 +40,9 @@
 #include "rte_eth_softnic_internals.h"
 #include "rte_eth_softnic.h"
 
-#define BYTES_IN_MBPS (1000 * 1000 / 8)
+#define BYTES_IN_MBPS		(1000 * 1000 / 8)
+#define SUBPORT_TC_PERIOD	10
+#define PIPE_TC_PERIOD		40
 
 int
 tm_params_check(struct pmd_params *params, uint32_t hard_rate)
@@ -86,6 +88,79 @@ tm_params_check(struct pmd_params *params, uint32_t hard_rate)
 	return 0;
 }
 
+static void
+tm_hierarchy_init(struct pmd_internals *p)
+{
+	memset(&p->soft.tm.h, 0, sizeof(p->soft.tm.h));
+
+	/* Initialize shaper profile list */
+	TAILQ_INIT(&p->soft.tm.h.shaper_profiles);
+
+	/* Initialize shared shaper list */
+	TAILQ_INIT(&p->soft.tm.h.shared_shapers);
+
+	/* Initialize wred profile list */
+	TAILQ_INIT(&p->soft.tm.h.wred_profiles);
+
+	/* Initialize TM node list */
+	TAILQ_INIT(&p->soft.tm.h.nodes);
+}
+
+static void
+tm_hierarchy_uninit(struct pmd_internals *p)
+{
+	/* Remove all nodes*/
+	for ( ; ; ) {
+		struct tm_node *tm_node;
+
+		tm_node = TAILQ_FIRST(&p->soft.tm.h.nodes);
+		if (tm_node == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.nodes, tm_node, node);
+		free(tm_node);
+	}
+
+	/* Remove all WRED profiles */
+	for ( ; ; ) {
+		struct tm_wred_profile *wred_profile;
+
+		wred_profile = TAILQ_FIRST(&p->soft.tm.h.wred_profiles);
+		if (wred_profile == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.wred_profiles, wred_profile, node);
+		free(wred_profile);
+	}
+
+	/* Remove all shared shapers */
+	for ( ; ; ) {
+		struct tm_shared_shaper *shared_shaper;
+
+		shared_shaper = TAILQ_FIRST(&p->soft.tm.h.shared_shapers);
+		if (shared_shaper == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.shared_shapers, shared_shaper, node);
+		free(shared_shaper);
+	}
+
+	/* Remove all shaper profiles */
+	for ( ; ; ) {
+		struct tm_shaper_profile *shaper_profile;
+
+		shaper_profile = TAILQ_FIRST(&p->soft.tm.h.shaper_profiles);
+		if (shaper_profile == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.shaper_profiles,
+			shaper_profile, node);
+		free(shaper_profile);
+	}
+
+	memset(&p->soft.tm.h, 0, sizeof(p->soft.tm.h));
+}
+
 int
 tm_init(struct pmd_internals *p,
 	struct pmd_params *params,
@@ -112,12 +187,15 @@ tm_init(struct pmd_internals *p,
 		return -ENOMEM;
 	}
 
+	tm_hierarchy_init(p);
+
 	return 0;
 }
 
 void
 tm_free(struct pmd_internals *p)
 {
+	tm_hierarchy_uninit(p);
 	rte_free(p->soft.tm.pkts_enq);
 	rte_free(p->soft.tm.pkts_deq);
 }
@@ -129,6 +207,10 @@ tm_start(struct pmd_internals *p)
 	uint32_t n_subports, subport_id;
 	int status;
 
+	/* Is hierarchy frozen? */
+	if (p->soft.tm.hierarchy_frozen == 0)
+		return -1;
+
 	/* Port */
 	p->soft.tm.sched = rte_sched_port_config(&t->port_params);
 	if (p->soft.tm.sched == NULL)
@@ -178,6 +260,51 @@ tm_stop(struct pmd_internals *p)
 {
 	if (p->soft.tm.sched)
 		rte_sched_port_free(p->soft.tm.sched);
+
+	/* Unfreeze hierarchy */
+	p->soft.tm.hierarchy_frozen = 0;
+}
+
+static struct tm_shaper_profile *
+tm_shaper_profile_search(struct rte_eth_dev *dev, uint32_t shaper_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile_list *spl = &p->soft.tm.h.shaper_profiles;
+	struct tm_shaper_profile *sp;
+
+	TAILQ_FOREACH(sp, spl, node)
+		if (shaper_profile_id == sp->shaper_profile_id)
+			return sp;
+
+	return NULL;
+}
+
+static struct tm_shared_shaper *
+tm_shared_shaper_search(struct rte_eth_dev *dev, uint32_t shared_shaper_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper_list *ssl = &p->soft.tm.h.shared_shapers;
+	struct tm_shared_shaper *ss;
+
+	TAILQ_FOREACH(ss, ssl, node)
+		if (shared_shaper_id == ss->shared_shaper_id)
+			return ss;
+
+	return NULL;
+}
+
+static struct tm_wred_profile *
+tm_wred_profile_search(struct rte_eth_dev *dev, uint32_t wred_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile_list *wpl = &p->soft.tm.h.wred_profiles;
+	struct tm_wred_profile *wp;
+
+	TAILQ_FOREACH(wp, wpl, node)
+		if (wred_profile_id == wp->wred_profile_id)
+			return wp;
+
+	return NULL;
 }
 
 static struct tm_node *
@@ -194,6 +321,94 @@ tm_node_search(struct rte_eth_dev *dev, uint32_t node_id)
 	return NULL;
 }
 
+static struct tm_node *
+tm_root_node_present(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node)
+		if (n->parent_node_id == RTE_TM_NODE_ID_NULL)
+			return n;
+
+	return NULL;
+}
+
+static uint32_t
+tm_node_subport_id(struct rte_eth_dev *dev, struct tm_node *subport_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *ns;
+	uint32_t subport_id;
+
+	subport_id = 0;
+	TAILQ_FOREACH(ns, nl, node) {
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		if (ns->node_id == subport_node->node_id)
+			return subport_id;
+
+		subport_id++;
+	}
+
+	return UINT32_MAX;
+}
+
+static uint32_t
+tm_node_pipe_id(struct rte_eth_dev *dev, struct tm_node *pipe_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *np;
+	uint32_t pipe_id;
+
+	pipe_id = 0;
+	TAILQ_FOREACH(np, nl, node) {
+		if (np->level != TM_NODE_LEVEL_PIPE ||
+			np->parent_node_id != pipe_node->parent_node_id)
+			continue;
+
+		if (np->node_id == pipe_node->node_id)
+			return pipe_id;
+
+		pipe_id++;
+	}
+
+	return UINT32_MAX;
+}
+
+static uint32_t
+tm_node_tc_id(struct rte_eth_dev *dev __rte_unused, struct tm_node *tc_node)
+{
+	return tc_node->priority;
+}
+
+static uint32_t
+tm_node_queue_id(struct rte_eth_dev *dev, struct tm_node *queue_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *nq;
+	uint32_t queue_id;
+
+	queue_id = 0;
+	TAILQ_FOREACH(nq, nl, node) {
+		if (nq->level != TM_NODE_LEVEL_QUEUE ||
+			nq->parent_node_id != queue_node->parent_node_id)
+			continue;
+
+		if (nq->node_id == queue_node->node_id)
+			return queue_id;
+
+		queue_id++;
+	}
+
+	return UINT32_MAX;
+}
+
 static uint32_t
 tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 {
@@ -219,6 +434,35 @@ tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 	}
 }
 
+/* Traffic manager node type get */
+static int
+pmd_tm_node_type_get(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	int *is_leaf,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	if (is_leaf == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_UNSPECIFIED,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	if (node_id == RTE_TM_NODE_ID_NULL ||
+		(tm_node_search(dev, node_id) == NULL))
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_NODE_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	*is_leaf = node_id < p->params.soft.tm.nb_queues;
+
+	return 0;
+}
+
 #ifdef RTE_SCHED_RED
 #define WRED_SUPPORTED						1
 #else
@@ -674,8 +918,2535 @@ pmd_tm_node_capabilities_get(struct rte_eth_dev *dev __rte_unused,
 	return 0;
 }
 
-const struct rte_tm_ops pmd_tm_ops = {
-	.capabilities_get = pmd_tm_capabilities_get,
-	.level_capabilities_get = pmd_tm_level_capabilities_get,
-	.node_capabilities_get = pmd_tm_node_capabilities_get,
+static int
+shaper_profile_check(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error)
+{
+	struct tm_shaper_profile *sp;
+
+	/* Shaper profile ID must not be NONE. */
+	if (shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper profile must not exist. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp)
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	/* Profile must not be NULL. */
+	if (profile == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Peak rate: non-zero, 32-bit */
+	if (profile->peak.rate == 0 ||
+		profile->peak.rate >= UINT32_MAX)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_RATE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Peak size: non-zero, 32-bit */
+	if (profile->peak.size == 0 ||
+		profile->peak.size >= UINT32_MAX)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_SIZE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Dual-rate profiles are not supported. */
+	if (profile->committed.rate != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_COMMITTED_RATE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Packet length adjust: 24 bytes */
+	if (profile->pkt_length_adjust != RTE_TM_ETH_FRAMING_OVERHEAD_FCS)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PKT_ADJUST_LEN,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+/* Traffic manager shaper profile add */
+static int
+pmd_tm_shaper_profile_add(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile_list *spl = &p->soft.tm.h.shaper_profiles;
+	struct tm_shaper_profile *sp;
+	int status;
+
+	/* Check input params */
+	status = shaper_profile_check(dev, shaper_profile_id, profile, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	sp = calloc(1, sizeof(struct tm_shaper_profile));
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	sp->shaper_profile_id = shaper_profile_id;
+	memcpy(&sp->params, profile, sizeof(sp->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(spl, sp, node);
+	p->soft.tm.h.n_shaper_profiles++;
+
+	return 0;
+}
+
+/* Traffic manager shaper profile delete */
+static int
+pmd_tm_shaper_profile_delete(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp;
+
+	/* Check existing */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (sp->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.shaper_profiles, sp, node);
+	p->soft.tm.h.n_shaper_profiles--;
+	free(sp);
+
+	return 0;
+}
+
+static struct tm_node *
+tm_shared_shaper_get_tc(struct rte_eth_dev *dev,
+	struct tm_shared_shaper *ss)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	/* Subport: each TC uses shared shaper  */
+	TAILQ_FOREACH(n, nl, node) {
+		if (n->level != TM_NODE_LEVEL_TC ||
+			n->params.n_shared_shapers == 0 ||
+			n->params.shared_shaper_id[0] != ss->shared_shaper_id)
+			continue;
+
+		return n;
+	}
+
+	return NULL;
+}
+
+static int
+update_subport_tc_rate(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct tm_shared_shaper *ss,
+	struct tm_shaper_profile *sp_new)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_subport_params subport_params;
+
+	struct tm_shaper_profile *sp_old = tm_shaper_profile_search(dev,
+		ss->shaper_profile_id);
+
+	/* Derive new subport configuration. */
+	memcpy(&subport_params,
+		&p->soft.tm.params.subport_params[subport_id],
+		sizeof(subport_params));
+	subport_params.tc_rate[tc_id] = sp_new->params.peak.rate;
+
+	/* Update the subport configuration. */
+	if (rte_sched_subport_config(p->soft.tm.sched,
+		subport_id, &subport_params))
+		return -1;
+
+	/* Commit changes. */
+	sp_old->n_users--;
+
+	ss->shaper_profile_id = sp_new->shaper_profile_id;
+	sp_new->n_users++;
+
+	memcpy(&p->soft.tm.params.subport_params[subport_id],
+		&subport_params,
+		sizeof(subport_params));
+
+	return 0;
+}
+
+/* Traffic manager shared shaper add/update */
+static int
+pmd_tm_shared_shaper_add_update(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper *ss;
+	struct tm_shaper_profile *sp;
+	struct tm_node *nt;
+
+	/* Shaper profile must be valid. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/**
+	 * Add new shared shaper
+	 */
+	ss = tm_shared_shaper_search(dev, shared_shaper_id);
+	if (ss == NULL) {
+		struct tm_shared_shaper_list *ssl =
+			&p->soft.tm.h.shared_shapers;
+
+		/* Hierarchy must not be frozen */
+		if (p->soft.tm.hierarchy_frozen)
+			return -rte_tm_error_set(error,
+				EBUSY,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EBUSY));
+
+		/* Memory allocation */
+		ss = calloc(1, sizeof(struct tm_shared_shaper));
+		if (ss == NULL)
+			return -rte_tm_error_set(error,
+				ENOMEM,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(ENOMEM));
+
+		/* Fill in */
+		ss->shared_shaper_id = shared_shaper_id;
+		ss->shaper_profile_id = shaper_profile_id;
+
+		/* Add to list */
+		TAILQ_INSERT_TAIL(ssl, ss, node);
+		p->soft.tm.h.n_shared_shapers++;
+
+		return 0;
+	}
+
+	/**
+	 * Update existing shared shaper
+	 */
+	/* Hierarchy must be frozen (run-time update) */
+	if (p->soft.tm.hierarchy_frozen == 0)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+
+	/* Propagate change. */
+	nt = tm_shared_shaper_get_tc(dev, ss);
+	if (update_subport_tc_rate(dev, nt, ss, sp))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+/* Traffic manager shared shaper delete */
+static int
+pmd_tm_shared_shaper_delete(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper *ss;
+
+	/* Check existing */
+	ss = tm_shared_shaper_search(dev, shared_shaper_id);
+	if (ss == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (ss->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.shared_shapers, ss, node);
+	p->soft.tm.h.n_shared_shapers--;
+	free(ss);
+
+	return 0;
+}
+
+static int
+wred_profile_check(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error)
+{
+	struct tm_wred_profile *wp;
+	enum rte_tm_color color;
+
+	/* WRED profile ID must not be NONE. */
+	if (wred_profile_id == RTE_TM_WRED_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* WRED profile must not exist. */
+	wp = tm_wred_profile_search(dev, wred_profile_id);
+	if (wp)
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	/* Profile must not be NULL. */
+	if (profile == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* min_th <= max_th, max_th > 0  */
+	for (color = RTE_TM_GREEN; color < RTE_TM_COLORS; color++) {
+		uint16_t min_th = profile->red_params[color].min_th;
+		uint16_t max_th = profile->red_params[color].max_th;
+
+		if (min_th > max_th || max_th == 0)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_WRED_PROFILE,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	return 0;
+}
+
+/* Traffic manager WRED profile add */
+static int
+pmd_tm_wred_profile_add(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile_list *wpl = &p->soft.tm.h.wred_profiles;
+	struct tm_wred_profile *wp;
+	int status;
+
+	/* Check input params */
+	status = wred_profile_check(dev, wred_profile_id, profile, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	wp = calloc(1, sizeof(struct tm_wred_profile));
+	if (wp == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	wp->wred_profile_id = wred_profile_id;
+	memcpy(&wp->params, profile, sizeof(wp->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(wpl, wp, node);
+	p->soft.tm.h.n_wred_profiles++;
+
+	return 0;
+}
+
+/* Traffic manager WRED profile delete */
+static int
+pmd_tm_wred_profile_delete(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile *wp;
+
+	/* Check existing */
+	wp = tm_wred_profile_search(dev, wred_profile_id);
+	if (wp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (wp->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.wred_profiles, wp, node);
+	p->soft.tm.h.n_wred_profiles--;
+	free(wp);
+
+	return 0;
+}
+
+static int
+node_add_check_port(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp = tm_shaper_profile_search(dev,
+		params->shaper_profile_id);
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid.
+	 * Shaper profile peak rate must fit the configured port rate.
+	 */
+	if (params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE ||
+		sp == NULL ||
+		sp->params.peak.rate > p->params.soft.tm.rate)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & ~STATS_MASK_DEFAULT)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_subport(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if (params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & ~STATS_MASK_DEFAULT)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_pipe(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight __rte_unused,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if (params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 4 */
+	if (params->nonleaf.n_sp_priorities !=
+		RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* WFQ mode must be byte mode */
+	if (params->nonleaf.wfq_weight_mode != NULL &&
+		params->nonleaf.wfq_weight_mode[0] != 0 &&
+		params->nonleaf.wfq_weight_mode[1] != 0 &&
+		params->nonleaf.wfq_weight_mode[2] != 0 &&
+		params->nonleaf.wfq_weight_mode[3] != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_WFQ_WEIGHT_MODE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & ~STATS_MASK_DEFAULT)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_tc(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority __rte_unused,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if (params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Single valid shared shaper */
+	if (params->n_shared_shapers > 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	if (params->n_shared_shapers == 1 &&
+		(params->shared_shaper_id == NULL ||
+		(!tm_shared_shaper_search(dev, params->shared_shaper_id[0]))))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & ~STATS_MASK_DEFAULT)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_queue(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight __rte_unused,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: leaf */
+	if (node_id >= p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shaper */
+	if (params->shaper_profile_id != RTE_TM_SHAPER_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Congestion management must not be head drop */
+	if (params->leaf.cman == RTE_TM_CMAN_HEAD_DROP)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_CMAN,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Congestion management set to WRED */
+	if (params->leaf.cman == RTE_TM_CMAN_WRED) {
+		uint32_t wred_profile_id = params->leaf.wred.wred_profile_id;
+		struct tm_wred_profile *wp = tm_wred_profile_search(dev,
+			wred_profile_id);
+
+		/* WRED profile (for private WRED context) must be valid */
+		if (wred_profile_id == RTE_TM_WRED_PROFILE_ID_NONE ||
+			wp == NULL)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARAMS_WRED_PROFILE_ID,
+				NULL,
+				rte_strerror(EINVAL));
+
+		/* No shared WRED contexts */
+		if (params->leaf.wred.n_shared_wred_contexts != 0)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_WRED_CONTEXTS,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Stats */
+	if (params->stats_mask & ~STATS_MASK_QUEUE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct tm_node *pn;
+	uint32_t level;
+	int status;
+
+	/* node_id, parent_node_id:
+	 *    -node_id must not be RTE_TM_NODE_ID_NULL
+	 *    -node_id must not be in use
+	 *    -root node add (parent_node_id is RTE_TM_NODE_ID_NULL):
+	 *        -root node must not exist
+	 *    -non-root node add (parent_node_id is not RTE_TM_NODE_ID_NULL):
+	 *        -parent_node_id must be valid
+	 */
+	if (node_id == RTE_TM_NODE_ID_NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	if (tm_node_search(dev, node_id))
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	if (parent_node_id == RTE_TM_NODE_ID_NULL) {
+		pn = NULL;
+		if (tm_root_node_present(dev))
+			return -rte_tm_error_set(error,
+				EEXIST,
+				RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+				NULL,
+				rte_strerror(EEXIST));
+	} else {
+		pn = tm_node_search(dev, parent_node_id);
+		if (pn == NULL)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* priority: must be 0 .. 3 */
+	if (priority >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* weight: must be 1 .. 255 */
+	if (weight == 0 || weight >= UINT8_MAX)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* level_id: if valid, then
+	 *    -root node add (parent_node_id is RTE_TM_NODE_ID_NULL):
+	 *        -level_id must be zero
+	 *    -non-root node add (parent_node_id is not RTE_TM_NODE_ID_NULL):
+	 *        -level_id must be parent level ID plus one
+	 */
+	level = (pn == NULL) ? 0 : pn->level + 1;
+	if (level_id != RTE_TM_NODE_LEVEL_ID_ANY && level_id != level)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* params: must not be NULL */
+	if (params == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* params: per level checks */
+	switch (level) {
+	case TM_NODE_LEVEL_PORT:
+		status = node_add_check_port(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		status = node_add_check_subport(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+		status = node_add_check_pipe(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_TC:
+		status = node_add_check_tc(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_QUEUE:
+		status = node_add_check_queue(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	default:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+	}
+
+	return 0;
+}
+
+/* Traffic manager node add */
+static int
+pmd_tm_node_add(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+	uint32_t i;
+	int status;
+
+	/* Checks */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	status = node_add_check(dev, node_id, parent_node_id, priority, weight,
+		level_id, params, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	n = calloc(1, sizeof(struct tm_node));
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	n->node_id = node_id;
+	n->parent_node_id = parent_node_id;
+	n->priority = priority;
+	n->weight = weight;
+
+	if (parent_node_id != RTE_TM_NODE_ID_NULL) {
+		n->parent_node = tm_node_search(dev, parent_node_id);
+		n->level = n->parent_node->level + 1;
+	}
+
+	if (params->shaper_profile_id != RTE_TM_SHAPER_PROFILE_ID_NONE)
+		n->shaper_profile = tm_shaper_profile_search(dev,
+			params->shaper_profile_id);
+
+	if (n->level == TM_NODE_LEVEL_QUEUE &&
+		params->leaf.cman == RTE_TM_CMAN_WRED)
+		n->wred_profile = tm_wred_profile_search(dev,
+			params->leaf.wred.wred_profile_id);
+
+	memcpy(&n->params, params, sizeof(n->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(nl, n, node);
+	p->soft.tm.h.n_nodes++;
+
+	/* Update dependencies */
+	if (n->parent_node)
+		n->parent_node->n_children++;
+
+	if (n->shaper_profile)
+		n->shaper_profile->n_users++;
+
+	for (i = 0; i < params->n_shared_shapers; i++) {
+		struct tm_shared_shaper *ss;
+
+		ss = tm_shared_shaper_search(dev, params->shared_shaper_id[i]);
+		ss->n_users++;
+	}
+
+	if (n->wred_profile)
+		n->wred_profile->n_users++;
+
+	p->soft.tm.h.n_tm_nodes[n->level]++;
+
+	return 0;
+}
+
+/* Traffic manager node delete */
+static int
+pmd_tm_node_delete(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node *n;
+	uint32_t i;
+
+	/* Check hierarchy changes are currently allowed */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Check existing */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (n->n_children)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Update dependencies */
+	p->soft.tm.h.n_tm_nodes[n->level]--;
+
+	if (n->wred_profile)
+		n->wred_profile->n_users--;
+
+	for (i = 0; i < n->params.n_shared_shapers; i++) {
+		struct tm_shared_shaper *ss;
+
+		ss = tm_shared_shaper_search(dev,
+				n->params.shared_shaper_id[i]);
+		ss->n_users--;
+	}
+
+	if (n->shaper_profile)
+		n->shaper_profile->n_users--;
+
+	if (n->parent_node)
+		n->parent_node->n_children--;
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.nodes, n, node);
+	p->soft.tm.h.n_nodes--;
+	free(n);
+
+	return 0;
+}
+
+
+static void
+pipe_profile_build(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct rte_sched_pipe_params *pp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *nt, *nq;
+
+	memset(pp, 0, sizeof(*pp));
+
+	/* Pipe */
+	pp->tb_rate = np->shaper_profile->params.peak.rate;
+	pp->tb_size = np->shaper_profile->params.peak.size;
+
+	/* Traffic Class (TC) */
+	pp->tc_period = PIPE_TC_PERIOD;
+
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+	pp->tc_ov_weight = np->weight;
+#endif
+
+	TAILQ_FOREACH(nt, nl, node) {
+		uint32_t queue_id = 0;
+
+		if (nt->level != TM_NODE_LEVEL_TC ||
+			nt->parent_node_id != np->node_id)
+			continue;
+
+		pp->tc_rate[nt->priority] =
+			nt->shaper_profile->params.peak.rate;
+
+		/* Queue */
+		TAILQ_FOREACH(nq, nl, node) {
+			uint32_t pipe_queue_id;
+
+			if (nq->level != TM_NODE_LEVEL_QUEUE ||
+				nq->parent_node_id != nt->node_id)
+				continue;
+
+			pipe_queue_id = nt->priority *
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+			pp->wrr_weights[pipe_queue_id] = nq->weight;
+
+			queue_id++;
+		}
+	}
+}
+
+static int
+pipe_profile_free_exists(struct rte_eth_dev *dev,
+	uint32_t *pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+
+	if (t->n_pipe_profiles < RTE_SCHED_PIPE_PROFILES_PER_PORT) {
+		*pipe_profile_id = t->n_pipe_profiles;
+		return 1;
+	}
+
+	return 0;
+}
+
+static int
+pipe_profile_exists(struct rte_eth_dev *dev,
+	struct rte_sched_pipe_params *pp,
+	uint32_t *pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t i;
+
+	for (i = 0; i < t->n_pipe_profiles; i++)
+		if (memcmp(&t->pipe_profiles[i], pp, sizeof(*pp)) == 0) {
+			if (pipe_profile_id)
+				*pipe_profile_id = i;
+			return 1;
+		}
+
+	return 0;
+}
+
+static void
+pipe_profile_install(struct rte_eth_dev *dev,
+	struct rte_sched_pipe_params *pp,
+	uint32_t pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+
+	memcpy(&t->pipe_profiles[pipe_profile_id], pp, sizeof(*pp));
+	t->n_pipe_profiles++;
+}
+
+static void
+pipe_profile_mark(struct rte_eth_dev *dev,
+	uint32_t subport_id,
+	uint32_t pipe_id,
+	uint32_t pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_pipes_per_subport, pos;
+
+	n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+	pos = subport_id * n_pipes_per_subport + pipe_id;
+
+	t->pipe_to_profile[pos] = pipe_profile_id;
+}
+
+static struct rte_sched_pipe_params *
+pipe_profile_get(struct rte_eth_dev *dev, struct tm_node *np)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	uint32_t subport_id = tm_node_subport_id(dev, np->parent_node);
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	uint32_t pos = subport_id * n_pipes_per_subport + pipe_id;
+	uint32_t pipe_profile_id = t->pipe_to_profile[pos];
+
+	return &t->pipe_profiles[pipe_profile_id];
+}
+
+static int
+pipe_profiles_generate(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *ns, *np;
+	uint32_t subport_id;
+
+	/* Objective: Fill in the following fields in struct tm_params:
+	 *    - pipe_profiles
+	 *    - n_pipe_profiles
+	 *    - pipe_to_profile
+	 */
+
+	subport_id = 0;
+	TAILQ_FOREACH(ns, nl, node) {
+		uint32_t pipe_id;
+
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		pipe_id = 0;
+		TAILQ_FOREACH(np, nl, node) {
+			struct rte_sched_pipe_params pp;
+			uint32_t pos;
+
+			if (np->level != TM_NODE_LEVEL_PIPE ||
+				np->parent_node_id != ns->node_id)
+				continue;
+
+			pipe_profile_build(dev, np, &pp);
+
+			if (!pipe_profile_exists(dev, &pp, &pos)) {
+				if (!pipe_profile_free_exists(dev, &pos))
+					return -1;
+
+				pipe_profile_install(dev, &pp, pos);
+			}
+
+			pipe_profile_mark(dev, subport_id, pipe_id, pos);
+
+			pipe_id++;
+		}
+
+		subport_id++;
+	}
+
+	return 0;
+}
+
+static struct tm_wred_profile *
+tm_tc_wred_profile_get(struct rte_eth_dev *dev, uint32_t tc_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *nq;
+
+	TAILQ_FOREACH(nq, nl, node) {
+		if (nq->level != TM_NODE_LEVEL_QUEUE ||
+			nq->parent_node->priority != tc_id)
+			continue;
+
+		return nq->wred_profile;
+	}
+
+	return NULL;
+}
+
+#ifdef RTE_SCHED_RED
+
+static void
+wred_profiles_set(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_sched_port_params *pp = &p->soft.tm.params.port_params;
+	uint32_t tc_id;
+	enum rte_tm_color color;
+
+	for (tc_id = 0; tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc_id++)
+		for (color = RTE_TM_GREEN; color < RTE_TM_COLORS; color++) {
+			struct rte_red_params *dst =
+				&pp->red_params[tc_id][color];
+			struct tm_wred_profile *src_wp =
+				tm_tc_wred_profile_get(dev, tc_id);
+			struct rte_tm_red_params *src =
+				&src_wp->params.red_params[color];
+
+			memcpy(dst, src, sizeof(*dst));
+		}
+}
+
+#else
+
+#define wred_profiles_set(dev)
+
+#endif
+
+static struct tm_shared_shaper *
+tm_tc_shared_shaper_get(struct rte_eth_dev *dev, struct tm_node *tc_node)
+{
+	return (tc_node->params.n_shared_shapers) ?
+		tm_shared_shaper_search(dev,
+			tc_node->params.shared_shaper_id[0]) :
+		NULL;
+}
+
+static struct tm_shared_shaper *
+tm_subport_tc_shared_shaper_get(struct rte_eth_dev *dev,
+	struct tm_node *subport_node,
+	uint32_t tc_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node) {
+		if (n->level != TM_NODE_LEVEL_TC ||
+			n->parent_node->parent_node_id !=
+				subport_node->node_id ||
+			n->priority != tc_id)
+			continue;
+
+		return tm_tc_shared_shaper_get(dev, n);
+	}
+
+	return NULL;
+}
+
+static int
+hierarchy_commit_check(struct rte_eth_dev *dev, struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_shared_shaper_list *ssl = &h->shared_shapers;
+	struct tm_wred_profile_list *wpl = &h->wred_profiles;
+	struct tm_node *nr = tm_root_node_present(dev), *ns, *np, *nt, *nq;
+	struct tm_shared_shaper *ss;
+
+	uint32_t n_pipes_per_subport;
+
+	/* Root node exists. */
+	if (nr == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* There is at least one subport, max is not exceeded. */
+	if (nr->n_children == 0 || nr->n_children > TM_MAX_SUBPORTS)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* There is at least one pipe. */
+	if (h->n_tm_nodes[TM_NODE_LEVEL_PIPE] == 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of pipes is the same for all subports. Maximum number of pipes
+	 * per subport is not exceeded.
+	 */
+	n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	if (n_pipes_per_subport > TM_MAX_PIPES_PER_SUBPORT)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	TAILQ_FOREACH(ns, nl, node) {
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		if (ns->n_children != n_pipes_per_subport)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Each pipe has exactly 4 TCs, with exactly one TC for each priority */
+	TAILQ_FOREACH(np, nl, node) {
+		uint32_t mask = 0, mask_expected =
+			RTE_LEN2MASK(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+				uint32_t);
+
+		if (np->level != TM_NODE_LEVEL_PIPE)
+			continue;
+
+		if (np->n_children != RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+
+		TAILQ_FOREACH(nt, nl, node) {
+			if (nt->level != TM_NODE_LEVEL_TC ||
+				nt->parent_node_id != np->node_id)
+				continue;
+
+			mask |= 1 << nt->priority;
+		}
+
+		if (mask != mask_expected)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Each TC has exactly 4 packet queues. */
+	TAILQ_FOREACH(nt, nl, node) {
+		if (nt->level != TM_NODE_LEVEL_TC)
+			continue;
+
+		if (nt->n_children != RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/**
+	 * Shared shapers:
+	 *    -For each TC #i, all pipes in the same subport use the same
+	 *     shared shaper (or no shared shaper) for their TC#i.
+	 *    -Each shared shaper needs to have at least one user. All its
+	 *     users have to be TC nodes with the same priority and the same
+	 *     subport.
+	 */
+	TAILQ_FOREACH(ns, nl, node) {
+		struct tm_shared_shaper *s[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t id;
+
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++)
+			s[id] = tm_subport_tc_shared_shaper_get(dev, ns, id);
+
+		TAILQ_FOREACH(nt, nl, node) {
+			struct tm_shared_shaper *subport_ss, *tc_ss;
+
+			if (nt->level != TM_NODE_LEVEL_TC ||
+				nt->parent_node->parent_node_id !=
+					ns->node_id)
+				continue;
+
+			subport_ss = s[nt->priority];
+			tc_ss = tm_tc_shared_shaper_get(dev, nt);
+
+			if (subport_ss == NULL && tc_ss == NULL)
+				continue;
+
+			if ((subport_ss == NULL && tc_ss != NULL) ||
+				(subport_ss != NULL && tc_ss == NULL) ||
+				subport_ss->shared_shaper_id !=
+					tc_ss->shared_shaper_id)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+	}
+
+	TAILQ_FOREACH(ss, ssl, node) {
+		struct tm_node *nt_any = tm_shared_shaper_get_tc(dev, ss);
+		uint32_t n_users = 0;
+
+		if (nt_any != NULL)
+			TAILQ_FOREACH(nt, nl, node) {
+				if (nt->level != TM_NODE_LEVEL_TC ||
+					nt->priority != nt_any->priority ||
+					nt->parent_node->parent_node_id !=
+					nt_any->parent_node->parent_node_id)
+					continue;
+
+				n_users++;
+			}
+
+		if (ss->n_users == 0 || ss->n_users != n_users)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Not too many pipe profiles. */
+	if (pipe_profiles_generate(dev))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/**
+	 * WRED (when used, i.e. at least one WRED profile defined):
+	 *    -Each WRED profile must have at least one user.
+	 *    -All leaf nodes must have their private WRED context enabled.
+	 *    -For each TC #i, all leaf nodes must use the same WRED profile
+	 *     for their private WRED context.
+	 */
+	if (h->n_wred_profiles) {
+		struct tm_wred_profile *wp;
+		struct tm_wred_profile *w[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t id;
+
+		TAILQ_FOREACH(wp, wpl, node)
+			if (wp->n_users == 0)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++) {
+			w[id] = tm_tc_wred_profile_get(dev, id);
+
+			if (w[id] == NULL)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+
+		TAILQ_FOREACH(nq, nl, node) {
+			uint32_t id;
+
+			if (nq->level != TM_NODE_LEVEL_QUEUE)
+				continue;
+
+			id = nq->parent_node->priority;
+
+			if (nq->wred_profile == NULL ||
+				nq->wred_profile->wred_profile_id !=
+					w[id]->wred_profile_id)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+	}
+
+	return 0;
+}
+
+static void
+hierarchy_blueprints_create(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *root = tm_root_node_present(dev), *n;
+
+	uint32_t subport_id;
+
+	t->port_params = (struct rte_sched_port_params) {
+		.name = dev->data->name,
+		.socket = dev->data->numa_node,
+		.rate = root->shaper_profile->params.peak.rate,
+		.mtu = dev->data->mtu,
+		.frame_overhead =
+			root->shaper_profile->params.pkt_length_adjust,
+		.n_subports_per_port = root->n_children,
+		.n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+			h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT],
+		.qsize = {p->params.soft.tm.qsize[0],
+			p->params.soft.tm.qsize[1],
+			p->params.soft.tm.qsize[2],
+			p->params.soft.tm.qsize[3],
+		},
+		.pipe_profiles = t->pipe_profiles,
+		.n_pipe_profiles = t->n_pipe_profiles,
+	};
+
+	wred_profiles_set(dev);
+
+	subport_id = 0;
+	TAILQ_FOREACH(n, nl, node) {
+		uint64_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t i;
+
+		if (n->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+			struct tm_shared_shaper *ss;
+			struct tm_shaper_profile *sp;
+
+			ss = tm_subport_tc_shared_shaper_get(dev, n, i);
+			sp = (ss) ? tm_shaper_profile_search(dev,
+				ss->shaper_profile_id) :
+				n->shaper_profile;
+			tc_rate[i] = sp->params.peak.rate;
+		}
+
+		t->subport_params[subport_id] =
+			(struct rte_sched_subport_params) {
+				.tb_rate = n->shaper_profile->params.peak.rate,
+				.tb_size = n->shaper_profile->params.peak.size,
+
+				.tc_rate = {tc_rate[0],
+					tc_rate[1],
+					tc_rate[2],
+					tc_rate[3],
+			},
+			.tc_period = SUBPORT_TC_PERIOD,
+		};
+
+		subport_id++;
+	}
+}
+
+/* Traffic manager hierarchy commit */
+static int
+pmd_tm_hierarchy_commit(struct rte_eth_dev *dev,
+	int clear_on_fail,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	int status;
+
+	/* Checks */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	status = hierarchy_commit_check(dev, error);
+	if (status) {
+		if (clear_on_fail) {
+			tm_hierarchy_uninit(p);
+			tm_hierarchy_init(p);
+		}
+
+		return status;
+	}
+
+	/* Create blueprints */
+	hierarchy_blueprints_create(dev);
+
+	/* Freeze hierarchy */
+	p->soft.tm.hierarchy_frozen = 1;
+
+	return 0;
+}
+
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+
+static int
+update_pipe_weight(struct rte_eth_dev *dev, struct tm_node *np, uint32_t weight)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tc_ov_weight = (uint8_t)weight;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	np->weight = weight;
+
+	return 0;
+}
+
+#endif
+
+static int
+update_queue_weight(struct rte_eth_dev *dev,
+	struct tm_node *nq, uint32_t weight)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t queue_id = tm_node_queue_id(dev, nq);
+
+	struct tm_node *nt = nq->parent_node;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t pipe_queue_id =
+		tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.wrr_weights[pipe_queue_id] = (uint8_t)weight;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set
+	 * of pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	nq->weight = weight;
+
+	return 0;
+}
+
+/* Traffic manager node parent update */
+static int
+pmd_tm_node_parent_update(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+
+	/* Port must be started and TM used. */
+	if (dev->data->dev_started == 0 && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Parent node must be the same */
+	if (n->parent_node_id != parent_node_id)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be the same */
+	if (n->priority != priority)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* weight: must be 1 .. 255 */
+	if (weight == 0 || weight >= UINT8_MAX)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_SUBPORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_PIPE:
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+		if (update_pipe_weight(dev, n, weight))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+#else
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+#endif
+		/* fall-through */
+	case TM_NODE_LEVEL_TC:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_QUEUE:
+		/* fall-through */
+	default:
+		if (update_queue_weight(dev, n, weight))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+	}
+}
+
+static int
+update_subport_rate(struct rte_eth_dev *dev,
+	struct tm_node *ns,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_subport_params subport_params;
+
+	/* Derive new subport configuration. */
+	memcpy(&subport_params,
+		&p->soft.tm.params.subport_params[subport_id],
+		sizeof(subport_params));
+	subport_params.tb_rate = sp->params.peak.rate;
+	subport_params.tb_size = sp->params.peak.size;
+
+	/* Update the subport configuration. */
+	if (rte_sched_subport_config(p->soft.tm.sched, subport_id,
+		&subport_params))
+		return -1;
+
+	/* Commit changes. */
+	ns->shaper_profile->n_users--;
+
+	ns->shaper_profile = sp;
+	ns->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	memcpy(&p->soft.tm.params.subport_params[subport_id],
+		&subport_params,
+		sizeof(subport_params));
+
+	return 0;
+}
+
+static int
+update_pipe_rate(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tb_rate = sp->params.peak.rate;
+	profile1.tb_size = sp->params.peak.size;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	np->shaper_profile->n_users--;
+	np->shaper_profile = sp;
+	np->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	return 0;
+}
+
+static int
+update_tc_rate(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tc_rate[tc_id] = sp->params.peak.rate;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	nt->shaper_profile->n_users--;
+	nt->shaper_profile = sp;
+	nt->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	return 0;
+}
+
+/* Traffic manager node shaper update */
+static int
+pmd_tm_node_shaper_update(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+	struct tm_shaper_profile *sp;
+
+	/* Port must be started and TM used. */
+	if (dev->data->dev_started == 0 && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper profile must be valid. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_SUBPORT:
+		if (update_subport_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_PIPE:
+		if (update_pipe_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_TC:
+		if (update_tc_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_QUEUE:
+		/* fall-through */
+	default:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+	}
+}
+
+static inline uint32_t
+tm_port_queue_id(struct rte_eth_dev *dev,
+	uint32_t port_subport_id,
+	uint32_t subport_pipe_id,
+	uint32_t pipe_tc_id,
+	uint32_t tc_queue_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	uint32_t n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+			h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	uint32_t port_pipe_id =
+		port_subport_id * n_pipes_per_subport + subport_pipe_id;
+	uint32_t port_tc_id =
+		port_pipe_id * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE + pipe_tc_id;
+	uint32_t port_queue_id =
+		port_tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + tc_queue_id;
+
+	return port_queue_id;
+}
+
+static int
+read_port_stats(struct rte_eth_dev *dev,
+	struct tm_node *nr,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	uint32_t n_subports_per_port = h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+	uint32_t subport_id;
+
+	for (subport_id = 0; subport_id < n_subports_per_port; subport_id++) {
+		struct rte_sched_subport_stats s;
+		uint32_t tc_ov, id;
+
+		/* Stats read */
+		int status = rte_sched_subport_read_stats(
+			p->soft.tm.sched,
+			subport_id,
+			&s,
+			&tc_ov);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++) {
+			nr->stats.n_pkts +=
+				s.n_pkts_tc[id] - s.n_pkts_tc_dropped[id];
+			nr->stats.n_bytes +=
+				s.n_bytes_tc[id] - s.n_bytes_tc_dropped[id];
+			nr->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] +=
+				s.n_pkts_tc_dropped[id];
+			nr->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+				s.n_bytes_tc_dropped[id];
+		}
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nr->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nr->stats, 0, sizeof(nr->stats));
+
+	return 0;
+}
+
+static int
+read_subport_stats(struct rte_eth_dev *dev,
+	struct tm_node *ns,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+	struct rte_sched_subport_stats s;
+	uint32_t tc_ov, tc_id;
+
+	/* Stats read */
+	int status = rte_sched_subport_read_stats(
+		p->soft.tm.sched,
+		subport_id,
+		&s,
+		&tc_ov);
+	if (status)
+		return status;
+
+	/* Stats accumulate */
+	for (tc_id = 0; tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc_id++) {
+		ns->stats.n_pkts +=
+			s.n_pkts_tc[tc_id] - s.n_pkts_tc_dropped[tc_id];
+		ns->stats.n_bytes +=
+			s.n_bytes_tc[tc_id] - s.n_bytes_tc_dropped[tc_id];
+		ns->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] +=
+			s.n_pkts_tc_dropped[tc_id];
+		ns->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_tc_dropped[tc_id];
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &ns->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&ns->stats, 0, sizeof(ns->stats));
+
+	return 0;
+}
+
+static int
+read_pipe_stats(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t i;
+
+	/* Stats read */
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
+		struct rte_sched_queue_stats s;
+		uint16_t qlen;
+
+		uint32_t qid = tm_port_queue_id(dev,
+			subport_id,
+			pipe_id,
+			i / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			i % RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
+
+		int status = rte_sched_queue_read_stats(
+			p->soft.tm.sched,
+			qid,
+			&s,
+			&qlen);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		np->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+		np->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+		np->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+		np->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_dropped;
+		np->stats.leaf.n_pkts_queued = qlen;
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &np->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&np->stats, 0, sizeof(np->stats));
+
+	return 0;
+}
+
+static int
+read_tc_stats(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t i;
+
+	/* Stats read */
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
+		struct rte_sched_queue_stats s;
+		uint16_t qlen;
+
+		uint32_t qid = tm_port_queue_id(dev,
+			subport_id,
+			pipe_id,
+			tc_id,
+			i);
+
+		int status = rte_sched_queue_read_stats(
+			p->soft.tm.sched,
+			qid,
+			&s,
+			&qlen);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		nt->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+		nt->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+		nt->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+		nt->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_dropped;
+		nt->stats.leaf.n_pkts_queued = qlen;
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nt->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nt->stats, 0, sizeof(nt->stats));
+
+	return 0;
+}
+
+static int
+read_queue_stats(struct rte_eth_dev *dev,
+	struct tm_node *nq,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_sched_queue_stats s;
+	uint16_t qlen;
+
+	uint32_t queue_id = tm_node_queue_id(dev, nq);
+
+	struct tm_node *nt = nq->parent_node;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	/* Stats read */
+	uint32_t qid = tm_port_queue_id(dev,
+		subport_id,
+		pipe_id,
+		tc_id,
+		queue_id);
+
+	int status = rte_sched_queue_read_stats(
+		p->soft.tm.sched,
+		qid,
+		&s,
+		&qlen);
+	if (status)
+		return status;
+
+	/* Stats accumulate */
+	nq->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+	nq->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+	nq->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+	nq->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+		s.n_bytes_dropped;
+	nq->stats.leaf.n_pkts_queued = qlen;
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nq->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_QUEUE;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nq->stats, 0, sizeof(nq->stats));
+
+	return 0;
+}
+
+/* Traffic manager read stats counters for specific node */
+static int
+pmd_tm_node_stats_read(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+
+	/* Port must be started and TM used. */
+	if (dev->data->dev_started == 0 && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		if (read_port_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		if (read_subport_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_PIPE:
+		if (read_pipe_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_TC:
+		if (read_tc_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		if (read_queue_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+	}
+}
+
+const struct rte_tm_ops pmd_tm_ops = {
+	.node_type_get = pmd_tm_node_type_get,
+	.capabilities_get = pmd_tm_capabilities_get,
+	.level_capabilities_get = pmd_tm_level_capabilities_get,
+	.node_capabilities_get = pmd_tm_node_capabilities_get,
+
+	.wred_profile_add = pmd_tm_wred_profile_add,
+	.wred_profile_delete = pmd_tm_wred_profile_delete,
+	.shared_wred_context_add_update = NULL,
+	.shared_wred_context_delete = NULL,
+
+	.shaper_profile_add = pmd_tm_shaper_profile_add,
+	.shaper_profile_delete = pmd_tm_shaper_profile_delete,
+	.shared_shaper_add_update = pmd_tm_shared_shaper_add_update,
+	.shared_shaper_delete = pmd_tm_shared_shaper_delete,
+
+	.node_add = pmd_tm_node_add,
+	.node_delete = pmd_tm_node_delete,
+	.node_suspend = NULL,
+	.node_resume = NULL,
+	.hierarchy_commit = pmd_tm_hierarchy_commit,
+
+	.node_parent_update = pmd_tm_node_parent_update,
+	.node_shaper_update = pmd_tm_node_shaper_update,
+	.node_shared_shaper_update = NULL,
+	.node_stats_update = NULL,
+	.node_wfq_weight_mode_update = NULL,
+	.node_cman_update = NULL,
+	.node_wred_context_update = NULL,
+	.node_shared_wred_context_update = NULL,
+
+	.node_stats_read = pmd_tm_node_stats_read,
 };
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v7 5/5] app/testpmd: add traffic management forwarding mode
  2017-10-09 12:58                       ` [dpdk-dev] [PATCH v7 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
                                           ` (3 preceding siblings ...)
  2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 4/5] net/softnic: add TM hierarchy related ops Jasvinder Singh
@ 2017-10-09 12:58                         ` Jasvinder Singh
  2017-10-09 20:17                           ` Ferruh Yigit
  4 siblings, 1 reply; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-09 12:58 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

This commit extends the testpmd application with new forwarding engine
that demonstrates the use of ethdev traffic management APIs and softnic
PMD for QoS traffic management.

In this mode, 5-level hierarchical tree of the QoS scheduler is built
with the help of ethdev TM APIs such as shaper profile add/delete,
shared shaper add/update, node add/delete, hierarchy commit, etc.
The hierarchical tree has following nodes; root node(x1, level 0),
subport node(x1, level 1), pipe node(x4096, level 2),
tc node(x16348, level 3), queue node(x65536, level 4).

During runtime, each received packet is first classified by mapping the
packet fields information to 5-tuples (HQoS subport, pipe, traffic class,
queue within traffic class, and color) and storing it in the packet mbuf
sched field. After classification, each packet is sent to softnic port
which prioritizes the transmission of the received packets, and
accordingly sends them on to the output interface.

To enable traffic management mode, following testpmd command is used;

$ ./testpmd -c c -n 4 --vdev
	'net_softnic0,hard_name=0000:06:00.1,soft_tm=on' -- -i
	--forward-mode=tm

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
v7 change:
- change port_id type to uint16_t
- rebase on dpdk-next-net

v5 change:
- add CLI to enable default tm hierarchy
 
v3 change:
- Implements feedback from Pablo[1]
  - add flag to check required librte_sched lib and softnic pmd
  - code cleanup

v2 change:
- change file name softnictm.c to tm.c
- change forward mode name to "tm"
- code clean up

[1] http://dpdk.org/ml/archives/dev/2017-September/075744.html

 app/test-pmd/Makefile  |  12 +
 app/test-pmd/cmdline.c |  88 +++++
 app/test-pmd/testpmd.c |  15 +
 app/test-pmd/testpmd.h |  46 +++
 app/test-pmd/tm.c      | 865 +++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 1026 insertions(+)
 create mode 100644 app/test-pmd/tm.c

diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile
index b6e80dd..2c50f68 100644
--- a/app/test-pmd/Makefile
+++ b/app/test-pmd/Makefile
@@ -59,6 +59,10 @@ SRCS-y += csumonly.c
 SRCS-y += icmpecho.c
 SRCS-$(CONFIG_RTE_LIBRTE_IEEE1588) += ieee1588fwd.c
 
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_SOFTNIC)$(CONFIG_RTE_LIBRTE_SCHED),yy)
+SRCS-y += tm.c
+endif
+
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 
 ifeq ($(CONFIG_RTE_LIBRTE_PMD_BOND),y)
@@ -77,6 +81,14 @@ ifeq ($(CONFIG_RTE_LIBRTE_BNXT_PMD),y)
 LDLIBS += -lrte_pmd_bnxt
 endif
 
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_XENVIRT),y)
+LDLIBS += -lrte_pmd_xenvirt
+endif
+
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_SOFTNIC),y)
+LDLIBS += -lrte_pmd_softnic
+endif
+
 endif
 
 CFLAGS_cmdline.o := -D_GNU_SOURCE
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 20e04f7..292b9be 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -637,6 +637,11 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"E-tag set filter del e-tag-id (value) port (port_id)\n"
 			"    Delete an E-tag forwarding filter on a port\n\n"
 
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+			"set port tm hierarchy default (port_id)\n"
+			"	Set default traffic Management hierarchy on a port\n\n"
+
+#endif
 			"ddp add (port_id) (profile_path[,output_path])\n"
 			"    Load a profile package on a port\n\n"
 
@@ -13424,6 +13429,86 @@ cmdline_parse_inst_t cmd_vf_tc_max_bw = {
 	},
 };
 
+
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+
+/* *** Set Port default Traffic Management Hierarchy *** */
+struct cmd_set_port_tm_hierarchy_default_result {
+	cmdline_fixed_string_t set;
+	cmdline_fixed_string_t port;
+	cmdline_fixed_string_t tm;
+	cmdline_fixed_string_t hierarchy;
+	cmdline_fixed_string_t def;
+	uint16_t port_id;
+};
+
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_set =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result, set, "set");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_port =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result, port, "port");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_tm =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result, tm, "tm");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_hierarchy =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result,
+			hierarchy, "hierarchy");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_default =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result,
+			def, "default");
+cmdline_parse_token_num_t cmd_set_port_tm_hierarchy_default_port_id =
+	TOKEN_NUM_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result,
+			port_id, UINT8);
+
+static void cmd_set_port_tm_hierarchy_default_parsed(void *parsed_result,
+	__attribute__((unused)) struct cmdline *cl,
+	__attribute__((unused)) void *data)
+{
+	struct cmd_set_port_tm_hierarchy_default_result *res = parsed_result;
+	struct rte_port *p;
+	uint16_t port_id = res->port_id;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
+		return;
+
+	p = &ports[port_id];
+
+	/* Port tm flag */
+	if (p->softport.tm_flag == 0) {
+		printf("  tm not enabled on port %u (error)\n", port_id);
+		return;
+	}
+
+	/* Forward mode: tm */
+	if (strcmp(cur_fwd_config.fwd_eng->fwd_mode_name, "tm")) {
+		printf("  tm mode not enabled(error)\n");
+		return;
+	}
+
+	/* Set the default tm hierarchy */
+	p->softport.tm.default_hierarchy_enable = 1;
+}
+
+cmdline_parse_inst_t cmd_set_port_tm_hierarchy_default = {
+	.f = cmd_set_port_tm_hierarchy_default_parsed,
+	.data = NULL,
+	.help_str = "set port tm hierarchy default <port_id>",
+	.tokens = {
+		(void *)&cmd_set_port_tm_hierarchy_default_set,
+		(void *)&cmd_set_port_tm_hierarchy_default_port,
+		(void *)&cmd_set_port_tm_hierarchy_default_tm,
+		(void *)&cmd_set_port_tm_hierarchy_default_hierarchy,
+		(void *)&cmd_set_port_tm_hierarchy_default_default,
+		(void *)&cmd_set_port_tm_hierarchy_default_port_id,
+		NULL,
+	},
+};
+#endif
+
 /* Strict link priority scheduling mode setting */
 static void
 cmd_strict_link_prio_parsed(
@@ -15020,6 +15105,9 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_vf_tc_max_bw,
 	(cmdline_parse_inst_t *)&cmd_strict_link_prio,
 	(cmdline_parse_inst_t *)&cmd_tc_min_bw,
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+	(cmdline_parse_inst_t *)&cmd_set_port_tm_hierarchy_default,
+#endif
 	(cmdline_parse_inst_t *)&cmd_ddp_add,
 	(cmdline_parse_inst_t *)&cmd_ddp_del,
 	(cmdline_parse_inst_t *)&cmd_ddp_get_list,
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 3dcc325..552abdf 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -163,6 +163,10 @@ struct fwd_engine * fwd_engines[] = {
 	&tx_only_engine,
 	&csum_fwd_engine,
 	&icmp_echo_engine,
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+	&softnic_tm_engine,
+	&softnic_tm_bypass_engine,
+#endif
 #ifdef RTE_LIBRTE_IEEE1588
 	&ieee1588_fwd_engine,
 #endif
@@ -2108,6 +2112,17 @@ init_port_config(void)
 		    (rte_eth_devices[pid].data->dev_flags &
 		     RTE_ETH_DEV_INTR_RMV))
 			port->dev_conf.intr_conf.rmv = 1;
+
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+		/* Detect softnic port */
+		if (!strcmp(port->dev_info.driver_name, "net_softnic")) {
+			port->softnic_enable = 1;
+			memset(&port->softport, 0, sizeof(struct softnic_port));
+
+			if (!strcmp(cur_fwd_eng->fwd_mode_name, "tm"))
+				port->softport.tm_flag = 1;
+		}
+#endif
 	}
 }
 
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index e2d9e34..7c79f17 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -85,6 +85,12 @@ typedef uint16_t streamid_t;
 
 #define MAX_QUEUE_ID ((1 << (sizeof(queueid_t) * 8)) - 1)
 
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+#define TM_MODE			1
+#else
+#define TM_MODE			0
+#endif
+
 enum {
 	PORT_TOPOLOGY_PAIRED,
 	PORT_TOPOLOGY_CHAINED,
@@ -164,6 +170,38 @@ struct port_flow {
 	uint8_t data[]; /**< Storage for pattern/actions. */
 };
 
+#ifdef TM_MODE
+/**
+ * Soft port tm related parameters
+ */
+struct softnic_port_tm {
+	uint32_t default_hierarchy_enable; /**< def hierarchy enable flag */
+	uint32_t hierarchy_config;  /**< set to 1 if hierarchy configured */
+
+	uint32_t n_subports_per_port;  /**< Num of subport nodes per port */
+	uint32_t n_pipes_per_subport;  /**< Num of pipe nodes per subport */
+
+	uint64_t tm_pktfield0_slabpos;	/**< Pkt field position for subport */
+	uint64_t tm_pktfield0_slabmask; /**< Pkt field mask for the subport */
+	uint64_t tm_pktfield0_slabshr;
+	uint64_t tm_pktfield1_slabpos; /**< Pkt field position for the pipe */
+	uint64_t tm_pktfield1_slabmask; /**< Pkt field mask for the pipe */
+	uint64_t tm_pktfield1_slabshr;
+	uint64_t tm_pktfield2_slabpos; /**< Pkt field position table index */
+	uint64_t tm_pktfield2_slabmask;	/**< Pkt field mask for tc table idx */
+	uint64_t tm_pktfield2_slabshr;
+	uint64_t tm_tc_table[64];  /**< TC translation table */
+};
+
+/**
+ * The data structure associate with softnic port
+ */
+struct softnic_port {
+	unsigned int tm_flag;	/**< set to 1 if tm feature is enabled */
+	struct softnic_port_tm tm;	/**< softnic port tm parameters */
+};
+#endif
+
 /**
  * The data structure associated with each port.
  */
@@ -197,6 +235,10 @@ struct rte_port {
 	uint32_t                mc_addr_nb; /**< nb. of addr. in mc_addr_pool */
 	uint8_t                 slave_flag; /**< bonding slave port */
 	struct port_flow        *flow_list; /**< Associated flows. */
+#ifdef TM_MODE
+	unsigned int			softnic_enable;	/**< softnic flag */
+	struct softnic_port     softport;  /**< softnic port params */
+#endif
 };
 
 /**
@@ -257,6 +299,10 @@ extern struct fwd_engine rx_only_engine;
 extern struct fwd_engine tx_only_engine;
 extern struct fwd_engine csum_fwd_engine;
 extern struct fwd_engine icmp_echo_engine;
+#ifdef TM_MODE
+extern struct fwd_engine softnic_tm_engine;
+extern struct fwd_engine softnic_tm_bypass_engine;
+#endif
 #ifdef RTE_LIBRTE_IEEE1588
 extern struct fwd_engine ieee1588_fwd_engine;
 #endif
diff --git a/app/test-pmd/tm.c b/app/test-pmd/tm.c
new file mode 100644
index 0000000..9021e26
--- /dev/null
+++ b/app/test-pmd/tm.c
@@ -0,0 +1,865 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include <stdio.h>
+#include <sys/stat.h>
+
+#include <rte_cycles.h>
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_flow.h>
+#include <rte_meter.h>
+#include <rte_eth_softnic.h>
+#include <rte_tm.h>
+
+#include "testpmd.h"
+
+#define SUBPORT_NODES_PER_PORT		1
+#define PIPE_NODES_PER_SUBPORT		4096
+#define TC_NODES_PER_PIPE			4
+#define QUEUE_NODES_PER_TC			4
+
+#define NUM_PIPE_NODES						\
+	(SUBPORT_NODES_PER_PORT * PIPE_NODES_PER_SUBPORT)
+
+#define NUM_TC_NODES						\
+	(NUM_PIPE_NODES * TC_NODES_PER_PIPE)
+
+#define ROOT_NODE_ID				1000000
+#define SUBPORT_NODES_START_ID		900000
+#define PIPE_NODES_START_ID			800000
+#define TC_NODES_START_ID			700000
+
+#define STATS_MASK_DEFAULT					\
+	(RTE_TM_STATS_N_PKTS |					\
+	RTE_TM_STATS_N_BYTES |					\
+	RTE_TM_STATS_N_PKTS_GREEN_DROPPED |			\
+	RTE_TM_STATS_N_BYTES_GREEN_DROPPED)
+
+#define STATS_MASK_QUEUE					\
+	(STATS_MASK_DEFAULT |					\
+	RTE_TM_STATS_N_PKTS_QUEUED)
+
+#define BYTES_IN_MBPS				(1000 * 1000 / 8)
+#define TOKEN_BUCKET_SIZE			1000000
+
+/* TM Hierarchy Levels */
+enum tm_hierarchy_level {
+	TM_NODE_LEVEL_PORT = 0,
+	TM_NODE_LEVEL_SUBPORT,
+	TM_NODE_LEVEL_PIPE,
+	TM_NODE_LEVEL_TC,
+	TM_NODE_LEVEL_QUEUE,
+	TM_NODE_LEVEL_MAX,
+};
+
+struct tm_hierarchy {
+	/* TM Nodes */
+	uint32_t root_node_id;
+	uint32_t subport_node_id[SUBPORT_NODES_PER_PORT];
+	uint32_t pipe_node_id[SUBPORT_NODES_PER_PORT][PIPE_NODES_PER_SUBPORT];
+	uint32_t tc_node_id[NUM_PIPE_NODES][TC_NODES_PER_PIPE];
+	uint32_t queue_node_id[NUM_TC_NODES][QUEUE_NODES_PER_TC];
+
+	/* TM Hierarchy Nodes Shaper Rates */
+	uint32_t root_node_shaper_rate;
+	uint32_t subport_node_shaper_rate;
+	uint32_t pipe_node_shaper_rate;
+	uint32_t tc_node_shaper_rate;
+	uint32_t tc_node_shared_shaper_rate;
+
+	uint32_t n_shapers;
+};
+
+#define BITFIELD(byte_array, slab_pos, slab_mask, slab_shr)	\
+({								\
+	uint64_t slab = *((uint64_t *) &byte_array[slab_pos]);	\
+	uint64_t val =				\
+		(rte_be_to_cpu_64(slab) & slab_mask) >> slab_shr;	\
+	val;						\
+})
+
+#define RTE_SCHED_PORT_HIERARCHY(subport, pipe,           \
+	traffic_class, queue, color)                          \
+	((((uint64_t) (queue)) & 0x3) |                       \
+	((((uint64_t) (traffic_class)) & 0x3) << 2) |         \
+	((((uint64_t) (color)) & 0x3) << 4) |                 \
+	((((uint64_t) (subport)) & 0xFFFF) << 16) |           \
+	((((uint64_t) (pipe)) & 0xFFFFFFFF) << 32))
+
+
+static void
+pkt_metadata_set(struct rte_port *p, struct rte_mbuf **pkts,
+	uint32_t n_pkts)
+{
+	struct softnic_port_tm *tm = &p->softport.tm;
+	uint32_t i;
+
+	for (i = 0; i < (n_pkts & (~0x3)); i += 4) {
+		struct rte_mbuf *pkt0 = pkts[i];
+		struct rte_mbuf *pkt1 = pkts[i + 1];
+		struct rte_mbuf *pkt2 = pkts[i + 2];
+		struct rte_mbuf *pkt3 = pkts[i + 3];
+
+		uint8_t *pkt0_data = rte_pktmbuf_mtod(pkt0, uint8_t *);
+		uint8_t *pkt1_data = rte_pktmbuf_mtod(pkt1, uint8_t *);
+		uint8_t *pkt2_data = rte_pktmbuf_mtod(pkt2, uint8_t *);
+		uint8_t *pkt3_data = rte_pktmbuf_mtod(pkt3, uint8_t *);
+
+		uint64_t pkt0_subport = BITFIELD(pkt0_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt0_pipe = BITFIELD(pkt0_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt0_dscp = BITFIELD(pkt0_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt0_tc = tm->tm_tc_table[pkt0_dscp & 0x3F] >> 2;
+		uint32_t pkt0_tc_q = tm->tm_tc_table[pkt0_dscp & 0x3F] & 0x3;
+		uint64_t pkt1_subport = BITFIELD(pkt1_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt1_pipe = BITFIELD(pkt1_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt1_dscp = BITFIELD(pkt1_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt1_tc = tm->tm_tc_table[pkt1_dscp & 0x3F] >> 2;
+		uint32_t pkt1_tc_q = tm->tm_tc_table[pkt1_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt2_subport = BITFIELD(pkt2_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt2_pipe = BITFIELD(pkt2_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt2_dscp = BITFIELD(pkt2_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt2_tc = tm->tm_tc_table[pkt2_dscp & 0x3F] >> 2;
+		uint32_t pkt2_tc_q = tm->tm_tc_table[pkt2_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt3_subport = BITFIELD(pkt3_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt3_pipe = BITFIELD(pkt3_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt3_dscp = BITFIELD(pkt3_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt3_tc = tm->tm_tc_table[pkt3_dscp & 0x3F] >> 2;
+		uint32_t pkt3_tc_q = tm->tm_tc_table[pkt3_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt0_sched = RTE_SCHED_PORT_HIERARCHY(pkt0_subport,
+						pkt0_pipe,
+						pkt0_tc,
+						pkt0_tc_q,
+						0);
+		uint64_t pkt1_sched = RTE_SCHED_PORT_HIERARCHY(pkt1_subport,
+						pkt1_pipe,
+						pkt1_tc,
+						pkt1_tc_q,
+						0);
+		uint64_t pkt2_sched = RTE_SCHED_PORT_HIERARCHY(pkt2_subport,
+						pkt2_pipe,
+						pkt2_tc,
+						pkt2_tc_q,
+						0);
+		uint64_t pkt3_sched = RTE_SCHED_PORT_HIERARCHY(pkt3_subport,
+						pkt3_pipe,
+						pkt3_tc,
+						pkt3_tc_q,
+						0);
+
+		pkt0->hash.sched.lo = pkt0_sched & 0xFFFFFFFF;
+		pkt0->hash.sched.hi = pkt0_sched >> 32;
+		pkt1->hash.sched.lo = pkt1_sched & 0xFFFFFFFF;
+		pkt1->hash.sched.hi = pkt1_sched >> 32;
+		pkt2->hash.sched.lo = pkt2_sched & 0xFFFFFFFF;
+		pkt2->hash.sched.hi = pkt2_sched >> 32;
+		pkt3->hash.sched.lo = pkt3_sched & 0xFFFFFFFF;
+		pkt3->hash.sched.hi = pkt3_sched >> 32;
+	}
+
+	for (; i < n_pkts; i++)	{
+		struct rte_mbuf *pkt = pkts[i];
+
+		uint8_t *pkt_data = rte_pktmbuf_mtod(pkt, uint8_t *);
+
+		uint64_t pkt_subport = BITFIELD(pkt_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt_pipe = BITFIELD(pkt_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt_dscp = BITFIELD(pkt_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt_tc = tm->tm_tc_table[pkt_dscp & 0x3F] >> 2;
+		uint32_t pkt_tc_q = tm->tm_tc_table[pkt_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt_sched = RTE_SCHED_PORT_HIERARCHY(pkt_subport,
+						pkt_pipe,
+						pkt_tc,
+						pkt_tc_q,
+						0);
+
+		pkt->hash.sched.lo = pkt_sched & 0xFFFFFFFF;
+		pkt->hash.sched.hi = pkt_sched >> 32;
+	}
+}
+
+/*
+ * Soft port packet forward
+ */
+static void
+softport_packet_fwd(struct fwd_stream *fs)
+{
+	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_port *rte_tx_port = &ports[fs->tx_port];
+	uint16_t nb_rx;
+	uint16_t nb_tx;
+	uint32_t retry;
+
+#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
+	uint64_t start_tsc;
+	uint64_t end_tsc;
+	uint64_t core_cycles;
+#endif
+
+#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
+	start_tsc = rte_rdtsc();
+#endif
+
+	/*  Packets Receive */
+	nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue,
+			pkts_burst, nb_pkt_per_burst);
+	fs->rx_packets += nb_rx;
+
+#ifdef RTE_TEST_PMD_RECORD_BURST_STATS
+	fs->rx_burst_stats.pkt_burst_spread[nb_rx]++;
+#endif
+
+	if (rte_tx_port->softnic_enable) {
+		/* Set packet metadata if tm flag enabled */
+		if (rte_tx_port->softport.tm_flag)
+			pkt_metadata_set(rte_tx_port, pkts_burst, nb_rx);
+
+		/* Softport run */
+		rte_pmd_softnic_run(fs->tx_port);
+	}
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
+			pkts_burst, nb_rx);
+
+	/* Retry if necessary */
+	if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
+		retry = 0;
+		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
+			rte_delay_us(burst_tx_delay_time);
+			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
+					&pkts_burst[nb_tx], nb_rx - nb_tx);
+		}
+	}
+	fs->tx_packets += nb_tx;
+
+#ifdef RTE_TEST_PMD_RECORD_BURST_STATS
+	fs->tx_burst_stats.pkt_burst_spread[nb_tx]++;
+#endif
+
+	if (unlikely(nb_tx < nb_rx)) {
+		fs->fwd_dropped += (nb_rx - nb_tx);
+		do {
+			rte_pktmbuf_free(pkts_burst[nb_tx]);
+		} while (++nb_tx < nb_rx);
+	}
+#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
+	end_tsc = rte_rdtsc();
+	core_cycles = (end_tsc - start_tsc);
+	fs->core_cycles = (uint64_t) (fs->core_cycles + core_cycles);
+#endif
+}
+
+static void
+set_tm_hiearchy_nodes_shaper_rate(portid_t port_id, struct tm_hierarchy *h)
+{
+	struct rte_eth_link link_params;
+	uint64_t tm_port_rate;
+
+	memset(&link_params, 0, sizeof(link_params));
+
+	rte_eth_link_get(port_id, &link_params);
+	tm_port_rate = link_params.link_speed * BYTES_IN_MBPS;
+
+	if (tm_port_rate > UINT32_MAX)
+		tm_port_rate = UINT32_MAX;
+
+	/* Set tm hierarchy shapers rate */
+	h->root_node_shaper_rate = tm_port_rate;
+	h->subport_node_shaper_rate =
+		tm_port_rate / SUBPORT_NODES_PER_PORT;
+	h->pipe_node_shaper_rate
+		= h->subport_node_shaper_rate / PIPE_NODES_PER_SUBPORT;
+	h->tc_node_shaper_rate = h->pipe_node_shaper_rate;
+	h->tc_node_shared_shaper_rate = h->subport_node_shaper_rate;
+}
+
+static int
+softport_tm_root_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	struct rte_tm_node_params rnp;
+	struct rte_tm_shaper_params rsp;
+	uint32_t priority, weight, level_id, shaper_profile_id;
+
+	memset(&rsp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&rnp, 0, sizeof(struct rte_tm_node_params));
+
+	/* Shaper profile Parameters */
+	rsp.peak.rate = h->root_node_shaper_rate;
+	rsp.peak.size = TOKEN_BUCKET_SIZE;
+	rsp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+	shaper_profile_id = 0;
+
+	if (rte_tm_shaper_profile_add(port_id, shaper_profile_id,
+		&rsp, error)) {
+		printf("%s ERROR(%d)-%s!(shaper_id %u)\n ",
+			__func__, error->type, error->message,
+			shaper_profile_id);
+		return -1;
+	}
+
+	/* Root Node Parameters */
+	h->root_node_id = ROOT_NODE_ID;
+	weight = 1;
+	priority = 0;
+	level_id = TM_NODE_LEVEL_PORT;
+	rnp.shaper_profile_id = shaper_profile_id;
+	rnp.nonleaf.n_sp_priorities = 1;
+	rnp.stats_mask = STATS_MASK_DEFAULT;
+
+	/* Add Node to TM Hierarchy */
+	if (rte_tm_node_add(port_id, h->root_node_id, RTE_TM_NODE_ID_NULL,
+		priority, weight, level_id, &rnp, error)) {
+		printf("%s ERROR(%d)-%s!(node_id %u, parent_id %u, level %u)\n",
+			__func__, error->type, error->message,
+			h->root_node_id, RTE_TM_NODE_ID_NULL,
+			level_id);
+		return -1;
+	}
+	/* Update */
+	h->n_shapers++;
+
+	printf("  Root node added (Start id %u, Count %u, level %u)\n",
+		h->root_node_id, 1, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_subport_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t subport_parent_node_id, subport_node_id;
+	struct rte_tm_node_params snp;
+	struct rte_tm_shaper_params ssp;
+	uint32_t priority, weight, level_id, shaper_profile_id;
+	uint32_t i;
+
+	memset(&ssp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&snp, 0, sizeof(struct rte_tm_node_params));
+
+	shaper_profile_id = h->n_shapers;
+
+	/* Add Shaper Profile to TM Hierarchy */
+	for (i = 0; i < SUBPORT_NODES_PER_PORT; i++) {
+		ssp.peak.rate = h->subport_node_shaper_rate;
+		ssp.peak.size = TOKEN_BUCKET_SIZE;
+		ssp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+		if (rte_tm_shaper_profile_add(port_id, shaper_profile_id,
+			&ssp, error)) {
+			printf("%s ERROR(%d)-%s!(shaper_id %u)\n ",
+				__func__, error->type, error->message,
+				shaper_profile_id);
+			return -1;
+		}
+
+		/* Node Parameters */
+		h->subport_node_id[i] = SUBPORT_NODES_START_ID + i;
+		subport_parent_node_id = h->root_node_id;
+		weight = 1;
+		priority = 0;
+		level_id = TM_NODE_LEVEL_SUBPORT;
+		snp.shaper_profile_id = shaper_profile_id;
+		snp.nonleaf.n_sp_priorities = 1;
+		snp.stats_mask = STATS_MASK_DEFAULT;
+
+		/* Add Node to TM Hiearchy */
+		if (rte_tm_node_add(port_id,
+				h->subport_node_id[i],
+				subport_parent_node_id,
+				priority, weight,
+				level_id,
+				&snp,
+				error)) {
+			printf("%s ERROR(%d)-%s!(node %u,parent %u,level %u)\n",
+					__func__,
+					error->type,
+					error->message,
+					h->subport_node_id[i],
+					subport_parent_node_id,
+					level_id);
+			return -1;
+		}
+		shaper_profile_id++;
+		subport_node_id++;
+	}
+	/* Update */
+	h->n_shapers = shaper_profile_id;
+
+	printf("  Subport nodes added (Start id %u, Count %u, level %u)\n",
+		h->subport_node_id[0], SUBPORT_NODES_PER_PORT, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_pipe_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t pipe_parent_node_id;
+	struct rte_tm_node_params pnp;
+	struct rte_tm_shaper_params psp;
+	uint32_t priority, weight, level_id, shaper_profile_id;
+	uint32_t i, j;
+
+	memset(&psp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&pnp, 0, sizeof(struct rte_tm_node_params));
+
+	shaper_profile_id = h->n_shapers;
+
+	/* Shaper Profile Parameters */
+	psp.peak.rate = h->pipe_node_shaper_rate;
+	psp.peak.size = TOKEN_BUCKET_SIZE;
+	psp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+	/* Pipe Node Parameters */
+	weight = 1;
+	priority = 0;
+	level_id = TM_NODE_LEVEL_PIPE;
+	pnp.nonleaf.n_sp_priorities = 4;
+	pnp.stats_mask = STATS_MASK_DEFAULT;
+
+	/* Add Shaper Profiles and Nodes to TM Hierarchy */
+	for (i = 0; i < SUBPORT_NODES_PER_PORT; i++) {
+		for (j = 0; j < PIPE_NODES_PER_SUBPORT; j++) {
+			if (rte_tm_shaper_profile_add(port_id,
+				shaper_profile_id, &psp, error)) {
+				printf("%s ERROR(%d)-%s!(shaper_id %u)\n ",
+					__func__, error->type, error->message,
+					shaper_profile_id);
+				return -1;
+			}
+			pnp.shaper_profile_id = shaper_profile_id;
+			pipe_parent_node_id = h->subport_node_id[i];
+			h->pipe_node_id[i][j] = PIPE_NODES_START_ID +
+				(i * PIPE_NODES_PER_SUBPORT) + j;
+
+			if (rte_tm_node_add(port_id,
+					h->pipe_node_id[i][j],
+					pipe_parent_node_id,
+					priority, weight, level_id,
+					&pnp,
+					error)) {
+				printf("%s ERROR(%d)-%s!(node %u,parent %u )\n",
+					__func__,
+					error->type,
+					error->message,
+					h->pipe_node_id[i][j],
+					pipe_parent_node_id);
+
+				return -1;
+			}
+			shaper_profile_id++;
+		}
+	}
+	/* Update */
+	h->n_shapers = shaper_profile_id;
+
+	printf("  Pipe nodes added (Start id %u, Count %u, level %u)\n",
+		h->pipe_node_id[0][0], NUM_PIPE_NODES, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_tc_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t tc_parent_node_id;
+	struct rte_tm_node_params tnp;
+	struct rte_tm_shaper_params tsp, tssp;
+	uint32_t shared_shaper_profile_id[TC_NODES_PER_PIPE];
+	uint32_t priority, weight, level_id, shaper_profile_id;
+	uint32_t pos, n_tc_nodes, i, j, k;
+
+	memset(&tsp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&tssp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&tnp, 0, sizeof(struct rte_tm_node_params));
+
+	shaper_profile_id = h->n_shapers;
+
+	/* Private Shaper Profile (TC) Parameters */
+	tsp.peak.rate = h->tc_node_shaper_rate;
+	tsp.peak.size = TOKEN_BUCKET_SIZE;
+	tsp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+	/* Shared Shaper Profile (TC) Parameters */
+	tssp.peak.rate = h->tc_node_shared_shaper_rate;
+	tssp.peak.size = TOKEN_BUCKET_SIZE;
+	tssp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+	/* TC Node Parameters */
+	weight = 1;
+	level_id = TM_NODE_LEVEL_TC;
+	tnp.n_shared_shapers = 1;
+	tnp.nonleaf.n_sp_priorities = 1;
+	tnp.stats_mask = STATS_MASK_DEFAULT;
+
+	/* Add Shared Shaper Profiles to TM Hierarchy */
+	for (i = 0; i < TC_NODES_PER_PIPE; i++) {
+		shared_shaper_profile_id[i] = shaper_profile_id;
+
+		if (rte_tm_shaper_profile_add(port_id,
+			shared_shaper_profile_id[i], &tssp, error)) {
+			printf("%s ERROR(%d)-%s!(Shared shaper profileid %u)\n",
+				__func__, error->type, error->message,
+				shared_shaper_profile_id[i]);
+
+			return -1;
+		}
+		if (rte_tm_shared_shaper_add_update(port_id,  i,
+			shared_shaper_profile_id[i], error)) {
+			printf("%s ERROR(%d)-%s!(Shared shaper id %u)\n",
+				__func__, error->type, error->message, i);
+
+			return -1;
+		}
+		shaper_profile_id++;
+	}
+
+	/* Add Shaper Profiles and Nodes to TM Hierarchy */
+	n_tc_nodes = 0;
+	for (i = 0; i < SUBPORT_NODES_PER_PORT; i++) {
+		for (j = 0; j < PIPE_NODES_PER_SUBPORT; j++) {
+			for (k = 0; k < TC_NODES_PER_PIPE ; k++) {
+				priority = k;
+				tc_parent_node_id = h->pipe_node_id[i][j];
+				tnp.shared_shaper_id =
+					(uint32_t *)calloc(1, sizeof(uint32_t));
+				tnp.shared_shaper_id[0] = k;
+				pos = j + (i * PIPE_NODES_PER_SUBPORT);
+				h->tc_node_id[pos][k] =
+					TC_NODES_START_ID + n_tc_nodes;
+
+				if (rte_tm_shaper_profile_add(port_id,
+					shaper_profile_id, &tsp, error)) {
+					printf("%s ERROR(%d)-%s!(shaper %u)\n",
+						__func__, error->type,
+						error->message,
+						shaper_profile_id);
+
+					return -1;
+				}
+				tnp.shaper_profile_id = shaper_profile_id;
+				if (rte_tm_node_add(port_id,
+						h->tc_node_id[pos][k],
+						tc_parent_node_id,
+						priority, weight,
+						level_id,
+						&tnp, error)) {
+					printf("%s ERROR(%d)-%s!(node id %u)\n",
+						__func__,
+						error->type,
+						error->message,
+						h->tc_node_id[pos][k]);
+
+					return -1;
+				}
+				shaper_profile_id++;
+				n_tc_nodes++;
+			}
+		}
+	}
+	/* Update */
+	h->n_shapers = shaper_profile_id;
+
+	printf("  TC nodes added (Start id %u, Count %u, level %u)\n",
+		h->tc_node_id[0][0], n_tc_nodes, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_queue_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t queue_parent_node_id;
+	struct rte_tm_node_params qnp;
+	uint32_t priority, weight, level_id, pos;
+	uint32_t n_queue_nodes, i, j, k;
+
+	memset(&qnp, 0, sizeof(struct rte_tm_node_params));
+
+	/* Queue Node Parameters */
+	priority = 0;
+	weight = 1;
+	level_id = TM_NODE_LEVEL_QUEUE;
+	qnp.shaper_profile_id = RTE_TM_SHAPER_PROFILE_ID_NONE;
+	qnp.leaf.cman = RTE_TM_CMAN_TAIL_DROP;
+	qnp.stats_mask = STATS_MASK_QUEUE;
+
+	/* Add Queue Nodes to TM Hierarchy */
+	n_queue_nodes = 0;
+	for (i = 0; i < NUM_PIPE_NODES; i++) {
+		for (j = 0; j < TC_NODES_PER_PIPE; j++) {
+			queue_parent_node_id = h->tc_node_id[i][j];
+			for (k = 0; k < QUEUE_NODES_PER_TC; k++) {
+				pos = j + (i * TC_NODES_PER_PIPE);
+				h->queue_node_id[pos][k] = n_queue_nodes;
+				if (rte_tm_node_add(port_id,
+						h->queue_node_id[pos][k],
+						queue_parent_node_id,
+						priority,
+						weight,
+						level_id,
+						&qnp, error)) {
+					printf("%s ERROR(%d)-%s!(node %u)\n",
+						__func__,
+						error->type,
+						error->message,
+						h->queue_node_id[pos][k]);
+
+					return -1;
+				}
+				n_queue_nodes++;
+			}
+		}
+	}
+	printf("  Queue nodes added (Start id %u, Count %u, level %u)\n",
+		h->queue_node_id[0][0], n_queue_nodes, level_id);
+
+	return 0;
+}
+
+/*
+ * TM Packet Field Setup
+ */
+static void
+softport_tm_pktfield_setup(portid_t port_id)
+{
+	struct rte_port *p = &ports[port_id];
+	uint64_t pktfield0_mask = 0;
+	uint64_t pktfield1_mask = 0x0000000FFF000000LLU;
+	uint64_t pktfield2_mask = 0x00000000000000FCLLU;
+
+	p->softport.tm = (struct softnic_port_tm) {
+		.n_subports_per_port = SUBPORT_NODES_PER_PORT,
+		.n_pipes_per_subport = PIPE_NODES_PER_SUBPORT,
+
+		/* Packet field to identify subport
+		 *
+		 * Default configuration assumes only one subport, thus
+		 * the subport ID is hardcoded to 0
+		 */
+		.tm_pktfield0_slabpos = 0,
+		.tm_pktfield0_slabmask = pktfield0_mask,
+		.tm_pktfield0_slabshr =
+			__builtin_ctzll(pktfield0_mask),
+
+		/* Packet field to identify pipe.
+		 *
+		 * Default value assumes Ethernet/IPv4/UDP packets,
+		 * UDP payload bits 12 .. 23
+		 */
+		.tm_pktfield1_slabpos = 40,
+		.tm_pktfield1_slabmask = pktfield1_mask,
+		.tm_pktfield1_slabshr =
+			__builtin_ctzll(pktfield1_mask),
+
+		/* Packet field used as index into TC translation table
+		 * to identify the traffic class and queue.
+		 *
+		 * Default value assumes Ethernet/IPv4 packets, IPv4
+		 * DSCP field
+		 */
+		.tm_pktfield2_slabpos = 8,
+		.tm_pktfield2_slabmask = pktfield2_mask,
+		.tm_pktfield2_slabshr =
+			__builtin_ctzll(pktfield2_mask),
+
+		.tm_tc_table = {
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+		}, /**< TC translation table */
+	};
+}
+
+static int
+softport_tm_hierarchy_specify(portid_t port_id, struct rte_tm_error *error)
+{
+
+	struct tm_hierarchy h;
+	int status;
+
+	memset(&h, 0, sizeof(struct tm_hierarchy));
+
+	/* TM hierarchy shapers rate */
+	set_tm_hiearchy_nodes_shaper_rate(port_id, &h);
+
+	/* Add root node (level 0) */
+	status = softport_tm_root_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add subport node (level 1) */
+	status = softport_tm_subport_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add pipe nodes (level 2) */
+	status = softport_tm_pipe_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add traffic class nodes (level 3) */
+	status = softport_tm_tc_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add queue nodes (level 4) */
+	status = softport_tm_queue_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* TM packet fields setup */
+	softport_tm_pktfield_setup(port_id);
+
+	return 0;
+}
+
+/*
+ * Soft port Init
+ */
+static void
+softport_tm_begin(portid_t pi)
+{
+	struct rte_port *port = &ports[pi];
+
+	/* Soft port TM flag */
+	if (port->softport.tm_flag == 1) {
+		printf("\n\n  TM feature available on port %u\n", pi);
+
+		/* Soft port TM hierarchy configuration */
+		if ((port->softport.tm.hierarchy_config == 0) &&
+			(port->softport.tm.default_hierarchy_enable == 1)) {
+			struct rte_tm_error error;
+			int status;
+
+			/* Stop port */
+			rte_eth_dev_stop(pi);
+
+			/* TM hierarchy specification */
+			status = softport_tm_hierarchy_specify(pi, &error);
+			if (status) {
+				printf("  TM Hierarchy built error(%d) - %s\n",
+					error.type, error.message);
+				return;
+			}
+			printf("\n  TM Hierarchy Specified!\n\v");
+
+			/* TM hierarchy commit */
+			status = rte_tm_hierarchy_commit(pi, 0, &error);
+			if (status) {
+				printf("  Hierarchy commit error(%d) - %s\n",
+					error.type, error.message);
+				return;
+			}
+			printf("  Hierarchy Committed (port %u)!", pi);
+			port->softport.tm.hierarchy_config = 1;
+
+			/* Start port */
+			status = rte_eth_dev_start(pi);
+			if (status) {
+				printf("\n  Port %u start error!\n", pi);
+				return;
+			}
+			printf("\n  Port %u started!\n", pi);
+			return;
+		}
+	}
+	printf("\n  TM feature not available on port %u", pi);
+}
+
+struct fwd_engine softnic_tm_engine = {
+	.fwd_mode_name  = "tm",
+	.port_fwd_begin = softport_tm_begin,
+	.port_fwd_end   = NULL,
+	.packet_fwd     = softport_packet_fwd,
+};
+
+struct fwd_engine softnic_tm_bypass_engine = {
+	.fwd_mode_name  = "tm-bypass",
+	.port_fwd_begin = NULL,
+	.port_fwd_end   = NULL,
+	.packet_fwd     = softport_packet_fwd,
+};
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v7 5/5] app/testpmd: add traffic management forwarding mode
  2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 5/5] app/testpmd: add traffic management forwarding mode Jasvinder Singh
@ 2017-10-09 20:17                           ` Ferruh Yigit
  2017-10-10 10:07                             ` Singh, Jasvinder
  0 siblings, 1 reply; 79+ messages in thread
From: Ferruh Yigit @ 2017-10-09 20:17 UTC (permalink / raw)
  To: Jasvinder Singh, dev; +Cc: cristian.dumitrescu, thomas, wenzhuo.lu

On 10/9/2017 1:58 PM, Jasvinder Singh wrote:
> This commit extends the testpmd application with new forwarding engine
> that demonstrates the use of ethdev traffic management APIs and softnic
> PMD for QoS traffic management.
> 
> In this mode, 5-level hierarchical tree of the QoS scheduler is built
> with the help of ethdev TM APIs such as shaper profile add/delete,
> shared shaper add/update, node add/delete, hierarchy commit, etc.
> The hierarchical tree has following nodes; root node(x1, level 0),
> subport node(x1, level 1), pipe node(x4096, level 2),
> tc node(x16348, level 3), queue node(x65536, level 4).
> 
> During runtime, each received packet is first classified by mapping the
> packet fields information to 5-tuples (HQoS subport, pipe, traffic class,
> queue within traffic class, and color) and storing it in the packet mbuf
> sched field. After classification, each packet is sent to softnic port
> which prioritizes the transmission of the received packets, and
> accordingly sends them on to the output interface.
> 
> To enable traffic management mode, following testpmd command is used;
> 
> $ ./testpmd -c c -n 4 --vdev
> 	'net_softnic0,hard_name=0000:06:00.1,soft_tm=on' -- -i
> 	--forward-mode=tm
> 
> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> Acked-by: Thomas Monjalon <thomas@monjalon.net>

<...>

> +static int
> +softport_tm_subport_node_add(portid_t port_id, struct tm_hierarchy *h,
> +	struct rte_tm_error *error)
> +{
> +	uint32_t subport_parent_node_id, subport_node_id;
> +	struct rte_tm_node_params snp;
> +	struct rte_tm_shaper_params ssp;
> +	uint32_t priority, weight, level_id, shaper_profile_id;
> +	uint32_t i;
> +
> +	memset(&ssp, 0, sizeof(struct rte_tm_shaper_params));
> +	memset(&snp, 0, sizeof(struct rte_tm_node_params));
> +
> +	shaper_profile_id = h->n_shapers;
> +
> +	/* Add Shaper Profile to TM Hierarchy */
> +	for (i = 0; i < SUBPORT_NODES_PER_PORT; i++) {
> +		ssp.peak.rate = h->subport_node_shaper_rate;
> +		ssp.peak.size = TOKEN_BUCKET_SIZE;
> +		ssp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
> +
> +		if (rte_tm_shaper_profile_add(port_id, shaper_profile_id,
> +			&ssp, error)) {
> +			printf("%s ERROR(%d)-%s!(shaper_id %u)\n ",
> +				__func__, error->type, error->message,
> +				shaper_profile_id);
> +			return -1;
> +		}
> +
> +		/* Node Parameters */
> +		h->subport_node_id[i] = SUBPORT_NODES_START_ID + i;
> +		subport_parent_node_id = h->root_node_id;
> +		weight = 1;
> +		priority = 0;
> +		level_id = TM_NODE_LEVEL_SUBPORT;
> +		snp.shaper_profile_id = shaper_profile_id;
> +		snp.nonleaf.n_sp_priorities = 1;
> +		snp.stats_mask = STATS_MASK_DEFAULT;
> +
> +		/* Add Node to TM Hiearchy */
> +		if (rte_tm_node_add(port_id,
> +				h->subport_node_id[i],
> +				subport_parent_node_id,
> +				priority, weight,
> +				level_id,
> +				&snp,
> +				error)) {
> +			printf("%s ERROR(%d)-%s!(node %u,parent %u,level %u)\n",
> +					__func__,
> +					error->type,
> +					error->message,
> +					h->subport_node_id[i],
> +					subport_parent_node_id,
> +					level_id);
> +			return -1;
> +		}
> +		shaper_profile_id++;
> +		subport_node_id++;

This is causing following build error:

.../dpdk/app/test-pmd/tm.c:462:3: error: variable 'subport_node_id' is
uninitialized when used here [-Werror,-Wuninitialized]
                subport_node_id++;
                ^~~~~~~~~~~~~~~
.../dpdk/app/test-pmd/tm.c:409:50: note: initialize the variable
'subport_node_id' to silence this warning
        uint32_t subport_parent_node_id, subport_node_id;
                                                        ^
                                                         = 0

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v7 1/5] net/softnic: add softnic PMD
  2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 1/5] net/softnic: add softnic PMD Jasvinder Singh
@ 2017-10-09 20:18                           ` Ferruh Yigit
  2017-10-10 10:08                             ` Singh, Jasvinder
  2017-10-10 10:18                           ` [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  1 sibling, 1 reply; 79+ messages in thread
From: Ferruh Yigit @ 2017-10-09 20:18 UTC (permalink / raw)
  To: Jasvinder Singh, dev; +Cc: cristian.dumitrescu, thomas, wenzhuo.lu

On 10/9/2017 1:58 PM, Jasvinder Singh wrote:
> Add SoftNIC PMD to provide SW fall-back for ethdev APIs.
> 
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> 
> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> Acked-by: Thomas Monjalon <thomas@monjalon.net>

<...>

> --- a/doc/guides/rel_notes/release_17_11.rst
> +++ b/doc/guides/rel_notes/release_17_11.rst
> @@ -81,6 +81,7 @@ New Features
>    See the :ref:`Membership Library <Member_Library>` documentation in
>    the Programmers Guide document, for more information.
>  
> +<<<<<<< a3dabd30369bd73017fa367725fb1b455074a2ca
>  * **Added the Generic Segmentation Offload Library.**
>  
>    Added the Generic Segmentation Offload (GSO) library to enable
> @@ -96,6 +97,12 @@ New Features
>    The GSO library doesn't check if the input packets have correct
>    checksums, and doesn't update checksums for output packets.
>    Additionally, the GSO library doesn't process IP fragmented packets.
> +=======
> +* **Added SoftNIC PMD.**
> +
> +  Added new SoftNIC PMD. This virtual device offers applications a software
> +  fallback support for traffic management.
> +>>>>>>> net/softnic: add softnic PMD

This looks like merge artifact, can you please fix this?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v7 5/5] app/testpmd: add traffic management forwarding mode
  2017-10-09 20:17                           ` Ferruh Yigit
@ 2017-10-10 10:07                             ` Singh, Jasvinder
  0 siblings, 0 replies; 79+ messages in thread
From: Singh, Jasvinder @ 2017-10-10 10:07 UTC (permalink / raw)
  To: Yigit, Ferruh, dev; +Cc: Dumitrescu, Cristian, thomas, Lu, Wenzhuo


<snip>
> 
> > +static int
> > +softport_tm_subport_node_add(portid_t port_id, struct tm_hierarchy
> *h,
> > +	struct rte_tm_error *error)
> > +{
> > +	uint32_t subport_parent_node_id, subport_node_id;
> > +	struct rte_tm_node_params snp;
> > +	struct rte_tm_shaper_params ssp;
> > +	uint32_t priority, weight, level_id, shaper_profile_id;
> > +	uint32_t i;
> > +
> > +	memset(&ssp, 0, sizeof(struct rte_tm_shaper_params));
> > +	memset(&snp, 0, sizeof(struct rte_tm_node_params));
> > +
> > +	shaper_profile_id = h->n_shapers;
> > +
> > +	/* Add Shaper Profile to TM Hierarchy */
> > +	for (i = 0; i < SUBPORT_NODES_PER_PORT; i++) {
> > +		ssp.peak.rate = h->subport_node_shaper_rate;
> > +		ssp.peak.size = TOKEN_BUCKET_SIZE;
> > +		ssp.pkt_length_adjust =
> RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
> > +
> > +		if (rte_tm_shaper_profile_add(port_id, shaper_profile_id,
> > +			&ssp, error)) {
> > +			printf("%s ERROR(%d)-%s!(shaper_id %u)\n ",
> > +				__func__, error->type, error->message,
> > +				shaper_profile_id);
> > +			return -1;
> > +		}
> > +
> > +		/* Node Parameters */
> > +		h->subport_node_id[i] = SUBPORT_NODES_START_ID + i;
> > +		subport_parent_node_id = h->root_node_id;
> > +		weight = 1;
> > +		priority = 0;
> > +		level_id = TM_NODE_LEVEL_SUBPORT;
> > +		snp.shaper_profile_id = shaper_profile_id;
> > +		snp.nonleaf.n_sp_priorities = 1;
> > +		snp.stats_mask = STATS_MASK_DEFAULT;
> > +
> > +		/* Add Node to TM Hiearchy */
> > +		if (rte_tm_node_add(port_id,
> > +				h->subport_node_id[i],
> > +				subport_parent_node_id,
> > +				priority, weight,
> > +				level_id,
> > +				&snp,
> > +				error)) {
> > +			printf("%s ERROR(%d)-%s!(node %u,parent %u,level
> %u)\n",
> > +					__func__,
> > +					error->type,
> > +					error->message,
> > +					h->subport_node_id[i],
> > +					subport_parent_node_id,
> > +					level_id);
> > +			return -1;
> > +		}
> > +		shaper_profile_id++;
> > +		subport_node_id++;
> 
> This is causing following build error:
> 
> .../dpdk/app/test-pmd/tm.c:462:3: error: variable 'subport_node_id' is
> uninitialized when used here [-Werror,-Wuninitialized]
>                 subport_node_id++;
>                 ^~~~~~~~~~~~~~~
> .../dpdk/app/test-pmd/tm.c:409:50: note: initialize the variable
> 'subport_node_id' to silence this warning
>         uint32_t subport_parent_node_id, subport_node_id;
>                                                         ^
>                                                          = 0

Fixed in the next version. Thanks.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v7 1/5] net/softnic: add softnic PMD
  2017-10-09 20:18                           ` Ferruh Yigit
@ 2017-10-10 10:08                             ` Singh, Jasvinder
  0 siblings, 0 replies; 79+ messages in thread
From: Singh, Jasvinder @ 2017-10-10 10:08 UTC (permalink / raw)
  To: Yigit, Ferruh, dev; +Cc: Dumitrescu, Cristian, thomas, Lu, Wenzhuo

> <...>
> 
> > --- a/doc/guides/rel_notes/release_17_11.rst
> > +++ b/doc/guides/rel_notes/release_17_11.rst
> > @@ -81,6 +81,7 @@ New Features
> >    See the :ref:`Membership Library <Member_Library>` documentation in
> >    the Programmers Guide document, for more information.
> >
> > +<<<<<<< a3dabd30369bd73017fa367725fb1b455074a2ca
> >  * **Added the Generic Segmentation Offload Library.**
> >
> >    Added the Generic Segmentation Offload (GSO) library to enable @@
> > -96,6 +97,12 @@ New Features
> >    The GSO library doesn't check if the input packets have correct
> >    checksums, and doesn't update checksums for output packets.
> >    Additionally, the GSO library doesn't process IP fragmented packets.
> > +=======
> > +* **Added SoftNIC PMD.**
> > +
> > +  Added new SoftNIC PMD. This virtual device offers applications a
> > + software  fallback support for traffic management.
> > +>>>>>>> net/softnic: add softnic PMD
> 
> This looks like merge artifact, can you please fix this?

Fixed in the next version. Thanks.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 1/5] net/softnic: add softnic PMD Jasvinder Singh
  2017-10-09 20:18                           ` Ferruh Yigit
@ 2017-10-10 10:18                           ` Jasvinder Singh
  2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 1/5] net/softnic: add softnic PMD Jasvinder Singh
                                               ` (5 more replies)
  1 sibling, 6 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-10 10:18 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

The SoftNIC PMD is intended to provide SW fall-back options for specific
ethdev APIs in a generic way to the NICs not supporting those features.

Currently, the only implemented ethdev API is Traffic Management (TM),
but other ethdev APIs such as rte_flow, traffic metering & policing, etc
can be easily implemented.

Overview:
* Generic: The SoftNIC PMD works with any "hard" PMD that implements the
  ethdev API. It does not change the "hard" PMD in any way.
* Creation: For any given "hard" ethdev port, the user can decide to
  create an associated "soft" ethdev port to drive the "hard" port. The
  "soft" port is a virtual device that can be created at app start-up
  through EAL vdev arg or later through the virtual device API.
* Configuration: The app explicitly decides which features are to be
  enabled on the "soft" port and which features are still to be used from
  the "hard" port. The app continues to explicitly configure both the
  "hard" and the "soft" ports after the creation of the "soft" port.
* RX/TX: The app reads packets from/writes packets to the "soft" port
  instead of the "hard" port. The RX and TX queues of the "soft" port are
  thread safe, as any ethdev.
* Execution: The "soft" port is a feature-rich NIC implemented by the CPU,
  so the run function of the "soft" port has to be executed by the CPU in
  order to get packets moving between "hard" port and the app.
* Meets the NFV vision: The app should be (almost) agnostic about the NIC
  implementation (different vendors/models, HW-SW mix), the app should not
  require changes to use different NICs, the app should use the same API
  for all NICs. If a NIC does not implement a specific feature, the HW
  should be augmented with SW to meet the functionality while still
  preserving the same API.

Traffic Management SW fall-back overview:
* Implements the ethdev traffic management API (rte_tm.h).
* Based on the existing librte_sched DPDK library.

Example: Create "soft" port for "hard" port "0000:04:00.1", enable the TM
feature with default settings:
          --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on' 

Q1: Why generic name, if only TM is supported (for now)?
A1: The intention is to have SoftNIC PMD implement many other (all?)
    ethdev APIs under a single "ideal" ethdev, hence the generic name.
    The initial motivation is TM API, but the mechanism is generic and can
    be used for many other ethdev APIs. Somebody looking to provide SW
    fall-back for other ethdev API is likely to end up inventing the same,
    hence it would be good to consolidate all under a single PMD and have
    the user explicitly enable/disable the features it needs for each
    "soft" device.

Q2: Are there any performance requirements for SoftNIC?
A2: Yes, performance should be great/decent for every feature, otherwise
    the SW fall-back is unusable, thus useless.

Q3: Why not change the "hard" device (and keep a single device) instead of
    creating a new "soft" device (and thus having two devices)?
A3: This is not possible with the current librte_ether ethdev
    implementation. The ethdev->dev_ops are defined as constant structure,
    so it cannot be changed per device (nor per PMD). The new ops also
    need memory space to store their context data structures, which
    requires updating the ethdev->data->dev_private of the existing
    device; at best, maybe a resize of ethdev->data->dev_private could be
    done, assuming that librte_ether will introduce a way to find out its
    size, but this cannot be done while device is running. Other side
    effects might exist, as the changes are very intrusive, plus it likely
    needs more changes in librte_ether.

Q4: Why not call the SW fall-back dev_ops directly in librte_ether for
    devices which do not support the specific feature? If the device
    supports the capability, let's call its dev_ops, otherwise call the
    SW fall-back dev_ops.
A4: First, similar reasons to Q&A3. This fixes the need to change
    ethdev->dev_ops of the device, but it does not do anything to fix the
    other significant issue of where to store the context data structures
    needed by the SW fall-back functions (which, in this approach, are
    called implicitly by librte_ether).
    Second, the SW fall-back options should not be restricted arbitrarily
    by the librte_ether library, the decision should belong to the app.
    For example, the TM SW fall-back should not be limited to only
    librte_sched, which (like any SW fall-back) is limited to a specific
    hierarchy and feature set, it cannot do any possible hierarchy. If
    alternatives exist, the one to use should be picked by the app, not by
    the ethdev layer.

Q5: Why is the app required to continue to configure both the "hard" and
    the "soft" devices even after the "soft" device has been created? Why
    not hiding the "hard" device under the "soft" device and have the
    "soft" device configure the "hard" device under the hood?
A5: This was the approach tried in the V2 of this patch set (overlay
    "soft" device taking over the configuration of the underlay "hard"
    device) and eventually dropped due to increased complexity of having
    to keep the configuration of two distinct devices in sync with
    librte_ether implementation that is not friendly towards such
    approach. Basically, each ethdev API call for the overlay device
    needs to configure the overlay device, invoke the same configuration
    with possibly modified parameters for the underlay device, then resume
    the configuration of overlay device, turning this into a device
    emulation project.
    V2 minuses: increased complexity (deal with two devices at same time);
    need to implement every ethdev API, even those not needed for the scope
    of SW fall-back; intrusive; sometimes have to silently take decisions
    that should be left to the app.
    V3 pluses: lower complexity (only one device); only need to implement
    those APIs that are in scope of the SW fall-back; non-intrusive (deal
    with "hard" device through ethdev API); app decisions taken by the app
    in an explicit way.

Q6: Why expose the SW fall-back in a PMD and not in a SW library?
A6: The SW fall-back for an ethdev API has to implement that specific
    ethdev API, (hence expose an ethdev object through a PMD), as opposed
    to providing a different API. This approach allows the app to use the
    same API (NFV vision). For example, we already have a library for TM
    SW fall-back (librte_sched) that can be called directly by the apps
    that need to call it outside of ethdev context (use-cases exist), but
    an app that works with TM-aware NICs through the ethdev TM API would
    have to be changed significantly in order to work with different
    TM-agnostic NICs through the librte_sched API.

Q7: Why have all the SW fall-backs in a single PMD? Why not develop
    the SW fall-back for each different ethdev API in a separate PMD, then
    create a chain of "soft" devices for each "hard" device? Potentially,
    this results in smaller size PMDs that are easier to maintain.
A7: Arguments for single ethdev/PMD and against chain of ethdevs/PMDs:
    1. All the existing PMDs for HW NICs implement a lot of features under
       the same PMD, so there is no reason for single PMD approach to break
       code modularity. See the V3 code, a lot of care has been taken for
       code modularity.
    2. We should avoid the proliferation of SW PMDs.
    3. A single device should be handled by a single PMD.
    4. People are used with feature-rich PMDs, not with single-feature
       PMDs, so we change of mindset?
    5. [Configuration nightmare] A chain of "soft" devices attached to
       single "hard" device requires the app to be aware that the N "soft"
       devices in the chain plus the "hard" device refer to the same HW
       device, and which device should be invoked to configure which
       feature. Also the length of the chain and functionality of each
       link is different for each HW device. This breaks the requirement
       of preserving the same API while working with different NICs (NFV).
       This most likely results in a configuration nightmare, nobody is
       going to seriously use this.
    6. [Feature inter-dependecy] Sometimes different features need to be
       configured and executed together (e.g. share the same set of
       resources, are inter-dependent, etc), so it is better and more
       performant to do them in the same ethdev/PMD.
    7. [Code duplication] There is a lot of duplication in the
       configuration code for the chain of ethdevs approach. The ethdev
       dev_configure, rx_queue_setup, tx_queue_setup API functions have to
       be implemented per device, and they become meaningless/inconsistent
       with the chain approach.
    8. [Data structure duplication] The per device data structures have to
       be duplicated and read repeatedly for each "soft" ethdev. The
       ethdev device, dev_private, data, per RX/TX queue data structures
       have to be replicated per "soft" device. They have to be re-read for
       each stage, so the same cache misses are now multiplied with the
       number of stages in the chain.
    9. [rte_ring proliferation] Thread safety requirements for ethdev
       RX/TXqueues require an rte_ring to be used for every RX/TX queue
       of each "soft" ethdev. This rte_ring proliferation unnecessarily
       increases the memory footprint and lowers performance, especially
       when each "soft" ethdev ends up on a different CPU core (ping-pong
       of cache lines).
    10.[Meta-data proliferation] A chain of ethdevs is likely to result
       in proliferation of meta-data that has to be passed between the
       ethdevs (e.g. policing needs the output of flow classification),
       which results in more cache line ping-pong between cores, hence
       performance drops.

Cristian Dumitrescu (4):
Jasvinder Singh (4):
  net/softnic: add softnic PMD
  net/softnic: add traffic management support
  net/softnic: add TM capabilities ops
  net/softnic: add TM hierarchy related ops

Jasvinder Singh (1):
  app/testpmd: add traffic management forwarding mode

 MAINTAINERS                                        |    5 +
 app/test-pmd/Makefile                              |   12 +
 app/test-pmd/cmdline.c                             |   88 +
 app/test-pmd/testpmd.c                             |   15 +
 app/test-pmd/testpmd.h                             |   46 +
 app/test-pmd/tm.c                                  |  865 +++++
 config/common_base                                 |    5 +
 doc/api/doxy-api-index.md                          |    3 +-
 doc/api/doxy-api.conf                              |    1 +
 doc/guides/rel_notes/release_17_11.rst             |    6 +
 drivers/net/Makefile                               |    5 +
 drivers/net/softnic/Makefile                       |   57 +
 drivers/net/softnic/rte_eth_softnic.c              |  852 +++++
 drivers/net/softnic/rte_eth_softnic.h              |   83 +
 drivers/net/softnic/rte_eth_softnic_internals.h    |  291 ++
 drivers/net/softnic/rte_eth_softnic_tm.c           | 3452 ++++++++++++++++++++
 .../net/softnic/rte_pmd_eth_softnic_version.map    |    7 +
 mk/rte.app.mk                                      |    5 +-
 18 files changed, 5796 insertions(+), 2 deletions(-)
 create mode 100644 app/test-pmd/tm.c
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c
 create mode 100644 drivers/net/softnic/rte_pmd_eth_softnic_version.map

Series Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Series Acked-by: Thomas Monjalon <thomas@monjalon.net>
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v8 1/5] net/softnic: add softnic PMD
  2017-10-10 10:18                           ` [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
@ 2017-10-10 10:18                             ` Jasvinder Singh
  2017-10-11 23:18                               ` Thomas Monjalon
  2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 2/5] net/softnic: add traffic management support Jasvinder Singh
                                               ` (4 subsequent siblings)
  5 siblings, 1 reply; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-10 10:18 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

Add SoftNIC PMD to provide SW fall-back for ethdev APIs.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
v8 changes:
- rebase on dpdk-next-net
- fix merge conflict

v7 changes:
- rebase on dpdk-next-net
- change port_id type to uint16_t
 
v5 changes:
- change function name rte_pmd_softnic_run_default() to run_default()

v4 changes:
- Implemented feedback from Ferruh [1]
 - rename map file to rte_pmd_eth_softnic_version.map
 - add release notes library version info
 - doxygen: fix hooks in doc/api/doxy-api-index.md
 - add doxygen comment for rte_pmd_softnic_run()
 - free device name memory
 - remove soft_dev param in pmd_ethdev_register()
 - fix checkpatch warnings

v3 changes:
- rebase to dpdk17.08 release

v2 changes:
- fix build errors
- rebased to TM APIs v6 plus dpdk master

[1] Feedback from Ferruh on v3: http://dpdk.org/ml/archives/dev/2017-September/074576.html

 MAINTAINERS                                        |   5 +
 config/common_base                                 |   5 +
 doc/api/doxy-api-index.md                          |   3 +-
 doc/api/doxy-api.conf                              |   1 +
 doc/guides/rel_notes/release_17_11.rst             |   6 +
 drivers/net/Makefile                               |   5 +
 drivers/net/softnic/Makefile                       |  56 ++
 drivers/net/softnic/rte_eth_softnic.c              | 591 +++++++++++++++++++++
 drivers/net/softnic/rte_eth_softnic.h              |  67 +++
 drivers/net/softnic/rte_eth_softnic_internals.h    | 114 ++++
 .../net/softnic/rte_pmd_eth_softnic_version.map    |   7 +
 mk/rte.app.mk                                      |   5 +-
 12 files changed, 863 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/softnic/Makefile
 create mode 100644 drivers/net/softnic/rte_eth_softnic.c
 create mode 100644 drivers/net/softnic/rte_eth_softnic.h
 create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h
 create mode 100644 drivers/net/softnic/rte_pmd_eth_softnic_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 84e0464..acab15e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -536,6 +536,11 @@ M: Gaetan Rivet <gaetan.rivet@6wind.com>
 F: drivers/net/failsafe/
 F: doc/guides/nics/fail_safe.rst
 
+Softnic PMD
+M: Jasvinder Singh <jasvinder.singh@intel.com>
+M: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
+F: drivers/net/softnic
+
 
 Crypto Drivers
 --------------
diff --git a/config/common_base b/config/common_base
index 4648943..dc4da9a 100644
--- a/config/common_base
+++ b/config/common_base
@@ -279,6 +279,11 @@ CONFIG_RTE_LIBRTE_SFC_EFX_PMD=y
 CONFIG_RTE_LIBRTE_SFC_EFX_DEBUG=n
 
 #
+# Compile SOFTNIC PMD
+#
+CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
+
+#
 # Compile software PMD backed by SZEDATA2 device
 #
 CONFIG_RTE_LIBRTE_PMD_SZEDATA2=n
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index e032ae1..950a553 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -56,7 +56,8 @@ The public API headers are grouped by topics:
   [ixgbe]              (@ref rte_pmd_ixgbe.h),
   [i40e]               (@ref rte_pmd_i40e.h),
   [bnxt]               (@ref rte_pmd_bnxt.h),
-  [crypto_scheduler]   (@ref rte_cryptodev_scheduler.h)
+  [crypto_scheduler]   (@ref rte_cryptodev_scheduler.h),
+  [softnic]            (@ref rte_eth_softnic.h)
 
 - **memory**:
   [memseg]             (@ref rte_memory.h),
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index 63fe6cb..1310dc7 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -32,6 +32,7 @@ PROJECT_NAME            = DPDK
 INPUT                   = doc/api/doxy-api-index.md \
                           drivers/crypto/scheduler \
                           drivers/net/bonding \
+                          drivers/net/softnic \
                           drivers/net/i40e \
                           drivers/net/ixgbe \
                           drivers/net/bnxt \
diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst
index 10c240c..c184f1e 100644
--- a/doc/guides/rel_notes/release_17_11.rst
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -87,6 +87,11 @@ New Features
   See the :ref:`Membership Library <Member_Library>` documentation in
   the Programmers Guide document, for more information.
 
+* **Added SoftNIC PMD.**
+
+  Added new SoftNIC PMD. This virtual device offers applications a software
+  fallback support for traffic management.
+
 * **Added the Generic Segmentation Offload Library.**
 
   Added the Generic Segmentation Offload (GSO) library to enable
@@ -279,6 +284,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_pmd_ixgbe.so.2
      librte_pmd_ring.so.2
      librte_pmd_vhost.so.2
+   + librte_pmd_softnic.so.1
      librte_port.so.3
      librte_power.so.1
      librte_reorder.so.1
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index aa5edf8..5d2ad2f 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -109,6 +109,11 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_KNI) += kni
 endif
 DEPDIRS-kni = $(core-libs) librte_kni
 
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += softnic
+endif # $(CONFIG_RTE_LIBRTE_SCHED)
+DEPDIRS-softnic = $(core-libs) librte_sched
+
 ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
 endif # $(CONFIG_RTE_LIBRTE_VHOST)
diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
new file mode 100644
index 0000000..c2f42ef
--- /dev/null
+++ b/drivers/net/softnic/Makefile
@@ -0,0 +1,56 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2017 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_pmd_softnic.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+EXPORT_MAP := rte_pmd_eth_softnic_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+
+#
+# Export include files
+#
+SYMLINK-y-include += rte_eth_softnic.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
new file mode 100644
index 0000000..2f92594
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -0,0 +1,591 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_ethdev.h>
+#include <rte_ethdev_vdev.h>
+#include <rte_malloc.h>
+#include <rte_vdev.h>
+#include <rte_kvargs.h>
+#include <rte_errno.h>
+#include <rte_ring.h>
+
+#include "rte_eth_softnic.h"
+#include "rte_eth_softnic_internals.h"
+
+#define DEV_HARD(p)					\
+	(&rte_eth_devices[p->hard.port_id])
+
+#define PMD_PARAM_HARD_NAME					"hard_name"
+#define PMD_PARAM_HARD_TX_QUEUE_ID			"hard_tx_queue_id"
+
+static const char *pmd_valid_args[] = {
+	PMD_PARAM_HARD_NAME,
+	PMD_PARAM_HARD_TX_QUEUE_ID,
+	NULL
+};
+
+static const struct rte_eth_dev_info pmd_dev_info = {
+	.min_rx_bufsize = 0,
+	.max_rx_pktlen = UINT32_MAX,
+	.max_rx_queues = UINT16_MAX,
+	.max_tx_queues = UINT16_MAX,
+	.rx_desc_lim = {
+		.nb_max = UINT16_MAX,
+		.nb_min = 0,
+		.nb_align = 1,
+	},
+	.tx_desc_lim = {
+		.nb_max = UINT16_MAX,
+		.nb_min = 0,
+		.nb_align = 1,
+	},
+};
+
+static void
+pmd_dev_infos_get(struct rte_eth_dev *dev __rte_unused,
+	struct rte_eth_dev_info *dev_info)
+{
+	memcpy(dev_info, &pmd_dev_info, sizeof(*dev_info));
+}
+
+static int
+pmd_dev_configure(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_eth_dev *hard_dev = DEV_HARD(p);
+
+	if (dev->data->nb_rx_queues > hard_dev->data->nb_rx_queues)
+		return -1;
+
+	if (p->params.hard.tx_queue_id >= hard_dev->data->nb_tx_queues)
+		return -1;
+
+	return 0;
+}
+
+static int
+pmd_rx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t rx_queue_id,
+	uint16_t nb_rx_desc __rte_unused,
+	unsigned int socket_id,
+	const struct rte_eth_rxconf *rx_conf __rte_unused,
+	struct rte_mempool *mb_pool __rte_unused)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	if (p->params.soft.intrusive == 0) {
+		struct pmd_rx_queue *rxq;
+
+		rxq = rte_zmalloc_socket(p->params.soft.name,
+			sizeof(struct pmd_rx_queue), 0, socket_id);
+		if (rxq == NULL)
+			return -ENOMEM;
+
+		rxq->hard.port_id = p->hard.port_id;
+		rxq->hard.rx_queue_id = rx_queue_id;
+		dev->data->rx_queues[rx_queue_id] = rxq;
+	} else {
+		struct rte_eth_dev *hard_dev = DEV_HARD(p);
+		void *rxq = hard_dev->data->rx_queues[rx_queue_id];
+
+		if (rxq == NULL)
+			return -1;
+
+		dev->data->rx_queues[rx_queue_id] = rxq;
+	}
+	return 0;
+}
+
+static int
+pmd_tx_queue_setup(struct rte_eth_dev *dev,
+	uint16_t tx_queue_id,
+	uint16_t nb_tx_desc,
+	unsigned int socket_id,
+	const struct rte_eth_txconf *tx_conf __rte_unused)
+{
+	uint32_t size = RTE_ETH_NAME_MAX_LEN + strlen("_txq") + 4;
+	char name[size];
+	struct rte_ring *r;
+
+	snprintf(name, sizeof(name), "%s_txq%04x",
+		dev->data->name, tx_queue_id);
+	r = rte_ring_create(name, nb_tx_desc, socket_id,
+		RING_F_SP_ENQ | RING_F_SC_DEQ);
+	if (r == NULL)
+		return -1;
+
+	dev->data->tx_queues[tx_queue_id] = r;
+	return 0;
+}
+
+static int
+pmd_dev_start(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	dev->data->dev_link.link_status = ETH_LINK_UP;
+
+	if (p->params.soft.intrusive) {
+		struct rte_eth_dev *hard_dev = DEV_HARD(p);
+
+		/* The hard_dev->rx_pkt_burst should be stable by now */
+		dev->rx_pkt_burst = hard_dev->rx_pkt_burst;
+	}
+
+	return 0;
+}
+
+static void
+pmd_dev_stop(struct rte_eth_dev *dev)
+{
+	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+}
+
+static void
+pmd_dev_close(struct rte_eth_dev *dev)
+{
+	uint32_t i;
+
+	/* TX queues */
+	for (i = 0; i < dev->data->nb_tx_queues; i++)
+		rte_ring_free((struct rte_ring *)dev->data->tx_queues[i]);
+}
+
+static int
+pmd_link_update(struct rte_eth_dev *dev __rte_unused,
+	int wait_to_complete __rte_unused)
+{
+	return 0;
+}
+
+static const struct eth_dev_ops pmd_ops = {
+	.dev_configure = pmd_dev_configure,
+	.dev_start = pmd_dev_start,
+	.dev_stop = pmd_dev_stop,
+	.dev_close = pmd_dev_close,
+	.link_update = pmd_link_update,
+	.dev_infos_get = pmd_dev_infos_get,
+	.rx_queue_setup = pmd_rx_queue_setup,
+	.tx_queue_setup = pmd_tx_queue_setup,
+	.tm_ops_get = NULL,
+};
+
+static uint16_t
+pmd_rx_pkt_burst(void *rxq,
+	struct rte_mbuf **rx_pkts,
+	uint16_t nb_pkts)
+{
+	struct pmd_rx_queue *rx_queue = rxq;
+
+	return rte_eth_rx_burst(rx_queue->hard.port_id,
+		rx_queue->hard.rx_queue_id,
+		rx_pkts,
+		nb_pkts);
+}
+
+static uint16_t
+pmd_tx_pkt_burst(void *txq,
+	struct rte_mbuf **tx_pkts,
+	uint16_t nb_pkts)
+{
+	return (uint16_t)rte_ring_enqueue_burst(txq,
+		(void **)tx_pkts,
+		nb_pkts,
+		NULL);
+}
+
+static __rte_always_inline int
+run_default(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Persistent context: Read Only (update not required) */
+	struct rte_mbuf **pkts = p->soft.def.pkts;
+	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
+
+	/* Persistent context: Read - Write (update required) */
+	uint32_t txq_pos = p->soft.def.txq_pos;
+	uint32_t pkts_len = p->soft.def.pkts_len;
+	uint32_t flush_count = p->soft.def.flush_count;
+
+	/* Not part of the persistent context */
+	uint32_t pos;
+	uint16_t i;
+
+	/* Soft device TXQ read, Hard device TXQ write */
+	for (i = 0; i < nb_tx_queues; i++) {
+		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
+
+		/* Read soft device TXQ burst to packet enqueue buffer */
+		pkts_len += rte_ring_sc_dequeue_burst(txq,
+			(void **)&pkts[pkts_len],
+			DEFAULT_BURST_SIZE,
+			NULL);
+
+		/* Increment soft device TXQ */
+		txq_pos++;
+		if (txq_pos >= nb_tx_queues)
+			txq_pos = 0;
+
+		/* Hard device TXQ write when complete burst is available */
+		if (pkts_len >= DEFAULT_BURST_SIZE) {
+			for (pos = 0; pos < pkts_len; )
+				pos += rte_eth_tx_burst(p->hard.port_id,
+					p->params.hard.tx_queue_id,
+					&pkts[pos],
+					(uint16_t)(pkts_len - pos));
+
+			pkts_len = 0;
+			flush_count = 0;
+			break;
+		}
+	}
+
+	if (flush_count >= FLUSH_COUNT_THRESHOLD) {
+		for (pos = 0; pos < pkts_len; )
+			pos += rte_eth_tx_burst(p->hard.port_id,
+				p->params.hard.tx_queue_id,
+				&pkts[pos],
+				(uint16_t)(pkts_len - pos));
+
+		pkts_len = 0;
+		flush_count = 0;
+	}
+
+	p->soft.def.txq_pos = txq_pos;
+	p->soft.def.pkts_len = pkts_len;
+	p->soft.def.flush_count = flush_count + 1;
+
+	return 0;
+}
+
+int
+rte_pmd_softnic_run(uint16_t port_id)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+#ifdef RTE_LIBRTE_ETHDEV_DEBUG
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+#endif
+
+	return run_default(dev);
+}
+
+static struct ether_addr eth_addr = { .addr_bytes = {0} };
+
+static uint32_t
+eth_dev_speed_max_mbps(uint32_t speed_capa)
+{
+	uint32_t rate_mbps[32] = {
+		ETH_SPEED_NUM_NONE,
+		ETH_SPEED_NUM_10M,
+		ETH_SPEED_NUM_10M,
+		ETH_SPEED_NUM_100M,
+		ETH_SPEED_NUM_100M,
+		ETH_SPEED_NUM_1G,
+		ETH_SPEED_NUM_2_5G,
+		ETH_SPEED_NUM_5G,
+		ETH_SPEED_NUM_10G,
+		ETH_SPEED_NUM_20G,
+		ETH_SPEED_NUM_25G,
+		ETH_SPEED_NUM_40G,
+		ETH_SPEED_NUM_50G,
+		ETH_SPEED_NUM_56G,
+		ETH_SPEED_NUM_100G,
+	};
+
+	uint32_t pos = (speed_capa) ? (31 - __builtin_clz(speed_capa)) : 0;
+	return rate_mbps[pos];
+}
+
+static int
+default_init(struct pmd_internals *p,
+	struct pmd_params *params,
+	int numa_node)
+{
+	p->soft.def.pkts = rte_zmalloc_socket(params->soft.name,
+		2 * DEFAULT_BURST_SIZE * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.def.pkts == NULL)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void
+default_free(struct pmd_internals *p)
+{
+	rte_free(p->soft.def.pkts);
+}
+
+static void *
+pmd_init(struct pmd_params *params, int numa_node)
+{
+	struct pmd_internals *p;
+	int status;
+
+	p = rte_zmalloc_socket(params->soft.name,
+		sizeof(struct pmd_internals),
+		0,
+		numa_node);
+	if (p == NULL)
+		return NULL;
+
+	memcpy(&p->params, params, sizeof(p->params));
+	rte_eth_dev_get_port_by_name(params->hard.name, &p->hard.port_id);
+
+	/* Default */
+	status = default_init(p, params, numa_node);
+	if (status) {
+		free(p->params.hard.name);
+		rte_free(p);
+		return NULL;
+	}
+
+	return p;
+}
+
+static void
+pmd_free(struct pmd_internals *p)
+{
+	default_free(p);
+
+	free(p->params.hard.name);
+	rte_free(p);
+}
+
+static int
+pmd_ethdev_register(struct rte_vdev_device *vdev,
+	struct pmd_params *params,
+	void *dev_private)
+{
+	struct rte_eth_dev_info hard_info;
+	struct rte_eth_dev *soft_dev;
+	uint32_t hard_speed;
+	int numa_node;
+	uint16_t hard_port_id;
+
+	rte_eth_dev_get_port_by_name(params->hard.name, &hard_port_id);
+	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
+	numa_node = rte_eth_dev_socket_id(hard_port_id);
+
+	/* Ethdev entry allocation */
+	soft_dev = rte_eth_dev_allocate(params->soft.name);
+	if (!soft_dev)
+		return -ENOMEM;
+
+	/* dev */
+	soft_dev->rx_pkt_burst = (params->soft.intrusive) ?
+		NULL : /* set up later */
+		pmd_rx_pkt_burst;
+	soft_dev->tx_pkt_burst = pmd_tx_pkt_burst;
+	soft_dev->tx_pkt_prepare = NULL;
+	soft_dev->dev_ops = &pmd_ops;
+	soft_dev->device = &vdev->device;
+
+	/* dev->data */
+	soft_dev->data->dev_private = dev_private;
+	soft_dev->data->dev_link.link_speed = hard_speed;
+	soft_dev->data->dev_link.link_duplex = ETH_LINK_FULL_DUPLEX;
+	soft_dev->data->dev_link.link_autoneg = ETH_LINK_SPEED_FIXED;
+	soft_dev->data->dev_link.link_status = ETH_LINK_DOWN;
+	soft_dev->data->mac_addrs = &eth_addr;
+	soft_dev->data->promiscuous = 1;
+	soft_dev->data->kdrv = RTE_KDRV_NONE;
+	soft_dev->data->numa_node = numa_node;
+	soft_dev->data->dev_flags = RTE_ETH_DEV_DETACHABLE;
+
+	return 0;
+}
+
+static int
+get_string(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(char **)extra_args = strdup(value);
+
+	if (!*(char **)extra_args)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static int
+get_uint32(const char *key __rte_unused, const char *value, void *extra_args)
+{
+	if (!value || !extra_args)
+		return -EINVAL;
+
+	*(uint32_t *)extra_args = strtoull(value, NULL, 0);
+
+	return 0;
+}
+
+static int
+pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
+{
+	struct rte_kvargs *kvlist;
+	int ret;
+
+	kvlist = rte_kvargs_parse(params, pmd_valid_args);
+	if (kvlist == NULL)
+		return -EINVAL;
+
+	/* Set default values */
+	memset(p, 0, sizeof(*p));
+	p->soft.name = name;
+	p->soft.intrusive = INTRUSIVE;
+	p->hard.tx_queue_id = SOFTNIC_HARD_TX_QUEUE_ID;
+
+	/* HARD: name (mandatory) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_NAME) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME,
+			&get_string, &p->hard.name);
+		if (ret < 0)
+			goto out_free;
+	} else {
+		ret = -EINVAL;
+		goto out_free;
+	}
+
+	/* HARD: tx_queue_id (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_TX_QUEUE_ID) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_TX_QUEUE_ID,
+			&get_uint32, &p->hard.tx_queue_id);
+		if (ret < 0)
+			goto out_free;
+	}
+
+out_free:
+	rte_kvargs_free(kvlist);
+	return ret;
+}
+
+static int
+pmd_probe(struct rte_vdev_device *vdev)
+{
+	struct pmd_params p;
+	const char *params;
+	int status;
+
+	struct rte_eth_dev_info hard_info;
+	uint16_t hard_port_id;
+	int numa_node;
+	void *dev_private;
+
+	RTE_LOG(INFO, PMD,
+		"Probing device \"%s\"\n",
+		rte_vdev_device_name(vdev));
+
+	/* Parse input arguments */
+	params = rte_vdev_device_args(vdev);
+	if (!params)
+		return -EINVAL;
+
+	status = pmd_parse_args(&p, rte_vdev_device_name(vdev), params);
+	if (status)
+		return status;
+
+	/* Check input arguments */
+	if (rte_eth_dev_get_port_by_name(p.hard.name, &hard_port_id))
+		return -EINVAL;
+
+	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	numa_node = rte_eth_dev_socket_id(hard_port_id);
+
+	if (p.hard.tx_queue_id >= hard_info.max_tx_queues)
+		return -EINVAL;
+
+	/* Allocate and initialize soft ethdev private data */
+	dev_private = pmd_init(&p, numa_node);
+	if (dev_private == NULL)
+		return -ENOMEM;
+
+	/* Register soft ethdev */
+	RTE_LOG(INFO, PMD,
+		"Creating soft ethdev \"%s\" for hard ethdev \"%s\"\n",
+		p.soft.name, p.hard.name);
+
+	status = pmd_ethdev_register(vdev, &p, dev_private);
+	if (status) {
+		pmd_free(dev_private);
+		return status;
+	}
+
+	return 0;
+}
+
+static int
+pmd_remove(struct rte_vdev_device *vdev)
+{
+	struct rte_eth_dev *dev = NULL;
+	struct pmd_internals *p;
+
+	if (!vdev)
+		return -EINVAL;
+
+	RTE_LOG(INFO, PMD, "Removing device \"%s\"\n",
+		rte_vdev_device_name(vdev));
+
+	/* Find the ethdev entry */
+	dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+	if (dev == NULL)
+		return -ENODEV;
+	p = dev->data->dev_private;
+
+	/* Free device data structures*/
+	pmd_free(p);
+	rte_free(dev->data);
+	rte_eth_dev_release_port(dev);
+
+	return 0;
+}
+
+static struct rte_vdev_driver pmd_softnic_drv = {
+	.probe = pmd_probe,
+	.remove = pmd_remove,
+};
+
+RTE_PMD_REGISTER_VDEV(net_softnic, pmd_softnic_drv);
+RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
+	PMD_PARAM_HARD_NAME "=<string> "
+	PMD_PARAM_HARD_TX_QUEUE_ID "=<int>");
diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
new file mode 100644
index 0000000..566490a
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic.h
@@ -0,0 +1,67 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_H__
+
+#include <stdint.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifndef SOFTNIC_HARD_TX_QUEUE_ID
+#define SOFTNIC_HARD_TX_QUEUE_ID			0
+#endif
+
+/**
+ * Run the traffic management function on the softnic device
+ *
+ * This function read the packets from the softnic input queues, insert into
+ * QoS scheduler queues based on mbuf sched field value and transmit the
+ * scheduled packets out through the hard device interface.
+ *
+ * @param portid
+ *    port id of the soft device.
+ * @return
+ *    zero.
+ */
+
+int
+rte_pmd_softnic_run(uint16_t port_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
new file mode 100644
index 0000000..08a633f
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -0,0 +1,114 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+#define __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__
+
+#include <stdint.h>
+
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+
+#include "rte_eth_softnic.h"
+
+#ifndef INTRUSIVE
+#define INTRUSIVE					0
+#endif
+
+struct pmd_params {
+	/** Parameters for the soft device (to be created) */
+	struct {
+		const char *name; /**< Name */
+		uint32_t flags; /**< Flags */
+
+		/** 0 = Access hard device though API only (potentially slower,
+		 *      but safer);
+		 *  1 = Access hard device private data structures is allowed
+		 *      (potentially faster).
+		 */
+		int intrusive;
+	} soft;
+
+	/** Parameters for the hard device (existing) */
+	struct {
+		char *name; /**< Name */
+		uint16_t tx_queue_id; /**< TX queue ID */
+	} hard;
+};
+
+/**
+ * Default Internals
+ */
+
+#ifndef DEFAULT_BURST_SIZE
+#define DEFAULT_BURST_SIZE				32
+#endif
+
+#ifndef FLUSH_COUNT_THRESHOLD
+#define FLUSH_COUNT_THRESHOLD			(1 << 17)
+#endif
+
+struct default_internals {
+	struct rte_mbuf **pkts;
+	uint32_t pkts_len;
+	uint32_t txq_pos;
+	uint32_t flush_count;
+};
+
+/**
+ * PMD Internals
+ */
+struct pmd_internals {
+	/** Params */
+	struct pmd_params params;
+
+	/** Soft device */
+	struct {
+		struct default_internals def; /**< Default */
+	} soft;
+
+	/** Hard device */
+	struct {
+		uint16_t port_id;
+	} hard;
+};
+
+struct pmd_rx_queue {
+	/** Hard device */
+	struct {
+		uint16_t port_id;
+		uint16_t rx_queue_id;
+	} hard;
+};
+
+#endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
diff --git a/drivers/net/softnic/rte_pmd_eth_softnic_version.map b/drivers/net/softnic/rte_pmd_eth_softnic_version.map
new file mode 100644
index 0000000..fb2cb68
--- /dev/null
+++ b/drivers/net/softnic/rte_pmd_eth_softnic_version.map
@@ -0,0 +1,7 @@
+DPDK_17.11 {
+	global:
+
+	rte_pmd_softnic_run;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index cee21ea..0b8f612 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -68,7 +68,6 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_IP_FRAG)        += -lrte_ip_frag
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
-_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
 # librte_acl needs --whole-archive because of weak functions
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)            += --whole-archive
@@ -101,6 +100,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_RING)           += -lrte_ring
 _LDLIBS-$(CONFIG_RTE_LIBRTE_EAL)            += -lrte_eal
 _LDLIBS-$(CONFIG_RTE_LIBRTE_CMDLINE)        += -lrte_cmdline
 _LDLIBS-$(CONFIG_RTE_LIBRTE_REORDER)        += -lrte_reorder
+_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED)          += -lrte_sched
 
 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_KNI)            += -lrte_kni
@@ -143,6 +143,9 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL)       += -lrte_pmd_null
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP)       += -lrte_pmd_pcap -lpcap
 _LDLIBS-$(CONFIG_RTE_LIBRTE_QEDE_PMD)       += -lrte_pmd_qede
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING)       += -lrte_pmd_ring
+ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC)      += -lrte_pmd_softnic
+endif
 _LDLIBS-$(CONFIG_RTE_LIBRTE_SFC_EFX_PMD)    += -lrte_pmd_sfc_efx
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2)   += -lrte_pmd_szedata2 -lsze2
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_TAP)        += -lrte_pmd_tap
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v8 2/5] net/softnic: add traffic management support
  2017-10-10 10:18                           ` [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 1/5] net/softnic: add softnic PMD Jasvinder Singh
@ 2017-10-10 10:18                             ` Jasvinder Singh
  2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 3/5] net/softnic: add TM capabilities ops Jasvinder Singh
                                               ` (3 subsequent siblings)
  5 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-10 10:18 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

Add ethdev Traffic Management API support to SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Lu, Wenzhuo <wenzhuo.lu@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
v7 changes:
- fix checkpatch warning

v5 changes:
- change function name rte_pmd_softnic_run_tm() to run_tm()

v3 changes:
- add more confguration parameters (tm rate, tm queue sizes)

 drivers/net/softnic/Makefile                    |   1 +
 drivers/net/softnic/rte_eth_softnic.c           | 255 +++++++++++++++++++++++-
 drivers/net/softnic/rte_eth_softnic.h           |  16 ++
 drivers/net/softnic/rte_eth_softnic_internals.h | 104 ++++++++++
 drivers/net/softnic/rte_eth_softnic_tm.c        | 181 +++++++++++++++++
 5 files changed, 555 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c

diff --git a/drivers/net/softnic/Makefile b/drivers/net/softnic/Makefile
index c2f42ef..8b848a9 100644
--- a/drivers/net/softnic/Makefile
+++ b/drivers/net/softnic/Makefile
@@ -47,6 +47,7 @@ LIBABIVER := 1
 # all source are stored in SRCS-y
 #
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC) += rte_eth_softnic_tm.c
 
 #
 # Export include files
diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index 2f92594..2f19159 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -42,6 +42,7 @@
 #include <rte_kvargs.h>
 #include <rte_errno.h>
 #include <rte_ring.h>
+#include <rte_sched.h>
 
 #include "rte_eth_softnic.h"
 #include "rte_eth_softnic_internals.h"
@@ -49,10 +50,29 @@
 #define DEV_HARD(p)					\
 	(&rte_eth_devices[p->hard.port_id])
 
+#define PMD_PARAM_SOFT_TM					"soft_tm"
+#define PMD_PARAM_SOFT_TM_RATE				"soft_tm_rate"
+#define PMD_PARAM_SOFT_TM_NB_QUEUES			"soft_tm_nb_queues"
+#define PMD_PARAM_SOFT_TM_QSIZE0			"soft_tm_qsize0"
+#define PMD_PARAM_SOFT_TM_QSIZE1			"soft_tm_qsize1"
+#define PMD_PARAM_SOFT_TM_QSIZE2			"soft_tm_qsize2"
+#define PMD_PARAM_SOFT_TM_QSIZE3			"soft_tm_qsize3"
+#define PMD_PARAM_SOFT_TM_ENQ_BSZ			"soft_tm_enq_bsz"
+#define PMD_PARAM_SOFT_TM_DEQ_BSZ			"soft_tm_deq_bsz"
+
 #define PMD_PARAM_HARD_NAME					"hard_name"
 #define PMD_PARAM_HARD_TX_QUEUE_ID			"hard_tx_queue_id"
 
 static const char *pmd_valid_args[] = {
+	PMD_PARAM_SOFT_TM,
+	PMD_PARAM_SOFT_TM_RATE,
+	PMD_PARAM_SOFT_TM_NB_QUEUES,
+	PMD_PARAM_SOFT_TM_QSIZE0,
+	PMD_PARAM_SOFT_TM_QSIZE1,
+	PMD_PARAM_SOFT_TM_QSIZE2,
+	PMD_PARAM_SOFT_TM_QSIZE3,
+	PMD_PARAM_SOFT_TM_ENQ_BSZ,
+	PMD_PARAM_SOFT_TM_DEQ_BSZ,
 	PMD_PARAM_HARD_NAME,
 	PMD_PARAM_HARD_TX_QUEUE_ID,
 	NULL
@@ -157,6 +177,13 @@ pmd_dev_start(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
 
+	if (tm_used(dev)) {
+		int status = tm_start(p);
+
+		if (status)
+			return status;
+	}
+
 	dev->data->dev_link.link_status = ETH_LINK_UP;
 
 	if (p->params.soft.intrusive) {
@@ -172,7 +199,12 @@ pmd_dev_start(struct rte_eth_dev *dev)
 static void
 pmd_dev_stop(struct rte_eth_dev *dev)
 {
+	struct pmd_internals *p = dev->data->dev_private;
+
 	dev->data->dev_link.link_status = ETH_LINK_DOWN;
+
+	if (tm_used(dev))
+		tm_stop(p);
 }
 
 static void
@@ -293,6 +325,77 @@ run_default(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static __rte_always_inline int
+run_tm(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* Persistent context: Read Only (update not required) */
+	struct rte_sched_port *sched = p->soft.tm.sched;
+	struct rte_mbuf **pkts_enq = p->soft.tm.pkts_enq;
+	struct rte_mbuf **pkts_deq = p->soft.tm.pkts_deq;
+	uint32_t enq_bsz = p->params.soft.tm.enq_bsz;
+	uint32_t deq_bsz = p->params.soft.tm.deq_bsz;
+	uint16_t nb_tx_queues = dev->data->nb_tx_queues;
+
+	/* Persistent context: Read - Write (update required) */
+	uint32_t txq_pos = p->soft.tm.txq_pos;
+	uint32_t pkts_enq_len = p->soft.tm.pkts_enq_len;
+	uint32_t flush_count = p->soft.tm.flush_count;
+
+	/* Not part of the persistent context */
+	uint32_t pkts_deq_len, pos;
+	uint16_t i;
+
+	/* Soft device TXQ read, TM enqueue */
+	for (i = 0; i < nb_tx_queues; i++) {
+		struct rte_ring *txq = dev->data->tx_queues[txq_pos];
+
+		/* Read TXQ burst to packet enqueue buffer */
+		pkts_enq_len += rte_ring_sc_dequeue_burst(txq,
+			(void **)&pkts_enq[pkts_enq_len],
+			enq_bsz,
+			NULL);
+
+		/* Increment TXQ */
+		txq_pos++;
+		if (txq_pos >= nb_tx_queues)
+			txq_pos = 0;
+
+		/* TM enqueue when complete burst is available */
+		if (pkts_enq_len >= enq_bsz) {
+			rte_sched_port_enqueue(sched, pkts_enq, pkts_enq_len);
+
+			pkts_enq_len = 0;
+			flush_count = 0;
+			break;
+		}
+	}
+
+	if (flush_count >= FLUSH_COUNT_THRESHOLD) {
+		if (pkts_enq_len)
+			rte_sched_port_enqueue(sched, pkts_enq, pkts_enq_len);
+
+		pkts_enq_len = 0;
+		flush_count = 0;
+	}
+
+	p->soft.tm.txq_pos = txq_pos;
+	p->soft.tm.pkts_enq_len = pkts_enq_len;
+	p->soft.tm.flush_count = flush_count + 1;
+
+	/* TM dequeue, Hard device TXQ write */
+	pkts_deq_len = rte_sched_port_dequeue(sched, pkts_deq, deq_bsz);
+
+	for (pos = 0; pos < pkts_deq_len; )
+		pos += rte_eth_tx_burst(p->hard.port_id,
+			p->params.hard.tx_queue_id,
+			&pkts_deq[pos],
+			(uint16_t)(pkts_deq_len - pos));
+
+	return 0;
+}
+
 int
 rte_pmd_softnic_run(uint16_t port_id)
 {
@@ -302,7 +405,7 @@ rte_pmd_softnic_run(uint16_t port_id)
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
 #endif
 
-	return run_default(dev);
+	return (tm_used(dev)) ? run_tm(dev) : run_default(dev);
 }
 
 static struct ether_addr eth_addr = { .addr_bytes = {0} };
@@ -378,12 +481,26 @@ pmd_init(struct pmd_params *params, int numa_node)
 		return NULL;
 	}
 
+	/* Traffic Management (TM)*/
+	if (params->soft.flags & PMD_FEATURE_TM) {
+		status = tm_init(p, params, numa_node);
+		if (status) {
+			default_free(p);
+			free(p->params.hard.name);
+			rte_free(p);
+			return NULL;
+		}
+	}
+
 	return p;
 }
 
 static void
 pmd_free(struct pmd_internals *p)
 {
+	if (p->params.soft.flags & PMD_FEATURE_TM)
+		tm_free(p);
+
 	default_free(p);
 
 	free(p->params.hard.name);
@@ -464,7 +581,7 @@ static int
 pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
 {
 	struct rte_kvargs *kvlist;
-	int ret;
+	int i, ret;
 
 	kvlist = rte_kvargs_parse(params, pmd_valid_args);
 	if (kvlist == NULL)
@@ -474,8 +591,124 @@ pmd_parse_args(struct pmd_params *p, const char *name, const char *params)
 	memset(p, 0, sizeof(*p));
 	p->soft.name = name;
 	p->soft.intrusive = INTRUSIVE;
+	p->soft.tm.rate = 0;
+	p->soft.tm.nb_queues = SOFTNIC_SOFT_TM_NB_QUEUES;
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++)
+		p->soft.tm.qsize[i] = SOFTNIC_SOFT_TM_QUEUE_SIZE;
+	p->soft.tm.enq_bsz = SOFTNIC_SOFT_TM_ENQ_BSZ;
+	p->soft.tm.deq_bsz = SOFTNIC_SOFT_TM_DEQ_BSZ;
 	p->hard.tx_queue_id = SOFTNIC_HARD_TX_QUEUE_ID;
 
+	/* SOFT: TM (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM) == 1) {
+		char *s;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM,
+			&get_string, &s);
+		if (ret < 0)
+			goto out_free;
+
+		if (strcmp(s, "on") == 0)
+			p->soft.flags |= PMD_FEATURE_TM;
+		else if (strcmp(s, "off") == 0)
+			p->soft.flags &= ~PMD_FEATURE_TM;
+		else
+			ret = -EINVAL;
+
+		free(s);
+		if (ret)
+			goto out_free;
+	}
+
+	/* SOFT: TM rate (measured in bytes/second) (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_RATE) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_RATE,
+			&get_uint32, &p->soft.tm.rate);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM number of queues (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_NB_QUEUES) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_NB_QUEUES,
+			&get_uint32, &p->soft.tm.nb_queues);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM queue size 0 .. 3 (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE0) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE0,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[0] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE1) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE1,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[1] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE2) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE2,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[2] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_QSIZE3) == 1) {
+		uint32_t qsize;
+
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_QSIZE3,
+			&get_uint32, &qsize);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.tm.qsize[3] = (uint16_t)qsize;
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM enqueue burst size (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_ENQ_BSZ) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_ENQ_BSZ,
+			&get_uint32, &p->soft.tm.enq_bsz);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
+	/* SOFT: TM dequeue burst size (optional) */
+	if (rte_kvargs_count(kvlist, PMD_PARAM_SOFT_TM_DEQ_BSZ) == 1) {
+		ret = rte_kvargs_process(kvlist, PMD_PARAM_SOFT_TM_DEQ_BSZ,
+			&get_uint32, &p->soft.tm.deq_bsz);
+		if (ret < 0)
+			goto out_free;
+
+		p->soft.flags |= PMD_FEATURE_TM;
+	}
+
 	/* HARD: name (mandatory) */
 	if (rte_kvargs_count(kvlist, PMD_PARAM_HARD_NAME) == 1) {
 		ret = rte_kvargs_process(kvlist, PMD_PARAM_HARD_NAME,
@@ -508,6 +741,7 @@ pmd_probe(struct rte_vdev_device *vdev)
 	int status;
 
 	struct rte_eth_dev_info hard_info;
+	uint32_t hard_speed;
 	uint16_t hard_port_id;
 	int numa_node;
 	void *dev_private;
@@ -530,11 +764,19 @@ pmd_probe(struct rte_vdev_device *vdev)
 		return -EINVAL;
 
 	rte_eth_dev_info_get(hard_port_id, &hard_info);
+	hard_speed = eth_dev_speed_max_mbps(hard_info.speed_capa);
 	numa_node = rte_eth_dev_socket_id(hard_port_id);
 
 	if (p.hard.tx_queue_id >= hard_info.max_tx_queues)
 		return -EINVAL;
 
+	if (p.soft.flags & PMD_FEATURE_TM) {
+		status = tm_params_check(&p, hard_speed);
+
+		if (status)
+			return status;
+	}
+
 	/* Allocate and initialize soft ethdev private data */
 	dev_private = pmd_init(&p, numa_node);
 	if (dev_private == NULL)
@@ -587,5 +829,14 @@ static struct rte_vdev_driver pmd_softnic_drv = {
 
 RTE_PMD_REGISTER_VDEV(net_softnic, pmd_softnic_drv);
 RTE_PMD_REGISTER_PARAM_STRING(net_softnic,
+	PMD_PARAM_SOFT_TM	 "=on|off "
+	PMD_PARAM_SOFT_TM_RATE "=<int> "
+	PMD_PARAM_SOFT_TM_NB_QUEUES "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE0 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE1 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE2 "=<int> "
+	PMD_PARAM_SOFT_TM_QSIZE3 "=<int> "
+	PMD_PARAM_SOFT_TM_ENQ_BSZ "=<int> "
+	PMD_PARAM_SOFT_TM_DEQ_BSZ "=<int> "
 	PMD_PARAM_HARD_NAME "=<string> "
 	PMD_PARAM_HARD_TX_QUEUE_ID "=<int>");
diff --git a/drivers/net/softnic/rte_eth_softnic.h b/drivers/net/softnic/rte_eth_softnic.h
index 566490a..b49e582 100644
--- a/drivers/net/softnic/rte_eth_softnic.h
+++ b/drivers/net/softnic/rte_eth_softnic.h
@@ -40,6 +40,22 @@
 extern "C" {
 #endif
 
+#ifndef SOFTNIC_SOFT_TM_NB_QUEUES
+#define SOFTNIC_SOFT_TM_NB_QUEUES			65536
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_QUEUE_SIZE
+#define SOFTNIC_SOFT_TM_QUEUE_SIZE			64
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_ENQ_BSZ
+#define SOFTNIC_SOFT_TM_ENQ_BSZ				32
+#endif
+
+#ifndef SOFTNIC_SOFT_TM_DEQ_BSZ
+#define SOFTNIC_SOFT_TM_DEQ_BSZ				24
+#endif
+
 #ifndef SOFTNIC_HARD_TX_QUEUE_ID
 #define SOFTNIC_HARD_TX_QUEUE_ID			0
 #endif
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 08a633f..fd9cbbe 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -37,10 +37,19 @@
 #include <stdint.h>
 
 #include <rte_mbuf.h>
+#include <rte_sched.h>
 #include <rte_ethdev.h>
 
 #include "rte_eth_softnic.h"
 
+/**
+ * PMD Parameters
+ */
+
+enum pmd_feature {
+	PMD_FEATURE_TM = 1, /**< Traffic Management (TM) */
+};
+
 #ifndef INTRUSIVE
 #define INTRUSIVE					0
 #endif
@@ -57,6 +66,16 @@ struct pmd_params {
 		 *      (potentially faster).
 		 */
 		int intrusive;
+
+		/** Traffic Management (TM) */
+		struct {
+			uint32_t rate; /**< Rate (bytes/second) */
+			uint32_t nb_queues; /**< Number of queues */
+			uint16_t qsize[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+			/**< Queue size per traffic class */
+			uint32_t enq_bsz; /**< Enqueue burst size */
+			uint32_t deq_bsz; /**< Dequeue burst size */
+		} tm;
 	} soft;
 
 	/** Parameters for the hard device (existing) */
@@ -86,6 +105,66 @@ struct default_internals {
 };
 
 /**
+ * Traffic Management (TM) Internals
+ */
+
+#ifndef TM_MAX_SUBPORTS
+#define TM_MAX_SUBPORTS					8
+#endif
+
+#ifndef TM_MAX_PIPES_PER_SUBPORT
+#define TM_MAX_PIPES_PER_SUBPORT			4096
+#endif
+
+struct tm_params {
+	struct rte_sched_port_params port_params;
+
+	struct rte_sched_subport_params subport_params[TM_MAX_SUBPORTS];
+
+	struct rte_sched_pipe_params
+		pipe_profiles[RTE_SCHED_PIPE_PROFILES_PER_PORT];
+	uint32_t n_pipe_profiles;
+	uint32_t pipe_to_profile[TM_MAX_SUBPORTS * TM_MAX_PIPES_PER_SUBPORT];
+};
+
+/* TM Levels */
+enum tm_node_level {
+	TM_NODE_LEVEL_PORT = 0,
+	TM_NODE_LEVEL_SUBPORT,
+	TM_NODE_LEVEL_PIPE,
+	TM_NODE_LEVEL_TC,
+	TM_NODE_LEVEL_QUEUE,
+	TM_NODE_LEVEL_MAX,
+};
+
+/* TM Hierarchy Specification */
+struct tm_hierarchy {
+	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
+};
+
+struct tm_internals {
+	/** Hierarchy specification
+	 *
+	 *     -Hierarchy is unfrozen at init and when port is stopped.
+	 *     -Hierarchy is frozen on successful hierarchy commit.
+	 *     -Run-time hierarchy changes are not allowed, therefore it makes
+	 *      sense to keep the hierarchy frozen after the port is started.
+	 */
+	struct tm_hierarchy h;
+
+	/** Blueprints */
+	struct tm_params params;
+
+	/** Run-time */
+	struct rte_sched_port *sched;
+	struct rte_mbuf **pkts_enq;
+	struct rte_mbuf **pkts_deq;
+	uint32_t pkts_enq_len;
+	uint32_t txq_pos;
+	uint32_t flush_count;
+};
+
+/**
  * PMD Internals
  */
 struct pmd_internals {
@@ -95,6 +174,7 @@ struct pmd_internals {
 	/** Soft device */
 	struct {
 		struct default_internals def; /**< Default */
+		struct tm_internals tm; /**< Traffic Management */
 	} soft;
 
 	/** Hard device */
@@ -111,4 +191,28 @@ struct pmd_rx_queue {
 	} hard;
 };
 
+int
+tm_params_check(struct pmd_params *params, uint32_t hard_rate);
+
+int
+tm_init(struct pmd_internals *p, struct pmd_params *params, int numa_node);
+
+void
+tm_free(struct pmd_internals *p);
+
+int
+tm_start(struct pmd_internals *p);
+
+void
+tm_stop(struct pmd_internals *p);
+
+static inline int
+tm_used(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	return (p->params.soft.flags & PMD_FEATURE_TM) &&
+		p->soft.tm.h.n_tm_nodes[TM_NODE_LEVEL_PORT];
+}
+
 #endif /* __INCLUDE_RTE_ETH_SOFTNIC_INTERNALS_H__ */
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
new file mode 100644
index 0000000..165abfe
--- /dev/null
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -0,0 +1,181 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <rte_malloc.h>
+
+#include "rte_eth_softnic_internals.h"
+#include "rte_eth_softnic.h"
+
+#define BYTES_IN_MBPS (1000 * 1000 / 8)
+
+int
+tm_params_check(struct pmd_params *params, uint32_t hard_rate)
+{
+	uint64_t hard_rate_bytes_per_sec = hard_rate * BYTES_IN_MBPS;
+	uint32_t i;
+
+	/* rate */
+	if (params->soft.tm.rate) {
+		if (params->soft.tm.rate > hard_rate_bytes_per_sec)
+			return -EINVAL;
+	} else {
+		params->soft.tm.rate =
+			(hard_rate_bytes_per_sec > UINT32_MAX) ?
+				UINT32_MAX : hard_rate_bytes_per_sec;
+	}
+
+	/* nb_queues */
+	if (params->soft.tm.nb_queues == 0)
+		return -EINVAL;
+
+	if (params->soft.tm.nb_queues < RTE_SCHED_QUEUES_PER_PIPE)
+		params->soft.tm.nb_queues = RTE_SCHED_QUEUES_PER_PIPE;
+
+	params->soft.tm.nb_queues =
+		rte_align32pow2(params->soft.tm.nb_queues);
+
+	/* qsize */
+	for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+		if (params->soft.tm.qsize[i] == 0)
+			return -EINVAL;
+
+		params->soft.tm.qsize[i] =
+			rte_align32pow2(params->soft.tm.qsize[i]);
+	}
+
+	/* enq_bsz, deq_bsz */
+	if (params->soft.tm.enq_bsz == 0 ||
+		params->soft.tm.deq_bsz == 0 ||
+		params->soft.tm.deq_bsz >= params->soft.tm.enq_bsz)
+		return -EINVAL;
+
+	return 0;
+}
+
+int
+tm_init(struct pmd_internals *p,
+	struct pmd_params *params,
+	int numa_node)
+{
+	uint32_t enq_bsz = params->soft.tm.enq_bsz;
+	uint32_t deq_bsz = params->soft.tm.deq_bsz;
+
+	p->soft.tm.pkts_enq = rte_zmalloc_socket(params->soft.name,
+		2 * enq_bsz * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.tm.pkts_enq == NULL)
+		return -ENOMEM;
+
+	p->soft.tm.pkts_deq = rte_zmalloc_socket(params->soft.name,
+		deq_bsz * sizeof(struct rte_mbuf *),
+		0,
+		numa_node);
+
+	if (p->soft.tm.pkts_deq == NULL) {
+		rte_free(p->soft.tm.pkts_enq);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void
+tm_free(struct pmd_internals *p)
+{
+	rte_free(p->soft.tm.pkts_enq);
+	rte_free(p->soft.tm.pkts_deq);
+}
+
+int
+tm_start(struct pmd_internals *p)
+{
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_subports, subport_id;
+	int status;
+
+	/* Port */
+	p->soft.tm.sched = rte_sched_port_config(&t->port_params);
+	if (p->soft.tm.sched == NULL)
+		return -1;
+
+	/* Subport */
+	n_subports = t->port_params.n_subports_per_port;
+	for (subport_id = 0; subport_id < n_subports; subport_id++) {
+		uint32_t n_pipes_per_subport =
+			t->port_params.n_pipes_per_subport;
+		uint32_t pipe_id;
+
+		status = rte_sched_subport_config(p->soft.tm.sched,
+			subport_id,
+			&t->subport_params[subport_id]);
+		if (status) {
+			rte_sched_port_free(p->soft.tm.sched);
+			return -1;
+		}
+
+		/* Pipe */
+		n_pipes_per_subport = t->port_params.n_pipes_per_subport;
+		for (pipe_id = 0; pipe_id < n_pipes_per_subport; pipe_id++) {
+			int pos = subport_id * TM_MAX_PIPES_PER_SUBPORT +
+				pipe_id;
+			int profile_id = t->pipe_to_profile[pos];
+
+			if (profile_id < 0)
+				continue;
+
+			status = rte_sched_pipe_config(p->soft.tm.sched,
+				subport_id,
+				pipe_id,
+				profile_id);
+			if (status) {
+				rte_sched_port_free(p->soft.tm.sched);
+				return -1;
+			}
+		}
+	}
+
+	return 0;
+}
+
+void
+tm_stop(struct pmd_internals *p)
+{
+	if (p->soft.tm.sched)
+		rte_sched_port_free(p->soft.tm.sched);
+}
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v8 3/5] net/softnic: add TM capabilities ops
  2017-10-10 10:18                           ` [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
  2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 1/5] net/softnic: add softnic PMD Jasvinder Singh
  2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 2/5] net/softnic: add traffic management support Jasvinder Singh
@ 2017-10-10 10:18                             ` Jasvinder Singh
  2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 4/5] net/softnic: add TM hierarchy related ops Jasvinder Singh
                                               ` (2 subsequent siblings)
  5 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-10 10:18 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

Implement ethdev TM capability APIs in SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Lu, Wenzhuo <wenzhuo.lu@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
 drivers/net/softnic/rte_eth_softnic.c           |  12 +-
 drivers/net/softnic/rte_eth_softnic_internals.h |  32 ++
 drivers/net/softnic/rte_eth_softnic_tm.c        | 500 ++++++++++++++++++++++++
 3 files changed, 543 insertions(+), 1 deletion(-)

diff --git a/drivers/net/softnic/rte_eth_softnic.c b/drivers/net/softnic/rte_eth_softnic.c
index 2f19159..34dceae 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -43,6 +43,7 @@
 #include <rte_errno.h>
 #include <rte_ring.h>
 #include <rte_sched.h>
+#include <rte_tm_driver.h>
 
 #include "rte_eth_softnic.h"
 #include "rte_eth_softnic_internals.h"
@@ -224,6 +225,15 @@ pmd_link_update(struct rte_eth_dev *dev __rte_unused,
 	return 0;
 }
 
+static int
+pmd_tm_ops_get(struct rte_eth_dev *dev, void *arg)
+{
+	*(const struct rte_tm_ops **)arg =
+		(tm_enabled(dev)) ? &pmd_tm_ops : NULL;
+
+	return 0;
+}
+
 static const struct eth_dev_ops pmd_ops = {
 	.dev_configure = pmd_dev_configure,
 	.dev_start = pmd_dev_start,
@@ -233,7 +243,7 @@ static const struct eth_dev_ops pmd_ops = {
 	.dev_infos_get = pmd_dev_infos_get,
 	.rx_queue_setup = pmd_rx_queue_setup,
 	.tx_queue_setup = pmd_tx_queue_setup,
-	.tm_ops_get = NULL,
+	.tm_ops_get = pmd_tm_ops_get,
 };
 
 static uint16_t
diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index fd9cbbe..75d9387 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -39,6 +39,7 @@
 #include <rte_mbuf.h>
 #include <rte_sched.h>
 #include <rte_ethdev.h>
+#include <rte_tm_driver.h>
 
 #include "rte_eth_softnic.h"
 
@@ -137,8 +138,26 @@ enum tm_node_level {
 	TM_NODE_LEVEL_MAX,
 };
 
+/* TM Node */
+struct tm_node {
+	TAILQ_ENTRY(tm_node) node;
+	uint32_t node_id;
+	uint32_t parent_node_id;
+	uint32_t priority;
+	uint32_t weight;
+	uint32_t level;
+	struct tm_node *parent_node;
+	struct rte_tm_node_params params;
+	struct rte_tm_node_stats stats;
+	uint32_t n_children;
+};
+
+TAILQ_HEAD(tm_node_list, tm_node);
+
 /* TM Hierarchy Specification */
 struct tm_hierarchy {
+	struct tm_node_list nodes;
+
 	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
 };
 
@@ -191,6 +210,11 @@ struct pmd_rx_queue {
 	} hard;
 };
 
+/**
+ * Traffic Management (TM) Operation
+ */
+extern const struct rte_tm_ops pmd_tm_ops;
+
 int
 tm_params_check(struct pmd_params *params, uint32_t hard_rate);
 
@@ -207,6 +231,14 @@ void
 tm_stop(struct pmd_internals *p);
 
 static inline int
+tm_enabled(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	return (p->params.soft.flags & PMD_FEATURE_TM);
+}
+
+static inline int
 tm_used(struct rte_eth_dev *dev)
 {
 	struct pmd_internals *p = dev->data->dev_private;
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index 165abfe..73274d4 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -179,3 +179,503 @@ tm_stop(struct pmd_internals *p)
 	if (p->soft.tm.sched)
 		rte_sched_port_free(p->soft.tm.sched);
 }
+
+static struct tm_node *
+tm_node_search(struct rte_eth_dev *dev, uint32_t node_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node)
+		if (n->node_id == node_id)
+			return n;
+
+	return NULL;
+}
+
+static uint32_t
+tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t n_queues_max = p->params.soft.tm.nb_queues;
+	uint32_t n_tc_max = n_queues_max / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
+	uint32_t n_pipes_max = n_tc_max / RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE;
+	uint32_t n_subports_max = n_pipes_max;
+	uint32_t n_root_max = 1;
+
+	switch (level) {
+	case TM_NODE_LEVEL_PORT:
+		return n_root_max;
+	case TM_NODE_LEVEL_SUBPORT:
+		return n_subports_max;
+	case TM_NODE_LEVEL_PIPE:
+		return n_pipes_max;
+	case TM_NODE_LEVEL_TC:
+		return n_tc_max;
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		return n_queues_max;
+	}
+}
+
+#ifdef RTE_SCHED_RED
+#define WRED_SUPPORTED						1
+#else
+#define WRED_SUPPORTED						0
+#endif
+
+#define STATS_MASK_DEFAULT					\
+	(RTE_TM_STATS_N_PKTS |					\
+	RTE_TM_STATS_N_BYTES |					\
+	RTE_TM_STATS_N_PKTS_GREEN_DROPPED |			\
+	RTE_TM_STATS_N_BYTES_GREEN_DROPPED)
+
+#define STATS_MASK_QUEUE						\
+	(STATS_MASK_DEFAULT |					\
+	RTE_TM_STATS_N_PKTS_QUEUED)
+
+static const struct rte_tm_capabilities tm_cap = {
+	.n_nodes_max = UINT32_MAX,
+	.n_levels_max = TM_NODE_LEVEL_MAX,
+
+	.non_leaf_nodes_identical = 0,
+	.leaf_nodes_identical = 1,
+
+	.shaper_n_max = UINT32_MAX,
+	.shaper_private_n_max = UINT32_MAX,
+	.shaper_private_dual_rate_n_max = 0,
+	.shaper_private_rate_min = 1,
+	.shaper_private_rate_max = UINT32_MAX,
+
+	.shaper_shared_n_max = UINT32_MAX,
+	.shaper_shared_n_nodes_per_shaper_max = UINT32_MAX,
+	.shaper_shared_n_shapers_per_node_max = 1,
+	.shaper_shared_dual_rate_n_max = 0,
+	.shaper_shared_rate_min = 1,
+	.shaper_shared_rate_max = UINT32_MAX,
+
+	.shaper_pkt_length_adjust_min = RTE_TM_ETH_FRAMING_OVERHEAD_FCS,
+	.shaper_pkt_length_adjust_max = RTE_TM_ETH_FRAMING_OVERHEAD_FCS,
+
+	.sched_n_children_max = UINT32_MAX,
+	.sched_sp_n_priorities_max = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+	.sched_wfq_n_children_per_group_max = UINT32_MAX,
+	.sched_wfq_n_groups_max = 1,
+	.sched_wfq_weight_max = UINT32_MAX,
+
+	.cman_head_drop_supported = 0,
+	.cman_wred_context_n_max = 0,
+	.cman_wred_context_private_n_max = 0,
+	.cman_wred_context_shared_n_max = 0,
+	.cman_wred_context_shared_n_nodes_per_context_max = 0,
+	.cman_wred_context_shared_n_contexts_per_node_max = 0,
+
+	.mark_vlan_dei_supported = {0, 0, 0},
+	.mark_ip_ecn_tcp_supported = {0, 0, 0},
+	.mark_ip_ecn_sctp_supported = {0, 0, 0},
+	.mark_ip_dscp_supported = {0, 0, 0},
+
+	.dynamic_update_mask = 0,
+
+	.stats_mask = STATS_MASK_QUEUE,
+};
+
+/* Traffic manager capabilities get */
+static int
+pmd_tm_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	struct rte_tm_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_cap, sizeof(*cap));
+
+	cap->n_nodes_max = tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_TC) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_QUEUE);
+
+	cap->shaper_private_n_max =
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE) +
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_TC);
+
+	cap->shaper_shared_n_max = RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE *
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_SUBPORT);
+
+	cap->shaper_n_max = cap->shaper_private_n_max +
+		cap->shaper_shared_n_max;
+
+	cap->shaper_shared_n_nodes_per_shaper_max =
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE);
+
+	cap->sched_n_children_max = RTE_MAX(
+		tm_level_get_max_nodes(dev, TM_NODE_LEVEL_PIPE),
+		(uint32_t)RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE);
+
+	cap->sched_wfq_n_children_per_group_max = cap->sched_n_children_max;
+
+	if (WRED_SUPPORTED)
+		cap->cman_wred_context_private_n_max =
+			tm_level_get_max_nodes(dev, TM_NODE_LEVEL_QUEUE);
+
+	cap->cman_wred_context_n_max = cap->cman_wred_context_private_n_max +
+		cap->cman_wred_context_shared_n_max;
+
+	return 0;
+}
+
+static const struct rte_tm_level_capabilities tm_level_cap[] = {
+	[TM_NODE_LEVEL_PORT] = {
+		.n_nodes_max = 1,
+		.n_nodes_nonleaf_max = 1,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = 1,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_SUBPORT] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+			.sched_wfq_weight_max = UINT32_MAX,
+#else
+			.sched_wfq_weight_max = 1,
+#endif
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_PIPE] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 0,
+
+			.sched_n_children_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_sp_n_priorities_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_wfq_n_children_per_group_max = 1,
+			.sched_wfq_n_groups_max = 0,
+			.sched_wfq_weight_max = 1,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_TC] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = UINT32_MAX,
+		.n_nodes_leaf_max = 0,
+		.non_leaf_nodes_identical = 1,
+		.leaf_nodes_identical = 0,
+
+		.nonleaf = {
+			.shaper_private_supported = 1,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 1,
+			.shaper_private_rate_max = UINT32_MAX,
+			.shaper_shared_n_max = 1,
+
+			.sched_n_children_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+
+			.stats_mask = STATS_MASK_DEFAULT,
+		},
+	},
+
+	[TM_NODE_LEVEL_QUEUE] = {
+		.n_nodes_max = UINT32_MAX,
+		.n_nodes_nonleaf_max = 0,
+		.n_nodes_leaf_max = UINT32_MAX,
+		.non_leaf_nodes_identical = 0,
+		.leaf_nodes_identical = 1,
+
+		.leaf = {
+			.shaper_private_supported = 0,
+			.shaper_private_dual_rate_supported = 0,
+			.shaper_private_rate_min = 0,
+			.shaper_private_rate_max = 0,
+			.shaper_shared_n_max = 0,
+
+			.cman_head_drop_supported = 0,
+			.cman_wred_context_private_supported = WRED_SUPPORTED,
+			.cman_wred_context_shared_n_max = 0,
+
+			.stats_mask = STATS_MASK_QUEUE,
+		},
+	},
+};
+
+/* Traffic manager level capabilities get */
+static int
+pmd_tm_level_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t level_id,
+	struct rte_tm_level_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	if (level_id >= TM_NODE_LEVEL_MAX)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_LEVEL_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_level_cap[level_id], sizeof(*cap));
+
+	switch (level_id) {
+	case TM_NODE_LEVEL_PORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_SUBPORT);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_SUBPORT);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_PIPE);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_PIPE);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		break;
+
+	case TM_NODE_LEVEL_TC:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_TC);
+		cap->n_nodes_nonleaf_max = cap->n_nodes_max;
+		break;
+
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		cap->n_nodes_max = tm_level_get_max_nodes(dev,
+			TM_NODE_LEVEL_QUEUE);
+		cap->n_nodes_leaf_max = cap->n_nodes_max;
+		break;
+	}
+
+	return 0;
+}
+
+static const struct rte_tm_node_capabilities tm_node_cap[] = {
+	[TM_NODE_LEVEL_PORT] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = 1,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_SUBPORT] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max = UINT32_MAX,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max = UINT32_MAX,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_PIPE] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 0,
+
+		.nonleaf = {
+			.sched_n_children_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_sp_n_priorities_max =
+				RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+			.sched_wfq_n_children_per_group_max = 1,
+			.sched_wfq_n_groups_max = 0,
+			.sched_wfq_weight_max = 1,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_TC] = {
+		.shaper_private_supported = 1,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 1,
+		.shaper_private_rate_max = UINT32_MAX,
+		.shaper_shared_n_max = 1,
+
+		.nonleaf = {
+			.sched_n_children_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_sp_n_priorities_max = 1,
+			.sched_wfq_n_children_per_group_max =
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			.sched_wfq_n_groups_max = 1,
+			.sched_wfq_weight_max = UINT32_MAX,
+		},
+
+		.stats_mask = STATS_MASK_DEFAULT,
+	},
+
+	[TM_NODE_LEVEL_QUEUE] = {
+		.shaper_private_supported = 0,
+		.shaper_private_dual_rate_supported = 0,
+		.shaper_private_rate_min = 0,
+		.shaper_private_rate_max = 0,
+		.shaper_shared_n_max = 0,
+
+
+		.leaf = {
+			.cman_head_drop_supported = 0,
+			.cman_wred_context_private_supported = WRED_SUPPORTED,
+			.cman_wred_context_shared_n_max = 0,
+		},
+
+		.stats_mask = STATS_MASK_QUEUE,
+	},
+};
+
+/* Traffic manager node capabilities get */
+static int
+pmd_tm_node_capabilities_get(struct rte_eth_dev *dev __rte_unused,
+	uint32_t node_id,
+	struct rte_tm_node_capabilities *cap,
+	struct rte_tm_error *error)
+{
+	struct tm_node *tm_node;
+
+	if (cap == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_CAPABILITIES,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	tm_node = tm_node_search(dev, node_id);
+	if (tm_node == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_NODE_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	memcpy(cap, &tm_node_cap[tm_node->level], sizeof(*cap));
+
+	switch (tm_node->level) {
+	case TM_NODE_LEVEL_PORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_SUBPORT);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		cap->nonleaf.sched_n_children_max =
+			tm_level_get_max_nodes(dev,
+				TM_NODE_LEVEL_PIPE);
+		cap->nonleaf.sched_wfq_n_children_per_group_max =
+			cap->nonleaf.sched_n_children_max;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+	case TM_NODE_LEVEL_TC:
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+const struct rte_tm_ops pmd_tm_ops = {
+	.capabilities_get = pmd_tm_capabilities_get,
+	.level_capabilities_get = pmd_tm_level_capabilities_get,
+	.node_capabilities_get = pmd_tm_node_capabilities_get,
+};
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v8 4/5] net/softnic: add TM hierarchy related ops
  2017-10-10 10:18                           ` [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
                                               ` (2 preceding siblings ...)
  2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 3/5] net/softnic: add TM capabilities ops Jasvinder Singh
@ 2017-10-10 10:18                             ` Jasvinder Singh
  2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 5/5] app/testpmd: add traffic management forwarding mode Jasvinder Singh
  2017-10-10 18:31                             ` [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Ferruh Yigit
  5 siblings, 0 replies; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-10 10:18 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

Implement ethdev TM hierarchy related APIs in SoftNIC PMD.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>

Acked-by: Lu, Wenzhuo <wenzhuo.lu@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
v7 change:
- fix checkpatch warnings

v5 change:
- add macro for the tc period
- add more comments

 drivers/net/softnic/rte_eth_softnic_internals.h |   41 +
 drivers/net/softnic/rte_eth_softnic_tm.c        | 2781 ++++++++++++++++++++++-
 2 files changed, 2817 insertions(+), 5 deletions(-)

diff --git a/drivers/net/softnic/rte_eth_softnic_internals.h b/drivers/net/softnic/rte_eth_softnic_internals.h
index 75d9387..1f75806 100644
--- a/drivers/net/softnic/rte_eth_softnic_internals.h
+++ b/drivers/net/softnic/rte_eth_softnic_internals.h
@@ -138,6 +138,36 @@ enum tm_node_level {
 	TM_NODE_LEVEL_MAX,
 };
 
+/* TM Shaper Profile */
+struct tm_shaper_profile {
+	TAILQ_ENTRY(tm_shaper_profile) node;
+	uint32_t shaper_profile_id;
+	uint32_t n_users;
+	struct rte_tm_shaper_params params;
+};
+
+TAILQ_HEAD(tm_shaper_profile_list, tm_shaper_profile);
+
+/* TM Shared Shaper */
+struct tm_shared_shaper {
+	TAILQ_ENTRY(tm_shared_shaper) node;
+	uint32_t shared_shaper_id;
+	uint32_t n_users;
+	uint32_t shaper_profile_id;
+};
+
+TAILQ_HEAD(tm_shared_shaper_list, tm_shared_shaper);
+
+/* TM WRED Profile */
+struct tm_wred_profile {
+	TAILQ_ENTRY(tm_wred_profile) node;
+	uint32_t wred_profile_id;
+	uint32_t n_users;
+	struct rte_tm_wred_params params;
+};
+
+TAILQ_HEAD(tm_wred_profile_list, tm_wred_profile);
+
 /* TM Node */
 struct tm_node {
 	TAILQ_ENTRY(tm_node) node;
@@ -147,6 +177,8 @@ struct tm_node {
 	uint32_t weight;
 	uint32_t level;
 	struct tm_node *parent_node;
+	struct tm_shaper_profile *shaper_profile;
+	struct tm_wred_profile *wred_profile;
 	struct rte_tm_node_params params;
 	struct rte_tm_node_stats stats;
 	uint32_t n_children;
@@ -156,8 +188,16 @@ TAILQ_HEAD(tm_node_list, tm_node);
 
 /* TM Hierarchy Specification */
 struct tm_hierarchy {
+	struct tm_shaper_profile_list shaper_profiles;
+	struct tm_shared_shaper_list shared_shapers;
+	struct tm_wred_profile_list wred_profiles;
 	struct tm_node_list nodes;
 
+	uint32_t n_shaper_profiles;
+	uint32_t n_shared_shapers;
+	uint32_t n_wred_profiles;
+	uint32_t n_nodes;
+
 	uint32_t n_tm_nodes[TM_NODE_LEVEL_MAX];
 };
 
@@ -170,6 +210,7 @@ struct tm_internals {
 	 *      sense to keep the hierarchy frozen after the port is started.
 	 */
 	struct tm_hierarchy h;
+	int hierarchy_frozen;
 
 	/** Blueprints */
 	struct tm_params params;
diff --git a/drivers/net/softnic/rte_eth_softnic_tm.c b/drivers/net/softnic/rte_eth_softnic_tm.c
index 73274d4..682cc4d 100644
--- a/drivers/net/softnic/rte_eth_softnic_tm.c
+++ b/drivers/net/softnic/rte_eth_softnic_tm.c
@@ -40,7 +40,9 @@
 #include "rte_eth_softnic_internals.h"
 #include "rte_eth_softnic.h"
 
-#define BYTES_IN_MBPS (1000 * 1000 / 8)
+#define BYTES_IN_MBPS		(1000 * 1000 / 8)
+#define SUBPORT_TC_PERIOD	10
+#define PIPE_TC_PERIOD		40
 
 int
 tm_params_check(struct pmd_params *params, uint32_t hard_rate)
@@ -86,6 +88,79 @@ tm_params_check(struct pmd_params *params, uint32_t hard_rate)
 	return 0;
 }
 
+static void
+tm_hierarchy_init(struct pmd_internals *p)
+{
+	memset(&p->soft.tm.h, 0, sizeof(p->soft.tm.h));
+
+	/* Initialize shaper profile list */
+	TAILQ_INIT(&p->soft.tm.h.shaper_profiles);
+
+	/* Initialize shared shaper list */
+	TAILQ_INIT(&p->soft.tm.h.shared_shapers);
+
+	/* Initialize wred profile list */
+	TAILQ_INIT(&p->soft.tm.h.wred_profiles);
+
+	/* Initialize TM node list */
+	TAILQ_INIT(&p->soft.tm.h.nodes);
+}
+
+static void
+tm_hierarchy_uninit(struct pmd_internals *p)
+{
+	/* Remove all nodes*/
+	for ( ; ; ) {
+		struct tm_node *tm_node;
+
+		tm_node = TAILQ_FIRST(&p->soft.tm.h.nodes);
+		if (tm_node == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.nodes, tm_node, node);
+		free(tm_node);
+	}
+
+	/* Remove all WRED profiles */
+	for ( ; ; ) {
+		struct tm_wred_profile *wred_profile;
+
+		wred_profile = TAILQ_FIRST(&p->soft.tm.h.wred_profiles);
+		if (wred_profile == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.wred_profiles, wred_profile, node);
+		free(wred_profile);
+	}
+
+	/* Remove all shared shapers */
+	for ( ; ; ) {
+		struct tm_shared_shaper *shared_shaper;
+
+		shared_shaper = TAILQ_FIRST(&p->soft.tm.h.shared_shapers);
+		if (shared_shaper == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.shared_shapers, shared_shaper, node);
+		free(shared_shaper);
+	}
+
+	/* Remove all shaper profiles */
+	for ( ; ; ) {
+		struct tm_shaper_profile *shaper_profile;
+
+		shaper_profile = TAILQ_FIRST(&p->soft.tm.h.shaper_profiles);
+		if (shaper_profile == NULL)
+			break;
+
+		TAILQ_REMOVE(&p->soft.tm.h.shaper_profiles,
+			shaper_profile, node);
+		free(shaper_profile);
+	}
+
+	memset(&p->soft.tm.h, 0, sizeof(p->soft.tm.h));
+}
+
 int
 tm_init(struct pmd_internals *p,
 	struct pmd_params *params,
@@ -112,12 +187,15 @@ tm_init(struct pmd_internals *p,
 		return -ENOMEM;
 	}
 
+	tm_hierarchy_init(p);
+
 	return 0;
 }
 
 void
 tm_free(struct pmd_internals *p)
 {
+	tm_hierarchy_uninit(p);
 	rte_free(p->soft.tm.pkts_enq);
 	rte_free(p->soft.tm.pkts_deq);
 }
@@ -129,6 +207,10 @@ tm_start(struct pmd_internals *p)
 	uint32_t n_subports, subport_id;
 	int status;
 
+	/* Is hierarchy frozen? */
+	if (p->soft.tm.hierarchy_frozen == 0)
+		return -1;
+
 	/* Port */
 	p->soft.tm.sched = rte_sched_port_config(&t->port_params);
 	if (p->soft.tm.sched == NULL)
@@ -178,6 +260,51 @@ tm_stop(struct pmd_internals *p)
 {
 	if (p->soft.tm.sched)
 		rte_sched_port_free(p->soft.tm.sched);
+
+	/* Unfreeze hierarchy */
+	p->soft.tm.hierarchy_frozen = 0;
+}
+
+static struct tm_shaper_profile *
+tm_shaper_profile_search(struct rte_eth_dev *dev, uint32_t shaper_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile_list *spl = &p->soft.tm.h.shaper_profiles;
+	struct tm_shaper_profile *sp;
+
+	TAILQ_FOREACH(sp, spl, node)
+		if (shaper_profile_id == sp->shaper_profile_id)
+			return sp;
+
+	return NULL;
+}
+
+static struct tm_shared_shaper *
+tm_shared_shaper_search(struct rte_eth_dev *dev, uint32_t shared_shaper_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper_list *ssl = &p->soft.tm.h.shared_shapers;
+	struct tm_shared_shaper *ss;
+
+	TAILQ_FOREACH(ss, ssl, node)
+		if (shared_shaper_id == ss->shared_shaper_id)
+			return ss;
+
+	return NULL;
+}
+
+static struct tm_wred_profile *
+tm_wred_profile_search(struct rte_eth_dev *dev, uint32_t wred_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile_list *wpl = &p->soft.tm.h.wred_profiles;
+	struct tm_wred_profile *wp;
+
+	TAILQ_FOREACH(wp, wpl, node)
+		if (wred_profile_id == wp->wred_profile_id)
+			return wp;
+
+	return NULL;
 }
 
 static struct tm_node *
@@ -194,6 +321,94 @@ tm_node_search(struct rte_eth_dev *dev, uint32_t node_id)
 	return NULL;
 }
 
+static struct tm_node *
+tm_root_node_present(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node)
+		if (n->parent_node_id == RTE_TM_NODE_ID_NULL)
+			return n;
+
+	return NULL;
+}
+
+static uint32_t
+tm_node_subport_id(struct rte_eth_dev *dev, struct tm_node *subport_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *ns;
+	uint32_t subport_id;
+
+	subport_id = 0;
+	TAILQ_FOREACH(ns, nl, node) {
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		if (ns->node_id == subport_node->node_id)
+			return subport_id;
+
+		subport_id++;
+	}
+
+	return UINT32_MAX;
+}
+
+static uint32_t
+tm_node_pipe_id(struct rte_eth_dev *dev, struct tm_node *pipe_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *np;
+	uint32_t pipe_id;
+
+	pipe_id = 0;
+	TAILQ_FOREACH(np, nl, node) {
+		if (np->level != TM_NODE_LEVEL_PIPE ||
+			np->parent_node_id != pipe_node->parent_node_id)
+			continue;
+
+		if (np->node_id == pipe_node->node_id)
+			return pipe_id;
+
+		pipe_id++;
+	}
+
+	return UINT32_MAX;
+}
+
+static uint32_t
+tm_node_tc_id(struct rte_eth_dev *dev __rte_unused, struct tm_node *tc_node)
+{
+	return tc_node->priority;
+}
+
+static uint32_t
+tm_node_queue_id(struct rte_eth_dev *dev, struct tm_node *queue_node)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *nq;
+	uint32_t queue_id;
+
+	queue_id = 0;
+	TAILQ_FOREACH(nq, nl, node) {
+		if (nq->level != TM_NODE_LEVEL_QUEUE ||
+			nq->parent_node_id != queue_node->parent_node_id)
+			continue;
+
+		if (nq->node_id == queue_node->node_id)
+			return queue_id;
+
+		queue_id++;
+	}
+
+	return UINT32_MAX;
+}
+
 static uint32_t
 tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 {
@@ -219,6 +434,35 @@ tm_level_get_max_nodes(struct rte_eth_dev *dev, enum tm_node_level level)
 	}
 }
 
+/* Traffic manager node type get */
+static int
+pmd_tm_node_type_get(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	int *is_leaf,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	if (is_leaf == NULL)
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_UNSPECIFIED,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	if (node_id == RTE_TM_NODE_ID_NULL ||
+		(tm_node_search(dev, node_id) == NULL))
+		return -rte_tm_error_set(error,
+		   EINVAL,
+		   RTE_TM_ERROR_TYPE_NODE_ID,
+		   NULL,
+		   rte_strerror(EINVAL));
+
+	*is_leaf = node_id < p->params.soft.tm.nb_queues;
+
+	return 0;
+}
+
 #ifdef RTE_SCHED_RED
 #define WRED_SUPPORTED						1
 #else
@@ -674,8 +918,2535 @@ pmd_tm_node_capabilities_get(struct rte_eth_dev *dev __rte_unused,
 	return 0;
 }
 
-const struct rte_tm_ops pmd_tm_ops = {
-	.capabilities_get = pmd_tm_capabilities_get,
-	.level_capabilities_get = pmd_tm_level_capabilities_get,
-	.node_capabilities_get = pmd_tm_node_capabilities_get,
+static int
+shaper_profile_check(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error)
+{
+	struct tm_shaper_profile *sp;
+
+	/* Shaper profile ID must not be NONE. */
+	if (shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper profile must not exist. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp)
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	/* Profile must not be NULL. */
+	if (profile == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Peak rate: non-zero, 32-bit */
+	if (profile->peak.rate == 0 ||
+		profile->peak.rate >= UINT32_MAX)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_RATE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Peak size: non-zero, 32-bit */
+	if (profile->peak.size == 0 ||
+		profile->peak.size >= UINT32_MAX)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PEAK_SIZE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Dual-rate profiles are not supported. */
+	if (profile->committed.rate != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_COMMITTED_RATE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Packet length adjust: 24 bytes */
+	if (profile->pkt_length_adjust != RTE_TM_ETH_FRAMING_OVERHEAD_FCS)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_PKT_ADJUST_LEN,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+/* Traffic manager shaper profile add */
+static int
+pmd_tm_shaper_profile_add(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_shaper_params *profile,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile_list *spl = &p->soft.tm.h.shaper_profiles;
+	struct tm_shaper_profile *sp;
+	int status;
+
+	/* Check input params */
+	status = shaper_profile_check(dev, shaper_profile_id, profile, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	sp = calloc(1, sizeof(struct tm_shaper_profile));
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	sp->shaper_profile_id = shaper_profile_id;
+	memcpy(&sp->params, profile, sizeof(sp->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(spl, sp, node);
+	p->soft.tm.h.n_shaper_profiles++;
+
+	return 0;
+}
+
+/* Traffic manager shaper profile delete */
+static int
+pmd_tm_shaper_profile_delete(struct rte_eth_dev *dev,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp;
+
+	/* Check existing */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (sp->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.shaper_profiles, sp, node);
+	p->soft.tm.h.n_shaper_profiles--;
+	free(sp);
+
+	return 0;
+}
+
+static struct tm_node *
+tm_shared_shaper_get_tc(struct rte_eth_dev *dev,
+	struct tm_shared_shaper *ss)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	/* Subport: each TC uses shared shaper  */
+	TAILQ_FOREACH(n, nl, node) {
+		if (n->level != TM_NODE_LEVEL_TC ||
+			n->params.n_shared_shapers == 0 ||
+			n->params.shared_shaper_id[0] != ss->shared_shaper_id)
+			continue;
+
+		return n;
+	}
+
+	return NULL;
+}
+
+static int
+update_subport_tc_rate(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct tm_shared_shaper *ss,
+	struct tm_shaper_profile *sp_new)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_subport_params subport_params;
+
+	struct tm_shaper_profile *sp_old = tm_shaper_profile_search(dev,
+		ss->shaper_profile_id);
+
+	/* Derive new subport configuration. */
+	memcpy(&subport_params,
+		&p->soft.tm.params.subport_params[subport_id],
+		sizeof(subport_params));
+	subport_params.tc_rate[tc_id] = sp_new->params.peak.rate;
+
+	/* Update the subport configuration. */
+	if (rte_sched_subport_config(p->soft.tm.sched,
+		subport_id, &subport_params))
+		return -1;
+
+	/* Commit changes. */
+	sp_old->n_users--;
+
+	ss->shaper_profile_id = sp_new->shaper_profile_id;
+	sp_new->n_users++;
+
+	memcpy(&p->soft.tm.params.subport_params[subport_id],
+		&subport_params,
+		sizeof(subport_params));
+
+	return 0;
+}
+
+/* Traffic manager shared shaper add/update */
+static int
+pmd_tm_shared_shaper_add_update(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper *ss;
+	struct tm_shaper_profile *sp;
+	struct tm_node *nt;
+
+	/* Shaper profile must be valid. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/**
+	 * Add new shared shaper
+	 */
+	ss = tm_shared_shaper_search(dev, shared_shaper_id);
+	if (ss == NULL) {
+		struct tm_shared_shaper_list *ssl =
+			&p->soft.tm.h.shared_shapers;
+
+		/* Hierarchy must not be frozen */
+		if (p->soft.tm.hierarchy_frozen)
+			return -rte_tm_error_set(error,
+				EBUSY,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EBUSY));
+
+		/* Memory allocation */
+		ss = calloc(1, sizeof(struct tm_shared_shaper));
+		if (ss == NULL)
+			return -rte_tm_error_set(error,
+				ENOMEM,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(ENOMEM));
+
+		/* Fill in */
+		ss->shared_shaper_id = shared_shaper_id;
+		ss->shaper_profile_id = shaper_profile_id;
+
+		/* Add to list */
+		TAILQ_INSERT_TAIL(ssl, ss, node);
+		p->soft.tm.h.n_shared_shapers++;
+
+		return 0;
+	}
+
+	/**
+	 * Update existing shared shaper
+	 */
+	/* Hierarchy must be frozen (run-time update) */
+	if (p->soft.tm.hierarchy_frozen == 0)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+
+	/* Propagate change. */
+	nt = tm_shared_shaper_get_tc(dev, ss);
+	if (update_subport_tc_rate(dev, nt, ss, sp))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+/* Traffic manager shared shaper delete */
+static int
+pmd_tm_shared_shaper_delete(struct rte_eth_dev *dev,
+	uint32_t shared_shaper_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shared_shaper *ss;
+
+	/* Check existing */
+	ss = tm_shared_shaper_search(dev, shared_shaper_id);
+	if (ss == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (ss->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.shared_shapers, ss, node);
+	p->soft.tm.h.n_shared_shapers--;
+	free(ss);
+
+	return 0;
+}
+
+static int
+wred_profile_check(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error)
+{
+	struct tm_wred_profile *wp;
+	enum rte_tm_color color;
+
+	/* WRED profile ID must not be NONE. */
+	if (wred_profile_id == RTE_TM_WRED_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* WRED profile must not exist. */
+	wp = tm_wred_profile_search(dev, wred_profile_id);
+	if (wp)
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	/* Profile must not be NULL. */
+	if (profile == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* min_th <= max_th, max_th > 0  */
+	for (color = RTE_TM_GREEN; color < RTE_TM_COLORS; color++) {
+		uint16_t min_th = profile->red_params[color].min_th;
+		uint16_t max_th = profile->red_params[color].max_th;
+
+		if (min_th > max_th || max_th == 0)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_WRED_PROFILE,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	return 0;
+}
+
+/* Traffic manager WRED profile add */
+static int
+pmd_tm_wred_profile_add(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_wred_params *profile,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile_list *wpl = &p->soft.tm.h.wred_profiles;
+	struct tm_wred_profile *wp;
+	int status;
+
+	/* Check input params */
+	status = wred_profile_check(dev, wred_profile_id, profile, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	wp = calloc(1, sizeof(struct tm_wred_profile));
+	if (wp == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	wp->wred_profile_id = wred_profile_id;
+	memcpy(&wp->params, profile, sizeof(wp->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(wpl, wp, node);
+	p->soft.tm.h.n_wred_profiles++;
+
+	return 0;
+}
+
+/* Traffic manager WRED profile delete */
+static int
+pmd_tm_wred_profile_delete(struct rte_eth_dev *dev,
+	uint32_t wred_profile_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_wred_profile *wp;
+
+	/* Check existing */
+	wp = tm_wred_profile_search(dev, wred_profile_id);
+	if (wp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (wp->n_users)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_WRED_PROFILE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.wred_profiles, wp, node);
+	p->soft.tm.h.n_wred_profiles--;
+	free(wp);
+
+	return 0;
+}
+
+static int
+node_add_check_port(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_shaper_profile *sp = tm_shaper_profile_search(dev,
+		params->shaper_profile_id);
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid.
+	 * Shaper profile peak rate must fit the configured port rate.
+	 */
+	if (params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE ||
+		sp == NULL ||
+		sp->params.peak.rate > p->params.soft.tm.rate)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & ~STATS_MASK_DEFAULT)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_subport(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if (params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & ~STATS_MASK_DEFAULT)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_pipe(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight __rte_unused,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if (params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of SP priorities must be 4 */
+	if (params->nonleaf.n_sp_priorities !=
+		RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* WFQ mode must be byte mode */
+	if (params->nonleaf.wfq_weight_mode != NULL &&
+		params->nonleaf.wfq_weight_mode[0] != 0 &&
+		params->nonleaf.wfq_weight_mode[1] != 0 &&
+		params->nonleaf.wfq_weight_mode[2] != 0 &&
+		params->nonleaf.wfq_weight_mode[3] != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_WFQ_WEIGHT_MODE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & ~STATS_MASK_DEFAULT)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_tc(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority __rte_unused,
+	uint32_t weight,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: non-leaf */
+	if (node_id < p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Weight must be 1 */
+	if (weight != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper must be valid */
+	if (params->shaper_profile_id == RTE_TM_SHAPER_PROFILE_ID_NONE ||
+		(!tm_shaper_profile_search(dev, params->shaper_profile_id)))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Single valid shared shaper */
+	if (params->n_shared_shapers > 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	if (params->n_shared_shapers == 1 &&
+		(params->shared_shaper_id == NULL ||
+		(!tm_shared_shaper_search(dev, params->shared_shaper_id[0]))))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHARED_SHAPER_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of priorities must be 1 */
+	if (params->nonleaf.n_sp_priorities != 1)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SP_PRIORITIES,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Stats */
+	if (params->stats_mask & ~STATS_MASK_DEFAULT)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check_queue(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id __rte_unused,
+	uint32_t priority,
+	uint32_t weight __rte_unused,
+	uint32_t level_id __rte_unused,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	/* node type: leaf */
+	if (node_id >= p->params.soft.tm.nb_queues)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be 0 */
+	if (priority != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shaper */
+	if (params->shaper_profile_id != RTE_TM_SHAPER_PROFILE_ID_NONE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_SHAPER_PROFILE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* No shared shapers */
+	if (params->n_shared_shapers != 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_SHAPERS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Congestion management must not be head drop */
+	if (params->leaf.cman == RTE_TM_CMAN_HEAD_DROP)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_CMAN,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Congestion management set to WRED */
+	if (params->leaf.cman == RTE_TM_CMAN_WRED) {
+		uint32_t wred_profile_id = params->leaf.wred.wred_profile_id;
+		struct tm_wred_profile *wp = tm_wred_profile_search(dev,
+			wred_profile_id);
+
+		/* WRED profile (for private WRED context) must be valid */
+		if (wred_profile_id == RTE_TM_WRED_PROFILE_ID_NONE ||
+			wp == NULL)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARAMS_WRED_PROFILE_ID,
+				NULL,
+				rte_strerror(EINVAL));
+
+		/* No shared WRED contexts */
+		if (params->leaf.wred.n_shared_wred_contexts != 0)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARAMS_N_SHARED_WRED_CONTEXTS,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Stats */
+	if (params->stats_mask & ~STATS_MASK_QUEUE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS_STATS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	return 0;
+}
+
+static int
+node_add_check(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct tm_node *pn;
+	uint32_t level;
+	int status;
+
+	/* node_id, parent_node_id:
+	 *    -node_id must not be RTE_TM_NODE_ID_NULL
+	 *    -node_id must not be in use
+	 *    -root node add (parent_node_id is RTE_TM_NODE_ID_NULL):
+	 *        -root node must not exist
+	 *    -non-root node add (parent_node_id is not RTE_TM_NODE_ID_NULL):
+	 *        -parent_node_id must be valid
+	 */
+	if (node_id == RTE_TM_NODE_ID_NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	if (tm_node_search(dev, node_id))
+		return -rte_tm_error_set(error,
+			EEXIST,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EEXIST));
+
+	if (parent_node_id == RTE_TM_NODE_ID_NULL) {
+		pn = NULL;
+		if (tm_root_node_present(dev))
+			return -rte_tm_error_set(error,
+				EEXIST,
+				RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+				NULL,
+				rte_strerror(EEXIST));
+	} else {
+		pn = tm_node_search(dev, parent_node_id);
+		if (pn == NULL)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* priority: must be 0 .. 3 */
+	if (priority >= RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* weight: must be 1 .. 255 */
+	if (weight == 0 || weight >= UINT8_MAX)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* level_id: if valid, then
+	 *    -root node add (parent_node_id is RTE_TM_NODE_ID_NULL):
+	 *        -level_id must be zero
+	 *    -non-root node add (parent_node_id is not RTE_TM_NODE_ID_NULL):
+	 *        -level_id must be parent level ID plus one
+	 */
+	level = (pn == NULL) ? 0 : pn->level + 1;
+	if (level_id != RTE_TM_NODE_LEVEL_ID_ANY && level_id != level)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* params: must not be NULL */
+	if (params == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARAMS,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* params: per level checks */
+	switch (level) {
+	case TM_NODE_LEVEL_PORT:
+		status = node_add_check_port(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		status = node_add_check_subport(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_PIPE:
+		status = node_add_check_pipe(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_TC:
+		status = node_add_check_tc(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	case TM_NODE_LEVEL_QUEUE:
+		status = node_add_check_queue(dev, node_id,
+			parent_node_id, priority, weight, level_id,
+			params, error);
+		if (status)
+			return status;
+		break;
+
+	default:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+	}
+
+	return 0;
+}
+
+/* Traffic manager node add */
+static int
+pmd_tm_node_add(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	uint32_t level_id,
+	struct rte_tm_node_params *params,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+	uint32_t i;
+	int status;
+
+	/* Checks */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	status = node_add_check(dev, node_id, parent_node_id, priority, weight,
+		level_id, params, error);
+	if (status)
+		return status;
+
+	/* Memory allocation */
+	n = calloc(1, sizeof(struct tm_node));
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			ENOMEM,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(ENOMEM));
+
+	/* Fill in */
+	n->node_id = node_id;
+	n->parent_node_id = parent_node_id;
+	n->priority = priority;
+	n->weight = weight;
+
+	if (parent_node_id != RTE_TM_NODE_ID_NULL) {
+		n->parent_node = tm_node_search(dev, parent_node_id);
+		n->level = n->parent_node->level + 1;
+	}
+
+	if (params->shaper_profile_id != RTE_TM_SHAPER_PROFILE_ID_NONE)
+		n->shaper_profile = tm_shaper_profile_search(dev,
+			params->shaper_profile_id);
+
+	if (n->level == TM_NODE_LEVEL_QUEUE &&
+		params->leaf.cman == RTE_TM_CMAN_WRED)
+		n->wred_profile = tm_wred_profile_search(dev,
+			params->leaf.wred.wred_profile_id);
+
+	memcpy(&n->params, params, sizeof(n->params));
+
+	/* Add to list */
+	TAILQ_INSERT_TAIL(nl, n, node);
+	p->soft.tm.h.n_nodes++;
+
+	/* Update dependencies */
+	if (n->parent_node)
+		n->parent_node->n_children++;
+
+	if (n->shaper_profile)
+		n->shaper_profile->n_users++;
+
+	for (i = 0; i < params->n_shared_shapers; i++) {
+		struct tm_shared_shaper *ss;
+
+		ss = tm_shared_shaper_search(dev, params->shared_shaper_id[i]);
+		ss->n_users++;
+	}
+
+	if (n->wred_profile)
+		n->wred_profile->n_users++;
+
+	p->soft.tm.h.n_tm_nodes[n->level]++;
+
+	return 0;
+}
+
+/* Traffic manager node delete */
+static int
+pmd_tm_node_delete(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node *n;
+	uint32_t i;
+
+	/* Check hierarchy changes are currently allowed */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Check existing */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Check unused */
+	if (n->n_children)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Update dependencies */
+	p->soft.tm.h.n_tm_nodes[n->level]--;
+
+	if (n->wred_profile)
+		n->wred_profile->n_users--;
+
+	for (i = 0; i < n->params.n_shared_shapers; i++) {
+		struct tm_shared_shaper *ss;
+
+		ss = tm_shared_shaper_search(dev,
+				n->params.shared_shaper_id[i]);
+		ss->n_users--;
+	}
+
+	if (n->shaper_profile)
+		n->shaper_profile->n_users--;
+
+	if (n->parent_node)
+		n->parent_node->n_children--;
+
+	/* Remove from list */
+	TAILQ_REMOVE(&p->soft.tm.h.nodes, n, node);
+	p->soft.tm.h.n_nodes--;
+	free(n);
+
+	return 0;
+}
+
+
+static void
+pipe_profile_build(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct rte_sched_pipe_params *pp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *nt, *nq;
+
+	memset(pp, 0, sizeof(*pp));
+
+	/* Pipe */
+	pp->tb_rate = np->shaper_profile->params.peak.rate;
+	pp->tb_size = np->shaper_profile->params.peak.size;
+
+	/* Traffic Class (TC) */
+	pp->tc_period = PIPE_TC_PERIOD;
+
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+	pp->tc_ov_weight = np->weight;
+#endif
+
+	TAILQ_FOREACH(nt, nl, node) {
+		uint32_t queue_id = 0;
+
+		if (nt->level != TM_NODE_LEVEL_TC ||
+			nt->parent_node_id != np->node_id)
+			continue;
+
+		pp->tc_rate[nt->priority] =
+			nt->shaper_profile->params.peak.rate;
+
+		/* Queue */
+		TAILQ_FOREACH(nq, nl, node) {
+			uint32_t pipe_queue_id;
+
+			if (nq->level != TM_NODE_LEVEL_QUEUE ||
+				nq->parent_node_id != nt->node_id)
+				continue;
+
+			pipe_queue_id = nt->priority *
+				RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+			pp->wrr_weights[pipe_queue_id] = nq->weight;
+
+			queue_id++;
+		}
+	}
+}
+
+static int
+pipe_profile_free_exists(struct rte_eth_dev *dev,
+	uint32_t *pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+
+	if (t->n_pipe_profiles < RTE_SCHED_PIPE_PROFILES_PER_PORT) {
+		*pipe_profile_id = t->n_pipe_profiles;
+		return 1;
+	}
+
+	return 0;
+}
+
+static int
+pipe_profile_exists(struct rte_eth_dev *dev,
+	struct rte_sched_pipe_params *pp,
+	uint32_t *pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t i;
+
+	for (i = 0; i < t->n_pipe_profiles; i++)
+		if (memcmp(&t->pipe_profiles[i], pp, sizeof(*pp)) == 0) {
+			if (pipe_profile_id)
+				*pipe_profile_id = i;
+			return 1;
+		}
+
+	return 0;
+}
+
+static void
+pipe_profile_install(struct rte_eth_dev *dev,
+	struct rte_sched_pipe_params *pp,
+	uint32_t pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+
+	memcpy(&t->pipe_profiles[pipe_profile_id], pp, sizeof(*pp));
+	t->n_pipe_profiles++;
+}
+
+static void
+pipe_profile_mark(struct rte_eth_dev *dev,
+	uint32_t subport_id,
+	uint32_t pipe_id,
+	uint32_t pipe_profile_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_pipes_per_subport, pos;
+
+	n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+	pos = subport_id * n_pipes_per_subport + pipe_id;
+
+	t->pipe_to_profile[pos] = pipe_profile_id;
+}
+
+static struct rte_sched_pipe_params *
+pipe_profile_get(struct rte_eth_dev *dev, struct tm_node *np)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_params *t = &p->soft.tm.params;
+	uint32_t n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	uint32_t subport_id = tm_node_subport_id(dev, np->parent_node);
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	uint32_t pos = subport_id * n_pipes_per_subport + pipe_id;
+	uint32_t pipe_profile_id = t->pipe_to_profile[pos];
+
+	return &t->pipe_profiles[pipe_profile_id];
+}
+
+static int
+pipe_profiles_generate(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *ns, *np;
+	uint32_t subport_id;
+
+	/* Objective: Fill in the following fields in struct tm_params:
+	 *    - pipe_profiles
+	 *    - n_pipe_profiles
+	 *    - pipe_to_profile
+	 */
+
+	subport_id = 0;
+	TAILQ_FOREACH(ns, nl, node) {
+		uint32_t pipe_id;
+
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		pipe_id = 0;
+		TAILQ_FOREACH(np, nl, node) {
+			struct rte_sched_pipe_params pp;
+			uint32_t pos;
+
+			if (np->level != TM_NODE_LEVEL_PIPE ||
+				np->parent_node_id != ns->node_id)
+				continue;
+
+			pipe_profile_build(dev, np, &pp);
+
+			if (!pipe_profile_exists(dev, &pp, &pos)) {
+				if (!pipe_profile_free_exists(dev, &pos))
+					return -1;
+
+				pipe_profile_install(dev, &pp, pos);
+			}
+
+			pipe_profile_mark(dev, subport_id, pipe_id, pos);
+
+			pipe_id++;
+		}
+
+		subport_id++;
+	}
+
+	return 0;
+}
+
+static struct tm_wred_profile *
+tm_tc_wred_profile_get(struct rte_eth_dev *dev, uint32_t tc_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *nq;
+
+	TAILQ_FOREACH(nq, nl, node) {
+		if (nq->level != TM_NODE_LEVEL_QUEUE ||
+			nq->parent_node->priority != tc_id)
+			continue;
+
+		return nq->wred_profile;
+	}
+
+	return NULL;
+}
+
+#ifdef RTE_SCHED_RED
+
+static void
+wred_profiles_set(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_sched_port_params *pp = &p->soft.tm.params.port_params;
+	uint32_t tc_id;
+	enum rte_tm_color color;
+
+	for (tc_id = 0; tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc_id++)
+		for (color = RTE_TM_GREEN; color < RTE_TM_COLORS; color++) {
+			struct rte_red_params *dst =
+				&pp->red_params[tc_id][color];
+			struct tm_wred_profile *src_wp =
+				tm_tc_wred_profile_get(dev, tc_id);
+			struct rte_tm_red_params *src =
+				&src_wp->params.red_params[color];
+
+			memcpy(dst, src, sizeof(*dst));
+		}
+}
+
+#else
+
+#define wred_profiles_set(dev)
+
+#endif
+
+static struct tm_shared_shaper *
+tm_tc_shared_shaper_get(struct rte_eth_dev *dev, struct tm_node *tc_node)
+{
+	return (tc_node->params.n_shared_shapers) ?
+		tm_shared_shaper_search(dev,
+			tc_node->params.shared_shaper_id[0]) :
+		NULL;
+}
+
+static struct tm_shared_shaper *
+tm_subport_tc_shared_shaper_get(struct rte_eth_dev *dev,
+	struct tm_node *subport_node,
+	uint32_t tc_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_node_list *nl = &p->soft.tm.h.nodes;
+	struct tm_node *n;
+
+	TAILQ_FOREACH(n, nl, node) {
+		if (n->level != TM_NODE_LEVEL_TC ||
+			n->parent_node->parent_node_id !=
+				subport_node->node_id ||
+			n->priority != tc_id)
+			continue;
+
+		return tm_tc_shared_shaper_get(dev, n);
+	}
+
+	return NULL;
+}
+
+static int
+hierarchy_commit_check(struct rte_eth_dev *dev, struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_shared_shaper_list *ssl = &h->shared_shapers;
+	struct tm_wred_profile_list *wpl = &h->wred_profiles;
+	struct tm_node *nr = tm_root_node_present(dev), *ns, *np, *nt, *nq;
+	struct tm_shared_shaper *ss;
+
+	uint32_t n_pipes_per_subport;
+
+	/* Root node exists. */
+	if (nr == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* There is at least one subport, max is not exceeded. */
+	if (nr->n_children == 0 || nr->n_children > TM_MAX_SUBPORTS)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* There is at least one pipe. */
+	if (h->n_tm_nodes[TM_NODE_LEVEL_PIPE] == 0)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_LEVEL_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Number of pipes is the same for all subports. Maximum number of pipes
+	 * per subport is not exceeded.
+	 */
+	n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+		h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	if (n_pipes_per_subport > TM_MAX_PIPES_PER_SUBPORT)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	TAILQ_FOREACH(ns, nl, node) {
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		if (ns->n_children != n_pipes_per_subport)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Each pipe has exactly 4 TCs, with exactly one TC for each priority */
+	TAILQ_FOREACH(np, nl, node) {
+		uint32_t mask = 0, mask_expected =
+			RTE_LEN2MASK(RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE,
+				uint32_t);
+
+		if (np->level != TM_NODE_LEVEL_PIPE)
+			continue;
+
+		if (np->n_children != RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+
+		TAILQ_FOREACH(nt, nl, node) {
+			if (nt->level != TM_NODE_LEVEL_TC ||
+				nt->parent_node_id != np->node_id)
+				continue;
+
+			mask |= 1 << nt->priority;
+		}
+
+		if (mask != mask_expected)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Each TC has exactly 4 packet queues. */
+	TAILQ_FOREACH(nt, nl, node) {
+		if (nt->level != TM_NODE_LEVEL_TC)
+			continue;
+
+		if (nt->n_children != RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/**
+	 * Shared shapers:
+	 *    -For each TC #i, all pipes in the same subport use the same
+	 *     shared shaper (or no shared shaper) for their TC#i.
+	 *    -Each shared shaper needs to have at least one user. All its
+	 *     users have to be TC nodes with the same priority and the same
+	 *     subport.
+	 */
+	TAILQ_FOREACH(ns, nl, node) {
+		struct tm_shared_shaper *s[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t id;
+
+		if (ns->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++)
+			s[id] = tm_subport_tc_shared_shaper_get(dev, ns, id);
+
+		TAILQ_FOREACH(nt, nl, node) {
+			struct tm_shared_shaper *subport_ss, *tc_ss;
+
+			if (nt->level != TM_NODE_LEVEL_TC ||
+				nt->parent_node->parent_node_id !=
+					ns->node_id)
+				continue;
+
+			subport_ss = s[nt->priority];
+			tc_ss = tm_tc_shared_shaper_get(dev, nt);
+
+			if (subport_ss == NULL && tc_ss == NULL)
+				continue;
+
+			if ((subport_ss == NULL && tc_ss != NULL) ||
+				(subport_ss != NULL && tc_ss == NULL) ||
+				subport_ss->shared_shaper_id !=
+					tc_ss->shared_shaper_id)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+	}
+
+	TAILQ_FOREACH(ss, ssl, node) {
+		struct tm_node *nt_any = tm_shared_shaper_get_tc(dev, ss);
+		uint32_t n_users = 0;
+
+		if (nt_any != NULL)
+			TAILQ_FOREACH(nt, nl, node) {
+				if (nt->level != TM_NODE_LEVEL_TC ||
+					nt->priority != nt_any->priority ||
+					nt->parent_node->parent_node_id !=
+					nt_any->parent_node->parent_node_id)
+					continue;
+
+				n_users++;
+			}
+
+		if (ss->n_users == 0 || ss->n_users != n_users)
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+	}
+
+	/* Not too many pipe profiles. */
+	if (pipe_profiles_generate(dev))
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/**
+	 * WRED (when used, i.e. at least one WRED profile defined):
+	 *    -Each WRED profile must have at least one user.
+	 *    -All leaf nodes must have their private WRED context enabled.
+	 *    -For each TC #i, all leaf nodes must use the same WRED profile
+	 *     for their private WRED context.
+	 */
+	if (h->n_wred_profiles) {
+		struct tm_wred_profile *wp;
+		struct tm_wred_profile *w[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t id;
+
+		TAILQ_FOREACH(wp, wpl, node)
+			if (wp->n_users == 0)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++) {
+			w[id] = tm_tc_wred_profile_get(dev, id);
+
+			if (w[id] == NULL)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+
+		TAILQ_FOREACH(nq, nl, node) {
+			uint32_t id;
+
+			if (nq->level != TM_NODE_LEVEL_QUEUE)
+				continue;
+
+			id = nq->parent_node->priority;
+
+			if (nq->wred_profile == NULL ||
+				nq->wred_profile->wred_profile_id !=
+					w[id]->wred_profile_id)
+				return -rte_tm_error_set(error,
+					EINVAL,
+					RTE_TM_ERROR_TYPE_UNSPECIFIED,
+					NULL,
+					rte_strerror(EINVAL));
+		}
+	}
+
+	return 0;
+}
+
+static void
+hierarchy_blueprints_create(struct rte_eth_dev *dev)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_params *t = &p->soft.tm.params;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+
+	struct tm_node_list *nl = &h->nodes;
+	struct tm_node *root = tm_root_node_present(dev), *n;
+
+	uint32_t subport_id;
+
+	t->port_params = (struct rte_sched_port_params) {
+		.name = dev->data->name,
+		.socket = dev->data->numa_node,
+		.rate = root->shaper_profile->params.peak.rate,
+		.mtu = dev->data->mtu,
+		.frame_overhead =
+			root->shaper_profile->params.pkt_length_adjust,
+		.n_subports_per_port = root->n_children,
+		.n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+			h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT],
+		.qsize = {p->params.soft.tm.qsize[0],
+			p->params.soft.tm.qsize[1],
+			p->params.soft.tm.qsize[2],
+			p->params.soft.tm.qsize[3],
+		},
+		.pipe_profiles = t->pipe_profiles,
+		.n_pipe_profiles = t->n_pipe_profiles,
+	};
+
+	wred_profiles_set(dev);
+
+	subport_id = 0;
+	TAILQ_FOREACH(n, nl, node) {
+		uint64_t tc_rate[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
+		uint32_t i;
+
+		if (n->level != TM_NODE_LEVEL_SUBPORT)
+			continue;
+
+		for (i = 0; i < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; i++) {
+			struct tm_shared_shaper *ss;
+			struct tm_shaper_profile *sp;
+
+			ss = tm_subport_tc_shared_shaper_get(dev, n, i);
+			sp = (ss) ? tm_shaper_profile_search(dev,
+				ss->shaper_profile_id) :
+				n->shaper_profile;
+			tc_rate[i] = sp->params.peak.rate;
+		}
+
+		t->subport_params[subport_id] =
+			(struct rte_sched_subport_params) {
+				.tb_rate = n->shaper_profile->params.peak.rate,
+				.tb_size = n->shaper_profile->params.peak.size,
+
+				.tc_rate = {tc_rate[0],
+					tc_rate[1],
+					tc_rate[2],
+					tc_rate[3],
+			},
+			.tc_period = SUBPORT_TC_PERIOD,
+		};
+
+		subport_id++;
+	}
+}
+
+/* Traffic manager hierarchy commit */
+static int
+pmd_tm_hierarchy_commit(struct rte_eth_dev *dev,
+	int clear_on_fail,
+	struct rte_tm_error *error)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	int status;
+
+	/* Checks */
+	if (p->soft.tm.hierarchy_frozen)
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	status = hierarchy_commit_check(dev, error);
+	if (status) {
+		if (clear_on_fail) {
+			tm_hierarchy_uninit(p);
+			tm_hierarchy_init(p);
+		}
+
+		return status;
+	}
+
+	/* Create blueprints */
+	hierarchy_blueprints_create(dev);
+
+	/* Freeze hierarchy */
+	p->soft.tm.hierarchy_frozen = 1;
+
+	return 0;
+}
+
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+
+static int
+update_pipe_weight(struct rte_eth_dev *dev, struct tm_node *np, uint32_t weight)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tc_ov_weight = (uint8_t)weight;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	np->weight = weight;
+
+	return 0;
+}
+
+#endif
+
+static int
+update_queue_weight(struct rte_eth_dev *dev,
+	struct tm_node *nq, uint32_t weight)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t queue_id = tm_node_queue_id(dev, nq);
+
+	struct tm_node *nt = nq->parent_node;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t pipe_queue_id =
+		tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + queue_id;
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.wrr_weights[pipe_queue_id] = (uint8_t)weight;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set
+	 * of pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	nq->weight = weight;
+
+	return 0;
+}
+
+/* Traffic manager node parent update */
+static int
+pmd_tm_node_parent_update(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t parent_node_id,
+	uint32_t priority,
+	uint32_t weight,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+
+	/* Port must be started and TM used. */
+	if (dev->data->dev_started == 0 && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Parent node must be the same */
+	if (n->parent_node_id != parent_node_id)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PARENT_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Priority must be the same */
+	if (n->priority != priority)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_PRIORITY,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* weight: must be 1 .. 255 */
+	if (weight == 0 || weight >= UINT8_MAX)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_SUBPORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_PIPE:
+#ifdef RTE_SCHED_SUBPORT_TC_OV
+		if (update_pipe_weight(dev, n, weight))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+#else
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+#endif
+		/* fall-through */
+	case TM_NODE_LEVEL_TC:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_WEIGHT,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_QUEUE:
+		/* fall-through */
+	default:
+		if (update_queue_weight(dev, n, weight))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+	}
+}
+
+static int
+update_subport_rate(struct rte_eth_dev *dev,
+	struct tm_node *ns,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_subport_params subport_params;
+
+	/* Derive new subport configuration. */
+	memcpy(&subport_params,
+		&p->soft.tm.params.subport_params[subport_id],
+		sizeof(subport_params));
+	subport_params.tb_rate = sp->params.peak.rate;
+	subport_params.tb_size = sp->params.peak.size;
+
+	/* Update the subport configuration. */
+	if (rte_sched_subport_config(p->soft.tm.sched, subport_id,
+		&subport_params))
+		return -1;
+
+	/* Commit changes. */
+	ns->shaper_profile->n_users--;
+
+	ns->shaper_profile = sp;
+	ns->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	memcpy(&p->soft.tm.params.subport_params[subport_id],
+		&subport_params,
+		sizeof(subport_params));
+
+	return 0;
+}
+
+static int
+update_pipe_rate(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tb_rate = sp->params.peak.rate;
+	profile1.tb_size = sp->params.peak.size;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	np->shaper_profile->n_users--;
+	np->shaper_profile = sp;
+	np->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	return 0;
+}
+
+static int
+update_tc_rate(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct tm_shaper_profile *sp)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	struct rte_sched_pipe_params *profile0 = pipe_profile_get(dev, np);
+	struct rte_sched_pipe_params profile1;
+	uint32_t pipe_profile_id;
+
+	/* Derive new pipe profile. */
+	memcpy(&profile1, profile0, sizeof(profile1));
+	profile1.tc_rate[tc_id] = sp->params.peak.rate;
+
+	/* Since implementation does not allow adding more pipe profiles after
+	 * port configuration, the pipe configuration can be successfully
+	 * updated only if the new profile is also part of the existing set of
+	 * pipe profiles.
+	 */
+	if (pipe_profile_exists(dev, &profile1, &pipe_profile_id) == 0)
+		return -1;
+
+	/* Update the pipe profile used by the current pipe. */
+	if (rte_sched_pipe_config(p->soft.tm.sched, subport_id, pipe_id,
+		(int32_t)pipe_profile_id))
+		return -1;
+
+	/* Commit changes. */
+	pipe_profile_mark(dev, subport_id, pipe_id, pipe_profile_id);
+	nt->shaper_profile->n_users--;
+	nt->shaper_profile = sp;
+	nt->params.shaper_profile_id = sp->shaper_profile_id;
+	sp->n_users++;
+
+	return 0;
+}
+
+/* Traffic manager node shaper update */
+static int
+pmd_tm_node_shaper_update(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	uint32_t shaper_profile_id,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+	struct tm_shaper_profile *sp;
+
+	/* Port must be started and TM used. */
+	if (dev->data->dev_started == 0 && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	/* Shaper profile must be valid. */
+	sp = tm_shaper_profile_search(dev, shaper_profile_id);
+	if (sp == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_SHAPER_PROFILE,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+		/* fall-through */
+	case TM_NODE_LEVEL_SUBPORT:
+		if (update_subport_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_PIPE:
+		if (update_pipe_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_TC:
+		if (update_tc_rate(dev, n, sp))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+		/* fall-through */
+	case TM_NODE_LEVEL_QUEUE:
+		/* fall-through */
+	default:
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EINVAL));
+	}
+}
+
+static inline uint32_t
+tm_port_queue_id(struct rte_eth_dev *dev,
+	uint32_t port_subport_id,
+	uint32_t subport_pipe_id,
+	uint32_t pipe_tc_id,
+	uint32_t tc_queue_id)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	uint32_t n_pipes_per_subport = h->n_tm_nodes[TM_NODE_LEVEL_PIPE] /
+			h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+
+	uint32_t port_pipe_id =
+		port_subport_id * n_pipes_per_subport + subport_pipe_id;
+	uint32_t port_tc_id =
+		port_pipe_id * RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE + pipe_tc_id;
+	uint32_t port_queue_id =
+		port_tc_id * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS + tc_queue_id;
+
+	return port_queue_id;
+}
+
+static int
+read_port_stats(struct rte_eth_dev *dev,
+	struct tm_node *nr,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct tm_hierarchy *h = &p->soft.tm.h;
+	uint32_t n_subports_per_port = h->n_tm_nodes[TM_NODE_LEVEL_SUBPORT];
+	uint32_t subport_id;
+
+	for (subport_id = 0; subport_id < n_subports_per_port; subport_id++) {
+		struct rte_sched_subport_stats s;
+		uint32_t tc_ov, id;
+
+		/* Stats read */
+		int status = rte_sched_subport_read_stats(
+			p->soft.tm.sched,
+			subport_id,
+			&s,
+			&tc_ov);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		for (id = 0; id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; id++) {
+			nr->stats.n_pkts +=
+				s.n_pkts_tc[id] - s.n_pkts_tc_dropped[id];
+			nr->stats.n_bytes +=
+				s.n_bytes_tc[id] - s.n_bytes_tc_dropped[id];
+			nr->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] +=
+				s.n_pkts_tc_dropped[id];
+			nr->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+				s.n_bytes_tc_dropped[id];
+		}
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nr->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nr->stats, 0, sizeof(nr->stats));
+
+	return 0;
+}
+
+static int
+read_subport_stats(struct rte_eth_dev *dev,
+	struct tm_node *ns,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+	struct rte_sched_subport_stats s;
+	uint32_t tc_ov, tc_id;
+
+	/* Stats read */
+	int status = rte_sched_subport_read_stats(
+		p->soft.tm.sched,
+		subport_id,
+		&s,
+		&tc_ov);
+	if (status)
+		return status;
+
+	/* Stats accumulate */
+	for (tc_id = 0; tc_id < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc_id++) {
+		ns->stats.n_pkts +=
+			s.n_pkts_tc[tc_id] - s.n_pkts_tc_dropped[tc_id];
+		ns->stats.n_bytes +=
+			s.n_bytes_tc[tc_id] - s.n_bytes_tc_dropped[tc_id];
+		ns->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] +=
+			s.n_pkts_tc_dropped[tc_id];
+		ns->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_tc_dropped[tc_id];
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &ns->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&ns->stats, 0, sizeof(ns->stats));
+
+	return 0;
+}
+
+static int
+read_pipe_stats(struct rte_eth_dev *dev,
+	struct tm_node *np,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t i;
+
+	/* Stats read */
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_PIPE; i++) {
+		struct rte_sched_queue_stats s;
+		uint16_t qlen;
+
+		uint32_t qid = tm_port_queue_id(dev,
+			subport_id,
+			pipe_id,
+			i / RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS,
+			i % RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS);
+
+		int status = rte_sched_queue_read_stats(
+			p->soft.tm.sched,
+			qid,
+			&s,
+			&qlen);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		np->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+		np->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+		np->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+		np->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_dropped;
+		np->stats.leaf.n_pkts_queued = qlen;
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &np->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&np->stats, 0, sizeof(np->stats));
+
+	return 0;
+}
+
+static int
+read_tc_stats(struct rte_eth_dev *dev,
+	struct tm_node *nt,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	uint32_t i;
+
+	/* Stats read */
+	for (i = 0; i < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; i++) {
+		struct rte_sched_queue_stats s;
+		uint16_t qlen;
+
+		uint32_t qid = tm_port_queue_id(dev,
+			subport_id,
+			pipe_id,
+			tc_id,
+			i);
+
+		int status = rte_sched_queue_read_stats(
+			p->soft.tm.sched,
+			qid,
+			&s,
+			&qlen);
+		if (status)
+			return status;
+
+		/* Stats accumulate */
+		nt->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+		nt->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+		nt->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+		nt->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+			s.n_bytes_dropped;
+		nt->stats.leaf.n_pkts_queued = qlen;
+	}
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nt->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_DEFAULT;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nt->stats, 0, sizeof(nt->stats));
+
+	return 0;
+}
+
+static int
+read_queue_stats(struct rte_eth_dev *dev,
+	struct tm_node *nq,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear)
+{
+	struct pmd_internals *p = dev->data->dev_private;
+	struct rte_sched_queue_stats s;
+	uint16_t qlen;
+
+	uint32_t queue_id = tm_node_queue_id(dev, nq);
+
+	struct tm_node *nt = nq->parent_node;
+	uint32_t tc_id = tm_node_tc_id(dev, nt);
+
+	struct tm_node *np = nt->parent_node;
+	uint32_t pipe_id = tm_node_pipe_id(dev, np);
+
+	struct tm_node *ns = np->parent_node;
+	uint32_t subport_id = tm_node_subport_id(dev, ns);
+
+	/* Stats read */
+	uint32_t qid = tm_port_queue_id(dev,
+		subport_id,
+		pipe_id,
+		tc_id,
+		queue_id);
+
+	int status = rte_sched_queue_read_stats(
+		p->soft.tm.sched,
+		qid,
+		&s,
+		&qlen);
+	if (status)
+		return status;
+
+	/* Stats accumulate */
+	nq->stats.n_pkts += s.n_pkts - s.n_pkts_dropped;
+	nq->stats.n_bytes += s.n_bytes - s.n_bytes_dropped;
+	nq->stats.leaf.n_pkts_dropped[RTE_TM_GREEN] += s.n_pkts_dropped;
+	nq->stats.leaf.n_bytes_dropped[RTE_TM_GREEN] +=
+		s.n_bytes_dropped;
+	nq->stats.leaf.n_pkts_queued = qlen;
+
+	/* Stats copy */
+	if (stats)
+		memcpy(stats, &nq->stats, sizeof(*stats));
+
+	if (stats_mask)
+		*stats_mask = STATS_MASK_QUEUE;
+
+	/* Stats clear */
+	if (clear)
+		memset(&nq->stats, 0, sizeof(nq->stats));
+
+	return 0;
+}
+
+/* Traffic manager read stats counters for specific node */
+static int
+pmd_tm_node_stats_read(struct rte_eth_dev *dev,
+	uint32_t node_id,
+	struct rte_tm_node_stats *stats,
+	uint64_t *stats_mask,
+	int clear,
+	struct rte_tm_error *error)
+{
+	struct tm_node *n;
+
+	/* Port must be started and TM used. */
+	if (dev->data->dev_started == 0 && (tm_used(dev) == 0))
+		return -rte_tm_error_set(error,
+			EBUSY,
+			RTE_TM_ERROR_TYPE_UNSPECIFIED,
+			NULL,
+			rte_strerror(EBUSY));
+
+	/* Node must be valid */
+	n = tm_node_search(dev, node_id);
+	if (n == NULL)
+		return -rte_tm_error_set(error,
+			EINVAL,
+			RTE_TM_ERROR_TYPE_NODE_ID,
+			NULL,
+			rte_strerror(EINVAL));
+
+	switch (n->level) {
+	case TM_NODE_LEVEL_PORT:
+		if (read_port_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_SUBPORT:
+		if (read_subport_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_PIPE:
+		if (read_pipe_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_TC:
+		if (read_tc_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+
+	case TM_NODE_LEVEL_QUEUE:
+	default:
+		if (read_queue_stats(dev, n, stats, stats_mask, clear))
+			return -rte_tm_error_set(error,
+				EINVAL,
+				RTE_TM_ERROR_TYPE_UNSPECIFIED,
+				NULL,
+				rte_strerror(EINVAL));
+		return 0;
+	}
+}
+
+const struct rte_tm_ops pmd_tm_ops = {
+	.node_type_get = pmd_tm_node_type_get,
+	.capabilities_get = pmd_tm_capabilities_get,
+	.level_capabilities_get = pmd_tm_level_capabilities_get,
+	.node_capabilities_get = pmd_tm_node_capabilities_get,
+
+	.wred_profile_add = pmd_tm_wred_profile_add,
+	.wred_profile_delete = pmd_tm_wred_profile_delete,
+	.shared_wred_context_add_update = NULL,
+	.shared_wred_context_delete = NULL,
+
+	.shaper_profile_add = pmd_tm_shaper_profile_add,
+	.shaper_profile_delete = pmd_tm_shaper_profile_delete,
+	.shared_shaper_add_update = pmd_tm_shared_shaper_add_update,
+	.shared_shaper_delete = pmd_tm_shared_shaper_delete,
+
+	.node_add = pmd_tm_node_add,
+	.node_delete = pmd_tm_node_delete,
+	.node_suspend = NULL,
+	.node_resume = NULL,
+	.hierarchy_commit = pmd_tm_hierarchy_commit,
+
+	.node_parent_update = pmd_tm_node_parent_update,
+	.node_shaper_update = pmd_tm_node_shaper_update,
+	.node_shared_shaper_update = NULL,
+	.node_stats_update = NULL,
+	.node_wfq_weight_mode_update = NULL,
+	.node_cman_update = NULL,
+	.node_wred_context_update = NULL,
+	.node_shared_wred_context_update = NULL,
+
+	.node_stats_read = pmd_tm_node_stats_read,
 };
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [dpdk-dev] [PATCH v8 5/5] app/testpmd: add traffic management forwarding mode
  2017-10-10 10:18                           ` [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
                                               ` (3 preceding siblings ...)
  2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 4/5] net/softnic: add TM hierarchy related ops Jasvinder Singh
@ 2017-10-10 10:18                             ` Jasvinder Singh
  2017-10-10 18:24                               ` Ferruh Yigit
  2017-10-10 18:31                             ` [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Ferruh Yigit
  5 siblings, 1 reply; 79+ messages in thread
From: Jasvinder Singh @ 2017-10-10 10:18 UTC (permalink / raw)
  To: dev; +Cc: cristian.dumitrescu, ferruh.yigit, thomas, wenzhuo.lu

This commit extends the testpmd application with new forwarding engine
that demonstrates the use of ethdev traffic management APIs and softnic
PMD for QoS traffic management.

In this mode, 5-level hierarchical tree of the QoS scheduler is built
with the help of ethdev TM APIs such as shaper profile add/delete,
shared shaper add/update, node add/delete, hierarchy commit, etc.
The hierarchical tree has following nodes; root node(x1, level 0),
subport node(x1, level 1), pipe node(x4096, level 2),
tc node(x16348, level 3), queue node(x65536, level 4).

During runtime, each received packet is first classified by mapping the
packet fields information to 5-tuples (HQoS subport, pipe, traffic class,
queue within traffic class, and color) and storing it in the packet mbuf
sched field. After classification, each packet is sent to softnic port
which prioritizes the transmission of the received packets, and
accordingly sends them on to the output interface.

To enable traffic management mode, following testpmd command is used;

$ ./testpmd -c c -n 4 --vdev
	'net_softnic0,hard_name=0000:06:00.1,soft_tm=on' -- -i
	--forward-mode=tm

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

---
v8 change:
- fix compilation warning on uninitialsed parameter
 
v7 change:
- change port_id type to uint16_t
- rebase on dpdk-next-net

v5 change:
- add CLI to enable default tm hierarchy
 
v3 change:
- Implements feedback from Pablo[1]
  - add flag to check required librte_sched lib and softnic pmd
  - code cleanup

v2 change:
- change file name softnictm.c to tm.c
- change forward mode name to "tm"
- code clean up

[1] http://dpdk.org/ml/archives/dev/2017-September/075744.html

 app/test-pmd/Makefile  |  12 +
 app/test-pmd/cmdline.c |  88 +++++
 app/test-pmd/testpmd.c |  15 +
 app/test-pmd/testpmd.h |  46 +++
 app/test-pmd/tm.c      | 865 +++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 1026 insertions(+)
 create mode 100644 app/test-pmd/tm.c

diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile
index b6e80dd..2c50f68 100644
--- a/app/test-pmd/Makefile
+++ b/app/test-pmd/Makefile
@@ -59,6 +59,10 @@ SRCS-y += csumonly.c
 SRCS-y += icmpecho.c
 SRCS-$(CONFIG_RTE_LIBRTE_IEEE1588) += ieee1588fwd.c
 
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_SOFTNIC)$(CONFIG_RTE_LIBRTE_SCHED),yy)
+SRCS-y += tm.c
+endif
+
 ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
 
 ifeq ($(CONFIG_RTE_LIBRTE_PMD_BOND),y)
@@ -77,6 +81,14 @@ ifeq ($(CONFIG_RTE_LIBRTE_BNXT_PMD),y)
 LDLIBS += -lrte_pmd_bnxt
 endif
 
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_XENVIRT),y)
+LDLIBS += -lrte_pmd_xenvirt
+endif
+
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_SOFTNIC),y)
+LDLIBS += -lrte_pmd_softnic
+endif
+
 endif
 
 CFLAGS_cmdline.o := -D_GNU_SOURCE
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 8dc5c85..a72679d 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -637,6 +637,11 @@ static void cmd_help_long_parsed(void *parsed_result,
 			"E-tag set filter del e-tag-id (value) port (port_id)\n"
 			"    Delete an E-tag forwarding filter on a port\n\n"
 
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+			"set port tm hierarchy default (port_id)\n"
+			"	Set default traffic Management hierarchy on a port\n\n"
+
+#endif
 			"ddp add (port_id) (profile_path[,output_path])\n"
 			"    Load a profile package on a port\n\n"
 
@@ -13424,6 +13429,86 @@ cmdline_parse_inst_t cmd_vf_tc_max_bw = {
 	},
 };
 
+
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+
+/* *** Set Port default Traffic Management Hierarchy *** */
+struct cmd_set_port_tm_hierarchy_default_result {
+	cmdline_fixed_string_t set;
+	cmdline_fixed_string_t port;
+	cmdline_fixed_string_t tm;
+	cmdline_fixed_string_t hierarchy;
+	cmdline_fixed_string_t def;
+	uint16_t port_id;
+};
+
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_set =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result, set, "set");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_port =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result, port, "port");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_tm =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result, tm, "tm");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_hierarchy =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result,
+			hierarchy, "hierarchy");
+cmdline_parse_token_string_t cmd_set_port_tm_hierarchy_default_default =
+	TOKEN_STRING_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result,
+			def, "default");
+cmdline_parse_token_num_t cmd_set_port_tm_hierarchy_default_port_id =
+	TOKEN_NUM_INITIALIZER(
+		struct cmd_set_port_tm_hierarchy_default_result,
+			port_id, UINT8);
+
+static void cmd_set_port_tm_hierarchy_default_parsed(void *parsed_result,
+	__attribute__((unused)) struct cmdline *cl,
+	__attribute__((unused)) void *data)
+{
+	struct cmd_set_port_tm_hierarchy_default_result *res = parsed_result;
+	struct rte_port *p;
+	uint16_t port_id = res->port_id;
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
+		return;
+
+	p = &ports[port_id];
+
+	/* Port tm flag */
+	if (p->softport.tm_flag == 0) {
+		printf("  tm not enabled on port %u (error)\n", port_id);
+		return;
+	}
+
+	/* Forward mode: tm */
+	if (strcmp(cur_fwd_config.fwd_eng->fwd_mode_name, "tm")) {
+		printf("  tm mode not enabled(error)\n");
+		return;
+	}
+
+	/* Set the default tm hierarchy */
+	p->softport.tm.default_hierarchy_enable = 1;
+}
+
+cmdline_parse_inst_t cmd_set_port_tm_hierarchy_default = {
+	.f = cmd_set_port_tm_hierarchy_default_parsed,
+	.data = NULL,
+	.help_str = "set port tm hierarchy default <port_id>",
+	.tokens = {
+		(void *)&cmd_set_port_tm_hierarchy_default_set,
+		(void *)&cmd_set_port_tm_hierarchy_default_port,
+		(void *)&cmd_set_port_tm_hierarchy_default_tm,
+		(void *)&cmd_set_port_tm_hierarchy_default_hierarchy,
+		(void *)&cmd_set_port_tm_hierarchy_default_default,
+		(void *)&cmd_set_port_tm_hierarchy_default_port_id,
+		NULL,
+	},
+};
+#endif
+
 /* Strict link priority scheduling mode setting */
 static void
 cmd_strict_link_prio_parsed(
@@ -15019,6 +15104,9 @@ cmdline_parse_ctx_t main_ctx[] = {
 	(cmdline_parse_inst_t *)&cmd_vf_tc_max_bw,
 	(cmdline_parse_inst_t *)&cmd_strict_link_prio,
 	(cmdline_parse_inst_t *)&cmd_tc_min_bw,
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+	(cmdline_parse_inst_t *)&cmd_set_port_tm_hierarchy_default,
+#endif
 	(cmdline_parse_inst_t *)&cmd_ddp_add,
 	(cmdline_parse_inst_t *)&cmd_ddp_del,
 	(cmdline_parse_inst_t *)&cmd_ddp_get_list,
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 667c228..23352be 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -164,6 +164,10 @@ struct fwd_engine * fwd_engines[] = {
 	&tx_only_engine,
 	&csum_fwd_engine,
 	&icmp_echo_engine,
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+	&softnic_tm_engine,
+	&softnic_tm_bypass_engine,
+#endif
 #ifdef RTE_LIBRTE_IEEE1588
 	&ieee1588_fwd_engine,
 #endif
@@ -2109,6 +2113,17 @@ init_port_config(void)
 		    (rte_eth_devices[pid].data->dev_flags &
 		     RTE_ETH_DEV_INTR_RMV))
 			port->dev_conf.intr_conf.rmv = 1;
+
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+		/* Detect softnic port */
+		if (!strcmp(port->dev_info.driver_name, "net_softnic")) {
+			port->softnic_enable = 1;
+			memset(&port->softport, 0, sizeof(struct softnic_port));
+
+			if (!strcmp(cur_fwd_eng->fwd_mode_name, "tm"))
+				port->softport.tm_flag = 1;
+		}
+#endif
 	}
 }
 
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index e2d9e34..7c79f17 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -85,6 +85,12 @@ typedef uint16_t streamid_t;
 
 #define MAX_QUEUE_ID ((1 << (sizeof(queueid_t) * 8)) - 1)
 
+#if defined RTE_LIBRTE_PMD_SOFTNIC && defined RTE_LIBRTE_SCHED
+#define TM_MODE			1
+#else
+#define TM_MODE			0
+#endif
+
 enum {
 	PORT_TOPOLOGY_PAIRED,
 	PORT_TOPOLOGY_CHAINED,
@@ -164,6 +170,38 @@ struct port_flow {
 	uint8_t data[]; /**< Storage for pattern/actions. */
 };
 
+#ifdef TM_MODE
+/**
+ * Soft port tm related parameters
+ */
+struct softnic_port_tm {
+	uint32_t default_hierarchy_enable; /**< def hierarchy enable flag */
+	uint32_t hierarchy_config;  /**< set to 1 if hierarchy configured */
+
+	uint32_t n_subports_per_port;  /**< Num of subport nodes per port */
+	uint32_t n_pipes_per_subport;  /**< Num of pipe nodes per subport */
+
+	uint64_t tm_pktfield0_slabpos;	/**< Pkt field position for subport */
+	uint64_t tm_pktfield0_slabmask; /**< Pkt field mask for the subport */
+	uint64_t tm_pktfield0_slabshr;
+	uint64_t tm_pktfield1_slabpos; /**< Pkt field position for the pipe */
+	uint64_t tm_pktfield1_slabmask; /**< Pkt field mask for the pipe */
+	uint64_t tm_pktfield1_slabshr;
+	uint64_t tm_pktfield2_slabpos; /**< Pkt field position table index */
+	uint64_t tm_pktfield2_slabmask;	/**< Pkt field mask for tc table idx */
+	uint64_t tm_pktfield2_slabshr;
+	uint64_t tm_tc_table[64];  /**< TC translation table */
+};
+
+/**
+ * The data structure associate with softnic port
+ */
+struct softnic_port {
+	unsigned int tm_flag;	/**< set to 1 if tm feature is enabled */
+	struct softnic_port_tm tm;	/**< softnic port tm parameters */
+};
+#endif
+
 /**
  * The data structure associated with each port.
  */
@@ -197,6 +235,10 @@ struct rte_port {
 	uint32_t                mc_addr_nb; /**< nb. of addr. in mc_addr_pool */
 	uint8_t                 slave_flag; /**< bonding slave port */
 	struct port_flow        *flow_list; /**< Associated flows. */
+#ifdef TM_MODE
+	unsigned int			softnic_enable;	/**< softnic flag */
+	struct softnic_port     softport;  /**< softnic port params */
+#endif
 };
 
 /**
@@ -257,6 +299,10 @@ extern struct fwd_engine rx_only_engine;
 extern struct fwd_engine tx_only_engine;
 extern struct fwd_engine csum_fwd_engine;
 extern struct fwd_engine icmp_echo_engine;
+#ifdef TM_MODE
+extern struct fwd_engine softnic_tm_engine;
+extern struct fwd_engine softnic_tm_bypass_engine;
+#endif
 #ifdef RTE_LIBRTE_IEEE1588
 extern struct fwd_engine ieee1588_fwd_engine;
 #endif
diff --git a/app/test-pmd/tm.c b/app/test-pmd/tm.c
new file mode 100644
index 0000000..9021e26
--- /dev/null
+++ b/app/test-pmd/tm.c
@@ -0,0 +1,865 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2017 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+#include <stdio.h>
+#include <sys/stat.h>
+
+#include <rte_cycles.h>
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_flow.h>
+#include <rte_meter.h>
+#include <rte_eth_softnic.h>
+#include <rte_tm.h>
+
+#include "testpmd.h"
+
+#define SUBPORT_NODES_PER_PORT		1
+#define PIPE_NODES_PER_SUBPORT		4096
+#define TC_NODES_PER_PIPE			4
+#define QUEUE_NODES_PER_TC			4
+
+#define NUM_PIPE_NODES						\
+	(SUBPORT_NODES_PER_PORT * PIPE_NODES_PER_SUBPORT)
+
+#define NUM_TC_NODES						\
+	(NUM_PIPE_NODES * TC_NODES_PER_PIPE)
+
+#define ROOT_NODE_ID				1000000
+#define SUBPORT_NODES_START_ID		900000
+#define PIPE_NODES_START_ID			800000
+#define TC_NODES_START_ID			700000
+
+#define STATS_MASK_DEFAULT					\
+	(RTE_TM_STATS_N_PKTS |					\
+	RTE_TM_STATS_N_BYTES |					\
+	RTE_TM_STATS_N_PKTS_GREEN_DROPPED |			\
+	RTE_TM_STATS_N_BYTES_GREEN_DROPPED)
+
+#define STATS_MASK_QUEUE					\
+	(STATS_MASK_DEFAULT |					\
+	RTE_TM_STATS_N_PKTS_QUEUED)
+
+#define BYTES_IN_MBPS				(1000 * 1000 / 8)
+#define TOKEN_BUCKET_SIZE			1000000
+
+/* TM Hierarchy Levels */
+enum tm_hierarchy_level {
+	TM_NODE_LEVEL_PORT = 0,
+	TM_NODE_LEVEL_SUBPORT,
+	TM_NODE_LEVEL_PIPE,
+	TM_NODE_LEVEL_TC,
+	TM_NODE_LEVEL_QUEUE,
+	TM_NODE_LEVEL_MAX,
+};
+
+struct tm_hierarchy {
+	/* TM Nodes */
+	uint32_t root_node_id;
+	uint32_t subport_node_id[SUBPORT_NODES_PER_PORT];
+	uint32_t pipe_node_id[SUBPORT_NODES_PER_PORT][PIPE_NODES_PER_SUBPORT];
+	uint32_t tc_node_id[NUM_PIPE_NODES][TC_NODES_PER_PIPE];
+	uint32_t queue_node_id[NUM_TC_NODES][QUEUE_NODES_PER_TC];
+
+	/* TM Hierarchy Nodes Shaper Rates */
+	uint32_t root_node_shaper_rate;
+	uint32_t subport_node_shaper_rate;
+	uint32_t pipe_node_shaper_rate;
+	uint32_t tc_node_shaper_rate;
+	uint32_t tc_node_shared_shaper_rate;
+
+	uint32_t n_shapers;
+};
+
+#define BITFIELD(byte_array, slab_pos, slab_mask, slab_shr)	\
+({								\
+	uint64_t slab = *((uint64_t *) &byte_array[slab_pos]);	\
+	uint64_t val =				\
+		(rte_be_to_cpu_64(slab) & slab_mask) >> slab_shr;	\
+	val;						\
+})
+
+#define RTE_SCHED_PORT_HIERARCHY(subport, pipe,           \
+	traffic_class, queue, color)                          \
+	((((uint64_t) (queue)) & 0x3) |                       \
+	((((uint64_t) (traffic_class)) & 0x3) << 2) |         \
+	((((uint64_t) (color)) & 0x3) << 4) |                 \
+	((((uint64_t) (subport)) & 0xFFFF) << 16) |           \
+	((((uint64_t) (pipe)) & 0xFFFFFFFF) << 32))
+
+
+static void
+pkt_metadata_set(struct rte_port *p, struct rte_mbuf **pkts,
+	uint32_t n_pkts)
+{
+	struct softnic_port_tm *tm = &p->softport.tm;
+	uint32_t i;
+
+	for (i = 0; i < (n_pkts & (~0x3)); i += 4) {
+		struct rte_mbuf *pkt0 = pkts[i];
+		struct rte_mbuf *pkt1 = pkts[i + 1];
+		struct rte_mbuf *pkt2 = pkts[i + 2];
+		struct rte_mbuf *pkt3 = pkts[i + 3];
+
+		uint8_t *pkt0_data = rte_pktmbuf_mtod(pkt0, uint8_t *);
+		uint8_t *pkt1_data = rte_pktmbuf_mtod(pkt1, uint8_t *);
+		uint8_t *pkt2_data = rte_pktmbuf_mtod(pkt2, uint8_t *);
+		uint8_t *pkt3_data = rte_pktmbuf_mtod(pkt3, uint8_t *);
+
+		uint64_t pkt0_subport = BITFIELD(pkt0_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt0_pipe = BITFIELD(pkt0_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt0_dscp = BITFIELD(pkt0_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt0_tc = tm->tm_tc_table[pkt0_dscp & 0x3F] >> 2;
+		uint32_t pkt0_tc_q = tm->tm_tc_table[pkt0_dscp & 0x3F] & 0x3;
+		uint64_t pkt1_subport = BITFIELD(pkt1_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt1_pipe = BITFIELD(pkt1_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt1_dscp = BITFIELD(pkt1_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt1_tc = tm->tm_tc_table[pkt1_dscp & 0x3F] >> 2;
+		uint32_t pkt1_tc_q = tm->tm_tc_table[pkt1_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt2_subport = BITFIELD(pkt2_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt2_pipe = BITFIELD(pkt2_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt2_dscp = BITFIELD(pkt2_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt2_tc = tm->tm_tc_table[pkt2_dscp & 0x3F] >> 2;
+		uint32_t pkt2_tc_q = tm->tm_tc_table[pkt2_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt3_subport = BITFIELD(pkt3_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt3_pipe = BITFIELD(pkt3_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt3_dscp = BITFIELD(pkt3_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt3_tc = tm->tm_tc_table[pkt3_dscp & 0x3F] >> 2;
+		uint32_t pkt3_tc_q = tm->tm_tc_table[pkt3_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt0_sched = RTE_SCHED_PORT_HIERARCHY(pkt0_subport,
+						pkt0_pipe,
+						pkt0_tc,
+						pkt0_tc_q,
+						0);
+		uint64_t pkt1_sched = RTE_SCHED_PORT_HIERARCHY(pkt1_subport,
+						pkt1_pipe,
+						pkt1_tc,
+						pkt1_tc_q,
+						0);
+		uint64_t pkt2_sched = RTE_SCHED_PORT_HIERARCHY(pkt2_subport,
+						pkt2_pipe,
+						pkt2_tc,
+						pkt2_tc_q,
+						0);
+		uint64_t pkt3_sched = RTE_SCHED_PORT_HIERARCHY(pkt3_subport,
+						pkt3_pipe,
+						pkt3_tc,
+						pkt3_tc_q,
+						0);
+
+		pkt0->hash.sched.lo = pkt0_sched & 0xFFFFFFFF;
+		pkt0->hash.sched.hi = pkt0_sched >> 32;
+		pkt1->hash.sched.lo = pkt1_sched & 0xFFFFFFFF;
+		pkt1->hash.sched.hi = pkt1_sched >> 32;
+		pkt2->hash.sched.lo = pkt2_sched & 0xFFFFFFFF;
+		pkt2->hash.sched.hi = pkt2_sched >> 32;
+		pkt3->hash.sched.lo = pkt3_sched & 0xFFFFFFFF;
+		pkt3->hash.sched.hi = pkt3_sched >> 32;
+	}
+
+	for (; i < n_pkts; i++)	{
+		struct rte_mbuf *pkt = pkts[i];
+
+		uint8_t *pkt_data = rte_pktmbuf_mtod(pkt, uint8_t *);
+
+		uint64_t pkt_subport = BITFIELD(pkt_data,
+					tm->tm_pktfield0_slabpos,
+					tm->tm_pktfield0_slabmask,
+					tm->tm_pktfield0_slabshr);
+		uint64_t pkt_pipe = BITFIELD(pkt_data,
+					tm->tm_pktfield1_slabpos,
+					tm->tm_pktfield1_slabmask,
+					tm->tm_pktfield1_slabshr);
+		uint64_t pkt_dscp = BITFIELD(pkt_data,
+					tm->tm_pktfield2_slabpos,
+					tm->tm_pktfield2_slabmask,
+					tm->tm_pktfield2_slabshr);
+		uint32_t pkt_tc = tm->tm_tc_table[pkt_dscp & 0x3F] >> 2;
+		uint32_t pkt_tc_q = tm->tm_tc_table[pkt_dscp & 0x3F] & 0x3;
+
+		uint64_t pkt_sched = RTE_SCHED_PORT_HIERARCHY(pkt_subport,
+						pkt_pipe,
+						pkt_tc,
+						pkt_tc_q,
+						0);
+
+		pkt->hash.sched.lo = pkt_sched & 0xFFFFFFFF;
+		pkt->hash.sched.hi = pkt_sched >> 32;
+	}
+}
+
+/*
+ * Soft port packet forward
+ */
+static void
+softport_packet_fwd(struct fwd_stream *fs)
+{
+	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	struct rte_port *rte_tx_port = &ports[fs->tx_port];
+	uint16_t nb_rx;
+	uint16_t nb_tx;
+	uint32_t retry;
+
+#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
+	uint64_t start_tsc;
+	uint64_t end_tsc;
+	uint64_t core_cycles;
+#endif
+
+#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
+	start_tsc = rte_rdtsc();
+#endif
+
+	/*  Packets Receive */
+	nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue,
+			pkts_burst, nb_pkt_per_burst);
+	fs->rx_packets += nb_rx;
+
+#ifdef RTE_TEST_PMD_RECORD_BURST_STATS
+	fs->rx_burst_stats.pkt_burst_spread[nb_rx]++;
+#endif
+
+	if (rte_tx_port->softnic_enable) {
+		/* Set packet metadata if tm flag enabled */
+		if (rte_tx_port->softport.tm_flag)
+			pkt_metadata_set(rte_tx_port, pkts_burst, nb_rx);
+
+		/* Softport run */
+		rte_pmd_softnic_run(fs->tx_port);
+	}
+	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
+			pkts_burst, nb_rx);
+
+	/* Retry if necessary */
+	if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
+		retry = 0;
+		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
+			rte_delay_us(burst_tx_delay_time);
+			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
+					&pkts_burst[nb_tx], nb_rx - nb_tx);
+		}
+	}
+	fs->tx_packets += nb_tx;
+
+#ifdef RTE_TEST_PMD_RECORD_BURST_STATS
+	fs->tx_burst_stats.pkt_burst_spread[nb_tx]++;
+#endif
+
+	if (unlikely(nb_tx < nb_rx)) {
+		fs->fwd_dropped += (nb_rx - nb_tx);
+		do {
+			rte_pktmbuf_free(pkts_burst[nb_tx]);
+		} while (++nb_tx < nb_rx);
+	}
+#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
+	end_tsc = rte_rdtsc();
+	core_cycles = (end_tsc - start_tsc);
+	fs->core_cycles = (uint64_t) (fs->core_cycles + core_cycles);
+#endif
+}
+
+static void
+set_tm_hiearchy_nodes_shaper_rate(portid_t port_id, struct tm_hierarchy *h)
+{
+	struct rte_eth_link link_params;
+	uint64_t tm_port_rate;
+
+	memset(&link_params, 0, sizeof(link_params));
+
+	rte_eth_link_get(port_id, &link_params);
+	tm_port_rate = link_params.link_speed * BYTES_IN_MBPS;
+
+	if (tm_port_rate > UINT32_MAX)
+		tm_port_rate = UINT32_MAX;
+
+	/* Set tm hierarchy shapers rate */
+	h->root_node_shaper_rate = tm_port_rate;
+	h->subport_node_shaper_rate =
+		tm_port_rate / SUBPORT_NODES_PER_PORT;
+	h->pipe_node_shaper_rate
+		= h->subport_node_shaper_rate / PIPE_NODES_PER_SUBPORT;
+	h->tc_node_shaper_rate = h->pipe_node_shaper_rate;
+	h->tc_node_shared_shaper_rate = h->subport_node_shaper_rate;
+}
+
+static int
+softport_tm_root_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	struct rte_tm_node_params rnp;
+	struct rte_tm_shaper_params rsp;
+	uint32_t priority, weight, level_id, shaper_profile_id;
+
+	memset(&rsp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&rnp, 0, sizeof(struct rte_tm_node_params));
+
+	/* Shaper profile Parameters */
+	rsp.peak.rate = h->root_node_shaper_rate;
+	rsp.peak.size = TOKEN_BUCKET_SIZE;
+	rsp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+	shaper_profile_id = 0;
+
+	if (rte_tm_shaper_profile_add(port_id, shaper_profile_id,
+		&rsp, error)) {
+		printf("%s ERROR(%d)-%s!(shaper_id %u)\n ",
+			__func__, error->type, error->message,
+			shaper_profile_id);
+		return -1;
+	}
+
+	/* Root Node Parameters */
+	h->root_node_id = ROOT_NODE_ID;
+	weight = 1;
+	priority = 0;
+	level_id = TM_NODE_LEVEL_PORT;
+	rnp.shaper_profile_id = shaper_profile_id;
+	rnp.nonleaf.n_sp_priorities = 1;
+	rnp.stats_mask = STATS_MASK_DEFAULT;
+
+	/* Add Node to TM Hierarchy */
+	if (rte_tm_node_add(port_id, h->root_node_id, RTE_TM_NODE_ID_NULL,
+		priority, weight, level_id, &rnp, error)) {
+		printf("%s ERROR(%d)-%s!(node_id %u, parent_id %u, level %u)\n",
+			__func__, error->type, error->message,
+			h->root_node_id, RTE_TM_NODE_ID_NULL,
+			level_id);
+		return -1;
+	}
+	/* Update */
+	h->n_shapers++;
+
+	printf("  Root node added (Start id %u, Count %u, level %u)\n",
+		h->root_node_id, 1, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_subport_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t subport_parent_node_id, subport_node_id;
+	struct rte_tm_node_params snp;
+	struct rte_tm_shaper_params ssp;
+	uint32_t priority, weight, level_id, shaper_profile_id;
+	uint32_t i;
+
+	memset(&ssp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&snp, 0, sizeof(struct rte_tm_node_params));
+
+	shaper_profile_id = h->n_shapers;
+
+	/* Add Shaper Profile to TM Hierarchy */
+	for (i = 0; i < SUBPORT_NODES_PER_PORT; i++) {
+		ssp.peak.rate = h->subport_node_shaper_rate;
+		ssp.peak.size = TOKEN_BUCKET_SIZE;
+		ssp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+		if (rte_tm_shaper_profile_add(port_id, shaper_profile_id,
+			&ssp, error)) {
+			printf("%s ERROR(%d)-%s!(shaper_id %u)\n ",
+				__func__, error->type, error->message,
+				shaper_profile_id);
+			return -1;
+		}
+
+		/* Node Parameters */
+		h->subport_node_id[i] = SUBPORT_NODES_START_ID + i;
+		subport_parent_node_id = h->root_node_id;
+		weight = 1;
+		priority = 0;
+		level_id = TM_NODE_LEVEL_SUBPORT;
+		snp.shaper_profile_id = shaper_profile_id;
+		snp.nonleaf.n_sp_priorities = 1;
+		snp.stats_mask = STATS_MASK_DEFAULT;
+
+		/* Add Node to TM Hiearchy */
+		if (rte_tm_node_add(port_id,
+				h->subport_node_id[i],
+				subport_parent_node_id,
+				priority, weight,
+				level_id,
+				&snp,
+				error)) {
+			printf("%s ERROR(%d)-%s!(node %u,parent %u,level %u)\n",
+					__func__,
+					error->type,
+					error->message,
+					h->subport_node_id[i],
+					subport_parent_node_id,
+					level_id);
+			return -1;
+		}
+		shaper_profile_id++;
+		subport_node_id++;
+	}
+	/* Update */
+	h->n_shapers = shaper_profile_id;
+
+	printf("  Subport nodes added (Start id %u, Count %u, level %u)\n",
+		h->subport_node_id[0], SUBPORT_NODES_PER_PORT, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_pipe_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t pipe_parent_node_id;
+	struct rte_tm_node_params pnp;
+	struct rte_tm_shaper_params psp;
+	uint32_t priority, weight, level_id, shaper_profile_id;
+	uint32_t i, j;
+
+	memset(&psp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&pnp, 0, sizeof(struct rte_tm_node_params));
+
+	shaper_profile_id = h->n_shapers;
+
+	/* Shaper Profile Parameters */
+	psp.peak.rate = h->pipe_node_shaper_rate;
+	psp.peak.size = TOKEN_BUCKET_SIZE;
+	psp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+	/* Pipe Node Parameters */
+	weight = 1;
+	priority = 0;
+	level_id = TM_NODE_LEVEL_PIPE;
+	pnp.nonleaf.n_sp_priorities = 4;
+	pnp.stats_mask = STATS_MASK_DEFAULT;
+
+	/* Add Shaper Profiles and Nodes to TM Hierarchy */
+	for (i = 0; i < SUBPORT_NODES_PER_PORT; i++) {
+		for (j = 0; j < PIPE_NODES_PER_SUBPORT; j++) {
+			if (rte_tm_shaper_profile_add(port_id,
+				shaper_profile_id, &psp, error)) {
+				printf("%s ERROR(%d)-%s!(shaper_id %u)\n ",
+					__func__, error->type, error->message,
+					shaper_profile_id);
+				return -1;
+			}
+			pnp.shaper_profile_id = shaper_profile_id;
+			pipe_parent_node_id = h->subport_node_id[i];
+			h->pipe_node_id[i][j] = PIPE_NODES_START_ID +
+				(i * PIPE_NODES_PER_SUBPORT) + j;
+
+			if (rte_tm_node_add(port_id,
+					h->pipe_node_id[i][j],
+					pipe_parent_node_id,
+					priority, weight, level_id,
+					&pnp,
+					error)) {
+				printf("%s ERROR(%d)-%s!(node %u,parent %u )\n",
+					__func__,
+					error->type,
+					error->message,
+					h->pipe_node_id[i][j],
+					pipe_parent_node_id);
+
+				return -1;
+			}
+			shaper_profile_id++;
+		}
+	}
+	/* Update */
+	h->n_shapers = shaper_profile_id;
+
+	printf("  Pipe nodes added (Start id %u, Count %u, level %u)\n",
+		h->pipe_node_id[0][0], NUM_PIPE_NODES, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_tc_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t tc_parent_node_id;
+	struct rte_tm_node_params tnp;
+	struct rte_tm_shaper_params tsp, tssp;
+	uint32_t shared_shaper_profile_id[TC_NODES_PER_PIPE];
+	uint32_t priority, weight, level_id, shaper_profile_id;
+	uint32_t pos, n_tc_nodes, i, j, k;
+
+	memset(&tsp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&tssp, 0, sizeof(struct rte_tm_shaper_params));
+	memset(&tnp, 0, sizeof(struct rte_tm_node_params));
+
+	shaper_profile_id = h->n_shapers;
+
+	/* Private Shaper Profile (TC) Parameters */
+	tsp.peak.rate = h->tc_node_shaper_rate;
+	tsp.peak.size = TOKEN_BUCKET_SIZE;
+	tsp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+	/* Shared Shaper Profile (TC) Parameters */
+	tssp.peak.rate = h->tc_node_shared_shaper_rate;
+	tssp.peak.size = TOKEN_BUCKET_SIZE;
+	tssp.pkt_length_adjust = RTE_TM_ETH_FRAMING_OVERHEAD_FCS;
+
+	/* TC Node Parameters */
+	weight = 1;
+	level_id = TM_NODE_LEVEL_TC;
+	tnp.n_shared_shapers = 1;
+	tnp.nonleaf.n_sp_priorities = 1;
+	tnp.stats_mask = STATS_MASK_DEFAULT;
+
+	/* Add Shared Shaper Profiles to TM Hierarchy */
+	for (i = 0; i < TC_NODES_PER_PIPE; i++) {
+		shared_shaper_profile_id[i] = shaper_profile_id;
+
+		if (rte_tm_shaper_profile_add(port_id,
+			shared_shaper_profile_id[i], &tssp, error)) {
+			printf("%s ERROR(%d)-%s!(Shared shaper profileid %u)\n",
+				__func__, error->type, error->message,
+				shared_shaper_profile_id[i]);
+
+			return -1;
+		}
+		if (rte_tm_shared_shaper_add_update(port_id,  i,
+			shared_shaper_profile_id[i], error)) {
+			printf("%s ERROR(%d)-%s!(Shared shaper id %u)\n",
+				__func__, error->type, error->message, i);
+
+			return -1;
+		}
+		shaper_profile_id++;
+	}
+
+	/* Add Shaper Profiles and Nodes to TM Hierarchy */
+	n_tc_nodes = 0;
+	for (i = 0; i < SUBPORT_NODES_PER_PORT; i++) {
+		for (j = 0; j < PIPE_NODES_PER_SUBPORT; j++) {
+			for (k = 0; k < TC_NODES_PER_PIPE ; k++) {
+				priority = k;
+				tc_parent_node_id = h->pipe_node_id[i][j];
+				tnp.shared_shaper_id =
+					(uint32_t *)calloc(1, sizeof(uint32_t));
+				tnp.shared_shaper_id[0] = k;
+				pos = j + (i * PIPE_NODES_PER_SUBPORT);
+				h->tc_node_id[pos][k] =
+					TC_NODES_START_ID + n_tc_nodes;
+
+				if (rte_tm_shaper_profile_add(port_id,
+					shaper_profile_id, &tsp, error)) {
+					printf("%s ERROR(%d)-%s!(shaper %u)\n",
+						__func__, error->type,
+						error->message,
+						shaper_profile_id);
+
+					return -1;
+				}
+				tnp.shaper_profile_id = shaper_profile_id;
+				if (rte_tm_node_add(port_id,
+						h->tc_node_id[pos][k],
+						tc_parent_node_id,
+						priority, weight,
+						level_id,
+						&tnp, error)) {
+					printf("%s ERROR(%d)-%s!(node id %u)\n",
+						__func__,
+						error->type,
+						error->message,
+						h->tc_node_id[pos][k]);
+
+					return -1;
+				}
+				shaper_profile_id++;
+				n_tc_nodes++;
+			}
+		}
+	}
+	/* Update */
+	h->n_shapers = shaper_profile_id;
+
+	printf("  TC nodes added (Start id %u, Count %u, level %u)\n",
+		h->tc_node_id[0][0], n_tc_nodes, level_id);
+
+	return 0;
+}
+
+static int
+softport_tm_queue_node_add(portid_t port_id, struct tm_hierarchy *h,
+	struct rte_tm_error *error)
+{
+	uint32_t queue_parent_node_id;
+	struct rte_tm_node_params qnp;
+	uint32_t priority, weight, level_id, pos;
+	uint32_t n_queue_nodes, i, j, k;
+
+	memset(&qnp, 0, sizeof(struct rte_tm_node_params));
+
+	/* Queue Node Parameters */
+	priority = 0;
+	weight = 1;
+	level_id = TM_NODE_LEVEL_QUEUE;
+	qnp.shaper_profile_id = RTE_TM_SHAPER_PROFILE_ID_NONE;
+	qnp.leaf.cman = RTE_TM_CMAN_TAIL_DROP;
+	qnp.stats_mask = STATS_MASK_QUEUE;
+
+	/* Add Queue Nodes to TM Hierarchy */
+	n_queue_nodes = 0;
+	for (i = 0; i < NUM_PIPE_NODES; i++) {
+		for (j = 0; j < TC_NODES_PER_PIPE; j++) {
+			queue_parent_node_id = h->tc_node_id[i][j];
+			for (k = 0; k < QUEUE_NODES_PER_TC; k++) {
+				pos = j + (i * TC_NODES_PER_PIPE);
+				h->queue_node_id[pos][k] = n_queue_nodes;
+				if (rte_tm_node_add(port_id,
+						h->queue_node_id[pos][k],
+						queue_parent_node_id,
+						priority,
+						weight,
+						level_id,
+						&qnp, error)) {
+					printf("%s ERROR(%d)-%s!(node %u)\n",
+						__func__,
+						error->type,
+						error->message,
+						h->queue_node_id[pos][k]);
+
+					return -1;
+				}
+				n_queue_nodes++;
+			}
+		}
+	}
+	printf("  Queue nodes added (Start id %u, Count %u, level %u)\n",
+		h->queue_node_id[0][0], n_queue_nodes, level_id);
+
+	return 0;
+}
+
+/*
+ * TM Packet Field Setup
+ */
+static void
+softport_tm_pktfield_setup(portid_t port_id)
+{
+	struct rte_port *p = &ports[port_id];
+	uint64_t pktfield0_mask = 0;
+	uint64_t pktfield1_mask = 0x0000000FFF000000LLU;
+	uint64_t pktfield2_mask = 0x00000000000000FCLLU;
+
+	p->softport.tm = (struct softnic_port_tm) {
+		.n_subports_per_port = SUBPORT_NODES_PER_PORT,
+		.n_pipes_per_subport = PIPE_NODES_PER_SUBPORT,
+
+		/* Packet field to identify subport
+		 *
+		 * Default configuration assumes only one subport, thus
+		 * the subport ID is hardcoded to 0
+		 */
+		.tm_pktfield0_slabpos = 0,
+		.tm_pktfield0_slabmask = pktfield0_mask,
+		.tm_pktfield0_slabshr =
+			__builtin_ctzll(pktfield0_mask),
+
+		/* Packet field to identify pipe.
+		 *
+		 * Default value assumes Ethernet/IPv4/UDP packets,
+		 * UDP payload bits 12 .. 23
+		 */
+		.tm_pktfield1_slabpos = 40,
+		.tm_pktfield1_slabmask = pktfield1_mask,
+		.tm_pktfield1_slabshr =
+			__builtin_ctzll(pktfield1_mask),
+
+		/* Packet field used as index into TC translation table
+		 * to identify the traffic class and queue.
+		 *
+		 * Default value assumes Ethernet/IPv4 packets, IPv4
+		 * DSCP field
+		 */
+		.tm_pktfield2_slabpos = 8,
+		.tm_pktfield2_slabmask = pktfield2_mask,
+		.tm_pktfield2_slabshr =
+			__builtin_ctzll(pktfield2_mask),
+
+		.tm_tc_table = {
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+			0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+		}, /**< TC translation table */
+	};
+}
+
+static int
+softport_tm_hierarchy_specify(portid_t port_id, struct rte_tm_error *error)
+{
+
+	struct tm_hierarchy h;
+	int status;
+
+	memset(&h, 0, sizeof(struct tm_hierarchy));
+
+	/* TM hierarchy shapers rate */
+	set_tm_hiearchy_nodes_shaper_rate(port_id, &h);
+
+	/* Add root node (level 0) */
+	status = softport_tm_root_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add subport node (level 1) */
+	status = softport_tm_subport_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add pipe nodes (level 2) */
+	status = softport_tm_pipe_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add traffic class nodes (level 3) */
+	status = softport_tm_tc_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* Add queue nodes (level 4) */
+	status = softport_tm_queue_node_add(port_id, &h, error);
+	if (status)
+		return status;
+
+	/* TM packet fields setup */
+	softport_tm_pktfield_setup(port_id);
+
+	return 0;
+}
+
+/*
+ * Soft port Init
+ */
+static void
+softport_tm_begin(portid_t pi)
+{
+	struct rte_port *port = &ports[pi];
+
+	/* Soft port TM flag */
+	if (port->softport.tm_flag == 1) {
+		printf("\n\n  TM feature available on port %u\n", pi);
+
+		/* Soft port TM hierarchy configuration */
+		if ((port->softport.tm.hierarchy_config == 0) &&
+			(port->softport.tm.default_hierarchy_enable == 1)) {
+			struct rte_tm_error error;
+			int status;
+
+			/* Stop port */
+			rte_eth_dev_stop(pi);
+
+			/* TM hierarchy specification */
+			status = softport_tm_hierarchy_specify(pi, &error);
+			if (status) {
+				printf("  TM Hierarchy built error(%d) - %s\n",
+					error.type, error.message);
+				return;
+			}
+			printf("\n  TM Hierarchy Specified!\n\v");
+
+			/* TM hierarchy commit */
+			status = rte_tm_hierarchy_commit(pi, 0, &error);
+			if (status) {
+				printf("  Hierarchy commit error(%d) - %s\n",
+					error.type, error.message);
+				return;
+			}
+			printf("  Hierarchy Committed (port %u)!", pi);
+			port->softport.tm.hierarchy_config = 1;
+
+			/* Start port */
+			status = rte_eth_dev_start(pi);
+			if (status) {
+				printf("\n  Port %u start error!\n", pi);
+				return;
+			}
+			printf("\n  Port %u started!\n", pi);
+			return;
+		}
+	}
+	printf("\n  TM feature not available on port %u", pi);
+}
+
+struct fwd_engine softnic_tm_engine = {
+	.fwd_mode_name  = "tm",
+	.port_fwd_begin = softport_tm_begin,
+	.port_fwd_end   = NULL,
+	.packet_fwd     = softport_packet_fwd,
+};
+
+struct fwd_engine softnic_tm_bypass_engine = {
+	.fwd_mode_name  = "tm-bypass",
+	.port_fwd_begin = NULL,
+	.port_fwd_end   = NULL,
+	.packet_fwd     = softport_packet_fwd,
+};
-- 
2.9.3

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v8 5/5] app/testpmd: add traffic management forwarding mode
  2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 5/5] app/testpmd: add traffic management forwarding mode Jasvinder Singh
@ 2017-10-10 18:24                               ` Ferruh Yigit
  0 siblings, 0 replies; 79+ messages in thread
From: Ferruh Yigit @ 2017-10-10 18:24 UTC (permalink / raw)
  To: Jasvinder Singh, dev; +Cc: cristian.dumitrescu, thomas, wenzhuo.lu

On 10/10/2017 11:18 AM, Jasvinder Singh wrote:
> This commit extends the testpmd application with new forwarding engine
> that demonstrates the use of ethdev traffic management APIs and softnic
> PMD for QoS traffic management.
> 
> In this mode, 5-level hierarchical tree of the QoS scheduler is built
> with the help of ethdev TM APIs such as shaper profile add/delete,
> shared shaper add/update, node add/delete, hierarchy commit, etc.
> The hierarchical tree has following nodes; root node(x1, level 0),
> subport node(x1, level 1), pipe node(x4096, level 2),
> tc node(x16348, level 3), queue node(x65536, level 4).
> 
> During runtime, each received packet is first classified by mapping the
> packet fields information to 5-tuples (HQoS subport, pipe, traffic class,
> queue within traffic class, and color) and storing it in the packet mbuf
> sched field. After classification, each packet is sent to softnic port
> which prioritizes the transmission of the received packets, and
> accordingly sends them on to the output interface.
> 
> To enable traffic management mode, following testpmd command is used;
> 
> $ ./testpmd -c c -n 4 --vdev
> 	'net_softnic0,hard_name=0000:06:00.1,soft_tm=on' -- -i
> 	--forward-mode=tm
> 
> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
> 
> ---
> v8 change:
> - fix compilation warning on uninitialsed parameter
>  
> v7 change:
> - change port_id type to uint16_t
> - rebase on dpdk-next-net
> 
> v5 change:
> - add CLI to enable default tm hierarchy
>  
> v3 change:
> - Implements feedback from Pablo[1]
>   - add flag to check required librte_sched lib and softnic pmd
>   - code cleanup
> 
> v2 change:
> - change file name softnictm.c to tm.c
> - change forward mode name to "tm"
> - code clean up
> 
> [1] http://dpdk.org/ml/archives/dev/2017-September/075744.html
> 
>  app/test-pmd/Makefile  |  12 +
>  app/test-pmd/cmdline.c |  88 +++++
>  app/test-pmd/testpmd.c |  15 +
>  app/test-pmd/testpmd.h |  46 +++
>  app/test-pmd/tm.c      | 865 +++++++++++++++++++++++++++++++++++++++++++++++++

Testpmd documentation [1] is missing, can you please send a separate
patch to update it? I can squash it later into this patch.

[1]
doc/guides/testpmd_app_ug/testpmd_funcs.rst

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-10-10 10:18                           ` [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
                                               ` (4 preceding siblings ...)
  2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 5/5] app/testpmd: add traffic management forwarding mode Jasvinder Singh
@ 2017-10-10 18:31                             ` Ferruh Yigit
  2017-10-10 19:09                               ` Singh, Jasvinder
  5 siblings, 1 reply; 79+ messages in thread
From: Ferruh Yigit @ 2017-10-10 18:31 UTC (permalink / raw)
  To: Jasvinder Singh, dev; +Cc: cristian.dumitrescu, thomas, wenzhuo.lu

On 10/10/2017 11:18 AM, Jasvinder Singh wrote:
> The SoftNIC PMD is intended to provide SW fall-back options for specific
> ethdev APIs in a generic way to the NICs not supporting those features.
> 
> Currently, the only implemented ethdev API is Traffic Management (TM),
> but other ethdev APIs such as rte_flow, traffic metering & policing, etc
> can be easily implemented.
> 
> Overview:
> * Generic: The SoftNIC PMD works with any "hard" PMD that implements the
>   ethdev API. It does not change the "hard" PMD in any way.
> * Creation: For any given "hard" ethdev port, the user can decide to
>   create an associated "soft" ethdev port to drive the "hard" port. The
>   "soft" port is a virtual device that can be created at app start-up
>   through EAL vdev arg or later through the virtual device API.
> * Configuration: The app explicitly decides which features are to be
>   enabled on the "soft" port and which features are still to be used from
>   the "hard" port. The app continues to explicitly configure both the
>   "hard" and the "soft" ports after the creation of the "soft" port.
> * RX/TX: The app reads packets from/writes packets to the "soft" port
>   instead of the "hard" port. The RX and TX queues of the "soft" port are
>   thread safe, as any ethdev.
> * Execution: The "soft" port is a feature-rich NIC implemented by the CPU,
>   so the run function of the "soft" port has to be executed by the CPU in
>   order to get packets moving between "hard" port and the app.
> * Meets the NFV vision: The app should be (almost) agnostic about the NIC
>   implementation (different vendors/models, HW-SW mix), the app should not
>   require changes to use different NICs, the app should use the same API
>   for all NICs. If a NIC does not implement a specific feature, the HW
>   should be augmented with SW to meet the functionality while still
>   preserving the same API.
> 
> Traffic Management SW fall-back overview:
> * Implements the ethdev traffic management API (rte_tm.h).
> * Based on the existing librte_sched DPDK library.
> 
> Example: Create "soft" port for "hard" port "0000:04:00.1", enable the TM
> feature with default settings:
>           --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on' 
> 
> Q1: Why generic name, if only TM is supported (for now)?
> A1: The intention is to have SoftNIC PMD implement many other (all?)
>     ethdev APIs under a single "ideal" ethdev, hence the generic name.
>     The initial motivation is TM API, but the mechanism is generic and can
>     be used for many other ethdev APIs. Somebody looking to provide SW
>     fall-back for other ethdev API is likely to end up inventing the same,
>     hence it would be good to consolidate all under a single PMD and have
>     the user explicitly enable/disable the features it needs for each
>     "soft" device.
> 
> Q2: Are there any performance requirements for SoftNIC?
> A2: Yes, performance should be great/decent for every feature, otherwise
>     the SW fall-back is unusable, thus useless.
> 
> Q3: Why not change the "hard" device (and keep a single device) instead of
>     creating a new "soft" device (and thus having two devices)?
> A3: This is not possible with the current librte_ether ethdev
>     implementation. The ethdev->dev_ops are defined as constant structure,
>     so it cannot be changed per device (nor per PMD). The new ops also
>     need memory space to store their context data structures, which
>     requires updating the ethdev->data->dev_private of the existing
>     device; at best, maybe a resize of ethdev->data->dev_private could be
>     done, assuming that librte_ether will introduce a way to find out its
>     size, but this cannot be done while device is running. Other side
>     effects might exist, as the changes are very intrusive, plus it likely
>     needs more changes in librte_ether.
> 
> Q4: Why not call the SW fall-back dev_ops directly in librte_ether for
>     devices which do not support the specific feature? If the device
>     supports the capability, let's call its dev_ops, otherwise call the
>     SW fall-back dev_ops.
> A4: First, similar reasons to Q&A3. This fixes the need to change
>     ethdev->dev_ops of the device, but it does not do anything to fix the
>     other significant issue of where to store the context data structures
>     needed by the SW fall-back functions (which, in this approach, are
>     called implicitly by librte_ether).
>     Second, the SW fall-back options should not be restricted arbitrarily
>     by the librte_ether library, the decision should belong to the app.
>     For example, the TM SW fall-back should not be limited to only
>     librte_sched, which (like any SW fall-back) is limited to a specific
>     hierarchy and feature set, it cannot do any possible hierarchy. If
>     alternatives exist, the one to use should be picked by the app, not by
>     the ethdev layer.
> 
> Q5: Why is the app required to continue to configure both the "hard" and
>     the "soft" devices even after the "soft" device has been created? Why
>     not hiding the "hard" device under the "soft" device and have the
>     "soft" device configure the "hard" device under the hood?
> A5: This was the approach tried in the V2 of this patch set (overlay
>     "soft" device taking over the configuration of the underlay "hard"
>     device) and eventually dropped due to increased complexity of having
>     to keep the configuration of two distinct devices in sync with
>     librte_ether implementation that is not friendly towards such
>     approach. Basically, each ethdev API call for the overlay device
>     needs to configure the overlay device, invoke the same configuration
>     with possibly modified parameters for the underlay device, then resume
>     the configuration of overlay device, turning this into a device
>     emulation project.
>     V2 minuses: increased complexity (deal with two devices at same time);
>     need to implement every ethdev API, even those not needed for the scope
>     of SW fall-back; intrusive; sometimes have to silently take decisions
>     that should be left to the app.
>     V3 pluses: lower complexity (only one device); only need to implement
>     those APIs that are in scope of the SW fall-back; non-intrusive (deal
>     with "hard" device through ethdev API); app decisions taken by the app
>     in an explicit way.
> 
> Q6: Why expose the SW fall-back in a PMD and not in a SW library?
> A6: The SW fall-back for an ethdev API has to implement that specific
>     ethdev API, (hence expose an ethdev object through a PMD), as opposed
>     to providing a different API. This approach allows the app to use the
>     same API (NFV vision). For example, we already have a library for TM
>     SW fall-back (librte_sched) that can be called directly by the apps
>     that need to call it outside of ethdev context (use-cases exist), but
>     an app that works with TM-aware NICs through the ethdev TM API would
>     have to be changed significantly in order to work with different
>     TM-agnostic NICs through the librte_sched API.
> 
> Q7: Why have all the SW fall-backs in a single PMD? Why not develop
>     the SW fall-back for each different ethdev API in a separate PMD, then
>     create a chain of "soft" devices for each "hard" device? Potentially,
>     this results in smaller size PMDs that are easier to maintain.
> A7: Arguments for single ethdev/PMD and against chain of ethdevs/PMDs:
>     1. All the existing PMDs for HW NICs implement a lot of features under
>        the same PMD, so there is no reason for single PMD approach to break
>        code modularity. See the V3 code, a lot of care has been taken for
>        code modularity.
>     2. We should avoid the proliferation of SW PMDs.
>     3. A single device should be handled by a single PMD.
>     4. People are used with feature-rich PMDs, not with single-feature
>        PMDs, so we change of mindset?
>     5. [Configuration nightmare] A chain of "soft" devices attached to
>        single "hard" device requires the app to be aware that the N "soft"
>        devices in the chain plus the "hard" device refer to the same HW
>        device, and which device should be invoked to configure which
>        feature. Also the length of the chain and functionality of each
>        link is different for each HW device. This breaks the requirement
>        of preserving the same API while working with different NICs (NFV).
>        This most likely results in a configuration nightmare, nobody is
>        going to seriously use this.
>     6. [Feature inter-dependecy] Sometimes different features need to be
>        configured and executed together (e.g. share the same set of
>        resources, are inter-dependent, etc), so it is better and more
>        performant to do them in the same ethdev/PMD.
>     7. [Code duplication] There is a lot of duplication in the
>        configuration code for the chain of ethdevs approach. The ethdev
>        dev_configure, rx_queue_setup, tx_queue_setup API functions have to
>        be implemented per device, and they become meaningless/inconsistent
>        with the chain approach.
>     8. [Data structure duplication] The per device data structures have to
>        be duplicated and read repeatedly for each "soft" ethdev. The
>        ethdev device, dev_private, data, per RX/TX queue data structures
>        have to be replicated per "soft" device. They have to be re-read for
>        each stage, so the same cache misses are now multiplied with the
>        number of stages in the chain.
>     9. [rte_ring proliferation] Thread safety requirements for ethdev
>        RX/TXqueues require an rte_ring to be used for every RX/TX queue
>        of each "soft" ethdev. This rte_ring proliferation unnecessarily
>        increases the memory footprint and lowers performance, especially
>        when each "soft" ethdev ends up on a different CPU core (ping-pong
>        of cache lines).
>     10.[Meta-data proliferation] A chain of ethdevs is likely to result
>        in proliferation of meta-data that has to be passed between the
>        ethdevs (e.g. policing needs the output of flow classification),
>        which results in more cache line ping-pong between cores, hence
>        performance drops.
> 
> Cristian Dumitrescu (4):
> Jasvinder Singh (4):
>   net/softnic: add softnic PMD
>   net/softnic: add traffic management support
>   net/softnic: add TM capabilities ops
>   net/softnic: add TM hierarchy related ops
> 
> Jasvinder Singh (1):
>   app/testpmd: add traffic management forwarding mode

Series applied to dpdk-next-net/master, thanks.

(Was getting same build error from previous version, fixed while
applying please confirm the pushed commit.
Also waiting for testpmd document to squash this set later)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others
  2017-10-10 18:31                             ` [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Ferruh Yigit
@ 2017-10-10 19:09                               ` Singh, Jasvinder
  0 siblings, 0 replies; 79+ messages in thread
From: Singh, Jasvinder @ 2017-10-10 19:09 UTC (permalink / raw)
  To: Yigit, Ferruh, dev; +Cc: Dumitrescu, Cristian, thomas, Lu, Wenzhuo



> -----Original Message-----
> From: Yigit, Ferruh
> Sent: Tuesday, October 10, 2017 7:31 PM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>; dev@dpdk.org
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>;
> thomas@monjalon.net; Lu, Wenzhuo <wenzhuo.lu@intel.com>
> Subject: Re: [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt
> and others
> 
> On 10/10/2017 11:18 AM, Jasvinder Singh wrote:
> > The SoftNIC PMD is intended to provide SW fall-back options for
> > specific ethdev APIs in a generic way to the NICs not supporting those
> features.
> >
> > Currently, the only implemented ethdev API is Traffic Management (TM),
> > but other ethdev APIs such as rte_flow, traffic metering & policing,
> > etc can be easily implemented.
> >
> > Overview:
> > * Generic: The SoftNIC PMD works with any "hard" PMD that implements
> the
> >   ethdev API. It does not change the "hard" PMD in any way.
> > * Creation: For any given "hard" ethdev port, the user can decide to
> >   create an associated "soft" ethdev port to drive the "hard" port. The
> >   "soft" port is a virtual device that can be created at app start-up
> >   through EAL vdev arg or later through the virtual device API.
> > * Configuration: The app explicitly decides which features are to be
> >   enabled on the "soft" port and which features are still to be used from
> >   the "hard" port. The app continues to explicitly configure both the
> >   "hard" and the "soft" ports after the creation of the "soft" port.
> > * RX/TX: The app reads packets from/writes packets to the "soft" port
> >   instead of the "hard" port. The RX and TX queues of the "soft" port are
> >   thread safe, as any ethdev.
> > * Execution: The "soft" port is a feature-rich NIC implemented by the CPU,
> >   so the run function of the "soft" port has to be executed by the CPU in
> >   order to get packets moving between "hard" port and the app.
> > * Meets the NFV vision: The app should be (almost) agnostic about the NIC
> >   implementation (different vendors/models, HW-SW mix), the app should
> not
> >   require changes to use different NICs, the app should use the same API
> >   for all NICs. If a NIC does not implement a specific feature, the HW
> >   should be augmented with SW to meet the functionality while still
> >   preserving the same API.
> >
> > Traffic Management SW fall-back overview:
> > * Implements the ethdev traffic management API (rte_tm.h).
> > * Based on the existing librte_sched DPDK library.
> >
> > Example: Create "soft" port for "hard" port "0000:04:00.1", enable the
> > TM feature with default settings:
> >           --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on'
> >
> > Q1: Why generic name, if only TM is supported (for now)?
> > A1: The intention is to have SoftNIC PMD implement many other (all?)
> >     ethdev APIs under a single "ideal" ethdev, hence the generic name.
> >     The initial motivation is TM API, but the mechanism is generic and can
> >     be used for many other ethdev APIs. Somebody looking to provide SW
> >     fall-back for other ethdev API is likely to end up inventing the same,
> >     hence it would be good to consolidate all under a single PMD and have
> >     the user explicitly enable/disable the features it needs for each
> >     "soft" device.
> >
> > Q2: Are there any performance requirements for SoftNIC?
> > A2: Yes, performance should be great/decent for every feature, otherwise
> >     the SW fall-back is unusable, thus useless.
> >
> > Q3: Why not change the "hard" device (and keep a single device) instead of
> >     creating a new "soft" device (and thus having two devices)?
> > A3: This is not possible with the current librte_ether ethdev
> >     implementation. The ethdev->dev_ops are defined as constant structure,
> >     so it cannot be changed per device (nor per PMD). The new ops also
> >     need memory space to store their context data structures, which
> >     requires updating the ethdev->data->dev_private of the existing
> >     device; at best, maybe a resize of ethdev->data->dev_private could be
> >     done, assuming that librte_ether will introduce a way to find out its
> >     size, but this cannot be done while device is running. Other side
> >     effects might exist, as the changes are very intrusive, plus it likely
> >     needs more changes in librte_ether.
> >
> > Q4: Why not call the SW fall-back dev_ops directly in librte_ether for
> >     devices which do not support the specific feature? If the device
> >     supports the capability, let's call its dev_ops, otherwise call the
> >     SW fall-back dev_ops.
> > A4: First, similar reasons to Q&A3. This fixes the need to change
> >     ethdev->dev_ops of the device, but it does not do anything to fix the
> >     other significant issue of where to store the context data structures
> >     needed by the SW fall-back functions (which, in this approach, are
> >     called implicitly by librte_ether).
> >     Second, the SW fall-back options should not be restricted arbitrarily
> >     by the librte_ether library, the decision should belong to the app.
> >     For example, the TM SW fall-back should not be limited to only
> >     librte_sched, which (like any SW fall-back) is limited to a specific
> >     hierarchy and feature set, it cannot do any possible hierarchy. If
> >     alternatives exist, the one to use should be picked by the app, not by
> >     the ethdev layer.
> >
> > Q5: Why is the app required to continue to configure both the "hard" and
> >     the "soft" devices even after the "soft" device has been created? Why
> >     not hiding the "hard" device under the "soft" device and have the
> >     "soft" device configure the "hard" device under the hood?
> > A5: This was the approach tried in the V2 of this patch set (overlay
> >     "soft" device taking over the configuration of the underlay "hard"
> >     device) and eventually dropped due to increased complexity of having
> >     to keep the configuration of two distinct devices in sync with
> >     librte_ether implementation that is not friendly towards such
> >     approach. Basically, each ethdev API call for the overlay device
> >     needs to configure the overlay device, invoke the same configuration
> >     with possibly modified parameters for the underlay device, then resume
> >     the configuration of overlay device, turning this into a device
> >     emulation project.
> >     V2 minuses: increased complexity (deal with two devices at same time);
> >     need to implement every ethdev API, even those not needed for the
> scope
> >     of SW fall-back; intrusive; sometimes have to silently take decisions
> >     that should be left to the app.
> >     V3 pluses: lower complexity (only one device); only need to implement
> >     those APIs that are in scope of the SW fall-back; non-intrusive (deal
> >     with "hard" device through ethdev API); app decisions taken by the app
> >     in an explicit way.
> >
> > Q6: Why expose the SW fall-back in a PMD and not in a SW library?
> > A6: The SW fall-back for an ethdev API has to implement that specific
> >     ethdev API, (hence expose an ethdev object through a PMD), as opposed
> >     to providing a different API. This approach allows the app to use the
> >     same API (NFV vision). For example, we already have a library for TM
> >     SW fall-back (librte_sched) that can be called directly by the apps
> >     that need to call it outside of ethdev context (use-cases exist), but
> >     an app that works with TM-aware NICs through the ethdev TM API would
> >     have to be changed significantly in order to work with different
> >     TM-agnostic NICs through the librte_sched API.
> >
> > Q7: Why have all the SW fall-backs in a single PMD? Why not develop
> >     the SW fall-back for each different ethdev API in a separate PMD, then
> >     create a chain of "soft" devices for each "hard" device? Potentially,
> >     this results in smaller size PMDs that are easier to maintain.
> > A7: Arguments for single ethdev/PMD and against chain of ethdevs/PMDs:
> >     1. All the existing PMDs for HW NICs implement a lot of features under
> >        the same PMD, so there is no reason for single PMD approach to break
> >        code modularity. See the V3 code, a lot of care has been taken for
> >        code modularity.
> >     2. We should avoid the proliferation of SW PMDs.
> >     3. A single device should be handled by a single PMD.
> >     4. People are used with feature-rich PMDs, not with single-feature
> >        PMDs, so we change of mindset?
> >     5. [Configuration nightmare] A chain of "soft" devices attached to
> >        single "hard" device requires the app to be aware that the N "soft"
> >        devices in the chain plus the "hard" device refer to the same HW
> >        device, and which device should be invoked to configure which
> >        feature. Also the length of the chain and functionality of each
> >        link is different for each HW device. This breaks the requirement
> >        of preserving the same API while working with different NICs (NFV).
> >        This most likely results in a configuration nightmare, nobody is
> >        going to seriously use this.
> >     6. [Feature inter-dependecy] Sometimes different features need to be
> >        configured and executed together (e.g. share the same set of
> >        resources, are inter-dependent, etc), so it is better and more
> >        performant to do them in the same ethdev/PMD.
> >     7. [Code duplication] There is a lot of duplication in the
> >        configuration code for the chain of ethdevs approach. The ethdev
> >        dev_configure, rx_queue_setup, tx_queue_setup API functions have to
> >        be implemented per device, and they become
> meaningless/inconsistent
> >        with the chain approach.
> >     8. [Data structure duplication] The per device data structures have to
> >        be duplicated and read repeatedly for each "soft" ethdev. The
> >        ethdev device, dev_private, data, per RX/TX queue data structures
> >        have to be replicated per "soft" device. They have to be re-read for
> >        each stage, so the same cache misses are now multiplied with the
> >        number of stages in the chain.
> >     9. [rte_ring proliferation] Thread safety requirements for ethdev
> >        RX/TXqueues require an rte_ring to be used for every RX/TX queue
> >        of each "soft" ethdev. This rte_ring proliferation unnecessarily
> >        increases the memory footprint and lowers performance, especially
> >        when each "soft" ethdev ends up on a different CPU core (ping-pong
> >        of cache lines).
> >     10.[Meta-data proliferation] A chain of ethdevs is likely to result
> >        in proliferation of meta-data that has to be passed between the
> >        ethdevs (e.g. policing needs the output of flow classification),
> >        which results in more cache line ping-pong between cores, hence
> >        performance drops.
> >
> > Cristian Dumitrescu (4):
> > Jasvinder Singh (4):
> >   net/softnic: add softnic PMD
> >   net/softnic: add traffic management support
> >   net/softnic: add TM capabilities ops
> >   net/softnic: add TM hierarchy related ops
> >
> > Jasvinder Singh (1):
> >   app/testpmd: add traffic management forwarding mode
> 
> Series applied to dpdk-next-net/master, thanks.
> 
> (Was getting same build error from previous version, fixed while applying
> please confirm the pushed commit.
> Also waiting for testpmd document to squash this set later)

I have pushed the documentation patch. Thanks.


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v8 1/5] net/softnic: add softnic PMD
  2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 1/5] net/softnic: add softnic PMD Jasvinder Singh
@ 2017-10-11 23:18                               ` Thomas Monjalon
  2017-10-12  8:22                                 ` Singh, Jasvinder
  0 siblings, 1 reply; 79+ messages in thread
From: Thomas Monjalon @ 2017-10-11 23:18 UTC (permalink / raw)
  To: Jasvinder Singh, cristian.dumitrescu; +Cc: dev, ferruh.yigit, wenzhuo.lu

10/10/2017 12:18, Jasvinder Singh:
> +ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC)      += -lrte_pmd_softnic
> +endif

Why linking softnic only if sched is enabled?

Please, can you fix it for RC2?

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [dpdk-dev] [PATCH v8 1/5] net/softnic: add softnic PMD
  2017-10-11 23:18                               ` Thomas Monjalon
@ 2017-10-12  8:22                                 ` Singh, Jasvinder
  0 siblings, 0 replies; 79+ messages in thread
From: Singh, Jasvinder @ 2017-10-12  8:22 UTC (permalink / raw)
  To: Thomas Monjalon, Dumitrescu, Cristian; +Cc: dev, Yigit, Ferruh, Lu, Wenzhuo



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Thursday, October 12, 2017 12:18 AM
> To: Singh, Jasvinder <jasvinder.singh@intel.com>; Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com>
> Cc: dev@dpdk.org; Yigit, Ferruh <ferruh.yigit@intel.com>; Lu, Wenzhuo
> <wenzhuo.lu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v8 1/5] net/softnic: add softnic PMD
> 
> 10/10/2017 12:18, Jasvinder Singh:
> > +ifeq ($(CONFIG_RTE_LIBRTE_SCHED),y)
> > +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SOFTNIC)      += -lrte_pmd_softnic
> > +endif
> 
> Why linking softnic only if sched is enabled?
> 
> Please, can you fix it for RC2?

Yes, will fix this. Thanks.

^ permalink raw reply	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2017-10-12  8:22 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-26 18:11 [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management Jasvinder Singh
2017-05-26 18:11 ` [dpdk-dev] [PATCH 1/2] net/softnic: add softnic PMD " Jasvinder Singh
2017-06-26 16:43   ` [dpdk-dev] [PATCH v2 0/2] net/softnic: sw fall-back " Jasvinder Singh
2017-06-26 16:43     ` [dpdk-dev] [PATCH v2 1/2] net/softnic: add softnic PMD " Jasvinder Singh
2017-08-11 12:49       ` [dpdk-dev] [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 1/4] net/softnic: add softnic PMD Jasvinder Singh
2017-09-05 14:53           ` Ferruh Yigit
2017-09-08  9:30             ` Singh, Jasvinder
2017-09-08  9:48               ` Ferruh Yigit
2017-09-08 10:42                 ` Singh, Jasvinder
2017-09-18  9:10           ` [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 1/4] net/softnic: add softnic PMD Jasvinder Singh
2017-09-18 16:58               ` Singh, Jasvinder
2017-09-18 19:09                 ` Thomas Monjalon
2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 2/4] net/softnic: add traffic management support Jasvinder Singh
2017-09-25  1:58               ` Lu, Wenzhuo
2017-09-28  8:14                 ` Singh, Jasvinder
2017-09-29 14:04               ` [dpdk-dev] [PATCH v5 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 1/5] net/softnic: add softnic PMD Jasvinder Singh
2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 2/5] net/softnic: add traffic management support Jasvinder Singh
2017-10-06 16:59                   ` [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
2017-10-06 16:59                     ` [dpdk-dev] [PATCH v6 1/5] net/softnic: add softnic PMD Jasvinder Singh
2017-10-09 12:58                       ` [dpdk-dev] [PATCH v7 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 1/5] net/softnic: add softnic PMD Jasvinder Singh
2017-10-09 20:18                           ` Ferruh Yigit
2017-10-10 10:08                             ` Singh, Jasvinder
2017-10-10 10:18                           ` [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Jasvinder Singh
2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 1/5] net/softnic: add softnic PMD Jasvinder Singh
2017-10-11 23:18                               ` Thomas Monjalon
2017-10-12  8:22                                 ` Singh, Jasvinder
2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 2/5] net/softnic: add traffic management support Jasvinder Singh
2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 3/5] net/softnic: add TM capabilities ops Jasvinder Singh
2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 4/5] net/softnic: add TM hierarchy related ops Jasvinder Singh
2017-10-10 10:18                             ` [dpdk-dev] [PATCH v8 5/5] app/testpmd: add traffic management forwarding mode Jasvinder Singh
2017-10-10 18:24                               ` Ferruh Yigit
2017-10-10 18:31                             ` [dpdk-dev] [PATCH v8 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Ferruh Yigit
2017-10-10 19:09                               ` Singh, Jasvinder
2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 2/5] net/softnic: add traffic management support Jasvinder Singh
2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 3/5] net/softnic: add TM capabilities ops Jasvinder Singh
2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 4/5] net/softnic: add TM hierarchy related ops Jasvinder Singh
2017-10-09 12:58                         ` [dpdk-dev] [PATCH v7 5/5] app/testpmd: add traffic management forwarding mode Jasvinder Singh
2017-10-09 20:17                           ` Ferruh Yigit
2017-10-10 10:07                             ` Singh, Jasvinder
2017-10-06 17:00                     ` [dpdk-dev] [PATCH v6 2/5] net/softnic: add traffic management support Jasvinder Singh
2017-10-06 17:00                     ` [dpdk-dev] [PATCH v6 3/5] net/softnic: add TM capabilities ops Jasvinder Singh
2017-10-06 17:00                     ` [dpdk-dev] [PATCH v6 4/5] net/softnic: add TM hierarchy related ops Jasvinder Singh
2017-10-06 17:00                     ` [dpdk-dev] [PATCH v6 5/5] app/testpmd: add traffic management forwarding mode Jasvinder Singh
2017-10-06 18:57                     ` [dpdk-dev] [PATCH v6 0/5] net/softnic: sw fall-back pmd for traffic mgmt and others Ferruh Yigit
2017-10-09 11:32                       ` Singh, Jasvinder
2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 3/5] net/softnic: add TM capabilities ops Jasvinder Singh
2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 4/5] net/softnic: add TM hierarchy related ops Jasvinder Singh
2017-09-29 14:04                 ` [dpdk-dev] [PATCH v5 5/5] app/testpmd: add traffic management forwarding mode Jasvinder Singh
2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 3/4] net/softnic: add TM capabilities ops Jasvinder Singh
2017-09-25  2:33               ` Lu, Wenzhuo
2017-09-28  8:16                 ` Singh, Jasvinder
2017-09-18  9:10             ` [dpdk-dev] [PATCH v4 4/4] net/softnic: add TM hierarchy related ops Jasvinder Singh
2017-09-25  7:14               ` Lu, Wenzhuo
2017-09-28  8:39                 ` Singh, Jasvinder
2017-09-20 15:35             ` [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Thomas Monjalon
2017-09-22 22:07               ` Singh, Jasvinder
2017-10-06 10:40               ` Dumitrescu, Cristian
2017-10-06 12:13                 ` Thomas Monjalon
2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 2/4] net/softnic: add traffic management support Jasvinder Singh
2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 3/4] net/softnic: add TM capabilities ops Jasvinder Singh
2017-08-11 12:49         ` [dpdk-dev] [PATCH v3 4/4] net/softnic: add TM hierarchy related ops Jasvinder Singh
2017-09-08 17:08         ` [dpdk-dev] [PATCH v3 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others Dumitrescu, Cristian
2017-06-26 16:43     ` [dpdk-dev] [PATCH v2 2/2] net/softnic: add traffic management ops Jasvinder Singh
2017-05-26 18:11 ` [dpdk-dev] [PATCH " Jasvinder Singh
2017-06-07 14:32 ` [dpdk-dev] [PATCH 0/2] net/softnic: sw fall-back for traffic management Thomas Monjalon
2017-06-08 13:27   ` Dumitrescu, Cristian
2017-06-08 13:59     ` Thomas Monjalon
2017-06-08 15:27       ` Dumitrescu, Cristian
2017-06-08 16:16         ` Thomas Monjalon
2017-06-08 16:43           ` Dumitrescu, Cristian
2017-07-04 23:48             ` Thomas Monjalon
2017-07-05  9:32               ` Dumitrescu, Cristian
2017-07-05 10:17                 ` Thomas Monjalon
2017-08-11 15:28 ` Stephen Hemminger
2017-08-11 16:22   ` Dumitrescu, Cristian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).