DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/7 v2] dpaa: fixes and performance improvement changes
  2018-01-23 12:31 [dpdk-dev] [PATCH 0/7] dpaa: fixes and performance improvement changes Nipun Gupta
@ 2018-01-23 12:27 ` Nipun Gupta
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 1/7 v2] bus/dpaa: check flag in qman multi enqueue Nipun Gupta
                     ` (6 more replies)
  2018-01-23 12:31 ` [dpdk-dev] [PATCH 1/7] bus/dpaa: check flag in qman multi enqueue Nipun Gupta
                   ` (6 subsequent siblings)
  7 siblings, 7 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:27 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain, Nipun Gupta

Patch 1-4 - Fixes some of the issues in the DPAA bus
Patch 5-7 - Performance enhancement changes on DPAA platform

Hemant Agrawal (2):
  mempool/dpaa: fix the phy to virt optimization
  net/dpaa: use phy to virt optimizations

Nipun Gupta (4):
  bus/dpaa: check flag in qman multi enqueue
  bus/dpaa: allocate qman portals in thread safe manner
  bus/dpaa: check portal presence in the caller API
  net/dpaa: further push mode optimizations

Shreyansh Jain (1):
  bus/dpaa: fix port order shuffling

Changes in v2:
  Fix the checkpatch warnings

 drivers/bus/dpaa/base/qbman/qman.c        |  99 +++++++++++++------------
 drivers/bus/dpaa/dpaa_bus.c               |  78 ++++++++++++++------
 drivers/bus/dpaa/include/fsl_qman.h       |  10 +++
 drivers/bus/dpaa/rte_bus_dpaa_version.map |   1 +
 drivers/bus/dpaa/rte_dpaa_bus.h           |   2 +
 drivers/mempool/dpaa/dpaa_mempool.c       |  33 +++++----
 drivers/mempool/dpaa/dpaa_mempool.h       |   4 +-
 drivers/net/dpaa/dpaa_ethdev.c            |  19 +++--
 drivers/net/dpaa/dpaa_rxtx.c              | 115 ++++++++++++++++++++++++------
 drivers/net/dpaa/dpaa_rxtx.h              |   9 ++-
 10 files changed, 247 insertions(+), 123 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 1/7 v2] bus/dpaa: check flag in qman multi enqueue
  2018-01-23 12:27 ` [dpdk-dev] [PATCH 0/7 v2] " Nipun Gupta
@ 2018-01-23 12:27   ` Nipun Gupta
  2018-01-24  5:24     ` Hemant Agrawal
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 2/7 v2] bus/dpaa: allocate qman portals in thread safe manner Nipun Gupta
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:27 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain, Nipun Gupta, stable

A caller may/may not pass the flags in qman enqueue multi API.
This patch adds a check on that flag and only accesses it if passed
by the caller.

Fixes: 43797e7b4774 ("bus/dpaa: support event dequeue and consumption")
Cc: stable@dpdk.org

Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
---
 drivers/bus/dpaa/base/qbman/qman.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/bus/dpaa/base/qbman/qman.c b/drivers/bus/dpaa/base/qbman/qman.c
index 609bc76..e7fdf03 100644
--- a/drivers/bus/dpaa/base/qbman/qman.c
+++ b/drivers/bus/dpaa/base/qbman/qman.c
@@ -2198,7 +2198,7 @@ int qman_enqueue_multi(struct qman_fq *fq,
 		eq->fd.addr = cpu_to_be40(fd->addr);
 		eq->fd.status = cpu_to_be32(fd->status);
 		eq->fd.opaque = cpu_to_be32(fd->opaque);
-		if (flags[i] & QMAN_ENQUEUE_FLAG_DCA) {
+		if (flags && (flags[i] & QMAN_ENQUEUE_FLAG_DCA)) {
 			eq->dca = QM_EQCR_DCA_ENABLE |
 				((flags[i] >> 8) & QM_EQCR_DCA_IDXMASK);
 		}
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 2/7 v2] bus/dpaa: allocate qman portals in thread safe manner
  2018-01-23 12:27 ` [dpdk-dev] [PATCH 0/7 v2] " Nipun Gupta
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 1/7 v2] bus/dpaa: check flag in qman multi enqueue Nipun Gupta
@ 2018-01-23 12:27   ` Nipun Gupta
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 3/7 v2] mempool/dpaa: fix the phy to virt optimization Nipun Gupta
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:27 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain, Nipun Gupta, stable

Fixes: 9d32ef0f5d61 ("bus/dpaa: support creating dynamic HW portal")
Cc: stable@dpdk.org

Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
---
 drivers/bus/dpaa/base/qbman/qman.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/bus/dpaa/base/qbman/qman.c b/drivers/bus/dpaa/base/qbman/qman.c
index e7fdf03..4d8bdae 100644
--- a/drivers/bus/dpaa/base/qbman/qman.c
+++ b/drivers/bus/dpaa/base/qbman/qman.c
@@ -625,7 +625,7 @@ struct qman_portal *qman_create_portal(
 
 #define MAX_GLOBAL_PORTALS 8
 static struct qman_portal global_portals[MAX_GLOBAL_PORTALS];
-static int global_portals_used[MAX_GLOBAL_PORTALS];
+rte_atomic16_t global_portals_used[MAX_GLOBAL_PORTALS];
 
 static struct qman_portal *
 qman_alloc_global_portal(void)
@@ -633,10 +633,8 @@ struct qman_portal *qman_create_portal(
 	unsigned int i;
 
 	for (i = 0; i < MAX_GLOBAL_PORTALS; i++) {
-		if (global_portals_used[i] == 0) {
-			global_portals_used[i] = 1;
+		if (rte_atomic16_test_and_set(&global_portals_used[i]))
 			return &global_portals[i];
-		}
 	}
 	pr_err("No portal available (%x)\n", MAX_GLOBAL_PORTALS);
 
@@ -650,7 +648,7 @@ struct qman_portal *qman_create_portal(
 
 	for (i = 0; i < MAX_GLOBAL_PORTALS; i++) {
 		if (&global_portals[i] == portal) {
-			global_portals_used[i] = 0;
+			rte_atomic16_clear(&global_portals_used[i]);
 			return 0;
 		}
 	}
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 3/7 v2] mempool/dpaa: fix the phy to virt optimization
  2018-01-23 12:27 ` [dpdk-dev] [PATCH 0/7 v2] " Nipun Gupta
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 1/7 v2] bus/dpaa: check flag in qman multi enqueue Nipun Gupta
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 2/7 v2] bus/dpaa: allocate qman portals in thread safe manner Nipun Gupta
@ 2018-01-23 12:27   ` Nipun Gupta
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 4/7 v2] bus/dpaa: fix port order shuffling Nipun Gupta
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:27 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain, stable

From: Hemant Agrawal <hemant.agrawal@nxp.com>

Fixes: 83a4f267f2e3 ("mempool/dpaa: optimize phy to virt conversion")
Cc: stable@dpdk.org

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
 drivers/mempool/dpaa/dpaa_mempool.c | 9 ++++-----
 drivers/mempool/dpaa/dpaa_mempool.h | 4 ++--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/mempool/dpaa/dpaa_mempool.c b/drivers/mempool/dpaa/dpaa_mempool.c
index ddc4e47..fe22519 100644
--- a/drivers/mempool/dpaa/dpaa_mempool.c
+++ b/drivers/mempool/dpaa/dpaa_mempool.c
@@ -150,8 +150,8 @@
 		uint64_t phy = rte_mempool_virt2iova(obj_table[i]);
 
 		if (unlikely(!bp_info->ptov_off)) {
-			/* buffers are not from multiple memzones */
-			if (!(bp_info->flags & DPAA_MPOOL_MULTI_MEMZONE)) {
+			/* buffers are from single mem segment */
+			if (bp_info->flags & DPAA_MPOOL_SINGLE_SEGMENT) {
 				bp_info->ptov_off
 						= (uint64_t)obj_table[i] - phy;
 				rte_dpaa_bpid_info[bp_info->bpid].ptov_off
@@ -282,9 +282,8 @@
 			   len, total_elt_sz * mp->size);
 
 	/* Detect pool area has sufficient space for elements in this memzone */
-	if (len < total_elt_sz * mp->size)
-		/* Else, Memory will be allocated from multiple memzones */
-		bp_info->flags |= DPAA_MPOOL_MULTI_MEMZONE;
+	if (len >= total_elt_sz * mp->size)
+		bp_info->flags |= DPAA_MPOOL_SINGLE_SEGMENT;
 
 	return 0;
 }
diff --git a/drivers/mempool/dpaa/dpaa_mempool.h b/drivers/mempool/dpaa/dpaa_mempool.h
index 02aa513..9435dd2 100644
--- a/drivers/mempool/dpaa/dpaa_mempool.h
+++ b/drivers/mempool/dpaa/dpaa_mempool.h
@@ -28,8 +28,8 @@
 /* Maximum release/acquire from BMAN */
 #define DPAA_MBUF_MAX_ACQ_REL  8
 
-/* Buffers are allocated from multiple memzones i.e. non phys contiguous */
-#define DPAA_MPOOL_MULTI_MEMZONE  0x01
+/* Buffers are allocated from single mem segment i.e. phys contiguous */
+#define DPAA_MPOOL_SINGLE_SEGMENT  0x01
 
 struct dpaa_bp_info {
 	struct rte_mempool *mp;
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 4/7 v2] bus/dpaa: fix port order shuffling
  2018-01-23 12:27 ` [dpdk-dev] [PATCH 0/7 v2] " Nipun Gupta
                     ` (2 preceding siblings ...)
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 3/7 v2] mempool/dpaa: fix the phy to virt optimization Nipun Gupta
@ 2018-01-23 12:27   ` Nipun Gupta
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 5/7 v2] net/dpaa: use phy to virt optimizations Nipun Gupta
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:27 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain, stable

From: Shreyansh Jain <shreyansh.jain@nxp.com>

While scanning for devices, the order in which devices appear is
different as compared to MAC sequence.
This can cause confusion for users and automated scripts.
This patch create a sorted list of devices.

Fixes: 919eeaccb2ba ("bus/dpaa: introduce NXP DPAA bus driver skeleton")
Cc: stable@dpdk.org

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
---
 drivers/bus/dpaa/dpaa_bus.c | 52 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 50 insertions(+), 2 deletions(-)

diff --git a/drivers/bus/dpaa/dpaa_bus.c b/drivers/bus/dpaa/dpaa_bus.c
index ba33566..ef2df48 100644
--- a/drivers/bus/dpaa/dpaa_bus.c
+++ b/drivers/bus/dpaa/dpaa_bus.c
@@ -57,10 +57,58 @@
 RTE_DEFINE_PER_LCORE(bool, _dpaa_io);
 RTE_DEFINE_PER_LCORE(struct dpaa_portal_dqrr, held_bufs);
 
+static int
+compare_dpaa_devices(struct rte_dpaa_device *dev1,
+		     struct rte_dpaa_device *dev2)
+{
+	int comp = 0;
+
+	/* Segragating ETH from SEC devices */
+	if (dev1->device_type > dev2->device_type)
+		comp = 1;
+	else if (dev1->device_type < dev2->device_type)
+		comp = -1;
+	else
+		comp = 0;
+
+	if ((comp != 0) || (dev1->device_type != FSL_DPAA_ETH))
+		return comp;
+
+	if (dev1->id.fman_id > dev2->id.fman_id) {
+		comp = 1;
+	} else if (dev1->id.fman_id < dev2->id.fman_id) {
+		comp = -1;
+	} else {
+		/* FMAN ids match, check for mac_id */
+		if (dev1->id.mac_id > dev2->id.mac_id)
+			comp = 1;
+		else if (dev1->id.mac_id < dev2->id.mac_id)
+			comp = -1;
+		else
+			comp = 0;
+	}
+
+	return comp;
+}
+
 static inline void
-dpaa_add_to_device_list(struct rte_dpaa_device *dev)
+dpaa_add_to_device_list(struct rte_dpaa_device *newdev)
 {
-	TAILQ_INSERT_TAIL(&rte_dpaa_bus.device_list, dev, next);
+	int comp, inserted = 0;
+	struct rte_dpaa_device *dev = NULL;
+	struct rte_dpaa_device *tdev = NULL;
+
+	TAILQ_FOREACH_SAFE(dev, &rte_dpaa_bus.device_list, next, tdev) {
+		comp = compare_dpaa_devices(newdev, dev);
+		if (comp < 0) {
+			TAILQ_INSERT_BEFORE(dev, newdev, next);
+			inserted = 1;
+			break;
+		}
+	}
+
+	if (!inserted)
+		TAILQ_INSERT_TAIL(&rte_dpaa_bus.device_list, newdev, next);
 }
 
 static inline void
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 5/7 v2] net/dpaa: use phy to virt optimizations
  2018-01-23 12:27 ` [dpdk-dev] [PATCH 0/7 v2] " Nipun Gupta
                     ` (3 preceding siblings ...)
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 4/7 v2] bus/dpaa: fix port order shuffling Nipun Gupta
@ 2018-01-23 12:27   ` Nipun Gupta
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 6/7 v2] bus/dpaa: check portal presence in the caller API Nipun Gupta
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 7/7 v2] net/dpaa: further push mode optimizations Nipun Gupta
  6 siblings, 0 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:27 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain

From: Hemant Agrawal <hemant.agrawal@nxp.com>

Use the optimized routine for phy to virt conversion,
when the mempool is allocated from physical contiguous memory.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
 drivers/net/dpaa/dpaa_rxtx.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_rxtx.c b/drivers/net/dpaa/dpaa_rxtx.c
index ab23352..b889d03 100644
--- a/drivers/net/dpaa/dpaa_rxtx.c
+++ b/drivers/net/dpaa/dpaa_rxtx.c
@@ -309,7 +309,7 @@ struct rte_mbuf *
 
 	DPAA_DP_LOG(DEBUG, "Received an SG frame");
 
-	vaddr = rte_dpaa_mem_ptov(qm_fd_addr(fd));
+	vaddr = DPAA_MEMPOOL_PTOV(bp_info, qm_fd_addr(fd));
 	if (!vaddr) {
 		DPAA_PMD_ERR("unable to convert physical address");
 		return NULL;
@@ -318,7 +318,7 @@ struct rte_mbuf *
 	sg_temp = &sgt[i++];
 	hw_sg_to_cpu(sg_temp);
 	temp = (struct rte_mbuf *)((char *)vaddr - bp_info->meta_data_size);
-	sg_vaddr = rte_dpaa_mem_ptov(qm_sg_entry_get64(sg_temp));
+	sg_vaddr = DPAA_MEMPOOL_PTOV(bp_info, qm_sg_entry_get64(sg_temp));
 
 	first_seg = (struct rte_mbuf *)((char *)sg_vaddr -
 						bp_info->meta_data_size);
@@ -334,7 +334,8 @@ struct rte_mbuf *
 	while (i < DPAA_SGT_MAX_ENTRIES) {
 		sg_temp = &sgt[i++];
 		hw_sg_to_cpu(sg_temp);
-		sg_vaddr = rte_dpaa_mem_ptov(qm_sg_entry_get64(sg_temp));
+		sg_vaddr = DPAA_MEMPOOL_PTOV(bp_info,
+					     qm_sg_entry_get64(sg_temp));
 		cur_seg = (struct rte_mbuf *)((char *)sg_vaddr -
 						      bp_info->meta_data_size);
 		cur_seg->data_off = sg_temp->offset;
@@ -361,7 +362,7 @@ struct rte_mbuf *
 {
 	struct rte_mbuf *mbuf;
 	struct dpaa_bp_info *bp_info = DPAA_BPID_TO_POOL_INFO(fd->bpid);
-	void *ptr = rte_dpaa_mem_ptov(qm_fd_addr(fd));
+	void *ptr;
 	uint8_t format =
 		(fd->opaque & DPAA_FD_FORMAT_MASK) >> DPAA_FD_FORMAT_SHIFT;
 	uint16_t offset;
@@ -372,6 +373,8 @@ struct rte_mbuf *
 	if (unlikely(format == qm_fd_sg))
 		return dpaa_eth_sg_to_mbuf(fd, ifid);
 
+	ptr = DPAA_MEMPOOL_PTOV(bp_info, qm_fd_addr(fd));
+
 	rte_prefetch0((void *)((uint8_t *)ptr + DEFAULT_RX_ICEOF));
 
 	offset = (fd->opaque & DPAA_FD_OFFSET_MASK) >> DPAA_FD_OFFSET_SHIFT;
@@ -537,7 +540,8 @@ static void *dpaa_get_pktbuf(struct dpaa_bp_info *bp_info)
 	DPAA_DP_LOG(DEBUG, "got buffer 0x%lx from pool %d",
 		    (uint64_t)bufs.addr, bufs.bpid);
 
-	buf = (uint64_t)rte_dpaa_mem_ptov(bufs.addr) - bp_info->meta_data_size;
+	buf = (uint64_t)DPAA_MEMPOOL_PTOV(bp_info, bufs.addr)
+				- bp_info->meta_data_size;
 	if (!buf)
 		goto out;
 
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 6/7 v2] bus/dpaa: check portal presence in the caller API
  2018-01-23 12:27 ` [dpdk-dev] [PATCH 0/7 v2] " Nipun Gupta
                     ` (4 preceding siblings ...)
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 5/7 v2] net/dpaa: use phy to virt optimizations Nipun Gupta
@ 2018-01-23 12:27   ` Nipun Gupta
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 7/7 v2] net/dpaa: further push mode optimizations Nipun Gupta
  6 siblings, 0 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:27 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain, Nipun Gupta

In the I/O path we were calling rte_dpaa_portal_init which
internally checks if a portal is affined to the core.
But this lead to calling of that non-static API in every call.

Instead check the portal affinity in the caller itself for
performance reasons

Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
---
 drivers/bus/dpaa/dpaa_bus.c               | 26 ++++++--------------------
 drivers/bus/dpaa/rte_bus_dpaa_version.map |  1 +
 drivers/bus/dpaa/rte_dpaa_bus.h           |  2 ++
 drivers/mempool/dpaa/dpaa_mempool.c       | 24 ++++++++++++++----------
 drivers/net/dpaa/dpaa_ethdev.c            | 10 ++++++----
 drivers/net/dpaa/dpaa_rxtx.c              | 20 ++++++++++++--------
 6 files changed, 41 insertions(+), 42 deletions(-)

diff --git a/drivers/bus/dpaa/dpaa_bus.c b/drivers/bus/dpaa/dpaa_bus.c
index ef2df48..5039067 100644
--- a/drivers/bus/dpaa/dpaa_bus.c
+++ b/drivers/bus/dpaa/dpaa_bus.c
@@ -54,7 +54,7 @@
 
 unsigned int dpaa_svr_family;
 
-RTE_DEFINE_PER_LCORE(bool, _dpaa_io);
+RTE_DEFINE_PER_LCORE(bool, dpaa_io);
 RTE_DEFINE_PER_LCORE(struct dpaa_portal_dqrr, held_bufs);
 
 static int
@@ -230,9 +230,7 @@
 	}
 }
 
-/** XXX move this function into a separate file */
-static int
-_dpaa_portal_init(void *arg)
+int rte_dpaa_portal_init(void *arg)
 {
 	cpu_set_t cpuset;
 	pthread_t id;
@@ -303,25 +301,13 @@
 		return ret;
 	}
 
-	RTE_PER_LCORE(_dpaa_io) = true;
+	RTE_PER_LCORE(dpaa_io) = true;
 
 	DPAA_BUS_LOG(DEBUG, "QMAN thread initialized");
 
 	return 0;
 }
 
-/*
- * rte_dpaa_portal_init - Wrapper over _dpaa_portal_init with thread level check
- * XXX Complete this
- */
-int rte_dpaa_portal_init(void *arg)
-{
-	if (unlikely(!RTE_PER_LCORE(_dpaa_io)))
-		return _dpaa_portal_init(arg);
-
-	return 0;
-}
-
 int
 rte_dpaa_portal_fq_init(void *arg, struct qman_fq *fq)
 {
@@ -329,8 +315,8 @@ int rte_dpaa_portal_init(void *arg)
 	u32 sdqcr;
 	struct qman_portal *qp;
 
-	if (unlikely(!RTE_PER_LCORE(_dpaa_io)))
-		_dpaa_portal_init(arg);
+	if (unlikely(!RTE_PER_LCORE(dpaa_io)))
+		rte_dpaa_portal_init(arg);
 
 	/* Initialise qman specific portals */
 	qp = fsl_qman_portal_create();
@@ -368,7 +354,7 @@ int rte_dpaa_portal_fq_close(struct qman_fq *fq)
 	rte_free(dpaa_io_portal);
 	dpaa_io_portal = NULL;
 
-	RTE_PER_LCORE(_dpaa_io) = false;
+	RTE_PER_LCORE(dpaa_io) = false;
 }
 
 #define DPAA_DEV_PATH1 "/sys/devices/platform/soc/soc:fsl,dpaa"
diff --git a/drivers/bus/dpaa/rte_bus_dpaa_version.map b/drivers/bus/dpaa/rte_bus_dpaa_version.map
index 925cf91..8d90285 100644
--- a/drivers/bus/dpaa/rte_bus_dpaa_version.map
+++ b/drivers/bus/dpaa/rte_bus_dpaa_version.map
@@ -70,6 +70,7 @@ DPDK_18.02 {
 
 	dpaa_logtype_eventdev;
 	dpaa_svr_family;
+	per_lcore_dpaa_io;
 	per_lcore_held_bufs;
 	qm_channel_pool1;
 	qman_alloc_cgrid_range;
diff --git a/drivers/bus/dpaa/rte_dpaa_bus.h b/drivers/bus/dpaa/rte_dpaa_bus.h
index 6fa0c3d..0352abd 100644
--- a/drivers/bus/dpaa/rte_dpaa_bus.h
+++ b/drivers/bus/dpaa/rte_dpaa_bus.h
@@ -31,6 +31,8 @@
 
 extern unsigned int dpaa_svr_family;
 
+extern RTE_DEFINE_PER_LCORE(bool, dpaa_io);
+
 struct rte_dpaa_device;
 struct rte_dpaa_driver;
 
diff --git a/drivers/mempool/dpaa/dpaa_mempool.c b/drivers/mempool/dpaa/dpaa_mempool.c
index fe22519..eb5b8f9 100644
--- a/drivers/mempool/dpaa/dpaa_mempool.c
+++ b/drivers/mempool/dpaa/dpaa_mempool.c
@@ -139,11 +139,13 @@
 	DPAA_MEMPOOL_DPDEBUG("Request to free %d buffers in bpid = %d",
 			     n, bp_info->bpid);
 
-	ret = rte_dpaa_portal_init((void *)0);
-	if (ret) {
-		DPAA_MEMPOOL_ERR("rte_dpaa_portal_init failed with ret: %d",
-				 ret);
-		return 0;
+	if (unlikely(!RTE_PER_LCORE(dpaa_io))) {
+		ret = rte_dpaa_portal_init((void *)0);
+		if (ret) {
+			DPAA_MEMPOOL_ERR("rte_dpaa_portal_init failed with ret: %d",
+					 ret);
+			return 0;
+		}
 	}
 
 	while (i < n) {
@@ -193,11 +195,13 @@
 		return -1;
 	}
 
-	ret = rte_dpaa_portal_init((void *)0);
-	if (ret) {
-		DPAA_MEMPOOL_ERR("rte_dpaa_portal_init failed with ret: %d",
-				 ret);
-		return -1;
+	if (unlikely(!RTE_PER_LCORE(dpaa_io))) {
+		ret = rte_dpaa_portal_init((void *)0);
+		if (ret) {
+			DPAA_MEMPOOL_ERR("rte_dpaa_portal_init failed with ret: %d",
+					 ret);
+			return -1;
+		}
 	}
 
 	while (n < count) {
diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index bf5eb96..b60ed3b 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -1331,10 +1331,12 @@ static int dpaa_debug_queue_init(struct qman_fq *fq, uint32_t fqid)
 		is_global_init = 1;
 	}
 
-	ret = rte_dpaa_portal_init((void *)1);
-	if (ret) {
-		DPAA_PMD_ERR("Unable to initialize portal");
-		return ret;
+	if (unlikely(!RTE_PER_LCORE(dpaa_io))) {
+		ret = rte_dpaa_portal_init((void *)1);
+		if (ret) {
+			DPAA_PMD_ERR("Unable to initialize portal");
+			return ret;
+		}
 	}
 
 	eth_dev = rte_eth_dev_allocate(dpaa_dev->name);
diff --git a/drivers/net/dpaa/dpaa_rxtx.c b/drivers/net/dpaa/dpaa_rxtx.c
index b889d03..f969ccf 100644
--- a/drivers/net/dpaa/dpaa_rxtx.c
+++ b/drivers/net/dpaa/dpaa_rxtx.c
@@ -503,10 +503,12 @@ uint16_t dpaa_eth_queue_rx(void *q,
 	if (likely(fq->is_static))
 		return dpaa_eth_queue_portal_rx(fq, bufs, nb_bufs);
 
-	ret = rte_dpaa_portal_init((void *)0);
-	if (ret) {
-		DPAA_PMD_ERR("Failure in affining portal");
-		return 0;
+	if (unlikely(!RTE_PER_LCORE(dpaa_io))) {
+		ret = rte_dpaa_portal_init((void *)0);
+		if (ret) {
+			DPAA_PMD_ERR("Failure in affining portal");
+			return 0;
+		}
 	}
 
 	ret = qman_set_vdq(fq, (nb_bufs > DPAA_MAX_DEQUEUE_NUM_FRAMES) ?
@@ -777,10 +779,12 @@ static struct rte_mbuf *dpaa_get_dmable_mbuf(struct rte_mbuf *mbuf,
 	int ret;
 	uint32_t seqn, index, flags[DPAA_TX_BURST_SIZE] = {0};
 
-	ret = rte_dpaa_portal_init((void *)0);
-	if (ret) {
-		DPAA_PMD_ERR("Failure in affining portal");
-		return 0;
+	if (unlikely(!RTE_PER_LCORE(dpaa_io))) {
+		ret = rte_dpaa_portal_init((void *)0);
+		if (ret) {
+			DPAA_PMD_ERR("Failure in affining portal");
+			return 0;
+		}
 	}
 
 	DPAA_DP_LOG(DEBUG, "Transmitting %d buffers on queue: %p", nb_bufs, q);
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 7/7 v2] net/dpaa: further push mode optimizations
  2018-01-23 12:27 ` [dpdk-dev] [PATCH 0/7 v2] " Nipun Gupta
                     ` (5 preceding siblings ...)
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 6/7 v2] bus/dpaa: check portal presence in the caller API Nipun Gupta
@ 2018-01-23 12:27   ` Nipun Gupta
  6 siblings, 0 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:27 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain, Nipun Gupta

This patch supports batch processing of multiple packets
in the Rx side

Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
 drivers/bus/dpaa/base/qbman/qman.c  | 89 ++++++++++++++++++-------------------
 drivers/bus/dpaa/include/fsl_qman.h | 10 +++++
 drivers/net/dpaa/dpaa_ethdev.c      |  9 +++-
 drivers/net/dpaa/dpaa_rxtx.c        | 81 +++++++++++++++++++++++++++++----
 drivers/net/dpaa/dpaa_rxtx.h        |  9 ++--
 5 files changed, 137 insertions(+), 61 deletions(-)

diff --git a/drivers/bus/dpaa/base/qbman/qman.c b/drivers/bus/dpaa/base/qbman/qman.c
index 4d8bdae..2b97671 100644
--- a/drivers/bus/dpaa/base/qbman/qman.c
+++ b/drivers/bus/dpaa/base/qbman/qman.c
@@ -1055,64 +1055,63 @@ unsigned int qman_portal_poll_rx(unsigned int poll_limit,
 				 void **bufs,
 				 struct qman_portal *p)
 {
-	const struct qm_dqrr_entry *dq;
-	struct qman_fq *fq;
-	enum qman_cb_dqrr_result res;
-	unsigned int limit = 0;
-#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
-	struct qm_dqrr_entry *shadow;
-#endif
-	unsigned int rx_number = 0;
+	struct qm_portal *portal = &p->p;
+	register struct qm_dqrr *dqrr = &portal->dqrr;
+	struct qm_dqrr_entry *dq[QM_DQRR_SIZE], *shadow[QM_DQRR_SIZE];
+	struct qman_fq *fq[QM_DQRR_SIZE];
+	unsigned int limit = 0, rx_number = 0;
+	uint32_t consume = 0;
 
 	do {
 		qm_dqrr_pvb_update(&p->p);
-		dq = qm_dqrr_current(&p->p);
-		if (unlikely(!dq))
+		if (!dqrr->fill)
 			break;
+
+		dq[rx_number] = dqrr->cursor;
+		dqrr->cursor = DQRR_CARRYCLEAR(dqrr->cursor + 1);
+		/* Prefetch the next DQRR entry */
+		rte_prefetch0(dqrr->cursor);
+
 #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
-	/* If running on an LE system the fields of the
-	 * dequeue entry must be swapper.  Because the
-	 * QMan HW will ignore writes the DQRR entry is
-	 * copied and the index stored within the copy
-	 */
-		shadow = &p->shadow_dqrr[DQRR_PTR2IDX(dq)];
-		*shadow = *dq;
-		dq = shadow;
-		shadow->fqid = be32_to_cpu(shadow->fqid);
-		shadow->contextB = be32_to_cpu(shadow->contextB);
-		shadow->seqnum = be16_to_cpu(shadow->seqnum);
-		hw_fd_to_cpu(&shadow->fd);
+		/* If running on an LE system the fields of the
+		 * dequeue entry must be swapper.  Because the
+		 * QMan HW will ignore writes the DQRR entry is
+		 * copied and the index stored within the copy
+		 */
+		shadow[rx_number] =
+			&p->shadow_dqrr[DQRR_PTR2IDX(dq[rx_number])];
+		shadow[rx_number]->fd.opaque_addr =
+			dq[rx_number]->fd.opaque_addr;
+		shadow[rx_number]->fd.addr =
+			be40_to_cpu(dq[rx_number]->fd.addr);
+		shadow[rx_number]->fd.opaque =
+			be32_to_cpu(dq[rx_number]->fd.opaque);
+#else
+		shadow = dq;
 #endif
 
 		/* SDQCR: context_b points to the FQ */
 #ifdef CONFIG_FSL_QMAN_FQ_LOOKUP
-		fq = get_fq_table_entry(dq->contextB);
+		fq[rx_number] = qman_fq_lookup_table[be32_to_cpu(
+						dq[rx_number]->contextB)];
 #else
-		fq = (void *)(uintptr_t)dq->contextB;
+		fq[rx_number] = (void *)(uintptr_t)be32_to_cpu(dq->contextB);
 #endif
-		/* Now let the callback do its stuff */
-		res = fq->cb.dqrr_dpdk_cb(NULL, p, fq, dq, &bufs[rx_number]);
+		fq[rx_number]->cb.dqrr_prepare(shadow[rx_number],
+						 &bufs[rx_number]);
+
+		consume |= (1 << (31 - DQRR_PTR2IDX(shadow[rx_number])));
 		rx_number++;
-		/* Interpret 'dq' from a driver perspective. */
-		/*
-		 * Parking isn't possible unless HELDACTIVE was set. NB,
-		 * FORCEELIGIBLE implies HELDACTIVE, so we only need to
-		 * check for HELDACTIVE to cover both.
-		 */
-		DPAA_ASSERT((dq->stat & QM_DQRR_STAT_FQ_HELDACTIVE) ||
-			    (res != qman_cb_dqrr_park));
-		qm_dqrr_cdc_consume_1ptr(&p->p, dq, res == qman_cb_dqrr_park);
-		/* Move forward */
-		qm_dqrr_next(&p->p);
-		/*
-		 * Entry processed and consumed, increment our counter.  The
-		 * callback can request that we exit after consuming the
-		 * entry, and we also exit if we reach our processing limit,
-		 * so loop back only if neither of these conditions is met.
-		 */
-	} while (likely(++limit < poll_limit));
+		--dqrr->fill;
+	} while (++limit < poll_limit);
 
-	return limit;
+	if (rx_number)
+		fq[0]->cb.dqrr_dpdk_pull_cb(fq, shadow, bufs, rx_number);
+
+	/* Consume all the DQRR enries together */
+	qm_out(DQRR_DCAP, (1 << 8) | consume);
+
+	return rx_number;
 }
 
 u32 qman_portal_dequeue(struct rte_event ev[], unsigned int poll_limit,
diff --git a/drivers/bus/dpaa/include/fsl_qman.h b/drivers/bus/dpaa/include/fsl_qman.h
index 99e46e1..e9793f3 100644
--- a/drivers/bus/dpaa/include/fsl_qman.h
+++ b/drivers/bus/dpaa/include/fsl_qman.h
@@ -1131,6 +1131,14 @@ typedef enum qman_cb_dqrr_result (*qman_dpdk_cb_dqrr)(void *event,
 					const struct qm_dqrr_entry *dqrr,
 					void **bd);
 
+/* This callback type is used when handling buffers in dpdk pull mode */
+typedef void (*qman_dpdk_pull_cb_dqrr)(struct qman_fq **fq,
+					struct qm_dqrr_entry **dqrr,
+					void **bufs,
+					int num_bufs);
+
+typedef void (*qman_dpdk_cb_prepare)(struct qm_dqrr_entry *dq, void **bufs);
+
 /*
  * This callback type is used when handling ERNs, FQRNs and FQRLs via MR. They
  * are always consumed after the callback returns.
@@ -1191,8 +1199,10 @@ enum qman_fq_state {
 struct qman_fq_cb {
 	union { /* for dequeued frames */
 		qman_dpdk_cb_dqrr dqrr_dpdk_cb;
+		qman_dpdk_pull_cb_dqrr dqrr_dpdk_pull_cb;
 		qman_cb_dqrr dqrr;
 	};
+	qman_dpdk_cb_prepare dqrr_prepare;
 	qman_cb_mr ern;		/* for s/w ERNs */
 	qman_cb_mr fqs;		/* frame-queue state changes*/
 };
diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index b60ed3b..97679eb 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -503,7 +503,11 @@ int dpaa_eth_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
 				   QM_FQCTRL_CTXASTASHING |
 				   QM_FQCTRL_PREFERINCACHE;
 		opts.fqd.context_a.stashing.exclusive = 0;
-		opts.fqd.context_a.stashing.annotation_cl =
+		/* In muticore scenario stashing becomes a bottleneck on LS1046.
+		 * So do not enable stashing in this case
+		 */
+		if (dpaa_svr_family != SVR_LS1046A_FAMILY)
+			opts.fqd.context_a.stashing.annotation_cl =
 						DPAA_IF_RX_ANNOTATION_STASH;
 		opts.fqd.context_a.stashing.data_cl = DPAA_IF_RX_DATA_STASH;
 		opts.fqd.context_a.stashing.context_cl =
@@ -526,7 +530,8 @@ int dpaa_eth_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
 		if (ret)
 			DPAA_PMD_ERR("Channel/Queue association failed. fqid %d"
 				     " ret: %d", rxq->fqid, ret);
-		rxq->cb.dqrr_dpdk_cb = dpaa_rx_cb;
+		rxq->cb.dqrr_dpdk_pull_cb = dpaa_rx_cb;
+		rxq->cb.dqrr_prepare = dpaa_rx_cb_prepare;
 		rxq->is_static = true;
 	}
 	dev->data->rx_queues[queue_idx] = rxq;
diff --git a/drivers/net/dpaa/dpaa_rxtx.c b/drivers/net/dpaa/dpaa_rxtx.c
index f969ccf..fc8144f 100644
--- a/drivers/net/dpaa/dpaa_rxtx.c
+++ b/drivers/net/dpaa/dpaa_rxtx.c
@@ -399,17 +399,80 @@ struct rte_mbuf *
 	return mbuf;
 }
 
-enum qman_cb_dqrr_result dpaa_rx_cb(void *event __always_unused,
-				    struct qman_portal *qm __always_unused,
-				    struct qman_fq *fq,
-				    const struct qm_dqrr_entry *dqrr,
-				    void **bufs)
+void
+dpaa_rx_cb(struct qman_fq **fq, struct qm_dqrr_entry **dqrr,
+	   void **bufs, int num_bufs)
 {
-	const struct qm_fd *fd = &dqrr->fd;
+	struct rte_mbuf *mbuf;
+	struct dpaa_bp_info *bp_info;
+	const struct qm_fd *fd;
+	void *ptr;
+	struct dpaa_if *dpaa_intf;
+	uint16_t offset, i;
+	uint32_t length;
+	uint8_t format;
+
+	if (dpaa_svr_family != SVR_LS1046A_FAMILY) {
+		bp_info = DPAA_BPID_TO_POOL_INFO(dqrr[0]->fd.bpid);
+		ptr = rte_dpaa_mem_ptov(qm_fd_addr(&dqrr[0]->fd));
+		rte_prefetch0((void *)((uint8_t *)ptr + DEFAULT_RX_ICEOF));
+		bufs[0] = (struct rte_mbuf *)((char *)ptr -
+				bp_info->meta_data_size);
+	}
 
-	*bufs = dpaa_eth_fd_to_mbuf(fd,
-			((struct dpaa_if *)fq->dpaa_intf)->ifid);
-	return qman_cb_dqrr_consume;
+	for (i = 0; i < num_bufs; i++) {
+		if (dpaa_svr_family != SVR_LS1046A_FAMILY &&
+		    i < num_bufs - 1) {
+			bp_info = DPAA_BPID_TO_POOL_INFO(dqrr[i + 1]->fd.bpid);
+			ptr = rte_dpaa_mem_ptov(qm_fd_addr(&dqrr[i + 1]->fd));
+			rte_prefetch0((void *)((uint8_t *)ptr +
+					DEFAULT_RX_ICEOF));
+			bufs[i + 1] = (struct rte_mbuf *)((char *)ptr -
+					bp_info->meta_data_size);
+		}
+
+		fd = &dqrr[i]->fd;
+		dpaa_intf = fq[i]->dpaa_intf;
+
+		format = (fd->opaque & DPAA_FD_FORMAT_MASK) >>
+				DPAA_FD_FORMAT_SHIFT;
+		if (unlikely(format == qm_fd_sg)) {
+			bufs[i] = dpaa_eth_sg_to_mbuf(fd, dpaa_intf->ifid);
+			continue;
+		}
+
+		offset = (fd->opaque & DPAA_FD_OFFSET_MASK) >>
+				DPAA_FD_OFFSET_SHIFT;
+		length = fd->opaque & DPAA_FD_LENGTH_MASK;
+
+		mbuf = bufs[i];
+		mbuf->data_off = offset;
+		mbuf->data_len = length;
+		mbuf->pkt_len = length;
+		mbuf->port = dpaa_intf->ifid;
+
+		mbuf->nb_segs = 1;
+		mbuf->ol_flags = 0;
+		mbuf->next = NULL;
+		rte_mbuf_refcnt_set(mbuf, 1);
+		dpaa_eth_packet_info(mbuf, (uint64_t)mbuf->buf_addr);
+	}
+}
+
+void dpaa_rx_cb_prepare(struct qm_dqrr_entry *dq, void **bufs)
+{
+	struct dpaa_bp_info *bp_info = DPAA_BPID_TO_POOL_INFO(dq->fd.bpid);
+	void *ptr = rte_dpaa_mem_ptov(qm_fd_addr(&dq->fd));
+
+	/* In case of LS1046, annotation stashing is disabled due to L2 cache
+	 * being bottleneck in case of multicore scanario for this platform.
+	 * So we prefetch the annoation beforehand, so that it is available
+	 * in cache when accessed.
+	 */
+	if (dpaa_svr_family == SVR_LS1046A_FAMILY)
+		rte_prefetch0((void *)((uint8_t *)ptr + DEFAULT_RX_ICEOF));
+
+	*bufs = (struct rte_mbuf *)((char *)ptr - bp_info->meta_data_size);
 }
 
 static uint16_t
diff --git a/drivers/net/dpaa/dpaa_rxtx.h b/drivers/net/dpaa/dpaa_rxtx.h
index 29d8f95..d3e6351 100644
--- a/drivers/net/dpaa/dpaa_rxtx.h
+++ b/drivers/net/dpaa/dpaa_rxtx.h
@@ -268,9 +268,8 @@ int dpaa_eth_mbuf_to_sg_fd(struct rte_mbuf *mbuf,
 			   struct qm_fd *fd,
 			   uint32_t bpid);
 
-enum qman_cb_dqrr_result dpaa_rx_cb(void *event,
-				    struct qman_portal *qm,
-				    struct qman_fq *fq,
-				    const struct qm_dqrr_entry *dqrr,
-				    void **bd);
+void dpaa_rx_cb(struct qman_fq **fq,
+		struct qm_dqrr_entry **dqrr, void **bufs, int num_bufs);
+
+void dpaa_rx_cb_prepare(struct qm_dqrr_entry *dq, void **bufs);
 #endif
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 0/7] dpaa: fixes and performance improvement changes
@ 2018-01-23 12:31 Nipun Gupta
  2018-01-23 12:27 ` [dpdk-dev] [PATCH 0/7 v2] " Nipun Gupta
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:31 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain, Nipun Gupta

Patch 1-4 - Fixes some of the issues in the DPAA bus
Patch 5-7 - Performance enhancement changes on DPAA platform

Hemant Agrawal (2):
  mempool/dpaa: fix the phy to virt optimization
  net/dpaa: use phy to virt optimizations

Nipun Gupta (4):
  bus/dpaa: check flag in qman multi enqueue
  bus/dpaa: allocate qman portals in thread safe manner
  bus/dpaa: check portal presence in the caller API
  net/dpaa: further push mode optimizations

Shreyansh Jain (1):
  bus/dpaa: fix port order shuffling

 drivers/bus/dpaa/base/qbman/qman.c        |  99 +++++++++++++-------------
 drivers/bus/dpaa/dpaa_bus.c               |  78 ++++++++++++++------
 drivers/bus/dpaa/include/fsl_qman.h       |  10 +++
 drivers/bus/dpaa/rte_bus_dpaa_version.map |   1 +
 drivers/bus/dpaa/rte_dpaa_bus.h           |   2 +
 drivers/mempool/dpaa/dpaa_mempool.c       |  33 +++++----
 drivers/mempool/dpaa/dpaa_mempool.h       |   4 +-
 drivers/net/dpaa/dpaa_ethdev.c            |  19 +++--
 drivers/net/dpaa/dpaa_rxtx.c              | 114 ++++++++++++++++++++++++------
 drivers/net/dpaa/dpaa_rxtx.h              |   9 ++-
 10 files changed, 246 insertions(+), 123 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 1/7] bus/dpaa: check flag in qman multi enqueue
  2018-01-23 12:31 [dpdk-dev] [PATCH 0/7] dpaa: fixes and performance improvement changes Nipun Gupta
  2018-01-23 12:27 ` [dpdk-dev] [PATCH 0/7 v2] " Nipun Gupta
@ 2018-01-23 12:31 ` Nipun Gupta
  2018-01-23 12:31 ` [dpdk-dev] [PATCH 2/7] bus/dpaa: allocate qman portals in thread safe manner Nipun Gupta
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:31 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain, Nipun Gupta, stable

A caller may/may not pass the flags in qman enqueue multi API.
This patch adds a check on that flag and only accesses it if passed
by the caller.

Fixes: 43797e7b4774 ("bus/dpaa: support event dequeue and consumption")
Cc: stable@dpdk.org

Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
---
 drivers/bus/dpaa/base/qbman/qman.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/bus/dpaa/base/qbman/qman.c b/drivers/bus/dpaa/base/qbman/qman.c
index 609bc76..e7fdf03 100644
--- a/drivers/bus/dpaa/base/qbman/qman.c
+++ b/drivers/bus/dpaa/base/qbman/qman.c
@@ -2198,7 +2198,7 @@ int qman_enqueue_multi(struct qman_fq *fq,
 		eq->fd.addr = cpu_to_be40(fd->addr);
 		eq->fd.status = cpu_to_be32(fd->status);
 		eq->fd.opaque = cpu_to_be32(fd->opaque);
-		if (flags[i] & QMAN_ENQUEUE_FLAG_DCA) {
+		if (flags && (flags[i] & QMAN_ENQUEUE_FLAG_DCA)) {
 			eq->dca = QM_EQCR_DCA_ENABLE |
 				((flags[i] >> 8) & QM_EQCR_DCA_IDXMASK);
 		}
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 2/7] bus/dpaa: allocate qman portals in thread safe manner
  2018-01-23 12:31 [dpdk-dev] [PATCH 0/7] dpaa: fixes and performance improvement changes Nipun Gupta
  2018-01-23 12:27 ` [dpdk-dev] [PATCH 0/7 v2] " Nipun Gupta
  2018-01-23 12:31 ` [dpdk-dev] [PATCH 1/7] bus/dpaa: check flag in qman multi enqueue Nipun Gupta
@ 2018-01-23 12:31 ` Nipun Gupta
  2018-01-23 12:31 ` [dpdk-dev] [PATCH 3/7] mempool/dpaa: fix the phy to virt optimization Nipun Gupta
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:31 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain, Nipun Gupta, stable

Fixes: 9d32ef0f5d61 ("bus/dpaa: support creating dynamic HW portal")
Cc: stable@dpdk.org

Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
---
 drivers/bus/dpaa/base/qbman/qman.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/bus/dpaa/base/qbman/qman.c b/drivers/bus/dpaa/base/qbman/qman.c
index e7fdf03..4d8bdae 100644
--- a/drivers/bus/dpaa/base/qbman/qman.c
+++ b/drivers/bus/dpaa/base/qbman/qman.c
@@ -625,7 +625,7 @@ struct qman_portal *qman_create_portal(
 
 #define MAX_GLOBAL_PORTALS 8
 static struct qman_portal global_portals[MAX_GLOBAL_PORTALS];
-static int global_portals_used[MAX_GLOBAL_PORTALS];
+rte_atomic16_t global_portals_used[MAX_GLOBAL_PORTALS];
 
 static struct qman_portal *
 qman_alloc_global_portal(void)
@@ -633,10 +633,8 @@ struct qman_portal *qman_create_portal(
 	unsigned int i;
 
 	for (i = 0; i < MAX_GLOBAL_PORTALS; i++) {
-		if (global_portals_used[i] == 0) {
-			global_portals_used[i] = 1;
+		if (rte_atomic16_test_and_set(&global_portals_used[i]))
 			return &global_portals[i];
-		}
 	}
 	pr_err("No portal available (%x)\n", MAX_GLOBAL_PORTALS);
 
@@ -650,7 +648,7 @@ struct qman_portal *qman_create_portal(
 
 	for (i = 0; i < MAX_GLOBAL_PORTALS; i++) {
 		if (&global_portals[i] == portal) {
-			global_portals_used[i] = 0;
+			rte_atomic16_clear(&global_portals_used[i]);
 			return 0;
 		}
 	}
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 3/7] mempool/dpaa: fix the phy to virt optimization
  2018-01-23 12:31 [dpdk-dev] [PATCH 0/7] dpaa: fixes and performance improvement changes Nipun Gupta
                   ` (2 preceding siblings ...)
  2018-01-23 12:31 ` [dpdk-dev] [PATCH 2/7] bus/dpaa: allocate qman portals in thread safe manner Nipun Gupta
@ 2018-01-23 12:31 ` Nipun Gupta
  2018-01-23 12:31 ` [dpdk-dev] [PATCH 4/7] bus/dpaa: fix port order shuffling Nipun Gupta
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:31 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain, stable

From: Hemant Agrawal <hemant.agrawal@nxp.com>

Fixes: 83a4f267f2e3 ("mempool/dpaa: optimize phy to virt conversion")
Cc: stable@dpdk.org

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
 drivers/mempool/dpaa/dpaa_mempool.c | 9 ++++-----
 drivers/mempool/dpaa/dpaa_mempool.h | 4 ++--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/mempool/dpaa/dpaa_mempool.c b/drivers/mempool/dpaa/dpaa_mempool.c
index ddc4e47..fe22519 100644
--- a/drivers/mempool/dpaa/dpaa_mempool.c
+++ b/drivers/mempool/dpaa/dpaa_mempool.c
@@ -150,8 +150,8 @@
 		uint64_t phy = rte_mempool_virt2iova(obj_table[i]);
 
 		if (unlikely(!bp_info->ptov_off)) {
-			/* buffers are not from multiple memzones */
-			if (!(bp_info->flags & DPAA_MPOOL_MULTI_MEMZONE)) {
+			/* buffers are from single mem segment */
+			if (bp_info->flags & DPAA_MPOOL_SINGLE_SEGMENT) {
 				bp_info->ptov_off
 						= (uint64_t)obj_table[i] - phy;
 				rte_dpaa_bpid_info[bp_info->bpid].ptov_off
@@ -282,9 +282,8 @@
 			   len, total_elt_sz * mp->size);
 
 	/* Detect pool area has sufficient space for elements in this memzone */
-	if (len < total_elt_sz * mp->size)
-		/* Else, Memory will be allocated from multiple memzones */
-		bp_info->flags |= DPAA_MPOOL_MULTI_MEMZONE;
+	if (len >= total_elt_sz * mp->size)
+		bp_info->flags |= DPAA_MPOOL_SINGLE_SEGMENT;
 
 	return 0;
 }
diff --git a/drivers/mempool/dpaa/dpaa_mempool.h b/drivers/mempool/dpaa/dpaa_mempool.h
index 02aa513..9435dd2 100644
--- a/drivers/mempool/dpaa/dpaa_mempool.h
+++ b/drivers/mempool/dpaa/dpaa_mempool.h
@@ -28,8 +28,8 @@
 /* Maximum release/acquire from BMAN */
 #define DPAA_MBUF_MAX_ACQ_REL  8
 
-/* Buffers are allocated from multiple memzones i.e. non phys contiguous */
-#define DPAA_MPOOL_MULTI_MEMZONE  0x01
+/* Buffers are allocated from single mem segment i.e. phys contiguous */
+#define DPAA_MPOOL_SINGLE_SEGMENT  0x01
 
 struct dpaa_bp_info {
 	struct rte_mempool *mp;
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 4/7] bus/dpaa: fix port order shuffling
  2018-01-23 12:31 [dpdk-dev] [PATCH 0/7] dpaa: fixes and performance improvement changes Nipun Gupta
                   ` (3 preceding siblings ...)
  2018-01-23 12:31 ` [dpdk-dev] [PATCH 3/7] mempool/dpaa: fix the phy to virt optimization Nipun Gupta
@ 2018-01-23 12:31 ` Nipun Gupta
  2018-01-23 12:31 ` [dpdk-dev] [PATCH 5/7] net/dpaa: use phy to virt optimizations Nipun Gupta
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:31 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain, stable

From: Shreyansh Jain <shreyansh.jain@nxp.com>

While scanning for devices, the order in which devices appear is
different as compared to MAC sequence.
This can cause confusion for users and automated scripts.
This patch create a sorted list of devices.

Fixes: 919eeaccb2ba ("bus/dpaa: introduce NXP DPAA bus driver skeleton")
Cc: stable@dpdk.org

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
---
 drivers/bus/dpaa/dpaa_bus.c | 52 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 50 insertions(+), 2 deletions(-)

diff --git a/drivers/bus/dpaa/dpaa_bus.c b/drivers/bus/dpaa/dpaa_bus.c
index ba33566..ef2df48 100644
--- a/drivers/bus/dpaa/dpaa_bus.c
+++ b/drivers/bus/dpaa/dpaa_bus.c
@@ -57,10 +57,58 @@
 RTE_DEFINE_PER_LCORE(bool, _dpaa_io);
 RTE_DEFINE_PER_LCORE(struct dpaa_portal_dqrr, held_bufs);
 
+static int
+compare_dpaa_devices(struct rte_dpaa_device *dev1,
+		     struct rte_dpaa_device *dev2)
+{
+	int comp = 0;
+
+	/* Segragating ETH from SEC devices */
+	if (dev1->device_type > dev2->device_type)
+		comp = 1;
+	else if (dev1->device_type < dev2->device_type)
+		comp = -1;
+	else
+		comp = 0;
+
+	if ((comp != 0) || (dev1->device_type != FSL_DPAA_ETH))
+		return comp;
+
+	if (dev1->id.fman_id > dev2->id.fman_id) {
+		comp = 1;
+	} else if (dev1->id.fman_id < dev2->id.fman_id) {
+		comp = -1;
+	} else {
+		/* FMAN ids match, check for mac_id */
+		if (dev1->id.mac_id > dev2->id.mac_id)
+			comp = 1;
+		else if (dev1->id.mac_id < dev2->id.mac_id)
+			comp = -1;
+		else
+			comp = 0;
+	}
+
+	return comp;
+}
+
 static inline void
-dpaa_add_to_device_list(struct rte_dpaa_device *dev)
+dpaa_add_to_device_list(struct rte_dpaa_device *newdev)
 {
-	TAILQ_INSERT_TAIL(&rte_dpaa_bus.device_list, dev, next);
+	int comp, inserted = 0;
+	struct rte_dpaa_device *dev = NULL;
+	struct rte_dpaa_device *tdev = NULL;
+
+	TAILQ_FOREACH_SAFE(dev, &rte_dpaa_bus.device_list, next, tdev) {
+		comp = compare_dpaa_devices(newdev, dev);
+		if (comp < 0) {
+			TAILQ_INSERT_BEFORE(dev, newdev, next);
+			inserted = 1;
+			break;
+		}
+	}
+
+	if (!inserted)
+		TAILQ_INSERT_TAIL(&rte_dpaa_bus.device_list, newdev, next);
 }
 
 static inline void
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 5/7] net/dpaa: use phy to virt optimizations
  2018-01-23 12:31 [dpdk-dev] [PATCH 0/7] dpaa: fixes and performance improvement changes Nipun Gupta
                   ` (4 preceding siblings ...)
  2018-01-23 12:31 ` [dpdk-dev] [PATCH 4/7] bus/dpaa: fix port order shuffling Nipun Gupta
@ 2018-01-23 12:31 ` Nipun Gupta
  2018-01-23 12:31 ` [dpdk-dev] [PATCH 6/7] bus/dpaa: check portal presence in the caller API Nipun Gupta
  2018-01-23 12:31 ` [dpdk-dev] [PATCH 7/7] net/dpaa: further push mode optimizations Nipun Gupta
  7 siblings, 0 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:31 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain

From: Hemant Agrawal <hemant.agrawal@nxp.com>

Use the optimized routine for phy to virt conversion,
when the mempool is allocated from physical contiguous memory.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
 drivers/net/dpaa/dpaa_rxtx.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_rxtx.c b/drivers/net/dpaa/dpaa_rxtx.c
index ab23352..b889d03 100644
--- a/drivers/net/dpaa/dpaa_rxtx.c
+++ b/drivers/net/dpaa/dpaa_rxtx.c
@@ -309,7 +309,7 @@ struct rte_mbuf *
 
 	DPAA_DP_LOG(DEBUG, "Received an SG frame");
 
-	vaddr = rte_dpaa_mem_ptov(qm_fd_addr(fd));
+	vaddr = DPAA_MEMPOOL_PTOV(bp_info, qm_fd_addr(fd));
 	if (!vaddr) {
 		DPAA_PMD_ERR("unable to convert physical address");
 		return NULL;
@@ -318,7 +318,7 @@ struct rte_mbuf *
 	sg_temp = &sgt[i++];
 	hw_sg_to_cpu(sg_temp);
 	temp = (struct rte_mbuf *)((char *)vaddr - bp_info->meta_data_size);
-	sg_vaddr = rte_dpaa_mem_ptov(qm_sg_entry_get64(sg_temp));
+	sg_vaddr = DPAA_MEMPOOL_PTOV(bp_info, qm_sg_entry_get64(sg_temp));
 
 	first_seg = (struct rte_mbuf *)((char *)sg_vaddr -
 						bp_info->meta_data_size);
@@ -334,7 +334,8 @@ struct rte_mbuf *
 	while (i < DPAA_SGT_MAX_ENTRIES) {
 		sg_temp = &sgt[i++];
 		hw_sg_to_cpu(sg_temp);
-		sg_vaddr = rte_dpaa_mem_ptov(qm_sg_entry_get64(sg_temp));
+		sg_vaddr = DPAA_MEMPOOL_PTOV(bp_info,
+					     qm_sg_entry_get64(sg_temp));
 		cur_seg = (struct rte_mbuf *)((char *)sg_vaddr -
 						      bp_info->meta_data_size);
 		cur_seg->data_off = sg_temp->offset;
@@ -361,7 +362,7 @@ struct rte_mbuf *
 {
 	struct rte_mbuf *mbuf;
 	struct dpaa_bp_info *bp_info = DPAA_BPID_TO_POOL_INFO(fd->bpid);
-	void *ptr = rte_dpaa_mem_ptov(qm_fd_addr(fd));
+	void *ptr;
 	uint8_t format =
 		(fd->opaque & DPAA_FD_FORMAT_MASK) >> DPAA_FD_FORMAT_SHIFT;
 	uint16_t offset;
@@ -372,6 +373,8 @@ struct rte_mbuf *
 	if (unlikely(format == qm_fd_sg))
 		return dpaa_eth_sg_to_mbuf(fd, ifid);
 
+	ptr = DPAA_MEMPOOL_PTOV(bp_info, qm_fd_addr(fd));
+
 	rte_prefetch0((void *)((uint8_t *)ptr + DEFAULT_RX_ICEOF));
 
 	offset = (fd->opaque & DPAA_FD_OFFSET_MASK) >> DPAA_FD_OFFSET_SHIFT;
@@ -537,7 +540,8 @@ static void *dpaa_get_pktbuf(struct dpaa_bp_info *bp_info)
 	DPAA_DP_LOG(DEBUG, "got buffer 0x%lx from pool %d",
 		    (uint64_t)bufs.addr, bufs.bpid);
 
-	buf = (uint64_t)rte_dpaa_mem_ptov(bufs.addr) - bp_info->meta_data_size;
+	buf = (uint64_t)DPAA_MEMPOOL_PTOV(bp_info, bufs.addr)
+				- bp_info->meta_data_size;
 	if (!buf)
 		goto out;
 
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 6/7] bus/dpaa: check portal presence in the caller API
  2018-01-23 12:31 [dpdk-dev] [PATCH 0/7] dpaa: fixes and performance improvement changes Nipun Gupta
                   ` (5 preceding siblings ...)
  2018-01-23 12:31 ` [dpdk-dev] [PATCH 5/7] net/dpaa: use phy to virt optimizations Nipun Gupta
@ 2018-01-23 12:31 ` Nipun Gupta
  2018-01-23 12:31 ` [dpdk-dev] [PATCH 7/7] net/dpaa: further push mode optimizations Nipun Gupta
  7 siblings, 0 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:31 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain, Nipun Gupta

In the I/O path we were calling rte_dpaa_portal_init which
internally checks if a portal is affined to the core.
But this lead to calling of that non-static API in every call.

Instead check the portal affinity in the caller itself for
performance reasons

Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
---
 drivers/bus/dpaa/dpaa_bus.c               | 26 ++++++--------------------
 drivers/bus/dpaa/rte_bus_dpaa_version.map |  1 +
 drivers/bus/dpaa/rte_dpaa_bus.h           |  2 ++
 drivers/mempool/dpaa/dpaa_mempool.c       | 24 ++++++++++++++----------
 drivers/net/dpaa/dpaa_ethdev.c            | 10 ++++++----
 drivers/net/dpaa/dpaa_rxtx.c              | 20 ++++++++++++--------
 6 files changed, 41 insertions(+), 42 deletions(-)

diff --git a/drivers/bus/dpaa/dpaa_bus.c b/drivers/bus/dpaa/dpaa_bus.c
index ef2df48..5039067 100644
--- a/drivers/bus/dpaa/dpaa_bus.c
+++ b/drivers/bus/dpaa/dpaa_bus.c
@@ -54,7 +54,7 @@
 
 unsigned int dpaa_svr_family;
 
-RTE_DEFINE_PER_LCORE(bool, _dpaa_io);
+RTE_DEFINE_PER_LCORE(bool, dpaa_io);
 RTE_DEFINE_PER_LCORE(struct dpaa_portal_dqrr, held_bufs);
 
 static int
@@ -230,9 +230,7 @@
 	}
 }
 
-/** XXX move this function into a separate file */
-static int
-_dpaa_portal_init(void *arg)
+int rte_dpaa_portal_init(void *arg)
 {
 	cpu_set_t cpuset;
 	pthread_t id;
@@ -303,25 +301,13 @@
 		return ret;
 	}
 
-	RTE_PER_LCORE(_dpaa_io) = true;
+	RTE_PER_LCORE(dpaa_io) = true;
 
 	DPAA_BUS_LOG(DEBUG, "QMAN thread initialized");
 
 	return 0;
 }
 
-/*
- * rte_dpaa_portal_init - Wrapper over _dpaa_portal_init with thread level check
- * XXX Complete this
- */
-int rte_dpaa_portal_init(void *arg)
-{
-	if (unlikely(!RTE_PER_LCORE(_dpaa_io)))
-		return _dpaa_portal_init(arg);
-
-	return 0;
-}
-
 int
 rte_dpaa_portal_fq_init(void *arg, struct qman_fq *fq)
 {
@@ -329,8 +315,8 @@ int rte_dpaa_portal_init(void *arg)
 	u32 sdqcr;
 	struct qman_portal *qp;
 
-	if (unlikely(!RTE_PER_LCORE(_dpaa_io)))
-		_dpaa_portal_init(arg);
+	if (unlikely(!RTE_PER_LCORE(dpaa_io)))
+		rte_dpaa_portal_init(arg);
 
 	/* Initialise qman specific portals */
 	qp = fsl_qman_portal_create();
@@ -368,7 +354,7 @@ int rte_dpaa_portal_fq_close(struct qman_fq *fq)
 	rte_free(dpaa_io_portal);
 	dpaa_io_portal = NULL;
 
-	RTE_PER_LCORE(_dpaa_io) = false;
+	RTE_PER_LCORE(dpaa_io) = false;
 }
 
 #define DPAA_DEV_PATH1 "/sys/devices/platform/soc/soc:fsl,dpaa"
diff --git a/drivers/bus/dpaa/rte_bus_dpaa_version.map b/drivers/bus/dpaa/rte_bus_dpaa_version.map
index 925cf91..8d90285 100644
--- a/drivers/bus/dpaa/rte_bus_dpaa_version.map
+++ b/drivers/bus/dpaa/rte_bus_dpaa_version.map
@@ -70,6 +70,7 @@ DPDK_18.02 {
 
 	dpaa_logtype_eventdev;
 	dpaa_svr_family;
+	per_lcore_dpaa_io;
 	per_lcore_held_bufs;
 	qm_channel_pool1;
 	qman_alloc_cgrid_range;
diff --git a/drivers/bus/dpaa/rte_dpaa_bus.h b/drivers/bus/dpaa/rte_dpaa_bus.h
index 6fa0c3d..0352abd 100644
--- a/drivers/bus/dpaa/rte_dpaa_bus.h
+++ b/drivers/bus/dpaa/rte_dpaa_bus.h
@@ -31,6 +31,8 @@
 
 extern unsigned int dpaa_svr_family;
 
+extern RTE_DEFINE_PER_LCORE(bool, dpaa_io);
+
 struct rte_dpaa_device;
 struct rte_dpaa_driver;
 
diff --git a/drivers/mempool/dpaa/dpaa_mempool.c b/drivers/mempool/dpaa/dpaa_mempool.c
index fe22519..eb5b8f9 100644
--- a/drivers/mempool/dpaa/dpaa_mempool.c
+++ b/drivers/mempool/dpaa/dpaa_mempool.c
@@ -139,11 +139,13 @@
 	DPAA_MEMPOOL_DPDEBUG("Request to free %d buffers in bpid = %d",
 			     n, bp_info->bpid);
 
-	ret = rte_dpaa_portal_init((void *)0);
-	if (ret) {
-		DPAA_MEMPOOL_ERR("rte_dpaa_portal_init failed with ret: %d",
-				 ret);
-		return 0;
+	if (unlikely(!RTE_PER_LCORE(dpaa_io))) {
+		ret = rte_dpaa_portal_init((void *)0);
+		if (ret) {
+			DPAA_MEMPOOL_ERR("rte_dpaa_portal_init failed with ret: %d",
+					 ret);
+			return 0;
+		}
 	}
 
 	while (i < n) {
@@ -193,11 +195,13 @@
 		return -1;
 	}
 
-	ret = rte_dpaa_portal_init((void *)0);
-	if (ret) {
-		DPAA_MEMPOOL_ERR("rte_dpaa_portal_init failed with ret: %d",
-				 ret);
-		return -1;
+	if (unlikely(!RTE_PER_LCORE(dpaa_io))) {
+		ret = rte_dpaa_portal_init((void *)0);
+		if (ret) {
+			DPAA_MEMPOOL_ERR("rte_dpaa_portal_init failed with ret: %d",
+					 ret);
+			return -1;
+		}
 	}
 
 	while (n < count) {
diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index bf5eb96..b60ed3b 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -1331,10 +1331,12 @@ static int dpaa_debug_queue_init(struct qman_fq *fq, uint32_t fqid)
 		is_global_init = 1;
 	}
 
-	ret = rte_dpaa_portal_init((void *)1);
-	if (ret) {
-		DPAA_PMD_ERR("Unable to initialize portal");
-		return ret;
+	if (unlikely(!RTE_PER_LCORE(dpaa_io))) {
+		ret = rte_dpaa_portal_init((void *)1);
+		if (ret) {
+			DPAA_PMD_ERR("Unable to initialize portal");
+			return ret;
+		}
 	}
 
 	eth_dev = rte_eth_dev_allocate(dpaa_dev->name);
diff --git a/drivers/net/dpaa/dpaa_rxtx.c b/drivers/net/dpaa/dpaa_rxtx.c
index b889d03..f969ccf 100644
--- a/drivers/net/dpaa/dpaa_rxtx.c
+++ b/drivers/net/dpaa/dpaa_rxtx.c
@@ -503,10 +503,12 @@ uint16_t dpaa_eth_queue_rx(void *q,
 	if (likely(fq->is_static))
 		return dpaa_eth_queue_portal_rx(fq, bufs, nb_bufs);
 
-	ret = rte_dpaa_portal_init((void *)0);
-	if (ret) {
-		DPAA_PMD_ERR("Failure in affining portal");
-		return 0;
+	if (unlikely(!RTE_PER_LCORE(dpaa_io))) {
+		ret = rte_dpaa_portal_init((void *)0);
+		if (ret) {
+			DPAA_PMD_ERR("Failure in affining portal");
+			return 0;
+		}
 	}
 
 	ret = qman_set_vdq(fq, (nb_bufs > DPAA_MAX_DEQUEUE_NUM_FRAMES) ?
@@ -777,10 +779,12 @@ static struct rte_mbuf *dpaa_get_dmable_mbuf(struct rte_mbuf *mbuf,
 	int ret;
 	uint32_t seqn, index, flags[DPAA_TX_BURST_SIZE] = {0};
 
-	ret = rte_dpaa_portal_init((void *)0);
-	if (ret) {
-		DPAA_PMD_ERR("Failure in affining portal");
-		return 0;
+	if (unlikely(!RTE_PER_LCORE(dpaa_io))) {
+		ret = rte_dpaa_portal_init((void *)0);
+		if (ret) {
+			DPAA_PMD_ERR("Failure in affining portal");
+			return 0;
+		}
 	}
 
 	DPAA_DP_LOG(DEBUG, "Transmitting %d buffers on queue: %p", nb_bufs, q);
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [dpdk-dev] [PATCH 7/7] net/dpaa: further push mode optimizations
  2018-01-23 12:31 [dpdk-dev] [PATCH 0/7] dpaa: fixes and performance improvement changes Nipun Gupta
                   ` (6 preceding siblings ...)
  2018-01-23 12:31 ` [dpdk-dev] [PATCH 6/7] bus/dpaa: check portal presence in the caller API Nipun Gupta
@ 2018-01-23 12:31 ` Nipun Gupta
  7 siblings, 0 replies; 18+ messages in thread
From: Nipun Gupta @ 2018-01-23 12:31 UTC (permalink / raw)
  To: thomas; +Cc: dev, hemant.agrawal, shreyansh.jain, Nipun Gupta

This patch supports batch processing of multiple packets
in the Rx side

Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
 drivers/bus/dpaa/base/qbman/qman.c  | 89 ++++++++++++++++++-------------------
 drivers/bus/dpaa/include/fsl_qman.h | 10 +++++
 drivers/net/dpaa/dpaa_ethdev.c      |  9 +++-
 drivers/net/dpaa/dpaa_rxtx.c        | 80 +++++++++++++++++++++++++++++----
 drivers/net/dpaa/dpaa_rxtx.h        |  9 ++--
 5 files changed, 136 insertions(+), 61 deletions(-)

diff --git a/drivers/bus/dpaa/base/qbman/qman.c b/drivers/bus/dpaa/base/qbman/qman.c
index 4d8bdae..2b97671 100644
--- a/drivers/bus/dpaa/base/qbman/qman.c
+++ b/drivers/bus/dpaa/base/qbman/qman.c
@@ -1055,64 +1055,63 @@ unsigned int qman_portal_poll_rx(unsigned int poll_limit,
 				 void **bufs,
 				 struct qman_portal *p)
 {
-	const struct qm_dqrr_entry *dq;
-	struct qman_fq *fq;
-	enum qman_cb_dqrr_result res;
-	unsigned int limit = 0;
-#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
-	struct qm_dqrr_entry *shadow;
-#endif
-	unsigned int rx_number = 0;
+	struct qm_portal *portal = &p->p;
+	register struct qm_dqrr *dqrr = &portal->dqrr;
+	struct qm_dqrr_entry *dq[QM_DQRR_SIZE], *shadow[QM_DQRR_SIZE];
+	struct qman_fq *fq[QM_DQRR_SIZE];
+	unsigned int limit = 0, rx_number = 0;
+	uint32_t consume = 0;
 
 	do {
 		qm_dqrr_pvb_update(&p->p);
-		dq = qm_dqrr_current(&p->p);
-		if (unlikely(!dq))
+		if (!dqrr->fill)
 			break;
+
+		dq[rx_number] = dqrr->cursor;
+		dqrr->cursor = DQRR_CARRYCLEAR(dqrr->cursor + 1);
+		/* Prefetch the next DQRR entry */
+		rte_prefetch0(dqrr->cursor);
+
 #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
-	/* If running on an LE system the fields of the
-	 * dequeue entry must be swapper.  Because the
-	 * QMan HW will ignore writes the DQRR entry is
-	 * copied and the index stored within the copy
-	 */
-		shadow = &p->shadow_dqrr[DQRR_PTR2IDX(dq)];
-		*shadow = *dq;
-		dq = shadow;
-		shadow->fqid = be32_to_cpu(shadow->fqid);
-		shadow->contextB = be32_to_cpu(shadow->contextB);
-		shadow->seqnum = be16_to_cpu(shadow->seqnum);
-		hw_fd_to_cpu(&shadow->fd);
+		/* If running on an LE system the fields of the
+		 * dequeue entry must be swapper.  Because the
+		 * QMan HW will ignore writes the DQRR entry is
+		 * copied and the index stored within the copy
+		 */
+		shadow[rx_number] =
+			&p->shadow_dqrr[DQRR_PTR2IDX(dq[rx_number])];
+		shadow[rx_number]->fd.opaque_addr =
+			dq[rx_number]->fd.opaque_addr;
+		shadow[rx_number]->fd.addr =
+			be40_to_cpu(dq[rx_number]->fd.addr);
+		shadow[rx_number]->fd.opaque =
+			be32_to_cpu(dq[rx_number]->fd.opaque);
+#else
+		shadow = dq;
 #endif
 
 		/* SDQCR: context_b points to the FQ */
 #ifdef CONFIG_FSL_QMAN_FQ_LOOKUP
-		fq = get_fq_table_entry(dq->contextB);
+		fq[rx_number] = qman_fq_lookup_table[be32_to_cpu(
+						dq[rx_number]->contextB)];
 #else
-		fq = (void *)(uintptr_t)dq->contextB;
+		fq[rx_number] = (void *)(uintptr_t)be32_to_cpu(dq->contextB);
 #endif
-		/* Now let the callback do its stuff */
-		res = fq->cb.dqrr_dpdk_cb(NULL, p, fq, dq, &bufs[rx_number]);
+		fq[rx_number]->cb.dqrr_prepare(shadow[rx_number],
+						 &bufs[rx_number]);
+
+		consume |= (1 << (31 - DQRR_PTR2IDX(shadow[rx_number])));
 		rx_number++;
-		/* Interpret 'dq' from a driver perspective. */
-		/*
-		 * Parking isn't possible unless HELDACTIVE was set. NB,
-		 * FORCEELIGIBLE implies HELDACTIVE, so we only need to
-		 * check for HELDACTIVE to cover both.
-		 */
-		DPAA_ASSERT((dq->stat & QM_DQRR_STAT_FQ_HELDACTIVE) ||
-			    (res != qman_cb_dqrr_park));
-		qm_dqrr_cdc_consume_1ptr(&p->p, dq, res == qman_cb_dqrr_park);
-		/* Move forward */
-		qm_dqrr_next(&p->p);
-		/*
-		 * Entry processed and consumed, increment our counter.  The
-		 * callback can request that we exit after consuming the
-		 * entry, and we also exit if we reach our processing limit,
-		 * so loop back only if neither of these conditions is met.
-		 */
-	} while (likely(++limit < poll_limit));
+		--dqrr->fill;
+	} while (++limit < poll_limit);
 
-	return limit;
+	if (rx_number)
+		fq[0]->cb.dqrr_dpdk_pull_cb(fq, shadow, bufs, rx_number);
+
+	/* Consume all the DQRR enries together */
+	qm_out(DQRR_DCAP, (1 << 8) | consume);
+
+	return rx_number;
 }
 
 u32 qman_portal_dequeue(struct rte_event ev[], unsigned int poll_limit,
diff --git a/drivers/bus/dpaa/include/fsl_qman.h b/drivers/bus/dpaa/include/fsl_qman.h
index 99e46e1..e9793f3 100644
--- a/drivers/bus/dpaa/include/fsl_qman.h
+++ b/drivers/bus/dpaa/include/fsl_qman.h
@@ -1131,6 +1131,14 @@ typedef enum qman_cb_dqrr_result (*qman_dpdk_cb_dqrr)(void *event,
 					const struct qm_dqrr_entry *dqrr,
 					void **bd);
 
+/* This callback type is used when handling buffers in dpdk pull mode */
+typedef void (*qman_dpdk_pull_cb_dqrr)(struct qman_fq **fq,
+					struct qm_dqrr_entry **dqrr,
+					void **bufs,
+					int num_bufs);
+
+typedef void (*qman_dpdk_cb_prepare)(struct qm_dqrr_entry *dq, void **bufs);
+
 /*
  * This callback type is used when handling ERNs, FQRNs and FQRLs via MR. They
  * are always consumed after the callback returns.
@@ -1191,8 +1199,10 @@ enum qman_fq_state {
 struct qman_fq_cb {
 	union { /* for dequeued frames */
 		qman_dpdk_cb_dqrr dqrr_dpdk_cb;
+		qman_dpdk_pull_cb_dqrr dqrr_dpdk_pull_cb;
 		qman_cb_dqrr dqrr;
 	};
+	qman_dpdk_cb_prepare dqrr_prepare;
 	qman_cb_mr ern;		/* for s/w ERNs */
 	qman_cb_mr fqs;		/* frame-queue state changes*/
 };
diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index b60ed3b..97679eb 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -503,7 +503,11 @@ int dpaa_eth_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
 				   QM_FQCTRL_CTXASTASHING |
 				   QM_FQCTRL_PREFERINCACHE;
 		opts.fqd.context_a.stashing.exclusive = 0;
-		opts.fqd.context_a.stashing.annotation_cl =
+		/* In muticore scenario stashing becomes a bottleneck on LS1046.
+		 * So do not enable stashing in this case
+		 */
+		if (dpaa_svr_family != SVR_LS1046A_FAMILY)
+			opts.fqd.context_a.stashing.annotation_cl =
 						DPAA_IF_RX_ANNOTATION_STASH;
 		opts.fqd.context_a.stashing.data_cl = DPAA_IF_RX_DATA_STASH;
 		opts.fqd.context_a.stashing.context_cl =
@@ -526,7 +530,8 @@ int dpaa_eth_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
 		if (ret)
 			DPAA_PMD_ERR("Channel/Queue association failed. fqid %d"
 				     " ret: %d", rxq->fqid, ret);
-		rxq->cb.dqrr_dpdk_cb = dpaa_rx_cb;
+		rxq->cb.dqrr_dpdk_pull_cb = dpaa_rx_cb;
+		rxq->cb.dqrr_prepare = dpaa_rx_cb_prepare;
 		rxq->is_static = true;
 	}
 	dev->data->rx_queues[queue_idx] = rxq;
diff --git a/drivers/net/dpaa/dpaa_rxtx.c b/drivers/net/dpaa/dpaa_rxtx.c
index f969ccf..89e7918 100644
--- a/drivers/net/dpaa/dpaa_rxtx.c
+++ b/drivers/net/dpaa/dpaa_rxtx.c
@@ -399,17 +399,79 @@ struct rte_mbuf *
 	return mbuf;
 }
 
-enum qman_cb_dqrr_result dpaa_rx_cb(void *event __always_unused,
-				    struct qman_portal *qm __always_unused,
-				    struct qman_fq *fq,
-				    const struct qm_dqrr_entry *dqrr,
-				    void **bufs)
+void
+dpaa_rx_cb(struct qman_fq **fq, struct qm_dqrr_entry **dqrr,
+	   void **bufs, int num_bufs)
 {
-	const struct qm_fd *fd = &dqrr->fd;
+	struct rte_mbuf *mbuf;
+	struct dpaa_bp_info *bp_info;
+	const struct qm_fd *fd;
+	void *ptr;
+	struct dpaa_if *dpaa_intf;
+	uint16_t offset, i;
+	uint32_t length;
+	uint8_t format;
+
+	if (dpaa_svr_family != SVR_LS1046A_FAMILY) {
+		bp_info = DPAA_BPID_TO_POOL_INFO(dqrr[0]->fd.bpid);
+		ptr = rte_dpaa_mem_ptov(qm_fd_addr(&dqrr[0]->fd));
+		rte_prefetch0((void *)((uint8_t *)ptr + DEFAULT_RX_ICEOF));
+		bufs[0] = (struct rte_mbuf *)((char *)ptr -
+				bp_info->meta_data_size);
+	}
 
-	*bufs = dpaa_eth_fd_to_mbuf(fd,
-			((struct dpaa_if *)fq->dpaa_intf)->ifid);
-	return qman_cb_dqrr_consume;
+	for (i = 0; i < num_bufs; i++) {
+		if (dpaa_svr_family != SVR_LS1046A_FAMILY && i < num_bufs -1) {
+			bp_info = DPAA_BPID_TO_POOL_INFO(dqrr[i + 1]->fd.bpid);
+			ptr = rte_dpaa_mem_ptov(qm_fd_addr(&dqrr[i + 1]->fd));
+			rte_prefetch0((void *)((uint8_t *)ptr +
+					DEFAULT_RX_ICEOF));
+			bufs[i + 1] = (struct rte_mbuf *)((char *)ptr -
+					bp_info->meta_data_size);
+		}
+
+		fd = &dqrr[i]->fd;
+		dpaa_intf = fq[i]->dpaa_intf;
+
+		format = (fd->opaque & DPAA_FD_FORMAT_MASK) >>
+				DPAA_FD_FORMAT_SHIFT;
+		if (unlikely(format == qm_fd_sg)) {
+			bufs[i] = dpaa_eth_sg_to_mbuf(fd, dpaa_intf->ifid);
+			continue;
+		}
+
+		offset = (fd->opaque & DPAA_FD_OFFSET_MASK) >>
+				DPAA_FD_OFFSET_SHIFT;
+		length = fd->opaque & DPAA_FD_LENGTH_MASK;
+
+		mbuf = bufs[i];
+		mbuf->data_off = offset;
+		mbuf->data_len = length;
+		mbuf->pkt_len = length;
+		mbuf->port = dpaa_intf->ifid;
+
+		mbuf->nb_segs = 1;
+		mbuf->ol_flags = 0;
+		mbuf->next = NULL;
+		rte_mbuf_refcnt_set(mbuf, 1);
+		dpaa_eth_packet_info(mbuf, (uint64_t)mbuf->buf_addr);
+	}
+}
+
+void dpaa_rx_cb_prepare(struct qm_dqrr_entry *dq, void **bufs)
+{
+	struct dpaa_bp_info *bp_info = DPAA_BPID_TO_POOL_INFO(dq->fd.bpid);
+	void *ptr = rte_dpaa_mem_ptov(qm_fd_addr(&(dq->fd)));
+
+	/* In case of LS1046, annotation stashing is disabled due to L2 cache
+	 * being bottleneck in case of multicore scanario for this platform.
+	 * So we prefetch the annoation beforehand, so that it is available
+	 * in cache when accessed.
+	 */
+	if (dpaa_svr_family == SVR_LS1046A_FAMILY)
+		rte_prefetch0((void *)((uint8_t *)ptr + DEFAULT_RX_ICEOF));
+
+	*bufs = (struct rte_mbuf *)((char *)ptr - bp_info->meta_data_size);
 }
 
 static uint16_t
diff --git a/drivers/net/dpaa/dpaa_rxtx.h b/drivers/net/dpaa/dpaa_rxtx.h
index 29d8f95..d3e6351 100644
--- a/drivers/net/dpaa/dpaa_rxtx.h
+++ b/drivers/net/dpaa/dpaa_rxtx.h
@@ -268,9 +268,8 @@ int dpaa_eth_mbuf_to_sg_fd(struct rte_mbuf *mbuf,
 			   struct qm_fd *fd,
 			   uint32_t bpid);
 
-enum qman_cb_dqrr_result dpaa_rx_cb(void *event,
-				    struct qman_portal *qm,
-				    struct qman_fq *fq,
-				    const struct qm_dqrr_entry *dqrr,
-				    void **bd);
+void dpaa_rx_cb(struct qman_fq **fq,
+		struct qm_dqrr_entry **dqrr, void **bufs, int num_bufs);
+
+void dpaa_rx_cb_prepare(struct qm_dqrr_entry *dq, void **bufs);
 #endif
-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH 1/7 v2] bus/dpaa: check flag in qman multi enqueue
  2018-01-23 12:27   ` [dpdk-dev] [PATCH 1/7 v2] bus/dpaa: check flag in qman multi enqueue Nipun Gupta
@ 2018-01-24  5:24     ` Hemant Agrawal
  2018-01-31 12:45       ` Thomas Monjalon
  0 siblings, 1 reply; 18+ messages in thread
From: Hemant Agrawal @ 2018-01-24  5:24 UTC (permalink / raw)
  To: Nipun Gupta, thomas; +Cc: dev, Shreyansh Jain, stable



> -----Original Message-----
> From: Nipun Gupta [mailto:nipun.gupta@nxp.com]
> A caller may/may not pass the flags in qman enqueue multi API.
> This patch adds a check on that flag and only accesses it if passed by the
> caller.
> 
> Fixes: 43797e7b4774 ("bus/dpaa: support event dequeue and
> consumption")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
> ---
>  drivers/bus/dpaa/base/qbman/qman.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/bus/dpaa/base/qbman/qman.c
> b/drivers/bus/dpaa/base/qbman/qman.c
> index 609bc76..e7fdf03 100644
> --- a/drivers/bus/dpaa/base/qbman/qman.c
> +++ b/drivers/bus/dpaa/base/qbman/qman.c
> @@ -2198,7 +2198,7 @@ int qman_enqueue_multi(struct qman_fq *fq,
>                 eq->fd.addr = cpu_to_be40(fd->addr);
>                 eq->fd.status = cpu_to_be32(fd->status);
>                 eq->fd.opaque = cpu_to_be32(fd->opaque);
> -               if (flags[i] & QMAN_ENQUEUE_FLAG_DCA) {
> +               if (flags && (flags[i] & QMAN_ENQUEUE_FLAG_DCA)) {
>                         eq->dca = QM_EQCR_DCA_ENABLE |
>                                 ((flags[i] >> 8) & QM_EQCR_DCA_IDXMASK);
>                 }

Series-
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [dpdk-dev] [PATCH 1/7 v2] bus/dpaa: check flag in qman multi enqueue
  2018-01-24  5:24     ` Hemant Agrawal
@ 2018-01-31 12:45       ` Thomas Monjalon
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas Monjalon @ 2018-01-31 12:45 UTC (permalink / raw)
  To: Hemant Agrawal, Nipun Gupta; +Cc: dev, Shreyansh Jain, stable

> > Fixes: 43797e7b4774 ("bus/dpaa: support event dequeue and
> > consumption")
> > Cc: stable@dpdk.org
> > 
> > Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
> 
> Series-
> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>

Applied, thanks

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-01-31 12:46 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-23 12:31 [dpdk-dev] [PATCH 0/7] dpaa: fixes and performance improvement changes Nipun Gupta
2018-01-23 12:27 ` [dpdk-dev] [PATCH 0/7 v2] " Nipun Gupta
2018-01-23 12:27   ` [dpdk-dev] [PATCH 1/7 v2] bus/dpaa: check flag in qman multi enqueue Nipun Gupta
2018-01-24  5:24     ` Hemant Agrawal
2018-01-31 12:45       ` Thomas Monjalon
2018-01-23 12:27   ` [dpdk-dev] [PATCH 2/7 v2] bus/dpaa: allocate qman portals in thread safe manner Nipun Gupta
2018-01-23 12:27   ` [dpdk-dev] [PATCH 3/7 v2] mempool/dpaa: fix the phy to virt optimization Nipun Gupta
2018-01-23 12:27   ` [dpdk-dev] [PATCH 4/7 v2] bus/dpaa: fix port order shuffling Nipun Gupta
2018-01-23 12:27   ` [dpdk-dev] [PATCH 5/7 v2] net/dpaa: use phy to virt optimizations Nipun Gupta
2018-01-23 12:27   ` [dpdk-dev] [PATCH 6/7 v2] bus/dpaa: check portal presence in the caller API Nipun Gupta
2018-01-23 12:27   ` [dpdk-dev] [PATCH 7/7 v2] net/dpaa: further push mode optimizations Nipun Gupta
2018-01-23 12:31 ` [dpdk-dev] [PATCH 1/7] bus/dpaa: check flag in qman multi enqueue Nipun Gupta
2018-01-23 12:31 ` [dpdk-dev] [PATCH 2/7] bus/dpaa: allocate qman portals in thread safe manner Nipun Gupta
2018-01-23 12:31 ` [dpdk-dev] [PATCH 3/7] mempool/dpaa: fix the phy to virt optimization Nipun Gupta
2018-01-23 12:31 ` [dpdk-dev] [PATCH 4/7] bus/dpaa: fix port order shuffling Nipun Gupta
2018-01-23 12:31 ` [dpdk-dev] [PATCH 5/7] net/dpaa: use phy to virt optimizations Nipun Gupta
2018-01-23 12:31 ` [dpdk-dev] [PATCH 6/7] bus/dpaa: check portal presence in the caller API Nipun Gupta
2018-01-23 12:31 ` [dpdk-dev] [PATCH 7/7] net/dpaa: further push mode optimizations Nipun Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).